Last year when Framework announced the Framework Desktop I immediately ordered one. I’d been wanting a new gaming PC, but I’d also been kicking around the idea of running a local LLM. When it finally arrived it worked great for gaming… but there wasn’t much that would run on the AMD hardware from an LLM standpoint. Over the next few months more tools became available, but it was very slow going. I had many long nights where I’d work and work and work and end up right back where I started.
So I got a Claude Code subscription and used it to help me build out my LLM setup. I made a lot of progress, but now I was comparing my local LLM to Claude, and there was no comparison.
Then I started messing with OpenClaw. First with Claude (expensive, fast), then with my local llama.cpp (cheap, frustrating). I didn’t know enough about it, so I used Claude to help me build a custom app around my llama.cpp. That was fun and I learned a lot, but I was spending most of my time chasing bugs instead of actually optimizing anything.
Around that time I heard about Qwen3-Coder-Next, dropped it into llama.cpp, and wow that was a huge step forward. Better direction-following, better tool calls, just better. I felt like my homegrown app was now holding the model back, so I converted over to OpenClaw. Some growing pains, but once things settled I was impressed again.
We built a lot of tooling along the way: a vector database memory system that cleans itself up each night, a filesystem-based context system, speech-to-text and text-to-speech, and a vision model. At this point my local LLM could see me, hear me, speak to me, and remember things about me, and all of it was built to be LLM-agnostic so Claude and my local system could share the same tools.
I was still leaning on Claude heavily for coding, because honestly it’s amazing at it. I decided to give Qwen a small test project: build a web-based kanban board: desktop and mobile friendly. It built it… but it sucked. Drag between columns? Broken. Fixed that, now you can’t add items. Fixed that, dragging broke on mobile. I kept asking Claude to help troubleshoot and it kept just wanting to rewrite the app. Finally I gave in and said “just fix it” and Claude rewrote the whole thing and it was great. I was disheartened. On top of that, Qwen kept getting into these loops, sometimes running for hours doing nothing productive.
So about a week and a half ago I decided to rethink what I even wanted my local LLM to do. Coding was obviously out. I decided to start fresh and use it to help me journal. A few times a day it reaches out, asks what I’m doing, and if it’s relevant, adds an entry to my journal.
I went through a couple more model swaps trying to get it stable, Qwen3.5 was better than Coder-Next for this use case but I was still hitting loop issues. It was consistently prompting me and doing a decent job with the journal, which was at least a step in the right direction.
Then Qwen3.6 dropped. I put the Q6 quant on the same day it released and immediately I could tell it was faster and the output quality was much higher. And I realized earlier today that since I switched to Qwen3.6 I haven’t had to ask Claude to check in on Qwen even once. The looping is gone. It’s actually following the anti-loop protocols I’ve been trying to get models to follow for months.
I haven’t tried coding with it yet (I don’t have high hopes there) but I’ve given it the ability to create and modify its own skills and it’s been doing that beautifully. Scheduled tasks, multiple agents (voice assistant, primary, Home Assistant), all running smoothly.
My reliance on Claude has dropped off sharply since moving to Qwen3.6, and my system resource usage has gone down significantly too. If you’ve tried to get a local LLM setup running and gave up out of frustration… now might be a good time to jump back in, especially if you know your hardware should be able to handle it.


This is very exciting to hear! I might see about creating a dedicated coding agent to test it out.
Creating dedicated agents isn’t something I’ve really done much of yet, I’ve mostly just kept it as a single agent, but this might be a good use case for it.
If you do, please write back and let us all know how it fares. I think a lot of us are pinning our hopes of the Qwen line as “Claude at Home”. I don’t think 3.6 is it…but 3.7? 3.8?
I did some analysis on OpenCode, Little Coder, Aider, OpenClaude and ClawCode and each one had potential issues with my specific setup (OpenCode and OpenClaude have a looping bug at the moment, Little Coder is brand new and has potential, but I’m leary of new projects, Aider is promising but had some unknowns)… anyway, as I was evaluating these I realized that I currently use Claude Code, all I have to do is use that with my model rather than using their models. A few quick changes and now I have Qwen3.6 in Claude Code. I asked it to create a web-based Tetris game… 7m13s later I have a fully functional, minimally buggy Tetris game in my browser!
That’s awesome! Full circle :)
I have to admit, once you have your work environment set to how you like it, it can be difficult to transition away. I find OpenCode actively hostile to how I work, to say nothing of how shit the OOBE / on ramping is. I was willing to eat shit to get it set with Little Coder (because as you say, if that works as advertised, it will be a hell of a thing) but time will tell.
I might need Little Coder eventually / as I try to step back from cloud-based, because the best local coding models I can run at good speeds are in the 8-14B range. Anything that can automatically enforce structured output discipline to get more out of smaller models would be a huge win.
Did you happen to see the tests on the new Qwen3.6 27 dense?
https://www.youtube.com/watch?v=N-0WtgxJ7ZU
Last night I decided to try Little-Coder to work on a game that I’ve had in the ‘plannng’ phase for a while. I had a few issues getting Little-Coder to work with my system, but once it was working I pointed it to the project fact sheet in my Obsidian vault and told it to get familiar with the project and ask any questions. It had a few questions and then wanted to know if I wanted it to get started on a prototype for the gameplay. The game is a narrative driven politics game and I asked it to create the prototype as a brower game. It ran for about 40 minutes and then had the working prototype. There were a couple of bugs with some buttons not working, but those were easy to fix.
This is really impressive. I’m making this game for my daughter and I let her play the prototype (which is very repetitive at this point since the ‘narrative’ hasn’t really been written yet) and she loved it. She played it for about 15 minutes before realizing that it had started repeating. I’ve been working on the game all day today and she came in a bit ago and asked when she can play again.
And thanks for that video on Qwen3.6 27 dense… I might have to throw my system at it and see how it handles things.
That’s awesome and I’m jealous of you on both fronts. I spent yesterday installing Luanti (free clone of Minecraft) for the kids to play and they loved it.
I’d love to figure out a way to get the assets and modpacks from that into the Wii version of Minecraft, as that would make it a one stop shop for family gaming. It’s not an impossible ask…but it’s beyond what little coding skill I have.
I run my homelab / AI on a Lenovo P330 Tiny (a 1L shoebox, circa 2018) so am limited by what GPUs I can install. Basically the best that can be shoehorned in there is an Nvidia RTX A2000 12GB, which costs around $700 -$1,000 AUD 2nd hand …at that point, the whole thing becomes uneconomical / may as well buy an old Optiplex and stick in a Tesla P40 or 3090. But that carries its own issues. TL;DR - it’s about $1000 to get 27B to run at anything like ok speed here. Hard to justify given the rest of the stack works just fine.
On my half length / half height Tesla P4 8GB (old server card), I can maybe get 5tok/s on 27B dense. Not really good for coding.
https://github.com/itayinbarr/little-coder