Cline and LM Studio: the local coding stack with Qwen3 Coder 30B

76 Terretta 19 8/31/2025, 2:50:17 PM cline.bot ↗

Comments (19)

dzikibaz · 1d ago

Open-weights models are catching up and are now viable for many tasks.

Keep in mind that closed, proprietary models:

1) Use your data internally for training, analytics, and more - because "the data is the moat"

2) Are out of your control - one day something might work, another day it might fail because of a model update, a new "internal" system prompt, or a new guardrail that just simply blocks your task

4) Are built on the "biggest intellectual property theft" of this century, so they should be open and free ;-)

42lux · 1d ago

I'll have to admit I was sceptical about the 30B benchmarks but after testing it over the weekend I'll have to admit it's pretty good. It needs more help in architecture related questions but for coding well defined tasks (for me primarily python) it's on par with the commercial models.

thecolorblue · 1d ago

I just ran a test giving the same prompt to claude, gemini, grok and qwen3 coder running locally. Qwen did great by last years standards, and was very useful in building out boilerplate code. That being said, if you looked at the code side by side with cloud hosted models, I don't think anyone would pick Qwen.

If you have 32gb of memory you are not using, it is worth running for small tasks. Otherwise, I would stick with a cloud hosted model.

blackoil · 1d ago

That should remain true for foreseeable future. A 30b model can't beat 300b. Running 300b model locally is prohibitively expensive. By time it would be feasible cloud will also move to 10x larger model.

dcreater · 1d ago

Please share the results

wendythehacker · 1d ago

Cline seems to be having some security vulnerabilities that aren't addressed, e.g. https://embracethered.com/blog/posts/2025/cline-vulnerable-t...

Begs the question of long-term support, etc...

hiatus · 1d ago

This person keeps banging the drum of agents running on untrusted inputs doing unexpected things. The proof of concept doesn't prove anything and doesn't even have working code. It's not clear why this is classed as a markdown rendering bug when it appears cline is calling out to a remote server with the contents of an env file as parameters in a url.

edit: are you the author? You seem to post a lot from that blog and the blog author's other accounts.

jasonjmcghee · 1d ago

I took the time to build an agent from scratch in rust, copying a lot of ideas from claude code, and using Qwen3 Coder 30B - 3.3B does really well with it. Replicating the str_replace / text editor tools, bash tool, and todo list and a bit of prompting engineering goes really far.

I didn't do anything fancy and found it to do much better than the experience I had with codex cli and similar quality to Claude Code if I used sonnet or opus.

Honestly the cli stuff was the hardest part but I chose not to use something like crossterm.

ptrj_ · 1d ago

How have you found the current experience of (async) networking in Rust? This is something which is stupidly easy out-of-the-box in Python -- semi-seriously, async/await in Python was _made_ for interacting w/ a chat completions/messages API.

(As an aside, my "ideal" language mix would be a pairing of Rust with Python, though the PyO3 interface could be improved.)

Would also love to learn more about your Rust agent + Qwen3!

jasonjmcghee · 19h ago

I would pick rust for doing async over python every time, if it's the only consideration.

In python there are hidden sharp edges and depending on what dependencies you use you can get into deadlocks in production without ever knowing you were in danger.

Rust has traits to protect against this. Async in rust is great.

I'd do something like:

let (tx, rx) = std::sync::mpsc::channel(); thread::spawn(move || { // blocking request let response = reqwest::blocking::get(url).unwrap(); tx.send(response.text().unwrap()); });

let (tx, mut rx) = tokio::sync::mpsc::channel(100); tokio::spawn(async move { let response = client.get(url).send().await; tx.send(response).await; });

ptrj_ · 18h ago

> In python there are hidden sharp edges and depending on what dependencies you use you can get into deadlocks in production without ever knowing you were in danger.

I've heard of deadlocks when using aiohttp or maybe httpx (e.g. due to hidden async-related globals), but have never managed myself to get any system based on asyncio + concurrent.futures + urllib (i.e. stdlib-only) to deadlock, including w/ some mix of asyncio and threading locks.

johnisgood · 1d ago

Hardware requirements?

"What you need" only includes software requirements.

DrAwdeOccarim · 1d ago

The author says 36GB unified ram in the article. I run the same memory M3 Pro and LM Studio daily with various models up to the 30b parameter one listed and it flies. Can’t differentiate between my OAi chats vs locals aside from modern context, though I have puppeteer MCP which works well for web search and site-reading.

jszymborski · 1d ago

30B runs at a reasonable speed on my desktop which has an RTX 2080 (8gb VRAM) and 32Gb of RAM.

Havoc · 1d ago

30B class model should run on a consumer 24gb card when quantised though would need pretty aggressive quant to make room for context. Don’t think you’ll get the full 256k context though

So about 700 bucks for a 3090 on eBay

magicalhippo · 1d ago

I have a 5070 Ti and a 2080 Ti, but running Windows so roughly 25-26 GB available. With Flash Attention enabled, I can just about squeeze in Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL from Unsloth with 64k context entirely on the GPUs.

With a 3090 I guess you'd have to reduce context or go for a slightly more aggressive quantization level.

Summarizing llama-arch.cpp which is roughly 40k tokens I get ~50 tok/sec generation speed and ~14 seconds to first token.

For short prompts I get more like ~90 tok/sec and <1 sec to first token.

thecolorblue · 1d ago

I am running it on an M1 Max.

nurettin · 1d ago

I tried qwen code yesterday. I don't recommend it for code editing unless you've committed your code. It destroyed a lot of files in just 10 minutes.

dcreater · 1d ago

Why do i feel like there's a plug for LMStudio baked in here

Ask HN: Who is hiring? (September 2025)

Ask HN: Who wants to be hired? (September 2025)

How do you handle JDK/JRE patch updates for Java apps on K8s?

Ask HN: Best foundation model for CLM fine-tuning?

Ask HN: Are there enough utilities in bash now?

Worse Performance at a Higher Cost

Ask HN: The government of my country blocked VPN access. What should I use?

Understanding Android's Boot Process

Ask HN: What is your biggest regret about a decision you made?

Ask HN: Why does Seattle feel so risk-averse compared to the Bay Area?

LinkedIn seems to be leaking Google Docs

My experience with Apache Pulsar to solve PostgreSQL multi-tenant pain

Ask HN: Tools for Crossword Puzzle Generation?

Ask HN: How do you fight YouTube addiction and procrastination? I'm struggling

Ask HN: Why hasn't x86 caught up with Apple M series?

Ask HN: What to learn for math for modeling?

Ask HN: Do custom ROMs exist for electric cars, for example Teslas?

Tell HN: My advice after I applied to 450 positions before getting hired

Tell HN: Use "-f**k" to kill Google AI Overview

Change Tracker: Monitor+revert file edits from Claude/AI agents(in-memory VCS)

Ask HN: Which Open Source License to Choose for a Python Language Server

Ask HN: How can I recover and run my old mobile game from the 2010s?

Ask HN: Did Developers Undermine Their Own Profession?

Hacker News Alternativies

Ask HN: How much are you guys paying for AI coding tools monthly?

Ask HN: Anyone using their own custom text editor?

Ask HN: Best self-hosted wiki solution in 2025? Mediawiki or something else?

Looking for Info on Aerosol Pathogen Detection

Ask HN: What do you think about GFW?

Ask HN: Any Android Engineers Here?

JDeploy 5.0: Deploy Java Desktop Apps to ARM64 Windows and Linux with One Click

FUGC: Understand the GC in Fil-C

Ask HN: Best codebases to study to learn software design?

Ask HN: What to Do with Old iPads?

Cline and LM Studio: the local coding stack with Qwen3 Coder 30B

Comments (19)