Ask HN: Qwen3 – is it ready for driving AI agents?

1 morisil 0 5/4/2025, 8:43:39 AM

It seems that Qwen3 is not capable of driving independent reasoning - it lacks the quality needed to power fully autonomous AI agents.

Initially I was quite impressed with it's problem solving capabilities, when outputting the code through the chat interface. It addressed certain problems much better than Claude or Gemini. However, as soon as I switched to Alibaba Cloud's API to provide Dashscope based implementation of cognizer interface of my new generation of AI agents (chain of code), the whole charm was gone.

Qwen3 struggles with structured generation attempts, quite often falling into an infinite loop when spitting out tokens.

It has troubles crossing boundaries of languages, which is crucial for my agents which are "thinking in code" - writing Kotlin script, containing JavaScript, containing SQL, etc., therefore it will not work well as automated software engineer.

It is "stubborn" - even when the syntax error in generated code is clearly indicated, it is rather wiling to output the same error code again and again, instead of testing another hypothesis.

It lacks the theory of mind and understanding of the context and the environment. For example when asked to check the recent news, it is always responding by trying to use BBC API url, with non-filled API key as a part of the request, while passing this url to the Files tool instead of the WebBrowser tool, which obviously fails.

And the last, but not least - censorship, for example Qwen3 will refuse to search for the information on the most recent anti-governmental protests in China. I wouldn't be surprised if these censorship blockers were partially responsible for poor quality of cognition in other areas.

Maybe I'm doing something wrong, and you are getting much better results with this model for fully autonomous agents with feedback loop?

I built a pixel art editor after playing Octopath Traveler II

Ask HN: I'm a New PostgreSQL Hacker – What Problems Do You Face with Postgres?

Ask ChatGPT for "Assistant Response Preferences" to see what it learnt about you

Ask HN: What are you working on? (April 2025)

Ask HN: Who wants to be hired? (May 2025)

Ask HN: Who is hiring? (May 2025)

Tell HN: Incorporated my software company today

Ask HN: Best underrated way to get a job in tech during a hiring slowdown?

We cut CI emissions by up to 90% – by choosing where code runs

Ask HN: What Problem Would You Solve with Unlimited Resources? [May 2025]

Ask HN: How do I get over my fear of launching my product?

Building MapReduce (Based on Google Paper)

The engineering interview process is broken, AI cheating is exposing it faster

Ask HN: AI tools to help you learn faster (GitHub, books, PDFs)

LeetCode for Front End Engineers

Ask HN: Memory-safe low level languages?

VPNSecure deactivated all lifetime subscribers

Ask HN: Blocked by Cloudflare Infinite verify your human loop

Ask HN: Why has there been a recent surge in criticism toward Next.js?

Ask HN: When do tariffs get levied? Who specifically charges them and how?

Ask HN: CS degrees, do they matter again?

Ask HN: Can't launch my Android app – Google couldn't verify your identity

Feedback on Tool I Created?

Ask HN: Sold my company, parents passed away – feeling lost. What now?

Ask HN: What's a good system to remember to wear my reading glasses at my desk?

Payment processors shouldn't be able to charge through an expired card

Ask HN: Are there any apps to track grocery prices in local stores?

A affordable retrocomputer compatible with the C64

Ask HN: Parents of young kids: how do you teach that hitting is not acceptable?

Is the Job Market Improving?

Ask HN: Can vibe coding competitions be challenging and fair?

Ask HN: What are your favourite daily puzzle games?

What are the AI MCP servers wish you existed?

Ask HN: What Is the Hacker News for Medicine?

Ask HN: Qwen3 – is it ready for driving AI agents?

Comments (0)