Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've seen reports of qwen3.5-35b-a3b spending a ton of time reasoning if the context window is nearly empty-- supposedly it reasons less if you provide a long system prompt or some file contents, like if you use it in a coding agent.

I'm too GPU-poor to run it, but r/LocalLLaMa is full of people using it.



Can confirm. I gave it a variant of the car wash question on a MacBook M4 with 32 GB of RAM. It produced output at a conversational speed, sure, but that started with 6 minutes of thinking output. 6 minutes.

On the plus side, it did figure out the question even without the first sentence that's intended as a bit of a giveaway.


There's definitely something wrong with the thinking mode on this one. I wouldn't be surprised if it gets fixed, either by qwen themselves or with a fine-tune.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: