How Qwen3.6-35B-A3B fails differently as a sub agent compared to solo
r/LocalLLaMA
•
Generative AI
Been running Qwen3.6-35B-A3B as a sub agent on a single 4090 for a few weeks. The failure modes are different from solo use and I haven't seen this written up anywhere. Solo use, you notice drift fast. The model produces something confused, you see it, you can fix it. When it's a sub agent receiving tasks from an orchestrator, the orchestrator treats a confused or partial response the same as a legitimate one unless you've explicitly built a validation layer. Most of us don't. The confident format passes through and the bad output goes downstream.