Agent Execution Tax: new procurement metric for browser agent benchmarks?
r/LocalLLaMA
•
Generative AI
Open Source AI
AI Research
One model paid a 22.9% Agent Execution Tax (wasted / productive inference). The same model that looked cheapest per token cost 2.3x per successful task. Ran 720 browser agent tasks across these four models on the WebVoyager benchmark. Open-weight models held their own against Gemini 2.5 Flash.