Apex-Testing: real-world, real repos, agentic coding benchmark (Update)
r/LocalLLaMA
•
Generative AI
AI Research
BIG Apex-Testing update! The Real-World Agentic Coding benchmark has been (95%) updated with all recent models! This is based on 65-70 actual private github repos made especially to test proper agentic coding capabilities of models. For those who don't know about the project and see it for the first time, here's the excerpt from the website: " What is APEX Testing? Every week there's a new model that's "the best ever." Every provider promises 10x performance at a fraction of the cost. Benchmarks get cherry-picked, their s get curated, influencers get paid and people keep falling for it.