Arena.ai is running possibly the most fraudulent benchmark thus far

Previously they placed GPT 5.5 below Meta's Muse Spark in terms of coding ability. This latest benchmark they've released with Grok Imagine surpassing Seedance video generation. if anyone is currently using both it's fair to say this is objectively dishonest. submitted by /u/Cagnazzo82 [link] [comments]