AI RESEARCH
Knowledge Index of Noah's Ark
arXiv CS.AI
•
ArXi:2606.05104v1 Announce Type: new Knowledge benchmarks for LLMs face three issues: scaling-driven designs that do not operationalize disciplinary representativeness; flat-payment annotation that permits lazy consensus; and unaudited ranking instability under bounded test budgets. We