AI RESEARCH

Knowledge Index of Noah's Ark

arXiv CS.AI

ArXi:2606.05104v1 Announce Type: new Knowledge benchmarks for LLMs face three issues: scaling-driven designs that do not operationalize disciplinary representativeness; flat-payment annotation that permits lazy consensus; and unaudited ranking instability under bounded test budgets. We