AI RESEARCH
How Much Is a Dataset Worth? Scaling Laws, the Vendi Score, and Matrix Spectral Functions
arXiv CS.AI
•
ArXi:2605.29448v1 Announce Type: cross Neural scaling laws appraise data through dataset size, while the Vendi Score uses quantum entropy to measure dataset value. We show both that common neural-scaling-law objectives and the Vendi Score are submodular. We further show that the Vendi Score is a special case of a broader class of submodular objectives that we call matrix spectral functions. This also includes determinantal (DPP) objectives, as well as many others. We also