AI RESEARCH

AI for Monitoring and Classifying Data Used in Research Literature

arXiv CS.CL

ArXi:2605.30582v1 Announce Type: new While platforms like Google Scholar and Semantic Scholar track citations for academic papers, no comparable infrastructure exists for monitoring dataset usage in research literature, leaving the landscape of data use largely opaque. Addressing this gap is critical for transparency, reproducibility, and monitoring of impact, yet progress is hindered by inconsistent citation practices, scarce labeled data, and ambiguous references to datasets in the wild.