Automated Essay Scoring and Language Certification: Assessing Generalizability, Agreement and Validity for French

ArXi:2606.02009v1 Announce Type: new In Automated Essay Scoring (AES), benchmarking practices have fostered minimalist evaluation practices, in contrast with the broader-view recommendations of evaluation frameworks, such as the argument-based validation framework (ABV), which argued in favor of a multidimensional assessment of systems, especially in the context of high-stakes language tests. In this paper, we