TSM-Bench: Detecting LLM-Generated Text in Real-World Wikipedia Editing Practices

ArXi:2605.31113v1 Announce Type: new Automatically detecting machine-generated text (MGT) is critical to maintaining the knowledge integrity of user-generated content (UGC) platforms such as Wikipedia. Existing detection benchmarks primarily focus on \textit{generic} text generation tasks (e.g., ``Write an article about machine learning.''). However, editors frequently employ LLMs for specific writing tasks (e.g., summarisation). These \textit{task-specific} MGT instances tend to resemble human-written text closely due to their constrained task formulation and contextual conditioning.