AI RESEARCH
A Reproducible Universal Dependencies-Style Pipeline for Katharevousa Greek Parliamentary Text
arXiv CS.CL
•
ArXi:2605.22978v1 Announce Type: new Katharevousa Greek remains poorly served by contemporary NLP pipelines despite its importance for legal, administrative, and parliamentary archives. We present a reproducible workflow for building and evaluating a Universal Dependencies-style parsing resource for Katharevousa parliamentary questions from Greece's early post-junta period. The pipeline links OCR-aware reconstruction, schema-constrained LLM-assisted annotation, automatic validation, deterministic CoNLL-U snapshotting, fixed-split evaluation, and model-family comparison.