AI RESEARCH

OpenSTBench: Beyond Semantic Evaluation for Speech Translation

arXiv CS.AI

ArXi:2605.30792v1 Announce Type: cross Speech translation systems increasingly span speech-to-text translation (S2TT), speech-to-speech translation (S2ST), offline translation, and streaming generation, producing outputs that differ in modality, speech realization, and timing behavior. Existing evaluation practices assess important aspects such as translation quality, speech quality, and temporal quality, but these aspects are often evaluated under separate protocols, making it difficult to compare heterogeneous systems comprehensively.