AI RESEARCH

Context-aware child-directed speech detection from long-form recordings

arXiv CS.LG

ArXi:2606.01134v1 Announce Type: cross Automatically distinguishing child-directed speech from adult-directed speech in long-form recordings is key to scalable analyses of children's language environments. Existing approaches process utterances in isolation and have been evaluated primarily on English. We address these gaps along three dimensions. First, we fine-tune and evaluate six-self supervised models on a multilingual dataset of 182 children, showing that in-domain pre-