AI RESEARCH
CASTLE2026 Team WDL Technical Report
arXiv CS.CV
•
ArXi:2606.00712v1 Announce Type: new The CASTLE Challenge @ EgoVis 2026 evaluates long-form egocentric video question answering over 600+ hours of multi-perspective recordings. Each four-choice question requires evidence from videos, transcripts, auxiliary photos, people, days, rooms, and temporal context. We propose an evidence-aware multimodal reasoning pipeline based on Qwen.