AI RESEARCH
LogDx-CI: Benchmarking Log Reduction Tools for LLM Root-Cause Diagnosis
arXiv CS.AI
•
ArXi:2605.28876v1 Announce Type: cross CI failure logs are large (median 5k lines, max 200k in this corpus) and noisy. Coding agents that try to debug them depend on an upstream tool to reduce the log to a manageable context, but the field has had no public empirical comparison of which reductions preserve enough evidence for downstream LLM diagnosis. We