MARGIN: Runtime Confidence Calibration for Multi-Agent Foundation Model Coordination

ArXi:2605.22949v1 Announce Type: new Foundation model agents increasingly operate in multi-agent deployments where a coordinator must decide which agent's response to trust. The standard approach weights agents by their self-reported confidence, but recent evidence shows that foundation model confidence is systematically mis-calibrated and, on hard tasks, inversely correlated with accuracy.