AI RESEARCH

Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor

arXiv CS.LG

ArXi:2605.20402v1 Announce Type: new MXFP4 arithmetic can dramatically accelerate reinforcement learning (RL) post-