AI RESEARCH
Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor
arXiv CS.LG
•
ArXi:2605.20402v1 Announce Type: new MXFP4 arithmetic can dramatically accelerate reinforcement learning (RL) post-