CoRe-Code: Collaborative Reinforcement Learning for Code Generation

ArXi:2605.24812v1 Announce Type: new Large language models (LLMs) have achieved strong performance in code generation, but most methods rely on autoregressive decoding without global planning, often leading to locally coherent yet globally suboptimal solutions (e.g., failing test cases or inefficient complexity). While recent approaches such as Chain-of-Thought (CoT) and multi-agent systems