AI RESEARCH
ARCA: Adapter-Residual Credit Assignment When Token Signals Degenerate
arXiv CS.AI
•
ArXi:2606.00257v1 Announce Type: cross Token-level credit assignment for language-model reinforcement learning is usually formulated as if the policy were fully trainable, while practical LLM-RL pipelines often rely on parameter-efficient fine-tuning, especially LoRA. We argue that this separation hides a structural failure mode.