RUBAS: Rubric-Based Reinforcement Learning for Agent Safety

ArXi:2606.04051v1 Announce Type: cross The evolution of LLMs into tool-enabled agents creates a new class of safety challenges associated with real-world execution rather than simple text generation. Existing alignment methods often rely on coarse refusal signals or static supervision, making it difficult to balance safety with useful tool execution across diverse agentic risks. We