AI RESEARCH
Safety Game: Inference-Time Alignment of Black-Box LLMs via Constrained Optimization
arXiv CS.LG
•
ArXi:2510.09330v3 Announce Type: replace Ensuring that large language models (LLMs) comply with safety requirements is a central challenge in AI deployment. Existing alignment approaches primarily operate during