AI RESEARCH
Dialectics of Alignment: Harnessing Unsafe Knowledge for Dynamic Safety Routing
arXiv CS.LG
•
ArXi:2606.00686v1 Announce Type: new The prevailing paradigm in large language model (LLM) alignment operates via erasure, filtering unsafe data or