AI RESEARCH

Optimus: A Robust Defense Framework for Mitigating Toxicity while Fine-Tuning Conversational AI

arXiv CS.CL

ArXi:2507.05660v3 Announce Type: replace-cross Customizing Large Language Models (LLMs) on untrusted datasets poses severe risks of injecting toxic behaviors. In this work, we