AI RESEARCH

Position: Retire the "Positive Backdoor" Label -- Secret Alignment Requires Strict and Systematic Evaluation

arXiv CS.AI

ArXi:2605.28597v1 Announce Type: cross This position paper argues that the AI/ML community should stop overclaiming and retire the label "positive backdoor," and instead treat trigger-activated hidden behaviors as Secret Alignment. Crucially, protective claims based on Secret Alignment should be presumed not secure by default unless ed by rigorous, standardized evaluation. The Private AI era, enabled by open-weight LLMs and accessible