It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs

ArXi:2605.20258v1 Announce Type: new Contextual Integrity (CI) defines privacy not merely as keeping information hidden, but as governing information flows according to the norms of a given context. As large language models are increasingly deployed as personal agents handling sensitive workflows, adhering to CI becomes critical. However, even frontier models remain unreliable in making disclosure decisions, and existing mitigation strategies often degrade underlying task performance.