AI SAFETY & ETHICS

Testing Gemini models for scheming tendencies

LessWrong AI

As AI models become increasingly capable and autonomous, keeping them safely aligned with human intentions is critical. Extending our previous work on evaluating scheming capabilities, we