AI SAFETY & ETHICS
Testing Gemini models for scheming tendencies
LessWrong AI
•
As AI models become increasingly capable and autonomous, keeping them safely aligned with human intentions is critical. Extending our previous work on evaluating scheming capabilities, we