Mechanistic Interpretability: We Built the Most Powerful Minds in History. We Can't Read Them.
Towards AI
•
AI Safety
Mechanistic Interpretability: We Built the Most Powerful Minds in History. We Can’t Read Them. We are flying blind inside the most powerful systems ever built. Here is the map being drawn in real time. 14 min read · AI Research I want to be honest about what pulled me into this topic. I had been following AI developments closely for a couple of years - reading papers, tinkering with APIs, following the usual dis