AI News Leader · Topic
AI Safety & Alignment
The latest AI Safety & Alignment news, research, and analysis, continuously tracked across the AI landscape.
24 recent stories
-
Two House lawmakers unveil bipartisan AI legislation that would override some state AI laws and require top AI developers to implement risk-management plans (Politico)
Two House lawmakers unveil bipartisan AI legislation that would override some state AI laws and require top AI developers to implement risk-management plans - But it's the propo…
-
How authoritarian governments are twisting AI safety to get tech companies to fall in line
Authoritarian governments, including the Trump administration, are reorienting AI safety provisions away from protecting the public toward coercing for the regime.
-
Investigation finds that, to discredit AI safety, the OpenAI/a16z Super Pac made sockpuppet accounts - pretending to be AI safety advocates - that call for violence
Full investigation: www.modelrepublic.org/articles/a-pro-ai-super-pac-s-secret-meme-sockpuppets submitted by /u/EchoOfOppenheimer [link] [comments]
-
A profile of Anthropic as it prepares to go public and broaden access to Mythos, amid criticism that commercial pressures have eroded its AI safety standards (Madhumita Murgia/Financial Times)
Madhumita Murgia / Financial Times: A profile of Anthropic as it prepares to go public and broaden access to Mythos, amid criticism that commercial pressures have eroded its AI…
-
Make sure it's not coming from the LLM, I used a LLM and got 0 rejections on normal prompts
I modified the default workflow to use a (censored!) Gemma-4-31B running in llama.cpp, called it via API rather than invoking through Comfy and used the "Magic Prompt" from the…
-
We Tested 25 LLMs for Medical Use. Here’s What Shipped.
Vitaly Garie / Unsplash Over the past year we’ve been building a medical AI assistant that turns doctor-patient audio into SOAP notes, ICD-10-GM codes, and billing entries at Me…
-
How OpenAI, Anthropic, and AI startups are pursuing "recursive self-improvement", in a bid to build AI that can improve itself with little to no human input (Financial Times)
How OpenAI, Anthropic, and AI startups are pursuing “recursive self-improvement”, in a bid to build AI that can improve itself with little to no human input - Industry chiefs sa…
-
My AI coding agent tried to touch files it should never touch. So I built a guardrail.
AI coding agents are amazing until they touch the wrong file. I had agents delete files, inspect things they shouldn’t, and get way too confident around sensitive project data.…
-
When 'Read-Only' Becomes a False Sense of Security
When 'Read-Only' Becomes a False Sense of Security TL;DR: Restricting AI to only 'read' does not make systems safer. Instead, it enables the system to deceive itself and humans…
-
An economist's case against the AI jobs-pocalypse
Kathryn Anne Edwards, a labor economist, presents a nuanced perspective on the impact of AI on jobs. Contrary to widespread fears of a job-pocalypse, Edwards argues that AI will…
-
Building Your First Developer Agent With OpenAI Agents SDK
Building a developer agent with OpenAI Agents SDK requires a strategic approach, starting with reading issue details, inspecting the codebase, creating a plan, suggesting tests,…
-
Microsoft releases ASSERT, an open-source framework that lets developers generate and run AI behavior tests using natural-language descriptions (Ram Iyer/TechCrunch)
Ram Iyer / TechCrunch: Microsoft releases ASSERT, an open-source framework that lets developers generate and run AI behavior tests using natural-language descriptions - AI resea…
-
Microsoft’s Project Solara Aims to Put AI in an Employee Badge
Microsoft's Project Solara seeks to integrate AI into employee badges, enhancing workplace safety and efficiency. This initiative matters for organizations with high-risk enviro…
-
This Is Why America Can’t Have Robots And Other Nice Things
The article "This Is Why America Can't Have Robots And Other Nice Things" highlights the challenges of implementing advanced technologies in the United States due to outdated la…
-
Nvidia and Microsoft Researchers Say AI Agents Don't Care About Safety or Reliability
Nvidia and Microsoft researchers have made a groundbreaking discovery that challenges the fundamental assumptions of artificial intelligence development. Their study reveals tha…
-
Florida Sues Open AI and Sam Altman Alleging Safety Issues
Florida's lawsuit against OpenAI and CEO Sam Altman highlights the growing concern over the safety of AI-powered products. The state alleges that OpenAI prioritized profits over…
-
Anthropic files confidential IPO paperwork with SEC this week
Anthropic filed a confidential S-1 with the SEC this week, moving toward a public listing that will put disclosure obligations and investor return expectations directly in tensi…
-
Florida sues OpenAI over safety concerns
Florida became the first US state to sue OpenAI, alleging that the ChatGPT maker ignored safety concerns.
-
Florida lawsuit accuses OpenAI of ignoring safety warnings and putting children at risk
State sues maker of ChatGPT and CEO Sam Altman, alleging company ‘allowed a dangerous product to reach millions’ Florida filed a lawsuit against OpenAI, the maker of ChatGPT, an…
-
Anthropic’s IPO Filing and How It Affects Its Responsible AI Stance
Anthropic's highly anticipated IPO filing marks a significant milestone for the responsible AI pioneer. As the company prepares to go public, its commitment to ethics and safety…
-
Florida sues OpenAI, alleging it’s unsafe for children
Florida's Attorney General has filed a lawsuit against OpenAI, accusing the company of creating an unsafe environment for children through its popular AI chatbot, ChatGPT. The l…
-
Florida Sues OpenAI Over Chatbot Safety Concerns
Florida's lawsuit against OpenAI marks a significant escalation in the debate over chatbot safety. The state claims OpenAI's technology poses a risk to children and that the com…
-
US Humanoid Robots Being Tested in Ukraine War
In a groundbreaking development, the US military is testing humanoid robots in the Ukraine war, pushing the boundaries of robotics in combat zones. This initiative has significa…
-
Florida Sues OpenAI and Sam Altman Over Safety Concerns
Florida Attorney General James Uthmeier on Monday sued OpenAI and its chief executive Sam Altman, alleging 10 counts of negligence, liability, and other state law violations rel…