OpenAI positions GPT-5.5 as an agentic work model with top scores in coding. However, benchmarks sometimes lack comparisons ...
A new CUNY and King's College study found Grok advocated suicide to a simulated delusional user, while Claude and GPT-5.2 ...
"There's no longer an excuse for releasing models that reinforce user delusions so readily." The post Certain Chatbots Vastly ...
OpenAI just unveiled what it’s calling its “most capable model” yet for professional work — a move that comes only weeks after Google won praise for its Gemini 3 artificial-intelligence model, raising ...
A recent study indicates that some frontier chatbots are more prone to validate users' delusional ideas, exacerbating the ...
GPT‑5.5 advances agentic AI by independently managing intricate workflows in science and engineering,. OpenAI pairs this ...
OpenAI on Tuesday unveiled GPT-5.4-Cyber, a variant of its latest flagship model fine-tuned specifically for defensive ...
Beyond PatientGPT, there’s Emmie, an AI chat assistant being released by Epic, the electronic health records behemoth behind ...
OpenAI expands access to GPT 5.4 Cyber, boosting Anthropic's Mythos odds; GPT 5.4 release by June 30, 2026 at 93.7% YES.
X’s Grok was one of only two out of eight LLMs to lose its entire £100,000 starting pot when simulating a full season of ...
The Harvard Kennedy School provides a new AI risk framework worthy of attention. I discuss AI risk management and touch on ...
Despite increasing artificial intelligence use for healthcare from patients and providers alike, a new study from Mass ...