Journal

ai-safety

Articles tagged with “ai-safety”.

Agentico Teams Up with Apart Labs for AI Research

December 19, 2024

Agentico is committing one day of staff time a week to Apart Lab's AI safety and interpretability research. With models like OpenAI o1 now matching physicians on reasoning, advising on AI has become less like automation engineering and more like recruiting.

We Under-Imagined the Zombie

April 21, 2026

Anthropic measured something like emotions inside Claude — not mimicry, but representations that direct it. Intervene and the behaviour changes. The debate fixates on one question: is AI conscious? This suggests both sides ask the wrong thing. What does it mean for business?

Commercial Rewards from AI Safety

June 09, 2024

Human-in-the-loop oversight doesn't scale — people suffer 'vigilance decrement', and nobody could review the thousands of lines of code GPT-5 writes daily. Meanwhile AI safety researchers build techniques where systems check each other. So how do these methods become a commercial edge?

Understand, Edit & Steer AI - via API

November 28, 2024

At a Goodfire and Apart hackathon, our team built a tool to catch hallucinations in medical diagnostic AI, then steer the model around them neuron by neuron via a simple API. What does it mean when a business can edit a model's internals this easily?

Find what you need

ai-safety

Agentico Teams Up with Apart Labs for AI Research

We Under-Imagined the Zombie

Commercial Rewards from AI Safety

Understand, Edit & Steer AI - via API

Before you start

Assistant unavailable