Journal

ai-safety

Articles tagged with “ai-safety”.

Agentico Teams Up with Apart Labs for AI Research

Agentico Teams Up with Apart Labs for AI Research

Agentico is committing one day of staff time a week to Apart Lab's AI safety and interpretability research. With models like OpenAI o1 now matching physicians on reasoning, advising on AI has become less like automation engineering and more like recruiting.

We Under-Imagined the Zombie

We Under-Imagined the Zombie

Anthropic measured something like emotions inside Claude — not mimicry, but representations that direct it. Intervene and the behaviour changes. The debate fixates on one question: is AI conscious? This suggests both sides ask the wrong thing. What does it mean for business?

Commercial Rewards from AI Safety

Commercial Rewards from AI Safety

Human-in-the-loop oversight doesn't scale — people suffer 'vigilance decrement', and nobody could review the thousands of lines of code GPT-5 writes daily. Meanwhile AI safety researchers build techniques where systems check each other. So how do these methods become a commercial edge?

Understand, Edit & Steer AI - via API

Understand, Edit & Steer AI - via API

At a Goodfire and Apart hackathon, our team built a tool to catch hallucinations in medical diagnostic AI, then steer the model around them neuron by neuron via a simple API. What does it mean when a business can edit a model's internals this easily?