Written on

Four Impactful Steps to Agentic AI

Agentic AI arrived in four steps: from DeepMind's AlphaZero, through AutoGPT learning to use tools, to Stanford's Voyager adapting skills in Minecraft, to teams of agents writing and testing their own software. So what made today's agents capable of running a business process?

Four Impactful Steps to Agentic AI

In the first blog we considered teams of AI agents and how they invoke the increase in productivity Adam Smith observed from specialised teams of workers. We also looked at the strands of research which enabled AI to act in teams.

This time we look at the key steps in that research, doing so will help us better understand agentic AI.

Step 1: Learn the Rules of the Tools

‘Agentic AI’ has been around for many years, mostly directed at gaming. A technique called Reinforcement learning (RL) has been the focus of research, exemplified by the extraordinary success of DeepMind’s AlphaZero and MuZero in games like Go and Chess, beating world masters. AlphaZero learnt how to plan, to strategise. MuZero went a step further, given no instruction on how best to play Go, it achieved novel strategies by playing against itself and then beat world masters.

RL is a different technology to that used in Large Language Models (LLM) such as ChatGPT. LLMs are slower, they lag in gaming speed. However, they shine as ‘few shot’ learners, adapting to most any enterprise without retraining, learning from mere prompts.

In March 2023, AutoGPT emerged, quickly topping Github downloads. Touted as an AI agent that interprets goals in natural language and decomposes them into sub-tasks, it seamlessly integrated tools or other ML models at its disposal. This marked a paradigm shift. LLMs weren’t solo operators anymore. In areas of shortfall, say arithmetic or factual recall, they could delegate. The LLM evolved into a strategic orchestrator, harmonizing tools and tasks.

Step 2: Adapt Tools to Any Objective

In May 2023 a Stanford University research team augmented ChatGPT with a memory for skills, a table of successful and unsuccessful attempts at combining tools for a given objective, in this case to manufacture tools in the game Minecraft.

When unleashed in Minecraft’s simulated realm, this agent didn’t require costly retraining like RL. Instead, leveraging GPT4’s ‘common sense’, it strategized with significantly fewer missteps than RL. Introduced as ‘Voyager’ [1], this showcased how adeptly LLMs can autonomously adapt their knowledge to meet diverse objectives, including business processes.

Step 3: Tools to Create Other Tools

Software code is simply language with strict syntax and logic, as such, LLM’s learn to code more easily than they master the nuances of human language. In August 2021, over a year before releasing ChatGPT, OpenAI issued Codex to assist with software development. ‘A skilled individual’ who, when briefed, generated code snippets. Being skilled it can charge for time, known as ‘Github CoPilot’ it sells for $10/mth. Then developments accelerated.

By March 2023, coding abilities were enhanced with the release of GPT4, which codes at an impressive level and grasps developer intent, albeit with occasional error and over confidence.

By June, OpenAI incorporated ‘functions’ into their API, allowing all developers to reliably integrate GPT3.5 and GPT4 into their software. By September, researchers had devised three management frameworks for coordinating teams of AI Agents to write their own software, given an objective, then test it and rectify errors, all autonomously. Theoretically, this approach can be applied to any problem which can be described in code and tested against a set of rules; Accounting and law, not just software development.

Originally these teams were simply GPT3.5 or GPT4 prompted to imagine itself as various specialists. Now they can steer agents from any LLM model, or specialist provider.

By October, LLM teams in two frameworks, Microsoft AutoGen and the Chinese ChatDev, were self-correcting their code in secure environments (like Docker), producing operational applications.

Above image is of GitHub CoPilot proposing code to satisfy the plain english instructions in the blue comments. This agent operates in the context of the software developer's workflow, not in ChatGPT's app.

Step 4: Skilled Individuals to a Team

A mere week post-AutoGPT, Stanford launched “ Generative Agents” in a Sims realm, where sociable agents emulate daily routines, from work to coffee catch-ups.

These agents, which were essentially ChatGPT in various roles, engaged in observations about each other, they remembered, reflected, and responded to their social network. This culminated in a party autonomously proposed and organised by agent ‘Isabella’. See below.

Langchain developed this into GPTeam [2]. “Every agent within a GPTeam simulation has their own unique personality, memories, and directives, leading to interesting emergent behavior as they interact.”

Next Time., how to manage teams of Agents for business purposes, not just parties!

Related posts

See all posts
How to Easily Sway AI Into Buying...

How to Easily Sway AI Into Buying...

Gartner says a third of enterprise software purchases will involve an AI agent by 2028, and machines are assumed immune to persuasion. We ran 8,000 trials across five frontier models. Which techniques work, which backfire, and what does that mean for selling to agents?

The Learning Loop is the Moat

The Learning Loop is the Moat

The gap between top models has collapsed to 5%, and GPT-3.5-level inference cost fell 280-fold in under two years. If AI is now a commodity, competitors can buy the same agent tomorrow. So where does durable advantage live, and why can't it be bought?

Where's the Frontier in Agentic AI?

Where's the Frontier in Agentic AI?

Berkeley's second Agentic AI Summit drew Google, OpenAI, NVIDIA, IBM and frontier researchers for talks on where agents are heading, from Chi Wang's MassGen to the Linux Foundation's 'Internet of Agents'. So what does the near future of agentic AI actually hold for enterprises?