Written on

Google & Andrew Ng on Agentic AI

In one week Google launched its Vertex AI agent framework and Andrew Ng devoted The Batch to multi-agent collaboration. So why does splitting a task across 'software engineer', 'QA' and 'product manager' agents outperform a single agent — and how should leaders manage them?

Google & Andrew Ng on Agentic AI

In the past week Google launched its own AI Agent Framework under the Vertex AI brand, more about those in another blog, but you can read a professional analysis of Google's AI strategy here.

Meanwhile, Andrew Ng has also highlighted agentic AI. The following is a repost from today's DeepLearning.ai newsletter ' The Batch', in which Andrew writes:

------

AI Agents, There's Nothing Like Results!

Multi-agent collaboration is the last of the four key AI agentic design patterns that I’ve described in recent letters. Given a complex task like writing software, a multi-agent approach would break down the task into subtasks to be executed by different roles — such as a software engineer, product manager, designer, QA (quality assurance) engineer, and so on — and have different agents accomplish different subtasks.

Different agents might be built by prompting one LLM (or, if you prefer, multiple LLMs) to carry out different tasks. For example, to build a software engineer agent, we might prompt the LLM: “You are an expert in writing clear, efficient code. Write code to perform the task . . ..”

It might seem counterintuitive that, although we are making multiple calls to the same LLM, we apply the programming abstraction of using multiple agents. I’d like to offer a few reasons:

  • It works! Many teams are getting good results with this method, and there’s nothing like results! Further, ablation studies (for example, in the AutoGen paper cited below) show that multiple agents give superior performance to a single agent.

  • Even though some LLMs today can accept very long input contexts (for instance, Gemini 1.5 Pro accepts 1 million tokens), their ability to truly understand long, complex inputs is mixed. An agentic workflow in which the LLM is prompted to focus on one thing at a time can give better performance. By telling it when it should play software engineer, we can also specify what is important in that role’s subtask. For example, the prompt above emphasized clear, efficient code as opposed to, say, scalable and highly secure code. By decomposing the overall task into subtasks, we can optimize the subtasks better.

  • Perhaps most important, the multi-agent design pattern gives us, as developers, a framework for breaking down complex tasks into subtasks. When writing code to run on a single CPU, we often break our program up into different processes or threads. This is a useful abstraction that lets us decompose a task, like implementing a web browser, into subtasks that are easier to code. I find thinking through multi-agent roles to be a useful abstraction as well.

Manage AI Agents Like People

In many companies, managers routinely decide what roles to hire, and then how to split complex projects — like writing a large piece of software or preparing a research report — into smaller tasks to assign to employees with different specialties.

Using multiple agents is analogous. Each agent implements its own workflow, has its own memory (itself a rapidly evolving area in agentic technology: how can an agent remember enough of its past interactions to perform better on upcoming ones?), and may ask other agents for help.

Agents can also engage in Planning and Tool Use. This results in a cacophony of LLM calls and message passing between agents that can result in very complex workflows.

While managing people is hard, it's a sufficiently familiar idea that it gives us a mental framework for how to "hire" and assign tasks to our AI agents. Fortunately, the damage from mismanaging an AI agent is much lower than that from mismanaging humans!

The Race in Agentic Development Frameworks

Emerging frameworks like AutoGen, Crew AI, and LangGraph, provide rich ways to build multi-agent solutions to problems. If you're interested in playing with a fun multi-agent system, also check out ChatDev, an open source implementation of a set of agents that run a virtual software company. I encourage you to check out their GitHub repo and perhaps clone the repo and run the system yourself. While it may not always produce what you want, you might be amazed at how well it does.

Like the design pattern of Planning, I find the output quality of multi-agent collaboration hard to predict, especially when allowing agents to interact freely and providing them with multiple tools. The more mature patterns of Reflection and Tool Use are more reliable. I hope you enjoy playing with these agentic design patterns and that they produce amazing results for you!

If you're interested in learning more, I recommend:

------

Related posts

See all posts
How to Easily Sway AI Into Buying...

How to Easily Sway AI Into Buying...

Gartner says a third of enterprise software purchases will involve an AI agent by 2028, and machines are assumed immune to persuasion. We ran 8,000 trials across five frontier models. Which techniques work, which backfire, and what does that mean for selling to agents?

The Learning Loop is the Moat

The Learning Loop is the Moat

The gap between top models has collapsed to 5%, and GPT-3.5-level inference cost fell 280-fold in under two years. If AI is now a commodity, competitors can buy the same agent tomorrow. So where does durable advantage live, and why can't it be bought?

New Economics of the Agentic Firm

New Economics of the Agentic Firm

When cognitive labour was scarce, firms built processes to protect it. Now agents generate drafts and fixes faster than humans can inspect, and review becomes the bottleneck. If intelligence is suddenly cheap, what becomes scarce inside the firm, and what must leaders redesign?