Agentic AI Ingredients

Updated: Jan 22

To understand both the opportunity of agentic AI and the challenges, it helps to break it down into component parts:

Agentic AI = LLM + Data + Tools + Environment

First, let's get motivated about how powerful this combination is.

Below is a video of a team of agents given all of these components. They are able to code a solution to a complex problem, handling any difficulties and any errors as they proceed.

LLM
- GPT-4-Turbo
Data
- A spreadsheet of 10,000 AI apps and descriptions of their usages
Tools
- Python, coding language
Environment
- Jupyter Notebook, which allows notes, code and code outputs to live alongside each other

The team are asked to inspect the data, clean it, then 'cluster' the apps into a managable number and give each cluster a representative name. Clustering of words takes some involved logic, its a fairly complex task.

Compare how naturally the team collaborate on a solution versus how you might interact with just one AI in chat GPT. The team are natural and fluid, reacting to problems as they face them, without need for human prompting or cajoling.

Let's look at the components in more detail...

LLM (Large Language Model)

This means ChatGPT, pi.ai, Claude, Google Bard, Perplexity etc. LLM's alone have been revolutionary, the freedom to converse with a machine in natural language rather than code.

It is also informative for product developers, marketers and sales people. Users tend to say far more to a fairly intelligent chatbot than they would to a human.

LLM + Data

The ability for staff and customers to chat with documents and data is by far the most popular application of AI.

OpenAI felt it was important enough to take a detour from foundation model building and into 'GPT Assistants'. Note, there are many other suppliers of such functionality. Not least Microsoft's CoPilot, which searches the data in your Sharepoint or OneDrive, allowing staff (not clients) to 'chat' with all the material they have access to.

LLM + Data + Tools

An agent is born when it can take action on the data is has. That action may be to interact with a database, write and execute code, conduct research via the web etc.

An agent may have multiple or even thousands of tools made available to it. These are properly called 'functions', and they require some consideration. We need to minimise or eliminate risks of incorrect actions or inaction. Reward follows from managed risk, it is only via these 'functions' that real productivity gains can be won.

LLM + Data + Tools + Reasoning

Satya Nadella (CEO Microsoft) advises us to understand LLM's as 'reasoning engines'. Existing software can then call upon that reasoning, as required to meet the user's objective.

The importance of quality reasoning is clear when seeking productivity gains from LLM's equipped with tools. When machines act on our behalf, we want them to be as intelligent and trustworthy as anyone else we would delegate tasks to.

When we establish teams of agents to co-operate for an objective, it quickly becomes apparent that only GPT4 can reason its way through complex problems. Reasoning remains somewhat expensive, although much cheaper than people. This will soon change, and even if we find higher levels of reasoning hard to deliver, then simply having faster models will be revolutionary.

Go Faster

Faster models with existing levels of reasoning will allow us to 'brute force' novel solutions to problems. Google's Gemini recently demonstrated how reading thousands of research papers concurrently (aka 'brute force') allowed an entirely novel chemistry solution to be discovered over a lunch time.

This approach sounds wasteful, but much of our modern world operates on delivering mundane calculations so fast that the experience is magical. Displaying a YouTube video or using GPS to determine our location are exactly such processes. Automation of science, engineering and business processes may be no different, given a fast reasoning engine.

LLM + Data + Tools + Reasoning + Environment

The final ingredient and perhaps most overlooked, is the environment. An agent's actions take place in an environment with rules and outcomes. Code is executed in a sandbox, database actions are implemented in a database, planning is conducted in an ERP system.

A test version of the environment allows the agent, or team of agents, to test proposals, correct errors and explore new scenarios whilst taking no risk in the real world.

In this way hallucinations are ironed out. Repeated simulations can explore the landscape of possible solutions, finding the optimal plan and discounting faulty reasoning. Robust and proven proposals can then be presented to a human for review.

There may be a successful future for games adapted as simulation environments. Consider an AI agent concurrently exploring multiple strategies in Farming Simulator, Sim Companies, or even GTA

Humans
In Mind

Email: info@humansinmind.ai Tel: 01952 928189

Agentic AI Ingredients

Recent Posts

Humans In Mind

Email: info@humansinmind.ai Tel: 01952 928189

Agentic AI Ingredients

Recent Posts

Humans
In Mind