Over the past few years, use cases for LLMs (Large Language Models) have come along in leaps and bounds, and with these rapid developments has come a myriad of new terms – many of which are now common place in the world of data. “RAG”, “Prompt”, and “Fine-tune” get mentioned on a daily basis and have comfortably become part of business as usual, however innovation is nothing if not relentless and you may have increasingly heard a new term being mentioned: “Agents”.
Agents, or ”Agentic Frameworks”, are an area of Generative AI which is gaining traction at a phenomenal rate, as evidenced by how much of the spotlight (ie, nearly all of it) they received at Microsoft Ignite this year. However, there is still little consensus on what exactly the term means. There are currently many different definitions and boundaries floating around for where “traditional” chains end and where agents begin, and this post will not attempt to solve these disputes or draw a consensus. We will however do our best to unify the thoughts behind these terms, highlight what makes them so exciting, and discuss how they can apply to you as an AI enthusiast!
Well, that depends who you ask!
As pointed out by Andrew Ng[1], who seems to be reasonably qualified on the subject, sticking to the noun “agent” is incredibly limiting and an easy target for pedants. As we will see, an “agent” can take many forms and behave in many ways, which is largely to be expected as this young yet rapidly developing area of AI finds its way forward. However, classifying systems as either “an agent” or “not an agent” leaves much room for uncertainty.
Therefore, it is perhaps more appropriate to consider agent status not as a binary classification, but rather as a sliding scale – how “agentic” a system may be.
Rather than having to choose whether or not something is an agent in a binary way, I thought, it would be more useful to think of systems as being agent-like to different degrees. Unlike the noun “agent,” the adjective “agentic” allows us to contemplate such systems and include all of them in this growing movement.
~ Andrew Ng, The Batch #253[1]
As further discussed by Harrison Chase[2], broadly speaking we can say that the more a system uses an LLM to make its own behavioural decisions, the more agentic it becomes.
*****
This means that if you’re working with AI, you’re probably already working with compound systems! In fact in the same post, BAIR referenced findings from Databricks that some 60% of LLM applications incorporate RAG, and 30% leverage more than one LLM call.
Agentic systems are therefore a subclass of compound systems, or to put it another way, all agentic systems are compound, but not all compound systems are agentic![4]
*****
So having said all that we can take the view that, at their core, agentic frameworks are just compound systems but with the ability to use an LLM to make dynamic decisions rather than relying on hard-coded actions. These decisions can be taken through a variety of means, and the various methods are still undergoing a rapid evolution. However, at a high level, agentic workflows tend to rely on:
This list is by no means exhaustive, and a deeper dive into these methods will follow in a future blog post, so keep an eye on our feed for that!
A key take away from agentic frameworks too is how they are inspiring a shift away from the "bigger = better" mindset which many LLMs have witnessed over the past few years. New models are, mostly, much larger with each new iteration, and rightly so as their requirements become more intricate. However agents are starting to challenging that notion, and are instead promoting the idea that many "smaller" models working together effectively can outperform larger alternatives.
Performance
Just as compound systems can outperform basic LLM calls, so too can agentic systems outperform compound ones. For example a RAG model can provide a more grounded and detailed answer to a question, but an agentic framework can iteratively refine the answer and remove mistakes. It has even been found that effective implementation of agentic workflows can outstrip the performance gain between LLM generations.
Take for example the HumanEval benchmark. Zero-shot performance increased from 48.1% to 67.0% between GPT-3.5 and GPT-4 respectively, however GPT-3.5 leveraging various agentic workflows scored between 75% and 95%.
Source: Andrew Ng, Joaquin Dominguez, John Santerre, DeepLearning.AI[5]
This affects not only performance but also cost-effectiveness. For example GPT-3.5 can cost up to 40x less per token than GPT-4, and so in this context effective utilisation of agentic frameworks could massively reduce costs whilst maintaining a competitive performance. Though be warned, using agents can also considerably increase token usage, and different language models are better suited for different tasks, so balancing a decrease in token costs and increase in token usage is critical!
Autonomy
Agents can fix their own mistakes, reason their way around issues, and overcome hurdles without the need for human intervention. Of course, they are not all powerful and may well encounter issues they cannot solve within their confines, but equally they are not as helpless in the face of errors deterministic code.
Flexibility
Agentic frameworks allow you to leverage a variety of tools for different purposes, and crucially this allows you to bring the creativity of Language Models together with the predictability of code. For example, say there is a deterministic function that you need to call. It is relatively straightforward to code but absolutely critical that it runs correctly. An agentic framework enables you to leverage this function while dynamically interpreting the results and selecting subsequent actions, eliminating the need to hard code every possible outcome and edge case.
Agentic frameworks should be considered over traditional compound frameworks when the application requires autonomous decision making and goal oriented behaviour.
Agentic frameworks are ideal for tasks which involve complex interactions with dynamic environments, such as code generation, assessment of feedback, or use cases which require adaptive learning. Unlike traditional frameworks, they enable the LLM not only to process information but also to act upon it, making decisions based on predefined goals and real-time feedback.
This approach is beneficial when the system needs to exhibit a higher degree of independence and flexibility, ensuring more efficient and effective performance in unpredictable scenarios, however agentic frameworks are not best suited for processes which are very linear or predictable.
Agents bring tremendous benefits but also some risks, including the ability to "run away" and fall in to endless feedback loops - which can be costly in terms of time and money. There are ways to alleviate these risks, such as implementing timeouts or iteration limits whereupon an escape tool is called, however the best way to avoid them is not to take the risk if you don't have to!
At present, reliability is inversely proportional to the agent's level of freedom[6], so if you expect your LLM's environment and workload to be predictable, it is often safer to stick to traditional chains with hardcoded logic.
Furthermore, as we've already discussed, agentic frameworks are undergoing rapid development and change at the moment. Best practices, techniques, and libraries change regularly - meaning that what you've built could quickly become outdated or even deprecated! Agentic libraries such as LangChain are moving towards production-ready versions for agentic systems, and some companies have already exhibited successful production quality agentic solutions, but this space is far less stable than other areas of Gen AI and should be treated with caution when thinking in the long term.
We're working on some great agentic resources, so keep an eye on our blog and YouTube channel! But in the meantime if you want to get started in the world of agentic frameworks you can check out the following brilliant libraries:
We should also provide a special mention to Databricks' Mosaic AI and Microsoft's newly announced AI Foundry, both of which provide massive boosts to the development and evaluation of agentic frameworks through lower code and UI driven interactions.
Still need a bit of guidance? Get in touch to schedule a chat with one of our experts who will be happy to discuss how you can leverage the power of agentic AI in your business!