VentureBeat presents: AI Unleashed – An exclusive executive event for enterprise data leaders. Network and learn with industry peers. Learn More
Microsoft has joined the race for large language model (LLM) application frameworks with its open source Python library, AutoGen.
As described by Microsoft, AutoGen is “a framework for simplifying the orchestration, optimization, and automation of LLM workflows.” The fundamental concept behind AutoGen is the creation of “agents,” which are programming modules powered by LLMs such as GPT-4. These agents interact with each other through natural language messages to accomplish various tasks.
Agents can be customized and augmented using prompt engineering techniques and external tools that enable them to retrieve information or execute code. With AutoGen, developers can create an ecosystem of agents that specialize in different tasks and cooperate with each other.
A simplified view of the agent ecosystem is to view each agent as an individual ChatGPT session with its unique system instruction. For instance, one agent could be instructed to act as a programming assistant that generates Python code based on user requests. Another agent can be a code reviewer that takes Python code snippets and troubleshoots them. The response from the first agent can then be passed on as input to the second agent. Some of these agents might even have access to external tools, which is the equivalent of ChatGPT plugins like Code Interpreter or Wolfram Alpha.
Event
AI Unleashed
An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.
Image source: Microsoft blog
AutoGen provides the necessary tools for creating these agents and enabling them to interact automatically.
Multi-agent applications can be fully autonomous or moderated through “human proxy agents,” which allow users to step into the conversation between the AI agents, acting as another voice to provide oversight and control over their process. In a way, the human user is turned into a team leader overseeing a team of multiple AIs.
Human agents are useful for applications where the agent framework must make sensitive decisions and require confirmation from the user, such as making purchases or sending emails.
They can also enable users to help agents steer course when they start going in the wrong direction. For example, the user can start with an initial idea for an application and gradually refine it and add or modify features as they start writing the code with the help of agents.
The modular architecture of AutoGen allows developers to create general-purpose reusable components that can be assembled together to rapidly build custom applications.
Multiple AutoGen agents can collaborate to accomplish complex tasks. For example, a human agent might request assistance in writing code for a specific task.
A coding assistant agent can generate and return the code, which the AI user agent can then verify using a code execution module. Together, the two AI agents can then troubleshoot the code and produce a final executable version, with the human user able to interrupt or provide feedback at any point.
This collaborative approach can lead to significant efficiency gains. According to Microsoft, AutoGen can speed up coding by up to four times.
AutoGen also supports more complex scenarios and architectures, such as the hierarchical arrangement of LLM agents. For instance, a group chat manager agent could moderate conversations between multiple human users and LLM agents and pass on messages between them according to a set of rules.
A competitive field
The field of LLM application frameworks is fast developing and Microsoft AutoGen is competing with many other contenders. LangChain is a framework for creating various types of LLM applications, from chatbots to text summarizers and agents. LlamaIndex offers rich tools for connecting LLMs to external data sources such as documents and databases.
Libraries like AutoGPT, MetaGPT, and BabyAGI are specifically focused on LLM agents and multi-agent applications. ChatDev uses LLM agents to emulate an entire software development team. And Hugging Face’s Transformers Agents library enables developers to create conversational applications that connect LLMs to external tools.
LLM agents are a hot area of research and development, with prototypes already created for tasks ranging from product development to executive functions, shopping, and market research. Studies have also shown how LLM agents can be used to simulate mass population behavior or create realistic non-playable characters in games. However, much of this work remains proof of concept and is not yet production-ready due to challenges, such as hallucinations and unpredictable behavior from LLM agents.
Despite these challenges, the future of LLM applications appears bright, with agents set to play a significant role. Big tech companies are already betting big on AI copilots being a big part of future applications and operating systems. And LLM agent frameworks will enable companies to create their own customized copilots. Microsoft’s entrance into this field with AutoGen is a testament to the intensifying competition around LLM agents and their future potential.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.