In the previous post, we started to explore the capabilities of AutoGen, a framework by Microsoft to create multi-agent solutions. In the post, we started approaching it with AutoGen Studio, a “low code” web application that you can use as a playground to validate your scenario before actually writing code. As a scenario, we used a fun one to demonstrate the potential of AutoGen: a rap battle between two agents, with an MC agent introducing the battle and a judge agent deciding who has created the best rap lyrics.
In this post, we’re going to enhance our rap battle, by using one of the capabilities of AutoGen that we haven’t used so far but that it’s very powerful: tools! Tools allow you to add extra capabilities to an agent, and they are represented as Python functions. Being code, we can do anything with them, from calling an API to querying a database or reading the content of a file. This means we can expand an agent’s skills beyond what it can do using only the LLM that powers it.
Let’s start!
Adding our first tool
For our demonstration, we’re going to tweak our rap battle scenario a bit. The Rap MC, instead of just introducing the topic, will now share a list of words to the two rappers, that they have to use in their rap lyrics. However, this list of words won’t be hardcoded in the instructions or randomly generated by the LLM, but it will be returned by a Python function that we’re going to create. For the sake of this post, the Python function will be incredibly simple:
|
|
The function simply returns a static list of words, nothing more, nothing less. We could have retrieved the list from an API; or read it from a text file. The result would have been the same.
Let’s see how we can add this function to our project. In AutoGen Studio, open the Team Builder section and select the team you have created in the previous post. In the Component Library, under Tools, you fill find the skill that we have deleted in the original post, called calculator
. Drag and drop it under the Tools section of the Rap MC agent:
Click on the pencil icon near the agent’s name and scroll down at the end of the Configuration section. First, make sure to turn on the option Reflect on Tool Use. Then, click on the FunctionTool to open the editor. We’ll need to customize a few things since, by default, the tool will contain a Python function to perform basic math operations. You can leave the standard description as it is (which is only for human reading), so let’s focus on the Configuration section:
-
Name: call the function
get_list_of_words
-
Description: this is the description of the function that will be used by the LLM to understand what it does. Use the following one:
1
Get a list of words that can be used to generate the lyrics for a rap song. Participants to the rap battle must use all the words returned by this function.
-
Source code: this is the actual code of the function. Copy and paste the Python function I shared before:
1 2 3 4
def get_list_of_words() -> str: print("Get_list_of_words invoked") words = "apple, banana, cherry, date, elderberry, Christmas" return words
Then click Save Changes. The next step is to tweak the instructions of the Rap MC agent. We need to make it aware that now it must use this function before starting the rap battle, so that it can share the list of words with the rappers. Click again on the pencil icon near the agent’s name and scroll down System Message section. Replace the instructions with the following ones:
|
|
As you can see, we have instructed the agent that now it has access to a skill that gives it a list of words for rap lyrics and that it must use it to get the list of words before starting the rap battle. Click on Save Changes.
Testing our changes
We don’t need to make further changes for the moment. This tool will be needed only by the Rap MC agent, so we don’t need to add it to the other agents. Let’s test our changes. Click on the Playground button and create a new session, by choosing the team you have just updated. Then, start a new rap battle by providing a topic, for example:
|
|
If everything goes as planned, you will see that, this time, the Rap MC agent will call the function get_list_of_words()
before starting the rap battle:
And if you read the message generated by the Rap MC agent, you will notice that, this time, other than introducing the topic, it also shares the list of words with the rappers:
Finally, if you read the lyrics generated by the two rapper agents, you will see that they have used all the words in the list.
Spicing up the competition
Let’s spice up a bit the competition. Let’s change our scenario so that the judge agent, after having scored the lyrics, will have the opportunity to run a second round in case the two scores are too close. You will soon understand why we’re doing this 😊
Let’s go back to the Team Builder section of AutoGen Studio and let’s edit the Judge agent, by clicking on the pencil icon near its name. Update the System Instructions with the following ones:
|
|
This is the part that we have changed, compared to the previous instructions:
|
|
This way, the judge agent will be able to ask for a new round in case the two scores are too close. Click on Save Changes.
Now go back to the Playground section and start a new session. Start a new chat experience by sharing the topic you want to use, like:
|
|
You will see that things will go almost as planned, but not exactly quite right. The judge agent will score the two rappers and, in case of a tie or a very close score, it will ask to run a new round. However, instead of giving the stage directly to the two rappers to generate a new round of lyrics, you will see the workflow starting again from scratch:
- The Rap MC agent will call the function
get_list_of_words()
to get the list of words - The Rap MC agent will introduce the topic and share the list of words with the two rappers
- The two rappers will generate a new round of lyrics
- The judge agent will score the two rappers
This is not what we want. We want the judge agent to give the stage directly to the two rappers, without going through the Rap MC agent again. Why is this happening?
The answer lies in the type of team we are using, which is RoundRobinGroupChat
. As the name says, this type of team is capable only to run a round-robin workflow, where each agent has the opportunity to speak in turn. This means that, when the judge agent asks for a new round, the workflow starts again from scratch, with the Rap MC agent speaking first.
An easy way to observe this behavior is to change the order in which we have added the agents to the team. If we add them in an order different than the current one (Rap MC, Rappers, Judge), our scenario will be broken, because the order doesn’t match the workflow we have envisioned.
The following image shows what happens when you scramble the order of the agents in the team (first judge, then one rapper, then Rap MC and then another rapper). As you can notice, in this case the conversation ended not because the battle is over, but because we have reached the maximum number of turns (10) since the judge agent wasn’t able to properly evaluate and conclude the battle.
Is there a way to solve this problem? The answer is yes!
Introducing the SelectorGroupChat
The SelectorGroupChat
is another type of team supported by AutoGen, which is more powerful. Unlike the RoundRobinGroupChat
, the SelectorGroupChat
allows you to create a workflow where each agent can speak when it is needed, without following a strict order. This is possible thanks to three extra features:
- The selector prompt: using natural language, we can provide a prompt that explains which is the expected workflow of the conversation.
- LLM integration: the
SelectorGroupChat
is powered by a LLM on its own, which enables AutoGen to reason about the conversation and the context and decide which agent should speak next. - The ability to support calling an agent multiple times.
To use a SelectorGrupChat
, we must switch to JSON mode in our Team Builder, since it isn’t supported yet by the UI. The first step is to change the provider property of the main object. When you switch to JSON mode, the first line will be:
|
|
You must change it to:
|
|
This is enough since, once you do that, the UI will be updated to support the extra properties which are required by the SelectorGroupChat
. Go back to visual mode and click on the Edit icon. You will see that we have some new properties we can customize:
Let’s start with the** Selector Prompt**, which is the set of instructions that describes the workflow we want to implement. Copy and paste the following text:
|
|
This prompt is quite detailed and explain the various possibilities that can happen during the rap battle. We are also explicitly calling the opportunity that an agent can be called multiple times.
You can see that we have also a series of placeholder, like {roles}
or {history}
. These are required by AutoGen, so that the team can retain the entire context to take the best possible decision.
The next step is to turn on the option Allow repeated speaker, which enables the team to call multiple times the same agent if needed.
Finally, we need to set up the LLM which will be used by the SelectorGroupChat
to reason about the conversation. The process is exactly the same we did for agents in the previous post:
- If you want to use OpenAI, you can just provide your API key in the API Key field.
- If you want to use Azure OpenAI or another LLM, you will need to switch to JSON mode and add the missing properties.
Another option you may want to adjust is Max Turns in the team configuration.. It’s currently set to 10 but, since we have tuned the instructions so that the Judge agent can run up to 3 rounds, we may want to increase this value to 15 or 20 to make sure the battle doesn’t terminate earlier.
Now we’re ready to test our changes!
Running our improved rap battle
Go back to the Playground section and start a new session, like we did so far by sharing the topic you want to use, like:
|
|
You should see that the battle is now much more dynamic and, in case of a tie or a very close score, the Judge agent will ask for a new round without going through the Rap MC agent again, like in the following screenshot:
The following workflow created by AutoGen Studio shows the entire conversation, which clearly shows how we aren’t using anymore a round robin approach but, right after the Judge declares that the battle is not over yet, the two rappers are called again to generate new lyrics:
Please note that, when you use the SelectorGroupChat
, it becomes more important than ever that you provide a proper description of the agent in the Description property that you can find in the Configuration section of the agent. This is because the SelectorGroupChat
will use this description to understand what the agent is capable of doing and, if the description is not clear, it may select the wrong agent to perform a task.
Wrapping up
In this post, we have explored two new features offered by AutoGen:
- How we can empower agents with tools, which expand the capabilities of an agent beyond what it can do with the LLM alone and enabling scenarios like calling an API, querying a database or reading the content of a file.
- How we can use the
SelectorGroupChat
to create a more dynamic workflow, where agents can be called multiple times and the order in which they speak is not fixed.
In the next post, we’ll move from playground to production and we’ll see how we can use AutoGen in our code, both by exporting the workflow we have created with AutoGen Studio and by writing code from scratch using the AutoGen APIs.
Happy coding!