Featured image of post A smarter way to manage prompts with Semantic Kernel using Prompty

A smarter way to manage prompts with Semantic Kernel using Prompty

Prompty is a new solution from Microsoft to standardize prompts and its execution into a single asset. Let's see an overview and how we can use it with Semantic Kernel.

One of the needs that often comes up when you’re building an AI powered application is a smart way to manage a prompt library. Even if, due to the nature of LLMs, chat is the predominant interfaced offered by AI applications, there are many scenarios where AI stays behind the scenes and augments the capabilities of the application. Think, for example, to Copilot for Microsoft 365. Many experiences are chat based, but there are also many experiences which are triggered directly in the user interface. In Outlook, when you want to get a summary of a thread, you click on the Summary By Copilot banner at the top of the conversation; in Word, when you want to convert a content into a table, you select it, right click on it and you choose Copilot -> Rewrite as a table. In all these scenarios, the user doesn’t see “the magic behind” but, in the background, Copilot is just executing a prompt.

In all these scenarios, you can’t just rely on having a prompt hard coded in the application. You need a way to manage it, test it, easily change it without rebuilding the application. On this blog, we have learned a way to do that by using one of the features offered by Semantic Kernel: prompt functions. However, recently Microsoft has introduced a new solution to standardize prompts and its execution into a single asset that we can use to improve the management of our prompts in our applications. It’s called Prompty and it’s part of PromptFlow, a suite of development tools from Microsoft to streamline the end-to-end development of LLM based applications.

Specifically, Prompty relies on PromptFlow to give you the ability to test prompts without needing to write the code to execute it or to run the full application.

Let’s take a deeper look!

Start with Prompty

The easiest way to start with Prompty is by installing the dedicated Visual Studio Code extension. Once you have installed it, you will get the ability to right click on Explorer and choose New prompty. This will create a new file in the folder called basic.prompty, which is a YAML file that contains an example of a prompt.

As you can see, the file is more complex than the prompt functions we have seen in the past and it allows much more flexibility. Let’s take a look at the content of the file and, along the way, we’re going to customize it.

At the top of the file, you will see a section wrapped into three dashes (---). This is the configuration section which, however, will cover later, you’ll understand why shortly. Let’s start with what we find below, which is the actual prompt. You can immediately notice how you can express a full conversation using the Chat Completion API style:

  • We can define multiple actors in the conversation (system prompt, user prompt, etc.)
  • We can provide variables that, at runtime, will be replaced with values coming from the application.

This is how my section looks like for my prompt:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16

system:
You are an AI assistant who helps people find information. As the assistant, 
you answer questions briefly, succinctly, and in a personable manner using 
markdown and even add some personal flair with appropriate emojis.

# Customer
You are helping {{firstName}} to find answers to their questions.
Use their name to address them in your responses.

# Context
Use the following context to provide a more personalized response to {{firstName}}:
{{context}}

user:
{{question}}

First, we can notice how we are basically setting up a whole agent, since we are providing a system prompt (to instruct the LLM on how to respond) and a user prompt (which is the message we’ll get from the application).

Second, despite the prompt is generic (you are an AI assistant who helps people find information), we are using variables to make it more dynamic. We are instructing the LLM to always address the user with their name and we are providing some information the LLM can use to generate an answer (think of it like a super simplified RAG implementation).

All these three variables (name of the user, context and question) are supplied with a parameter, using the {{variable}} syntax. When we’re going to invoke this prompt from our application, we’ll have to provide the values for these variables.

The sample section

At the beginning of the post, we have said that one of the benefits of Prompty is that it allows to test prompts without needing to write the code to execute it or to run the full application. How can we do that? That’s where the sample section comes into play. In this section, which belongs to the Prompty configuration at the top, you can provide the values for the variables you have defined in the prompt and they will be used when we execute the prompt using the built-in tools provided by Visual Studio Code.

Let’s take a look at my sample section:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
sample:
  firstName: Matteo
  context: >
    Contoso Electronics is a leader in the aerospace industry, providing advanced electronic 
    components for both commercial and military aircraft. We specialize in creating cutting edge systems that are both reliable and efficient. Our mission is to provide the highest 
    quality aircraft components to our customers, while maintaining a commitment to safety 
    and excellence. We are proud to have built a strong reputation in the aerospace industry 
    and strive to continually improve our products and services. Our experienced team of 
    engineers and technicians are dedicated to providing the best products and services to our 
    customers. With our commitment to excellence, we are sure to remain a leader in the aerospace industry for years to come    
  question: Which type of srvices does Contoso Electronics provide?

As you can see, I’m providing the values for the variables I have defined in the prompt. When I’m going to execute the prompt, these values will be used to generate the response. Again, it’s very simplistic, but it’s a very basic RAG implementation: we are asking a question (the type of services offered by a fictional company called Contoso Electronics) and, in the context, we are providing the information that the LLM can use to answer it.

Testing the prompt

Now that we have completed the creation of the prompty file, we can actually test it without needing to write any code. There are multiple ways to do that, let’s start from the easiest one, which is running it. Before doing that, however, we must setup the AI service we want to use to execute the prompt. To do that, you must go in the Settings of Visual Studio Code and search for prompty with the internal search engine. You’ll see a section like the following one:

The Visual Studio Code settings to configure prompty

Click on Edit in settings.json below Model configurations to open the settings file, in which you will find a section named prompty.modelConfigurations. Here, you can configure one or more AI services, which could be based on Azure OpenAI or OpenAI. For example, here below is the configuration to use gpt-4o from my Azure OpenAI instance:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
   "prompty.modelConfigurations": [
        {
            "name": "gpt4-o",
            "type": "azure_openai",
            "api_version": "2024-02-15-preview",
            "azure_endpoint": "<endpoint>",
            "azure_deployment": "gpt-4o",
            "api_key": "<api-key>"
        },
        {
            "name": "gpt-3.5-turbo",
            "type": "openai",
            "api_key": "<api-key>",
            "organization": "<org-id>",
            "base_url": "<base-url>"
        }
    ],

As you can see, you can specify more than one configuration. They will all show up in the Visual Studio Code status bar at the bottom:

The model selection in Visual Studio Code

If you click on it, you will be able to choose one of the services that you have previously added in the prompt.modelConfigurations section.

Once you have selected the model you want to use, make sure to focus Visual Studio Code on the prompty file, then click on the play button at the top right of the file to execute the prompt:

The button to test a prompt in Visual Studio Code

If you have set up the AI service in the right way, Visual Studio Code will run the prompt against the LLM and will fill the variables with the values we have set in the sample section. You will see the response in the output panel:

1
2
3
Hey Matteo! Contoso Electronics provides advanced electronic components for both commercial and military aircraft. 
Our services include designing and manufacturing cutting-edge systems that are reliable and efficient. 
We pride ourselves on our commitment to safety, excellence, and continual improvement in our products and services. ✈️🔧

If you prefer, in the panel list, you will find also a panel called Prompty Output (Verbose), which will show you the entire API communication with the AI service and the JSON request and response.

Pretty cool, right? We have been able to test the quality of our prompt without needing to write any code. And, if we want to make any change, we can just edit the prompty file and run it again. For example, let’s slightly change the system prompt:

1
2
3
4
5
system:
You are an AI assistant who helps people find information. As the assistant, 
you answer questions briefly, succinctly, and in a personable manner using 
markdown and even add some personal flair with appropriate emojis.
Respond using the JSON format, by including the original question and the response.

We have added an instruction to return the response in JSON format, rather than in Markdown.

We can just hit Run again to see how the response is changing:

1
2
3
4
5
6
2024-07-24 11:45:15.179 [info] Calling https://semantickernel-gpt.openai.azure.com//openai/deployments/gpt-4o/chat/completions?api-version=2024-02-15-preview
2024-07-24 11:45:17.827 [info] ```json
{
  "question": "Which type of services does Contoso Electronics provide?",
  "response": "Hey Matteo! Contoso Electronics provides advanced electronic components for both commercial and military aircraft. Our services include designing and manufacturing cutting-edge systems that are reliable and efficient. We pride ourselves on our commitment to safety, excellence, and continual improvement in our products and services. ✈️🔧"
}

But that’s not enough! We can do more advanced testing thanks to Prompt Flow.

Let’s take a look!

Advanced testing with Prompt Flow

When you worked on the Prompty file, you might have noticed that, at the top, Visual Studio Code was showing up a few options at the top:

The options offered when you work with a Prompty file

All of them are based on Prompt Flow, and the most interesting one is Open test chat page. The Test one, in fact, will simply execute the prompt, like we have done before but with the additional requirement of needing to setup Prompt Flow. Open test chat page, instead, will give us a chat interface that will allow us to do deeper experiments with the prompt.

Before using it, we must setup Prompt Flow however. This technology is based on Python, so the first step is to make sure you have a Python environment installed on your machine. If you don’t have it, you can quickly install it thanks to WinGet if you’re on Windows. Open a terminal and run the following command:

1
winget install Python.Python.3.11

Once you have Python installed, you can install Prompt Flow using the Python package manager, called pip:

1
pip install promptflow

When we use this approach to test our prompt, there’s an important difference: we aren’t executing anymore the prompt using Visual Studio Code (which directly calls the Azure OpenAI APIs), but using the Prompt Flow service. As such, we can’t leverage the model configuration we have previously created in the Visual Studio Code settings, but we must add a new section to our prompty file called model, like in the following example:

1
2
3
4
5
6
7
8
9
model:
  api: chat
  configuration:
    type: azure_openai
    azure_endpoint: <endpoint>
    azure_deployment: gpt-4o
    api_version: 2024-02-15-preview
  parameters:
    max_tokens: 3000

This is the meaning of the various parameters:

  • api: it’s the type of API you want to use. In our case, we use chat to use the chat completion service.
  • configuration is used to specify the service configuration with the following parameters:
    • type: this is the AI service you want to use. In my case I’m using Azure OpenAI, so I set this value to azure_openai. Based on the type, the other parameters will change, since Azure OpenAI and OpenAI have different connection methods. * azure_endpoint: this is specific for Azure OpenAI and it’s the URL of the service you want to use. * azure_deployment: this is specific for Azure OpenAI and it’s the name of the deployment of the model you want to use. * api_version: it’s the version of the API you want to use.
  • parameters allows you to specify different parameters to customize the model interaction, like the temperature or (like in this case), the maximum number of tokens to use.

As you can see, unlike we did in the Visual Studio Code settings, we don’t have a way to specify the API key. This is because Prompt Flow requires to use the Microsoft Entra authentication in order to securely connect with Azure OpenAI. For this reason, you will need to have installed on your machine the Azure CLI, which enables a series of commands to work with Azure.

Once you have installed it, run the following command:

1
az login

You will be guided through a series of steps to authenticate with your Azure account and to choose the subscription in which you have deployed the Azure OpenAI service. Once you have gone through this process, you will need to assign a dedicated role to your account to be able to leverage the integrated Microsoft Entra authentication. Open the Azure portal and open the Azure OpenAI instance that you’re using. Click on Access Control (IAM) and then Add -> Add role assignment.

The option to add a role assignment to an Azure resource

In the first step, in the Role tab, look for a role called Cognitive Services OpenAI User and select it. Then click Next.

The role to select to enable authenticated access on Azure OpenAI

In the Members tab, make sure that the option Assign access to is set to User, group or service principal and click on Select members. From there, search for your account and click on Select:

Choosing a user for the selected role in the Azure portal

Click on Review + assign two times and wait for the operation to complete.

Now you should be good to go. To quickly test if you did everything properly, you can just click on the Test option at the top of the Prompty file. Visual Studio Code will launch the Prompt Flow service and will execute the prompt. You should get a similar result than the one we have seen before.

The test executed by Prompt Flow

However, we said that the most interesting integration is the chat interface, so click on Open test chat page. This option will launch again the Prompt Flow service but, at the end, you should see your browser opening up on a local server that will show you a chat interface like the following one:

The Prompt Flow chat interface

On the panel on the left, you can setup the input and outputs of the conversation:

  • Under Chat input/output field config, the most important parameter to set is Chat input. You must specify, among the parameters supported by your prompt, which is the one mapped with the user input. In my case, it’s the question variable.
  • Under Prompty inputs, you must assign a value to each of the parameters you have included in your prompt (except the one which you’re going to get from the chat input). In my case, I had to provide a value for context (the paragraph about the fictional company Contoso Electronics) and for firstName (the name of the user).

Now you can just chat using the chat interface on the left. Prompt Flow will send the prompt to the LLM and it will fill all the parameters with the values you have provided, either in the chat or in the Settings panel. Let’s ask, for example, What is Contoso Electronics?

The response to our question

You can notice how the instructions we have provided in the prompt are followed: the LLM is addressing the user with their name and is using the context to provide a more personalized response. If you want, you can click on View trace to see the inner details of the operations, like the raw API calls.

Using Prompty with Semantic Kernel

Now that we have tested our prompt, we are ready to use it in our application. Again, the Visual Studio Code extension simplifies the process. Right click on the Prompty file and you will see the option to generate code for various AI frameworks, including Semantic Kernel:

The options to generate code staring from a Prompty file

Click on Add Semantic Kernel code and you will get a new file with the following code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;

var deployment = "gpt-4o";
var endpoint = "https://semantickernel-gpt.openai.azure.com/";
var key = "aoai_key";
var kernel = Kernel.CreateBuilder()
    .AddAzureOpenAIChatCompletion(deployment, endpoint, key)
    .Build();

    // update the input below to match your prompty
    KernelArguments kernelArguments = new()
    {
        { "question", "what's my question?" },
    };

var prompty = kernel.CreateFunctionFromPromptyFile("basic.prompty");
var result = await prompty.InvokeAsync<string>(kernel, kernelArguments);
Console.WriteLine(result);

If you have already experience with Semantic Kernel, the code will be familiar to you. We are creating a new instance of the Kernel class and we are setting it up with our Azure OpenAI instance. However, we can notice a new method called CreateFunctionFromPromptyFile(), which will create a prompt function starting from a Prompty file. We just need to pass, as parameter, the relative path of the Prompty file we want to use. To use this feature, however, we must install a new NuGet package, which is currently in preview: Microsoft.SemanticKernel.Prompty. As such, in the NuGet Package Manager, you’ll need to enable the Include prerelease option to see it.

The code generated by Visual Studio Code requires an additional change. As you can see, through the KernelArguments collection, we are providing a value for only one of the three parameters used by the prompt, question. As such, we need to enhance the collection to add the missing ones, firstName and context. We also need to wrap the CreateFunctionFromPromptyFile() method in a pragma directive, since the feature is marked as experimental, otherwise the code won’t compile.

This is the final version of the code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
using Microsoft.Extensions.Configuration;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;

var configuration = new ConfigurationBuilder()
    .AddUserSecrets("9aea1645-00e5-48dc-b396-a39b7d6821ca")
    .Build();

string apiKey = configuration["AzureOpenAI:ApiKey"];
string deploymentName = configuration["AzureOpenAI:DeploymentName"];
string endpoint = configuration["AzureOpenAI:Endpoint"];

var kernel = Kernel.CreateBuilder()
    .AddAzureOpenAIChatCompletion(deploymentName, endpoint, apiKey)
    .Build();

KernelArguments kernelArguments = new()
    {
        { "question", "Which type of services does Contoso Electronics provide?" },
        { "firstName", "Matteo" },
        { "context", @"Contoso Electronics is a leader in the aerospace industry, providing advanced electronic components for both commercial and military aircraft. 
                    We specialize in creating cutting edge systems that are both reliable and efficient. Our mission is to provide the highest quality aircraft components to our customers, 
                    while maintaining a commitment to safety and excellence. We are proud to have built a strong reputation in the aerospace industry and strive to continually 
                    improve our products and services. Our experienced team of engineers and technicians are dedicated to providing the best products and services to our customers. 
                    With our commitment to excellence, we are sure to remain a leader in the aerospace industry for years to come" }
    };

#pragma warning disable SKEXP0040 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed.
var prompty = kernel.CreateFunctionFromPromptyFile("basic.prompty");
#pragma warning restore SKEXP0040 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed.
var result = await prompty.InvokeAsync<string>(kernel, kernelArguments);
Console.WriteLine(result);

If you run the application (in my case, it’s a .NET Console application), you will see a similar output than the one we have observed during our tests:

The prompt executed in a console app in .NET

Wrapping up

In this post, we have seen a new way to manage prompts and their execution in our applications, thanks to Prompty, a new solution from Microsoft to standardize prompts and templates. Compared to prompt functions, with Prompty we can use a more powerful syntax, enable more complex scenarios and, most of all, test the quality of our prompt engineering skills right in Visual Studio Code, before writing any code.

Once you have validated your job, you can easily import the Prompty file into a Semantic Kernel based application. And if you are a Python developer, Visual Studio Code gets you covered, since you can quickly generate a code file for LangChain as well.

You can find a Semantic Kernel sample about Prompty in my catalog of samples on GitHub.

Happy coding!

Built with Hugo
Theme Stack designed by Jimmy