Semantic Kernel - Upgrading your projects to 1.0

After a long preview, Microsoft finally released version 1.0 of Semantic Kernel! The good news is that the SDK is now stable and it has matured a lot over time, introducing new features, improving performances and aligning naming and features to the latest innovations introduced by OpenAI. The bad news is that all these features comes with a cost: 1.0 introduces tons of breaking changes, which means that you will have to apply lot of updates to your code.

In this post, we’re going to review the basic changes, that must be applied to every project, like the new way to setup the kernel or to invoke a function. In the next posts, instead, we’ll see in more details the deeper changes that affected some of the features we discussed in this blog, like semantic functions (now called prompt functions) and native plugins.

Setting up the kernel

The first important change is the way we setup the kernel. With the goal to align the naming with the ones used in the AI industry by OpenAI and Hugging Face, all the methods to setup an AI service have been renamed. The new names are now the following:

AddChatCompletion() to use chat completion models.
AddTextEmbeddingGeneration() to use embedding models.
AddTextGeneration() to use text generation models.
AddTextToImage() to use image generation models.

All these methods, as before, are available in two variants, based on the AI service you want to use. For example, if you want to use a chat completion model, you’re going to use:

AddOpenAIChatCompletion() to use OpenAI models.
AddAzureOpenAIChatCompletion() to use Azure OpenAI models.

Another important change is the way you initialize the kernel. Instead of manually creating a new instance of the KernelBuilder class, now you must obtain a KernelBuilder object using the CreateBuilder() method, which is a static method of the Kernel class. Here is an example of the new initialization code:

1
2
3
4
5
6
7
string apiKey = "AzureOpenAI:ApiKey";
string deploymentName = "AzureOpenAI:DeploymentName";
string endpoint = configuration"AzureOpenAI:Endpoint";

var kernel = Kernel.CreateBuilder()
    .AddAzureOpenAIChatCompletion(deploymentName, endpoint, apiKey)
    .Build();

Executing a basic function

The way you just execute a prompt (without using any plugin) has also changed. Let’s take a look at the following example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
string prompt = """
Rewrite the text between triple backticks into a business mail. Use a professional tone, be clear and concise.
Sign the mail as AI Assistant.

Text: ```{{$input}}```
""";

var mailFunction = kernel.CreateFunctionFromPrompt(prompt, new OpenAIPromptExecutionSettings
{
    Temperature = 0.7,
    MaxTokens = 1000,
});

KernelArguments arguments = new KernelArguments
{
    { "input", "Tell David that I'm going to finish the business plan by the end of the week." }
};


var response = await kernel.InvokeAsync(mailFunction, arguments);

Console.WriteLine(output.GetValue<string>());
Console.ReadLine();

The first change is that, to create a function out of a prompt, we must use a new method called CreateFunctionFromPrompt(). Additionally, if we want to customize the LLM parameters, we must supply a new object to the method called OpenAIPromptExecutionSettings, which offers various properties to customize the execution of the prompt. In the example, you can see that we have changed the values of Temperature and MaxTokens.

Another important change is that the ContextVariables collection has been renamed to KernelArguments. However, it works exactly in the same way as before: you supply a collection of key / value pairs, where each of them is a variable that was defined in the prompt.

Finally, also the way we invoke a function has changed: the RunAsync() method has been renamed to InvokeAsync() and we must pass the parameters in the reverse order (first the function, then the input variables).

Making the code shorter and simpler

Semantic Kernel 1.0 has added also a simpler way to invoke a prompt, thanks to the InvokePromptAsync() method. This is how we can simplify the previous code:

1
2
3
4
5
6
var response = await kernel.InvokePromptAsync(prompt, arguments: new() {
    { "input", "Tell David that I'm going to finish the business plan by the end of the week." }
});

Console.WriteLine(output.GetValue<string>());
Console.ReadLine();

As you can see, we don’t have to create a function out of the prompt anymore, we can just invoke it directly. The only parameter we must pass is the prompt itself and the KernelArguments collections with the variables we need to pass for the prompt parameters.

Streaming the response

If you have ever played with ChatGPT, you have noticed that responses aren’t returned as a “one shot message”, but they are “streamed”: a few words at a time, the response is gradually composed in front of the eyes of the user. This approach isn’t artificially used to create a more compelling visual effect, but it’s used to mitigate the fact that LLMs takes a while before they can generate a full response. Simplifying a lot, in fact, what a LLM does is “guessing” the next word based on the previous one, using probability. This means that, to generate a full response, the LLM must “guess” the next word, then the next one, then the next one, and so on. This process takes time, so the response is gradually streamed to the user to provide a better user experience. Semantic Kernel 1.0 offers an easy way to implement the same approach, by proving a variant of the InvokeAsync() and InvokePromptAsync() methods, called InvokeStreamingAsync() and InvokeStreamingPromptAsync(). Let’s see how the previous example can be changed to use a streaming response:

1
2
3
4
5
6
7
8
9
var response = kernel.InvokePromptStreamingAsync(prompt, arguments: new()
{
    { "input", "Tell David that I'm going to finish the business plan by the end of the week." }
});

await foreach (var message in response)
{
    Console.Write(message);
}

These methods make use of one of the most recent features introduced in C#, which are asynchronous iterators. We call the InvokePromptStreamingAsync() method (which works exactly like the standard variant, which means you must provide the prompt and the input variables) and we iterate over the result using the await foreach statement. The result is a collection of KernelResponse objects, which contains the response generated by the LLM and which gets continuously populated as new content gets generated.

Wrapping up

In this post, we have learned how to migrate a Semantic Kernel project to use the new 1.0 version. In the next posts, we’ll see in more details the changes that affected the semantic functions, the native plugins and the planner.

You can find the updated projects on GitHub.

Happy coding!