After a long preview, Microsoft finally released version 1.0 of Semantic Kernel! The good news is that the SDK is now stable and it has matured a lot over time, introducing new features, improving performances and aligning naming and features to the latest innovations introduced by OpenAI. The bad news is that all these features comes with a cost: 1.0 introduces tons of breaking changes, which means that you will have to apply lot of updates to your code.
In this post, we’re going to review the basic changes, that must be applied to every project, like the new way to setup the kernel or to invoke a function. In the next posts, instead, we’ll see in more details the deeper changes that affected some of the features we discussed in this blog, like semantic functions (now called prompt functions) and native plugins.
Setting up the kernel
The first important change is the way we setup the kernel. With the goal to align the naming with the ones used in the AI industry by OpenAI and Hugging Face, all the methods to setup an AI service have been renamed. The new names are now the following:
AddChatCompletion()
to use chat completion models.AddTextEmbeddingGeneration()
to use embedding models.AddTextGeneration()
to use text generation models.AddTextToImage()
to use image generation models.
All these methods, as before, are available in two variants, based on the AI service you want to use. For example, if you want to use a chat completion model, you’re going to use:
AddOpenAIChatCompletion()
to use OpenAI models.AddAzureOpenAIChatCompletion()
to use Azure OpenAI models.
Another important change is the way you initialize the kernel. Instead of manually creating a new instance of the KernelBuilder
class, now you must obtain a KernelBuilder
object using the CreateBuilder()
method, which is a static method of the Kernel
class. Here is an example of the new initialization code:
|
|
Executing a basic function
The way you just execute a prompt (without using any plugin) has also changed. Let’s take a look at the following example:
|
|
The first change is that, to create a function out of a prompt, we must use a new method called CreateFunctionFromPrompt()
. Additionally, if we want to customize the LLM parameters, we must supply a new object to the method called OpenAIPromptExecutionSettings
, which offers various properties to customize the execution of the prompt. In the example, you can see that we have changed the values of Temperature
and MaxTokens
.
Another important change is that the ContextVariables
collection has been renamed to KernelArguments
. However, it works exactly in the same way as before: you supply a collection of key / value pairs, where each of them is a variable that was defined in the prompt.
Finally, also the way we invoke a function has changed: the RunAsync()
method has been renamed to InvokeAsync()
and we must pass the parameters in the reverse order (first the function, then the input variables).
Making the code shorter and simpler
Semantic Kernel 1.0 has added also a simpler way to invoke a prompt, thanks to the InvokePromptAsync()
method. This is how we can simplify the previous code:
|
|
As you can see, we don’t have to create a function out of the prompt anymore, we can just invoke it directly. The only parameter we must pass is the prompt itself and the KernelArguments
collections with the variables we need to pass for the prompt parameters.
Streaming the response
If you have ever played with ChatGPT, you have noticed that responses aren’t returned as a “one shot message”, but they are “streamed”: a few words at a time, the response is gradually composed in front of the eyes of the user. This approach isn’t artificially used to create a more compelling visual effect, but it’s used to mitigate the fact that LLMs takes a while before they can generate a full response. Simplifying a lot, in fact, what a LLM does is “guessing” the next word based on the previous one, using probability. This means that, to generate a full response, the LLM must “guess” the next word, then the next one, then the next one, and so on. This process takes time, so the response is gradually streamed to the user to provide a better user experience.
Semantic Kernel 1.0 offers an easy way to implement the same approach, by proving a variant of the InvokeAsync()
and InvokePromptAsync()
methods, called InvokeStreamingAsync()
and InvokeStreamingPromptAsync()
. Let’s see how the previous example can be changed to use a streaming response:
|
|
These methods make use of one of the most recent features introduced in C#, which are asynchronous iterators. We call the InvokePromptStreamingAsync()
method (which works exactly like the standard variant, which means you must provide the prompt and the input variables) and we iterate over the result using the await foreach
statement. The result is a collection of KernelResponse
objects, which contains the response generated by the LLM and which gets continuously populated as new content gets generated.
Wrapping up
In this post, we have learned how to migrate a Semantic Kernel project to use the new 1.0 version. In the next posts, we’ll see in more details the changes that affected the semantic functions, the native plugins and the planner.
You can find the updated projects on GitHub.
Happy coding!