Name: Semantic Tests for SemanticKernel Plugins using skUnit
Rating: 2.4 (669 reviews)
Author: mehrandvd

Exploring SemanticKernel

This week, I had the chance to explore the SemanticKernel code base, particularly the core plugins. SemanticKernel comes equipped with these built-in plugins:

ConversationSummaryPlugin
FilePlugin
HttpPlugin
MathPlugin
TextPlugin
TimePlugin
WaitPlugin

When I looked at the Plugins.UnitTests project, I noticed that all the unit tests are passing. But there's something interesting:

Each plugin has a corresponding test file, except for ConversationSummaryPlugin.

You might wonder why!?

Here's the thing. All the other plugins have outputs that can be tested because they're deterministic. But ConversationSummaryPlugin is a different story.

For instance, it has a function called SummarizeConversation that does exactly what it says - it summarizes a conversation. But how do you test something like that? You need to check the meaning of the output, not just if the strings are identical.

Let's consider this test case:

USER: Is Eiffel tall?
AGENT: Yes it is
USER: What about Everest mountain?
AGENT: Yes it is tall too
USER: What about a mouse?
AGENT: No it is not tall.

If you call SummarizeConversation with this input, you should get something like:

Expected output: The conversation is about the heights of different things. Both the Eiffel Tower and Mount Everest are considered tall, while a mouse is not.

But how do you write a test for that? You need to use semantic assertions, something like:

SemanticAssert.HasCondition(
   output, 
   "It mentions that both the Eiffel Tower and Mount Everest are tall.")

While you can do this now with the SemanticValidation library, I'm going to introduce an even simpler way in this post: using the skUnit library for semantic unit testing. Sounds exciting, right?

Let's Dive into Testing with skUnit

With skUnit, you can whip up scenarios in markdown files. Here's an example:

# SCENARIO Height Discussion

## PARAMETER input
USER: Is Eiffel tall?
AGENT: Yes it is
USER: What about Everest mountain?
AGENT: Yes it is tall too
USER: What about a mouse?
AGENT: No it is not tall.


## ANSWER
The conversation revolves around the heights of different things. Both the Eiffel Tower and Mount Everest get the tall vote, while a mouse doesn't.

## CHECK SemanticCondition
It mentions that both the Eiffel Tower and Mount Everest are tall.

## CHECK SemanticCondition
It mentions that a mouse isn't tall.

As you can see, in a scenario, you can set the parameters and the expected answer. Then, you can specify the semantic conditions that the output should meet. The best part? skUnit can run this test for you automatically, and you can see it acing the test. How cool is that?

What’s great is that these scenarios are valid .md files. This means they’re not just for the tech-savvy among us - anyone can read and understand them! Isn’t that neat?

Finally

I enjoy writing semantic tests for SemanticKernel plugins, and I have created a repository to share some of them: https://github.com/mehrandvd/semantic-kernel-skunit-tests

You can see an example of a test scenario for the SummarizeConversationPlugin here.