In July 2023, OpenAI introduced function calling - a feature that underpins ChatGPT plugin ecosystem.
What I found particularly useful was not the capability to run external APIs and embed the responses into conversations... But the promise that new models have been tuned to return structured data:
These models have been fine-tuned to both detect when a function needs to be called (depending on the user’s input) and to respond with JSON that adheres to the function signature. Function calling allows developers to more reliably get structured data back from the model.
Previously, one had to ask the model to respond in JSON/XML/whatever, and provide schema definition somewhere in the prompt or conversation. And one has likely faced the issues with many responses not adhering to the expectation. Now there's a separate field for that in the API call and guarantees from OpenAI that the model will not reply in other formats (as long as it determines a function call is needed).
Smart validation
One of the use cases I've seen implemented in practice is the semantic validation of large forms. Instead of usual validators (checking empty fields, minimal length, or regex for emails), LLMs can now understand the meaning of the data.
By cooking a good prompt with enough explanations and clear criteria, you can have an AI-clerk doing the hard work: reviewing online submissions. In our case, that was a recognition process with people being nominated for a reward (think internal company Oscar with hundreds of submissions).
Code and Prompt
const body = JSON.stringify({
"messages": [
{
"role": "system",
"content": priming
},
{
"role": "user",
"content": descriptions
},
{
"role": "user",
"content": nominee
}
],
"functions": [
{
"name": functionName,
"description": "Sends validation results of nominee's submission and determines\
if the submission requries rework or can be marked as OK and sent for further processing",
"parameters": {
"type": "object",
"properties": {
"status": {
"type": "string",
"description": "Determines if validation is OK or not",
"enum": ["OK", "NOT-OK"]
},
"recommendations": {
"type": "string",
"description": "Empty if there're no objections and the submission has passed validation. \
Otherwise explains why validation was not passed. The result is a slightly styled text formatted \
as HTML (using paragraphs, lists, bold fonts where necessary)",
}
},
"required": ["status", "recommendations"]
}
}
],
"temperature": 0.0,
"frequency_penalty": 0,
"presence_penalty": 0,
"n": 1,
"max_tokens": 400,
"top_p": 0.95
});
This is a typical OpenAI request body for chat completion with one new field: functions
.
The idea is to pretend there is a function call to be issued and only use the parameters returned by LLM:
obj = JSON.parse(data.choices[0].message.function_call.arguments);
To steer GPT into returning a function call (in addition to having descriptions of function and params as shown above) I reinforced the requirement by adding this part to the system prompt:
Your output must be a function call with 2 fields:
• status - possible values are 'OK' (no complaints, nominee submission can go to the next step) and 'NOT-OK' (you have identified deficiencies and have recommendations)
• recommendations - should you have concerns regarding the submission (as instructed above) and would like to send it back for rework please provide your review results and list your arguments here.
Also, note that you need to use a more recent version of the API when issuing the call (make sure you have ?api-version=2023-07-01-preview
in the URL).
The full sample in TypeScript is here
Results
It worked like a charm with gpt3.5 and 16k context. OpenAI always returned a valid JSON (at least while debugging, didn't check prod telemetry).
I tried both, function calling and the old "Please respond with JSON..." prompts. The latter one used to provide anomalies. Sometimes recommendations
field contained a list of JSON items instead of being free text.
Vendor lock-in
If later you decide to switch LLM provider, you might find there is no function calling capability in their model. Or the corresponding API fields are different. Free-text conversations are still more portable.
P.S.: Some time ago, I came across this repo that hints at one of the implementation techniques (besides fine-tuning) in features like function calling that enforce structured replies. Regex can be used as an intermediate filter during token generation and boost the probability of tokens following the required pattern.