This publication is part of the AI Advent Calendar 2023, an initiative led by Héctor Pérez, Alex Rostan, Pablo Piovano, and Luis Beltrán. Check this link for more interesting articles about AI created by the community.
In preparation to my upcoming participation at the Global AI Conference 2023 with the topic Fine-tuning an Azure Open AI model: Lessons learned, let's see how to actually customize a model with your own data using Azure Open AI.
First of all, a definition
I like the definition presented here. Fine-tuning is:
the process that takes a model that has already been trained for one given task and then tunes or tweaks the model to make it perform a second similar task.
It is a way of applying transfer learning, a technique that uses knowledge which was gained from solving one problem and applies it to a new but related problem.
Azure Open AI and Fine-tuning
Azure Open AI is a cloud-based platform that enables everyone to build and deploy AI models quickly and easily. One of the capabilities of this service is fine-tuning pre-trained models with your own datasets. Some advantages include:
- Achieving better results than prompt-engineering.
- Needing less text sent (thus, fewer tokens are processed on each API call)
- Saving costs, improving request latency.
What do you need?
- An Azure subscription with access to Azure Open AI services.
- An Azure Open AI resource created in one of the supported regions for fine-tuning, with a supported deployed model.
- The Cognitive Services OpenAI Contributor role.
- The most important element to consider: Do you really need to fine-tune a model? I'll discuss about it during my talk next week, for the moment you can read about it here.
Let's fine-tune a model using Azure OpenAI Studio.
Steps
- Create an Azure Open AI resource.
- Prepare your data.
- Use Azure OpenAI Studio's Create custom model wizard.
- Wait until the model is fine-tuned.
- Deploy your custom model for use.
- Use it.
Let's do it!
Step 1. Create an Azure Open AI resource
Use the wizard to create an Azure Open AI resource. You only need to be careful about the region. Currently, only North Central US and Sweden Central support the fine-tuning capability, so just choose any of them.
Once the resource is created, go to Azure OpenAI Studio.
Step 2. Prepare your data.
You must prepare two datasets: one for training and a second one for validation. They each contain samples of inputs and its expected output in JSONL (JSON Lines) format. However, depending on the base model that you deployed, you will need specific properties for each element:
- If you are fine-tuning recent models, such as GPT 3.5 Turbo, here's an example of the file format.
{"messages": [{"role": "system", "content": "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."}, {"role": "user", "content": "Title: No-Bake Nut Cookies\n\nIngredients: [\"1 c. firmly packed brown sugar\", \"1/2 c. evaporated milk\", \"1/2 tsp. vanilla\", \"1/2 c. broken nuts (pecans)\", \"2 Tbsp. butter or margarine\", \"3 1/2 c. bite size shredded rice biscuits\"]\n\nGeneric ingredients: "}, {"role": "assistant", "content": "[\"brown sugar\", \"milk\", \"vanilla\", \"nuts\", \"butter\", \"bite size shredded rice biscuits\"]"}]}
{"messages": [{"role": "system", "content": "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."}, {"role": "user", "content": "Title: Jewell Ball'S Chicken\n\nIngredients: [\"1 small jar chipped beef, cut up\", \"4 boned chicken breasts\", \"1 can cream of mushroom soup\", \"1 carton sour cream\"]\n\nGeneric ingredients: "}, {"role": "assistant", "content": "[\"beef\", \"chicken breasts\", \"cream of mushroom soup\", \"sour cream\"]"}]}
Please notice that for each item (line) you provide a messages
element containing an array of role-content
pairs for the system
(the behavior), user
(the input), and assistant
(the output).
- On the other hand, if you are fine-tuning older models (such as Babbage or Davinci), here's a sample file format that works with both of them:
{"prompt": "You guys are some of the best fans in the NHL", "completion": "hockey"}
{"prompt": "The Red Sox and the Yankees play tonight!", "completion": "baseball"}
{"prompt": "Pelé was one of the greatest", "completion": "soccer"}
You can notice that each element contains a prompt-completion
pair, representing the input and the desired output which we'd like to be generated by the fine-tuned model.
More information about JSON Lines can be found here.
In order to generate a JSONL file, there are several approaches:
Manual approach: Write an application that creates a text file (with
.jsonl
extension), then loop over your data collection and serialize each item into a JSON string (don't forget that you need specific properties). Write each JSON string into a new line of the recently created file.Library approach: Depending on the programming language you are using, it's highly probable that there exists some libraries which can export your data in JSONL format. For example,
jsonlines
for Python.Website approach: There are some websites which can convert your
Excel
,SQL
,CSV
(and others) data into JSON Lines format, for example Table Convert or Code Beautify.
By the way, here are some characteristics of JSONL:
- Each line is a valid JSON item
- Each line is separated by a \n character
- The file is encoded using UTF-8
Moreover, for Azure Open AI usage, the file must include a byte-order mark (BOM).
Step 3. Use Azure OpenAI Studio's Create custom model wizard.
Only some models can be fine-tuned. As of today, babbage-002, davinci-002, and gpt-35-turbo models support this feature.
In Azure Open AI Studio, click on Models and then on Create a custom model.
A wizard shows up. First of all, you need to select one of the supported models for fine-tuning, which will be used as the base for customization.
For this example, I have selected gpt-35-turbo
. The suffix is optional, although it makes it easy to identify the generated model so it is recommended to provide it. Click on Next.
Now, you need to provide a JSONL
file, which serves as the training dataset. You can either select a local file or enter the URL of a public online resource (such as an Azure blob or a web location).
For this example, I have chosen a local JSONL
file which contains examples of a helpful virtual assistant that extracts generic ingredients from a provided recipe. Click on Upload file and once the task is completed, click on Next.
Now, you should provide the JSONL
file/resource for the validation dataset. For this example, I have chosen another local JSONL
file with examples of a helpful virtual assistant that extracts generic ingredients from a provided recipe. Click on Upload file and then on Next once the process finishes.
Afterwards, you can set advanced options, such as the number of epochs to train the model for. Let's keep the Default value for this example and click on Next.
Finally, a summary with the selected choices from the previous steps is presented. Click on Start Training job.
Step 4. Wait until the model is fine-tuned.
You can check the status of the fine-tuning training job in the Custom models tab from Models menu.
Click on the Refresh button to see the current progress of the task.
Training the model will take some time depending on the amount of data provided, the number of epochs, the base model, and other parameters selected for the task. Furthermore, since your job enters into a queue, the server might be handling other training tasks, causing that the process is delayed.
Once you see that the Status is Succeeded
, it means that your custom, fine-tuned model has been created! Well done!
However, an extra step is needed before you can try using it.
Step 5. Deploy your custom model for use.
Select the recently created model, then click on the Deploy button.
Afterwards, add a deployment name for its implementation. You can also change the model version in case the model has been fine-tuned several times. Click on Create.
You can monitor the deployment progress under the Deployments menu:
When the job finishes (Status = Succeeded), you are ready to use this model.
Step 6. Use it.
You can use the deployed fine-tuned model for inference anywhere: In an application that you develop, in the Playground, as part of an API request, etc. For example:
As you can see, the process for fine-tuning an Open AI model using Azure is quite straightforward and it offers several benefits. However, you should also consider if this is the best solution for your needs. Join my session at the Global AI Conference later this month to learn more about it!
Well, this was a long post but hopefully, it was also useful for you. Remember to follow the rest of the interesting publications of the AI Advent Calendar 2023. You can also follow the conversation on Twitter with the hashtag #AIAdvent.
Thank you for reading. Until next time!
Luis