On the first part of this series, we set up the environment by installing Ubuntu, Python, Pip and Virtual Environment. Now we can get started with the actual chatbot.
Install Required Libraries
Hugging Face is a company and community platform making AI accessible through open-source tools, libraries, and models. It is most notable for its transformers Python library, built for natural language processing applications. This library provides developers a way to integrate ML models hosted on Hugging Face into their projects and build comprehensive ML pipelines.
PyTorch is a powerful and flexible deep learning framework that offers a rich set of features for building and training neural networks.
To install the correct version of torch, you need to visit the PyTorch website and follow the instructions for your setup. For example, I chose the following:
- PyTorch Build: Stable (2.3.0)
- OS: Linux
- Package: Pip
- Language: Python
- Compute Platform: CPU
These settings resulted to a following command:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
It can take a while to install torch. After it is done, you can install transformers by running the following command:
pip3 install transformers
Developing a Simple Chatbot
For this example, we'll use the GPT-2 model from Hugging Face.
Here is a basic script to for creating a chatbot:
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
# Load the pre-trained GPT-2 model and tokenizer
model_name = "gpt2-large"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)
# Set the model to evaluation mode
model.eval()
def generate_response(prompt, max_length=100):
input_ids = tokenizer.encode(prompt, return_tensors="pt")
# Generate response
with torch.no_grad():
output = model.generate(input_ids, max_length=100, num_return_sequences=1, pad_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(output[0], skip_special_tokens=True)
return response
print("Chatbot: Hi there! How can I help you?")
while True:
user_input = input("You: ")
if user_input.lower() == "exit":
print("Chatbot: Goodbye!")
break
response = generate_response(user_input)
print("Chatbot:", response)
Let's go through the code together.
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
The first line imports the PyTorch library and the second imports two classes from transformers library: GPT2LMHeadModel and GPT2Tokenizer. GPT2LMHeadModel is used to load the GPT-2 model, and GPT2Tokenizer is used to preprocess and tokenize text input.
# Load the pre-trained GPT-2 model and tokenizer
model_name = "gpt2-large"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)
- model_name = "gpt2-large": Sets the variable model_name to the string gpt2-large, indicating the specific model to be loaded.
- tokenizer = GPT2Tokenizer.from_pretrained(model_name): Loads the pre-trained tokenizer corresponding to the GPT-2 model. The tokenizer is responsible for converting text to token IDs that the model can process.
- model = GPT2LMHeadModel.from_pretrained(model_name): Loads the pre-trained GPT-2 model using the specified model name. The from_pretrained method downloads the model's weights and configuration.
# Set the model to evaluation mode
model.eval()
In PyTorch, the model.eval() method is used to set the model to evaluation mode. This is important for certain layers and operations that behave differently during training and evaluation.
def generate_response(prompt, max_length=100):
input_ids = tokenizer.encode(prompt, return_tensors="pt")
# Generate response
with torch.no_grad():
output = model.generate(input_ids, max_length=100, num_return_sequences=1, pad_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(output[0], skip_special_tokens=True)
return response
Then we define a function that takes a text prompt and an optimal maximum length for the generated text.
- input_ids = tokenizer.encode(prompt, return_tensors="pt"): Encodes the text prompt into token IDs and returns a tensor suitable for PyTorch (hence return_tensors="pt"). This tensor will be used as input for the model.
- with torch.no_grad(): This is a context manager that disables gradient calculation. We can disable them for this example to speed up the calculation and to reduce the memory usage.
- output = model.generate(input_ids, max_length=100, num_return_sequences=1, pad_token_id=tokenizer.eos_token_id): Generates a response based on the input_ids. The generate method creates a sequence of tokens with a maximum length of max_length. num_return_sequences=1 specifies that only one sequence should be generated and pad_token_id specifies the padding token ID should be the same as the end-of-sequence (EOS) token ID defined by the tokenizer. When generating text, models use the EOS token to signify the conclusion of a sentence.
- response = tokenizer.decode(output[0], skip_special_tokens=True): This method is used to convert the generated sequence of token IDs back into a human-readable string. skip_special_tokens parameter ensures that special tokens (like padding or end-of-text tokens) are not included in the final decoded string.
print("Chatbot: Hi there! How can I help you?")
while True:
user_input = input("You: ")
if user_input.lower() == "exit":
print("Chatbot: Goodbye!")
break
response = generate_response(user_input)
print("Chatbot:", response)
- print("Chatbot: Hi there! How can I help you?"): Prints an initial greeting message from the chatbot to the console.
- while True:: Is the main loop of the chatbot. This will allow continuous conversation with the chatbot until the user sends "exit" command.
- response = generate_response(user_input): This calls the function we defined above with the promt inserted by the user.
- print("Chatbot:", response): And as a last step, the response is printed on the console.
Running the Script
Save the script to a file, named for example simple-chatbot.py, and run it using:
python3 simple-chatbot.py
It can take a while for the script to run. Eventually you can chat with the chatbot. However, as seen in the conversation below, the chatbot has some trouble with its response.
Chatbot: Hi there! How can I help you?
You: Hello! How are you?
Chatbot: Hello! How are you?
I'm a little bit nervous. I'm not sure if I'm going to be able to do this, but I'm going to be able to do it. I'm going to be able to do it. I'm going to be able to do it. I'm going to be able to do it. I'm going to be able to do it. I'm going to be able to do it. I'm going to be able to do it
You:
Sometimes the chatbot is saying the same sentence over and over on same line, and sometimes it returned the response multiple times, each response on its own line. We can clean the reponse by adding a function that will split the response text into lines and checking if the line has repeated sentences:
import re
def remove_repeated_sentences(text):
sentences = re.split(r'(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s', text)
unique_sentences = []
for sentence in sentences:
if sentence not in unique_sentences:
unique_sentences.append(sentence)
else:
break # Stop adding sentences after the first repetition
return ' '.join(unique_sentences)
def clean_response(response_text, prompt):
# Remove the prompt from the response
stripped_response = response_text.replace(prompt, '').strip()
# Split the stripped response text into lines
lines = stripped_response.split('\n')
combined_lines = " ".join(line.strip() for line in lines if line.strip())
return remove_repeated_sentences(combined_lines)
Then on the generate_response function, we can call this clean_response function:
def generate_response(prompt, max_length=100):
input_ids = tokenizer.encode(prompt, return_tensors="pt")
# Generate response
with torch.no_grad():
output = model.generate(input_ids, max_length=100, num_return_sequences=1, pad_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(output[0], skip_special_tokens=True)
**cleaned_response = clean_response(response, prompt)**
return cleaned_response
Now the chatbot should generate responses that look good! There might still be some corner cases that I hadn't thought about, but those can then be added to the clean_response function.
The following is an example of a conversation I had with the chatbot:
Chatbot: Hi there! How can I help you?
You: Hello! How are you?
Chatbot: I'm so glad you're here.
You: That´s nice to hear!
Chatbot: I´m not sure if I´m going to be able to do it, but I´m going to try. I´m going to try to do it.
You: You can do it!
Chatbot: You can
You: What do you think about AI?
Chatbot: Let us know in the comments below! Images courtesy of Shutterstock, and Pixabay.
You: You are funny
Chatbot: .
You:
It doesn't necessarily make that much sense all the time since this is just a simple example, but at least now you know how to develop a chatbot!
Here's the finalized code so you can better see what was changed from the first version:
import re
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
# Load the pre-trained GPT-2 model and tokenizer
model_name = "gpt2-large"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)
# Set the model to evaluation mode
model.eval()
def remove_repeated_sentences(text):
sentences = re.split(r'(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s', text)
unique_sentences = []
for sentence in sentences:
if sentence not in unique_sentences:
unique_sentences.append(sentence)
else:
break # Stop adding sentences after the first repetition
return ' '.join(unique_sentences)
def clean_response(response_text, prompt):
# Remove the prompt from the response
stripped_response = response_text.replace(prompt, '').strip()
# Split the stripped response text into lines
lines = stripped_response.split('\n')
combined_lines = " ".join(line.strip() for line in lines if line.strip())
return remove_repeated_sentences(combined_lines)
def generate_response(prompt, max_length=100):
input_ids = tokenizer.encode(prompt, return_tensors="pt")
# Generate response
with torch.no_grad():
output = model.generate(input_ids, max_length=100, num_return_sequences=1, pad_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(output[0], skip_special_tokens=True)
cleaned_response = clean_response(response, prompt)
return cleaned_response
print("Chatbot: Hi there! How can I help you?")
while True:
user_input = input("You: ")
if user_input.lower() == "exit":
print("Chatbot: Goodbye!")
break
response = generate_response(user_input)
print("Chatbot:", response)
That's it!
In this blog post, we developed a simple chatbot and cleaned the response so it looks a bit better!
You can also follow my Instagram @whatminjahacks if you are interested to see more about my days as a Cyber Security consultant and learn more about cyber security with me!