Learn How to Build a LangChain Audio App with Python in Just 5 Minutes!

Pavan Belagatti - Sep 15 '23 - - Dev Community

This guide will teach you the steps to import audio data using Langchain and develop an application capable of answering queries about an audio file, thanks to LangChain's latest integration with AssemblyAI.

What is LangChain?

Developed by Harrison Chase, and debuted in October 2022, LangChain serves as an open-source platform designed for constructing sturdy applications powered by Large Language Models, such as chatbots like ChatGPT and various tailor-made applications.

Langchain seeks to equip data engineers with an all-encompassing toolkit for utilizing LLMs in diverse use-cases, such as chatbots, automated question-answering, text summarization, and beyond.

Know more about LangChain and Large Language Models (LLMs) in my other tutorial.

What is AssemblyAI?

AssemblyAI offers the quickest route to AI-powered audio solutions. Utilize a straightforward API to tap into ready-to-use AI models designed for speech transcription and comprehension. As a company specializing in applied AI, AssemblyAI is committed to the development, training, and deployment of cutting-edge AI models that developers and product teams can seamlessly incorporate into their applications or products.

Tutorial

LangChain offers an integration with AssemblyAI that enables you to import audio data using only a handful of code lines.

Create and activate the new virtual environment

# Mac/Linux:
python3 -m venv venv
. venv/bin/activate
Enter fullscreen mode Exit fullscreen mode
# Windows:
python -m venv venv
.\venv\Scripts\activate.bat
Enter fullscreen mode Exit fullscreen mode

Install both LangChain and the AssemblyAI Python package

pip install langchain
pip install assemblyai
Enter fullscreen mode Exit fullscreen mode

Set your AssemblyAI API key. You can get your free API Key here

[Note: Here, for our tutorial example, we will be using an mp3 audio file link. The audio is an interview with Peter DiCarlo, an associate professor in the Department of Environmental Health and Engineering at Johns Hopkins University, discussing the impact of Canadian wildfires on air quality in the United States. The interview covers the factors contributing to the spread of smoke, the health risks associated with high levels of particulate matter in the air, vulnerable populations, and the potential for worsening conditions due to climate change.]

Create a python file demo.py and add the following code.

import assemblyai as aai

# replace with your API token
aai.settings.api_key = f"Your API Key"

# URL of the file to transcribe
FILE_URL = "https://github.com/AssemblyAI-Examples/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3"

transcriber = aai.Transcriber()
transcript = transcriber.transcribe(FILE_URL)

print(transcript.text)
Enter fullscreen mode Exit fullscreen mode

Now, run the application with the following command.

Python3 demo.py
Enter fullscreen mode Exit fullscreen mode

You should see the transcript of the audio link we provided in the application.
audio transcript

Let's Add Question & Answer Capabilities Using OpenAI

Get the OpenAI API key and set it.

On your terminal inside the application folder, set the OPENAI API key.

export OPENAI_API_KEY=<Your API Key>
Enter fullscreen mode Exit fullscreen mode

Go back to your demo.py file and modify the code to work with question and answer format.

from langchain.document_loaders import AssemblyAIAudioTranscriptLoader
from langchain.llms import OpenAI
from langchain.chains.question_answering import load_qa_chain

FILE_URL = "https://github.com/AssemblyAI-Examples/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3"

loader = AssemblyAIAudioTranscriptLoader(FILE_URL)
docs = loader.load()

llm = OpenAI()
qa_chain = load_qa_chain(llm, chain_type="stuff")

answer = qa_chain.run(input_documents=docs,
                      question="Where did the wildfire start?")
print(answer)
Enter fullscreen mode Exit fullscreen mode

Run the application with the command Python3 demo.py and you should see the following output. It should be an answer to your question.

The wildfires started in Canada.

Let's change the question again. Let's ask what was the professor's name who was called to talk about wildfire in Canada.

The answer should be Peter DiCarlo

Let's ask how did it impact the health of people?
Below is the answer you should receive.

Exposure to high levels of particulate matter in the air can lead to a host of health problems, including impacts to the respiratory system, cardiovascular system, and neurological system. People most vulnerable are those whose bodies are still developing (children), the elderly, and people with preexisting health conditions.

Keep asking questions related to the audio and the chatbot keeps answering your questions.

I hope this small and simple tutorial helped you learn how to set up a virtual environment, install necessary packages, and write Python code to transcribe audio files. Using LangChain and AssemblyAI makes more unique. More importantly, you've integrated OpenAI's API to add a question-answering feature to your application, making it not just a transcription tool but an interactive platform for audio data analysis.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .