Introduction
Deepgram’s Nova-3 is the latest evolution in speech-to-text AI, offering real-time multilingual transcription, improved accuracy, and instant vocabulary updates. If you’re working on AI-driven transcription, you’ll want to explore this model.
At Transgate.ai, we took our first look at Nova-3, and the results were promising! (Check out our insights here: Transgate’s article).
In this guide, let’s get started with Nova-3 by setting up a simple transcription pipeline using Deepgram’s API.
Step 1: Set Up Your Deepgram API Key
First, sign up at Deepgram and grab your API key.
Then, install the Deepgram SDK:
npm install @deepgram/sdk
Step 2: Basic Real-Time Transcription
Using Node.js, we’ll create a simple WebSocket connection to transcribe live audio.
import { Deepgram } from '@deepgram/sdk';
import WebSocket from 'ws';
import fs from 'fs';
// Replace with your Deepgram API key
const deepgramApiKey = 'YOUR_DEEPGRAM_API_KEY';
const audioFile = 'sample.wav'; // Path to your audio file
const deepgram = new Deepgram(deepgramApiKey);
const ws = new WebSocket('wss://api.deepgram.com/v1/listen', {
headers: { Authorization: `Token ${deepgramApiKey}` },
});
ws.on('open', () => {
console.log('Connected to Deepgram WebSocket');
const stream = fs.createReadStream(audioFile);
stream.on('data', (chunk) => ws.send(chunk));
stream.on('end', () => ws.close());
});
ws.on('message', (message) => {
const transcript = JSON.parse(message);
console.log('Transcript:', transcript.channel.alternatives[0].transcript);
});
ws.on('close', () => console.log('Connection closed'));
What This Script Does:
✅ Connects to Deepgram’s real-time transcription API
âś… Streams an audio file for processing
âś… Logs transcriptions in real time
Step 3: Customizing the Transcription
Nova-3 supports custom vocabulary and language models. To enhance accuracy, pass custom parameters like this:
const ws = new WebSocket('wss://api.deepgram.com/v1/listen?model=nova-3&language=en&keywords=AI,transcription');
This boosts accuracy for domain-specific terms like AI, medical jargon, or industry-specific words.
Final Thoughts
Deepgram’s Nova-3 is fast, multilingual, and highly customizable. It’s a powerful tool for anyone building real-time voice applications.
🚀 Next Steps:
- Try it with your own audio files 🎙️
- Experiment with different languages 🌍
- Fine-tune with custom vocabulary 🔧
Check out Transgate.ai’s first impressions: Read here.
What do you think about Nova-3? Let’s discuss in the comments! 👇