Automatically transcribe your video files uploaded to S3 using AWS Transcribe

Overview
Imagine you have a collection of files, and you want to make changes to them automatically as they're uploaded without any manual intervention.

For example, what if you wanted to resize images, mask personal data in documents, or even convert a file format? Using S3 Object Lambda, you can transform these files on the go, as they're uploaded to the bucket.

Auto Transcriber
Now, let's dive into the project which I've built. A pipeline which transcribes the audio and video files uploaded to the S3 bucket utilizing S3 event notification, lambda function and AWS Transcribe.

This is the architecture diagram of the project

Prerequisites

AWS Account - If you don't already have, you can sign up for free account

That's it. Everything else can be managed via the AWS Console.

Here’s how it works:

Uploading the Video: You start by uploading your video (like an interview or a lecture) into an S3 bucket. Think of this as a cloud folder where all your files are stored.
Triggering the Transcription: As soon as the video is uploaded, a special function automatically detects the new video. This function then tells another service, called Amazon Transcribe, to listen to the video’s audio and convert everything spoken into text.
Saving the Transcript: Once the transcription is done, the result is saved in a designated location, but this time as a JSON file. This file holds all the text, along with useful information like timestamps for when each word was spoken.

References

Look at this cool demo

demo-auto-transcribe.mp4 - Google Drive

drive.google.com
Github Repo - Code for Lambda function

Comment down if you need detailed implementation, or have any questions about this project.

Automatically transcribe your video files uploaded to S3 using AWS Transcribe

Prerequisites

Here’s how it works:

References

demo-auto-transcribe.mp4 - Google Drive