Build an Amazing Real-time Transcription App using React Hooks

Sagar Kava - May 28 - - Dev Community

TL;DR

I am going to show you how to create a step-by-step tutorial for building your real-time transcription app

In this tutorial, you'll learn how to harness the power of React Hooks and the VideoSDK to build an app that allows users to join or create meetings, record sessions, and transcribe them in real-time. By the end, you'll have a functional and efficient transcription app ready for use.

Real-time transcription services are crucial in today's digital landscape, helping to bridge communication gaps and enhance accessibility. By following this tutorial, you will gain practical experience with React's latest features, including Hooks for state management, and how to integrate third-party services like VideoSDK for real-time video handling.

Real-time Transcription

👍 Like and save this post if you find it useful!
💬 Comment below with your thoughts and suggestions.

  1. Prerequisites: Ensure you have Node.js and npm installed. Sign up at VideoSDK.live for an API key https://app.videosdk.live/api-keys to get started.
  2. Steps: Setup project, create API functions, build components, run the app
  3. Outcome: A fully functional real-time transcription app

Real-time Transcription App using React

Step 1: Setup the Project

1.Create a React App: If you don't already have a React app, create one using the following command:

npx create-react-app my-video-app
cd my-video-app
Enter fullscreen mode Exit fullscreen mode

2.Install Dependencies: Install the necessary packages.

npm install @videosdk.live/react-sdk react-player
Enter fullscreen mode Exit fullscreen mode

3.Configure ESLint and Prettier: Set up ESLint and Prettier for code quality and formatting.

npm install eslint prettier eslint-plugin-react eslint-config-prettier eslint-plugin-prettier
Enter fullscreen mode Exit fullscreen mode

4.Project Structure: Organize your project with a logical structure.

my-video-app/
├── public/
├── src/
   ├── components/
      ├── JoinScreen.js
      ├── ParticipantView.js
      ├── Controls.js
      └── MeetingView.js
   ├── utils/
      └── API.js
   ├── App.css
   ├── App.js
   ├── index.css
   └── index.js
├── .eslintrc.json
├── .prettierrc
├── package.json
└── README.md
Enter fullscreen mode Exit fullscreen mode

Step 2: Setup API Functions

Create an API.js file in the src directory to handle API interactions.

"YOUR_VIDEO_SDK_AUTH_TOKEN" >> https://app.videosdk.live/api-keys

// API.js
export const authToken = "YOUR_VIDEO_SDK_AUTH_TOKEN";

export const createMeeting = async ({ token }) => {
  const response = await fetch("https://api.videosdk.live/v1/meetings", {
    method: "POST",
    headers: {
      Authorization: token,
      "Content-Type": "application/json",
    },
  });
  const data = await response.json();
  return data.meetingId;
};
Enter fullscreen mode Exit fullscreen mode

Step 3: Build the Components

In your App.js, import necessary modules and create components as described below.

JoinScreen Component

Allows users to enter a meeting ID or create a new meeting.

import React, { useState } from "react";

function JoinScreen({ getMeetingAndToken }) {
  const [meetingId, setMeetingId] = useState(null);
  const onClick = async () => {
    await getMeetingAndToken(meetingId);
  };
  return (
    <div>
      <input
        type="text"
        placeholder="Enter Meeting Id"
        onChange={(e) => {
          setMeetingId(e.target.value);
        }}
      />
      <button onClick={onClick}>Join</button>
      {" or "}
      <button onClick={onClick}>Create Meeting</button>
    </div>
  );
}

export default JoinScreen;
Enter fullscreen mode Exit fullscreen mode

ParticipantView Component

Displays the video and audio of a participant.

import React, { useEffect, useMemo, useRef } from "react";
import { useParticipant } from "@videosdk.live/react-sdk";
import ReactPlayer from "react-player";

function ParticipantView(props) {
  const micRef = useRef(null);
  const { webcamStream, micStream, webcamOn, micOn, isLocal, displayName } =
    useParticipant(props.participantId);

  const videoStream = useMemo(() => {
    if (webcamOn && webcamStream) {
      const mediaStream = new MediaStream();
      mediaStream.addTrack(webcamStream.track);
      return mediaStream;
    }
  }, [webcamStream, webcamOn]);

  useEffect(() => {
    if (micRef.current) {
      if (micOn && micStream) {
        const mediaStream = new MediaStream();
        mediaStream.addTrack(micStream.track);

        micRef.current.srcObject = mediaStream;
        micRef.current
          .play()
          .catch((error) =>
            console.error("videoElem.current.play() failed", error)
          );
      } else {
        micRef.current.srcObject = null;
      }
    }
  }, [micStream, micOn]);

  return (
    <div>
      <p>
        Participant: {displayName} | Webcam: {webcamOn ? "ON" : "OFF"} | Mic:{" "}
        {micOn ? "ON" : "OFF"}
      </p>
      <audio ref={micRef} autoPlay playsInline muted={isLocal} />
      {webcamOn && (
        <ReactPlayer
          playsinline
          pip={false}
          light={false}
          controls={false}
          muted={true}
          playing={true}
          url={videoStream}
          height={"300px"}
          width={"300px"}
          onError={(err) => {
            console.log(err, "participant video error");
          }}
        />
      )}
    </div>
  );
}

export default ParticipantView;
Enter fullscreen mode Exit fullscreen mode

Controls Component

Provides buttons to leave the meeting and toggle mic/webcam.

import React from "react";
import { useMeeting } from "@videosdk.live/react-sdk";

function Controls() {
  const { leave, toggleMic, toggleWebcam } = useMeeting();
  return (
    <div>
      <button onClick={() => leave()}>Leave</button>
      <button onClick={() => toggleMic()}>toggleMic</button>
      <button onClick={() => toggleWebcam()}>toggleWebcam</button>
    </div>
  );
}

export default Controls;
Enter fullscreen mode Exit fullscreen mode

MeetingView Component

Main component to handle meeting functionalities like transcription, recording, and displaying participants.

import React, { useState } from "react";
import { useMeeting, useTranscription, Constants } from "@videosdk.live/react-sdk";
import ParticipantView from "./ParticipantView";
import Controls from "./Controls";

function MeetingView(props) {
  const [transcript, setTranscript] = useState("Transcription");
  const [transcriptState, setTranscriptState] = useState("Not Started");
  const tConfig = { webhookUrl: "https://www.example.com" };
  const { startTranscription, stopTranscription } = useTranscription({
    onTranscriptionStateChanged: (data) => {
      const { status } = data;
      if (status === Constants.transcriptionEvents.TRANSCRIPTION_STARTING) {
        setTranscriptState("Transcription Starting");
      } else if (status === Constants.transcriptionEvents.TRANSCRIPTION_STARTED) {
        setTranscriptState("Transcription Started");
      } else if (status === Constants.transcriptionEvents.TRANSCRIPTION_STOPPING) {
        setTranscriptState("Transcription Stopping");
      } else if (status === Constants.transcriptionEvents.TRANSCRIPTION_STOPPED) {
        setTranscriptState("Transcription Stopped");
      }
    },
    onTranscriptionText: (data) => {
      let { participantName, text, timestamp } = data;
      console.log(`${participantName}: ${text} ${timestamp}`);
      setTranscript(transcript + `${participantName}: ${text} ${timestamp}`);
    },
  });

  const { startRecording, stopRecording } = useMeeting();

  const handleStartRecording = () => {
    startRecording("YOUR_WEB_HOOK_URL", "AWS_Directory_Path", {
      layout: { type: "GRID", priority: "SPEAKER", gridSize: 4 },
      theme: "DARK",
      mode: "video-and-audio",
      quality: "high",
      orientation: "landscape",
    });
  };

  const handleStopRecording = () => stopRecording();
  const handleStartTranscription = () => startTranscription(tConfig);
  const handleStopTranscription = () => stopTranscription();

  const [joined, setJoined] = useState(null);
  const { join, participants } = useMeeting({
    onMeetingJoined: () => setJoined("JOINED"),
    onMeetingLeft: () => props.onMeetingLeave(),
  });

  const joinMeeting = () => {
    setJoined("JOINING");
    join();
  };

  return (
    <div className="container">
      <h3>Meeting Id: {props.meetingId}</h3>
      {joined && joined === "JOINED" ? (
        <div>
          <Controls />
          <button onClick={handleStartRecording}>Start Recording</button>
          <button onClick={handleStopRecording}>Stop Recording</button>
          <button onClick={handleStartTranscription}>Start Transcription</button>
          <button onClick={handleStopTranscription}>Stop Transcription</button>
          {[...participants.keys()].map((participantId) => (
            <ParticipantView participantId={participantId} key={participantId} />
          ))}
          <p>State: {transcriptState}</p>
          <p>{transcript}</p>
        </div>
      ) : joined && joined === "JOINING" ? (
        <p>Joining the meeting...</p>
      ) : (
        <button onClick={joinMeeting}>Join</button>
      )}
    </div>
  );
}

export default MeetingView;
Enter fullscreen mode Exit fullscreen mode

App Component

Main component that manages the meeting state and provides the necessary context.

import React, { useState } from "react";
import { MeetingProvider } from "@videosdk.live/react-sdk";
import JoinScreen from "./JoinScreen";
import MeetingView from "./MeetingView";
import { authToken, createMeeting } from "./API";

function App() {
  const [meetingId, setMeetingId] = useState(null);

  const getMeetingAndToken = async (id) => {
    const meetingId =
      id == null ? await createMeeting({ token: authToken }) : id;
    setMeetingId(meetingId);
  };

  const onMeetingLeave = () => setMeetingId(null);

  return authToken && meetingId ? (
    <MeetingProvider
      config={{
        meetingId,
        micEnabled: true,
        webcamEnabled: true,
        name: "C.V. Raman",
      }}
      token={authToken}
    >
      <MeetingView meetingId={meetingId} onMeetingLeave={onMeetingLeave} />
    </MeetingProvider

>
  ) : (
    <JoinScreen getMeetingAndToken={getMeetingAndToken} />
  );
}

export default App;
Enter fullscreen mode Exit fullscreen mode

Step 4: Run the Application

  1. Start the React App:
   npm start
Enter fullscreen mode Exit fullscreen mode
  1. Navigate to Your Browser: Open http://localhost:3000 to view the app.

VideoSDK Real-time Transcription Output

Optional Step: Add CSS for Styling

To enhance the visual appeal of your app, you can add the following CSS.

Create an App.css file in your src directory with the following content:

/* App.css */

body {
  background-color: #121212;
  color: #e0e0e0;
  font-family: 'Roboto', sans-serif;
  margin: 0;
  padding: 0;
}

input, button {
  background-color: #1e1e1e;
  border: 1px solid #333;
  color: #e0e0e0;
  padding: 10px;
  margin: 5px;
  border-radius: 5px;
}

button:hover {
  background-color: #333;
  cursor: pointer;
}

.container {
  max-width: 800px;
  margin: auto;
  padding: 20px;
  text-align: center;
}

h3 {
  color: #f5f5f5;
}

p {
  margin: 10px 0;
}

audio, .react-player__preview {
  background-color: #333;
  border: 1px solid #555;
  border-radius: 5px;
  margin: 10px 0;
}

.react-player__preview img {
  border-radius: 5px;
}

.react-player__shadow {
  border-radius: 5px;
}
Enter fullscreen mode Exit fullscreen mode

Conclusion

Congratulations! You have successfully built a functional and robust real-time transcription app using React Hooks and the VideoSDK.live SDK. This application empowers users to join or create meetings, manage participants effectively, and utilize advanced features such as real-time transcription and recording. By following the detailed steps and incorporating the optional CSS, you have ensured that the app not only performs seamlessly but also provides a visually appealing user interface.

. . . . . . . . . . . . . . . . .