Perform text translation using Vertex AI, Gemini, and NodeJS

Connie Leung - Jun 7 - - Dev Community

Introduction

Internationalization (i18n) is an essential aspect of commercial websites because commercial owners want to sell their products to customers worldwide. Even though English is one of the most popular languages in the world, not everyone can read and write it fluently. Therefore, websites typically provide additional languages, such as Spanish and Chinese, for visitors to change the language of the web pages. Before generative AI becomes popular, people hire agencies to perform text translations. Software engineers can now write codes using Vertex AI and the Gemini 1.5 Pro model to translate texts and persist the results in the database. Client applications can retrieve these translations and update the texts based on the selected language.

In this blog post, I will describe how I use NodeJS, Vertex AI, and the Gemini model to translate English phrases into Spanish.

Create a Google Cloud Project

Navigate to Google Cloud Console, https://console.cloud.google.com/, to create a Google Cloud Project. Then, I added a billing account to the cloud project because the Gemini 1.5 Pro model uses tokens to translate texts. Fortunately, the tokens are very cheap and the cost to perform text translations in this demo is low.

Create a new NodeJS Project

mkdir nodejs-gemini-translation
cd ./nodejs-gemini-translation
touch model.ts
touch index.ts
Enter fullscreen mode Exit fullscreen mode

I created a folder and added two new TypeScript files. model.ts defines a generative AI model. index.ts describes the main program that uses the LLM to perform the text translations.

Install dependencies

npm i --save-exact @google-cloud/vertexai
npm i --save-exact --save-dev @types/node ts-node
Enter fullscreen mode Exit fullscreen mode

Add Script in package.json

"scripts": {
    "start": "TARGET=es node -r ts-node/register --env-file=.env index.ts"
  }
Enter fullscreen mode Exit fullscreen mode

This node program uses Node 20; therefore, I use the --env-file flag to load the environment variables from the .env file.

Define Google Cloud variables

// .env.example

GOOGLE_PROJECT_ID=<google project id>
GOOGLE_LOCATION=asia-east2
GOOGLE_MODEL=gemini-1.5-pro-001
Enter fullscreen mode Exit fullscreen mode

Copy .env.example to .env in the folder. Replace GOOGLE_PROJECT_ID, GOOGLE_LOCATION and GOOGLE_MODEL with the project, location, and model, respectively.

  • GOOGLE_PROJECT_ID - Google Cloud Project Id
  • GOOGLE_LOCATION - Location of the Google Cloud. The default value is asia-east2 which is Hong Kong.
  • GOOGLE_MODEL - Large Language Model, and the default value is gemini-1.5-pro-001

Add .env to the .gitignore file to prevent accidentally committing the project ID to the GitHub repo.

// .gitignore

node_modules
.env
Enter fullscreen mode Exit fullscreen mode

Set up Application Default Credentials

gcloud auth application-default login 
Enter fullscreen mode Exit fullscreen mode

Set up an Application Default Credential (ADC) for use by the Vertex AI SDK in the local development environment.

gcloud auth application-default revoke 
Enter fullscreen mode Exit fullscreen mode

After not using the Application Default Credential, execute the above command in a terminal to revoke the ADC.

Next, I call the Vertex AI SDK to create a Gemini model to generate translations between two languages.

Create Gemini 1.5 Pro Large Language Model

// model.ts

import { HarmBlockThreshold, HarmCategory, VertexAI } from '@google-cloud/vertexai';

const project = process.env.GOOGLE_PROJECT_ID || '';
const location = process.env.GOOGLE_LOCATION || 'asia-east-2';
const model = process.env.GOOGLE_MODEL || 'gemini-1.5.-pro-latest';

const vertexAi = new VertexAI({ project, location });
export const generativeModel = vertexAi.getGenerativeModel({
    model,
    generationConfig: {
        candidateCount: 1,
        maxOutputTokens: 1024,
        temperature: 0,
        topP: 0.5,
        topK: 10,
    },
    safetySettings: [
        {
            category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
            threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
        },
        {
            category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
            threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
        },
        {
            category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
            threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
        },
        {
            category: HarmCategory.HARM_CATEGORY_HARASSMENT,
            threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
        }
    ],
});
Enter fullscreen mode Exit fullscreen mode

The Vertex AI LLM requires a valid project, google location, and model. Fortunately, the .env environment file provides the values.

generativeModel is a Gemini 1.5 Pro model, with temperature, topP, topK, maxOutputTokens, and safetySettings.

Invoke model to perform text translations

// index.ts

import { generativeModel } from './model';

const source = 'en';
const target = process.env.TARGET || 'ja';

async function main() {
    const samples = [
        'Good morning',
        'Hello, how are you today?',
        'How much does 3 apples, 5 pineapples and 4 oranges cost?',
        'My favorite hobby is riding bicycle.',
    ]

    const arrTranslations: Record<string, string>[] = [];

    for (let str of samples) {
        const generatedContents = await generativeModel.generateContent({
            systemInstruction: `You are a translation expert that can translate between source and target languages.
                If you don't know the translation, then return "UNKNOWN TRANSLATION".
                For numbers, please use words to represent then instead of Arabic values.
                The response is target_language_code|translation.
            `,
            contents: [
                {
                    role: 'user',
                    parts: [{ text: `${source}:${str} ${target}:` }],
                }
            ],
        });

        const candidates = generatedContents.response.candidates || [];
        for (const candidate of candidates) {
            const translations: Record<string, string> = {
                [source]: str
            };

            (candidate.content.parts || []).forEach((part) => {
                console.log('part', part);
                const line = (part.text || '').trim();
                const [targetLanguage, text = ''] = line.split('|');    
                translations[targetLanguage] = text;
            });

            arrTranslations.push(translations);
        }
    }

    console.log(arrTranslations);
}

main();
Enter fullscreen mode Exit fullscreen mode

The index.ts imports the generativeModel to generate contents that are the Spanish translations. The generateContent method expects a system instruction that describes the context of the model.

systemInstruction: `You are a translation expert that can translate between source and target languages. 
If you don't know the translation, then return "UNKNOWN TRANSLATION". 
For numbers, please use words to represent them instead of Arabic values.
The response is target_language_code|translation.`
Enter fullscreen mode Exit fullscreen mode

The model is a translation expert who translates texts. If the expert does not have the answer, it will output "UNKNOWN TRANSLATION". I force the model to give me texts instead of Arabic numbers for fun. When the text is "3 apples", the output is "tres mazanas". I also changed the separator from ":" to "|" by including "The response is target_language_code|translation." in the instruction.

contents: [
      {
              role: 'user',
              parts: [{ text: `${source}:${str} ${target}:` }],
      }
],
Enter fullscreen mode Exit fullscreen mode

The role is user, and the part consists of the source language code, source text, and the target language code. For example, parts: [{ text: 'en:Good Morning es:' }].

const candidates = generatedContents.response.candidates || [];
 for (const candidate of candidates) {
        const translations: Record<string, string> = {
                [source]: str
         };

         (candidate.content.parts || []).forEach((part) => {
                console.log('part', part);
                const line = (part.text || '').trim();
                const [targetLanguage, text = ''] = line.split('|');    
                translations[targetLanguage] = text;
         });

        arrTranslations.push(translations);
 }
Enter fullscreen mode Exit fullscreen mode

The candidates store the translations in parts.

part { text: 'es|Buenos días \n' }
part { text: 'es|Hola, ¿cómo estás hoy? \n' }
part {
  text: 'es|¿Cuánto cuestan tres manzanas, cinco piñas y cuatro naranjas?'
}
part { text: 'es|Mi pasatiempo favorito es andar en bicicleta. \n' }
Enter fullscreen mode Exit fullscreen mode

The function splits the part by '|' to obtain the target language code and the target text. These are keys and values of the translations map. Finally, the translations map is appended to the arrTranslation array. Repeat the same steps for all the source texts and terminate.

Test the translations

Run the start script in package.json in the terminal to generate the Spanish translations.

npm start
Enter fullscreen mode Exit fullscreen mode

The terminal should output an array of JSON objects with the source and the target texts.

[
    {
        "en": "Good morning",
        "es": "Buenos Dias"
    },
    {
        "en": "Hello, how are you today?",
        "es": "Hola, ¿cómo estás hoy?"
    },
    {
        "en": "How much does 3 apples and 4 oranges cost?",
        "es": "¿Cuánto cuestan tres manzanas, cinco piñas y cuatro naranjas?"
    }
]
Enter fullscreen mode Exit fullscreen mode

This concludes my blog post about using Vertex AI, Gemini, and NodeJS to perform text translations. I only scratched the surface of Vertex AI because it offers many large language models and services to build interesting projects and solve real-world problems in different domains. I hope you like the content and continue to follow my learning experience in Angular, NestJS, and other technologies.

Resources:

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .