Create a generative AI application with Angular and Gemini REST API

Connie Leung - Mar 18 - - Dev Community

Introduction

In this blog post, I show how to create a generative AI application that uses Angular and Gemini REST API. Therefore, it is feasible to build basic generative AI without a backend.

The application demonstrates 2 use cases

  • Generate text from a promt
  • Generate text from multimodal input (text + image)

Generate Gemini API Key

Go to https://aistudio.google.com/app/apikey to generate an API key for a new or an existing Google Cloud project

Install dependency

npm i --save-exact ngx-markdown
Enter fullscreen mode Exit fullscreen mode

Generate configuration file

It is a bad practice to include the Gemini API key in our source codes and commit the files into the Github repo. Then, anyone can use the key to make requests to the Gemini REST API, and the Google Cloud bill can become very expensive.

My solution is to write a shell script that generates a configuration file containing the API key.

// generate-config-file.sh

if [ $# -lt 1 ]; then
    echo "Usage: $0 <api key>"
    exit 1
fi

apiConfig='{ 
    "apiKey": "'"$1"'"
}'

outputFile='src/assets/config.json'
echo $apiConfig > $outputFile
Enter fullscreen mode Exit fullscreen mode

This script accepts an API key and writes the JSON object to the src/assets/config.json JSON file.

./generate-config-file.sh   some-api-key
Enter fullscreen mode Exit fullscreen mode

The command creates the following JSON object in the configuration file

{
    "apiKey": "some-api-key"
}
Enter fullscreen mode Exit fullscreen mode

Add src/assets/config.json to the .gitignore file to ensure we don't accidentally commit the Gemini API Key to the GitHub repo.

// .gitignore

src/assets/config.json
Enter fullscreen mode Exit fullscreen mode

Then, the Angular application can import the JSON file and inject the API key in an EnvironmentProvider in the next step.

Define custom Gemini Provider

// harm-category.enum.ts

export enum HARM_CATEGORY {
    HARM_CATEGORY_UNSPECIFIED="HARM_CATEGORY_UNSPECIFIED",
    HARM_CATEGORY_DEROGATORY="HARM_CATEGORY_DEROGATORY",
    HARM_CATEGORY_TOXICITY="HARM_CATEGORY_TOXICITY",
    HARM_CATEGORY_VIOLENCE="HARM_CATEGORY_VIOLENCE",
    HARM_CATEGORY_SEXUAL="HARM_CATEGORY_SEXUAL",
    HARM_CATEGORY_MEDICAL="HARM_CATEGORY_MEDICAL",
    HARM_CATEGORY_DANGEROUS="HARM_CATEGORY_DANGEROUS",
    HARM_CATEGORY_HARASSMENT="HARM_CATEGORY_HARASSMENT",
    HARM_CATEGORY_HATE_SPEECH="HARM_CATEGORY_HATE_SPEECH",
  HARM_CATEGORY_SEXUALLY_EXPLICIT="HARM_CATEGORY_SEXUALLY_EXPLICIT",
    HARM_CATEGORY_DANGEROUS_CONTENT="HARM_CATEGORY_DANGEROUS_CONTENT"
}
Enter fullscreen mode Exit fullscreen mode
// threshold.enun.ts

export enum THRESHOLD {
    HARM_BLOCK_THRESHOLD_UNSPECIFIED = "HARM_BLOCK_THRESHOLD_UNSPECIFIED",
    BLOCK_LOW_AND_ABOVE = "BLOCK_LOW_AND_ABOVE",
    BLOCK_MEDIUM_AND_ABOVE = "BLOCK_MEDIUM_AND_ABOVE",
    BLOCK_ONLY_HIGH = "BLOCK_ONLY_HIGH",
    BLOCK_NONE = "BLOCK_NONE"
};
Enter fullscreen mode Exit fullscreen mode
// gemini.interface.ts

export interface GeminiConfig {
    maxOutputTokens: number,
    temperature: number,
    topP: number,
    topK: number
};

export interface GeminiSafetySetting {
    category: HARM_CATEGORY,
    threshold: THRESHOLD
}
Enter fullscreen mode Exit fullscreen mode
// gemini.constant.ts

import { GeminiConfig, GeminiSafetySetting } from './interfaces/genmini.interface';

export const GEMINI_API_KEY = new InjectionToken<string>('API_KEY');
export const GEMINI_PRO_URL = new InjectionToken<string>('GEMINI_PRO_URL');
export const GEMINI_PRO_VISION_URL = new InjectionToken<string>('GEMINI_PRO_VISION_URL');

export const GEMINI_GENERATION_CONFIG = new InjectionToken<GeminiConfig>('GEMINI_GENERATION_CONFIG');
export const GEMINI_SAFETY_SETTINGS = new InjectionToken<GeminiSafetySetting[]>('GEMINI_SAFETY_SETTINGS');
Enter fullscreen mode Exit fullscreen mode
// gemini.provider.ts

import { EnvironmentProviders, inject, makeEnvironmentProviders } from '@angular/core';
import config from '../../assets/config.json';
import { CORE_GUARD } from '../core/core.constant';
import { HARM_CATEGORY } from './enums/harm-category.enum';
import { THRESHOLD } from './enums/threshold.enum';
import { GEMINI_API_KEY, GEMINI_GENERATION_CONFIG, GEMINI_PRO_URL, GEMINI_PRO_VISION_URL, GEMINI_SAFETY_SETTINGS } from './gemini.constant';

export function provideGeminiApi(): EnvironmentProviders {
    const genAIBase = 'https://generativelanguage.googleapis.com/v1beta/models';

    return makeEnvironmentProviders([
        {
            provide: GEMINI_API_KEY,
            useValue: config.apiKey,
        },
        {
            provide: GEMINI_GENERATION_CONFIG,
            useValue: {
                "maxOutputTokens": 1024,
                "temperature": 0.2,
                "topP": 0.5,
                "topK": 3
            },
        },
        {
            provide: GEMINI_SAFETY_SETTINGS,
            useValue: [
                {
                    "category": HARM_CATEGORY.HARM_CATEGORY_HATE_SPEECH,
                    "threshold": THRESHOLD.BLOCK_MEDIUM_AND_ABOVE
                },
                {
                    "category": HARM_CATEGORY.HARM_CATEGORY_DANGEROUS_CONTENT,
                    "threshold": THRESHOLD.BLOCK_MEDIUM_AND_ABOVE
                },
                {
                    "category": HARM_CATEGORY.HARM_CATEGORY_SEXUALLY_EXPLICIT,
                    "threshold": THRESHOLD.BLOCK_MEDIUM_AND_ABOVE
                },
                {
                    "category": HARM_CATEGORY.HARM_CATEGORY_HARASSMENT,
                    "threshold": THRESHOLD.BLOCK_MEDIUM_AND_ABOVE
                }
            ],
        },
        {
            provide: GEMINI_PRO_URL,
            useFactory: () => {
                const coreGuard = inject(CORE_GUARD, { self: true, optional: true });
                if (coreGuard) {
                    throw new TypeError('provideGeminiApi cannot load more than once');
                }

                const apiKey = inject(GEMINI_API_KEY);
                return `${genAIBase}/gemini-pro:generateContent?key=${apiKey}`;
            }
        },
        {
            provide: GEMINI_PRO_VISION_URL,
            useFactory: () => {
                const apiKey = inject(GEMINI_API_KEY);
                return `${genAIBase}/gemini-pro-vision:generateContent?key=${apiKey}`;
            }
        },
    ]);
}
Enter fullscreen mode Exit fullscreen mode

In genimi.provider.ts, I inject the API Key, Gemini generation configuration, Gemini safety settings, Gemini Pro URL, and Gemini Pro Vision URL.

  • Gemini Pro URL - the Gemini endpoint that generates text response from text input
  • Gemini Pro Vision URL - the Gemini endpoint that generates text response from multimodal inputs. Multimodal inputs mean texts and images supplied by users.

Bootstrap Application

Next, I register provideGeminiApi provider in bootstrapApplication.

// app.config.ts

import { provideHttpClient } from '@angular/common/http';
import { provideRouter, withComponentInputBinding } from '@angular/router';
import { provideMarkdown } from 'ngx-markdown';
import { routes } from './app.routes';
import { provideGeminiApi } from './gemini/gemini.provider';

export const appConfig = {
    providers: [
      ... other providers ...
      provideGeminiApi(),
    ]
};
Enter fullscreen mode Exit fullscreen mode
// main.ts

import { bootstrapApplication } from '@angular/platform-browser';
import { appConfig } from '~app/app.config';
import { AppComponent } from './app/app.component';

bootstrapApplication(AppComponent, appConfig)
  .catch(err => console.error(err));
Enter fullscreen mode Exit fullscreen mode

I have bootstrapped the Angular application and the next step is to create a Gemini Service to receive prompts and generate texts.

Create GeminiService

// generate-text.operator.ts

import { Observable, catchError, map, of, retry, tap } from 'rxjs';

export function generateText(numRetries: number) {
    return function(source: Observable<any>) {
      return source.pipe(
          retry(numRetries),
          tap((response) => console.log(response)),
          map((response) => response.candidates?.[0].content?.parts?.[0].text || 'No response' ),
          catchError((err) => {
            console.error(err);
            return of('Error occurs');
          })
        );
      }
 }
Enter fullscreen mode Exit fullscreen mode
// gemini.service.ts

import { HttpClient } from '@angular/common/http';
import { Injectable, inject } from '@angular/core';
import { Observable, catchError, map, of, retry, tap } from 'rxjs';
import { GEMINI_GENERATION_CONFIG, GEMINI_PRO_URL, GEMINI_PRO_VISION_URL, GEMINI_SAFETY_SETTINGS } from '../gemini.constant';
import { GeminiResponse } from '../interfaces/generate-response.interface';
import { MultimodalInquiry } from '../interfaces/genmini.interface';

@Injectable({
  providedIn: 'root'
})
export class GeminiService {
  private readonly geminiProUrl = inject(GEMINI_PRO_URL);
  private readonly geminiProVisionUrl = inject(GEMINI_PRO_VISION_URL);
  private readonly generationConfig = inject(GEMINI_GENERATION_CONFIG);
  private readonly safetySetting = inject(GEMINI_SAFETY_SETTINGS);
  private httpClient = inject(HttpClient);

  generateText(prompt: string): Observable<string> {
    return this.httpClient.post<GeminiResponse>(this.geminiProUrl, {
      "contents": [
        {
            "role": "user",
            "parts": [
              {
                "text": prompt
              }
          ]
        }
      ],
      "generation_config": this.generationConfig,
      "safetySettings": this.safetySetting
    }, {
      headers: {
        "Content-Type": "application/json"
      }
    })
    .pipe(generateText(3));
  }

  generateTextFromMultimodal({ prompt, mimeType, base64Data }: MultimodalInquiry): Observable<string> {
    return this.httpClient.post<GeminiResponse>(this.geminiProVisionUrl, {
      "contents": [
        {
            "role": "user",
            "parts": [
              {
                "text": prompt
              },
              {
                "inline_data": {
                  "mime_type": mimeType,
                  "data": base64Data
                }
              }
          ]
        }
      ],
      "generation_config": this.generationConfig,
      "safetySettings": this.safetySetting
    }, {
      headers: {
        "Content-Type": "application/json"
      }
    })
    .pipe(generateText(3));
  }
}
Enter fullscreen mode Exit fullscreen mode
  • generateText - this method receives a prompt and generates the text. The HTTP request retries 3 times before returning "No response" or "Error occurs".
  • generateTextFromMultimodal - this method receives a prompt, mime type, and the inline Base64 data of the image. Similarly, the request retries 3 times before return "No response" or "Error occurs".

Build shared components for the user interfaces

Create an Angular component to input prompt and submit the request when the user clicks the "Ask me anything" button

// prompt-box.coponent.ts

@Component({
  selector: 'app-prompt-box',
  standalone: true,
  imports: [FormsModule],
  template: `
    <div>
      <textarea rows="3" [(ngModel)]="prompt"></textarea>
      <button (click)="askMe.emit()" [disabled]="vm.isLoading">{{ vm.buttonText }}</button>
    </div>
  `,
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class PromptBoxComponent {
  prompt = model.required<string>();
  loading = input.required<boolean>();

  viewModel = computed(() => ({
    isLoading: this.loading(),
    buttonText: this.loading() ? 'Processing' : 'Ask me anything',
  }));

  get vm() {
    return this.viewModel();
  }

  askMe = output();
}
Enter fullscreen mode Exit fullscreen mode

An Image Preview component allows users to select an image (jpg, jpeg, png) from a file dialog and preview it.

// gemini.interface.ts

export interface ImageInfo {
    base64DataURL: string;
    base64Data: string;
    mimeType: string;
    filename: string;
} 
Enter fullscreen mode Exit fullscreen mode
// image-preview.component.ts

@Component({
  selector: 'app-image-preview',
  standalone: true,
  template: `
    <div>
      <label for="fileInput">Select an image:</label>
      <input id="fileInput" name="fileInput" (change)="fileChange($event)"
        alt="image input" type="file" accept=".jpg,.jpeg,.png" />
    </div>
    @if(imageInfo(); as imageInfo) {
      <img [src]="imageInfo.base64DataURL" [alt]="imageInfo.filename" width="250" height="250" />
    }
  `,
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class ImagePreviewComponent {
  imageInfo = model.required<ImageInfo | null>();

  fileChange(event: any) {
    const imageFile: File | undefined = event.target.files?.[0];
    if (!imageFile) {
      return;
    }

    const reader = new FileReader();
    reader.readAsDataURL(imageFile);
    reader.onloadend = () => {
      const fileResult = reader.result;
      if (fileResult && typeof fileResult === 'string') {
        const data = fileResult.substring(`data:${imageFile.type};base64,`.length);
        this.imageInfo.set({
          base64DataURL: fileResult,
          base64Data: data,
          mimeType: imageFile.type,
          filename: imageFile.name
        });
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

When users select a file, fileChange creates a FileReader and reads the data URL from the image file. When the read completes, I update the mime type, file name, and base64 image data in the imageInfo model input.

// chat-history.component.ts

import { ChangeDetectionStrategy, Component, input } from '@angular/core';
import { MarkdownComponent } from 'ngx-markdown';
import { HistoryItem } from '../interfaces/history-item.interface';
import { LineBreakPipe } from '../pipes/line-break.pipe';

@Component({
  selector: 'app-chat-history',
  standalone: true,
  imports: [MarkdownComponent, LineBreakPipe],
  template: `
    <h3>Chat History</h3>
    @if (chatHistory().length > 0) {
      <div class="scrollable-list">
        <ol>
          @for (history of chatHistory(); track history) {
            <li>
              <p>{{ history.prompt }}</p>
              <markdown [data]="lineBreakPipe.transform(history.response)" />
            </li>
          }
        </ol>
      </div>
    } @else {
      <p>No history</p>
    }
  `,
  changeDetection: ChangeDetectionStrategy.OnPush,
})
export class ChatHistoryComponent {
  chatHistory = input.required<HistoryItem[]>();
  lineBreakPipe = new LineBreakPipe();
}
Enter fullscreen mode Exit fullscreen mode

ChatHistoryComponent lists all the prompts and the generated texts from the earliest to the latest.

Build the user interfaces for the application

GenerateTextComponent is a component that displays a prompt box for a user to input a prompt and generate the text.

// generate-text.component.ts

@Component({
  selector: 'app-generate-text',
  standalone: true,
  imports: [FormsModule, ChatHistoryComponent, PromptBoxComponent, AsyncPipe],
  template: `
    <h3>Input a prompt to receive an answer from the Google Gemini AI</h3>
    <app-prompt-box [loading]="loading()" [(prompt)]="prompt" />
    @if (chatHistory$ | async; as chatHistory) {
      <app-chat-history [chatHistory]="chatHistory" />
    }
  `,
  changeDetection: ChangeDetectionStrategy.OnPush,
})
export class GenerateTextComponent implements OnInit {
  promptBox = viewChild.required(PromptBoxComponent);

  geminiService = inject(GeminiService);
  prompt = signal('');
  loading = signal(false);

  chatHistory$!: Observable<HistoryItem[]>;

  ngOnInit(): void {
    this.chatHistory$ = outputToObservable(this.promptBox().askMe)
      .pipe(
        filter(() => this.prompt() !== ''),
        tap(() => this.loading.set(true)),
        switchMap(() => 
          this.geminiService.generateText(this.prompt()).pipe(finalize(() => this.loading.set(false)))
        ),
        scan((acc, response) => acc.concat({ prompt: this.prompt(), response }), [] as HistoryItem[]),
        startWith([] as HistoryItem[])
      );
  }
}
Enter fullscreen mode Exit fullscreen mode

GenerateTextMultimodalComponent is a component that displays an image selector and a prompt box. A user selects an image, inputs a prompt, and clicks the button to generate text.

// generate-text-multimodal.component.ts

@Component({
  selector: 'app-generate-text-multimodal',
  standalone: true,
  imports: [
    FormsModule, 
    ChatHistoryComponent, 
    ImagePreviewComponent, 
    PromptBoxComponent, 
    AsyncPipe
  ],
  template: `
    <h3>Input a prompt and select an image to receive an answer from the Google Gemini AI</h3>
    <div class="container">
      <app-image-preview class="image-preview" [(imageInfo)]="imageInfo" />
      <app-prompt-box [loading]="loading()" [(prompt)]="prompt" />
    </div>
    @if (chatHistory$ | async; as chatHistory) {
      <app-chat-history [chatHistory]="chatHistory" />
    }
  `,
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class GenerateTextMultimodalComponent implements OnInit {
  promptBox = viewChild.required(PromptBoxComponent);

  geminiService = inject(GeminiService);
  prompt = signal('');
  loading = signal(false);
  imageInfo = signal<ImageInfo | null>(null);

  viewModel = computed(() => ({
    isLoading: this.loading(),
    base64Data: this.imageInfo()?.base64Data,
    mimeType: this.imageInfo()?.mimeType,
    prompt: this.prompt(),   
  }));

  chatHistory$!: Observable<HistoryItem[]>;

  get vm() {
    return this.viewModel();
  }

  ngOnInit(): void {
    this.chatHistory$ = outputToObservable(this.promptBox().askMe)
      .pipe(
        filter(() => this.vm.prompt !== '' && !!this.vm.base64Data),
        tap(() => this.loading.set(true)),
        switchMap(() => {
          const { isLoading, ...rest } = this.vm;
          return this.geminiService.generateTextFromMultimodal(rest as MultimodalInquiry)
            .pipe(finalize(() => this.loading.set(false)))
        }),
        scan((acc, response) => acc.concat({ prompt: this.prompt(), response }), [] as HistoryItem[]),
        startWith([] as HistoryItem[])
      );
  }
}
Enter fullscreen mode Exit fullscreen mode

Add routes to navigate to the user interfaces

Create a navigation bar to route to either GenerateTextComponent or GenerateTextMultimodalComponent.

// app.routes.ts

export const routes: Route[] = [
    {
        path: '',
        pathMatch: 'full',
        loadComponent: () => import('./gemini/generate-text/generate-text.component')
            .then((m) => m.GenerateTextComponent)
    },
    {
        path: 'text-multimodal',
        loadComponent: () => import('./gemini/generate-text-multimodal/generate-text-multimodal.component')
            .then((m) => m.GenerateTextMultimodalComponent)
    },
    {
        path: '**',
        redirectTo: '',
    }
];
Enter fullscreen mode Exit fullscreen mode
// app-menu.component.ts

import { ChangeDetectionStrategy, Component } from '@angular/core';
import { RouterLink } from '@angular/router';

@Component({
  selector: 'app-app-menu',
  standalone: true,
  imports: [RouterLink],
  template: `
    <div class="menu-container">
      <ul class="menu">
        <li><a routerLink="/">Generate Text from Text Input</a></li>
        <li><a routerLink="/text-multimodal">Generate Text from Text and Image Inputs</a></li>
      </ul>
    </div>
  `,
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class AppMenuComponent {}
Enter fullscreen mode Exit fullscreen mode

In appConfig, provideRouter(routes) registers the routes that navigate to different components.

// app.config.ts

import { provideRouter, withComponentInputBinding } from '@angular/router';
import { routes } from './app.routes';

export const appConfig = {
    providers: [
     ... other providers ...
      provideRouter(routes),
    ]
 };
Enter fullscreen mode Exit fullscreen mode

Add the navigation bar and router outlet to the AppComponent

// app.component.ts

@Component({
  selector: 'app-root',
  standalone: true,
  imports: [RouterOutlet, AppMenuComponent],
  template: `
    <div>
      <app-app-menu />
       <h2>{{ title }}</h2>
      <router-outlet />
    </div>
  `,
  changeDetection: ChangeDetectionStrategy.OnPush
})
export class AppComponent {
  title = 'Gemini AI Generate Text Demo';
}
Enter fullscreen mode Exit fullscreen mode

That is it. I create a generative AI application with only Angular and Gemini REST API. However, the application can extend to a full-stack application by replacing the REST API calls with endpoints to the backend APIs.

This is the end of the blog post that uses Angular to build s generative AI application. I hope you like the content and continue to follow my learning experience in Angular, NestJS, and other technologies.

Resources:

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .