tl;dr;

Migrating image data from Cloud SQL to GCS significantly improved database performance and reduced storage costs. By automating the process with Kubernetes Jobs and GitHub Actions, I ensured a repeatable and secure migration path.

Key Takeaways:

Use Appropriate Storage Solutions: Databases are not ideal for large binary data.
Automate with CI/CD: Reduces errors and streamlines deployments.
Secure Authentication: Utilize Workload Identity to avoid managing service account keys.
Cleanup After One-Time Tasks: Remove temporary scripts and resources to maintain a clean codebase.

In a recent project, I faced a challenge with efficiently storing image data. The service I was working on is built on Google Cloud Platform (GCP), utilizing Cloud SQL for relational data and Google Cloud Storage (GCS) for object storage. Initially, image data in base64 format was stored directly within the MessageTable of Cloud SQL. However, as the volume of images grew, I noticed increased database load and potential performance degradation.

To address this, I decided to migrate the base64-encoded images from the database to GCS. This blog post outlines the technical steps I took to achieve this migration using a Kubernetes Job, TypeScript scripts, and GitHub Actions.

The problem with storing images in the database

Storing large binary data like images in a relational database is suboptimal. Databases are designed for structured data and transactions, not for handling large blobs of data. Doing so can:

Increase backup sizes and times: Large data sizes make backups slower and consume more storage.
Slow down query performance: Large blobs can degrade the performance of queries, especially if the images are retrieved frequently.
Consume more memory and storage resources: Databases are not optimized for storing large files, leading to inefficient resource usage.

Learn more on this blog.

Solution Overview

To solve this, I planned to:

Extract the base64 images from Cloud SQL.
Upload them to GCS, which is optimized for storing large objects.
Update the application to reference the images from GCS instead of the database.

This process involved creating a TypeScript script to handle the migration, setting up a Kubernetes Job to execute the script within a private network, and automating the deployment with GitHub Actions.

Preparing the Migration Script

I used TypeScript and the Drizzle ORM for database interactions. The script performs the following steps:

Retrieve messages containing base64 image data from the database.
Decode the base64 string into a binary buffer.
Determine the image file type using the buffer.
Upload the image to GCS under a structured path using the message ID.
Handle errors and log them appropriately.

Migration Script (task.ts):

const migrateDBToStorage = async () => {
  const messages = await db.select().from(MessageTable).execute();

  for (const message of messages) {
    const base64Image = message.base64ImageData.replace(/^data:image\/\w+;base64,/, '');
    const buffer = Buffer.from(base64Image, 'base64');
    const fileType = await fileTypeFromBuffer(buffer);

    if (!fileType) {
      logger.error(`Failed to get file type for message ID: ${message.id}`);
      continue;
    }

    const uploadResult = await storageClient.uploadBlobsInBatch([
      {
        path: `messages/${message.id}.${fileType.ext}`,
        content: buffer,
      },
    ]);

    if (!uploadResult.isSuccess) {
      logger.error(`Failed to upload image for message ID: ${message.id}`, uploadResult.getError());
    }
  }
};

Key Points:

Data Extraction: Retrieves all messages with image data.
Data Transformation: Removes the base64 header and decodes the string.
File Type Detection: Uses fileTypeFromBuffer to determine the file extension.
Uploading to GCS: Utilizes storageClient.uploadBlobsInBatch for efficient uploading.

Setting Up the Kubernetes Job

To execute the script securely within the private network, I used a Kubernetes Job on Google Kubernetes Engine (GKE). I utilized kustomize to manage environment-specific configurations while keeping a common base.

Directory Structure:

.
├── base
│   ├── external-secret.yaml
│   ├── job.yaml
│   └── kustomization.yaml
└── overlays
    ├── dev
    │   ├── config.properties
    │   ├── external-secret.yaml
    │   ├── job.yaml
    │   ├── kustomization.yaml
    │   └── serviceaccount.yaml
    ├── stage
    └── prod

Job Definition (job.yaml):

apiVersion: batch/v1
kind: Job
metadata:
  name: db-to-storage-migration
spec:
  template:
    spec:
      containers:
      - name: db-to-storage-migration
        image: <artifact-registry-image>
        command:
          - "sh"
          - "-c"
          - "bun run /app/index.js migrate-db-to-storage"
      restartPolicy: Never

Explanation:

apiVersion & kind: Defines this resource as a Kubernetes Job.
metadata.name: Names the job db-to-storage-migration.
spec.template.spec.containers: Specifies the container to run.
- image: Points to the container image stored in Artifact Registry.
- command: Executes the migration script using bun.
restartPolicy: Set to Never to avoid re-running upon completion.

Managing Service Accounts and Permissions

To access Cloud SQL and GCS from the Kubernetes pods, I set up the necessary permissions:

Cloud SQL: Assigned roles/cloudsql.client to allow connections.
GCS: Assigned roles/storage.objectCreator to enable uploading objects.

I used Workload Identity to map a Kubernetes Service Account to a Google Service Account with these permissions.

Service Account Configuration (serviceaccount.yaml):

apiVersion: v1
kind: ServiceAccount
metadata:
  name: db-to-storage-migration-sa
  annotations:
    iam.gke.io/gcp-service-account: <gcp-service-account-email>

Steps Taken:

Created a Google Service Account with the required IAM roles.
Annotated the Kubernetes Service Account to link it with the Google Service Account.
Enabled Workload Identity to allow pods to authenticate as the Google Service Account without using service account keys.

Automating Deployment with GitHub Actions

To apply the Kubernetes Job on GKE, I utilized GitHub Actions. This approach:

Reduces Manual Errors: Automation prevents mistakes that can occur during manual deployment.
Leverages Existing CI/CD Pipelines: Integrates smoothly with current workflows.
Enhances Security: Uses Workload Identity and OIDC for authentication, avoiding service account keys.

GitHub Actions Workflow (db-to-storage-migration.yaml):

name: Database to Storage Migration

on:
  workflow_dispatch:

env:
  WORKLOAD_IDENTITY_PROVIDER: ${{ secrets.WORKLOAD_IDENTITY_PROVIDER }}
  GHA_SERVICE_ACCOUNT: ${{ secrets.GHA_SERVICE_ACCOUNT }}
  REGION: <region>
  GCP_PROJECT: <project-id>
  GKE_CLUSTER: <cluster-name>

jobs:
  db-to-storage-migration:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      id-token: write
    steps:
      - uses: actions/checkout@v4
        with:
          ref: ${{ github.ref_name || 'development' }}
      - uses: google-github-actions/auth@v2
        with:
          workload_identity_provider: ${{ env.WORKLOAD_IDENTITY_PROVIDER }}
          service_account: ${{ env.GHA_SERVICE_ACCOUNT }}
      - uses: google-github-actions/setup-gcloud@v2
        with:
          project_id: ${{ env.GCP_PROJECT }}
      - run: |
          gcloud components install gke-gcloud-auth-plugin
          gcloud container clusters get-credentials $GKE_CLUSTER --region $REGION --project $GCP_PROJECT
          kubectl apply -k deployment/k8s/manifests/db-to-storage-migration/overlays/dev

Workflow Breakdown:

Trigger: Manually via workflow_dispatch.
Environment Variables: Securely injected via GitHub Secrets.
Jobs:
- Authentication: Uses google-github-actions/auth with Workload Identity.
- Setup Cloud SDK: Ensures gcloud CLI is available.
- Execute Commands:
- Installs the GKE auth plugin.
- Fetches cluster credentials.
- Applies the Kubernetes manifests using kubectl apply -k.

Execution and Cleanup

After setting everything up:

Ran the GitHub Actions workflow for each environment (dev, stage, prod).
Migration Script Execution:
- The Kubernetes Job pulled the container image.
- Executed the migration script inside the cluster.
- Logs were collected for monitoring and troubleshooting.
Post-Migration:
- Updated the application code to reference images from GCS.
- Tested thoroughly to ensure functionality.
Cleanup:
- Deleted the Kubernetes Job and related resources.
- Removed the migration script from the codebase to prevent unintended reuse.

If you found this guide helpful and are dealing with similar challenges, don't hesitate to reach out to me on X. For more tech insights and updates, consider following me on GitHub. Let's innovate together!

Optimizing database performance by migrating Images to GCS