Live Image editor w/ JavaScript: Canvas API and Tesseract.js(OCR)

Sk - Aug 27 '23 - - Dev Community

One of my favorite charting libraries, chart.js is built around the canvas,

The goat of ML and Deep learning tensorflow.js, can use webgl for computation, which is faster than both the CPU and WebAssembly.

Guess what, the canvas API provides a webgl context:

const canvas = document.createElement("canvas");
const webgl = canvas.getContext("webgl");
Enter fullscreen mode Exit fullscreen mode

All this is usually reserved for lower level languages, it's undeniable the browser is powerful,
made even more by web API's,

granting us lower level acess and control,

We can animate, do game graphics/engine, physics simulation, data visualization, photo-manipulation, real-time video processing etc etc

all with the canvas.

This article will explore one of those possibilities, pixel by pixel image manipulation.

The Canvas API: an overview

A book on the power of the canvas and potential applications can span 1000's of pages,

We will cover only the necessary basics,

create a new project folder, with the following structure:

 imgs
 src\
   app.js
 index.html

Enter fullscreen mode Exit fullscreen mode

Copy and Paste the below HTML starter:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Live Img editor</title>

    <style>
        *, *::before, *::after {
            box-sizing: border-box;
        }

        body{
            margin: 0;
            padding: 0;
            height: 100vh;

        }

        .graphics{
            width: 100%;
            height: 100%;
            display: flex;
            align-items: center;
            justify-content: center;
            gap: 5em;
            padding: .9em;
        }
    </style>
</head>
<body>

   <div class="graphics">
     <div style="display: grid;">
         <h2>Edit</h2>
         <canvas id="canvas" width="600" height="400"></canvas>
    </div>
         <div style="display: grid;">
            <h2>Original</h2>
            <img src="./imgs/1.jpg" width="400" height="255" id="i"/>
         </div>
     </div>

   </div>


   <script type="module" src="./src/app.js"></script>
</body>
</html>
Enter fullscreen mode Exit fullscreen mode

Clone the repo to access the image files, which are optional you can use your own:

  git clone https://github.com/SfundoMhlungu/graphics-live-img-editor.git
Enter fullscreen mode Exit fullscreen mode

Navigate to app.js, it is always advisable to listen for the content loaded event before any operation:

// app.js

document.addEventListener("DOMContentLoaded", () => {
  /**
   * @type {HTMLCanvasElement}
   */
  const canvas = document.getElementById("canvas");

  /**
   * @type {CanvasRenderingContext2D}
   */
  const ctx = canvas.getContext("2d");
});
Enter fullscreen mode Exit fullscreen mode

We are getting a reference to the canvas, and from the canvas the 2d context,

giving us an API for drawing 2d graphics onto the canvas,

for example let's draw a simple square in the middle:

ctx.fillStyle = "green";

ctx.fillRect(canvas.clientWidth / 2 - 100,canvas.clientHeight / 2- 100, 200,200);
Enter fullscreen mode Exit fullscreen mode

The canvas API is imperative, we have to provide every instruction, as compared to declarative programming

We are first telling the context, the color to use from now on, meaning anything we drawn under ctx.fillStyle = "green" will be filled or styled with green,

for example to draw a yellow square after the green, we have to explicitly specify

ctx.fillStyle = "green";

ctx.fillRect(canvas.clientWidth / 2 - 100,canvas.clientHeight / 2- 100, 200,200);

ctx.fillStyle = "yellow";
ctx.fillRect(canvas.clientWidth / 2 - 100,canvas.clientHeight / 2- 100, 200,200);
Enter fullscreen mode Exit fullscreen mode

The canvas provides a coordinate system with the top left being (0, 0), going to the right on the x axis is +n, to left -n,
n being a number of pixels.

The y axis is flipped, +n goes down and -n goes up.

Graphics Coordinate System and Units example

for example the fillRect, takes coordinates(x, y), width(w) and height(h)

ctx.fillRect(x, y, w, h);
Enter fullscreen mode Exit fullscreen mode

We can draw all sorts of complex shapes, using the canvas, consult this article for more basics.

It will be an injustice, to not show the potential of the canvas with a complex example,

One of my favorite books I read as beginner, was Dan's The Nature of Code book, originally written in Java,

JavaScript p5 version

I translated most of it directly to the canvas API, and I will be posting articles related to it, in some form in the near future,

graphics programming is fun.

The example below is comprised of content spanning over 148 pages of the nature of code,

to make it easier I created a simple npm package for education purposes,

in the same folder initialize a package.json file:

  npm init
Enter fullscreen mode Exit fullscreen mode

and install the package:

npm i natureofcode-canvas-ed
Enter fullscreen mode Exit fullscreen mode

It only has two classes: Vector and Mover, the first for vector math and the latter a physics object.

create a new script under src particles.js and comment out app.js and the image for now:

<!-- <img src="./imgs/1.jpg" width="400" height="255" id="i"/> -->

<script type="module" src="./src/particles.js"></script>
<!-- <script type="module" src="./src/app.js"></script> -->
Enter fullscreen mode Exit fullscreen mode

copy and paste the following in particles.js:

import { Vector, Mover } from "natureofcode-canvas-ed";

/**
 * @type {HTMLCanvasElement}
 */
const canvas = document.getElementById("canvas");

/**
 * @type {CanvasRenderingContext2D}
 */
const ctx = canvas.getContext("2d");

canvas.width = window.innerWidth;
canvas.height = window.innerHeight;

canvas.width = window.innerWidth;
canvas.height = window.innerHeight;

let particle = new Mover(5, 100, 20, 200, [1, 1, 1], canvas);
// blowing to positive x direction
const wind = new Vector(0.01, 0);
const gravity = new Vector(0, 1);

let particles = [];

function update() {
  // clear the canvas before we draw
  ctx.clearRect(0, 0, canvas.width, canvas.height);

  for (let i = 0; i < 5; i++) {
    particles.push(
      new Mover(1, 200, 20, 100 * i, [(155 * i) ^ 2, 155, 155 * i], canvas)
    );
  }

  particles.forEach(p => {
    // p.applyForce(gravity)
    p.applyForce(wind);
    p.checkEdges();
    p.update();
    p.display();
  });

  for (let i = particles.length - 1; i > 0; i--) {
    if (particles[i].finished()) {
      // console.log("removed")
      particles.splice(i, 1);
    }
  }

  particle.applyForce(gravity);
  particle.applyForce(wind);
  particle.checkEdges();
  particle.update();
  particle.display(false);
  requestAnimationFrame(update);
}

update();
Enter fullscreen mode Exit fullscreen mode

Use parcel to bundle and run:

 npx parcel index.html
Enter fullscreen mode Exit fullscreen mode

The code below is creating a physics object,

bound by physical laws(nature) e.g wind, gravity etc,

The mover object can respond/simulate to these laws

params:

Mover(mass, x, y, lifespan, color, canvas)

let particle = new Mover(5, 100, 20, 200, [1, 1, 1], canvas);
constructor(m, x, y, (life = 200), (color = [155, 155, 155]), canvas);
Enter fullscreen mode Exit fullscreen mode

the update function, applys gravity and wind to the mover object on every tick:

particle.applyForce(gravity);
particle.applyForce(wind);
Enter fullscreen mode Exit fullscreen mode

check for bounds and commit calculations:

particle.checkEdges();
particle.update();
particle.display(false);
Enter fullscreen mode Exit fullscreen mode

You can play with forces, for example making the wind stronger:

const wind = new Vector(0.1, 0);
Enter fullscreen mode Exit fullscreen mode

Change the direction etc;

const wind = new Vector(0.2, 0.15);
Enter fullscreen mode Exit fullscreen mode

physics canvas gif

The canvas can do more and beyond, on top of being fun, we will certainly explore all of this in upcoming articles.

Now let's move on to the issue at hand,

let's go back to app.js

<img src="./imgs/1.jpg" width="400" height="255" id="i" />

<!-- <script type="module" src="./src/particles.js"></script>-->
<script type="module" src="./src/app.js"></script>
Enter fullscreen mode Exit fullscreen mode

Image manipulation

Comment out the lines creating squares, we need the canvas and the context for image manipulation

First we need to turn our image to pixel data, we can load the image into the canvas from the image tag,

but for this tutorial the <img> will serve as a prop or preview only, we will create a new Image in JS with:

// app.js

const img = new Image();

// handle events

img.src = "";
Enter fullscreen mode Exit fullscreen mode

This article will be followed by a chrome extension version where we take this Image editor and combine it with Optical Character Recognition(OCR), to extract text in images.

You can find the OCR part of the tutorial on dev.to, It's a only a 4 minute read,

In the extension the editor will allow users to enhance the image, for better text extraction by the OCR engine.

Here is the entire code to load the image, before we implement the function to paint into the canvas:

const img = new Image();
img.onload = function () {
  // paint into the canvas
  drawImageOnCanvas(img, canvas, ctx);
};
img.src = "./imgs/1.jpg";
Enter fullscreen mode Exit fullscreen mode

the drawImageOnCanvas will put the image into the canvas, turning it into pixel data

create a new file in the src directory imageEdit.js and export the following function:

/**
 *
 * @param {HTMLImageElement} img
 * @param {HTMLCanvasElement} canvas
 *  @param {CanvasRenderingContext2D} ctx
 */
export function drawImageOnCanvas(img, canvas, ctx) {
  const maxSide = Math.max(img.width, img.height);
  const scale = 400 / maxSide;
  const canvasWidth = img.width * scale;
  const canvasHeight = img.height * scale;
  canvas.width = canvasWidth;
  canvas.height = canvasHeight;
  ctx.clearRect(0, 0, canvasWidth, canvasHeight);
  ctx.drawImage(img, 0, 0, canvasWidth, canvasHeight);
}
Enter fullscreen mode Exit fullscreen mode

The image size may not be the same size as the canvas, an image can be for example 3000 * 2400 pixels, we cannot fit that onto a screen,

the first few lines handle that, by scaling the image down, to change the scale of the image tweak the 400 value, in the scale variable const scale = 400 / maxSide;

const maxSide = Math.max(img.width, img.height);
const scale = 400 / maxSide;
const canvasWidth = img.width * scale;
const canvasHeight = img.height * scale;
Enter fullscreen mode Exit fullscreen mode

The following function is responsible for turning the image to pixel data, which we need for
manipulation:

ctx.drawImage(img, 0, 0, canvasWidth, canvasHeight);
Enter fullscreen mode Exit fullscreen mode

Taking as params the image, x, y and size.

let's look at the pixel data

Pixel data

To get the pixel data from the canvas we use getImageData:

let pixelData = ctx.getImageData(0, 0, canvasWidth, canvasHeight);
Enter fullscreen mode Exit fullscreen mode

which returns an object, with a property data, a one dimensional array containing the pixel data:

pixelData.data;
Enter fullscreen mode Exit fullscreen mode

Now a pixel is made up of four values rgba(red, green, blue and alpha), meaning when we process the array we stride by 4,

in simple terms the data only makes sense if we process the values, four at a time:

Image data Array Drawing

which forms the repetition of rbga, we will talk about data representation in the upcoming ML in JavaScript series.

for example to print the first pixel

const r = pixelData.data[0];
const g = pixelData.data[1];
const b = pixelData.data[2];
const a = pixelData.data[3];
Enter fullscreen mode Exit fullscreen mode

The second pixel we move by four.

Let's create a grayscale function:

/**
 *
 * @param {CanvasRenderingContext2D} ctx
 */
export function grayScale(ctx, canvasWidth, canvasHeight) {
  let toEditImage = ctx.getImageData(0, 0, canvasWidth, canvasHeight);

  for (let i = 0; i < toEditImage.data.length; i += 4) {
    const avg =
      (toEditImage.data[i] +
        toEditImage.data[i + 1] +
        toEditImage.data[i + 2]) /
      3;

    toEditImage.data[i] =
      toEditImage.data[i + 1] =
      toEditImage.data[i + 2] =
        avg;

    toEditImage.data[i + 3] = toEditImage.data[i + 3];
  }

  ctx.putImageData(toEditImage, 0, 0);
}
Enter fullscreen mode Exit fullscreen mode

To get the grayscale value of a pixel, we get the average of the pixel's color space,

To know how much to darken it:

// getting the average
const avg =
  (toEditImage.data[i] + toEditImage.data[i + 1] + toEditImage.data[i + 2]) / 3;
Enter fullscreen mode Exit fullscreen mode

after we get the average we darken rgb while preserving the alpha value:

toEditImage.data[i] = toEditImage.data[i + 1] = toEditImage.data[i + 2] = avg;
toEditImage.data[i + 3] = toEditImage.data[i + 3];
Enter fullscreen mode Exit fullscreen mode

we do this for all the pixels in the image, from top to bottom, when the process is complete
we draw the image back to the canvas

ctx.putImageData(toEditImage, 0, 0);
Enter fullscreen mode Exit fullscreen mode

Let's test it out by importing the grayscale function into app.js

we will take a programmatic approach for this article, only focusing on functionality

we will worry about the UI for the extension part

import { drawImageOnCanvas, grayScale } from "./ImageEdit.js";

const img = new Image();
img.onload = function () {
  drawImageOnCanvas(img, canvas, ctx);
  grayScale(ctx, canvas.width, canvas.height);
};
img.src = "./imgs/1.jpg";
Enter fullscreen mode Exit fullscreen mode

The image should appear darker, another thing we can do is isolate color channels by turning off the unwanted

by turning, we set their values to 0, navigate to imageEdit.js:

export function colorChanel(ctx, canvas) {
  let editedImage = ctx.getImageData(0, 0, canvas.width, canvas.height);

  for (let i = 0; i < editedImage.data.length; i += 4) {
    // editedImage.data[i] = 0
    // editedImage.data[i + 1] = 0
    // turning off blue
    editedImage.data[i + 2] = 0;
  }
  ctx.putImageData(editedImage, 0, 0);
}
Enter fullscreen mode Exit fullscreen mode

the above turns off blue, you can uncomment out any, you will see the difference,

How about putting a tint color on all the pixels:

/**
 *
 * @param {CanvasRenderingContext2D} ctx
 */
export function tint(ctx, tintColor, canvasWidth, canvasHeight) {
  const editedImage = ctx.getImageData(0, 0, canvasWidth, canvasHeight);

  //  tintColor = [255, 0, 0]; // Red tint color

  for (let i = 0; i < editedImage.data.length; i += 4) {
    editedImage.data[i] = editedImage.data[i] + tintColor[0];
    editedImage.data[i + 1] = editedImage.data[i + 1] + tintColor[1];
    editedImage.data[i + 2] = editedImage.data[i + 2] + tintColor[2];
    editedImage.data[i + 3] = editedImage.data[i + 3]; // Preserve alpha value
  }

  ctx.putImageData(editedImage, 0, 0);
}
Enter fullscreen mode Exit fullscreen mode

We can keep going there's a lot we can do with a pixel space: rotate, scale, compress etc,

This example is enough for our purpose, to test this import the tint and color channel functions in app js

import {
  drawImageOnCanvas,
  grayScale,
  tint,
  colorChanel,
} from "./ImageEdit.js";
Enter fullscreen mode Exit fullscreen mode

Testing tint:

img.onload = function () {
  drawImageOnCanvas(img, canvas, ctx);
  // grayScale(ctx, canvas.width, canvas.height)
  tint(ctx, [32, 38, 51], canvas.width, canvas.height);
};
Enter fullscreen mode Exit fullscreen mode

Isolating color channels:

img.onload = function () {
  drawImageOnCanvas(img, canvas, ctx);
  // grayScale(ctx, canvas.width, canvas.height)
  // tint(ctx, [32, 38, 51], canvas.width, canvas.height)
  colorChanel(ctx, canvas);
};
Enter fullscreen mode Exit fullscreen mode

You can modify colorChanel to accept a switch case, on what channels to isolate,

After all this is done we need a way to save our modified image into an actual image(png/jpg etc)

Saving the Image

The canvas provides a convenient method for that:

toDataURL("image/jpeg", 1.0);
Enter fullscreen mode Exit fullscreen mode

The first param is the type and second the quality, 1.0 being the highest

let's replace the preview image with edited one:

const img = new Image();
img.onload = function () {
  // resizeImgtoCanvas(canvas, ctx, img)
  drawImageOnCanvas(img, canvas, ctx);
  // grayScale(ctx, canvas.width, canvas.height)
  colorChanel(ctx, canvas);
  // tint(ctx, [32, 38, 51], canvas.width, canvas.height)
  setTimeout(() => {
    const i = document.getElementById("i");

    i.src = canvas.toDataURL("image/jpeg", 1.0);
  }, 2000);
};
img.src = "./imgs/1.jpg";
Enter fullscreen mode Exit fullscreen mode

After 2 seconds the edited image should be the preview:

We can add more functionality like scale, rotate etc, more complex but easy to google, to avoid a long blog

We will pause here, until we pick this up for the OCR chrome extension.

In this article we looked at image manipulation using the canvas

Which allows us to turn an image to pixel data, in preparation for the combined image editor and OCR chrome extension.

Thanks for reading, please let me know your thoughts and any ideas or questions in the comments. Oh and don't forget to give this article a ❤ and a 🦄, it really does help and is appreciated!

You can connect with me on twitter, I am new!

and if you like concise content, I will be posting fuller articles and projects on my site skdev, as they do not make good blogging content,

be sure to check it out!

Articles on Machine Learning, Desktop development, Backend, Tools etc for the web with JavaScript, Golang and Python.

Buy Me a Coffee at ko-fi.com

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .