We have implemented image convolution using JS, Golang WebAssembly and rust WebAssembly previously. Which proves that WebAssembly's performance is indeed better. But in fact, when dealing with large data such as images, another common practice is to split the data into blocks and process them in parallel. And it's time to use Web Workers.

The method is also very simple, in the main process, we divide the image into several blocks, and then send it to various Web Workers to process, and then accept the processed results to join together. However, passing large amounts of data between processes is obviously an inefficient way, so we use SharedArrayBuffer to share memory between the main process and each Web Worker, so we only need to pass a reference to SharedArrayBuffer:

// main.js
const worker = new Worker('...')
const sharedArrayBuffer = new SharedArrayBuffer(imageData.buffer.byteLength)
new Uint8ClampedArray(sharedArrayBuffer).set(imageData)

worker.postMessage({start, end, width, sharedArrayBuffer})

// webworker.js
onmessage = async (e: MessageEvent) => {
  const {
    data: {sharedArrayBuffer, start, end, width},
  } = e
  const uint8ClampedArray = new Uint8ClampedArray(sharedArrayBuffer)

  // update sharedArrayBuffer using uint8ClampedArray
  for (let i = start; i < end; i++) {
    for (let j = 0; j < width; j++) {
      uint8ClampedArray[i * width + j] = 100 // Red
      uint8ClampedArray[i * width + j + 1] = 100 // Green
      uint8ClampedArray[i * width + j + 2] = 100 // Blue
    }
  }
}

In the above code, start and end represent the start and end line numbers that the current Web Worker needs to process, and width is Canvas's width. It can be explained like this:

After increasing width and height of the Canvas, the effect is significantly better than the previous JS version:

At this point, you must be thinking, if we use WebAssembly in the Web Worker, won't the result become fantastic?

Let's try it out. We just need to share the data with WASM in the Web Worker, and then update the block in SharedArrayBuffer with the modified memory.buffer:

onmessage = async (e: MessageEvent) => {
  const {
    data: {sharedArrayBuffer, start, end, width},
  } = e
  const ptr = return_pointer()
  const uint8ClampedArrayForMemBuf = new Uint8ClampedArray(memory.buffer)
  const uint8ClampedArrayForSharedBuf = new Uint8ClampedArray(sharedArrayBuffer)
  // Sync the block data to WASM
  uint8ClampedArrayForMemBuf.set(
    uint8ClampedArrayForSharedBuf.slice(start * width * 4, end * width * 4)
  )
  // Image convolution
  filter_shared_mem(
    ptr,
    width,
    end - start,
    new Float32Array([].concat(...kernel))
  )
  // Update the block in SharedArrayBuffer using modified memory.buffer
  uint8ClampedArrayForSharedBuf.set(
    new Uint8ClampedArray(memory.buffer).slice(ptr, (end - start) * width * 4),
    start * width * 4
  )
}

Note that when you synchronize data from SharedArrayBuffer to memory.buffer or update SharedArrayBuffer with modified memory.buffer, you only need to process the block belongs to current Web Worker.

The result is really improved:

Moreover, another memory-related problem is solved by this way. What? Because increasing the width and height of Canvas is equivalent to increasing the size of image data to be processed. If you select the Rust WebAssembly (Shared Memory) option on the image above. You will get a RangeError will be reported: offset is out of bounds error. The reason is that the image size has exceeded the initial Memory size in WebAssembly. (According to MDN, the size can be expand be call grow(), but I got another error after using it.)

Use Web Worker and SharedArrayBuffer for Image Convolution