GIFs, or JIFs, are one of those strange oddities of the internet. The format is slow, inefficient, and from the 80s (!)- did we even have color screens then?
Yet today, they're everywhere. But if you want to work with this format on the web—rather than just displaying it in an <img>
—the best practice is to read a GIF's raw bytes and parse it using JavaScript.
This is admittedly esoteric, but pure JavaScript is slow compared to doing it in native C, as there's just a lot of low-level byte manipulation and work to do. How do we get the best of both worlds? Web Assembly, or WASM. In this post, I'll show you how to parse directly in the browser using WASM, which gives about a 2x speed increase over JS only.
To do this, we'll use Wuffs: a library for Wrangling Untrusted File Formats Safely. It generates modern and provably safe C code for dealing with multimedia or other encoded file formats. You can read more about it here.
The Demo
Here's fastgif
, a library to decode GIFs based on Wuffs and WASM; and on the right, we use a popular JS-only GIF library. The time to parse the GIF is the key, and Wuffs on a modern MacBook is typically about 2x the JS-only version.
(Of course, you might not see such amazing results, but fastgif
is faster in most environments. You can also try another browser! 🤞😅)
Take It Home Today
If you'd like to use fastgif
in your site, check it out here. The library and demo work in all modern, evergreen browsers: Edge, Safari, Firefox and Chrome. Neat! 🤖📸
The Numbers
nb: this is a log graph. Smaller is better.
Some thoughts-
- On the web,
fastgif
is almost exactly the same speed as a native, unoptimized binary of Wuffs—the C program running from a command line. - When we use
-O3
to compile the native binary, it speeds up enormously—those optimizations seemingly don't apply to WASM. - The second example, with just a single frame, is very fast for the native case: this hints that moving lots of bytes around is costly for the web.
- There's also a bit of startup/parse time that's encapsulated in the web-based approaches. Repeated decodes would probably be faster.
Nonetheless, fastgif
is probably the fastest way to decode GIFs on the web. As an aside: this sort of work is best suited for a Worker
, off the main thread—but that's out of scope of this post.
Want To Know More?
If you'd like to learn more about how fastgif
takes Wuffs and makes it speedy ⚡💨 for the web, read on!
The rest of this post is for those folks who:
- Have a good understanding of JavaScript
- Have a basic knowledge of C and the command-line
- Want to port any native library—not necessarily just Wuffs—to the web
Step Zero: fetch Emscripten
While you can write WASM files by hand or with other tools, the main toolchain to build Web Assembly is Emscripten. You can follow its install instructions here.
Once you're setup or you want to resume coding, be sure to source the emsdk_env.sh
for your platform:
# Linux/macOS
source ~/Desktop/path/to/emsdk/emsdk_env.sh
# Windows
C:\path\to\emsdk\emsdk_env.bat # or .ps1 for PowerShell
Emscripten needs certain versions of Node, Clang and some other environment variables to be set. This also means you can set up Emscripten without installing it as root, as it won't replace anything on your system.
Step One: build a naïve demo
Wuffs, like many other native C libraries, has some demo applications—those with an entry point of int main()
. This will read input from the command line, and output it to the shell. This doesn't map to what we want to do on the web, but it's a good place to start.
In Wuffs' case, there's a few examples under example/
. I started by trying to just compile the GIF player, which normally outputs ASCII art to the terminal.
git clone git@github.com:google/wuffs.git
cd wuffs/
source ~/Desktop/emsdk/emsdk_env.sh
# finally, compile:
emcc -s WASM=1 -o gifplayer.html example/gifplayer/gifplayer.c
# and run a quick webserver of choice:
python -m SimpleHTTPServer
serve
This will generate gifplayer.html
, gifplayer.js
and gifplayer.wasm
. This is Emscripten being helpful, mostly for debugging and getting started—now, if you open up http://localhost:5000/gifplayer.html
in your browser, you'll see:
... actually, we see an error message. 💥
Some quick Googling 🔍 later, and it looks like we need to allow the program more memory. Let's recompile:
emcc -s WASM=1 -s TOTAL_MEMORY=128MB -o gifplayer.html example/gifplayer/gifplayer.c
Great! Open now, and we'll see this:
When you ask for input using e.g. scanf
or reading from the command line, Emscripten will by default use the JavaScript method prompt()
to ask for data. Try pasting in some text—the browser will continue prompting until you Cancel the input, which counts an EOF.
Unfortunately, even copying and pasting raw GIF bytes into this form will do.. nothing. If you do this and read the browser's console, you'll see a message about "gif failed parsing header".
So the code is running- yay 🎉! But since we can't give it real bytes, only a JavaScript string, nothing happens 🙅.
Step Two: Send raw bytes to the GIF decoder
As gifplayer.c
has an int main()
method in C, it's going to run something- waiting for input- when we start it. While it's a good place to start, I want to be able to pass it raw bytes—perhaps from a window.fetch
or AJAX request.
Let's modify it to do so.
a. Remove main method
I can open up the gifplayer.c
file and modify it, removing the main
and fail
methods just by commenting them out:
/*
int fail(const char* msg) {
...
}
int main(int argc, char** argv) {
...
}
*/
b. Replace reading from input, with accepting a passed buffer
Let's comment out the read_stdin
method, and add a read_buffer
method to replace it.
/*
const char* read_stdin() {
while (src_len < SRC_BUFFER_SIZE) {
...
}
return "input is too large";
}
*/
// add this method
const char* read_buffer(uint8_t *buf, size_t len) {
src_buffer = buf;
src_len = len;
return NULL;
}
However, src_buffer
used to be a fixed sized buffer based on SRC_BUFFER_SIZE
. This bounded the amount of data read from the command line. Instead, we're going to accept a pointer to somewhere in memory. Let's update the declaration of src_buffer
:
// don't need a fixed buffer, now just a pointer
//uint8_t src_buffer[SRC_BUFFER_SIZE] = {0};
uint8_t *src_buffer;
size_t src_len = 0;
c. Recompile and pass data
Now, let's recompile and pass in data via the read_buffer
method directly in our JavaScript console. To expose a method to JS properly, we need to indicate it in our compile pass. We'll also need the play
method to display the output.
emcc -s WASM=1 -s TOTAL_MEMORY=128MB \
-s EXPORTED_FUNCTIONS="['_read_buffer','_play']" \
-o gifplayer.html \
example/gifplayer/gifplayer.c
Ok, so now let's re-open the Emscripten window. Nothing happens and no relevant error messages appear in the console—we're no longer doing anything in main()
.
However, we can run the Module._read_buffer
method to load our buffer in. By default, the Emscripten toolchain exposes these methods on the global Module
. It also exposes the C standard library—methods like malloc
and free
.
So to test, there's a few steps. We need to get the source of the image, we need to malloc
it some memory, pass that to read_buffer
, and then play it. The following code does just that—so paste it into the JavaScript console:
// the following base64 string is really long- just copy and paste this whole section
// it's the base64 encoded version of:
// https://raw.githubusercontent.com/google/wuffs/master/test/data/muybridge.gif
const testGifB64 = "";
const buf = Uint8Array.from(atob(testGifB64), (c) => c.charCodeAt(0));
const at = Module._malloc(buf.length);
Module.HEAP8.set(buf, at);
Module._read_buffer(at, buf.length);
Module._play();
With any luck, you should see a decoded, moving ASCII art horse 🐎💨 playing in your console:
Congrats! At this point, I rewarded myself with a donut. You should do the same. 🍩
Step Three: Don't display to the screen
Right now, we just play the ASCII art to the console. Emscripten even implements the usleep
method with a busy loop: the GIF player, which pauses between frames, actually just loops to block the browser from proceeding. Yuck!
Instead of this, let's get access to the actual frames. First, inside the play()
method, comment out anything to do with sleep and displaying to the screen:
#ifdef _POSIX_TIMERS
/*
if (started) {
...
}
*/
#endif
//ignore_return_value(write(stdout_fd, print_buffer, n));
Now, we want to use the EM_ASM_
macro, which lets us call JavaScript inline inside C code. This is purely magic, which takes the string inside the macro and puts it Emscripten's output JS helper. So the rest of the method should now be:
// .. continued from above
cumulative_delay_micros +=
(1000 * wuffs_base__image_buffer__duration(&ib)) /
WUFFS_BASE__FLICKS_PER_MILLISECOND;
// .. add this bit
EM_ASM_({
onframe($0, $1, $2);
}, print_buffer, n, cumulative_delay_micros);
// TODO: should a zero duration mean to show this frame forever?
}
return NULL;
}
Finally, be sure to add the Emscripten header at the top of the file, as now we're using new macros:
#include <emscripten.h>
You can now recompile using the same command as before:
emcc -s WASM=1 -s TOTAL_MEMORY=128MB \
-s EXPORTED_FUNCTIONS="['_read_buffer','_play']" \
-o gifplayer.html \
example/gifplayer/gifplayer.c
And reload the page. Now, run our helper blob again, but this time provide an onframe
method:
const frames = [];
const decoder = new TextDecoder();
function onframe(buf, len, micros) {
const s = decoder.decode(Module.HEAP8.slice(buf, buf + len));
frames.push({s, ms: micros / 1000});
}
const testGifB64 = "";
const buf = Uint8Array.from(atob(testGifB64), (c) => c.charCodeAt(0));
const at = Module._malloc(buf.length);
Module.HEAP8.set(buf, at);
Module._read_buffer(at, buf.length);
Module._play();
Once you've run the above code, be sure to log the frames
object, like this.
Phew! Now, you could modify them, print them out, or use them at your leisure. This definitely isn't perfect, because you need to add onframe
to your global scope (!). But, Emscripten's helper libraries are already pretty bad at this—they're already adding Module
to your window.
Step Four: Get the image bytes
The demo we've been fixing actually just renders ASCII art to the console. Let's actually simplify the code and just get the raw bytes of the GIF.
Let's update the gifplayer.c
to pass more, varying arguments to the onframe
JavaScript method:
// update the forward declaration at top
extern void onframe(uint32_t *, uint32_t, uint32_t, uint32_t);
// -- removed for brevity --
// update EM_ASM_
EM_ASM_({
onframe($0, $1, $2, $3);
}, dst_buffer, width, height, cumulative_delay_micros);
Now, recompile the code—just using the same command as before, hit ⬆️ in your terminal.
In your browser, reload and run this snippet of code to generate ImageData
instances, enough to render for us:
const frames = [];
const decoder = new TextDecoder();
function onframe(buf, width, height, micros) {
const len = width * height;
const buf = new Uint32Array(len);
buf.set(Module.HEAPU32.subarray(buf, buf + len))
const clamped = new Uint8ClampedArray(buf.buffer);
const imageData = new ImageData(clamped, width, height);
frames.push({imageData, ms: micros / 1000});
}
const testGifB64 = "";
const buf = Uint8Array.from(atob(testGifB64), (c) => c.charCodeAt(0));
const at = Module._malloc(buf.length);
Module.HEAP8.set(buf, at);
Module._read_buffer(at, buf.length);
Module._play();
Finally, run this code to display frames[0]
to the screen:
const canvas = document.createElement('canvas');
document.body.appendChild(canvas);
const context = canvas.getContext('2d');
context.putImageData(frames[0].imageData, 0, 0);
Congratulations! You made an image appear! 🖼️🎉
There's still, obviously, a few stepping stones from what we've just done to creating a usable library like you saw in fastgif
, before.
One challenge is that Emscripten's helper layer is quite enormous (~100kb of JS, more HTML), and is really designed for monolithic programs—where all your code is in C, rather than just trying to wrap up a single library.
If you'd like to read more about Emscripten and how to avoid using its helper layer, its "runtime", check out my follow-up post here. It's way more technical than this post, and that's saying something—some knowledge of C required.
Done
I hope you've learned either:
a) that WASM is cool, and can speed up traditionally complex computational tasks like GIF decoding: and/or
b) that it's not too hard to port a native C library—albeit in in a very basic way—to the Web.
Thanks for reading! ✨