Building C and C++ libraries for WebAssembly

31 Dec 2020

Tags: WebAssembly


One of the main attractions of WebAssembly is that it allows us to leverage the vast library of existing open-source C and C++ code in our web applications. This post aims to give some general advice on building C/C++ libraries for WebAssembly and consuming them from JavaScript. It will demonstrate this via a concrete case study using the popular image processing library, ImageMagick.

Introducing Emscripten: C/C++ toolchain for the browser

A web browser is like any other platform: to build C/C++ software for it, we require a C/C++ toolchain that can target it. Emscripten is such a toolchain. It includes a cross-compiler that emits WebAssembly binaries, some convenient wrappers around common buildsystem tooling, and a C/C++ runtime for the browser.

Follow the Emscripten documentation for instructions on how to install the SDK. The rest of this post will assume that you have activated the latest Emscripten SDK in your PATH via source ./emsdk_env.sh (or emsdk_env.bat on Windows).

Building your library

Overview

Our goal is to produce a WebAssembly module (typically denoted by .wasm file extension) that contains (and exports) all of the functionality from the library that we want to consume from JavaScript. Preparing this module will typically happen in two stages.

First we must cross-compile the library to WebAssembly. This will involve using the Emscripten buildsystem wrappers but otherwise is much the same as building the library natively, plus or minus some configuration changes. Unfortunately this doesn’t (usually) output a .wasm module that we can immediately ship (just like building a native static library doesn’t output anything we can run directly).

Therefore we must link together a final WebAssembly module. This is done by either exporting functions directly from the library, or by writing a custom wrapper and exporting the wrapper’s functions. In either case the Emscripten compiler will package everything up into a shippable WebAssembly module, ready to be consumed from JavaScript.

Unless the datatypes that flow across the library’s interface are very simple, you will likely want to go down the wrapper route. Although Emscripten provides the ability to interoperate with native concepts like pointers and memory management, they are fiddly to use. With a wrapper we can expose our own higher-level interface to JavaScript that is easier to consume and hides some of the messier details. This choice is partly subjective, however.

The following sections are written for a spherical cow example. As every C/C++ library has its own buildsystem and dependency nuances, you will likely have to tweak these steps. The library’s own documentation should be your best friend

Cross-compilation

If the library has an autotools buildsystem, then we’ll use the Emscripten’s wrapper around configure, named emconfigure:

emconfigure ./configure <OPTIONS> --prefix="$(pwd)/install"

Otherwise if the library uses CMake, we’ll similarly use Emscripten’s CMake wrapper, named emcmake:

emcmake cmake <OPTIONS> -DCMAKE_INSTALL_PREFIX="$(pwd)/install"

Note:

  1. We set the prefix for an in-tree install to prevent us from accidentally deploying WebAssembly binaries to our default system path.
  2. Dynamic linking is still a little clunky in WebAssembly, so build a static library if at all possible. To disable dynamic linking you’ll have to consult the build configuration options for your particular library. For autotools projects this will often be something like passing the following options to configure: --enable-shared=no --enable-static=yes.
  3. The Emscripten runtime technically supports pthreads via Web Workers, however there are some caveats due to the asynchronous nature of the browser environment. It’s recommended to disable threads to begin with, and tentatively enable them again once you have the single-threaded library working. Also threads and memory growth together are very slow (open issue). To disable threads consult the configure options for your chosen library. For autotools projects this will often be something like passing the following option to configure: --without-threads.

Once configured, we can then use Emscripten’s wrapper around make (named—you guessed it—emmake), and install, as usual:

emmake make
make install

Identify the functions that you want to export and thereby make accessible to JavaScript. These might be in your wrapper C/C++ module, if you’ve decided to go that route.

Note that C++ name mangling might make function export tricky, therefore it’s recommended to either write your wrapper functions in C, or declare them extern "C".

We then run the Emscripten compiler to link everything together and to specify our exports:

emcc -o <OUTPUT> <INPUT> <OPTIONS> -s EXPORTED_FUNCTIONS='[<FUNCTIONS>]'

Note:

  1. The <OUTPUT> file can either be .wasm in which case only the raw WebAssembly binary is output. You will typically want to specify .js in which case a JavaScript helper module is also output. This takes care of fetching and instantiating the WebAssembly module and will be what we import into our web application. For debugging purposes you may even want to specify .html which additionally outputs an HTML page that loads the JavaScript helper module and WebAssembly module and displays console output.
  2. The <INPUT> file(s) may either be your .c or .cpp wrapper, or could be the WebAssembly object files (i.e. static libraries) that were built in the cross-compilation stage.
  3. <OPTIONS> could be whatever options you require for your use case. This might include flags for linking to other dependencies. Emscripten has built-in support for various common C/C++ libraries including the standard libraries, pthreads, SDL, and a subset of OpenGL. However if your library has any unsupported third party dependencies, you’ll have to also build them for WebAssembly, and link them here.
  4. The -s EXPORTED_FUNCTIONS option is a list of the functions to make available to JavaScript. Note that you must include the leading underscore that is a C binary naming convention. For example, if we wanted to export the functions foo and bar then this would take the form -s EXPORTED_FUNCTIONS='["_foo", "_bar"]'. Also note that Emscripten does aggressive tree shaking, so any code that is not ultimately reachable by one of the exported functions will be optimised out of the resultant WebAssembly module.

Consume from web application

If all goes well, you should now have a .wasm module, and probably also a .js helper module that you can now include in your web application. There are many ways to do this but perhaps the simplest is to import the JavaScript helper module and call your functions from the global scope, remembering the leading underscore, for example:

<script type="text/javascript" src="foobar.js"></script>
<script>
  _foo();
  _bar();
</script>

The nuances and alternatives to this approach are beyond the scope of this post. Please see the Emscripten documentation about connecting C++ and JavaScript for more details.

Case study: ImageMagick

The following case study details the steps I took to build ImageMagick for the interactive demos on my GIF test suite post.

Cross-compilation

As ImageMagick uses autotools, configuring and making is straightforward. Again note that we’re building static libraries, have disabled threads, and are installing in-tree:

emconfigure ./configure \
  --enable-shared=no --enable-static=yes --without-threads \
  --prefix="$(pwd)/install"
emmake make
make install

The target function that we want to call from JavaScript is in MagickWand/mogrify.c. It’s essentially the back-end to the magick CLI app and its signature looks like:

bool MagickCommandGenesis(ImageInfo *image_info, MagickCommand command, int argc,
    char **argv, char **metadata, ExceptionInfo *exception)

We could potentially call this function directly from JavaScript, however we’d need to construct whatever an ImageInfo object is and potentially deal with other messiness. As all we want to do from JavaScript is execute a command to generate a new image from scratch, I decided that a wrapper would make more sense. The wrapper function lives in gif.cpp and looks like:

#include "MagickWand/MagickWand.h"

extern "C" int make_gif(char *commands[], int ncommands) {
    MagickWandGenesis();

    ImageInfo *image = CloneImageInfo(NULL);

    if (image == NULL) {
        printf("Failed to create image!");
        return MagickFalse;
    }

    ExceptionInfo *ex = AcquireExceptionInfo();

    int ret = MagickCommandGenesis(image, ConvertImageCommand, ncommands, commands, NULL, ex);

    DestroyImageInfo(image);
    DestroyExceptionInfo(ex);

    return ret;
}

Again note the extern "C" to disable C++ name mangling for this function.

We can now invoke the Emscripten compiler to generate a WebAssembly module from our wrapper and the underlying library:

export PKG_CONFIG_PATH="$(pwd)/install/lib/pkgconfig"
emcc \
  -o gif.js \
  gif.cpp \
  `pkg-config --cflags --libs ImageMagick` \
  -lMagickWand-7.Q16HDRI \
  -s EXPORTED_FUNCTIONS='["_make_gif"]' \
  -s ALLOW_MEMORY_GROWTH=1 \
  -s EXTRA_EXPORTED_RUNTIME_METHODS='["setValue"]'

There’s a fair bit going on here so let’s break it down:

  1. export PKG_CONFIG_PATH="$(pwd)/install/lib/pkgconfig": sets the pkg-config path to our in-tree ImageMagick installation. This will be used in the next command to inject various configuration that will allow us to link to this installation.
  2. -o gif.js: output a JavaScript helper module (in addition to the .wasm binary module).
  3. gif.cpp: source file containing our wrapper function from above.
  4. `pkg-config --cflags --libs ImageMagick`: this uses pkg-config to inject additional flags and configuration pertinent to the ImageMagick library, for example include and library file paths. Note that this is not WebAssembly specific: it is the recommended way to build any application against ImageMagick, as described in their documentation. The combination of this with setting the PKG_CONFIG_PATH environment variable means that we’ll be linking against our in-tree installation of ImageMagick (as opposed to the native system installation).
  5. -lMagickWand-7.Q16HDRI: link against the relevant ImageMagick library.
  6. -s EXPORTED_FUNCTIONS='["_make_gif"]': export the function defined in our wrapper so it can be called from JavaScript.
  7. -s ALLOW_MEMORY_GROWTH=1: allows unbounded memory usage1.
  8. -s EXTRA_EXPORTED_RUNTIME_METHODS='["setValue"]': export the Emscripten Module.setValue method for use from JavaScript (we use it to construct the array of strings to pass into our make_gif wrapper function).

Consume from web application

The above command will output our WebAssembly module gif.wasm and the helper JavaScript module gif.js. The full use of this function, including marshalling arrays of strings into it, goes beyond the scope of this post, however a snippet of its main call site and import into an HTML web app looks like:

<script>
  function make_gif_from_params(params) {
    params.push("output.gif");
    var commands = alloc_params(params);
    let ret = _make_gif(commands, params.length);
    Module._free(commands);
    return ret;
  }
</script>

<script async type="text/javascript" src="gif.js"></script>

See also


  1. I believe this is not strictly necessary in newer versions of Emscripten as it’s primarily a legacy throwback to asm.js which WebAssembly has superseded. ↩︎