-
Notifications
You must be signed in to change notification settings - Fork 701
Description
I have a loose proposal for a layer 1 encoding. The basic idea is to split the current sequential stream-of-structs encoding into a set of smaller data streams - integer constants, function indexes, opcodes, etc. Doing this dramatically increases the compression ratio when compressing with weaker codecs like gzip (and is a slight size reduction for stronger codecs like brotli). This also creates opportunities for parallel decoding and better icache/dcache occupancy while decoding (in particular for varints).
I spent a couple days prototyping this a while ago and got it to the point where it can round-trip the Tanks Unity demo and one of the UE4 demos (with the minor caveat that you can't fully 100% round-trip wasm modules without using the same encoder, because encoders like binaryen seem to insert filler bytes and do other weird stuff to simplify the encoding process).
In my simple prototype (1.5k lines of C# to decode & encode wasm) it takes around 2 seconds to convert the UE4 demo (38.5mb) to my proposed format, and 1.4 seconds to convert it back to the .wasm format.
I believe my proposed format is compatible with streaming compilation and can trivially be integrated into existing decoders - you essentially just maintain separate read pointers for each data type, and when reading from layer 0 files all of those pointers are the same (and increment in sync). This works because the overall ordering of the data is not altered, the data is just sliced up and redistributed. Because this format is compatible with streaming that means you can also layer gzip or brotli transport compression on top of it without having to manually buffer it into memory before decompressing or compiling.
The current proposal produces respectable gains and I think there are ways to improve the file size further - I only tried a couple dozen things. Choosing what to split and how to encode it can have a pretty significant impact - for example splitting the 'flags' and 'offset' elements of the memory immediate into their own streams makes the post-gzip file slightly larger, and I'm not totally sure why that is. The bright side is that you can experimentally verify whether a change is a file size improvement very easily. I suspect some conversations with the designers of codecs like Brotli might lead to ideas on how to improve this further.
I'm including a basic illustration of the proposal below along with data from my tests.

