The Data Compression News Blog

All about the most recent compression techniques, algorithms, patents, products, tools and events.

Subscribe

Posts: RSS Feed
Comments: RSS Feed

Asymmetric Binary System

Posted by Sachin Garg on 17th December 2007 | Permanent Link

Update 16th Jan, 2008: Andrew Polar has written an alternate implementation of ABS. Check the comments below for details.

In an unpublished paper, Jarek Duda from Jagiellonian University in Poland has written about what he calls the “Asymmetric Binary System”.

It promises to be an alternate to arithmetic coding, having all the advantages, but being much simpler. Whether it holds up to that promise is yet to be seen but there have been a few interesting developments in last few weeks.

After a lot of hiccups in the process, Matt Mahoney has finally managed to implement the asymmetric coder in an order 0 compressor, fpaqa, released under GPL. It is designed as a drop in replacement for arithmetic coders in other programs like PAQ, LPAQ, BBB, and SR2.

There might be a few confusions when you go through the paper, so here are some details that came up in the comp.compression discussion on this. I am not detailing all the maths, refer the the paper for that.

It might seem that algorithm is O(n^2) because as x grows larger, so does the complexity of the arithmetic. But this is not true because x can be normalized by outputting the low bits to keep it in a fixed range (e.g. under 2^32) with insignificant compression loss.

It has only one state variable instead of 2 (low and high range), which means a lookup table can be used for both coding and decoding to fairly high precision.

Unlike arithmetic coding, the decoder moves x through the sequence of states in the opposite order as the encoder. This means that the bits of d are decompressed in the reverse order that they were compressed. This almost rules out all the usual predictive models like PPM, PAQ, LZMA, etc. because the model sees different context during encoding and decoding.

You have to ensure decoding/encoding to be reverses of each other - if one first encode/decode than put/get lowest bits, than the second has to do it in the reverse order - get/put lowest bits such that after decoding/encoding we will stay in the selected range. To do it unambiguously, what we need to have the reverse, we should ensure constant bit-length of x - select the range [2^R,2^{R+1}-1].

To solve the asymmetry problem in fpaqa, what Matt has done is that the compressor models the input in the forward direction and saves q and d in a stack of size B=500K. When the stack is full or at end of input the saved q and d are encoded to a second stack of packed bits, and that stack is written to the output file in reverse order along with the first stack size and final state x. The decompressor does everything in the forward direction, periodically reading x to reset the state. The compressor does all the data reversal, so it is about 15% slower than the decompressor. B is large enough that the extra headers are negligible (5 byte overhead per block).

Overall, it looks a promising alternate to arithmetic coding but there aren’t any compelling reasons yet which will make you want to replace it. But as with all fresh ideas, it might get there with time. And, did I forgot to tell you that its free of any patents?