The Data Compression News Blog

All about the most recent compression techniques, algorithms, patents, products, tools and events.

Subscribe

Posts: RSS Feed
Comments: RSS Feed

Sponsored Links

Recent Posts

  • Bijective BWT (2 Comments)

    David Scott has written a bijective BWT transform, which brings all the advantages of bijectiveness to BWT based compressors. Among other things, making BWT more suitable for compression-before-encryption and also give (slightly) better compression.

  • Asymmetric Binary System (107 Comments)

    Jarek Duda’s “Asymmetric Binary System” promises to be an alternate to arithmetic coding, having all the advantages, but being much simpler. Matt has coded a PAQ based compressor using ABS for back-end encoding. Update: Andrew Polar has written an alternate implementation of ABS.

  • Precomp: More Compression for your Compressed Files

    So many of today’s files are already compressed (using old, outdated algorithms) that newer algorithms don’t even get a chance to touch them. Christian Schneider’s Precomp comes to rescue by undoing the harm.

  • On2 Technologies is Hiring

    There aren’t too many companies working on cutting edge codecs, and of those few this one is hiring. Best of luck.

  • China’s AVS Specifications Available (2 Comments)

    Its old news that China has developed their own Advanced Video Standard to avoid high licensing fees. English translation of the standard is now available, along with the IPR policy. Finally something technical that you can get your hands on to feed your appetite.

The Calgary Corpus Compression Challenge Update

Posted by Sachin Garg on 5th August 2006 | Permanent Link

Alexander Ratushnyak updates his December 2005 entry (of size 593620) to Calgary corpus compression challenge. The new record is 589862 bytes.

Matt Mahoney provides a summary of the PAQAR based compressor in a comp.compression post:

I looked at the decompressor. It is based on PAQAR with a tiny dictionary like the previous submission last December. But it appears there are two main differences. First, it uses bitwise contexts to model pic (in PicModel), as in PAQ7. Second, it adds an indirect context model (included in SparssModel), as in PAQ8F.

In other news, on May 21, 2006, Calgary corpus compression completed ten years since its inception.

5 Responses to “The Calgary Corpus Compression Challenge Update”

  1. Tom "Spike" Says:

    Does this 0.5% improvement matters to anyone?

    You researchers must be excited about this, but all this effort is a useless overkill.

    And PAQ is sooooooo slow…

  2. Sachin Garg Says:

    Tom, you are correct that it doesn’t matters to normal users “yet”, but almost all technology the normal users are using today (or will use in future) is a result of such state-of-the-art work by researchers.

  3. Evangelist Says:

    If you look at the 10 year history of the challenge, most entries were small gains, but over time all this adds up to a significant improvement.

    Size Date Name
    759881 09/1997 Malcolm Taylor
    692154 08/2001 Maxim Smirnov
    680558 09/2001 Maxim Smirnov
    653720 11/2002 Serge Voskoboynikov
    645667 01/2004 Matt Mahoney
    637116 04/2004 Alexander Ratushnyak
    608980 12/2004 Alexander Ratushnyak
    603416 04/2005 Przemysław Skibiński
    596314 10/2005 Alexander Ratushnyak
    593620 12/2005 Alexander Ratushnyak
    589863 05/2006 Alexander Ratushnyak

  4. Sachin Garg Says:

    I wish there was a record of how the speed and memory requirements changed over time.

  5. Niels Says:

    Probably quadric anti-proportional. :-)

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>