The Data Compression News Blog

All about the most recent compression techniques, algorithms, patents, products, tools and events.

Subscribe

Posts: RSS Feed
Comments: RSS Feed

Sponsored Links

Recent Posts

  • Bijective BWT (7 Comments)

    David Scott has written a bijective BWT transform, which brings all the advantages of bijectiveness to BWT based compressors. Among other things, making BWT more suitable for compression-before-encryption and also give (slightly) better compression.

  • Asymmetric Binary System (113 Comments)

    Jarek Duda’s “Asymmetric Binary System” promises to be an alternate to arithmetic coding, having all the advantages, but being much simpler. Matt has coded a PAQ based compressor using ABS for back-end encoding. Update: Andrew Polar has written an alternate implementation of ABS.

  • Precomp: More Compression for your Compressed Files (3 Comments)

    So many of today’s files are already compressed (using old, outdated algorithms) that newer algorithms don’t even get a chance to touch them. Christian Schneider’s Precomp comes to rescue by undoing the harm.

  • On2 Technologies is Hiring

    There aren’t too many companies working on cutting edge codecs, and of those few this one is hiring. Best of luck.

  • China’s AVS Specifications Available (2 Comments)

    Its old news that China has developed their own Advanced Video Standard to avoid high licensing fees. English translation of the standard is now available, along with the IPR policy. Finally something technical that you can get your hands on to feed your appetite.

Compressed Index Web Site

Posted by Mark Nelson on 3rd January 2006 | Permanent Link

Paolo Ferragina of the University of Pisa and Gonzalo Navarro of the University of Chile have a web site dedicated to the exploration of compressed indices. Their mission statement explains it better than I can:

The new millennium has seen the born of a new class of full-text indexes which are structurally similar to Suffix Trees and Suffix Arrays, in that they support the powerful substring search operation, but are succinct in space, in that it is close to the empirical entropy of the indexed data. They are therefore called compressed Suffix Trees and compressed Suffix Arrays, or in general compressed indexes.

In general, old-school compression guys like me are more or less resigned to the notion that once you compress a block of data, you need to decompress the whole thing in order to get back even a little piece of it. However, Paolo and Gonzalo have posted links to quite a few papers on full text compressed indices, which expound the notion that you can pick and choose exactly what you want to decompress. The site has collections of texts, links to people and papers, and a proposed API for testing work in the future.

One Response to “Compressed Index Web Site”

  1. Brady Hauth Says:

    Definitely clever, definitely room for improvement. It would be nice if they would put up some free explanations of the algorithms on that page, with lots of nice pictures…static LZW with a special symbol to mark block beginnings can’t be optimal on BWTs.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>