The Hutter Prize - Accelerating Future
Posted by Sachin Garg on 14th November 2006 | Permanent Link
Not only was the first Hutter Prize won, the Hutter Prize itself has won over its goal of accelerating the speed of research in compression of human knowledge and bringing the potential of artificial intelligence closer to realization.
On October 31st, the first Hutter prize was awarded to Alexander Ratushnyak and Przemyslaw Skibinski for their work on the paq8hp5 text compressor. Prize money: 3416€ (500€ for each percent improvement).
At Alexander Ratushnyak’s request, part of the prize will go to Przemyslaw Skibinski for his early contributions to the underlying PAQ compression algorithm.
Announcement of Prize was slightly delayed due to unavailability of code based on GPLed code. paq8hp5 code is now available.
The prize is a more concrete reincarnation of previously announced C-Prize and has successfully fulfilled its goal of advancing the state of the art in compression of human knowledge and bringing the potential of artificial intelligence closer to realization. Here is excerpt from Matt Mohoney’s comp.compression post:
Both paq8hp5 (top ranked on enwik8, 100 MB) and durilca4linux_2 by Dmitry Shkarin (top ranked on enwik9, 1 GB, no prize money) incorporate low level syntactic and semantic language modeling. Both compressors preprocess text by replacing words with dictionary codes using a dictionary of the most frequent words occurring in the benchmark. In paq8hp5 related words were manually grouped. In durilca4linux_2 the grouping was done automatically to cluster words that appear in similar context. In both, these groups are also suffix sorted to improve dictionary compression. Although language models of syntax and semantics have been around for awhile, these techniques have never been incorporated into data compressors until after the Hutter prize was launched in August.
Many people failed to understand importance of additional few percent compression, and some were sure that its speed made it useless for practical purposes. Even more were confused about what compression has to do with AI. Check out the rational for this.
As far as “just compression” is concerned, it helps maintain the historic rate of progress in text compression of approximately 3% per year. And the story doesn’t ends here, Matt came across paq8hp6 which further improved ratio by 1.00052%, so there is more coming :-)