The Data Compression News Blog

All about the most recent compression techniques, algorithms, patents, products, tools and events.

Subscribe

Posts: RSS Feed
Comments: RSS Feed

18 Years of ZIP format: Happy Birthday

Posted by Sachin Garg on 19th July 2006 | Permanent Link

The ZIP format celebrated its 18th birthday recently. Considering the fact that it was designed in times when disk capacities were measured in 10s of megabytes and network bandwidth was 300-1200 bauds, it sure has come a long way.

ZIP format’s single major strength was open specifications which resulted in remarkable stability and compatibility over all these years. Large file support was the only change needed (made to have ZIP files larger than 2 GB in size).

The ZIP file format was originally created in 1989 by Phil Katz, founder of PKWARE, after a prolonged legal dispute between PKWare and System Enhancement Associates (SEA) over the trademark name “ARC” (short for “Archive”), the file name extension .arc and copyright issues over SEA’s published code. (The name zip (meaning speed) implied that their product would be faster than ARC and other compression formats of the time).

Soon open source implementation of Phil Katz’s “deflate” and “inflate” routines was released by the Info-ZIP project under a BSD license. This resulted in a horde of PKZIP imitators, establishing the ZIP file format as a defacto industry standard.

In the mid 1990s, as GUIs became more popular, WinZip became popular by pitching a graphical user interface. And in the late 1990s, various file manager software products started integrating support for the ZIP format into the file manager user interface (Windows Explorer, the Mac OS Finder, GNOME, KDE, and others).

Mr. Katz died of complications from chronic alcoholism in 2000. A sad end to a true pioneer in the field of data compression.

Thinking about all the advancements made in data compression algorithms research, it doesn’t feels good to know that the solution used by most people is so way behind the current state-of-the-art.

But then many more recent formats/algorithms are trying to hit it big, but haven’t been very successful, primarily due to lack of open specifications. The next best open-source option, bzip2, has gained popularity but couldn’t displace ZIP as it is considerably slower than ZIP’s Deflate algorithm.

However the ZIP format has also been splintering recently. In 2003, both PKWare and WinZip introduced their own ‘incompatible’ encryption extensions to format. More recently, WinZip and PKWare added new algorithms to ZIP format. PKWare introduced Deflate64, and WinZip incorporated bzip2 and ppmd as algorithm options in ZIP archives. This splintering of format, with more than one incompatible variations going around, might prove to be fatal.

These sugar coated ‘advancements’, which result in incompatible archives, were probably just an attempt to use the ZIP format’s brand value. Anyway, most users continue to use the plain old ZIP for reasons of compatibility.

These attempts to hack newer technology in the aging ZIP format clearly show that the format is, well, showing age. But there is still no doubt to believe that it will still go on strongly for many more years to come, for same reasons which have maintained it as an obvious choice till now.