Lempel-Ziv-Welch (LZW) Compression

Site Map

Feedback

Download:

LZW.h

LZWStream.h

LZW

Store LZW Compressed Data

This is a small suite of simple compression classes that use the LZW algorithm optionally using the GIF coding format.
Unisys patented the LZW algorithm, but the patent has now expired and you are free to use this algorithm without license.
They were designed to store short text messages efficiently.
You can specify a File or a CString or an area of memory for each class to act upon.
The Encoder can be told to create GIF codes in the Stream.
The Decoder automatically decodes GIF or normal LZW.
The code has the following behaviour:

The Encoder checks to see if Encoding really did compress the data.

If the data wan't compressed the Encoder output is just a copy of the input.

If the data was compressed the Encoder output starts with a 0xFF Byte.

Following the 0xFF Byte is a DWORD containing the Uncompressed data length. Use this to create the buffer for your Decompression.

The Decoder looks for a leading byte of 0xFF and if it doesn't find one, just copies input to output.

So if the original data has 0xFF as the first byte and doesn't get smaller after LZW then the Decoder will fail.

For all other cases you can happily Encode data and know you'll never end up with larger data.

If you used the default constructor, the function you called to Compress or Decompress will return an error message if anything went wrong.

The simplest way to use a CString to hold the data.
Here's example code for some of the constructors:
  CString S("anna and nanna banned bananas and bandannas");
  S+=S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S+S;
  int Length=S.GetLength();
  CLZWEncoder E(S);
  // S is now the compressed data (or the original data if no compression was possible).
  DWORD CompressedLength=S.GetLength(); // At this point: note the leading 0xFF Byte. S may contain NULLs before its end. GetLength returns the correct length for the compressed data.
  if((BYTE)*S==0xFF) {
    DWORD UnCompressedLength=*((DWORD*)(&*S+1)); //You can access the UnCompressed data Length from the compressed code
  }
  CLZWDecoder D(S);
  //S is now back to the original data or empty if it ran out of memory or the data was corrupt.
  if(S.IsEmpty() && CompressedLength) AfxMessageBox("LZW Decompression Failed");
  if(Length!=S.GetLength()) AfxMessageBox("LZW Decompression lost Byte(s)");
A file is specified simply by its Path, a section of memory by its Base and Length.
  CLZWEncoder Encoder1;
  const char* src="anna and nanna banned bananas and bandannas";
  DWORD UnCompressedLength=strlen(src)+1; // +1 to include the NULL terminator.
  char* LZW=new char[UnCompressedLength];
  S=Encoder1.Encode(src, UnCompressedLength, LZW, UnCompressedLength);
  if(!S.IsEmpty()) AfxMessageBox(S);
  else{
    CompressedLength=Encoder1.GetOutSize();
    double Ratio=UnCompressedLength+1./CompressedLength;
    CLZWDecoder Decoder1;
    char* dst=LZW; //start assuming no compression happened, so destination is the same as the source.
    if((BYTE)*LZW==0xFF) { // Decoding is necessary
      ASSERT(UnCompressedLength==*((DWORD*)(LZW+1))); //You can access the UnCompressed data Length from the compressed code
      dst=new char[UnCompressedLength];
      Decoder1.Decode(LZW, CompressedLength, dst, UnCompressedLength);
      ASSERT(strcmp(src,dst)==0);
      delete dst;
  } }
  delete LZW;

  DeleteFile("ReadMeNow.txt");
  CLZWEncoder Encoder2;
  Encoder2.Encode("ReadMe.txt", "ReadMe.lzw"); // You could use a memory buffer instead of the .lzw file
  CLZWDecoder Decoder2;
  Decoder2.Decode("ReadMe.lzw", "ReadMeNow.txt");
The bulk of the code handles the input/output, the compression and decompression are short sections of code.
If you remove the GIF decoding block, the Decoder involves very little code: it should be very clear and easy to alter.