Note that there are some explanatory texts on larger screens.

plurals
  1. POPython: Inflate and Deflate implementations
    primarykey
    data
    text
    <p>I am interfacing with a server that requires that data sent to it is compressed with <em>Deflate</em> algorithm (Huffman encoding + LZ77) and also sends data that I need to <em>Inflate</em>. </p> <p>I know that Python includes Zlib, and that the C libraries in Zlib support calls to <em>Inflate</em> and <em>Deflate</em>, but these apparently are not provided by the Python Zlib module. It does provide <em>Compress</em> and <em>Decompress</em>, but when I make a call such as the following:</p> <pre><code>result_data = zlib.decompress( base64_decoded_compressed_string ) </code></pre> <p>I receive the following error:</p> <pre><code>Error -3 while decompressing data: incorrect header check </code></pre> <p>Gzip does no better; when making a call such as:</p> <pre><code>result_data = gzip.GzipFile( fileobj = StringIO.StringIO( base64_decoded_compressed_string ) ).read() </code></pre> <p>I receive the error:</p> <pre><code>IOError: Not a gzipped file </code></pre> <p>which makes sense as the data is a <em>Deflated</em> file not a true <em>Gzipped</em> file.</p> <p>Now I know that there is a <em>Deflate</em> implementation available (Pyflate), but I do not know of an <em>Inflate</em> implementation.</p> <p>It seems that there are a few options:</p> <ol> <li><strong>Find an existing implementation (ideal) of <em>Inflate</em> and <em>Deflate</em> in Python</strong></li> <li>Write my own Python extension to the zlib c library that includes <em>Inflate</em> and <em>Deflate</em> </li> <li>Call something else that can be executed from the command line (such as a Ruby script, since <em>Inflate</em>/<em>Deflate</em> calls in zlib are fully wrapped in Ruby)</li> <li>?</li> </ol> <p>I am seeking a solution, but lacking a solution I will be thankful for insights, constructive opinions, and ideas.</p> <p><strong>Additional information</strong>: The result of deflating (and encoding) a string should, for the purposes I need, give the same result as the following snippet of C# code, where the input parameter is an array of UTF bytes corresponding to the data to compress:</p> <pre><code>public static string DeflateAndEncodeBase64(byte[] data) { if (null == data || data.Length &lt; 1) return null; string compressedBase64 = ""; //write into a new memory stream wrapped by a deflate stream using (MemoryStream ms = new MemoryStream()) { using (DeflateStream deflateStream = new DeflateStream(ms, CompressionMode.Compress, true)) { //write byte buffer into memorystream deflateStream.Write(data, 0, data.Length); deflateStream.Close(); //rewind memory stream and write to base 64 string byte[] compressedBytes = new byte[ms.Length]; ms.Seek(0, SeekOrigin.Begin); ms.Read(compressedBytes, 0, (int)ms.Length); compressedBase64 = Convert.ToBase64String(compressedBytes); } } return compressedBase64; } </code></pre> <p>Running this .NET code for the string "deflate and encode me" gives the result </p> <pre><code>7b0HYBxJliUmL23Ke39K9UrX4HShCIBgEyTYkEAQ7MGIzeaS7B1pRyMpqyqBymVWZV1mFkDM7Z28995777333nvvvfe6O51OJ/ff/z9cZmQBbPbOStrJniGAqsgfP358Hz8iZvl5mbV5mi1nab6cVrM8XeT/Dw== </code></pre> <p>When "deflate and encode me" is run through the Python Zlib.compress() and then base64 encoded, the result is "eJxLSU3LSSxJVUjMS1FIzUvOT0lVyE0FAFXHB6k=". </p> <p>It is clear that zlib.compress() is not an implementation of the same algorithm as the standard Deflate algorithm.</p> <p><strong>More Information</strong>:</p> <p>The first 2 bytes of the .NET deflate data ("7b0HY..."), after b64 decoding are 0xEDBD, which does not correspond to Gzip data (0x1f8b), BZip2 (0x425A) data, or Zlib (0x789C) data.</p> <p>The first 2 bytes of the Python compressed data ("eJxLS..."), after b64 decoding are 0x789C. This is a Zlib header.</p> <p><strong>SOLVED</strong></p> <p>To handle the raw deflate and inflate, without header and checksum, the following things needed to happen:</p> <p>On deflate/compress: strip the first two bytes (header) and the last four bytes (checksum).</p> <p>On inflate/decompress: there is a second argument for window size. If this value is negative it suppresses headers. here are my methods currently, including the base64 encoding/decoding - and working properly:</p> <pre><code>import zlib import base64 def decode_base64_and_inflate( b64string ): decoded_data = base64.b64decode( b64string ) return zlib.decompress( decoded_data , -15) def deflate_and_base64_encode( string_val ): zlibbed_str = zlib.compress( string_val ) compressed_string = zlibbed_str[2:-4] return base64.b64encode( compressed_string ) </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload