StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>DotNetZip would do what you want, but I understand your concerns about legal approval. </p> <p>On a side note, It might be good for you to navigate the legal jungle associated with getting an open-source library approved for use in the company, just to understand what's involved. But I'll leave that up to you. </p> <hr> <p>Getting back to rolling your own... DotNetZip is pretty full featured, and it handles a number of scenarios you probably don't care about. Like Unicode filenames and comments, setting windows timestamps and permissions of extracted files, getting timestamps of zip files created on old unix systems, split archives, Encrypted archives, files over 2gb, or self-extracting archives, etc etc etc. Many zip files use none of those things. </p> <p>Also DotNetZip does eventing and zip updates and zip creation - all the code associated with these things is probably not of interest to you, if you confine yourself just to the requirements you described in your question. </p> <p>You could, though, grab the DotNetZip code and use it to help you roll your own solution. If you constrain yourself to JUST reading zip files and not dealing with all the possible special cases, the zip format is not difficult to parse.</p> <p>here's how to do it: </p> <ol> <li><p>open the zip file using <code>new FileStream()</code> or <code>File.Open</code>. You want a <code>FileStream</code> object.</p></li> <li><p>Read 4 bytes. Verify that it is the zip-entry-header descriptor. (0x04034b50) In the file, the order you will find these bytes is 50 4b 03 04. </p></li> <li><p>if you find a match, you're in business. </p> <ul> <li>at offset 14 is a 4-byte CRC. Get it. (Same byte ordering as above)</li> <li>at offset 18 - the 4-byte length of the compressed blob. get it. (N)</li> <li>at offset 22 - the 4-byte length of the UNcompressed blob. get it. (U)</li> <li>at 26 - the 2-byte length of the filename. get it (L)</li> <li>at 28 - the 2-byte length of the "extra field". get it (E)</li> </ul></li> <li><p>Beyond the extra field, at offset 30, is the actual filename. read L bytes for the filename, and call System.Text.Encoding.ASCII.GetString(). The result will include a directory path, with the backslashes replaced with slashes (unix style). String.Replace() the slashes.</p></li> <li><p>after the filename comes the extra field - seek E bytes to get beyond it. You can mostly ifgnore it. This is where the compressed data starts.</p></li> <li><p>Open a System.IO.DeflateStream() on the zip FileStream, using CompressionMode.Decompress, and using the current offset of the FileStream as input. open a new FileStream, for output, with the file path you read in step 3. in a loop, call inflater.Read(). and output.Write(), to write the decompressed output of the DeflateStream to a filesystem file with the correct name. You will need to stop reading from the DeflateStream when you read exactly U (uncompressed) bytes.</p></li> <li><p>Check the uncompressed size (U) against the data you actually wrote out from the DeflateStream (after compression). They should match. </p></li> <li><p>If you are fancy, you can check the CRC of the output against what was in the header.</p></li> <li><p>go to step 2, to look for the next entry in the file.</p></li> </ol> <p>The most complicated part is step 3. Working code for that is easily found in <a href="http://dotnetzip.codeplex.com/SourceControl/changeset/view/65587#679992" rel="nofollow">this source module</a>, look for the ReadHeader method. </p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload