Note that there are some explanatory texts on larger screens.

plurals
  1. POIn Powershell, what's the most efficient way to split a large text file by record type?
    primarykey
    data
    text
    <p>I am using Powershell for some ETL work, reading compressed text files in and splitting them out depending on the first three characters of each line. </p> <p>If I were just filtering the input file, I could pipe the filtered stream to Out-File and be done with it. But I need to redirect the output to more than one destination, and as far as I know this can't be done with a simple pipe. I'm already using a .NET streamreader to read the compressed input files, and I'm wondering if I need to use a streamwriter to write the output files as well.</p> <p>The naive version looks something like this:</p> <pre><code>while (!$reader.EndOfFile) { $line = $reader.ReadLine(); switch ($line.substring(0,3) { "001" {Add-Content "output001.txt" $line} "002" {Add-Content "output002.txt" $line} "003" {Add-Content "output003.txt" $line} } } </code></pre> <p>That just looks like bad news: finding, opening, writing and closing a file once per row. The input files are huge 500MB+ monsters.</p> <p>Is there an idiomatic way to handle this efficiently w/ Powershell constructs, or should I turn to the .NET streamwriter? </p> <p>Are there methods of a (New-Item "path" -type "file") object I could use for this?</p> <p><strong>EDIT for context:</strong></p> <p>I'm using the <a href="http://www.codeplex.com/DotNetZip" rel="noreferrer">DotNetZip</a> library to read ZIP files as streams; thus <code>streamreader</code> rather than <code>Get-Content</code>/<code>gc</code>. Sample code:</p> <pre><code>[System.Reflection.Assembly]::LoadFrom("\Path\To\Ionic.Zip.dll") $zipfile = [Ionic.Zip.ZipFile]::Read("\Path\To\File.zip") foreach ($entry in $zipfile) { $reader = new-object system.io.streamreader $entry.OpenReader(); while (!$reader.EndOfFile) { $line = $reader.ReadLine(); #do something here } } </code></pre> <p>I should probably <code>Dispose()</code> of both the $zipfile and $reader, but that is for another question!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload