Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>This is definitely a job worth of a parsing library. My primary goal is normally (i.e., for anything I intend to use more than once or twice) to get the data into a non-textual form ASAP, something like </p> <pre><code>module ReportParser where import Prelude hiding (takeWhile) import Data.Text hiding (takeWhile) import Control.Applicative import Data.Attoparsec.Text data ReportHeaderData = Company Text | Program Text | State Text -- ... | FieldNames [Text] data ReportData = ReportData Int Text Int Int Int Int Date Int Text Text data Date = Date Int Int Int </code></pre> <p>and we can say, for the sake of argument, that a report is</p> <pre><code>data Report = Report [ReportHeaderData] [ReportData] </code></pre> <p>Now, I generally create a parser which is a function of the same name as the data type</p> <pre><code>-- Ending condition for a field doubleSpace :: Parser Char doubleSpace = space &gt;&gt; space -- Clears leading spaces clearSpaces :: Parser Text clearSpaces = takeWhile (== ' ') -- Naively assumes no tabs -- Throws away everything up to and including a newline character (naively assumes unix line endings) clearNewline :: Parser () clearNewline = (anyChar `manyTill` char '\n') *&gt; pure () -- Parse a date date :: Parser Date date = Date &lt;$&gt; decimal &lt;*&gt; (char '/' *&gt; decimal) &lt;*&gt; (char '/' *&gt; decimal) -- Parse a report reportData :: Parser ReportData reportData = let f1 = decimal &lt;* clearSpaces f2 = (pack &lt;$&gt; manyTill anyChar doubleSpace) &lt;* clearSpaces f3 = decimal &lt;* clearSpaces f4 = decimal &lt;* clearSpaces f5 = decimal &lt;* clearSpaces f6 = decimal &lt;* clearSpaces f7 = date &lt;* clearSpaces f8 = decimal &lt;* clearSpaces f9 = (pack &lt;$&gt; manyTill anyChar doubleSpace) &lt;* clearSpaces f10 = (pack &lt;$&gt; manyTill anyChar doubleSpace) &lt;* clearNewline in ReportData &lt;$&gt; f1 &lt;*&gt; f2 &lt;*&gt; f3 &lt;*&gt; f4 &lt;*&gt; f5 &lt;*&gt; f6 &lt;*&gt; f7 &lt;*&gt; f8 &lt;*&gt; f9 &lt;*&gt; f10 </code></pre> <p>By proper running of <a href="http://hackage.haskell.org/package/attoparsec-0.10.4.0/docs/Data-Attoparsec-ByteString.html#g:5" rel="nofollow">one of the parse functions</a> and the use of one of the combinators (such as <a href="http://hackage.haskell.org/package/attoparsec-0.10.4.0/docs/Data-Attoparsec-Combinator.html" rel="nofollow"><code>many</code></a> (and possibly <code>feed</code>, if you end up with a Partial result), you should end up with a list of <code>ReportData</code>s. You can then convert them to CSV with some function you've created.</p> <p>Note that I didn't deal with the header. It should be relatively trivial to write code to parse it, and build a <code>Report</code> with e.g.</p> <pre><code>-- Not tested parseReport = Report &lt;$&gt; (many reportHeader) &lt;*&gt; (many reportData) </code></pre> <p>Note that I prefer the <a href="http://hackage.haskell.org/package/base-4.6.0.1/docs/Control-Applicative.html#g:1" rel="nofollow">Applicative</a> form, but it's also possible to use the monadic form if you prefer (I did in <code>doubleSpace</code>). <a href="http://hackage.haskell.org/package/base-4.6.0.1/docs/Control-Applicative.html#g:2" rel="nofollow"><code>Data.Alternative</code></a> is also useful, for reasons implied by the name.</p> <p>For playing with this, I highly recommend GHCI and the <code>parseTest</code> function. GHCI is just overall handy and a good way to test individual parsers, while parseTest takes a parser and input string and outputs the status of the run, the parsed string, and any remaining string not parsed. Very useful when you're not quite sure what's going on.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload