Note that there are some explanatory texts on larger screens.

plurals
  1. POScala: Iterate over CSV files in a functional way?
    primarykey
    data
    text
    <p>I have CSV files with comments that give column names, where the columns change throughout the file:</p> <pre><code>#c1,c2,c3 a,b,c d,e,f #c4,c5 g,h i,j </code></pre> <p>I want to provide a way to iterate over (only) the data rows of the file as Maps of column names to values (all Strings). So the above would become:</p> <pre><code>Map(c1 -&gt; a, c2 -&gt; b, c3 -&gt; c) Map(c1 -&gt; d, c2 -&gt; e, c3 -&gt; f) Map(c4 -&gt; g, c5 -&gt; h) Map(c4 -&gt; i, c5 -&gt; j) </code></pre> <p>The files are very large, so reading everything into memory is not an option. Right now I have an <code>Iterator</code> class that keeps some ugly state between <code>hasNext()</code> and <code>next()</code>; I also provide accessors for the current line number and the actual last line and comment read (in case consumers care about field order). I'd like to try to do things in a more functional way. </p> <p>My first idea was a for comprehension: I can iterate over the lines of the file, skipping the comment lines with a filter clause. I can <code>yield</code> a tuple containing the map, the line number, etc. The problem is I need to remember the last column names seen so I can create Maps from them. For loops understandably try to discourage keeping state, by only letting you set new <code>val</code>s. I learned from <a href="https://stackoverflow.com/questions/7087353/can-a-scala-for-loop-modify-variables-outside-its-scope">this question</a> that I can update member variables in the <code>yield</code> block, but that's precisely when I <em>don't</em> want to update them in my case!</p> <p>I could call a function in the iteration clause that updates state, but that seems dirty. So, what is the best way to do this in a functional style? Abuse for comprehensions? Hack <a href="https://stackoverflow.com/questions/4469538/maintaining-a-state-throughout-a-scala-fold-operation">scanLeft</a>? Use a library? Bring out the <a href="https://stackoverflow.com/questions/5063022/use-scala-parser-combinator-to-parse-csv-files">parser combinator</a> big guns? Or is a functional style just not a good match for this problem?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload