Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Reading delimited text files is not as simple as it might first appear.</p> <p>If your semicolon-delimited file has 16 columns, the array resulting from splitting a line <em>should</em> be of length 16 (meaning the highest offset into the array is +15). It <em>might</em> be less, if any of the following is true for any line in the source data:</p> <ol> <li>You have a short record in the file.</li> <li>You have a record where one field contains an embedded CR, LF or CR+LF pair, thus splitting the record into two (or more) lines and resulting in case #1 above.</li> </ol> <p>You might wind up with more columns than you think, too. The primary reason for this is that data being tainted with the impurity of the world, as it is, is often unclean. People have been know to litter data with delimiter characters, such as commas or semicolons. When you do a naive <code>Split()</code> on the text, you don't always get what you want. And this is especially true for "CSV" files, the format being rather [cough] <em>loosely</em> defined. And even more loosely implemented.</p> <p>You might want to look at using Sebastien Lorion's <a href="http://www.codeproject.com/Articles/9258/A-Fast-CSV-Reader" rel="nofollow">Fast CSV Reader</a> from <a href="http://www.codeproject.com/" rel="nofollow">CodeProject</a> for this. It works quite well and takes care of a lot of the...unexpected cases you might encounter.</p> <p>Other references you might want to take a look at:</p> <ul> <li><a href="http://www.rfc-editor.org/info/rfc4180" rel="nofollow">RFC 4180</a>, <em>Common Format and MIME Type for Comma-Separated Values (CSV) Files</em></li> <li><a href="http://creativyst.com/Doc/Articles/CSV/CSV01.htm" rel="nofollow">http://creativyst.com/Doc/Articles/CSV/CSV01.htm</a></li> <li><a href="http://en.wikipedia.org/wiki/Comma-separated_values" rel="nofollow">http://en.wikipedia.org/wiki/Comma-separated_values</a></li> </ul> <p><strong>Edited to note:</strong> The Library of Congress seems to have weighed in on the CSV format as well: <a href="http://www.digitalpreservation.gov/formats/fdd/fdd000323.shtml" rel="nofollow">http://www.digitalpreservation.gov/formats/fdd/fdd000323.shtml</a></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload