Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>In the "Details" part of <code>?scan</code> (used by <code>read.table</code>, <code>read.csv</code> and so on):</p> <pre><code> If ‘sep’ is non-default, the fields may be quoted in the style of ‘.csv’ files where separators inside quotes (‘''’ or ‘""’) are ignored and quotes may be put inside strings by doubling them. However, if ‘sep = "\n"’ it is assumed by default that one wants to read entire lines verbatim. </code></pre> <p>So it looks like your problem is the "escaped" quote <code>\"</code> in that line causing the trouble - R expects an escaped quote for a CSV to be a double quote <code>""</code>, not a backslashed quote <code>\"</code>.</p> <p>I think your best bet here is to replace escaped quotes with double quotes, be it using Linux or in R (R example below):</p> <pre><code>txt &lt;- readLines("tmp.txt") txt &lt;- gsub('\\\\"', '""', txt) # note the weird double backslashing because # `readLines` adds extra backslashes # if you `cat(txt, sep='\n')` you will see that the `\"` is now `""` </code></pre> <p>Then you can use <code>read.csv</code> or <code>scan</code> like before (note the <code>textConnection(txt)</code> which converts the string into a file-like object for <code>scan</code> to use):</p> <pre><code>read.csv(textConnection(txt), ...) </code></pre> <hr> <h2>Edit/Addition</h2> <p>Re OP's comment - the file is 1.4GB and there are difficulties reading it all into R at once, so how to do the sanitizing?</p> <h3>Option 1</h3> <p>You appear to be on Linux, so you could use <code>sed</code>:</p> <pre><code>sed -ire 's!\\"!""!g' myfile.txt </code></pre> <p>(Depending on where your data comes from, perhaps you could adjust the program that is outputting the data to output it in the format you require in the first place, but this is not always possible).</p> <h3>Option 2</h3> <p>If you are averse to using Linux or want an in-house R solution, use the <code>n</code> parameter to <code>readLines</code> to only read in a few lines at a time:</p> <pre><code># create the file object and open it, see ?file f &lt;- file('tmp.txt') open(f) txt &lt;- '' # now read in 100 lines at a time, say while (length(txt)) { txt &lt;- readLines(f, n=100) # now do the sanitizing/coercing into a data frame, store. # ... } close(f) </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload