Note that there are some explanatory texts on larger screens.

plurals
  1. POProblems importing csv file/converting from integer to double in R
    primarykey
    data
    text
    <p>Today I have finally decided to start climbing R's steep learning curve. I have spent a few hours and I managed to import my dataset and do a few other basic things, but I am having trouble with the data type: <strong>a column which contains decimals is imported as integer, and conversion to double changes the values</strong>.</p> <p>In trying to get a small csv file to put here as an example I discovered that <strong>the problem only happens when the data file is too large</strong> (my original file is a 1048418 by 12 matrix, but even with "only" 5000 rows I have the same problem. When I only have 100, 1000 or even 2000 rows the column is imported correctly as double).</p> <p><a href="http://dl.getdropbox.com/u/1885087/exampleshort.csv">Here</a> is a smaller dataset (still 500kb, but again, if the dataset is small the problem is not replicated). The code is</p> <pre><code>&gt; ex &lt;- read.csv("exampleshort.csv",header=TRUE) &gt; typeof(ex$RET) [1] "integer" </code></pre> <p>Why is the column of returns being imported as integer when the file is large, when it is clearly of the type double?</p> <p>The worst thing is that if I try to convert it to double, the values are changed</p> <pre><code>&gt; exdouble &lt;- as.double(ex$RET) &gt; typeof(exdouble) [1] "double" &gt; ex$RET[1:5] [1] 0.005587 -0.005556 -0.005587 0.005618 -0.001862 2077 Levels: -0.000413 -0.000532 -0.001082 -0.001199 -0.0012 -0.001285 -0.001337 -0.001351 -0.001357 -0.001481 -0.001486 -0.001488 ... 0.309524 &gt; exdouble[1:5] [1] 1305 321 322 1307 41 </code></pre> <p>This is not the only column that is imported wrong, but I figured that if I find a solution for one column, I should be able to sort the other ones out. Here is some more information:</p> <pre><code>&gt; sapply(ex,class) PERMNO DATE COMNAM SICCD PRC RET RETX SHROUT VWRETD VWRETX EWRETD EWRETX "integer" "integer" "factor" "integer" "factor" "factor" "factor" "integer" "numeric" "numeric" "numeric" "numeric" </code></pre> <p>They should be in this order: integer, date, string, integer, double, double, double, integer, double, double, double, double (the types are probably wrong, but hopefully you will get what I mean)</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload