Note that there are some explanatory texts on larger screens.

plurals
  1. POImporting CSV data containing commas, thousand separators and trailing minus sign
    primarykey
    data
    text
    <p>R 2.13.1 on Mac OS X. I'm trying to import a data file that has a point for thousand separator and comma as the decimal point, as well as trailing minus for negative values.</p> <p>Basically, I'm trying to convert from:</p> <pre><code>"A|324,80|1.324,80|35,80-" </code></pre> <p>to</p> <pre><code> V1 V2 V3 V4 1 A 324.80 1324.8 -35.80 </code></pre> <p>Now, interactively both the following works:</p> <pre><code>gsub("\\.","","1.324,80") [1] "1324,80" gsub("(.+)-$","-\\1", "35,80-") [1] "-35,80" </code></pre> <p>and also combining them:</p> <pre><code>gsub("\\.", "", gsub("(.+)-$","-\\1","1.324,80-")) [1] "-1324,80" </code></pre> <p>However, I'm not able to remove the thousand separator from read.data:</p> <pre><code>setClass("num.with.commas") setAs("character", "num.with.commas", function(from) as.numeric(gsub("\\.", "", sub("(.+)-$","-\\1",from))) ) mydata &lt;- "A|324,80|1.324,80|35,80-" mytable &lt;- read.table(textConnection(mydata), header=FALSE, quote="", comment.char="", sep="|", dec=",", skip=0, fill=FALSE,strip.white=TRUE, colClasses=c("character","num.with.commas", "num.with.commas", "num.with.commas")) Warning messages: 1: In asMethod(object) : NAs introduced by coercion 2: In asMethod(object) : NAs introduced by coercion 3: In asMethod(object) : NAs introduced by coercion mytable V1 V2 V3 V4 1 A NA NA NA </code></pre> <p>Note that if I change from "\\." to "," in the function, things look a bit different:</p> <pre><code>setAs("character", "num.with.commas", function(from) as.numeric(gsub(",", "", sub("(.+)-$","-\\1",from))) ) mytable &lt;- read.table(textConnection(mydata), header=FALSE, quote="", comment.char="", sep="|", dec=",", skip=0, fill=FALSE,strip.white=TRUE, colClasses=c("character","num.with.commas", "num.with.commas", "num.with.commas")) mytable V1 V2 V3 V4 1 A 32480 1.3248 -3580 </code></pre> <p>I think the problem is that read.data with dec="," converts the incoming "," to "." BEFORE calling as(from, "num.with.commas"), so that the input string can be e.g. "1.324.80".</p> <p>I want as("1.123,80-","num.with.commas") to return -1123.80 and as("1.100.123,80", "num.with.commas") to return 1100123.80.</p> <p>How can I make my num.with.commas replace all <strong>except the last</strong> decimal point in the input string?</p> <p><strong>Update</strong>: First, I added negative lookahead and got as() working in the console:</p> <pre><code>setAs("character", "num.with.commas", function(from) as.numeric(gsub("(?!\\.\\d\\d$)\\.", "", gsub("(.+)-$","-\\1",from), perl=TRUE)) ) as("1.210.123.80-","num.with.commas") [1] -1210124 as("10.123.80-","num.with.commas") [1] -10123.8 as("10.123.80","num.with.commas") [1] 10123.8 </code></pre> <p>However, read.table still had the same problem. Adding some print()s to my function showed that num.with.commas in fact got the comma and not the point.</p> <p>So my current solution is to then replace from "," to "." in num.with.commas.</p> <pre><code>setAs("character", "num.with.commas", function(from) as.numeric(gsub(",","\\.",gsub("(?!\\.\\d\\d$)\\.", "", gsub("(.+)-$","-\\1",from), perl=TRUE))) ) mytable &lt;- read.table(textConnection(mydata), header=FALSE, quote="", comment.char="", sep="|", dec=",", skip=0, fill=FALSE,strip.white=TRUE, colClasses=c("character","num.with.commas", "num.with.commas", "num.with.commas")) mytable V1 V2 V3 V4 1 A 324.8 1101325 -35.8 </code></pre>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload