Note that there are some explanatory texts on larger screens.

plurals
  1. PORegexp/perl code for handling both dots and commas as valid decimal separators
    primarykey
    data
    text
    <p>I'm trying to create a method that provides "best effort" parsing of decimal inputs in cases where I do not know which of these two mutually exclusive ways of writing numbers the end-user is using:</p> <ul> <li>"." as thousands separator and "," as decimal separator</li> <li>"," as thousands separator and "." as decimal separator</li> </ul> <p>The method is implemented as <code>parse_decimal(..)</code> in the code below. Furthermore, I've defined 20 test cases that show how the heuristics of the method should work. </p> <p>While the code below passes the tests it is quite horrible and unreadable. I'm sure there is a more compact and readable way to implement the method. Possibly including smarter use of regexpes.</p> <p>My question is simply: <b>Given the code below and the test-cases, how would you improve parse_decimal(...) to make it more compact and readable while still passing the tests?</b></p> <p>Clarifications:</p> <ul> <li>Clarification #1: As pointed out in the comments the case <code>^\d{1,3}[\.,]\d{3}$</code> is ambiguous in that one cannot determine logically which character is used as thousands separator and which is used as a decimal separator. In ambiguous cases we'll simply assume that US-style decimals are used: "," as thousands separator and "." as decimal separator.</li> <li>Clarification #2: If you believe that any of test cases is wrong, then please state which of the tests that should be changed and how.</li> </ul> <p>The code in question including the test cases:</p> <pre><code>#!/usr/bin/perl -wT use strict; use warnings; use Test::More tests =&gt; 20; ok(&amp;parse_decimal("1,234,567") == 1234567); ok(&amp;parse_decimal("1,234567") == 1.234567); ok(&amp;parse_decimal("1.234.567") == 1234567); ok(&amp;parse_decimal("1.234567") == 1.234567); ok(&amp;parse_decimal("12,345") == 12345); ok(&amp;parse_decimal("12,345,678") == 12345678); ok(&amp;parse_decimal("12,345.67") == 12345.67); ok(&amp;parse_decimal("12,34567") == 12.34567); ok(&amp;parse_decimal("12.34") == 12.34); ok(&amp;parse_decimal("12.345") == 12345); ok(&amp;parse_decimal("12.345,67") == 12345.67); ok(&amp;parse_decimal("12.345.678") == 12345678); ok(&amp;parse_decimal("12.34567") == 12.34567); ok(&amp;parse_decimal("123,4567") == 123.4567); ok(&amp;parse_decimal("123.4567") == 123.4567); ok(&amp;parse_decimal("1234,567") == 1234.567); ok(&amp;parse_decimal("1234.567") == 1234.567); ok(&amp;parse_decimal("12345") == 12345); ok(&amp;parse_decimal("12345,67") == 12345.67); ok(&amp;parse_decimal("1234567") == 1234567); sub parse_decimal($) { my $input = shift; $input =~ s/[^\d,\.]//g; if ($input !~ /[,\.]/) { return &amp;parse_with_separators($input, '.', ','); } elsif ($input =~ /\d,\d+\.\d/) { return &amp;parse_with_separators($input, '.', ','); } elsif ($input =~ /\d\.\d+,\d/) { return &amp;parse_with_separators($input, ',', '.'); } elsif ($input =~ /\d\.\d+\.\d/) { return &amp;parse_with_separators($input, ',', '.'); } elsif ($input =~ /\d,\d+,\d/) { return &amp;parse_with_separators($input, '.', ','); } elsif ($input =~ /\d{4},\d/) { return &amp;parse_with_separators($input, ',', '.'); } elsif ($input =~ /\d{4}\.\d/) { return &amp;parse_with_separators($input, '.', ','); } elsif ($input =~ /\d,\d{3}$/) { return &amp;parse_with_separators($input, '.', ','); } elsif ($input =~ /\d\.\d{3}$/) { return &amp;parse_with_separators($input, ',', '.'); } elsif ($input =~ /\d,\d/) { return &amp;parse_with_separators($input, ',', '.'); } elsif ($input =~ /\d\.\d/) { return &amp;parse_with_separators($input, '.', ','); } else { return &amp;parse_with_separators($input, '.', ','); } } sub parse_with_separators($$$) { my $input = shift; my $decimal_separator = shift; my $thousand_separator = shift; my $output = $input; $output =~ s/\Q${thousand_separator}\E//g; $output =~ s/\Q${decimal_separator}\E/./g; return $output; } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload