Note that there are some explanatory texts on larger screens.

plurals
  1. POTesting query string unicode handling in Perl
    text
    copied!<p>I'm trying to write up an example of testing query string parsing when I got stumped on a Unicode issue. In short, the letter "Omega" (Ω) doesn't seem to be decoded correctly.</p> <ul> <li>Unicode: U+2126</li> <li>3-byte sequence: \xe2\x84\xa6</li> <li>URI encoded: %E2%84%A6</li> </ul> <p>So I wrote this test program verify that I could "decode" unicode query strings with URI::Encode.</p> <pre><code>use strict; use warnings; use utf8::all; # use before Test::Builder clones STDOUT, etc. use URI::Encode 'uri_decode'; use Test::More; sub parse_query_string { my $query_string = shift; my @pairs = split /[&amp;;]/ =&gt; $query_string; my %values_for; foreach my $pair (@pairs) { my ( $key, $value ) = split( /=/, $pair ); $_ = uri_decode($_) for $key, $value; $values_for{$key} ||= []; push @{ $values_for{$key} } =&gt; $value; } return \%values_for; } my $omega = "\N{U+2126}"; my $query = parse_query_string('alpha=%E2%84%A6'); is_deeply $query, { alpha =&gt; [$omega] }, 'Unicode should decode correctly'; diag $omega; diag $query-&gt;{alpha}[0]; done_testing; </code></pre> <p>And the output of the test:</p> <pre><code>query.t .. not ok 1 - Unicode should decode correctly # Failed test 'Unicode should decode correctly' # at query.t line 23. # Structures begin differing at: # $got-&gt;{alpha}[0] = 'â¦' # $expected-&gt;{alpha}[0] = 'Ω' # Ω # ⦠1..1 # Looks like you failed 1 test of 1. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/1 subtests Test Summary Report ------------------- query.t (Wstat: 256 Tests: 1 Failed: 1) Failed test: 1 Non-zero exit status: 1 Files=1, Tests=1, 0 wallclock secs ( 0.03 usr 0.01 sys + 0.05 cusr 0.00 csys = 0.09 CPU) Result: FAIL </code></pre> <p>It looks to me like URI::Encode may be broken here, but switching to URI::Escape and using the uri_unescape function reports the same error. What am I missing?</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload