Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Unfortunately your operating system's pathname API is another "binary interface" where you will have to use <code>Encode::encode</code> and <code>Encode::decode</code> to get predictable results.</p> <p>Most operating systems treat pathnames as a sequence of octets (i.e. bytes). Whether that sequence should be interpreted as latin-1, UTF-8 or other character encoding is an application decision. Consequently the value returned by <code>readdir()</code> is simply a sequence of octets, and <code>File::Find</code> doesn't know that you want the path name as Unicode code points. It forms <code>$File::Find::name</code> by simply concatenating the directory path (which you supplied) with the value returned by your OS via <code>readdir()</code>, and that's how you got code points mashed with octets.</p> <p>Rule of thumb: Whenever passing path names to the OS, <code>Encode::encode()</code> it to make sure it is a sequence of octets. When getting a path name from the OS, <code>Encode::decode()</code> it to the character set that your application wants it in.</p> <p>You can make your program work by calling <code>find</code> this way:</p> <pre><code>find( sub { ... }, Encode::encode('utf8', 'Delibes, Léo') ); </code></pre> <p>And then calling <code>Encode::decode()</code> when using the value of <code>$File::Find::name</code>:</p> <pre><code>my $path = Encode::decode('utf8', $File::Find::name); </code></pre> <p>To be more clear, this is how <code>$File::Find::name</code> was formed:</p> <pre><code>use Encode; # This is a way to get $dir to be represented as a UTF-8 string my $dir = 'L' .chr(233).'o'.chr(256); chop $dir; say "dir: ", d($dir); # length = 3 # This is what readdir() is returning: my $leaf = encode('utf8', 'Lakem' . chr(233)); say "leaf: ", d($leaf); # length = 7 $File::Find::name = $dir . '/' . $leaf; say "File::Find::name: ", d($File::Find::name); sub d { join(' ', map { sprintf("%02X", ord($_)) } split('', $_[0])) } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload