Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>The Perl implementation uses the <a href="http://en.wikipedia.org/wiki/Mmap" rel="nofollow noreferrer">mmap</a> system call. What that call does is establish a pointer which to the process appears to be a normal segment of memory or buffer to the program. It maps the contents of a file to a region of memory. There are performances advantages of doing this vs normal file IO (read) - one is that there are no user-space library calls necessary to get access to the data, another is that there are often less copy operations necessary (eg: moving data between kernel and user space).</p> <p>Perl's strings and regular expressions are 8-bit byte based (as opposed to utf16 for Java for example), so Perl's native 'character type' is the same encoding of the mmapped file.</p> <p>When the regular expression engine then operates on the mmap backed variable, it is directly accessing the file data via the mamped memory region - without going through Perl's IO functions, or even libc's IO functions.</p> <p>The mmap is probably largely responsible for the performance difference vs the Python version using the normal Python IO libraries - which additionally introduce the overhead of looking for line breaks.</p> <p>The Perl program also supports a -J to parallelize the processing, where the oepen "-|" causes a fork() where the file handle in the parent is to the child's stdout. The child processes serialize their results to stdout and the parent de-serializes them to coordinate and summarize the results.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload