Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>According to the <a href="http://docs.python.org/library/functions.html#open" rel="noreferrer">documentation for <code>open()</code></a>, you should add a <code>U</code> to the mode:</p> <pre><code>open('textbase.txt', 'Ur') </code></pre> <p>This enables "<a href="http://www.python.org/dev/peps/pep-0278/" rel="noreferrer">universal newlines</a>", which normalizes them to <code>\n</code> in the strings it gives you.</p> <p>However, the correct thing to do is to decode the UTF-16BE into Unicode objects <em>first</em>, before translating the newlines. Otherwise, a chance <code>0x0d</code> byte could get erroneously turned into a <code>0x0a</code>, resulting in</p> <blockquote> <p>UnicodeDecodeError: 'utf16' codec can't decode byte 0x0a in position 12: truncated data.</p> </blockquote> <p>Python's <a href="http://docs.python.org/library/codecs.html" rel="noreferrer"><code>codecs</code> module</a> supplies an <code>open</code> function that can decode Unicode and handle newlines at the same time:</p> <pre><code>import codecs for line in codecs.open('textbase.txt', 'Ur', 'utf-16be'): ... </code></pre> <p>If the file has a byte order mark (BOM) and you specify <code>'utf-16'</code>, then it detects the endianness and hides the BOM for you. If it does not (since the BOM is optional), then that decoder will just go ahead and use your system's endianness, which probably won't be good.</p> <p>Specifying the endianness yourself (with <code>'utf-16be'</code>) will not hide the BOM, so you might wish to use this hack:</p> <pre><code>import codecs firstline = True for line in codecs.open('textbase.txt', 'Ur', 'utf-16be'): if firstline: firstline = False line = line.lstrip(u'\ufeff') </code></pre> <p>See also: <a href="http://www.amk.ca/python/howto/unicode" rel="noreferrer">Python Unicode HOWTO</a></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload