Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>In Python 2 there are unicode strings and bytestrings. If you just use bytestrings, you can read/write to a file opened with <code>open()</code> just fine. After all, the strings are just bytes. </p> <p>The problem comes when, say, you have a unicode string and you do the following:</p> <pre><code>&gt;&gt;&gt; example = u'Μου αρέσει Ελληνικά' &gt;&gt;&gt; open('sample.txt', 'w').write(example) Traceback (most recent call last): File "&lt;stdin&gt;", line 1, in &lt;module&gt; UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) </code></pre> <p>So here obviously you either explicitly encode your unicode string in utf-8 or you use <code>codecs.open</code> to do it for you transparently. </p> <p>If you're only ever using bytestrings then no problems:</p> <pre><code>&gt;&gt;&gt; example = 'Μου αρέσει Ελληνικά' &gt;&gt;&gt; open('sample.txt', 'w').write(example) &gt;&gt;&gt; </code></pre> <p>It gets more involved than this because when you concatenate a unicode and bytestring string with the <code>+</code> operator you get a unicode string. Easy to get bitten by that one.</p> <p>Also <code>codecs.open</code> doesn't like bytestrings with non-ASCII chars being passed in:</p> <pre><code>codecs.open('test', 'w', encoding='utf-8').write('Μου αρέσει') Traceback (most recent call last): File "&lt;stdin&gt;", line 1, in &lt;module&gt; File "/usr/lib/python2.7/codecs.py", line 691, in write return self.writer.write(data) File "/usr/lib/python2.7/codecs.py", line 351, in write data, consumed = self.encode(object, self.errors) UnicodeDecodeError: 'ascii' codec can't decode byte 0xce in position 0: ordinal not in range(128) </code></pre> <p>The advice about strings for input/ouput is normally "convert to unicode as early as possible and back to bytestrings as late as possible". Using <code>codecs.open</code> allows you to do the latter very easily.</p> <p>Just be careful that you are giving it unicode strings and not bytestrings that may have non-ASCII characters.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload