Note that there are some explanatory texts on larger screens.

plurals
  1. POUnicode (UTF-8) reading and writing to files in Python
    primarykey
    data
    text
    <p>I'm having some brain failure in understanding reading and writing text to a file (Python 2.4).</p> <pre><code># The string, which has an a-acute in it. ss = u'Capit\xe1n' ss8 = ss.encode('utf8') repr(ss), repr(ss8) </code></pre> <blockquote> <p>("u'Capit\xe1n'", "'Capit\xc3\xa1n'")</p> </blockquote> <pre><code>print ss, ss8 print &gt;&gt; open('f1','w'), ss8 &gt;&gt;&gt; file('f1').read() 'Capit\xc3\xa1n\n' </code></pre> <p>So I type in <code>Capit\xc3\xa1n</code> into my favorite editor, in file f2.</p> <p>Then:</p> <pre><code>&gt;&gt;&gt; open('f1').read() 'Capit\xc3\xa1n\n' &gt;&gt;&gt; open('f2').read() 'Capit\\xc3\\xa1n\n' &gt;&gt;&gt; open('f1').read().decode('utf8') u'Capit\xe1n\n' &gt;&gt;&gt; open('f2').read().decode('utf8') u'Capit\\xc3\\xa1n\n' </code></pre> <p>What am I not understanding here? Clearly there is some vital bit of magic (or good sense) that I'm missing. What does one type into text files to get proper conversions?</p> <p>What I'm truly failing to grok here, is what the point of the UTF-8 representation is, if you can't actually get Python to recognize it, when it comes from outside. Maybe I should just JSON dump the string, and use that instead, since that has an asciiable representation! More to the point, is there an ASCII representation of this Unicode object that Python will recognize and decode, when coming in from a file? If so, how do I get it?</p> <pre><code>&gt;&gt;&gt; print simplejson.dumps(ss) '"Capit\u00e1n"' &gt;&gt;&gt; print &gt;&gt; file('f3','w'), simplejson.dumps(ss) &gt;&gt;&gt; simplejson.load(open('f3')) u'Capit\xe1n' </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload