Note that there are some explanatory texts on larger screens.

plurals
  1. POpython: unicode problem
    primarykey
    data
    text
    <p>I am trying to decode a string I took from file:</p> <pre><code>file = open ("./Downloads/lamp-post.csv", 'r') data = file.readlines() data[0] </code></pre> <blockquote> <p>'\xff\xfeK\x00e\x00y\x00w\x00o\x00r\x00d\x00\t\x00C\x00o\x00m\x00p\x00e\x00t\x00i\x00t\x00i\x00o\x00n\x00\t\x00G\x00l\x00o\x00b\x00a\x00l\x00 \x00M\x00o\x00n\x00t\x00h\x00l\x00y\x00 \x00S\x00e\x00a\x00r\x00c\x00h\x00e\x00s\x00\t\x00D\x00e\x00c\x00 \x002\x000\x001\x000\x00\t\x00N\x00o\x00v\x00 \x002\x000\x001\x000\x00\t\x00O\x00c\x00t\x00 \x002\x000\x001\x000\x00\t\x00S\x00e\x00p\x00 \x002\x000\x001\x000\x00\t\x00A\x00u\x00g\x00 \x002\x000\x001\x000\x00\t\x00J\x00u\x00l\x00 \x002\x000\x001\x000\x00\t\x00J\x00u\x00n\x00 \x002\x000\x001\x000\x00\t\x00M\x00a\x00y\x00 \x002\x000\x001\x000\x00\t\x00A\x00p\x00r\x00 \x002\x000\x001\x000\x00\t\x00M\x00a\x00r\x00 \x002\x000\x001\x000\x00\t\x00F\x00e\x00b\x00 \x002\x000\x001\x000\x00\t\x00J\x00a\x00n\x00 \x002\x000\x001\x000\x00\t\x00A\x00d\x00 \x00s\x00h\x00a\x00r\x00e\x00\t\x00S\x00e\x00a\x00r\x00c\x00h\x00 \x00s\x00h\x00a\x00r\x00e\x00\t\x00E\x00s\x00t\x00i\x00m\x00a\x00t\x00e\x00d\x00 \x00A\x00v\x00g\x00.\x00 \x00C\x00P\x00C\x00\t\x00E\x00x\x00t\x00r\x00a\x00c\x00t\x00e\x00d\x00 \x00F\x00r\x00o\x00m\x00 \x00W\x00e\x00b\x00 \x00P\x00a\x00g\x00e\x00\t\x00L\x00o\x00c\x00a\x00l\x00 \x00M\x00o\x00n\x00t\x00h\x00l\x00y\x00 \x00S\x00e\x00a\x00r\x00c\x00h\x00e\x00s\x00\n'</p> </blockquote> <p>Adding ignore do not really help...:</p> <blockquote> <p>In [69]: data[2] Out[69]: u'\u6700\u6100\u7200\u6400\u6500\u6e00\u2000\u6c00\u6100\u6d00\u7000\u2000\u7000\u6f00\u7300\u7400\u0900\u3000\u2e00\u3900\u3400\u0900\u3800\u3800\u3000\u0900\u2d00\u0900\u3300\u3200\u3000\u0900\u3300\u3900\u3000\u0900\u3300\u3900\u3000\u0900\u3400\u3800\u3000\u0900\u3500\u3900\u3000\u0900\u3500\u3900\u3000\u0900\u3700\u3200\u3000\u0900\u3700\u3200\u3000\u0900\u3300\u3900\u3000\u0900\u3300\u3200\u3000\u0900\u3200\u3600\u3000\u0900\u2d00\u0900\u2d00\u0900\ua300\u3200\u2e00\u3100\u3800\u0900\u2d00\u0900\u3400\u3800\u3000\u0a00'</p> <p>In [70]: data[2].decode("utf-8", "replace") --------------------------------------------------------------------------- Traceback (most recent call last)</p> <p>/Users/oleg/ in ()</p> <p>/opt/local/lib/python2.5/encodings/utf_8.py in decode(input, errors) 14 15 def decode(input, errors='strict'): ---> 16 return codecs.utf_8_decode(input, errors, True) 17 18 class IncrementalEncoder(codecs.IncrementalEncoder):</p> <p>: 'ascii' codec can't encode characters in position 0-87: ordinal not in range(128)</p> <p>In [71]:</p> </blockquote>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload