Note that there are some explanatory texts on larger screens.

plurals
  1. POUnicodeDecodeError in Python 2.7
    primarykey
    data
    text
    <p>I am trying to read a utf-8 encoded xml file in python and I am doing some processing on the lines read from the file something like below:</p> <pre><code>next_sent_separator_index = doc_content.find(word_value, int(characterOffsetEnd_value) + 1) </code></pre> <p>Where doc_content is the line read from the file and word_value is one of the string from the the same line. I am getting encoding related error for above line whenever doc_content or word_value is having some Unicode characters. So, I tried to decode them first with utf-8 decoding (instead of default ascii encoding) as below :</p> <pre><code>next_sent_separator_index = doc_content.decode('utf-8').find(word_value.decode('utf-8'), int(characterOffsetEnd_value) + 1) </code></pre> <p>But I am still getting UnicodeDecodeError as below :</p> <pre><code>Traceback (most recent call last): File "snippetRetriver.py", line 402, in &lt;module&gt; sentences_list,lemmatised_sentences_list = getSentenceList(form_doc) File "snippetRetriver.py", line 201, in getSentenceList next_sent_separator_index = doc_content.decode('utf-8').find(word_value.decode('utf-8'), int(characterOffsetEnd_value) + 1) File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 8: ordinal not in range(128) </code></pre> <p>Can anyone suggest me a suitable approach / way to avoid these kind of encoding errors in python 2.7 ? </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload