Note that there are some explanatory texts on larger screens.

plurals
  1. POTrouble with UTF-8 CSV input in Python
    primarykey
    data
    text
    <p>This seems like it should be an easy fix, but so far a solution has eluded me. I have a single column csv file with non-ascii chars saved in utf-8 that I want to read in and store in a list. I'm attempting to follow the principle of the <a href="http://nedbatchelder.com/text/unipain.html" rel="nofollow noreferrer">"Unicode Sandwich"</a> and decode upon reading the file in:</p> <pre><code>import codecs import csv with codecs.open('utf8file.csv', 'rU', encoding='utf-8') as file: input_file = csv.reader(file, delimiter=",", quotechar='|') list = [] for row in input_file: list.extend(row) </code></pre> <p>This produces the dread 'codec can't encode characters in position, ordinal not in range(128)' error.</p> <p>I've also tried adapting a solution from <a href="https://stackoverflow.com/questions/904041/reading-a-utf8-csv-file-with-python">this answer</a>, which returns a similar error</p> <pre><code>def unicode_csv_reader(utf8_data, dialect=csv.excel, **kwargs): csv_reader = csv.reader(utf8_data, dialect=dialect, **kwargs) for row in csv_reader: yield [unicode(cell, 'utf-8') for cell in row] filename = 'inputs\encode.csv' reader = unicode_csv_reader(open(filename)) target_list = [] for field1 in reader: target_list.extend(field1) </code></pre> <p>A very similar solution adapted from the <a href="http://docs.python.org/library/csv.html#examples" rel="nofollow noreferrer">docs</a> returns the same error.</p> <pre><code>def unicode_csv_reader(utf8_data, dialect=csv.excel): csv_reader = csv.reader(utf_8_encoder(utf8_data), dialect) for row in csv_reader: yield [unicode(cell, 'utf-8') for cell in row] def utf_8_encoder(unicode_csv_data): for line in unicode_csv_data: yield line.encode('utf-8') filename = 'inputs\encode.csv' reader = unicode_csv_reader(open(filename)) target_list = [] for field1 in reader: target_list.extend(field1) </code></pre> <p>Clearly I'm missing something. Most of the questions that I've seen regarding this problem seem to predate Python 2.7, so an update here might be useful.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload