Note that there are some explanatory texts on larger screens.

plurals
  1. POFixing encodings
    primarykey
    data
    text
    <p>I have ended up with messed up character encodings in one of our mysql columns.</p> <p>Typically I have </p> <p>√© instead of é<br> √∂ instead of ö<br> √≠ instead of í </p> <p>and so on...</p> <p>Fairly certain that someone here would know what happened and how to fix.</p> <p><strong>UPDATE:</strong> Based on bobince's answer and since I had this data in a file I did the following</p> <pre><code>#!/user/bin/env python import codecs f = codecs.open('./file.csv', 'r', 'utf-8') f2 = codecs.open('./file-fixed.csv', 'w', 'utf-8') for line in f: f2.write(line.encode('macroman').decode('utf-8')), </code></pre> <p>after which</p> <pre><code>load data infile 'file-fixed.csv' into table list1 fields terminated by ',' optionally enclosed by '"' ignore 1 lines; </code></pre> <p>properly imported the data.</p> <p><strong>UPDATE2:</strong> Hammerite, just for completeness here are the requested details...</p> <pre><code>mysql&gt; SHOW VARIABLES LIKE 'character_set%'; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ </code></pre> <p>The <code>SHOW CREATE TABLE</code> for the table I am importing to has <code>DEFAULT CHARSET=utf8</code></p> <p><strong>EDIT3:</strong></p> <p>Actually with the above settings the <code>load</code> <strong>didn't</strong> do the right thing (I could not compare to existing utf8 fields and my loaded data only <em>looked</em> as if it was loaded correctly; I assume because of the <em>wrong, but matching</em> client, connection and results charsets), so I updated the settings to: </p> <pre><code>+--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8 | | character_set_connection | utf8 | | character_set_database | utf8 | | character_set_filesystem | binary | | character_set_results | utf8 | | character_set_server | utf8 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ </code></pre> <p>uploaded data again and then finally I got the data loaded correctly (comparable with existing data).</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload