Note that there are some explanatory texts on larger screens.

plurals
  1. POImporting text to MySQL: strange format
    primarykey
    data
    text
    <p>I'm importing some data from a .txt file into a MySQL database table, using mysqlimport. It seems to import OK (no error messages) but looks very odd when displayed, and can't be searched as expected.</p> <p>Here are the details. The original text file is saved in UTF-8, with records that look (in a text editor) like this. The second field includes line breaks:</p> <pre><code>WAR-16,52 ~~~~~ Lorem ipsum dolor sit. Lorem ipsum dolor sit. ~~~~~ ENDOFRECORD WAR-16,53~~~~~Lorem ipsum dolor sit. Lorem ipsum dolor sit. Lorem ipsum dolor sit. Lorem ipsum dolor sit. ~~~~~ ENDOFRECORD </code></pre> <p>The database table into which I am importing is very simple:</p> <pre><code>+-------+---------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+---------------+------+-----+---------+-------+ | id | varchar(100) | YES | | NULL | | | text | varchar(5000) | YES | | NULL | | +-------+---------------+------+-----+---------+-------+ 2 rows in set (0.00 sec) </code></pre> <p>When I import the file, here's the command I use:</p> <pre><code>$ mysqlimport -u root db textfile.txt --fields-terminated-by="~~~~~" --lines-terminated-by="ENDOFTHELINE" --default-character-set='utf8' db.records_list: Records: 18778 Deleted: 0 Skipped: 0 Warnings: 18787 </code></pre> <p>Here's what I see if I then ask MySQL to display the records:</p> <pre><code>mysql&gt; select * from textfile; | W A R - 1 6 , 5 2 | L o r e m i p s u m d o l o r s i t . L o r e m i p s u m d o l o r s i t . (etc) </code></pre> <p>So, it looks as though spaces, or some strange encoding extras, are being added to the text.</p> <p>And here's the problem with the database query:</p> <pre><code>mysql&gt; select * from textfile where id like "%WAR%"; </code></pre> <p>returns nothing; nor does adding spaces:</p> <pre><code>mysql&gt; select * from textfile where id like "%W A R%"; </code></pre> <p>Only this command returns anything</p> <pre><code>mysql&gt; select * from textfile where id like "%W%"; </code></pre> <p>Can anyone guess what might be happening? I feel like it must be an encoding problem, but I can't work it out.</p> <p>------ UPDATE -------- </p> <p>OK, I've checked the database and connection encoding.</p> <pre><code>mysql&gt; show variables like "character_set_%"; +--------------------------+----------------------------------------+ | Variable_name | Value | +--------------------------+----------------------------------------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/local/mysql/share/mysql/charsets/ | +--------------------------+----------------------------------------+ 8 rows in set (0.01 sec) </code></pre> <p>And <code>show table status</code> says the table is <code>latin1_swedish_ci</code>.</p> <p>I have re-saved the text file in "Western (Windows Latin 1)" (using TextEdit on Snow Leopard) and tried to import it using the same command as above. However I still have the same encoding problem. </p> <p>I also tried, again with no luck:</p> <ul> <li>creating a new table with UTF-8 and importing the existing file </li> <li>copying &amp; pasting the text into another text file that I've previously imported fine, and trying to import that.</li> </ul> <p>Still totally baffled :(((</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload