Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to make MySQL aware of multi-byte characters in LIKE and REGEXP?
    primarykey
    data
    text
    <p>I have a MySQL table with two columns, both utf8_unicode_ci collated. It contains the following rows. Except for ASCII, the second field also contains Unicode codepoints like U+02C8 (MODIFIED LETTER VERTICAL LINE) and U+02D0 (MODIFIED LETTER TRIANGULAR COLON).</p> <pre><code> word | ipa --------+---------- Hallo | haˈloː IPA | ˌiːpeːˈʔaː </code></pre> <p>I need to search the second field with LIKE and REGEXP, but MySQL (5.0.77) seems to interpret these fields as bytes, not as characters.</p> <pre><code>SELECT * FROM pronunciation WHERE ipa LIKE '%ha?lo%'; -- 0 rows SELECT * FROM pronunciation WHERE ipa LIKE '%ha??lo%'; -- 1 row SELECT * FROM pronunciation WHERE ipa REGEXP 'ha.lo'; -- 0 rows SELECT * FROM pronunciation WHERE ipa REGEXP 'ha..lo'; -- 1 row </code></pre> <p>I'm quite sure that the data is stored correctly, as it seems good when I retrieve it and shows up fine in phpMyAdmin. I'm on a shared host, so I can't really install programs.</p> <p>How can I solve this problem? If it's not possible: is there a plausible work-around that does not involve processing the entire database with PHP every time? There are 40 000 lines, and I'm not dead-set on using MySQL (or UTF8, for that matter). I only have access to PHP and MySQL on the host.</p> <p><strong>Edit:</strong> There is an open 4-year-old MySQL bug report, <a href="http://bugs.mysql.com/bug.php?id=30241" rel="noreferrer">Bug #30241 Regular expression problems</a>, which notes that the regexp engine works byte-wise. Thus, I'm looking for a work-around.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload