Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to handle user input of invalid UTF-8 characters?
    primarykey
    data
    text
    <p>I'm looking for general a strategy/advice on how to handle invalid UTF-8 input from users. </p> <p>Even though my webapp uses UTF-8, somehow some users enter invalid characters. This causes errors in PHP's <a href="http://us2.php.net/json_encode" rel="noreferrer">json_encode()</a> and overall seems like a bad idea to have around.</p> <p><a href="http://www.w3.org/International/questions/qa-forms-utf-8" rel="noreferrer">W3C I18N FAQ: Multilingual Forms</a> says "If non-UTF-8 data is received, an error message should be sent back.".</p> <ul> <li>How exactly should this be practically done, throughout a site with dozens of different places where data can be input?</li> <li>How do you present the error in a helpful way to the user?</li> <li>How do you temporarily store and display bad form data so the user doesn't lose all their text? Strip bad characters? Use a replacement character, and how?</li> <li>For existing data in the database, when invalid UTF-8 data is detected, should I try to convert it and save it back (how? <a href="http://us3.php.net/utf8_encode" rel="noreferrer">utf8_encode</a>()? <a href="http://php.net/manual/en/function.mb-convert-encoding.php" rel="noreferrer">mb_convert_encoding()</a>?), or leave as-is in the database but doing something (what?) before json_encode()?</li> </ul> <p><strong>EDIT: I'm very familiar with the mbstring extension and am not asking "how does UTF-8 work in PHP". I'd like advice from people with experience in real-world situations how they've handled this.</strong> </p> <p><strong>EDIT2: As part of the solution, I'd really like to see a <em>fast</em> method to convert invalid characters to U+FFFD</strong></p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload