Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>I recommend merely not allowing garbage to get in. Don't rely on custom functions, which can bog your system down. Simply walk the submitted data against an alphabet you design. Create an acceptable alphabet string and walk the submitted data, byte by byte, as if it were an array. Push acceptable characters to a new string, and omit unacceptable characters. The data you store in your database then is data triggered by the user, but not actually user-supplied data.</p> <p>EDIT #4: Replacing bad character with entiy: &#65533;</p> <p>EDIT #3: Updated : Sept 22 2010 @ 1:32pm Reason: Now string returned is UTF-8, plus I used the test file you provided as proof.</p> <pre><code>&lt;?php // build alphabet // optionally you can remove characters from this array $alpha[]= chr(0); // null $alpha[]= chr(9); // tab $alpha[]= chr(10); // new line $alpha[]= chr(11); // tab $alpha[]= chr(13); // carriage return for ($i = 32; $i &lt;= 126; $i++) { $alpha[]= chr($i); } /* remove comment to check ascii ordinals */ // /* // foreach ($alpha as $key=&gt;$val){ // print ord($val); // print '&lt;br/&gt;'; // } // print '&lt;hr/&gt;'; //*/ // // //test case #1 // // $str = 'afsjdfhasjhdgljhasdlfy42we875y342q8957y2wkjrgSAHKDJgfcv kzXnxbnSXbcv '.chr(160).chr(127).chr(126); // // $string = teststr($alpha,$str); // print $string; // print '&lt;hr/&gt;'; // // //test case #2 // // $str = ''.'©?™???'; // $string = teststr($alpha,$str); // print $string; // print '&lt;hr/&gt;'; // // $str = '©'; // $string = teststr($alpha,$str); // print $string; // print '&lt;hr/&gt;'; $file = 'http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt'; $testfile = implode(chr(10),file($file)); $string = teststr($alpha,$testfile); print $string; print '&lt;hr/&gt;'; function teststr(&amp;$alpha, &amp;$str){ $strlen = strlen($str); $newstr = chr(0); //null $x = 0; if($strlen &gt;= 2){ for ($i = 0; $i &lt; $strlen; $i++) { $x++; if(in_array($str[$i],$alpha)){ // passed $newstr .= $str[$i]; }else{ // failed print 'Found out of scope character. (ASCII: '.ord($str[$i]).')'; print '&lt;br/&gt;'; $newstr .= '&amp;#65533;'; } } }elseif($strlen &lt;= 0){ // failed to qualify for test print 'Non-existent.'; }elseif($strlen === 1){ $x++; if(in_array($str,$alpha)){ // passed $newstr = $str; }else{ // failed print 'Total character failed to qualify.'; $newstr = '&amp;#65533;'; } }else{ print 'Non-existent (scope).'; } if(mb_detect_encoding($newstr, "UTF-8") == "UTF-8"){ // skip }else{ $newstr = utf8_encode($newstr); } // test encoding: if(mb_detect_encoding($newstr, "UTF-8")=="UTF-8"){ print 'UTF-8 :D&lt;br/&gt;'; }else{ print 'ENCODED: '.mb_detect_encoding($newstr, "UTF-8").'&lt;br/&gt;'; } return $newstr.' (scope: '.$x.', '.$strlen.')'; } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload