Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Here is a function I wrote to convert all non-ASCII characters to their corresponding entity. Might help you sanitizing some PCDATA content before output.</p> <pre><code>/** * Creates xml entities for non ascii characters in the given String. */ public static String xmlEntitify(String in){ StringBuffer b = new StringBuffer(); for (int i=0;i&lt;in.length();i++){ Character c = in.charAt(i); if (c&lt;128){ b.append(c); } else if (c=='\ufeff'){ // BOM character, just remove it } else { String cstr = Integer.toHexString(c).toUpperCase(); while(cstr.length()&lt;4){ cstr="0"+cstr; } b.append("&amp;#x"); b.append(cstr); b.append(";"); } } return b.toString(); } </code></pre> <p>Read your input stream into a <code>String content</code>, and write into the output stream <code>xmlEntitify(content)</code>.</p> <p>Your output is guaranteed to contain only ASCII characters, no more encoding problem.</p> <p><strong>UPDATE</strong></p> <p>Given the comments, I'll be even bolder : if you are not sanitizing your data, you are calling for trouble. I guess you are at least already replacing the <code>&lt;</code> and <code>&amp;</code> characters in your PCDATA. If not, you definitely should. I have another version of the above method which, instead of the first <code>if</code>, has :</p> <pre><code>if (c&lt;128 &amp;&amp; c!='&amp;' &amp;&amp; c!='&lt;' &amp;&amp; c!='&gt;' &amp;&amp; c!='"'){ b.append(c); } </code></pre> <p>so that these characters are also converted to their corresponding Unicode entity. This converts all of my PCDATA to unicode-friendly ASCII-only strings. I had no more encoding problem since I'm using this technique. I don't ever output XML PCDATA which has not been passed through this method : this is not sweeping the elephant under the carpet. It is just getting rid of the problem by being as generic as can be.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload