Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Depending on how much you know about the input, you may have to take into account that <a href="http://www.w3.org/TR/xml11/#charsets" rel="noreferrer">not all Unicode characters are valid XML characters</a>.</p> <p>Both <em>Server.HtmlEncode</em> and <em>System.Security.SecurityElement.Escape</em> seem to ignore illegal XML characters, while <em>System.XML.XmlWriter.WriteString</em> throws an <em>ArgumentException</em> when it encounters illegal characters (unless you disable that check in which case it ignores them). An overview of library functions is available <a href="http://weblogs.sqlteam.com/mladenp/archive/2008/10/21/Different-ways-how-to-escape-an-XML-string-in-C.aspx" rel="noreferrer">here</a>.</p> <p><strong>Edit 2011/8/14:</strong> seeing that at least a few people have consulted this answer in the last couple years, I decided to completely rewrite the original code, which had numerous issues, including <a href="https://stackoverflow.com/questions/1049947/should-utf-16-be-considered-harmful">horribly mishandling UTF-16</a>.</p> <pre class="lang-cs prettyprint-override"><code>using System; using System.Collections.Generic; using System.IO; using System.Linq; /// &lt;summary&gt; /// Encodes data so that it can be safely embedded as text in XML documents. /// &lt;/summary&gt; public class XmlTextEncoder : TextReader { public static string Encode(string s) { using (var stream = new StringReader(s)) using (var encoder = new XmlTextEncoder(stream)) { return encoder.ReadToEnd(); } } /// &lt;param name="source"&gt;The data to be encoded in UTF-16 format.&lt;/param&gt; /// &lt;param name="filterIllegalChars"&gt;It is illegal to encode certain /// characters in XML. If true, silently omit these characters from the /// output; if false, throw an error when encountered.&lt;/param&gt; public XmlTextEncoder(TextReader source, bool filterIllegalChars=true) { _source = source; _filterIllegalChars = filterIllegalChars; } readonly Queue&lt;char&gt; _buf = new Queue&lt;char&gt;(); readonly bool _filterIllegalChars; readonly TextReader _source; public override int Peek() { PopulateBuffer(); if (_buf.Count == 0) return -1; return _buf.Peek(); } public override int Read() { PopulateBuffer(); if (_buf.Count == 0) return -1; return _buf.Dequeue(); } void PopulateBuffer() { const int endSentinel = -1; while (_buf.Count == 0 &amp;&amp; _source.Peek() != endSentinel) { // Strings in .NET are assumed to be UTF-16 encoded [1]. var c = (char) _source.Read(); if (Entities.ContainsKey(c)) { // Encode all entities defined in the XML spec [2]. foreach (var i in Entities[c]) _buf.Enqueue(i); } else if (!(0x0 &lt;= c &amp;&amp; c &lt;= 0x8) &amp;&amp; !new[] { 0xB, 0xC }.Contains(c) &amp;&amp; !(0xE &lt;= c &amp;&amp; c &lt;= 0x1F) &amp;&amp; !(0x7F &lt;= c &amp;&amp; c &lt;= 0x84) &amp;&amp; !(0x86 &lt;= c &amp;&amp; c &lt;= 0x9F) &amp;&amp; !(0xD800 &lt;= c &amp;&amp; c &lt;= 0xDFFF) &amp;&amp; !new[] { 0xFFFE, 0xFFFF }.Contains(c)) { // Allow if the Unicode codepoint is legal in XML [3]. _buf.Enqueue(c); } else if (char.IsHighSurrogate(c) &amp;&amp; _source.Peek() != endSentinel &amp;&amp; char.IsLowSurrogate((char) _source.Peek())) { // Allow well-formed surrogate pairs [1]. _buf.Enqueue(c); _buf.Enqueue((char) _source.Read()); } else if (!_filterIllegalChars) { // Note that we cannot encode illegal characters as entity // references due to the "Legal Character" constraint of // XML [4]. Nor are they allowed in CDATA sections [5]. throw new ArgumentException( String.Format("Illegal character: '{0:X}'", (int) c)); } } } static readonly Dictionary&lt;char,string&gt; Entities = new Dictionary&lt;char,string&gt; { { '"', "&amp;quot;" }, { '&amp;', "&amp;amp;"}, { '\'', "&amp;apos;" }, { '&lt;', "&amp;lt;" }, { '&gt;', "&amp;gt;" }, }; // References: // [1] http://en.wikipedia.org/wiki/UTF-16/UCS-2 // [2] http://www.w3.org/TR/xml11/#sec-predefined-ent // [3] http://www.w3.org/TR/xml11/#charsets // [4] http://www.w3.org/TR/xml11/#sec-references // [5] http://www.w3.org/TR/xml11/#sec-cdata-sect } </code></pre> <p>Unit tests and full code can be found <a href="https://github.com/mkropat/.NET-Snippets/blob/master/XmlTextEncoder.cs" rel="noreferrer">here</a>.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload