Note that there are some explanatory texts on larger screens.

plurals
  1. POwhich essential charachters are converted to html entities?
    primarykey
    data
    text
    <p>I am trying to write a routine that compares the $(element).text() and $(element).html() outputs to determine the locations of html tags. This will later be used for applying formatting tags like "strong" and "em" to a contenteditable, without resorting to document.execCommand().</p> <p>At this point I realize that to make the comparison work, characters such as '>', '&lt;', and '&amp;' in the $(element).text() output need to be converted to their respective html entities. From firebug I see that these characters are automatically converted in the innerHTML properties. I have tried other characters. such as quotes and umlauts, and these do not get converted.</p> <p>My questions are:</p> <ol> <li><p>Is there an essential set of characters (my guess would be >, &lt;, and &amp;) that get converted consistently across browsers? My target browsers are Firefox and Chrome, no IE for this, thank goodness.</p></li> <li><p>Is this set of characters respected by jQuery's .html() method, or is jQuery doing its own thing to level the differences across browsers. If so, where can I find a comprehensive list of just the essential characters that jQuery converts to entities?</p></li> </ol> <p>Further clarification:</p> <p>if on a contenteditable I have a paragraph with this text entered manually:</p> <pre><code>some text, and some characters &gt;, &lt;, ", &amp;, ', ë </code></pre> <p><code>$('p').text()</code> will give me:</p> <pre><code>some text, and some characters &gt;, &lt;, ", &amp;, ', ë </code></pre> <p>while <code>$('p').html()</code> will give me:</p> <pre><code>some text, and some characters &amp;gt;, &amp;lt;, ", &amp;amp;, ', ë </code></pre> <p>This is also the result I see in both firebug, and chrome developer tools.</p> <p>The &lt;, >, and &amp; are obviously essential in order for the whole thing to work, while quotes and special characters are not.</p> <p>I want to convert the result of <code>$('p').text()</code> via find/replace all, so as to match the output of the <code>$('p').html()</code>, minus the tags themselves.</p> <p>I need to know which other characters beside the obvious &lt;, >, and &amp; need to be converted to html entities to have a perfect match.</p> <p>What this is for:</p> <p>I am trying to build a simple WYSIWYM editor with a contenteditable div, without resorting to the existing WYMEditor and the iFrame it comes with.</p> <p>This will be used in a controlled environment (my custom cms) and will allow for a subset of the features expected in a html editor. Basically the whole thing is a bunch of P, H1-H6, and UL>LI, OL>LI tags located in a contenteditable div. </p> <p>The content tags (P, H1-H6, and LI tags that don't have UL or OL children) will be allowed to contain only STRONG, EM, A, SUB, SUP and SPAN tags.</p> <p>I am not targeting IE, but would like to have this working in FF and Chrome without the platform differences. One of these platform differences is the way document.execCommand() is carried out when bolding or italicizing text. FF wraps selection in while chrome uses tags. I have decided to use the following way to apply formatting:</p> <ol> <li>Get selection range.</li> <li>List all "content tags" within the range.</li> <li>using the range object, and its relation to each "content tag" I define three chunks of text: Before selection, selection and After selection. These are coming as straight text, with special characters not converted to entities.</li> <li>for each "content tag" innerhtml, I parse character by character to decompose into a "map" of each kind of tag. I have established a hierarchy of tags: a > span > sub|sup > strong > em. The "map" will be something like this:</li> </ol> <p>for innerhtml: <code>this &lt;em&gt;is &lt;strong&gt;a&lt;/strong&gt;&lt;/em&gt; &lt;a href="#"&gt;&lt;strong&gt;test&lt;/strong&gt;</code> text</p> <pre><code>text: this is a test text a: __________XXXXXXXXX strong: ________X_XXXX_____ em: _____XXXX__________ </code></pre> <ol> <li>using the before selection, selection and after selection text, as well as the formatting operation, I then create a mask. For instance if 'this is' needs to be bolded: the mask would be:</li> </ol> <blockquote> <pre><code> text: this is a test text strong: XXXXXXX____________ </code></pre> </blockquote> <ol> <li>after combining the mask with the map, the resulting map is:</li> </ol> <blockquote> <pre><code> text: this is a test text a: __________XXXXXXXXX strong: XXXXXXX_X_XXXX_____ em: _____XXXX__________ </code></pre> </blockquote> <ol> <li>this map gets converted to html:</li> </ol> <blockquote> <pre><code>&lt;strong&gt;this &lt;em&gt;is&lt;/em&gt;&lt;/strong&gt;&lt;em&gt; &lt;/em&gt;&lt;strong&gt;&lt;em&gt;a&lt;/em&gt;&lt;/strong&gt; &lt;a href="#"&gt;&lt;strong&gt;test&lt;/strong&gt; </code></pre> </blockquote> <ol> <li>replace the "container tag"'s innerhtml with the resulting html.</li> </ol> <p>Now the reason I asked this question is that I need the text chunks extracted from the html and the texts given to me by the ranges to match perfectly. So I cannot go converting any special character, but only the "essential" ones.</p> <p>I am aware that this might not be the easiest or fastest way to go after this problem, but I am a visual thinker, and somehow laying out the problem in a 2-dimensional grid greatly helps.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload