Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>While waiting for answers I started writing my own parser. It's a bit crude as there is <strong>no cross browser support</strong> and I don't do any modifications to the text - this means any linebreaks and other whitespace from the HTML will be preserved.</p> <p>There is also a lot of redundancy which I haven't cleaned up yet, such as traversing children of nodes I already know to be hidden.</p> <p>Anyway, the code:</p> <pre><code>function ParsedRange(range){ this.text = ""; this.nodeIndices = []; this.highlight = function(startIndex, endIndex){ var selection = window.getSelection(); var startNode = this.nodeIndices[startIndex].node; var endNode = this.nodeIndices[endIndex].node; var startOffset = startIndex - this.nodeIndices[startIndex].startIndex; var endOffset = endIndex - this.nodeIndices[endIndex].startIndex + 1; // Scroll into view startNode.parentNode.scrollIntoViewIfNeeded(); // Highlight range.setStart(startNode, startOffset); range.setEnd(endNode, endOffset); selection.removeAllRanges(); selection.addRange(range); }; // Parsing starts here var startIndex; var rootNode = range.commonAncestorContainer; var startNode = range.startContainer; var endNode = range.endContainer; var treeWalker = document.createTreeWalker(rootNode, NodeFilter.SHOW_ELEMENT | NodeFilter.SHOW_TEXT, null, false); // Only walk text and element nodes var currentNode = treeWalker.currentNode; // Move to start node while (currentNode &amp;&amp; currentNode != startNode) currentNode = treeWalker.nextNode(); // Extract text var nodeText; while (currentNode &amp;&amp; currentNode != endNode){ // Handle end node separately // Continue to next node if current node is hidden if (isHidden(currentNode)){ currentNode = treeWalker.nextNode(); continue; } // Extract text if text node if (currentNode.nodeType == 3){ if (currentNode == startNode) nodeText = currentNode.nodeValue.substring(range.startOffset); // Extract from start of selection if first node else nodeText = currentNode.nodeValue; // Else extra entire node this.text += nodeText; if (currentNode == startNode) startIndex = range.startOffset * -1; else startIndex = this.nodeIndices.length; for (var i=0; i&lt;nodeText.length; i++){ this.nodeIndices.push({ startIndex: startIndex, node: currentNode }); } } // Continue to next node currentNode = treeWalker.nextNode(); } // Extract text from end node if it's a text node if (currentNode == endNode &amp;&amp; currentNode.nodeType == 3 &amp;&amp; !isHidden(currentNode)){ if (endNode == startNode) nodeText = currentNode.nodeValue.substring(range.startOffset, range.endOffset); // Extract only selected part if end and start nodes are the same else nodeText = currentNode.nodeValue.substring(0, range.endOffset); // Else extract up to where the selection ends in the end node this.text += nodeText; if (currentNode == startNode) startIndex = range.startOffset*-1; else startIndex = this.nodeIndices.length; for (var i=0; i&lt;nodeText.length; i++){ this.nodeIndices.push({ startIndex: startIndex, node: currentNode }); } } return this; } ParsedRange.removeHighlight = function(){ window.getSelection().removeAllRanges(); }; function isHidden(element){ // Get parent node if element is a text node if (element.nodeType == 3) element = element.parentNode; // Only check visibility of the element itself if (window.getComputedStyle(element, null).getPropertyValue("visibility") == "hidden") return true; // Check display and dimensions for element and its parents while (element){ if (element.nodeType == 9) return false; // Document if (element.tagName == "NOSCRIPT") return true; if (window.getComputedStyle(element, null).getPropertyValue("display") == "none") return true; if (element.offsetWidth == 0 || element.offsetHeight == 0){ // If element does not have overflow:visible it is hidden if (window.getComputedStyle(element, null).getPropertyValue("overflow") != "visible"){ return true; } } element = element.parentNode; } return false; } </code></pre> <p>I've made it as a class (apart from the <code>isHidden()</code> helper function) due to the way it's integrated in my project.</p> <p>That aside the class works by passing it a valid range which it will then extract the text inside the range and save references to all the nodes. These references are used in the <code>highlight()</code> function, which uses browser selection to highlight based on start and end character indices. </p> <p>An extra note on the <code>nodeIndices</code> property (seeing as that might not make sense). <code>nodeIndices</code> is an array containing objects with the form:</p> <pre><code>{ startIndex: // Int node: // Reference to text node } </code></pre> <p>For every single character I extract into my resulting text I push one of those objects on <code>nodeIndices</code>, the <code>node</code> property is simply a reference to the text node, from which the text came. <code>startIndex</code> defines at which character the node begins in the entire text.</p> <p>Using this array I can translate from a character index in <code>ParsedParagraph.text</code> to an HTML node and the index of the corresponding character inside that node.</p> <p><strong>Example of use:</strong></p> <pre><code>// Get start/end nodes and offsets for range var startNode = // Code to get start node here, can be a text node or an element node var startOffset = // Offset into the start node var endNode = // Code to get end node here, can be a text node or an element node var endOffset = // Offset into the end node // Create the range var range = document.createRange(); range.setStart(startNode, startOffset); range.setEnd(endNode, endOffset); // Parse the range using the ParsedRange class var parsedRange = new ParsedRange(range); parsedRange.text; // Contains visible text with whitespaces preserved. parsedRange.highlight(startIndex, endIndex); // Will highlight the corresponding text inside parsedRange.text using browser selection </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload