Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>You could write your own XML Parser.. no seriously! Check out <a href="http://kschiess.github.com/parslet/" rel="nofollow">Parslet</a> Infact the examples it comes with include an <a href="https://github.com/kschiess/parslet/blob/master/example/simple_xml.rb" rel="nofollow">XML Parser</a> </p> <p>Something like this:</p> <pre><code>require 'parslet' class XML &lt; Parslet::Parser root :document rule(:document) { (formatting | text).repeat(1) } rule(:formatting) { tag_pair('b').as(:bold) | tag_pair('u').as(:underline) | tag_pair('i').as(:italic) } def tag(type) str('&lt;') &gt;&gt; str(type) &gt;&gt; str('&gt;') end def tag_pair(type) tag(type) &gt;&gt; document.maybe &gt;&gt; tag("/" + type) end rule(:text) { match('[^&lt;&gt;]').repeat(1).as(:text) } end parser = XML.new input = ARGV[0] require 'parslet/convenience' puts parser.parse_with_debug(input).inspect </code></pre> <p>produces something like this... </p> <pre><code>&gt; ruby xmlparser.rb "&lt;b&gt;bold&lt;i&gt;italic&lt;/i&gt; bold again &lt;u&gt;underlined&lt;/u&gt;&lt;/b&gt;" </code></pre> <p>[{:bold=>[{:text=>"bold"@3}, {:italic=>[{:text=>"italic"@10}]}, {:text=>" bold again "@21}, {:underline=>[{:text=>"underlined"@36}]}]}]</p> <p>As you can see this tree has style nodes for bold italic etc. and the content inside them.</p> <p>It could easily be extended to handle white space, and dealing with other tags you care about. It's a little harder to deal with tags you don't care about. </p> <p>anyway.. just showing the possibilities.</p> <p>With Parslet you typically then write a Transform class to convert this tree structure into what you are hoping to do in the end. I love the way Parslet splits parsing from using the parsed data.</p> <p>Hope this helps.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload