Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Using <code>elements = doc.search('//[@box and not(ancestor::@box)]')</code> isn't correct.</p> <p>Use <code>elements = doc.at('//div[@box]')</code> which will find the first occurrence.</p> <p>I'd recommend using <a href="http://nokogiri.org" rel="nofollow">Nokogiri</a> over Hpricot. Nokogiri is well supported, very flexible and more robust.</p> <hr> <p>EDIT: Added because original question changed:</p> <blockquote> <p>Thanks that worked perfectly, except I forget to mention that I want to return multiple outer elements. Sorry about that, I have updated the question. I will look into Nokogiri further, I didn't choose it originally because Hpricot seemed more approachable.</p> </blockquote> <p>Remember that XPath acts like accessing a file in a directory at its simplest form, so you can drill down and search in "subdirectories". If you only want the outer <code>&lt;div&gt;</code> tags, then look inside the <code>&lt;body&gt;</code> level and no further:</p> <pre><code>doc.search('/html/body/div') </code></pre> <p>or, if you might have unadorned <code>div</code> tags along with the targets:</p> <pre><code>doc.search('/html/body/div[@box]') </code></pre> <p>Regarding Hpricot seeming more approachable:</p> <p>Nokogiri implements a superset of Hpricot's accessors, allowing you to drop it into place for most uses. It supports XPath and CSS accessors allowing more intuitive ways of getting at data if you live in CSS and HTML and don't grok XPath. In addition there are many methods to find your desired target:</p> <pre><code>doc.search('body &gt; div[box]') (doc / 'body &gt; div[box]') doc.css('body &gt; div[box]') </code></pre> <p>Nokogiri supports the <code>at</code> and <code>%</code> synonym found in Hpricot also, along with <code>css_at</code>, if you only want the first occurrence of something.</p> <p>I started using Nokogiri after running into some situations where Hpricot exploded because it couldn't handle malformed news-feeds in the wilds.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload