Note that there are some explanatory texts on larger screens.

plurals
  1. POHow can this xpath query (PHP) be more flexible?
    primarykey
    data
    text
    <p>I'm parsing an XHTML document using PHP's SimpleXML. I need to query a series of ul's in the document for a node containing a specific value, then find that node's parent's direct previous sibling... code will help explain!</p> <p>Given the following dummy xhtml:</p> <pre><code>&lt;html&gt; &lt;head&gt;&lt;/head&gt; &lt;body&gt; ... &lt;ul class="attr-list"&gt; &lt;li&gt;Active Life (active)&lt;/li&gt; &lt;ul&gt; &lt;li&gt;Amateur Sports Teams (amateursportsteams)&lt;/li&gt; &lt;li&gt;Amusement Parks (amusementparks)&lt;/li&gt; &lt;li&gt;Fitness &amp; Instruction (fitness)&lt;/li&gt; &lt;ul&gt; &lt;li&gt;Dance Studios (dancestudio)&lt;/li&gt; &lt;li&gt;Gyms (gyms)&lt;/li&gt; &lt;li&gt;Martial Arts (martialarts)&lt;/li&gt; &lt;li&gt;Pilates (pilates)&lt;/li&gt; &lt;li&gt;Swimming Lessons/Schools (swimminglessons)&lt;/li&gt; &lt;/ul&gt; &lt;li&gt;Go Karts (gokarts)&lt;/li&gt; &lt;li&gt;Mini Golf (mini_golf)&lt;/li&gt; &lt;li&gt;Parks (parks)&lt;/li&gt; &lt;ul&gt; &lt;li&gt;Dog Parks (dog_parks)&lt;/li&gt; &lt;li&gt;Skate Parks (skate_parks)&lt;/li&gt; &lt;/ul&gt; &lt;li&gt;Playgrounds (playgrounds)&lt;/li&gt; &lt;li&gt;Rafting/Kayaking (rafting)&lt;/li&gt; &lt;li&gt;Tennis (tennis)&lt;/li&gt; &lt;li&gt;Zoos (zoos)&lt;/li&gt; &lt;/ul&gt; &lt;li&gt;Arts &amp; Entertainment (arts)&lt;/li&gt; &lt;ul&gt; &lt;li&gt;Arcades (arcades)&lt;/li&gt; &lt;li&gt;Art Galleries (galleries)&lt;/li&gt; &lt;li&gt;Wineries (wineries)&lt;/li&gt; &lt;/ul&gt; &lt;li&gt;Automotive (auto)&lt;/li&gt; &lt;ul&gt; &lt;li&gt;Auto Detailing (auto_detailing)&lt;/li&gt; &lt;li&gt;Auto Glass Services (autoglass)&lt;/li&gt; &lt;li&gt;Auto Parts &amp; Supplies (autopartssupplies)&lt;/li&gt; &lt;/ul&gt; &lt;li&gt;Nightlife (nightlife)&lt;/li&gt; &lt;ul&gt; &lt;li&gt;Bars (bars)&lt;/li&gt; &lt;ul&gt; &lt;li&gt;Dive Bars (divebars)&lt;/li&gt; &lt;/ul&gt; &lt;/ul&gt; &lt;/ul&gt; ... &lt;/body&gt; &lt;/html&gt; </code></pre> <p>I need to be able to query the ul.attr-list for a child element, and discover its "root" category. I cannot change the xhtml to be formed differently.</p> <p>So, if I have "galleries" as a category, I need to know that it is in the "arts" "root" category. Or, if I have "dog_parks", I need to know that it is in the "active" category. The following code gets the job done, but only with the assumption that at max there are two nested levels:</p> <pre><code>function get_root_category($shortCategoryName){ $url = "http://www.yelp.com/developers/documentation/category_list"; $result = file_get_contents($url); $dom = new domDocument(); @$dom-&gt;loadHTML($result); $dom-&gt;preserveWhiteSpace = false; $sxml = simplexml_import_dom($dom); $lvl1 = $sxml-&gt;xpath("//li[contains(., '".$shortCategoryName."')]/parent::ul/preceding-sibling::li"); $lvl2 = $sxml-&gt;xpath("//li[contains(., '".$shortCategoryName."')]/parent::ul/preceding-sibling::li/parent::ul/preceding-sibling::li"); if($lvl2){ return array_pop($lvl2); } else { return array_pop($lvl1); } } </code></pre> <p>There has to be a better way to write that XPath, so that only one query needs to be made, and is relatively bulletproof to multiple, nested levels.</p> <p>EDIT:: Thanks to those that pointed out that this HTML is not valid. However, the structure of the page is set, and I cannot edit it; I can only use it as a resource, and have to make due with what it is. </p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload