Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Here is a solution using lxml. The main idea is to find all the preceding and following elements and then do a roundrobin iteration through those elements:</p> <pre><code>def find_nearest(elt): preceding = elt.xpath('preceding::*/@href')[::-1] following = elt.xpath('following::*/@href') parent = elt.xpath('parent::*/@href') for href in roundrobin(parent, preceding, following): return href </code></pre> <p>A similar solution using BeautifulSoups' (or bs4's) <a href="http://www.crummy.com/software/BeautifulSoup/bs4/doc/#next-elements-and-previous-elements" rel="nofollow">next_elements and previous_elements</a> should also be possible.</p> <hr> <pre><code>import lxml.html as LH import itertools def find_nearest(elt): preceding = elt.xpath('preceding::*/@href')[::-1] following = elt.xpath('following::*/@href') parent = elt.xpath('parent::*/@href') for href in roundrobin(parent, preceding, following): return href def roundrobin(*iterables): "roundrobin('ABC', 'D', 'EF') --&gt; A D E B F C" # http://docs.python.org/library/itertools.html#recipes # Author: George Sakkis pending = len(iterables) nexts = itertools.cycle(iter(it).next for it in iterables) while pending: try: for n in nexts: yield n() except StopIteration: pending -= 1 nexts = itertools.cycle(itertools.islice(nexts, pending)) tekst = ''' &lt;li&gt; &lt;div class="views-field-field-webrubrik-value"&gt; &lt;h3&gt; &lt;a href="/307046"&gt;Claus Hjort spiller med mrkede kort&lt;/a&gt; &lt;/h3&gt; &lt;/div&gt; &lt;div class="views-field-field-skribent-uid"&gt; &lt;div class="byline"&gt;Af: &lt;span class="authors"&gt;Dennis Kristensen&lt;/span&gt;&lt;/div&gt; &lt;/div&gt; &lt;div class="views-field-field-webteaser-value"&gt; &lt;div class="webteaser"&gt;Claus Hjort Frederiksens argumenter for at afvise trepartsforhandlinger har ikke hold i virkeligheden. Hans rinde er nok snarere at forberede det ideologiske grundlag for en Løkke Rasmussens genkomst som statsministe &lt;/div&gt; &lt;/div&gt; &lt;span class="views-field-view-node"&gt; &lt;span class="actions"&gt; &lt;a href="/307046"&gt;Ls mere&lt;/a&gt; | &lt;a href="/307046/#comments"&gt;Kommentarer (4)&lt;/a&gt; &lt;/span&gt; &lt;/span&gt; &lt;/li&gt; ''' to_find = "Rasmussen" doc = LH.fromstring(tekst) for x in doc.xpath('//*[contains(text(),{s!r})]'.format(s = to_find)): print(find_nearest(x)) </code></pre> <p>yields</p> <pre><code>/307046 </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload