Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>This is actually a very subtle problem and I think a great question.</p> <p>My understanding is that an (abbreviated) XPATH points to an attribute if and only its <em>last</em> <code>@</code> is not within a predicate, that is, something of the form <code>[...]</code>, and has no steps after it (something like <code>/...</code>). I think this has the relatively simple regular expression <code>@[^]/]*$</code>, that is, there must be an <code>@</code> that has no <code>]</code>s nor <code>/</code>s after it. Also, if you want to cover unabbreviated XPATHs, you can use <code>(@|attribute::)[^]/]*$</code></p> <p>I've included a test harness that may prove useful in checking this or other tests. Note also that there may be whitespace in between tokens which can complicate some regexs.</p> <h3>Positive (an attribute)</h3> <ul> <li><code>@*</code> or <code>@a</code> or <code>../@a</code> or <code>a/@b</code></li> <li><code>a[@b and @c]/@d</code></li> <li><code>a[b[@c="d"]/e[@f and @g]]/h[@i="j"]/@k</code></li> </ul> <h3>Negative (not an attribute)</h3> <ul> <li><code>a[@b]</code> or <code>a[@b and @c]</code></li> <li><code>a[b[@c and @d]/@e]</code></li> <li><code>a[b[@c="d"]/e[@f and @g]]/h[@i="j"]/k[5][@l="m"]</code></li> </ul> <p>I can't think of a legal example where there is a <code>/</code> but not a <code>]</code> after the last example, but I think there might be one.</p> <p>Hopefully these examples make it at least a little clear that there can be arbitrary nesting of <code>[</code> and <code>]</code> together with <code>@</code>s anywhere in between. Luckily, I think only the very last <code>@</code> and its nesting level matters.</p> <p>(For reference, the OP's regex fails on <code>@a</code>. My original regex failed on <code>a[@b and @c]</code>.)</p> <p><em>Edit</em>: It turns out that there are more corner cases, which convinces me that there is no perfectly-correct regular expression. For example, once you have an attribute node, there are many ways of keeping it, <em>e.g.</em> <code>//@a//</code> or <code>//@a/.</code> in the abbreviated syntax. There are also a variety of more creative ways, such as <code>//@f//[node()]</code>. All in all, it seems that if you want to cover these cases, you need to be able to match <code>[</code> and <code>]</code>, which a basic regular expression cannot do. On the other hand, you could decide this is too contrived ...</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload