Note that there are some explanatory texts on larger screens.

plurals
  1. POHow can I substitute an expression {{ text }} with re.sub() when 'text' may include further {{ text }} blocks?
    primarykey
    data
    text
    <p>I'm trying to parse raw wikipedia article content, e.g. <a href="http://en.wikipedia.org/w/index.php?title=Sweden&amp;action=raw" rel="nofollow">the article on Sweden</a>, using <code>re.sub()</code>. However, I am running into problems trying to substitute blocks of <code>{{some text}}</code>, because they can contain further blocks of <code>{{some text}}</code>.</p> <p>Here's an abbreviated example from the above article:</p> <pre><code>{{Infobox country | conventional_long_name = Kingdom of Sweden | native_name = {{native name|sv|Konungariket Sverige|icon=no}} | common_name = Sweden }} Some text I do not want parsed. {{Link GA|eo}} </code></pre> <p>The curly braces within curly braces recursion could theoretically be arbitrarily nested to any number of levels.</p> <p>If I match the greedy block of <code>{{.+}}</code>, everything is matched from <code>{{Infobox</code> to <code>eo}}</code>, including the text I do not want matched.</p> <p>If I match the ungreedy block of <code>{{.+}}</code>, the part from <code>{{Infobox</code> to <code>icon=no}}</code> is matched, as is <code>{{Link GA|eo}}</code>. But then I'm left with the string <code>| common_name [...] not want parsed.</code></p> <p>I also tried variants of <code>\{\{.+(\{\{.+\}\})*.+\}\}</code> and <code>\{\{[^\{]+(\{\{[^\{]+\}\})*[^\{]+\}\}</code>, in the hopes of matching only sub-blocks within the larger block, but to no avail.</p> <p>I'd list all of what I've tried, but I honestly can't remember half and I doubt it'd be of much use anyway. It always comes back to the same problem: that for the double curly end braces <code>}}</code> to match, there needs to have been the same number of <code>{{</code> occurrences beforehand.</p> <p>Is this even solvable using regular expressions, or do I need another solution?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload