Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Parsing mediawiki content in any generic sense is pretty much impossible short using mediawiki itself. In order to parse it you need to be able fully parse HTML and CSS (since they can be embedded), and handle full template instantiation and expansion, as well as any parser addition the relevent content may have been using. That template instantiation is equivalent to a preprocessor.</p> <p>It is in some senses similiar to parsing C++ except the parser also handle malformed input and arbitrary syntax additions made by parser extensions. The actual mediawiki implementation is a lot like Perl 5, the original implementation was not so bad because all the edge cases just fall out however things are linked together, but actually getting any subsequent implementation to do the same thing is really complicated, especially since the behaviors are often emergent and undocumented, not designed.</p> <p>If you do not need 100% of pages to work or to be able to extract all content you might be able to cobble something together that works for you, and as you have noted there are some packages that do that. Short of knowing your actual precise needs I doubt anyone can give you a substantially better answer on how to parse it. If you need to be able to work on every page and correctly parse everything you better have a fairly large team and several years to work, and even then you still have lots of small edge cases.</p> <p>So in short, no an EBNF grammer is not well suited to parsing mediawiki markup, but nothing really is...</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload