Note that there are some explanatory texts on larger screens.

plurals
  1. POConverting a regular expression for skipping over missing information from python to java-7
    text
    copied!<p>I have the following test cases for which I need to develop a regular expression in order to catch particular sections of information in named groups. </p> <p>The test cases are:</p> <pre><code>Title v01 Title v01 c01 Title v01 c01-02 Title c01 Title c01-02 </code></pre> <p>The groups to look for are: Title, Volume, ChapterStart and ChapterEnd. For example in the below case these would catch </p> <pre><code>Title v02 c05-08 1 2 3 4 1. Title: "Title" 2. Volume: "02" 3. ChapterStart: "05" 4. ChapterEnd: "08" </code></pre> <p>The only mandatory group is Title, the others are optional, as can be seen in the other use case scenarios listed above.</p> <p>So far I was able to come up with the following regex which successfully treats the first three cases: </p> <pre><code>(?P&lt;Title&gt;.*)((((( |\.)v))(?P&lt;Volume&gt;\d+))(( |\.)c(?P&lt;ChapterStart&gt;\d+)(-(?P&lt;ChapterEnd&gt;\d+))?)?) </code></pre> <p>However, I don't know how to treat the last two cases in one regular expression without having to recycle group names. The group names are static because the program that receives this regex uses them to extract particular pieces of information (Series names, Titles, Volume and Chapter numbers, etc.)</p> <p>I have two problems now:</p> <ol> <li><p>How to make the regular expression treat the case where the volume information is missing.</p></li> <li><p>How to convert this regular expression from python re compatibility to Java 1.7 regex compatibility, which currently fails to match the exact same test case. (I only changed the named group notation by removing the 'P')</p></li> </ol> <p>Any help will be greatly appreciated.</p> <p>Note: The title can be any number of words containing uppercase letters, lowercase letters, numbers, special characters, unicode characters, separated with spaces, dot, underscore and/or dash, etc. So the only way to identify the end is if it finds a v or c preceded by a space character (space, dot, underscore, etc.) or if it reaches the end of the string.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload