Note that there are some explanatory texts on larger screens.

plurals
  1. POSentence parsing with regular expressions including bullet lists in java
    primarykey
    data
    text
    <p>Currently, I use the following regular expression to parse sentences in a document:</p> <pre><code>Pattern.compile("(?&lt;=\\w[\\w\\)\\]](?&lt;!Mrs?|Dr|Rev|Mr|Ms|vs|abd|ABD|Abd|resp|St|wt)[\\.\\?\\!\\:\\@]\\s)"); </code></pre> <p>This almost works. For example: Given this string:</p> <p>"Mary had a little lamb (i.e. lamby pie). Here are its properties: 1. It has four feet 2. It has fleece 3. It is a mammal. It had white fleese. Her father, Mr. Lamb, lives on Mulbery St. in a little white house."</p> <p>I get the following sentences:</p> <pre><code>Mary had a little lamb (i.e. lamby pie). Here are its properties: 1. It has four feet 2. It has fleece 3. It is a mammal. It had white fleese. Her father, Mr. Lamb, live on Mulbery St. in a little white house. </code></pre> <p>However, what I would like is:</p> <pre><code>Mary had a little lamb (i.e. lamby pie). Here are its properties: 1. It has four feet 2. It has fleece 3. It is a mammal. It had white fleese. Her father, Mr. Lamb, lives on Mulbery St. in a little white house. </code></pre> <p>Is there anyway to do this by altering the existing regular expression? </p> <p>Right now to accomplish this task, I first do an initial split and then check for bullets. The following code works but I'm wondering if there is a more elegant solution:</p> <pre><code>public static void doHomeMadeSentenceParser(String temp) { Pattern p = Pattern .compile("(?&lt;=\\w[\\w\\)\\]](?&lt;!Mrs?|Dr|Rev|Mr|Ms|vs|abd|ABD|Abd|resp|St|wt)[\\.\\?\\!\\:\\@]\\s)"); String[] sentences = p.split(temp); Vector psentences = new Vector(); Pattern p1 = Pattern.compile("\\b\\d+[.)]\\s"); for (int x = 0; x &lt; sentences.length; x++) { Matcher matcher = p1.matcher(sentences[x]); int bstart = 0; boolean bulletfound = false; while (matcher.find()) { bulletfound = true; String bullet = sentences[x].substring(bstart, matcher.start()); if (bullet.length() &gt; 0) { psentences.add(bullet); } bstart = matcher.start(); } if (bulletfound) psentences.add(sentences[x].substring(bstart)); else psentences.add(sentences[x]); } for (int x = 0; x &lt; psentences.size(); x++) { String s = (String) psentences.get(x); System.out.println(s.trim()); } } </code></pre> <p>Thanks in advance for any help.</p> <p>Elliott</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload