Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Your sample output is confusing. The first line implies that you want to break the street address down to its individual components, but in the rest of the lines it's all bunched together. I would expect the desired result to be either:</p> <pre><code>"123 SUNNYSIDE AVENUE", "BROOKLYN" "59 MAIDEN LANE", "MANHATTAN" "59 MAIDEN LANE", "MANHATTAN" "39-076 46 STREET", "SUNNYSIDE" "39-076 46 STREET", "SUNNYSIDE" "59 MAIDEN LANE", "MANHATTAN" </code></pre> <p>...or:</p> <pre><code>"123", "SUNNYSIDE", "AVENUE", "BROOKLYN" "59", "MAIDEN", "LANE", "MANHATTAN" "59", "MAIDEN", "LANE", "MANHATTAN" "39-076", "46", "STREET", "SUNNYSIDE" "39-076", "46", "STREET", "SUNNYSIDE" "59", "MAIDEN", "LANE", "MANHATTAN" </code></pre> <p>In either case, I would start by matching it with this regex:</p> <pre><code>^(\S+(?:\s+\S+)*)\s+(MANHATTAN|BROOKLYN|SUNNYSIDE) </code></pre> <p>The first group is greedy, so it will initially consume all but the last word of the address string. If the last word is not a city name (that is, it doesn't match the <code>(MANHATTAN|BROOKLYN|SUNNYSIDE)</code> group), the first group "gives up" one word at a time until the second group <em>does</em> match. </p> <p>Assuming the string actually contains a city name, and the name is included in the second group's subexpression, it will be captured in group #2. Group #1 will contain the whole street address; if you want it broken up as shown above, you can split it on whitespace.</p> <p><strong>EDIT:</strong> Here's some sample code to demonstrate. Note especially the use of <code>find()</code> instead of <code>matches()</code>. The behavior of Java's <code>matches()</code> method surprises many people, and it occurred to that it might be part of the problem here. In a nutshell, <code>find()</code> is why I had to add <code>^</code> the beginning of the regex, and why I <em>didn't</em> have to add <code>.*</code> to the end. ;)</p> <pre><code>String[] ss = { "123 SUNNYSIDE AVENUE BROOKLYN", "59 MAIDEN LANE MANHATTAN", "59 MAIDEN LANE MANHATTAN 10038", "39-076 46 STREET SUNNYSIDE", "39-076 46 STREET SUNNYSIDE 11104", "59 MAIDEN LANE MANHATTAN NY USA" }; Pattern p = Pattern.compile("^(\\S+(?:\\s+\\S+)*)\\s+(MANHATTAN|BROOKLYN|SUNNYSIDE)"); Matcher m = p.matcher(""); for (String s : ss) { if (m.reset(s).find()) { System.out.printf("%naddr: '%s'%ncity: '%s'%n", m.group(1), m.group(2)); } } </code></pre> <p>output:</p> <pre><code>addr: '123 SUNNYSIDE AVENUE' city: 'BROOKLYN' addr: '59 MAIDEN LANE' city: 'MANHATTAN' addr: '59 MAIDEN LANE' city: 'MANHATTAN' addr: '39-076 46 STREET' city: 'SUNNYSIDE' addr: '39-076 46 STREET' city: 'SUNNYSIDE' addr: '59 MAIDEN LANE' city: 'MANHATTAN' </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload