Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p><strong>Currently (including Java 8) it is possible to do it with <code>split()</code>, but in real world don't use this approach since it looks like it is based on bug (look-behind in Java should have obvious maximum length, but this solution uses <code>\w+</code> which doesn't respect this limitation). Instead use <code>Pattern</code> and <code>Matcher</code> classes to avoid overcomplicating thins and maintenance hell since this behaviour may change in next versions of Java or in Java-like environments like Android.</strong></p> <hr> <p>Is this what you are looking for? <br/><sub>(you can replace <code>\\w</code> with <code>\\S</code> to include all non-space characters but for this example I will leave <code>\\w</code> since it is easier to read regex with <code>\\w\\s</code> then <code>\\S\\s</code>)</sub></p> <pre><code>String input = "one two three four five six seven"; String[] pairs = input.split("(?&lt;!\\G\\w+)\\s"); System.out.println(Arrays.toString(pairs)); </code></pre> <p>output:</p> <pre><code>[one two, three four, five six, seven] </code></pre> <hr> <p><code>\G</code> is previous match, <code>(?&lt;!regex)</code> is negative lookbehind.</p> <p>In <code>split</code> we are trying to </p> <ol> <li>find spaces -> <code>\\s</code></li> <li>that are not predicted -> <code>(?&lt;!negativeLookBehind)</code> </li> <li>by some word -> <code>\\w+</code> </li> <li>with previously matched (space) -> <code>\\G</code> </li> <li>before it -><code>\\G\\w+</code>.</li> </ol> <p>Only confusion that I had at start was how would it work for first space since we want that space to be ignored. <em>Important information is that <code>\\G</code> at start matches start of the String <code>^</code></em>. </p> <p>So before first iteration regex in negative look-behind will look like <code>(?&lt;!^\\w+)</code> and since first space <strong>do</strong> have <code>^\\w+</code> before, it can't be match for split. Next space will not have this problem, so it will be matched and informations about it (like its <strong><em>position</em></strong> in <code>input</code> String) will be stored in <code>\\G</code> and used later in next negative look-behind.</p> <p>So for 3rd space regex will check if there is previously matched space <code>\\G</code> and word <code>\\w+</code> before it. Since result of this test will be positive, negative look-behind wont accept it so this space wont be matched, but 4th space wont have this problem because space before it wont be the same as stored in <code>\\G</code> (it will have different position in <code>input</code> String).</p> <hr> <p>Also if someone would like to separate on lets say every 3rd space you can use this form (based on <a href="https://stackoverflow.com/users/1284661/maybewecouldstealavan">@maybeWeCouldStealAVan</a>'s <a href="https://stackoverflow.com/a/16486384/1393766">answer</a> which was deleted when I posted this fragment of answer) </p> <pre><code>input.split("(?&lt;=\\G\\w{1,100}\\s\\w{1,100}\\s\\w{1,100})\\s") </code></pre> <p>Instead of 100 you can use some bigger value that will be at least the size of length of longest word in String.</p> <hr> <p>I just noticed that we can also use <code>+</code> instead of <code>{1,maxWordLength}</code> if we want to split with every odd number like every 3rd, 5th, 7th for example</p> <pre><code>String data = "0,0,1,2,4,5,3,4,6,1,3,3,4,5,1,1"; String[] array = data.split("(?&lt;=\\G\\d+,\\d+,\\d+,\\d+,\\d+),");//every 5th comma </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload