Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <blockquote> <p>This all works fine, I'm worried about the efficiency of the cleaning section...</p> </blockquote> <p>Really? Did you inspect the resulting "string"? On my machine the data in <code>forumContentCleaned</code> still contains the <code>&amp;amp;sid=...</code> data.</p> <p>That's because </p> <pre><code>forumContentList.removeAll(Arrays.asList(sidPattern)); </code></pre> <p>tries to remove a <code>List&lt;byte[]&gt;</code> from a <code>List&lt;Byte&gt;</code>. This will do nothing. And even if you replace the argument of <code>removeAll</code> with a real <code>List&lt;Byte&gt;</code> containing the bytes of <code>"&amp;amp;sid="</code>, then you will remove <strong>ALL</strong> occurences of each <code>a</code>, each <code>m</code>, each <code>p</code> and so forth. The resulting data will look like this: </p> <pre><code>&lt;l cl"con-logout"&gt;&lt; href"./uc.h?oelogout34043284674572e35881e022c68fc8" ttle.... </code></pre> <p>Well, strictly speaking, the <code>&amp;amp;sid=</code> part is gone, but I'm quite sure this is not what you wanted.</p> <p>Therefore take a step back and think: You are doing string manipulation here, so use a <code>StringBuilder</code>, feed it with the <code>String(forumContent)</code> and do your manipulation there. </p> <p><strong>Edit</strong></p> <p>Looking at the given example input string, I guess, that also the <em>value</em> of <code>sid</code> should be removed, not only the key. This code should do it efficiently without regular expresions:</p> <pre><code>String removeSecrets(String input){ StringBuilder sb = new StringBuilder(input); String sidStart = "&amp;amp;sid="; String sidEnd = "\""; int posStart = 0; while ((posStart = sb.indexOf(sidStart, posStart)) &gt;= 0) { int posEnd = sb.indexOf(sidEnd, posStart); if (posEnd &lt; 0) // delete as far as possible - YMMV posEnd = sb.length(); sb.delete(posStart, posEnd); } return sb.toString(); } </code></pre> <p><strong>Edit 2</strong></p> <p>Here is a small benchmark between <code>StringBuilder</code> and <code>String.replaceAll</code>:</p> <pre><code>public class ReplaceAllBenchmark { public static void main(String[] args) throws Throwable { final int N = 1000000; String input = "&lt;li class=\"icon-logout\"&gt;&lt;a href=\"./ucp.php?mode=logout&amp;amp;sid=3a4043284674572e35881e022c68fcd8\" title=\"Logout [ barry ]\" accesskey=\"x\"&gt;Logout [ barry ]&lt;/a&gt;&amp;amp;sid=3a4043284674572e35881e022c68fcd8\"&lt;/li&gt;"; stringBuilderBench(input, N); regularExpressionBench(input, N); } static void stringBuilderBench(String input, final int N) throws Throwable{ for(int run=0; run&lt;5; ++run){ long t1 = System.nanoTime(); for(int i=0; i&lt;N; ++i) removeSecrets(input); long t2 = System.nanoTime(); System.out.println("sb: "+(t2-t1)+"ns, "+(t2-t1)/N+"ns/call"); Thread.sleep(1000); } } static void regularExpressionBench(String input, final int N) throws Throwable{ for(int run=0; run&lt;5; ++run){ long t1 = System.nanoTime(); for(int i=0; i&lt;N; ++i) removeSecrets2(input); long t2 = System.nanoTime(); System.out.println("regexp: "+(t2-t1)+"ns, "+(t2-t1)/N+"ns/call"); Thread.sleep(1000); } } static String removeSecrets2(String input){ return input.replaceAll("&amp;amp;sid=[^\"]*\"", "\""); } } </code></pre> <p>Results: </p> <pre><code>java version "1.6.0_20" OpenJDK Runtime Environment (IcedTea6 1.9.9) (6b20-1.9.9-0ubuntu1~10.04.2) OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) sb: 538735438ns, 538ns/call sb: 457107726ns, 457ns/call sb: 443282145ns, 443ns/call sb: 453978805ns, 453ns/call sb: 458895308ns, 458ns/call regexp: 2404818405ns, 2404ns/call regexp: 2196834572ns, 2196ns/call regexp: 2239056178ns, 2239ns/call regexp: 2164337638ns, 2164ns/call regexp: 2177091893ns, 2177ns/call </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload