Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>The Boyer-Moore algorithm that is used in grep is pretty efficient, and gets more efficient for longer pattern sizes. I'm pretty sure you could make it work for a byte array without too much difficulty, and its <a href="http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm" rel="noreferrer">wikipedia page</a> has an implementation in Java that should be fairly easy to port to C#.</p> <p>UPDATE:</p> <p>Here's an implementation of a simplified version of the Boyer-Moore algorithm for byte arrays in C#. It only uses the <a href="http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm#The_second_table" rel="noreferrer">second jump table</a> of the full algorithm. Based on the array sizes that you said (haystack: 2000000 bytes, needle: 10 bytes), it's about 5-8 times faster than a simple byte by byte algorithm.</p> <pre><code> static int SimpleBoyerMooreSearch(byte[] haystack, byte[] needle) { int[] lookup = new int[256]; for (int i = 0; i &lt; lookup.Length; i++) { lookup[i] = needle.Length; } for (int i = 0; i &lt; needle.Length; i++) { lookup[needle[i]] = needle.Length - i - 1; } int index = needle.Length - 1; var lastByte = needle.Last(); while (index &lt; haystack.Length) { var checkByte = haystack[index]; if (haystack[index] == lastByte) { bool found = true; for (int j = needle.Length - 2; j &gt;= 0; j--) { if (haystack[index - needle.Length + j + 1] != needle[j]) { found = false; break; } } if (found) return index - needle.Length + 1; else index++; } else { index += lookup[checkByte]; } } return -1; } </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload