Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>strstr is hard to optimize because (a) \0-termination means you have to read every byte anyway, and (b) it has to be good on all the edge cases, too.</p> <p>With that said, you can beat standard strstr by a factor of 10, using SSE2 ops. I've noticed that gcc 4.4 uses these ops for strlen now, but not for the other string ops. More on how to use SSE2 registers for strlen, strchr, strpbrk, etc. at <a href="http://mischasan.wordpress.com" rel="nofollow">mischasan.wordpress.com</a>. Pardon my super-terse code layout.</p> <pre><code>#include &lt;emmintrin.h&gt; // Other standard #includes you can figure out... static inline unsigned under(unsigned x) { return (x - 1) &amp; ~x; } static inline __m128i xmfill(char b) { return _mm_set1_epi8(b); } static inline __m128i xmload(void const*p) { return _mm_load_si128((__m128i const*)p); } static inline unsigned xmatch(__m128i a, __m128i b) { return _mm_movemask_epi8(_mm_cmpeq_epi8(a, b)); } char const *sse_strstr(char const *tgt, char const *pat) { unsigned len = sse_strlen(pat); if (len == 0) return tgt; if (len == 1) return sse_strchr(tgt,*pat); __m128i x, zero = {}; __m128i p0 = _m_set1_epi8(pat[0]), p1 = _m_set1_epi8(pat[1]); uint16_t pair = *(uint16_t const*)pat; unsigned z, m, f = 15 &amp; (uintptr_t)tgt; char const* p; // Initial unaligned chunk of tgt: if (f) { z = xmatch(x = xmload(tgt - f), zero) &gt;&gt; f; m = under(z) &amp; ((xmatch(x,p0) &amp; (xmatch(x,p1) &gt;&gt; 1)) &gt;&gt; f); for (; m; m &amp;= m - 1) if (!memcmp((p = tgt+ffs(m)-1)+2, pat+2, len-2)) return p; if (z) return NULL; tgt += 16 - f; if (*(uint16_t const*)(tgt - 1) == pair &amp;&amp; !memcmp(tgt+1, pat+2, len-2)) return tgt - 1; } // 16-byte aligned chunks of tgt: while (!(z = xmatch(x = xmload(tgt), zero))) { m = xmatch(x,p0) &amp; (xmatch(x,p1) &gt;&gt; 1); for (; m; m &amp;= m - 1) if (!memcmp((p = tgt+ffs(m)-1)+2, pat+2, len-2)) return p; tgt += 16; if (*(uint16_t const*)(tgt - 1) == pair &amp;&amp; !memcmp(tgt+1, pat+2, len-2)) return tgt - 1; } // Final 0..15 bytes of tgt: m = under(z) &amp; xmatch(x,p0) &amp; (xmatch(x,p1) &gt;&gt; 1); for (; m; m &amp;= m - 1) if (!memcmp((p = tgt+ffs(m)-1)+2, pat+2, len-2)) return p; return NULL; } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload