Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>That's referring to "variable integer encoding", where the number of bits used to store an integer when serialized is not fixed at 4 bytes. There is a good description of <a href="http://code.google.com/apis/protocolbuffers/docs/encoding.html#varints" rel="nofollow noreferrer">varint in the protocol buffer documentation</a>.</p> <p>It is used in encoding <a href="http://code.google.com/apis/protocolbuffers/" rel="nofollow noreferrer">Google's protocol buffers</a>, and you can browse the <a href="http://www.google.com/codesearch/p?hl=en#WTeibokF6gE/trunk/src/google/protobuf/wire_format.h&amp;q=varint%20package:http://protobuf%5C.googlecode%5C.com&amp;d=4" rel="nofollow noreferrer">protocol buffer source code</a>.</p> <p>The <code>CodedOutputStream</code> contains the exact encoding function <a href="http://www.google.com/codesearch/p?hl=en#WTeibokF6gE/trunk/src/google/protobuf/io/coded_stream.cc&amp;q=varint%20package:http://protobuf%5C.googlecode%5C.com&amp;d=4&amp;l=632" rel="nofollow noreferrer">WriteVarint32FallbackToArrayInline</a>:</p> <pre><code>inline uint8* CodedOutputStream::WriteVarint32FallbackToArrayInline( uint32 value, uint8* target) { target[0] = static_cast&lt;uint8&gt;(value | 0x80); if (value &gt;= (1 &lt;&lt; 7)) { target[1] = static_cast&lt;uint8&gt;((value &gt;&gt; 7) | 0x80); if (value &gt;= (1 &lt;&lt; 14)) { target[2] = static_cast&lt;uint8&gt;((value &gt;&gt; 14) | 0x80); if (value &gt;= (1 &lt;&lt; 21)) { target[3] = static_cast&lt;uint8&gt;((value &gt;&gt; 21) | 0x80); if (value &gt;= (1 &lt;&lt; 28)) { target[4] = static_cast&lt;uint8&gt;(value &gt;&gt; 28); return target + 5; } else { target[3] &amp;= 0x7F; return target + 4; } } else { target[2] &amp;= 0x7F; return target + 3; } } else { target[1] &amp;= 0x7F; return target + 2; } } else { target[0] &amp;= 0x7F; return target + 1; } } </code></pre> <p>The cascading <code>if</code>s will only add additional bytes onto the end of the <code>target</code> array if the magnitude of <code>value</code> warrants those extra bytes. The <code>0x80</code> masks the byte being written, and the <code>value</code> is shifted down. From what I can tell, the <code>0x7f</code> mask causes it to signify the "last byte of encoding". (When OR'ing <code>0x80</code>, the highest bit will always be <code>1</code>, then the last byte clears the highest bit (by AND'ing <code>0x7f</code>). So, when reading varints you read until you get a byte with a zero in the highest bit.</p> <p>I just realized you asked about "Group VarInt encoding" specifically. Sorry, that code was about basic VarInt encoding (still faster than 7-bit). The basic idea looks to be similar. Unfortunately, it's <em>not</em> what's being used to store 64bit numbers in protocol buffers. I wouldn't be surprised if that code was open sourced somewhere though.</p> <p>Using the ideas from <code>varint</code> and the diagrams of "Group varint" from the slides, it shouldn't be too too hard to cook up your own :)</p> <p>Here is another page describing <a href="http://www.ir.uwaterloo.ca/book/addenda-06-index-compression.html" rel="nofollow noreferrer">Group VarInt compression</a>, which contains decoding code. Unfortunately they allude to publicly available implementations, but they don't provide references.</p> <pre><code>void DecodeGroupVarInt(const byte* compressed, int size, uint32_t* uncompressed) { const uint32_t MASK[4] = { 0xFF, 0xFFFF, 0xFFFFFF, 0xFFFFFFFF }; const byte* limit = compressed + size; uint32_t current_value = 0; while (compressed != limit) { const uint32_t selector = *compressed++; const uint32_t selector1 = (selector &amp; 3); current_value += *((uint32_t*)(compressed)) &amp; MASK[selector1]; *uncompressed++ = current_value; compressed += selector1 + 1; const uint32_t selector2 = ((selector &gt;&gt; 2) &amp; 3); current_value += *((uint32_t*)(compressed)) &amp; MASK[selector2]; *uncompressed++ = current_value; compressed += selector2 + 1; const uint32_t selector3 = ((selector &gt;&gt; 4) &amp; 3); current_value += *((uint32_t*)(compressed)) &amp; MASK[selector3]; *uncompressed++ = current_value; compressed += selector3 + 1; const uint32_t selector4 = (selector &gt;&gt; 6); current_value += *((uint32_t*)(compressed)) &amp; MASK[selector4]; *uncompressed++ = current_value; compressed += selector4 + 1; } } </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload