StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>Regardless of the back-end chosen, (memcached, mongodb, redis, mysql, carrier pigeon) the most speed-efficient way to store data in it would be a simple block of data (so the back-end has no knowledge of it.) Whether that's <code>string</code>, <code>byte[]</code>, <code>BLOB</code>, is really all the same.</p> <p>Each language will need an agreed mechanism to convert objects to a storable data format and back. You:</p> <ul> <li>Shouldn't build your own mechanism, that's just reinventing the wheel.</li> <li>Should think about whether 'invalid' objects might end up in the back-end. <em>(either because of a bug in a writer, or because objects from a previous revision are still present)</em></li> </ul> <p>When it comes to choosing a format, I'd recommend two: <a href="http://json.org%20JSON" rel="nofollow">JSON</a> or <a href="http://code.google.com/apis/protocolbuffers/docs/overview.html" rel="nofollow">Protocol Buffers</a>. This is because their encoded size and encode/decode speed is among the smallest/fastest of all the available encodings.</p> <h2>Comparison</h2> <p><strong>JSON:</strong></p> <ul> <li>Libraries available for dozens of languages, sometimes part of the standard library.</li> <li>Very simple format - Human-readable when stored, human-writable!</li> <li>No coordination required between different systems, just agreement on object structure.</li> <li>No set-up needed in many languages, eg PHP: <code>$data = json_encode($object); $object = json_decode($data);</code></li> <li>No inherent schema, so readers need to validate decoded messages manually.</li> <li>Takes more space than Protocol Buffers.</li> </ul> <p><strong>Protocol Buffers:</strong></p> <ul> <li>Generating tools provided for several languages.</li> <li>Minimal size - difficult to beat.</li> <li>Defined schema (externally) through <code>.proto</code> files.</li> <li>Auto-generated interface objects for encoding/decoding, eg C++: <code>person.SerializeToOstream(&output);</code></li> <li>Support for differing versions of object schemas to add new <code>optional</code> members, so that existing objects aren't necessarily invalidated.</li> <li>Not human-readable or writable, so possibly harder to debug.</li> <li>Defined schema introduces some configuration management overhead.</li> </ul> <h2>Unicode</h2> <p>When it comes to Unicode support, both handle it without issues:</p> <ul> <li>JSON: Will typically escape non-ascii characters inside the string as <code>\uXXXX</code>, so no compatibility problem there. Depending on the library, it may be also possible to force UTF-8 encoding.</li> <li>Protocol Buffers: Seem to use UTF-8, though I haven't found info in Google's documentation in 3-foot-high letters to that effect.</li> </ul> <h2>Summary</h2> <p>Which one you go with will depend on how exactly your system will behave, how often changes to the data structure occur, and how all the above points will affect you. </p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload