StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
text
Body
copied!<p>Gzip is fast. <em>Very fast.</em> And I have a good deal of confidence that it's the best (in terms of both practicality and efficiency) solution for lean object transportation.</p> <p>To illustrate the point, I've built a quick sample on one of my staging sites.</p> <p><a href="http://staging.svidgen.com/ajax/test-a.js" rel="nofollow">http://staging.svidgen.com/ajax/test-a.js</a> generates 5k rows of simple data and outputs raw, untainted JSON.</p> <pre><code>$data = array(); for ($i = 0; $i < 5000; $i++) { $data[] = array( 'value-a' => $i, 'value-b' => pow($i,2), 'value-c' => pow($i,3) ); } print json_encode($data); </code></pre> <p>The gzipped response is <strong>65KB</strong> and takes about <strong>357ms</strong> to request, build, serialize, and transmit. Omitting client-size parsing from the equation, that's a <em>throughput</em> of <strong>182KB/s</strong>. Considering the <strong>274KB</strong> of raw data transmitted, that's an <em>effective</em> throughput of <strong>767KB/s</strong>. The response looks like this:</p> <pre><code>[{"value-a":0,"value-b":0,"value-c":0},{"value-a":1,"value-b":1,"value-c":1} /* etc. */] </code></pre> <p>The alternative format, <a href="http://staging.svidgen.com/ajax/test-b.js" rel="nofollow">http://staging.svidgen.com/ajax/test-b.js</a>, generates the same 5k rows of simple data, but restructures the data into a more efficient, indexed JSON serialization.</p> <pre><code>$data = array(); for ($i = 0; $i < 5000; $i++) { $data[] = array( 'value-a' => $i, 'value-b' => pow($i,2), 'value-c' => pow($i,3) ); } $out_index = array(); $out = array(); foreach ($data as $row) { $new_row = array(); foreach ($row as $k => $v) { if (!isset($out_index[$k])) { $out_index[$k] = sizeof($out_index); } $new_row[$out_index[$k]] = $v; } $out[] = $new_row; } print json_encode(array( 'index' => $out_index, 'rows' => $out )); </code></pre> <p>The gzipped response is <strong>59.4KB</strong> and takes about <strong>515ms</strong> to request, build, serialize, and transmit. Omitting client-size parsing from the equation, that's a <em>throughput</em> of <strong>115KB/s</strong>. Considering the <strong>128KB</strong> of raw data transmitted, that's an <em>effective</em> throughput of <strong>248KB/s</strong>. The response looks like this:</p> <pre><code>{"index":{"value-a":0,"value-b":1,"value-c":2},"rows":[[0,0,0],[1,1,1] /* etc. */ ]} </code></pre> <p>So, in our fairly simple example, the raw, restructured data is over 50% smaller than the original raw data. But, it's only 9% smaller when gzipped. And the cost, in this case, is a 44% increase in total request time.</p> <p>If you wrote a binary library to restructure the data, I expect you could significantly reduce that 44%. But, it's still highly unlikely to worthwhile. You need it to serialize the data without taking more than a 9% longer than encoding the structure "normally" to see any gain at all.</p> <p>The only way to avoid the restructuring or "alternative serialization" hit is to work with all your objects server-side in the awkward, indexed manner from start to finish. And, unless you're really pressed to get every negligible ounce of performance out of your hardware, that's really just a terrible idea.</p> <p>And in both cases, the space savings of gzip well beyond what we're able to accomplish using an alternative JavaScript comptable format.</p> <p>(And we haven't even taking client-size parsing into account -- which is going to be <em>very significant</em> for anything that isn't "normal" JSON or XML.)</p> <h2>In Conclusion</h2> <p>Just use the built-in serialization libraries and gzip.</p>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload