Note that there are some explanatory texts on larger screens.

plurals
  1. POPython: Limiting the size of a json string for posting to a server
    text
    copied!<p>I'm posting hundreds of thousands of JSON records to a server that has a MAX data upload limit of 1MB. My records can be of very variable size, from as little as a few hundred bytes, to a few hundred thousand.</p> <pre><code>def checkSize(payload): return len(payload) &gt;= bytesPerMB toSend = [] for row in rows: toSend.append(row) postData = json.dumps(toSend) tooBig = tooBig or checkSize() if tooBig: sendToServer(postData) </code></pre> <p>Which then posts to the server. It currently works, but the constant dumping of toSend to a jsonified string seems really heavy and almost 100% too much, although I can't seem to find a way to do it another. Would I be ok with stringifying the individual new records and keeping a tally of what they would be together?</p> <p>I'm sure there must be a cleaner way of doing this, but I just don't know.</p> <p>Thanks for any and all help given.</p> <hr> <p>This is the answer I'm now using, I came up with it at the same time as @rsegal below, just posting for clarity and completion (sendToServer is just a dummy function to show things are working correctly),</p> <pre><code>import pickle import json f = open("userProfiles") rows = pickle.load(f) f.close() bytesPerMB = 1024 * 1024 comma = "," appendSize = len(comma) def sendToServer(obj): #send to server pass def checkSize(numBytes): return numBytes &gt;= bytesPerMB def jsonDump(obj): return json.dumps(obj, separators=(comma, ":")) leftover = [] numRows = len(rows) rowsSent = 0 while len(rows) &gt; 0: toSend = leftover[:] toSendSize = len( jsonDump(toSend) ) leftover = [] first = len(toSend) == 0 while True: try: row = rows.pop() except IndexError: break rowSize = len( jsonDump(row) ) + (0 if first else appendSize) first = False if checkSize(toSendSize + rowSize): leftover.append(row) break toSend.append(row) toSendSize += rowSize rowsSent += len(toSend) postData = jsonDump(toSend) print "assuming to send '{0}' bytes, actual size '{1}'. rows sent {2}, total {3}".format(toSendSize, len(postData), rowsSent, numRows) sendToServer(postData) </code></pre>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload