Note that there are some explanatory texts on larger screens.

plurals
  1. POGetting fast translation of string data transmitted via a socket into objects in Python
    primarykey
    data
    text
    <p>I currently have a Python application where newline-terminated ASCII strings are being transmitted to me via a TCP/IP socket. I have a high data rate of these strings and I need to parse them as quickly as possible. Currently, the strings are being transmitted as CSV and if the data rate is high enough, my Python application starts to lag behind the input data rate (probably not all that surprising).</p> <p>The strings look something like this:</p> <pre><code>chan,2007-07-13T23:24:40.143,0,0188878425-079,0,0,True,S-4001,UNSIGNED_INT,name1,module1,... </code></pre> <p>I have a corresponding object that will parse these strings and store all of the data into an object. Currently the object looks something like this:</p> <pre><code>class ChanVal(object): def __init__(self, csvString=None,**kwargs): if csvString is not None: self.parseFromCsv(csvString) for key in kwargs: setattr(self,key,kwargs[key]) def parseFromCsv(self, csvString): lst = csvString.split(',') self.eventTime=lst[1] self.eventTimeExact=long(lst[2]) self.other_clock=lst[3] ... </code></pre> <p>To read the data in from the socket, I'm using a basic "socket.socket(socket.AF_INET,socket.SOCK_STREAM)" (my app is the server socket) and then I'm using the "select.poll()" object from the "select" module to constantly poll the socket for new input using its "poll(...)" method.</p> <p>I have some control over the process sending the data (meaning I can get the sender to change the format), but it would be really convenient if we could speed up the ASCII processing enough to not have to use fixed-width or binary formats for the data.</p> <p>So up until now, here are the things I've tried and haven't really made much of a difference:</p> <ol> <li>Using the string "split" method and then indexing the list of results directly (see above), but "split" seems to be really slow.</li> <li>Using the "reader" object in the "csv" module to parse the strings</li> <li>Changing the strings being sent to a string format that I can use to directly instantiate an object via "eval" (e.g. sending something like "ChanVal(eventTime='2007-07-13T23:24:40.143',eventTimeExact=0,...)")</li> </ol> <p>I'm trying to avoid going to a fixed-width or binary format, though I realize those would probably ultimately be much faster.</p> <p>Ultimately, I'm open to suggestions on better ways to poll the socket, better ways to format/parse the data (though hopefully we can stick with ASCII) or anything else you can think of.</p> <p>Thanks!</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. COI just wanted to caution you on using "eval" especially if you are running some sort of server. You are opening yourself up to all sorts of potential hacks by executing arbitrary code. Even if you give it empty namespaces (e.g. `eval(expression, {"__builtins__":None},{})`) someone could enter something like `2 ** 1234567890` and that would tie up your server for a long time.
      singulars
    2. COI'd like to see your socket handling code - it's possible that you're not using poll() in the most effective way. poll() is good enough for your needs but it's very easy to misuse any I/O in such a way that you cancel out the effects of using the best function for the job. I'd also like to see where you're breaking apart the input into discrete messages.
      singulars
    3. CO@Zvarberg True. Luckily the processes that is the client socket is actually spawned by this Python process (it makes sense in my application, I promise) and is therefore the only thing actually sending data to me. Barring port sniffing attempts and a rather intelligent and malicious user, I think I'm ok (especially seeing as how I'm inside two firewalls). Good comment though.
      singulars
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload