Note that there are some explanatory texts on larger screens.

plurals
  1. POCapture TCP-Packets with Python
    text
    copied!<p>I try to capture an HTTP-download with Python using dpkt and pcap. The code looks like</p> <pre><code>... pc = pcap.pcap(iface) for ts, pkt in pc: handle_packet(pkt) def handle_packet(pkt): eth = dpkt.ethernet.Ethernet(pkt) # Ignore non-IP and non-TCP packets if eth.type != dpkt.ethernet.ETH_TYPE_IP: return ip = eth.data if ip.p != dpkt.ip.IP_PROTO_TCP: return tcp = ip.data data = tcp.data # current connection c = (ip.src, ip.dst, tcp.sport, tcp.dport) # Handle only new HTTP-responses and TCP-packets # of existing connections. if c in conn: handle_tcp_packet(c, tcp) elif data[:4] == 'HTTP': handle_http_response(c, tcp) ... </code></pre> <p>In <code>handle_http_response()</code> and <code>handle_tcp_packet()</code> i read the data of the tcp-packets (<code>tcp.data</code>) and write them to a file. However i noticed that i often get packets with the same TCP sequence number (<code>tcp.seq</code>) (on the same connection) but it seems that they contain the same data. Moreover it seems that not all packets are captured. For example if i sum up the packet-sizes the resulting value is lower than the one listed in the http-header (<code>content-length</code>). But in Wireshark i can see all packages.</p> <p>Does anyone has an idea why i get those duplicate packets and how i can capture every packet belonging to the http-response?</p> <p><strong>EDIT:</strong><br> Here you can find the complete code: <a href="http://pastebin.com/25hYxgKi" rel="nofollow">pastebin.com</a>. When running it prints something like that to stdout:</p> <pre><code>Waiting for HTTP-Audio-responses ... ... New TCP-Packet, len=1440, tcp-payload=5107680, con-len=5197150 , dups=57 , dup-bytes=82080 New TCP-Packet, len=1440, tcp-payload=5109120, con-len=5197150 , dups=57 , dup-bytes=82080 New TCP-Packet, len=1440, tcp-payload=5110560, con-len=5197150 , dups=57 , dup-bytes=82080 ----------&gt; FIN &lt;---------- New TCP-Packet, len=1937, tcp-payload=5112497, con-len=5197150 , dups=57 , dup-bytes=82080 New TCP-Packet, len=0, tcp-payload=5112497, con-len=5197150 , dups=57 , dup-bytes=82080 </code></pre> <p>As you can see the TCP-payload plus the duplicate received bytes (5112497+82080=5194577) are lower than the filesize of the download (5197150). Moreover you can see that i receive 57 duplicate packages (same SEQ and same TCP-data) and that still packages are received after the packet with the FIN-flag.</p> <p>So does anyone have an idea how i can capture all packets belonging to the connection? Wireshark sees all packets and i think it uses libpcap too.</p> <p>I don't even know if i do something wrong or if the pcap-library does something wrong.</p> <p><strong>EDIT2:</strong><br> OK, it seems that my code is correct: In Wireshark I saved the captured packets and used the capture-file in my code (<code>pcap.pcap('/home/path/filename')</code> instead of <code>pcap.pcap('eth0')</code>). My code read perfectly all packages (on multiple tests)! Since Wireshark uses libpcap too (afaik), i think the problem is the lib pypcap which does not provide me all packages.</p> <p>Any idea on how to test that?</p> <p>I already compiled pypcap by myself (trunk) but that didn't change anything -.-</p> <p><strong>EDIT3:</strong><br> OK, I changed my code to work with pcapy instead of pypcap and have the same problem:<br> When reading the packets from a previous captured file (created with Wireshark) then everything is fine, but when I capture the packets directly from eth0 I miss some packets.</p> <p>Interesting: When running both programs (the one using pypcap and the one using pcapy) in parallel they capture different packets. e.g. one programm receives one packet more.</p> <p>But I have still no idea why -.-<br> I thought Wireshark uses the same base-lib (libpcap).</p> <p>Please help :)</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload