Note that there are some explanatory texts on larger screens.

plurals
  1. POTCP Socket hanging - both sides stuck in sendto()
    primarykey
    data
    text
    <p>We have an linux application (we don't have the source) that seems to be hanging. The socket between the two processes is reported as ESTABLISHED, and there is some data in the kernel socket buffer (although nowhere near the configured 16M via wmem/rmem). Both ends of the socket seem to be stuck on a sendto().</p> <p>Below is some investigation using netstat/lsof and strace:</p> <h2>HOST A (10.152.20.28)</h2> <pre><code>[root@hosta ~]# lsof -n -u df01 | grep 12959 | grep 12u q 12959 df01 12u IPv4 4398449 TCP 10.152.20.28:38521-&gt;10.152.20.29:gsigatekeeper (ESTABLISHED) [root@hosta ~]# netstat -anp | grep 38521 tcp 268754 90712 10.152.20.28:38521 10.152.20.29:2119 ESTABLISHED 12959/q [root@hosta ~]# strace -p 12959 Process 12959 attached - interrupt to quit sendto(12, "sometext\0somecode\0More\0exJKsss"..., 542, 0, NULL, 0 &lt;unfinished ...&gt; Process 12959 detached [root@hosta~]# </code></pre> <h2>HOST B (10.152.20.29)</h2> <pre><code>[root@hostb ~]# netstat -anp | grep 38521 tcp 72858 110472 10.152.20.29:2119 10.152.20.28:38521 ESTABLISHED 25512/q [root@hostb ~]# lsof -n -u df01 | grep 38521 q 25512 df01 14u IPv4 6456715 TCP 10.152.20.29:gsigatekeeper-&gt;10.152.20.28:38521 (ESTABLISHED) [root@hostb ~]# strace -p 25512 Process 25512 attached - interrupt to quit sendto(14, "\0\10\0\0\0Owner\0sym\0Type\0Ctpy\0Time\0Lo"..., 207, 0, NULL, 0 &lt;unfinished ...&gt; Process 25512 detached [root@hostb~]# </code></pre> <p>We have upgraded the NIC driver to the latest and greatest. The systems are running RHEL 5.6 x64 (2.6.18-238.el5), I have checked the eratta for RHEL 5.7 and 5.8 but I can see no mention of bugs with the bnx2 driver or the kernel.</p> <p>Does anyone have any ideas of how to debug this further?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload