Note that there are some explanatory texts on larger screens.

plurals
  1. POConcurrency in the Linux network drivers: probe() VS ndo_open(), ndo_start_xmit() VS NAPI poll()
    text
    copied!<p>Could anyone explain if additional synchronization, e.g., locking, is needed in the following two situations in a Linux network driver? I am interested in the kernel 2.6.32 and newer.</p> <h2>1. .probe VS .ndo_open</h2> <p>In a driver for a PCI network card, the <code>net_device</code> instance is usually registered in <code>.probe()</code> callback. Suppose a driver specifies <code>.ndo_open</code> callback in the <code>net_device_ops</code>, performs other necessary operations and then calls <code>register_netdev()</code>. </p> <p>Is it possible for that <code>.ndo_open</code> callback to be called by the kernel after <code>register_netdev()</code> but before the end of <code>.probe</code> callback? I suppose it is, but may be, there is a stronger guarantee, something that ensures that the device can be opened no earlier than <code>.probe</code> ends?</p> <p>In other words, if <code>.probe</code> callback accesses, say, the private part of the net_device struct after <code>register_netdev()</code> and <code>ndo_open</code> callback accesses that part too, do I need to use locks or other means to synchronize these accesses?</p> <h2>2. .ndo_start_xmit VS NAPI poll</h2> <p>Is there any guarantee that, for a given network device, <code>.ndo_start_xmit</code> callback and NAPI <code>poll</code> callback provided by a driver never execute concurrently?</p> <p>I know that <code>.ndo_start_xmit</code> is executed with BH disabled at least and <code>poll</code> runs in the softirq, and hence, BH context. But this serializes execution of these callbacks on the local CPU only. Is it possible for <code>.ndo_start_xmit</code> and <code>poll</code> for the same network device to execute simultaneously on different CPUs?</p> <p>As above, if these callbacks access the same data, is it needed to protect the data with a lock or something?</p> <p>References to the kernel code and/or the docs are appreciated.</p> <p><strong>EDIT:</strong> </p> <p>To check the first situation, I conducted an experiment and added a 1-minute delay right before the end of the call to <code>register_netdev()</code> in e1000 driver (kernel: 3.11-rc1). I also added debug prints there in <code>.probe</code> and <code>.ndo_open</code> callbacks. Then I loaded e1000.ko, and tried to access the network device it services before the delay ended (in fact, NetworkManager did that before me), then checked the system log. </p> <p><strong>Result:</strong> yes, it is possible for <code>.ndo_open</code> to be called even before the end of <code>.probe</code> although the "race window" is usually rather small.</p> <p>The second situation (<code>.ndo_start_xmit</code> VS NAPI <code>poll</code>) is still unclear to me and any help is appreciated.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload