Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to debug deadlock problems in kernel
    text
    copied!<p>I have a buggy kernel module which I am trying to fix. Basically when this module is running, it will cause other tasks to hang for more than 120 seconds. Since almost all the hung tasks are waiting for either mm->mmap_sem or some file system locks (i_node->i_mutex) I suspect that it has something to do with this module doesn't not grab the mmap_sem lock and some file-system level lock (like inote->i_mutex) in order, which could have caused some deadlock problem. Since my module does not try to grab those locks directly though, I assume it is some function I called that grab those locks. And now I am trying to figure out which function calls in my module is causing the problem.</p> <p>However, I am having a hard time debugging it for the following reasons:</p> <ol> <li><p>I don't know exactly which lock the hung task is trying to grab. I got the call trace of the hung task, and know at what point it hangs. Kernel also gives me some kind of information like: "1 lock held by automount/3115: 0: (&amp;type->i_mutex_dir_key#2){--..}, at: [] real_lookup+0x24/0xc5". However, I want to know exact which lock a task holds, and exactly which lock it is trying to acquire in order to figure out the problem. As kernel doesn't provide the arguments of function calls along with the call trace, I find this information difficult to obtain. </p></li> <li><p>I am using gdb andvmware to debug this, which allows me to set breakpoints, step into a function and such. However, as which task and at what point that task will hang is kind of un-deterministic, I don't really know where to set breakpoints and inspect. It will be great if I can somehow "attach" to the task which kernel reported to be blocked for more than 120 secs, and get some information about it. </p></li> </ol> <p>So my questions are as following:</p> <ol> <li><p>Where can I get, along with the call trace, the arguments of the functions in the call trace, in order to figure out exactly which lock a task is trying to grab.</p></li> <li><p>Is it possible for me to use gdb to somehow "attach" to a hung task in a kernel? If not, is there some way for me to at least examine the data structure which represents that task? As I am having a hard time examining all the global data structure in kernel too. GDB always complains that "can't access memory 0x3200" or something similar. </p></li> <li><p>It would also be very helpful if I can print out for every task in the kernel, what locks they are currently holding. Is there a way to do it?</p></li> </ol> <p>Thank you very much!</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload