Note that there are some explanatory texts on larger screens.

plurals
  1. POLooking for guidance on a deadlock scenario
    text
    copied!<p>I have a program that spawns lots of children and runs for long periods of time. The program contains a SIGCHLD handler to reap defunct processes. Occasionally, this program freezes. I believe that pstack is indicating a deadlock scenario. Is that the proper interpretation of this output?</p> <pre><code>10533: ./asyncsignalhandler ff3954e4 lwp_park (0, 0, 0) ff391bbc slow_lock (ff341688, ff350000, 0, 0, 0, 0) + 58 ff2c45c8 localtime_r (ffbfe7a0, 0, 0, 0, 0, 0) + 24 ff2ba39c __posix_ctime_r (ffbfe7a0, ffbfe80e, ffbfe7a0, 0, 0, 0) + c 00010bd8 gettimestamp (ffbfe80e, ffbfe828, 40, 0, 0, 0) + 18 00010c50 sig_chld (12, 0, ffbfe9f0, 0, 0, 0) + 30 ff3956fc __sighndlr (12, 0, ffbfe9f0, 10c20, 0, 0) + c ff38f354 call_user_handler (12, 0, ffbfe9f0, 0, 0, 0) + 234 ff38f504 sigacthandler (12, 0, ffbfe9f0, 0, 0, 0) + 64 --- called from signal handler with signal 18 (SIGCLD) --- ff391c14 pthread_mutex_lock (20fc8, 0, 0, 0, 0, 0) + 48 ff2bcdec getenv (ff32a9ac, 770d0, 0, 0, 0, 0) + 1c ff2c6f40 getsystemTZ (0, 79268, 0, 0, 0, 0) + 14 ff2c4da8 ltzset_u (4ede65ba, 0, 0, 0, 0, 0) + 14 ff2c45d0 localtime_r (ffbff378, 0, 0, 0, 0, 0) + 2c ff2ba39c __posix_ctime_r (ffbff378, ffbff402, ffbff378, ff33e000, 0, 0) + c 00010bd8 gettimestamp (ffbff402, ffbff402, 2925, 29a7, 79c38, 10b54) + 18 00010ae0 main (1, ffbff4ac, ffbff4b4, 20c00, 0, 0) + 190 00010928 _start (0, 0, 0, 0, 0, 0) + 108 </code></pre> <p>I don't really fancy myself a C coder and am not familiar with the nuances of the language. I'm specifically using the re-entrant version of ctime(_r) in the program. Why is this still deadlocking?</p> <pre><code>#include &lt;stdio.h&gt; #include &lt;stdlib.h&gt; #include &lt;string.h&gt; #include &lt;time.h&gt; // import pid_t type #include &lt;sys/types.h&gt; // import _exit function #include &lt;unistd.h&gt; // import WNOHANG definition #include &lt;sys/wait.h&gt; // import errno variable #include &lt;errno.h&gt; // header for signal functions #include &lt;signal.h&gt; // function prototypes void sig_chld(int); char * gettimestamp(char *); // begin int main(int argc, char **argv) { time_t sleepstart; time_t sleepcheck; pid_t childpid; int i; unsigned int sleeptime; char sleepcommand[20]; char ctime_buf[26]; struct sigaction act; /* set stdout to line buffered for logging purposes */ setvbuf(stdout, NULL, _IOLBF, BUFSIZ); /* Assign sig_chld as our SIGCHLD handler */ act.sa_handler = sig_chld; /* We don't want to block any other signals */ sigemptyset(&amp;act.sa_mask); /* * We're only interested in children that have terminated, not ones * which have been stopped (eg user pressing control-Z at terminal) */ act.sa_flags = SA_NOCLDSTOP; /* Make these values effective. */ if (sigaction(SIGCHLD, &amp;act, NULL) &lt; 0) { printf("sigaction failed\n"); return 1; } while (1) { for (i = 0; i &lt; 20; i++) { /* fork/exec child program */ childpid = fork(); if (childpid == 0) // child { //sleeptime = 30 + i; sprintf(sleepcommand, "sleep %d", i); printf("\t[%s][%d] Executing /bin/sh -c %s\n", gettimestamp(ctime_buf), getpid(), sleepcommand); execl("/bin/sh", "/bin/sh", "-c", sleepcommand, NULL); // only executed if exec fails printf("[%s][%d] Error executing program, errno: %d\n", gettimestamp(ctime_buf), getpid(), errno); _exit(1); } else if (childpid &lt; 0) // error { printf("[%s][%d] Error forking, errno: %d\n", gettimestamp(ctime_buf), getpid(), errno); } else // parent { printf("[%s][%d] Spawned child, pid: %d\n", gettimestamp(ctime_buf), getpid(), childpid); } } // sleep is interrupted by SIGCHLD, so we can't simply sleep(5) printf("[%s][%d] Sleeping for 5 seconds\n", gettimestamp(ctime_buf), getpid()); time(&amp;sleepstart); while (1) { time(&amp;sleepcheck); if (difftime(sleepcheck, sleepstart) &lt; 5) { sleep(1); } else { break; } } } return(0); } char * gettimestamp(char *ctime_buf) { time_t now; time(&amp;now); // format the timestamp and chomp the newline ctime_r(&amp;now, ctime_buf); ctime_buf[strlen(ctime_buf) - 1] = '\0'; return ctime_buf; } /* * The signal handler function -- only gets called when a SIGCHLD * is received, ie when a child terminates. */ void sig_chld(int signo) { pid_t childpid; int childexitstatus; char ctime_buf[26]; while (1) { childpid = waitpid(-1, &amp;childexitstatus, WNOHANG); if (childpid &gt; 0) printf("[%s][%d] Reaped child, pid: %d, exitstatus: %d\n", gettimestamp(ctime_buf), getpid(), childpid, WEXITSTATUS(childexitstatus)); else return; } } </code></pre> <p>I'm running in a Solaris 9 environment. The program was compiled with Sun WorkShop 6 update 2 C 5.3 Patch 111679-15 2009/09/10 using the following syntax:</p> <pre><code>cc -o asyncsignalhandler asyncsignalhandler.c -mt -D_POSIX_PTHREAD_SEMANTICS </code></pre> <p>Is there a flaw in the program? Are there better ways to handle logging (with timestamps) from a signal handler?</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload