Note that there are some explanatory texts on larger screens.

plurals
  1. POSegfault occurs due to one line of code in C file and entire program does not run
    primarykey
    data
    text
    <p>I've created a C program to write to a serial port (/dev/ttyS0) on an embedded ARM system. The kernel running on the embedded ARM system is Linux version 3.0.4, built with the same cross-compiler as the one listed below. </p> <p>My cross-compiler is arm-linux-gcc (Buildroot 2011.08) 4.3.6, running on an Ubuntu x86_64 host (3.0.0-14-generic #23-Ubuntu SMP). I have used the stty utility to set up the serial port from the command line.</p> <p>Mysteriously, it seems that the program will refuse to run on the embedded ARM system if a single line of code is present. If the line is removed, the program will run.</p> <p>Here is a full code listing replicating the problem:</p> <p>EDIT: I now close the file on error, as suggested in the comments below.</p> <pre><code>#include &lt;stdio.h&gt; #include &lt;stdlib.h&gt; #include &lt;unistd.h&gt; #include &lt;sys/types.h&gt; #include &lt;sys/stat.h&gt; #include &lt;fcntl.h&gt; #include &lt;stdint.h&gt; #include &lt;string.h&gt; #include &lt;errno.h&gt; #include &lt;termios.h&gt; int test(); void run_experiment(); int main() { run_experiment(); return 0; } void run_experiment() { printf("Starting program\n"); test(); } int test() { int fd; int ret; fd = open("/dev/ttyS0", O_RDWR | O_NOCTTY); printf("fd = %u\n", fd); if (fd &lt; 0) { close(fd); return 0; } fcntl(fd, F_SETFL, 0); printf("Now writing to serial port\n"); //TODO: // segfault occurs due to line of code here // removing this line causes the program to run properly ret = write( fd, "test\r\n", sizeof("test\r\n") ); if (ret &lt; 0) { close(fd); return 0; } close(fd); return 1; } </code></pre> <p>The output of this program on the ARM system is the following:</p> <pre><code>Segmentation fault </code></pre> <p>However, if I remove the line listed above and recompile the program, the problem goes away, and the output is the following:</p> <pre><code>Starting program fd = 3 Now writing to serial port </code></pre> <p>What could be going wrong here, and how do I debug the problem? Would this be an issue with the code, with the cross-compiler compiler, or with a version of the OS?</p> <p>I have also tried various combinations of O_WRONLY and O_RDWR without O_NOCTTY when opening the file, but the problem still persists.</p> <p>As suggested by @wildplasser in the comments below, I have replaced the test function with the following code, heavily based on the code at another site (http://www.warpspeed.com.au/cgi-bin/inf2html.cmd?..\html\book\Toolkt40\XPG4REF.INF+112).</p> <p>However, the program still doesn't run, and I receive the mysterious <code>Segmentation Fault</code> again. </p> <p>Here is the code:</p> <pre><code>int test() { int fh; FILE *fp; char *cp; if (-1 == (fh = open("/dev/ttyS0", O_RDWR))) { perror("Unable to open"); return EXIT_FAILURE; } if (NULL == (fp = fdopen(fh, "w"))) { perror("fdopen failed"); close(fh); return EXIT_FAILURE; } for (cp = "hello world\r\n"; *cp; cp++) fputc( *cp, fp); fclose(fp); return 0; } </code></pre> <p>This is very mysterious, since using other programs that I have written, I can use the <code>write()</code> function in a similar fashion to write to sysfs files, without any problem. </p> <p>HOWEVER, if the program is exactly in the same structure, then I cannot write to /dev/null.</p> <p>BUT I can successfully write to a sysfs file using exactly the same program!</p> <p>If the segfault occurred at a particular line in the function, then I would assume that the function call would be causing the segfault. However, the full program does not run!</p> <p>UPDATE: To provide more information, here is the cross-compiler information used to build on ARM system:</p> <p>$ arm-linux-gcc --v Using built-in specs. Target: arm-unknown-linux-uclibcgnueabi Configured with: /media/RESEARCH/SAS2-version2/device-system/buildroot/buildroot-2011.08/output/toolchain/gcc-4.3.6/configure --prefix=/media/RESEARCH/SAS2-version2/device-system/buildroot/buildroot-2011.08/output/host/usr --build=x86_64-unknown-linux-gnu --host=x86_64-unknown-linux-gnu --target=arm-unknown-linux-uclibcgnueabi --enable-languages=c,c++ --with-sysroot=/media/RESEARCH/SAS2-version2/device-system/buildroot/buildroot-2011.08/output/host/usr/arm-unknown-linux-uclibcgnueabi/sysroot --with-build-time-tools=/media/RESEARCH/SAS2-version2/device-system/buildroot/buildroot-2011.08/output/host/usr/arm-unknown-linux-uclibcgnueabi/bin --disable-__cxa_atexit --enable-target-optspace --disable-libgomp --with-gnu-ld --disable-libssp --disable-multilib --enable-tls --enable-shared --with-gmp=/media/RESEARCH/SAS2-version2/device-system/buildroot/buildroot-2011.08/output/host/usr --with-mpfr=/media/RESEARCH/SAS2-version2/device-system/buildroot/buildroot-2011.08/output/host/usr --disable-nls --enable-threads --disable-decimal-float --with-float=soft --with-abi=aapcs-linux --with-arch=armv5te --with-tune=arm926ej-s --disable-largefile --with-pkgversion='Buildroot 2011.08' --with-bugurl=http://bugs.buildroot.net/ Thread model: posix gcc version 4.3.6 (Buildroot 2011.08) </p> <p>Here is the makefile that I am using to compile my code:</p> <pre><code>CC=arm-linux-gcc CFLAGS=-Wall datacollector: datacollector.o clean: rm -f datacollector datacollector.o </code></pre> <p>UPDATE: Using the debugging suggestions given in the comments and answers below, I found that the segfault was caused by including the <code>\r</code> escape sequence in the string. For some strange reason, the compiler doesn't like the <code>\r</code> escape sequence, and will cause a segfault without running the code.</p> <p>If the <code>\r</code> escape sequence is removed, then the code runs as expected.</p> <p>Thus, the offending line of code should be the following:</p> <p>ret = write( fd, "test\n", sizeof("test\n") );</p> <p>So for the record, a full test program that actually runs is the following (could someone comment?):</p> <pre><code>#include &lt;stdio.h&gt; #include &lt;stdlib.h&gt; #include &lt;unistd.h&gt; #include &lt;sys/types.h&gt; #include &lt;sys/stat.h&gt; #include &lt;fcntl.h&gt; #include &lt;stdint.h&gt; #include &lt;string.h&gt; #include &lt;errno.h&gt; #include &lt;termios.h&gt; int test(); void run_experiment(); int main() { run_experiment(); return 0; } void run_experiment() { printf("Starting program\n"); fflush(stdout); test(); } int test() { int fd; int ret; char *msg = "test\n"; // NOTE: This does not work and will cause a segfault! // even if the fflush is called after each printf, // the program will still refuse to run //char *msg = "test\r\n"; fd = open("/dev/ttyS0", O_RDWR | O_NOCTTY); printf("fd = %u\n", fd); fflush(stdout); if (fd &lt; 0) { close(fd); return 0; } fcntl(fd, F_SETFL, 0); printf("Now writing to serial port\n"); fflush(stdout); ret = write( fd, msg, strlen(msg) ); if (ret &lt; 0) { close(fd); return 0; } close(fd); return 1; } </code></pre> <p>EDIT: As an aside to all of this, is it better to use:</p> <pre><code>ret = write( fd, msg, sizeof(msg) ); </code></pre> <p>or is it better to use:</p> <pre><code>ret = write( fd, msg, strlen(msg) ); </code></pre> <p><strong>Which is better? Is it better to use sizeof() or strlen()? It appears that some of the data in the string is truncated and not written to the serial port using the sizeof() function.</strong></p> <p>As I understand from Pavel's comment below, it is better to use <code>strlen()</code> if <code>msg</code> is declared as <code>char*</code>.</p> <p>Moreover, it appears that gcc is not creating a proper binary when the escape sequence <code>\r</code> is being used to write to a tty.</p> <p>Referring to the last test program given in my post above, the following line of code causes a segfault without the program running:</p> <pre><code>char *msg = "test\r\n"; </code></pre> <p>As suggested by Igor in the comments, I have run the gdb debugger on the binary with the offending line of code. I had to compile the program with the <code>-g</code> switch. <strong>The gdb debugger is being run natively on the ARM system, and all binaries are being built for the ARM architecture on the host using the same Makefile. All binaries are being built using the arm-linux-gcc cross-compiler.</strong></p> <p>The output of gdb (running natively on the ARM system) is as follows:</p> <pre><code>GNU gdb 6.8 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later &lt;http://gnu.org/licenses/gpl.html&gt; This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "arm-unknown-linux-uclibcgnueabi"... "/programs/datacollector": not in executable format: File format not recognized (gdb) run Starting program: No executable file specified. Use the "file" or "exec-file" command. (gdb) file datacollector "/programs/datacollector": not in executable format: File format not recognized (gdb) </code></pre> <p>However, if I change the single line of code to the following, the binary compiles and runs properly. Note that the <code>\r</code> escape sequence is missing:</p> <pre><code>char *msg = "test\n"; </code></pre> <p>Here is the output of gdb after changing the single line of code:</p> <pre><code>GNU gdb 6.8 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later &lt;http://gnu.org/licenses/gpl.html&gt; This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "arm-unknown-linux-uclibcgnueabi"... (gdb) run Starting program: /programs/datacollector Starting program fd = 4 Now writing to serial port test Program exited normally. (gdb) </code></pre> <p>UPDATE:</p> <p>As suggested by Zack in an answer below, I have now ran a test program on the embedded Linux system. Although Zack gives a detailed script to run on the embedded system, I was unable to run the script due to the lack of development tools (compiler and headers) installed in the root file system. In lieu of installing these tools, I simply compiled the nice test program that Zack provided in the script and used the strace utility. The strace utility was run on the embedded system.</p> <p>At last, I think that I understand what is happening.</p> <p>The bad binary was transferred to the embedded system over FTP, using an SPI-to-Ethernet bridge (KSZ8851SNL). There is a driver for the KSZ8851SNL in the Linux kernel.</p> <p><strong>It appears that either the Linux kernel driver, the proftpd server software running on the embedded system, or the actual hardware itself (KSZ8851SNL) was somehow corrupting the binary. The binary runs well on the embedded system.</strong></p> <p>Here is the output of strace on the testz binary transferred to the embedded Linux system over the Ethernet serial link:</p> <p>Bad binary tests:</p> <pre><code># strace ./testz /dev/null execve("./testz", ["./testz", "/dev/null"], [/* 17 vars */]) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x4000000, -1, 0) = 0x40089000 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ Segmentation fault # strace ./testz /dev/ttyS0 execve("./testz", ["./testz", "/dev/ttyS0"], [/* 17 vars */]) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x4000000, -1, 0) = 0x400ca000 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ Segmentation fault # </code></pre> <p>Here is the output of strace on the testz binary transferred on SD card to the embedded Linux system:</p> <p>Good binary tests:</p> <pre><code># strace ./testz /dev/null execve("./testz", ["./testz", "/dev/null"], [/* 17 vars */]) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x4000000, -1, 0) = 0x40058000 open("/lib/libc.so.0", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0755, st_size=298016, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x4000000, -1, 0) = 0x400b8000 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\240\230\0\0004\0\0\0"..., 4096) = 4096 mmap2(NULL, 348160, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40147000 mmap2(0x40147000, 290576, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x40147000 mmap2(0x40196000, 4832, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x47) = 0x40196000 mmap2(0x40198000, 14160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40198000 close(3) = 0 munmap(0x400b8000, 4096) = 0 stat("/lib/ld-uClibc.so.0", {st_mode=S_IFREG|0755, st_size=25296, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x4000000, -1, 0) = 0x400c4000 set_tls(0x400c4470, 0x400c4470, 0x4007b088, 0x400c4b18, 0x40) = 0 mprotect(0x40196000, 4096, PROT_READ) = 0 mprotect(0x4007a000, 4096, PROT_READ) = 0 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B115200 opost isig icanon echo ...}) = 0 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B115200 opost isig icanon echo ...}) = 0 open("/dev/null", O_RDWR|O_NOCTTY|O_NONBLOCK) = 3 write(3, "1\n", 2) = 2 write(3, "12\n", 3) = 3 write(3, "123\n", 4) = 4 write(3, "1234\n", 5) = 5 write(3, "12345\n", 6) = 6 write(3, "1\r\n", 3) = 3 write(3, "12\r\n", 4) = 4 write(3, "123\r\n", 5) = 5 write(3, "1234\r\n", 6) = 6 close(3) = 0 exit_group(0) = ? # strace ./testz /dev/ttyS0 execve("./testz", ["./testz", "/dev/ttyS0"], [/* 17 vars */]) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x4000000, -1, 0) = 0x400ed000 open("/lib/libc.so.0", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0755, st_size=298016, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x4000000, -1, 0) = 0x40176000 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\240\230\0\0004\0\0\0"..., 4096) = 4096 mmap2(NULL, 348160, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40238000 mmap2(0x40238000, 290576, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x40238000 mmap2(0x40287000, 4832, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x47) = 0x40287000 mmap2(0x40289000, 14160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40289000 close(3) = 0 munmap(0x40176000, 4096) = 0 stat("/lib/ld-uClibc.so.0", {st_mode=S_IFREG|0755, st_size=25296, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x4000000, -1, 0) = 0x400d1000 set_tls(0x400d1470, 0x400d1470, 0x40084088, 0x400d1b18, 0x40) = 0 mprotect(0x40287000, 4096, PROT_READ) = 0 mprotect(0x40083000, 4096, PROT_READ) = 0 ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B115200 opost isig icanon echo ...}) = 0 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B115200 opost isig icanon echo ...}) = 0 open("/dev/ttyS0", O_RDWR|O_NOCTTY|O_NONBLOCK) = 3 write(3, "1\n", 21 ) = 2 write(3, "12\n", 312 ) = 3 write(3, "123\n", 4123 ) = 4 write(3, "1234\n", 51234 ) = 5 write(3, "12345\n", 612345 ) = 6 write(3, "1\r\n", 31 ) = 3 write(3, "12\r\n", 412 ) = 4 write(3, "123\r\n", 5123 ) = 5 write(3, "1234\r\n", 61234 ) = 6 close(3) = 0 exit_group(0) = ? </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload