A faulty program corrupts some its data preventing correct core generation (Failed to write core file for process postgres (error 14))

Konstantin Belousov kostikbel at gmail.com
Tue Jul 5 11:48:16 UTC 2016


On Mon, Jul 04, 2016 at 10:26:25PM -0700, Maxim Sobolev wrote:
> Hi all, investigating some random postgresql-9.1.21 server crashes on
> FreeBSD 10.3, we've started seeing those after upgrading from postgres
> 9.1.18 on more than one system, so hardware (e.g. RAM issues) are very
> unlikely. I suspect that postgres is at fault, however I am also curious
> how could it be that kernel is not capable of generating core file when
> application does something silly? Is it that some ELF-related data
> structures got corrupted or something else? Are we protecting the page
> where ELF header is mapped with R/O flag? I am looking at possibly
> recreating this by poking around elf header(s), seeing if I can corrupt it
> in a similar manner reliably, any pointers or suggestions are appreciated.
> 
> Jun 27 04:10:18 dal12 kernel: Failed to write core file for process
> postgres (error 14)
> Jun 27 04:10:18 dal12 kernel: pid 41361 (postgres), uid 70: exited on
> signal 11
> Jul  1 05:21:46 dal12 kernel: Failed to write core file for process
> postgres (error 14)
> Jul  1 05:21:46 dal12 kernel: pid 1722 (postgres), uid 70: exited on signal
> 11
> 
> #define EFAULT          14              /* Bad address */
> 
> The resulting files are truncated and is not really usable for anything.
> We've seen the same issue
> 
> -rw-------    1 pgsql     wheel     1310720 Jun 27 04:10 postgres.41361.core
> -rw-------    1 pgsql     wheel     1310720 Jul  1 05:21 postgres.1722.core
> 
> [ssp-root at dal12 /var/tmp]$ sudo gdb711 postgres postgres.1722.core
> GNU gdb (GDB) 7.11 [GDB v7.11 for FreeBSD]
> Copyright (C) 2016 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html
> >
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-portbld-freebsd10.3".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from postgres...(no debugging symbols found)...done.
> BFD: Warning: /var/tmp/postgres.1722.core is truncated: expected core file
> size >= 517120000, found: 1310720.
> [New LWP 100261]
> Core was generated by `postgres'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x0000000800cfba67 in ?? () from /lib/libthr.so.3
> (gdb) where
> #0  0x0000000800cfba67 in ?? () from /lib/libthr.so.3
> Backtrace stopped: Cannot access memory at address 0x7fffffffdd08
> (gdb) q
> 
https://lists.freebsd.org/pipermail/freebsd-stable/2016-June/084877.html


More information about the freebsd-hackers mailing list