DEBUG - analysing core dumps

Damien Fleuriot ml at my.gd
Thu May 26 09:29:15 UTC 2011


On 26 May 2011 09:51, Damien Fleuriot <ml at my.gd> wrote:
>
>
> On 5/25/11 7:10 PM, Garrett Cooper wrote:
>> On Wed, May 25, 2011 at 9:36 AM, Damien Fleuriot <ml at my.gd> wrote:
>>> Hello list,
>>>
>>>
>>>
>>> We've got these boxes at work running FreeBSD 8.1-STABLE amd64 and
>>> serving as firewalls and openvpn gateways.
>>>
>>> We use CARP interfaces to provide an active-passive fault tolerant system.
>>>
>>>
>>> Today, we received a nagios alert from the master box saying it's
>>> rsyslogd process had crashed.
>>>
>>> I logged on to it and tried to relaunch it, to no avail:
>>> pid 2303 (rsyslogd), uid 0: exited on signal 11 (core dumped)
>>>
>>>
>>>
>>>
>>> I would like advice on how to debug the output from the core dump.
>>>
>>> This is what I get from gdb:
>>>
>>> # gdb
>>> GNU gdb 6.1.1 [FreeBSD]
>>> Copyright 2004 Free Software Foundation, Inc.
>>> GDB is free software, covered by the GNU General Public License, and you are
>>> welcome to change it and/or distribute copies of it under certain
>>> conditions.
>>> Type "show copying" to see the conditions.
>>> There is absolutely no warranty for GDB.  Type "show warranty" for details.
>>> This GDB was configured as "amd64-marcel-freebsd".
>>> (gdb) core rsyslogd.core
>>> Core was generated by `rsyslogd'.
>>> Program terminated with signal 11, Segmentation fault.
>>> #0  0x00000000004258ec in ?? ()
>>>
>>>
>>>
>>>
>>> Sadly, getting a backtrace with "bt" gives me more lines with "??",
>>> which is totally not helpful:
>>> [SNIP]
>>> #13 0x00007fffff1f9d70 in ?? ()
>>> #14 0x0000000000000000 in ?? ()
>>> #15 0x6f70732f7261762f in ?? ()
>>> #16 0x6c737973722f6c6f in ?? ()
>>> #17 0x5f6e70766f2f676f in ?? ()
>>> #18 0x746174732e676f6c in ?? ()
>>> #19 0x0000000000000065 in ?? ()
>>> #20 0x0000000000000000 in ?? ()
>>> [SNIP]
>>>
>>> I am not sure what steps I should follow to get more information ?
>>>
>>>
>>>
>>> Also, I believe that often, core dumps with signal 11 = RAM problems and
>>> I would like a confirmation here.
>>>
>>> I am concerned because rsyslogd is the only process that crashes in this
>>> way, even after I rebooted the firewall.
>>
>>     Rebuild and reinstall rsyslogd with debug symbols and see if you
>> can get a reasonable stack trace. Something else to try before that to
>> narrow down the problem section of code is ktrace/kdump it, or truss
>> it, and see if it's trying to open/read from a file and failing.
>> Thanks,
>> -Garrett
>
>
>
>
> Thanks everyone for your answers, I'll recompile with DEBUG and obtain a
> new core dump.
>
> I'll also investigate the possibility of corrupted spool files and post
> the resolution here :)
>
>
> --
> dfl
>


Turns out that after rebuilding rsyslog4-relp with -DWITH_DEBUG , the
new daemon works just fine and doesn't sig11 anymore.
Odd, but well, solves my problem.

I will upgrade it on all the other boxes then.

Thanks for the help guys

--
dfl


More information about the freebsd-hackers mailing list