Stack overflow with kernel r254683

msch at snafu.de msch at snafu.de
Tue Feb 4 09:09:05 UTC 2014


It seems that this mail has been sent encoded... Here I try it once
more
 verified as 'plain text' from my private e-mail account
 **************************************************************************

 Hello,

 finally I got it managed to upgrade and test my server last weekend.

 There are good news: so far kernel r261208 (FreeBSD 9.2-STABLE) runs
without problems.

 I could not apply the patch you supplied, but I saw that the code was
modified
 nonetheless and I gave it a try :-)

 It seems that the problem has been solved.

 Thank you very much! :-)

 with best regards
 Matthias Schuendehuette

 > -----Ursprüngliche Nachricht-----
 > Von: Rick Macklem [mailto:rmacklem at uoguelph.ca]
 > Gesendet: Sonntag, 19. Januar 2014 03:19
 > An: Schuendehuette, Matthias
 > Cc: Konstantin Belousov
 > Betreff: Re: Stack overflow with kernel r254683
 > 
 > I just found a bug that causes a stack overflow in the file handle
 > affinity code done by ken at . It occurs for an NFSv2 client mounting
 > a server, where sizeof(fhandle_t) < 32.
 > 
 > I've attached the patch that fixes this, in case you can test it?
 > 
 > Since your stack trace looks completely different, I won't guess if
 > this was the bug, but this bug definitely trashed the stack.
 > 
 > rick
 > 
 > ----- Original Message -----
 > > On Mon, Aug 26, 2013 at 07:11:48PM -0400, Rick Macklem wrote:
 > > > Matthias Schuendehuette wrote:
 > > > > Hello,
 > > > >
 > > > > yesterday I got a kernel crash on my server (a ProLiant DL380
 > > > > G5):
 > > > >
 > > > > "panic: stack overflow detected; backtrace may be corrupted"
 > > > >
 > > > > Kernel is "9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #7 r254683"
 > > > >
 > > > >
 > > > > The stack trace reads:
 > > > >
 > > > > #0 doadump (textdump=1) at pcpu.h:249
 > > > > 249 pcpu.h: No such file or directory.
 > > > > in pcpu.h
 > > > > (kgdb) #0 doadump (textdump=1) at pcpu.h:249
 > > > > #1 0xc0668a4d in kern_reboot (howto=260)
 > > > > at /usr/src/sys/kern/kern_shutdown.c:449
 > > > > #2 0xc0668f07 in panic (fmt=0x104 )
 > > > > at /usr/src/sys/kern/kern_shutdown.c:637
 > > > > #3 0xc0691da2 in __stack_chk_fail ()
 > > > > at /usr/src/sys/kern/stack_protector.c:17
 > > > > #4 0xc7fdb175 in nfsrvd_setattr (nd=0xc73b4400,
 > > > > isdgram=-952596480,
 > > > > vp=0xc8001140, p=0xf405ecc8, exp=0xc07af7f0)
 > > > > at
 > > > >
/usr/src/sys/modules/nfsd/../../fs/nfsserver/nfs_nfsdserv.c:371
 > > > > #5 0xc7fdb6e0 in nfsrvd_releaselckown (nd=0xc7442a00,
 > > > > isdgram=-952596480,
 > > > > vp=0xc7388848, p=0xf405ecb8, exp=0x0)
 > > > > at
 > > > >
/usr/src/sys/modules/nfsd/../../fs/nfsserver/nfs_nfsdserv.c:3481
 > > > > #6 0xc07af7f0 in svc_run_internal (pool=0xc7de8b80,
ismaster=0)
 > > > > at /usr/src/sys/rpc/svc.c:1109
 > > > > #7 0xc07b006d in svc_thread_start (arg=0xc7de8b80)
 > > > > at /usr/src/sys/rpc/svc.c:1200
 > > > > #8 0xc06384f7 in fork_exit (callout=0xc07b0060
 > > > > ,
 > > > > arg=0xc7de8b80, frame=0xf405ed08) at
 > > > > /usr/src/sys/kern/kern_fork.c:992
 > > > > #9 0xc08787c4 in fork_trampoline () at
 > > > > /usr/src/sys/i386/i386/exception.s:279
 > > > >
 > > > Well, when I've looked on i386, the nfsd threads normally don't
use
 > > > 1 page
 > > > and the stacks are 2 pages, so I doubt an nfsd thread is
blowing
 > > > the stack.
 > > It is overflowing the frame, not the whole stack. In other word,
 > > something
 > > overwrote the canary which was put on the stack between local
 > > variables
 > > and the return address, possibly corrupting the return address as
 > > well.
 > >
 > > > Also, nfsrvd_releaselckown() doesn't call nfsrvd_setattr(), so
the
 > > > backtrace
 > > > doesn't make much sense.
 > > Yes, this might be one of the consequences of the stack smashing.
 > >
 > > >
 > > > Afraid I can't help more than this. Good luck with it, rick
 > > >
 > > > >
 > > > > I have all the files in /var/crash, so if someone wants
 > > > > additional
 > > > > informations
 > > > > I should be able to deliver them.
 > > > >
 > > > > The kernel config file is customized in the sense that I have
 > > > > removed
 > > > > kernel items, that aren't used on that machine.
 > > > >
 > > > > One major difference: I use
 > > > >
 > > > > < options NFSCLIENT # Network Filesystem
 > > > > Client
 > > > > < options NFSSERVER # Network Filesystem
 > > > > Server
 > > > >
 > > > > instead of
 > > > >
 > > > > > options NFSCL # New Network Filesystem
 > > > > > Client
 > > > > > options NFSD # New Network Filesystem
 > > > > > Server
 > > > >
 > > > > because a kernel a few weeks ago immediately crashed with the
new
 > > > > NFS-code.
 > > > >
 > > > > But it seems now, that the old NFS-code is also somehow
damaged.
 > > > >
 > > > > Ah, and I still have from older releases of FreeBSD the
following
 > > > > loader options - do they still make sense?
 > > > >
 > > > > geom_vinum_load="YES"
 > > > > kern.maxdsiz="734003200"
 > > > > vm.pmap.shpgperproc=256
 > > > > vm.pmap.pv_entry_max=3145728
 > > > >
 > > > >
 > > > > 'geom_vinum' is used as LVM only, no RAIDs are configured.
 > > > >
 > > > > This server is primarily a Samba server with the SMB-shares
 > > > > exported
 > > > > as NFS-shares as well
 > > > > for the other *nix-servers around.
 > > > >
 > > > > Because this is the most loaded production server, testing is
a
 > > > > bit
 > > > > difficult, restricted to the evening and the weekends.
 > > > >
 > > > > On my two other FreeBSD machines I have no problems at all,
one
 > > > > of
 > > > > them is an identical ProLiant server with a nearly identical
 > > > > kernel
 > > > > config - runs like a charm...
 > > > >
 > > > > Has someone a good advice or further questions?
 > > > >
 > > > >
 > > > >
 > > > > with best regards
 > > > > Matthias Schuendehuette
 > > > >



More information about the freebsd-stable mailing list