smbfs crashes since approx. 10.1-RELEASE
John Baldwin
jhb at freebsd.org
Wed Oct 7 00:09:13 UTC 2015
On Monday, October 05, 2015 06:16:54 PM Rick Macklem wrote:
> Christian Kratzer wrote:
> > Hi,
> >
> > I run a regular rsync job that runs from cron and copies stuff that gets
> > created on a Windows smbfs share.
> >
> > Starting about 10.1-RELEASE the VM has become unstable and started panicing.
> >
> > I have narrowed the issue down to the aforementioned rsync job.
> >
> > When I move the job to a different VM the the other VM starts crashing and
> > the VM without the job becomes stable agin.
> >
> > I have panics and crashinfos stored in /var/crash if anybody is interested:
> >
> > root at noc2:/var/crash # uname -a
> > FreeBSD noc2.cksoft.de 10.2-RELEASE FreeBSD 10.2-RELEASE #0 r286666: Wed
> > Aug 12 15:26:37 UTC 2015
> > root at releng1.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
> > root at noc2:/var/crash # freebsd-version -u
> > 10.2-RELEASE-p5
> > root at noc2:/var/crash # freebsd-version -k
> > 10.2-RELEASE
> > root at noc2:/var/crash #
> >
> > This is what I have in /var/crash/core.txt.0
> >
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; apic id = 00
> > fault virtual address = 0x20
> > fault code = supervisor read data, page not present
> > instruction pointer = 0x20:0xffffffff80996c7c
> > stack pointer = 0x28:0xfffffe003d6c0ac0
> > frame pointer = 0x28:0xfffffe003d6c0af0
> > code segment = base 0x0, limit 0xfffff, type 0x1b
> > = DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags = resume, IOPL = 0
> > current process = 1349 (smbiod10)
> > trap number = 12
> > panic: page fault
> > cpuid = 0
> > KDB: stack backtrace:
> > #0 0xffffffff80984e30 at kdb_backtrace+0x60
> > #1 0xffffffff809489e6 at vpanic+0x126
> > #2 0xffffffff809488b3 at panic+0x43
> > #3 0xffffffff80d4aadb at trap_fatal+0x36b
> > #4 0xffffffff80d4addd at trap_pfault+0x2ed
> > #5 0xffffffff80d4a47a at trap+0x47a
> > #6 0xffffffff80d307f2 at calltrap+0x8
> > #7 0xffffffff8092ebe0 at __mtx_unlock_sleep+0x60
> > #8 0xffffffff8092eb69 at __mtx_unlock_flags+0x69
> > #9 0xffffffff81a1b724 at smb_iod_thread+0xb4
> > #10 0xffffffff8091244a at fork_exit+0x9a
> > #11 0xffffffff80d30d2e at fork_trampoline+0xe
> > Uptime: 2h43m55s
> > Dumping 103 out of 999 MB: (CTRL-C to abort)
> > ..16%..31%..47%..62%..78%..93%
> >
> This crash is occurring when doing an mtx_unlock(&Giant). Unfortunately, I'm not
> conversant w.r.t. this code. I've cc'd jhb@ in case he has some insight.
> If you don't get any responses, I'd suggest reposting to freebsd-current@ with
> "crashes in mtx_unlock(&Giant)" in the subject line.
>
> Btw John, the code does tsleep() in a loop before the mtx_unlock(&Giant). I do
> remember that was once allowed, but am not sure if it still is (ie a tsleep() call
> while holding Giant)?
>
> Hopefully someone who knows what is special about Giant that might cause this will
> respond.
>
> Good luck with it, rick
tsleep() with Giant is still allowed. However, this sort of panic usually means
you unlocked a mutex you didn't hold (but without INVARIANTS enabled or you'd get
an assertion failure earlier).
I don't see anything obviously wrong in smb_iod_thread() however.
If you have the crashdump, can you please run this in kgdb:
frame 9
p (struct mtx *)c
p *(struct mtx *)c
--
John Baldwin
More information about the freebsd-stable
mailing list