8.1-RELEASE hangs on reboot

John Baldwin jhb at freebsd.org
Wed Dec 1 13:22:02 UTC 2010


On Tuesday, November 30, 2010 8:23:19 pm Ondřej Majerech wrote:
> Hello,
> 
> my 8.1-R system has just started hanging on reboot. Specifically after
> I svn up'd my source and updated from 8.1-R-p1 to -p2.
> 
> Some kind of hang occurs on every reboot attempt. Usually it hangs at
> the "Rebooting..." message, but sometimes the thing just locks up
> before it even syncs disks. shutdown -p now seems to shutdown the
> system successfully each time.
> 
> So I booted into single-user mode, executed "reboot" and during the
> "Syncing disks" I pressed Ctrl-Alt-Escape to break into the debugger.
> There I single-stepped with the "s" command until the thing simply
> stopped doing anything. (Even if I pressed NumLock, the LED on the
> keyboard wouldn't turn off.)
> 
> The screen content at the moment of hang is (dutifully typed over as
> the thing is dead and I don't have a serial cable):
> 
> [thread pid 12 tid 100017 ]
> Stopped at sckbdevent+0x5f: call _mtx_unlock_flags
> db>
> [thread pid 12 tid 100017 ]
> Stopped at _mtx_unlock_flags: pushq %rbp
> db>
> [thread pid 12 tid 100017 ]
> Stopped at _mtx_unlock_flags+0x1: movq %rsp,%rbp
> db>
> [thread pid 12 tid 100017 ]
> Stopped at _mtx_unloock_flags+0x4: subq $0x20,%rsp
> db>
> [thread pid 12 tid 100017 ]
> Stopped at _mtx_unlock_flags+0x8: movq %rbx,(%rsp)
> db>
> [thread pid 12 tid 100017 ]
> Stopped at _mtx_unlock_flags+0xc: movq %r12,0x8(%rsp)
> db>
> [thread pid 12 pid 100017 ]
> Stopped at _mtx_unlock_flags+0x11: movq %rdi,%rbx
> db>
> [thread pid 12 pid 100017 ]
> Stopped at _mtx_unlock_flags+0x14: movq %r13,0x10(%rsp)
> db>
> E
> 
> Including that "E" at the end.

No good ideas here, though I think we just turned off PSL_T by
accident so it ran for a while before hanging after this.  'E' must be
the start of a message on the console.

> As I said, it's 8.1-RELEASE-p2; it's on AMD64. I'm using custom kernel
> which only differs from GENERIC by addition of the debugging options:
> 
> options     INVARIANTS
> options     INVARIANT_SUPPORT
> options     WITNESS
> options     DEBUG_LOCKS
> options     DEBUG_VFS_LOCKS
> options     DIAGNOSTIC
> 
> I tried rebooting with ACPI disabled, but the thing paniced on boot with
> 
> panic: Duplicate free of item 0xffffff00025e0000 from zone
> 0xffffff00bfdcc2a0(1024)
> 
> cpuid = 0
> KDB: enter: panic
> [thread pid 0 tid 100000 ]
> Stopped at kdb_enter+0x3d: movq $0, 0x6b2d20(%rip)
> db> bt
> Tracing pid 0 tid 100000 td 0xffffffff80c63fc0
> kdb_enter() at kdb_enter+0x3d
> panic() at panic+0x17b
> uma_dbg_free() at uma_dbg_free+0x171
> uma_zfree_arg() at uma_zfree_arg+0x68
> free() at free+0xcd
> device_set_driver() at device_set_driver+0x7c
> device_attach() at device_attach+0x19b
> bus_generic_attach() at bus_generic_attach+0x1a
> pci_attach() at pci_attach+0xf1

The free() should be the free to free the softc but that implies it had a 
previous driver and softc.  Maybe add some debug info to devclass_set_driver() 
to print out the previous driver's name (and maybe the value of the pointer)
before free'ing the softc.  You could use gdb on the kernel.debug and the 
pointer value to figure out exactly which driver was the previous one and look 
to see if it's probe routine does something funky with the softc pointer.

-- 
John Baldwin


More information about the freebsd-hackers mailing list