i386/169838: spin lock held too long
Tig On
tigger at lvlworld.com
Sat Jul 14 03:40:04 UTC 2012
>Number: 169838
>Category: i386
>Synopsis: spin lock held too long
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: freebsd-i386
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Sat Jul 14 03:40:03 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator: Tig On
>Release: FreeBSD 8.3
>Organization:
>Environment:
FreeBSD tiger.lilypie.com 8.3-RELEASE-p3 FreeBSD 8.3-RELEASE-p3 #10: Wed Jul 4 14:33:44 EST 2012 tigger at tiger.lilypie.com:/usr/obj/usr/src/sys/TIGER i386
>Description:
Once a week, for many years now a back-up sh script runs over the two SCSI drives in 6 similar, but slightly different servers.
The script is very heavy on the drives and will create many tar files with many more small files in each tar. In total about 6.5 million small files across all servers will be tar'd up at the end.
Two weeks ago, the servers were upgraded from 8.2 to 8.3. On the first back-up 4 of the 6 servers went down.
This week, so far only one has gone down. The message on the console is:
spin lock 0xc0cb94b4 (smp rendezvous) held by 0xccaf78a0 (tid 100986) too long
panic: spin lock held too long
cpuid = 3
Debug options are disable on the server, in the kernel conf:
#makeoptions DEBUG=-g
#options KDTRACE_HOOKS # Kernel DTrace hooks
#options KDB # Kernel debugger related code
What other info can I share to help?
>How-To-Repeat:
Sadly, not a completely repeatable issue.
The first crash happens at about 6 hours into the back-up. The 4th crash (last week) happened at about the 24 hour point on the back-up (which takes from 30 to 38 hours).
More than happy to try anything.
>Fix:
Wish I knew :]
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-i386
mailing list