i386/169838: spin lock held too long

Tig On tigger at lvlworld.com
Sat Jul 14 03:40:04 UTC 2012


>Number:         169838
>Category:       i386
>Synopsis:       spin lock held too long
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-i386
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Jul 14 03:40:03 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator:     Tig On
>Release:        FreeBSD 8.3
>Organization:
>Environment:
FreeBSD tiger.lilypie.com 8.3-RELEASE-p3 FreeBSD 8.3-RELEASE-p3 #10: Wed Jul  4 14:33:44 EST 2012     tigger at tiger.lilypie.com:/usr/obj/usr/src/sys/TIGER  i386
>Description:
Once a week, for many years now a back-up sh script runs over the two SCSI drives in 6 similar, but slightly different servers.

The script is very heavy on the drives and will create many tar files with many more small files in each tar. In total about 6.5 million small files across all servers will be tar'd up at the end.

Two weeks ago, the servers were upgraded from 8.2 to 8.3. On the first back-up 4 of the 6 servers went down.

This week, so far only one has gone down. The message on the console is:

spin lock 0xc0cb94b4 (smp rendezvous) held by 0xccaf78a0 (tid 100986) too long
panic: spin lock held too long
cpuid = 3

Debug options are disable on the server, in the kernel conf:
#makeoptions    DEBUG=-g
#options        KDTRACE_HOOKS           # Kernel DTrace hooks
#options        KDB                     # Kernel debugger related code

What other info can I share to help?
>How-To-Repeat:
Sadly, not a completely repeatable issue.

The first crash happens at about 6 hours into the back-up. The 4th crash (last week) happened at about the 24 hour point on the back-up (which takes from 30 to 38 hours).

More than happy to try anything.
>Fix:
Wish I knew :]

>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-i386 mailing list