bin/121684: : dump(8) frequently hangs

Michael freebsdports at bindone.de
Mon Sep 1 21:28:56 UTC 2008


Sorry to scare you, I was a little unhappy about dump hanging.
Based on the cvs repository I wrote a little patch that combines two 
changes made by scott and jeff that works agains 7.0-RELEASE (looking at 
the commit logs scared me away from trying STABLE on production right 
now. After applying this patch I could run dumps successfully on seven 
machines where it hung before on every single attempt (use is at your 
own risk of course).

cd /usr/src/sys/kern
patch < /tmp/mysleepqueue.patch
recompile and install kernel
reboot

Since I also found a fatal bug in ipv6 (panic on ping6) it might be 
better for you to wait for 7.1, for us there is no way back now.

cheers
michael

Mike Tancsa wrote:
> At 05:07 AM 9/1/2008, Derek Kuliński wrote:
> 
>> Now I'm honestly a bit scared about it (even if it will be fixed
>> before 7.1, I'm not sure I'll hurry with the update).
> 
> There have been a number of commits to releng_7 that fixed dump issues 
> for me.  A box that used to regularly exhibit hung dump processes have 
> been working fine since April.  e.g. a kernel from
> 7.0-STABLE FreeBSD 7.0-STABLE #4: Wed Apr 30
> 
> does weekly level 0 dumps and daily differential dumps on the file 
> systems below without issue
> % df -i
> Filesystem    1K-blocks      Used     Avail Capacity iused    ifree 
> %iused  Mounted on
> /dev/twed0s1a   2026030    284346   1579602    15%    2937   279685    
> 1%   /
> devfs                 1         1         0 100%       0        0  
> 100%   /dev
> /dev/twed0s1d   5077038    575828   4095048 12%    1197   658257    0%   
> /tmp
> /dev/twed0s1e  20308398  11072840   7610888 59% 1065406  1572416   40%   
> /usr
> /dev/twed0s1f  20308398  13275050   5408678 71%   13750  2624072    1%   
> /var
> /dev/twed0s1g 246875258 186393906  40731332    82% 9118036 22794922   
> 29%   /zoo
> 
> However, you should test and make sure it works for you.
> 
>         ---Mike
> 
>         ---Mike

-------------- next part --------------
--- subr_sleepqueue.c~	2008-09-01 05:14:28.000000000 +0200
+++ subr_sleepqueue.c	2008-09-01 05:14:28.000000000 +0200
@@ -177,7 +177,7 @@
 	for (i = 0; i < SC_TABLESIZE; i++) {
 		LIST_INIT(&sleepq_chains[i].sc_queues);
 		mtx_init(&sleepq_chains[i].sc_lock, "sleepq chain", NULL,
-		    MTX_SPIN);
+		    MTX_SPIN | MTX_RECURSE);
 #ifdef SLEEPQUEUE_PROFILING
 		snprintf(chain_name, sizeof(chain_name), "%d", i);
 		chain_oid = SYSCTL_ADD_NODE(NULL, 
@@ -403,12 +403,15 @@
 		mtx_unlock(&ps->ps_mtx);
 	}
 	/*
-	 * Lock sleepq chain before unlocking proc
-	 * without this, we could lose a race.
-	 */
+	 * Lock the per-process spinlock prior to dropping the PROC_LOCK
+	 * to avoid a signal delivery race.  PROC_LOCK, PROC_SLOCK, and
+	 * thread_lock() are currently held in tdsignal().
+ 	 */
+	PROC_SLOCK(p);
 	mtx_lock_spin(&sc->sc_lock);
 	PROC_UNLOCK(p);
 	thread_lock(td);
+	PROC_SUNLOCK(p);
 	if (ret == 0) {
 		if (!(td->td_flags & TDF_INTERRUPT)) {
 			sleepq_switch(wchan);


More information about the freebsd-stable mailing list