kern/71771: Hang during heavy load with amr raid controller (466 series / dell perc 2 SC)

Michel Gravey michel.gravey at 7ici.biz
Wed Sep 15 09:20:26 PDT 2004


>Number:         71771
>Category:       kern
>Synopsis:       Hang during heavy load with amr raid controller (466 series / dell perc 2 SC)
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Sep 15 16:20:17 GMT 2004
>Closed-Date:
>Last-Modified:
>Originator:     Michel Gravey
>Release:        4.10-RELEASE--p2
>Organization:
7ici
>Environment:
FreeBSD proliant 4.10-RELEASE-p2 FreeBSD 4.10-RELEASE-p2 #19: Thu Aug 5 21:55:46 CEST 2004     root at proliant:/usr/src/sys/compile/PROLIANT  i386
>Description:
The hang (no panic) comes under heavy load after 1-5 hours of make world running at -j4(or -j6 or -j12). The system is a proliant 1850R(P3SMP) with 4 drives on a hardware raid 5 controller, a amr 466 (dell perc2sc).
make.conf: COPTFLAGS= -O2 -pipe -march=pentiumpro or GENERIC kernel from fresh install
>How-To-Repeat:
Probably very hardware specific but running several make buildworld/installworld during several hours (1-5, maybe less, it depends) with SMP turned on, with a amr controller 466 series should repeat the problem. A big dbench doesn't reproduce the problem (maybe not tried enought time).
>Fix:
Here is a patch from cognet at freebsd.org whitch correct the problem
Index: amr.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/amr/amr.c,v
retrieving revision 1.7.2.15
diff -u -p -r1.7.2.15 amr.c
--- amr.c       22 Jul 2004 16:35:18 -0000      1.7.2.15
+++ amr.c       5 Aug 2004 11:41:30 -0000
@@ -326,7 +326,7 @@ amr_startup(void *arg)
     /*
      * Start the timeout routine.
      */
-/*    sc->amr_timeout = timeout(amr_periodic, sc, hz);*/
+    sc->amr_timeout = timeout(amr_periodic, sc, hz);
 
     return;
 }
@@ -542,14 +542,16 @@ static void
 amr_periodic(void *data)
 {
     struct amr_softc   *sc = (struct amr_softc *)data;
-
+    int s;
+                            
     debug_called(2);
 
     /* XXX perform periodic status checks here */
 
     /* compensate for missed interrupts */
+    s = splbio();
     amr_done(sc);
-
+    splx(s);
     /* reschedule */
     sc->amr_timeout = timeout(amr_periodic, sc, hz);
 }
      
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list