kern/114370: [hang] 6.2 kernel with SMP options hangs when dumping core on dual cpu board

Thu May 1 22:20:05 UTC 2008

The following reply was made to PR kern/114370; it has been noted by GNATS.

From: "Dorr H. Clark" <dclark at engr.scu.edu>
To: bug-followup at FreeBSD.org
Cc: yrao at force10networks.com, smp at FreeBSD.org, bugs at FreeBSD.org,
        dclark at applmath.scu.edu
Subject: Re: kern/114370: [hang] 6.2 kernel with SMP options hangs when
 dumping core on dual cpu board
Date: Thu, 1 May 2008 14:12:14 -0700 (PDT)

 http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/114370

 We believe we have recreated this issue with 6.3, 
 we have test code which has helped us reproduce it, 
 and we have a proposed fix.

 Our version of the symptom is slightly different
 in that we get a couple #s into the block countdown
 of the core dump, but otherwise it's the same.

 Note that this test code is solely for the purpose
 of exploring the hypothesis of the fix, it is not
 required to exhibit the issue, but it makes it convenient
 on an SMP/GENERIC kernel (i.e.- no special config).

 We have added 2 commands FIOCNCR1 and FIOCNCR2 to the ioctl system
 call, which is implemented in kern/sys_generic.c. 
 This is just some silly code added to reproduce the issue.

 <test code begins>

 #define FIOCNCR1 _IO('f', 3)
 #define FIOCNCR2 _IO('f', 4)

         case FIOCNCR1:
                 mtx_lock_spin(&sched_lock);
                 sched_bind(curthread, 0);
                 mtx_unlock_spin(&sched_lock);
                 while(ncr1) {
                         DELAY(100000);
                         yield(curthread, NULL);
                 }
                 return (0);

         case FIOCNCR2:
                 mtx_lock_spin(&sched_lock);
                 sched_bind(curthread, 1);
                 mtx_unlock_spin(&sched_lock);
                 while(ncr1) {
                         if (ncr2) {
                                 panic("force panic on CPU 1");

                         }
                         DELAY(100000);
                         yield(curthread, NULL);
                 }
                 return (0);

 <test code ends>

 Here is our explanation of the issue.

 If CPU1 is generating a dump, it is not getting out of the following
 loop in ata-queue.c[ata_start(), line 213]

                 if (dumping) {
                     mtx_unlock(&ch->state_mtx);
                     mtx_unlock(&ch->queue_mtx);
                     while (!ata_interrupt(ch))
                         DELAY(10);
                     return;
                 }

 The stack trace is like this

 DELAY(a) at DELAY+0x92
 ata_start() at ata_start+0x313
 ata_queue_request(at ata_queue_request+0x27f
 ad_strategy() at ad_strategy+0x169
 ad_dump() at ad_dump+0xa4
 cb_dumpdata() at cb_dumpdata+0x100
 foreach_chunk() at foreach_chunk+0x23
 dumpsys() at dumpsys+0x1ec
 doadump() at doadump+0x48
 boot() at boot+0x4ea
 panic() at panic+0x1c9
 trap_fatal() at trap_fatal+0x31e
 trap_pfault() at trap_pfault+0x1d7
 trap() at trap+0x309
 calltrap() at calltrap+0x5

 Basically a request is issued to the disk and the thread is waiting for
 the disk IO to complete. The interrupts are not turned off and the
 interrupt thread for the disk controller is processing the "disk IO
 completion". The thread that is waiting for the disk IO completion is
 not aware of this and is waiting forever until ata_interrupt return a
 non-zero value[if the interrupts are turned off, ata_interrupt would
 have returned a non-zero value].

 The proposed patch makes ata_interrupt return 1 if there are no running
 requests and dumping is in progress. This patch doesn't have any impact
 while dumping is not in progress. With this patch a correct dump is
 generated (forced a panic from the slave) and kgdb could read the dump.

 An alternative solution may be to disable interrupts across the system, 
 but is not currently done in FreeBSD 6.3.  Note kern_shutdown.c boot()

         /* XXX This doesn't disable interrupts any more.  Reconsider? */
         splhigh();

         if ((howto & (RB_HALT|RB_DUMP)) == RB_DUMP && !cold && = !dumping)
                 doadump();

 In the context of the current code, here is a proposed fix:

 @@ -315,16 +315,34 @@
  {
      struct ata_channel *ch = (struct ata_channel *)data;
      struct ata_request *request;

 +#if defined(FIX114370)
 +    int rv = 0;
 +#endif

      mtx_lock(&ch->state_mtx);

      do {

         /* ignore interrupt if its not for us */

 +#if defined(FIX114370)
 +       if (ch->hw.status && !ch->hw.status(ch->dev)) {
 +           if ((dumping) && (ch->running == NULL))
 +               rv = 1;
 +           break;
 +       }
 +
 +       /* do we have a running request */
 +       if (!(request = ch->running)) {
 +           if (dumping)
 +               rv = 1;
 +           break;
 +       }
 +#else

         if (ch->hw.status && !ch->hw.status(ch->dev))

             break;

         /* do we have a running request */
         if (!(request = ch->running))
             break;
 +#endif

         ATA_DEBUG_RQ(request, "interrupt");

 @@ -349,7 +367,11 @@

         }
      } while (0);
      mtx_unlock(&ch->state_mtx);

 +#if defined(FIX114370)
 +    return rv;
 +#else
      return 0;
 +#endif

  }

  /*

 If someone can explain why this is not a fix, identify ill side effects, 
 or propose a better solution please respond.

 Thanks,

 Chitti Nimmagadda
 Engineer

 Dorr H. Clark
 Advisor

 Graduate School of Engineering
 Santa Clara University
 Santa Clara, CA