kern/115801: Writing of crash dumps is unreliable

Arthur Hartwig arthur.hartwig at nokia.com
Fri Aug 24 20:40:02 PDT 2007


>Number:         115801
>Category:       kern
>Synopsis:       Writing of crash dumps is unreliable
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sat Aug 25 03:40:01 GMT 2007
>Closed-Date:
>Last-Modified:
>Originator:     Arthur Hartwig
>Release:        6.2
>Organization:
Nokia
>Environment:
>Description:
In ata_start() in ata-queue.c if the variable dumping is non-zero the code loops calling ata_interrupt() until it returns non-zero. Unfortunately it is possible for hardware interrupt to occur while in this loop and process the completion with the consequence that the completion is not seen in the loop in ata_start().






>How-To-Repeat:

>Fix:
Suggested fix: On initiating a dump, remove the device interrupt handler to ensure that ata_interrupt() is not called as a consequence of a hardware interrupt while in the afore mentioned loop in ata_start(). Note it is not sufficient to disable interrupts on the device because the device interrupt line might be shared by other devices (particularly true with recent Intel chipsets) and the ATA interrupt handler will be called when other devices interrupt. This fix has proved reliable on a number of systems based on Intel E7520 and P5000 chipsets.

In ad_dump() in ata-disk.c, change 
*** 953,963 ****
--- 953,975 ----
  ad_dump(void *arg, void *virtual, vm_offset_t physical,
        off_t offset, size_t length)
  {
      struct disk *dp = arg;
      struct bio bp;
+     static int first_call = 1;
+     int ret;

+
+     if (first_call) {
+       device_t dev = device_get_parent((device_t)(dp->d_drv1));
+       struct ata_channel *ch = device_get_softc(dev);
+
+       if ((ret = bus_teardown_intr(dev, ch->r_irq, ch->ih))) {
+               printf("Failed to deregister interrupt handler: %d\n", ret);
+       }
+       first_call--;
+     }
      /* length zero is special and really means flush buffers to media */
      if (!length) {
        struct ata_device *atadev = device_get_softc(dp->d_drv1);
        int error = 0;



>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list