Serious Dell Sadness - H200, H700, and H800

Wed Mar 23 16:25:07 UTC 2011

Well after letting it run all night, the patch appears to be working as 
expected.  Fantastic!  I'm putting the machine into production just so 
the users can bang away at it in their own way, they'll find any way of 
crashing it, if it is possible, that I did not.  ;)

Neil, you mentioned that there may be a performance hit from the extra 
read operation the patch executes.  Does that mean for every single read 
or write operation, there is an extra read operation?  Such that the 
number of I/Os to the disk is multiplied by two?  Or is it only an extra 
read operation at the end of an interrupt or something (forgive my 
ignorance, I'm not fully versed on how interrupts affect the bus)?  If 
the latter, would the performance hit only be like 1-2% in practice?  If 
the former, would that mean a 50% performance hit?

On 03/22/11 17:58, Erich Weiler wrote:
> This is great news!  I've patched my kernel (8.2-PRERELEASE) and am 
> testing it now by running two concurrent looping iozone runs and also 
> rsyncing 1TB of data to my two SAS chained MD1200s at the same time (via 
> my Perc H800 controller).  The disks are definitely busy but hanging in 
> there, but then again it's only been an hour.  If it's still going in 
> the morning and I see no TIMEOUT messages in my logs I'll call it a win. 
>  I'll let you guys know how that works for me.
> 
> Thanks Scott and Neil!
> 
> If this is blessed by whoever blesses such things, can it be pushed into 
> 8-STABLE?
> 
> On 3/22/11 11:43 AM, Neil Schelly wrote:
>> We have reached some conclusion on this issue, and a positive one at 
>> that.  Big Credit here goes to Scott Long, who was able to help us 
>> debug the issue with a patch to the driver that has completely 
>> resolved the issue for us.  He gave permission for me to 
>> post/distribute this patch, and sees no reason it couldn't be made a 
>> part of the MFI driver base.  I've pasted it at the bottom of this 
>> message.
>>
>> His explanation centers around out-of-band interrupt synchronization 
>> on the PCI bus.  Interrupts associated with the completion of I/O 
>> operations from the card to the CPU are getting lost/ignored.  By 
>> issuing a dummy read operation (thus forcing a flush of data buffers), 
>> this issue is largely averted.  He strongly suspects that the 
>> controller firmware is de-asserting an interrupt prematurely, so that 
>> the OS never responds to the I/O operation and things just hang.  Once 
>> something like mfiutil is run, it reads from the device, unlocking the 
>> bus, and things continue as normal.  The patch adds extraneous read 
>> operations into the end of the interrupt handler, which keeps things 
>> flowing more normally, albeit with a slight performance hit by having 
>> the extra read operations.
>>
>> I am unsure if this completely eliminates the race condition, but it 
>> will at least have to happen in a much smaller window of time with 
>> this patch.  We have been unable to reproduce the problem while 
>> running this version.  From the sound of his explanation, it's also 
>> possible this problem doesn't exist except when accessing the card via 
>> PCI semantics.  If the device were operating in MSI mode (PCI 
>> Express), where interrupt handling is significantly different, this 
>> may not come up at all.
>>
>> Thanks again to Scott Long for the help.  Here's patch:
>>
>> Index: mfi.c
>> ===================================================================
>> RCS file: /usr/ncvs/src/sys/dev/mfi/mfi.c,v
>> retrieving revision 1.54
>> diff -u -r1.54 mfi.c
>> --- mfi.c 7 Dec 2009 21:24:07 -0000 1.54
>> +++ mfi.c 13 Mar 2011 04:12:35 -0000
>> @@ -928,6 +928,12 @@
>> if (sc->mfi_check_clear_intr(sc))
>> return;
>>
>> + /*
>> + * Do a dummy read to flush the interrupt ACK that we just performed,
>> + * ensuring that everything is really, truly consistent.
>> + */
>> + (void)sc->mfi_read_fw_status(sc);
>> +
>> pi = sc->mfi_comms->hw_pi;
>> ci = sc->mfi_comms->hw_ci;
>> mtx_lock(&sc->mfi_io_lock);
>>
>> -- 
>> Neil Schelly
>> Director of Uptime
>> Dynamic Network Services, Inc.
>> W: 603-296-1581
>> M: 508-410-4776
>> http://www.dyndns.com
>> _______________________________________________
>> freebsd-scsi at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-scsi
>> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe at freebsd.org"
> _______________________________________________
> freebsd-scsi at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-scsi
> To unsubscribe, send any mail to "freebsd-scsi-unsubscribe at freebsd.org"