[PATCH] Stability fixes for IPS driver for 4.x
David Sze
dsze at alumni.uwaterloo.ca
Tue Apr 12 20:57:04 PDT 2005
At 12:26 PM 12/04/2005 -0600, Scott Long wrote this to All:
>David Sze wrote:
>>At 11:31 PM 10/04/2005 -0600, Scott Long wrote this to All:
>>
>>>Making a driver PAE-ified means either teaching it to do 64-bit
>>>scatter-gather (assuming that the peripheral hardware can do this
>>>and that it's documented), or teaching the driver to correctly handle
>>>EINPROGRESS from bus_dmamap_load() along with using the proper busdma
>>>tag limits. The strategy I took with 6.x/5.x was the second one since
>>>I didn't have good IPS docs in front of me and I wanted it follow the
>>>APIs correctly. I did test it with 8GB of memory and it performed
>>>correctly under load. I haven't taken a close enough look at your
>>>MFC patch to say for sure if it's correct or not. I'm not sure if
>>>I'll have time to take another look in the next few days, unfortunately.
>>>Is there any chance you could test 5.x/6.0 under load with PAE just to
>>>validate the assertion that it works correctly there?
>>
>>I had a chance to test 5.4-RC1 (i386) today with GENERIC, SMP, PAE, and
>>SMP-PAE kernels (the last one is just PAE with "options SMP").
>>To recap, the hardware is an IBM xSeries 346, Dual Xeon 3GHz (non-E64MT),
>>ServeRAID-7K.
>>GENERIC and SMP survived "make buildkernel", but PAE and SMP-PAE paniced
>>reproducibly doing the same. The DDB stack trace doesn't appear to be
>>anywhere near the IPS driver though, so I'm way out of my league.
>
>Darnit, hard to say if this is an existing bug in 5.4 or if it's a
>bug/corruption in ips.Can you re-run with PAE disabled?
Works fine with PAE disabled (or at least I couldn't get it to panic), both
UP and SMP kernels.
>Would you be
>willing to put the Giant lock back on top of the driver? This would
>mean modifying the call to bus_intr_config(), adding the D_GIANTNEEDED
>flag to the disk structure in disk_create(), and switching the mutex
>argument in bus_dma_tag_create() for the sg_dmatag tag.
I put Giant back in as you described (patch attached), but it still
panic'ed with PAE enabled, both UP and SMP kernels. The stack trace was
very similar; the fault address (0x24) and the top three stack frames were
the same as without Giant:
propagate_priority
turnstile_wait
_mtx_lock_sleep
At this point I no longer have access to the hardware, the customer wanted
his servers back. They're going into the datacenter with RELENG_4 (w/IPS
stability patch), without PAE (so the top ~900MB of his 4GB RAM is lost to
PCI-X address space).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ips.RELENG_5_4.giant.patch
Type: application/octet-stream
Size: 4351 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20050412/9c83bda7/ips.RELENG_5_4.giant.obj
More information about the freebsd-stable
mailing list