seeing data corruption with zfs trim functionality
Ajit Jain
ajit.jain at cloudbyte.com
Thu May 23 10:33:30 UTC 2013
Sure Steven,
I'll apply the patches and update ASAP.
thanks
ajit
On Thu, May 23, 2013 at 3:03 PM, Steven Hartland <killing at multiplay.co.uk>wrote:
> I've attacked the two patch sets I'm looking to MFC to stable-9, one
> adds BIO_DELETE CAM changes and the other is ZFS TRIM support.
>
> They should both apply cleanly to stable-9, if you could test with
> those on your machine and let me know.
>
> Regards
> Steve
>
> ----- Original Message ----- From: "Ajit Jain" <ajit.jain at cloudbyte.com>
>
>
> Hi Steven,
>>
>> FW version on the setup is P15.
>> I will upgrade the FW to P16, but I think my
>> best bet will be to update code base to 9 stable as unlike you,
>> I was seeing corruption for all three delete methods.
>>
>> thanks
>> ajit
>>
>> On Sat, May 18, 2013 at 4:15 AM, Steven Hartland <killing at multiplay.co.uk
>> >**wrote:
>>
>> ----- Original Message ----- From: "Steven Hartland" <
>>> killing at multiplay.co.uk>
>>>
>>>
>>> After initially seeing not issues, our overnight monitoring started
>>>> moaning
>>>> big time on the test box. So we checked and there was zpool corruption
>>>> as
>>>> well
>>>> as a missing boot loader and a corrupt GPT, so I believe we have
>>>> reproduced
>>>> your issue.
>>>>
>>>> After recovering the machine I created 3 pools on 3 different disks each
>>>> running a different delete_method.
>>>>
>>>> We then re-ran the tests which resulted in the pool running with
>>>> delete_method
>>>> WS16 being so broken it had suspended IO. A reboot resulted in it once
>>>> again
>>>> reporting no partition table via gpart.
>>>>
>>>> A third test run again produced a corrupt pool for WS16.
>>>>
>>>> I've conducted a preliminary review of the CAM WS16 code path along with
>>>> SBC-3
>>>> spec which didn't identify any obvious issues.
>>>>
>>>> Given we're both using LSI 2008 based controllers it could be FW issue
>>>> specific
>>>> to WS16 but that's just speculation atm, so I'll continue to
>>>> investigate.
>>>>
>>>> If you could re-test you end without using WS16 to see if you can
>>>> reproduce the
>>>> problem with either UNMAP or ATA_TRIM that would be a very useful data
>>>> point.
>>>>
>>>>
>>> After much playing I narrow down a test case of one delete which was
>>> causing
>>> disc corruption for us (deleted the partition table instead of data in
>>> the middle of the disk).
>>>
>>> The conclusion is LSI 2008 HBA with FW below P13 will eat the data on
>>> your
>>> SATA
>>> disks if you use WS16 due to the following bug:-
>>> SCGCQ00230159 (DFCT) - Write same command to a SATA drive that doesn't
>>> support
>>> SCT write same may write wrong region.
>>>
>>> After updating here to P16, which we would generally be running, but test
>>> box
>>> was new and hadnt updated yet the corruption issue is no longer
>>> reproducable.
>>>
>>> So Ajit please check your FW version, I'm hoping to here your on
>>> something
>>> below P13, P12 possibly?
>>>
>>> If so then this is your issue, to fix simply update to P16 and the
>>> problem
>>> should be gone.
>>>
>>>
>>> Regards
>>> Steve
>>>
>>>
>>> ==============================****==================
>>>
>>> This e.mail is private and confidential between Multiplay (UK) Ltd. and
>>> the person or entity to whom it is addressed. In the event of
>>> misdirection,
>>> the recipient is prohibited from using, copying, printing or otherwise
>>> disseminating it or any information contained in it.
>>> In the event of misdirection, illegible or incomplete transmission please
>>> telephone +44 845 868 1337
>>> or return the E.mail to postmaster at multiplay.co.uk.
>>>
>>>
>>>
>>
> ==============================**==================
> This e.mail is private and confidential between Multiplay (UK) Ltd. and
> the person or entity to whom it is addressed. In the event of misdirection,
> the recipient is prohibited from using, copying, printing or otherwise
> disseminating it or any information contained in it.
> In the event of misdirection, illegible or incomplete transmission please
> telephone +44 845 868 1337
> or return the E.mail to postmaster at multiplay.co.uk.
>
More information about the freebsd-fs
mailing list