AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5

Mon Oct 15 07:17:21 PDT 2007

d_elbracht wrote:
>>> we are trying to diagnose errors seen on 6.2, SMP, amd64, 
>> cvsup'ed of
>>> 2007-10-09
>>>
>>> Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x 
>>> Opteron 2216, da3 is on a 3ware 9550-12
>>>
>>> we are seeing this error:
>>> g_vfs_done():da3s1a[READ(offset=81064794762854400, 
>> length=8192)]error 
>>> = 5 on a 12 GB Hyperdrive
>>>
>>> the offset changes sometimes, but it is always 
>> 81064794xxxxxxxxx and 
>>> well out the 12GB range.
>> Yes.
>>
>>> According to systat -vm, da3 does tps > 500 (yes, that's a lot)
>> That's not a lot :) That's actually low for a modern solid 
>> state drive.
>>
>>> This leads to an assumption, the error has to do with very high IOs 
>>> per second on a SMP machine.
>> Either that or file system errors. Does fsck run ok or does 
>> it say anything unusual?
>>
>> There are several theoretical reasons for such errors that 
>> are connected with the fact you use solid state drives, but 
>> all are tricky to diagnose if you don't have a certain 
>> repeatable test you can try. For example:
>> some SSDs optimize writes to "spread out" the IO on the 
>> chips, but some do it by looking into file system structures 
>> to determine where it's safe to relocate the write - 
>> obviously this works only with a known and supported file 
>> system. This is a really wild guess, but maybe the SSD 
>> firmware has error somewhere in this area, trying to 
>> interpret UFS as it was FAT? If you manage to get a 
>> repeatable failure test, you can try formatting the drive as 
>> FAT32 and trying it on that.

Solid state drives don't behave much differently that a regular drive 
from FreeBSD's point of view.  The huge difference most people notice is 
that they perform best at their page size (or maybe what the SSD 
manufacturer might call a block size, which is not a sector size), which 
is often 128K or 256K.  IO smaller than the page size suffers a big 
penalty since most SSD devices do not have a cache onboard (although 
some do now).

>> Or maybe it's just a bad drive...

I doubt it's a bad device..

>>> The system-disk is a RAID1 on an ICP 5805. All other disks 
>> (51) are 20 
>>> gstripe'd partitions.
>> 51 drives and 20 partitions?
>>
> According to the manufaturer, the drive handles any filesystem. In other
> words, it's as transparent as any harddisk would be.
> Also, as written before, we have seen the error=5 with weird offsets on an
> md (memory disk) before too.
> fsck on the disk does NOT show any error.
> 
> yes, 20 partitions on the other 51 disks (/dev/stripe/data ..datann). That's
> for hashfeed from diablo.
> 
> One basic question to ask: where does the value for offset= in g_vfs_done()
> come from ? 
>>From the time the error shows up in syslog I believe, the error only
> happens, when a file get's appended.

I wonder if (wild guess follows) there's a 32/64 bit conversion problem 
somewhere, like a 32bit number cast as 64bit or something.

I'd like to see a full trace to see what path it takes.  Maybe putting a 
  panic in the error path would be worth doing.

Eric