AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5

d_elbracht d_elbracht at ecngs.de
Mon Oct 15 01:21:06 PDT 2007


> > we are trying to diagnose errors seen on 6.2, SMP, amd64, 
> cvsup'ed of
> > 2007-10-09
> > 
> > Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x 
> > Opteron 2216, da3 is on a 3ware 9550-12
> > 
> > we are seeing this error:
> > g_vfs_done():da3s1a[READ(offset=81064794762854400, 
> length=8192)]error 
> > = 5 on a 12 GB Hyperdrive
> > 
> > the offset changes sometimes, but it is always 
> 81064794xxxxxxxxx and 
> > well out the 12GB range.
> 
> Yes.
> 
> > According to systat -vm, da3 does tps > 500 (yes, that's a lot)
> 
> That's not a lot :) That's actually low for a modern solid 
> state drive.
> 
> > This leads to an assumption, the error has to do with very high IOs 
> > per second on a SMP machine.
> 
> Either that or file system errors. Does fsck run ok or does 
> it say anything unusual?
> 
> There are several theoretical reasons for such errors that 
> are connected with the fact you use solid state drives, but 
> all are tricky to diagnose if you don't have a certain 
> repeatable test you can try. For example:
> some SSDs optimize writes to "spread out" the IO on the 
> chips, but some do it by looking into file system structures 
> to determine where it's safe to relocate the write - 
> obviously this works only with a known and supported file 
> system. This is a really wild guess, but maybe the SSD 
> firmware has error somewhere in this area, trying to 
> interpret UFS as it was FAT? If you manage to get a 
> repeatable failure test, you can try formatting the drive as 
> FAT32 and trying it on that.
> 
> Or maybe it's just a bad drive...
> 
> > The system-disk is a RAID1 on an ICP 5805. All other disks 
> (51) are 20 
> > gstripe'd partitions.
> 
> 51 drives and 20 partitions?
> 
According to the manufaturer, the drive handles any filesystem. In other
words, it's as transparent as any harddisk would be.
Also, as written before, we have seen the error=5 with weird offsets on an
md (memory disk) before too.
fsck on the disk does NOT show any error.

yes, 20 partitions on the other 51 disks (/dev/stripe/data ..datann). That's
for hashfeed from diablo.

One basic question to ask: where does the value for offset= in g_vfs_done()
come from ? 
>From the time the error shows up in syslog I believe, the error only
happens, when a file get's appended.

Dieter




More information about the freebsd-stable mailing list