Hardware raid v.s. 'soft errors'
Dirk-Willem van Gulik
dirkx at webweaving.org
Thu Apr 10 09:56:17 UTC 2014
Got two AAC hardware raid machines - slightly differently configured. Both are (about once every other year) giving an error like
g_vfs_done():aacd1p8[READ(offset=1757508976640, length=65536)]error = 5
while the hardware raid is healthy - and passes all its validations. As do the disks (we’ve replaced disks on those machines some 4 or
5 times - with no real impact/change - still above issue every 18 months or so) As far as I can trace this through the kernel - these
errors *really* come from the ATA in the AAC - correct ?
The odd thing is that it happens on two machines - with slightly different AAC cards; and with a different upgrade history. Both are now
9.2-RELEASE-p3 - but the issue has propagated from 7.2 upward.
Any suggestions as to where to look ? And specifically - is there a way in the AAC to intercept events/errors at an even lower level ? Or
should I be assuming this to be more in raid card firmware territory ?
Dw
More information about the freebsd-hackers
mailing list