Hardware raid v.s. 'soft errors'

Dirk-Willem van Gulik dirkx at webweaving.org
Thu Apr 10 09:56:17 UTC 2014


Got two AAC hardware raid machines - slightly differently configured. Both are (about once every other year) giving an error like

	g_vfs_done():aacd1p8[READ(offset=1757508976640, length=65536)]error = 5

while the hardware raid is healthy - and passes all its validations. As do the disks (we’ve replaced disks on those machines some 4 or
5 times - with no real impact/change - still above issue every 18 months or so) As far as I can trace this through the kernel - these
errors *really* come from the ATA in the AAC - correct ?

The odd thing is that it happens on two machines - with slightly different AAC cards; and with a different upgrade history. Both are now
9.2-RELEASE-p3 - but the issue has propagated from 7.2 upward.

Any suggestions as to where to look ? And specifically - is there a way in the AAC to intercept events/errors at an even lower level ? Or
should I be assuming this to be more in raid card firmware territory ?

Dw


More information about the freebsd-hackers mailing list