FreeBSD 8.0-BETA4 IBM ServerRaid 8k issues

Ivan Voras ivoras at freebsd.org
Thu Sep 10 14:51:49 UTC 2009


George Mamalakis wrote:
> Hello everybody,
> 
> Yesterday I installed FreeBSD 8.0-BETA4 on an IBM 3650, having a 

3650 M1 or M2?

> ServerRaid 8k adapter, and 6 sata disks on raid-6. The raid-6 volume was 
> "synchronizing" for a day, so this syncing process was happening while I 
> was installing fbsd on the server. During the installation I was 

I can give you only some generic information.

> lock order reversal:
> 1st 0xffffff807c133540 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2559
> 2nd 0xffffff0003deb200 dirhash (dirhash) @ 
> /usr/src/sys/ufs/ufs/ufs_dirhash.c:285

You have received this message because you are running a debug kernel; a 
normal kernel with debugging disabled would not have shown it.

By the fact that you did received it and the system managed to recover 
from it it is likely that the issue is harmless. It's likely related to 
the following timeouts.

> aac0: COMMAND 0xffffff80003e08a0 TIMEOUT AFTER 40 SECONDS
> aac0: COMMAND 0xffffff80003d5070 TIMEOUT AFTER 40 SECONDS
> aac0: COMMAND 0xffffff80003e0d00 TIMEOUT AFTER 40 SECONDS
> aac0: COMMAND 0xffffff80003d9440 TIMEOUT AFTER 40 SECONDS
> ....
> 
> ...and kept on like that, for many many lines, with decreasing timeouts. 

It looks like the controller was too busy rebuilding to take any new 
requests. It is possible you have filled the controller's write cache 
and that is why the lag happened at this point. You can easily test this 
theory.

> Once the syncing process stopped, everything came back to normal (not 
> that I have stress-tested the machine, to be honest...). But since it 
> happened once, during this specific procedure, then maybe it could also 
> happen when the raid controller is reconstructing its volumes; and this 
> would be very annoying, as far as the server's efficiency (and/or maybe 
> stability) is concerned.

Yes, you are right. But if the controller is the issue here, there is 
not much you can do about it. If it has a "priority" setting between 
normal usage and rebuilding/resyncing you might alter it to favour 
normal usage.

Initial rebuild of RAID 5/6 is also a bit specific as it touches all 
drives and, in some instances (don't know specifically about ServeRAID), 
it means all drives are rewritten in their entirety.

If you determine rebuild/resync is problematic, you might consider using 
a RAID mode that doesn't require it to be so extensive, like RAID 10 
with 4+ drives, or software RAID with ZFS.



More information about the freebsd-stable mailing list