Odd RAID Performance Issue

Stephen Sanders ssanders at softhammer.net
Mon Feb 13 15:04:31 UTC 2012


We've an application that logs data on one very large raid6 array
and updates/accesses a database on another smaller raid5 array.

Both arrays are connected to the same PCIe 3ware RAID controller.   The
system has 2 six core 3Ghz processors and 24 GB of RAM.  The system is
running FreeBSD 8.1.

The averaged read/write rate to the database is 2MB/s while the averaged
write raid to the data  logging array is 300MB/s.  Writes to the logging
array are somewhat bursty.

The problem we're encountering is that the disk subsystem appears to
'pause' periodically.   It looks as if this is a result of disk read/write
operations from the database array taking a very long time to complete
(up to 8 sec).

When the disk read operation takes such a long time, it appears that the
system starts to run out of memory due to bio block buffering.  Most
processes end up in either getblk() or waithighrunning().

We've instrumented g_vfs_strategie() and bufdone_finish() using dtrace.  
The indication from this effort is that a number of reads and writes are
taking 4-8 seconds.

So far, it looks as if the disk driver and hardware are OK as read/write
operations appear to be in the milli-second region.  We believe that our
instrumentation is pointing to something between the VFS layer and the
CAM as the culprit.

We've gotten the same result from FreeBSD 8.2 but have not tried FreeBSD
9 as yet.

This scenario is not limited to a single system and is occurring on a
couple of systems.

Does this sound familiar to anyone out there?

Thanks


More information about the freebsd-hackers mailing list