Hast locking up under 9.2

Pete French petefrench at ingresso.co.uk
Thu Nov 21 11:57:06 UTC 2013


I have had to (hopefully temprarily) disable hats on
our systems as under 9.2 I am finding that it locks up under
high disc load. This has only sarted being a problem after we moved
from 8-STABLE to 9-STABLE, there was no locking up before.

I have a zpool on top of the hast devices - I did have two hast devices,
but the problem still occurs with a single device. the symptoms
are that I see the 'dirty" count on the master sidetick at 2.0 megs
and not change, the number of writes does not change, and if I usse a "sync"
command at the command line it never returns - there is no disc activity
on eiher the primary or the secondary side. If I leave it like this it will
eventually freeze the whole machine, but usually if I see this happening I 
reboot the stuck machine.

This only happens under high levels of disc activity (in this case modifying
a mysql table from myisan to inndb - causes a few gig of copies). However it
is not simply high disc activity as I can resilver the ZFS pool quite happily
without problems.

Frustratingly I have a similar setup on a test pair of machines, but I cannot
reporduce the problem there.

I dont have any useful debugging unfortunately, and I do
realise thart "it locks up" is unhelpful! The only thing
I see in the syslog are a statements like this:

Nov 14 13:51:59 <daemon.err> serpentine-active hastd[1258]: [serp1] (primary) Worker process killed (pid=1520, signal=6).
Nov 14 13:51:59 <daemon.err> serpentine-passive hastd[14307]: [serp1] (secondary) Worker process exited ungracefully (pid=14638, exitcode=75).

Thats about all the nfo I have - currently I have taken hast out of the stack
and am tryying to cobble something together manually using
iscsi, but I would prefer to go back to hast if possible. Has anyone seen
anythign similar, or have any suggestions ?

thanks,

-pete.


More information about the freebsd-stable mailing list