ARC size constantly shrinks, then ZFS slows down extremely

Attila Nagy bra at fsn.hu
Fri Oct 2 07:59:10 UTC 2009


On 09/29/09 12:45, Attila Nagy wrote:
> I'm using FreeBSD 8 (previously 7) on a machine with a lot of disks 
> and 32 GB RAM. With 7.x it ran very well for about 50 days, but 
> suddenly every operation have slowed down.
> gstat showed that the disks are working a lot more than usual the 
> zpool/zfs was pretty unusable.
>
> I've rebooted the machine then with FreeBSD 8 in the hope the new ZFS 
> fixes will correct this issue (no 50 days have passed since then, so I 
> don't know yet) and started to monitor ZFS's statistics.
>
> It seems that after a reboot, the ARC size starts to grow, then 
> something flips the switch and it changes to shrinking, instead of 
> maintaining the size.
>
> Please see the pictures here: 
> http://people.fsn.hu/~bra/freebsd/20090929-zfs-arcsize/
>
> Before the 27th, the machine ran FreeBSD 7, after that date it runs 8.
>
> As you can see, no user process tooks the memory, so I don't know why 
> the ARC size grows first and then start to decrease.
>
> Could it be that the ARC size decreases such a big amount that it 
> effectively disappears and this causes the IO activity go up and kill 
> the machine?
I've upgraded another machine from an older 8-CURRENT to 8-STABLE. It 
has low memory (1GB) and it's i386.
The above symptoms can be triggered very easily: if I do an IMAP search 
on a lot of mailboxes (which I do regularly), about 10 minutes needed 
for the IMAP server to become completely inaccessible.
The machine runs fine, but every operation of the ZFS pool take ages.
According to gstat there is only a very minimal disk activity. The 
machine can't even be rebooted, at least not in ten minutes (reboot, 
wait 10 minutes, nearly nothing happens, reboot -qn makes the machine 
disappear from the net, but it doesn't restart).

Backing out this change from the 8-STABLE kernel:
http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c?r1=191901&r2=191902

makes it survive about half and hour of IMAP searching. Of course only 
time will tell whether this helps in the long run, but so far 10/10 
tries succeeded to kill the machine with this method...

According to this, I would say that this change makes things worse even 
on low memory, i386 (1G RAM) and "there's a plenty of RAM" (32 G) amd64 
servers.


More information about the freebsd-fs mailing list