FreeBSD 9.1 and swap on zfs

Guido Falsi mad at madpilot.net
Mon Apr 8 13:58:58 UTC 2013


On 04/08/13 15:08, Kai Gallasch wrote:
> Hi.
>
> When running a ZFS on root FreeBSD install..
>
> is it for FreeBSD 9.1 (ZFS v28) still not advisable to use a vdev as swapspace? - like:
>
> # zfs create -V 8G \
> 	-o org.freebsd:swap=on \
> 	-o sync=disabled \
> 	-o primarycache=none \
>          -o secondarycache=none rpool/swap
>
> # swapon /dev/zvol/rpool/swap
>
> Often voiced fears for swapping on zfs are, that at the moment the server starts to swap, ZFS will start to compete for memory with the short-on-memory server itself (reason for swapping) and the system will lock up shortly after.
>
> Seems a lot of ZFS-on-root single disk setups make use of an extra swap partition on the root-disk.
>
> People booting of a mirrored-zpool often seem to have a swap partition on both disks forming the zpool and use those as devices for a gmirror. They then use the gmirror device as swap.
>
> Which approach is the recommended one?
> Swapping to ZFS *or* a swap partition / gmirror on top of two partitions?
>

I can share my experience, which is not definitive but I hope can help.

I have various machines with ZFS on root. some with swap on ZVOL and 
some with swap on separate partitions (none are mirroring the swap using 
gmirror though).

There is a race condition between ZFS' ARC and the VM system when very 
low memory conditions arise and this could happen and the machine just 
starves, I've seen this happen on machines when running buildworld -j 
without enough ram and also on machines running ports tinderbox or 
poudriere. This is not happening when using a separate swap partition. 
In such a case the machine swaps happily and just slows down as 
naturally expected when swapping a lot.

I have noticed that setting the following properties on the ZFS ZVOL can 
help some:

checksum              off
compression           off	(it's the default usually)
primarycache          metadata	(maybe none would be even better)
secondarycache        none
sync                  disabled	(in case of a system reset swap data is
				not valuable anyway)


(I'm also not really sure if setting secondarycache has any purpose when 
primarycache is metadata or none)

but this will not completely solve the problem anyway.

Also I'm quite sure that tuning ARC not to take all available memory can 
mitigate the problem too. But the basic race condition remains anyway.

Also, this could be just an idea I have no data to corroborate this, it 
looks to me that ZVOL swap is somewhat slower than a separate partition 
at recovering swapped data back to ram. But again, I don't know how to 
properly test this.

I have never used mirrored swap because I don't think swap data is 
anyway valuable, but I understand that having a machine die just because 
it lost half it's swap can be a problem, so if you need the machine to 
run rock solid through a disk failure having swap mirrored could be a 
really good idea.

My suggestion is:

if you want stability and don't have specific disk layout problems 
create a separate swap.

If you can afford the need to hard reset the machine sometimes when high 
load sends it in a lockup or really can't make separate partitions, just 
go for ZVOLS.

Just my 2 cents!

-- 
Guido Falsi <mad at madpilot.net>


More information about the freebsd-fs mailing list