Rel.10.3 zfs GEOM removal and memory leak
Peter
pmc at citylink.dinoex.sub.org
Sat Dec 3 01:13:27 UTC 2016
Question: how to get ZFS l2arc working on FBSD 10.3 (RELENG or STABLE)?
Problem using 10.3 RELENG:
When ZFS is called the first time after boot, it will delete all device
nodes of the drive carrying l2arc. ZFS itself will access it's slices
by a "diskid/" string, but all other access is impossible - especially,
a swapspace on the same drive (NOT under ZFS) will fail to activate:
> NAME STATE READ WRITE CKSUM
> gr ONLINE 0 0 0
> raidz1-0 ONLINE 0 0 0
> da0s2 ONLINE 0 0 0
> da1s2 ONLINE 0 0 0
> da2s2 ONLINE 0 0 0
> cache
> diskid/DISK-162020405512s1e ONLINE 0 0 0
Here "diskid/DISK-162020405512s1e" equals to ada3s1e, and trying to
open a swapspace on ada3s1b now fails, because that device is no longer
present in /dev :
> root at edge:~ # gpart show ada3
> gpart: No such geom: ada3.
If we now remove the l2arc via
"zfs remove gr diskid/DISK-162020405512s1e"
then the device nodes magically reappear, and we can activate swapspace.
Afterwards we can add the l2arc again, and it will be shown correctly as
"ada3s1e" - but at the next boot the problem appears again.
This problem does not exist in 10.3 STABLE, but instead there is:
Problem using 10.3 STABLE:
Here seems to be a memory leak: the ARC grows above its limits, while
the space used is not accounted in one of [MFU MRU Anon Header Other
L2Hdr].
After some time the MFU+MRU shrink to the bare minimum, and the system
is all busy with arc_reclaim.
The behaviour seems to be triggered by writing to l2arc.(*)
Any advice on how to proceed (or which supported version might work
better)?
(*) Addendum:
I tried to understand the phenomen, and found this on arcstats:
(metadata_size + data_size) + hdr_size + l2_hdr_size + other_size = size
and
metadata_size + data_size = mfu_size + mru_size + anon_size + X
The X is the memory leak, it does never shrink, does not disappear when
all l2arc are removed, and while l2arc are written it does continually
(but not linear) grow until the system is quite stuck and l2arc write
ceases.
Further investigations shows the growing of X being synchronous with
the growing of kstat.zfs.misc.arcstats.l2_free_on_write figure.
More information about the freebsd-stable
mailing list