Regarding regular zfs

Joar Jegleim joar.jegleim at gmail.com
Mon Apr 8 08:29:54 UTC 2013


[...]"Are you deleting old snapshots after the newer snapshots have been
sent?"[...]
yeah, the script deletes old snapshots. The slave will usually hold 2
snapshots ( 1 being the initial snapshot received via zfs send from master,
2nd being the latest snapshot received from master) .

[...]"Can you clarify which machine you mean by server in the last line
above.
I presume you mean the slave machine running "zfs recv".

If you monitor the "server" with "vmstat -v 1", "gstat -a" and "zfs-mon -a"
(the latter is part of ports/sysutils/zfs-stats) during the "freeze",
what do you see?  Are the disks saturated or idle?  Are the "cache" or
"free" values close to zero?" [...]
The last line "Everything
on the server halts / hangs completely." I'm talking about the 'slave' (the
receiving end)
I'll check how cache is doing, but as I wrote in my previous reply, the
'slave' server is completely unresponsive, nothing works at all for 5-15
seconds, when the server is responsive again (can ssh in and so on) I can't
seem to find anything in dmesg or any log hinting about anything at all
that went 'wrong'  .

"There was a bug in interface between ZFS ARC and FreeBSD VM that resulted
in ARC starvation.  This was fixed between 8.2 and 8.3/9.0."
ah, ok .

"Do you have atime enabled or disabled?  What happens when you don't run
rsync at the same time?

Are you able to break into DDB?"
atime is disabled. When I don't run rsync the server seem ok, I've tried to
detect any hang (as in I ssh into the server and issue various commands
such as top, ls and so on) while not rsync'ing and there might have been a
really minor 'glitch' but it was hardly noticeable at all, and nothing
compared to those 5-15 seconds when the backup server is doing the rsync
(from the live volume, not a snapshot) .

I could try DDB, I'm gonna have to get back to you on that, I haven't
debug'ed FreeBSD kernel before and the system is in production, so I would
have to be cautious. I might be able to try out that during this week .

[...]Apart from the rsync whilst receiving, everything sounds OK.  It's
possible that the rsync whilst receiving is triggering a bug.[...]
I sort of think so too, at least since the whole OS is unresponsive / hang
for anything from 5-15 seconds .

-- 
----------------------
Joar Jegleim
Homepage: http://cosmicb.no
Linkedin: http://no.linkedin.com/in/joarjegleim
fb: http://www.facebook.com/joar.jegleim
AKA: CosmicB @Freenode

----------------------

On 5 April 2013 23:12, Peter Jeremy <peter at rulingia.com> wrote:

> On 2013-Apr-05 12:17:27 +0200, Joar Jegleim <joar.jegleim at gmail.com>
> wrote:
> >I've got this script that initially zfs send's a whole zfs volume, and
> >for every send after that only sends the diff . So after the initial zfs
> >send, the diff's usually take less than a minute to send over.
>
> Are you deleting old snapshots after the newer snapshots have been sent?
>
> >I've had increasing problems on the 'slave', it seem to grind to a
> >halt for anything between 5-20 seconds after every zfs receive .
> Everything
> >on the server halts / hangs completely.
>
> Can you clarify which machine you mean by server in the last line above.
> I presume you mean the slave machine running "zfs recv".
>
> If you monitor the "server" with "vmstat -v 1", "gstat -a" and "zfs-mon -a"
> (the latter is part of ports/sysutils/zfs-stats) during the "freeze",
> what do you see?  Are the disks saturated or idle?  Are the "cache" or
> "free" values close to zero?
>
> ># 16GB arc_max ( server got 30GB of ram, but had a couple 'freeze'
> >situations, suspect zfs.arc ate too much memory)
>
> There was a bug in interface between ZFS ARC and FreeBSD VM that resulted
> in ARC starvation.  This was fixed between 8.2 and 8.3/9.0.
>
> >I suspect it may have something to do with the zfs volume being sent
> >is mount'ed on the slave, and I'm also doing the backups from the
> >slave, which means a lot of the time the backup server is rsyncing the
> >zfs volume being updated.
>
> Do you have atime enabled or disabled?  What happens when you don't run
> rsync at the same time?
>
> Are you able to break into DDB?
>
> >In my setup have I taken the use case for zfs send / receive too far
> >(?) as in, it's not meant for this kind of syncing and this often, so
> >there's actually nothing 'wrong'.
>
> Apart from the rsync whilst receiving, everything sounds OK.  It's
> possible that the rsync whilst receiving is triggering a bug.
>
> --
> Peter Jeremy
>


More information about the freebsd-fs mailing list