Re: speeding up zfs send | recv (update)

From: Steven Hartland <killing_at_multiplay.co.uk>
Date: Thu, 20 May 2021 10:25:30 UTC
What is your pool structure / disk types?

On Mon, 17 May 2021 at 16:58, mike tancsa <mike@sentex.net> wrote:

> On 5/13/2021 11:37 AM, Alan Somers wrote:
> > On Thu, May 13, 2021 at 8:45 AM mike tancsa <mike@sentex.net
> > <mailto:mike@sentex.net>> wrote:
> >
> >     For offsite storage, I have been doing a zfs send across a 10G
> >     link and
> >     noticed something I don't understand with respect to speed.  I have a
> >
> >
> > Is this a high latency link?  ZFS send streams can be bursty.  Piping
> > the stream through mbuffer helps with that.  Just google "zfs send
> > mbuffer" for some examples.  And be aware that your speed may be
> > limited by the sender.  Especially if those small files are randomly
> > spread across the platter, your sending server's disks may be the
> > limiting factor.  Use gstat to check.
> > -Alan
>
>
> Just a quick follow up.  I was doing some tests with just mbuffer,
> mbuffer and ssh, and just ssh (aes128-gcm) with an compressed stream and
> non compressed stream (zfs send vs zfs send -c).  Generally, didnt find
> too much of a difference.  I was testing on a production server that is
> generally uniformly busy, so it wont be 100% reliable, but I think close
> enough as there is not much variance in the back ground load nor in the
> results.
>
> I tried this both with datasets that were backups of mailspools, so LOTS
> of little files and big directories as well as with zfs datasets with a
> few big files.
>
> On the mail spool just via mbuffer (no ssh involved at all)
>
> zfs send
> summary:  514 GiByte in 1h 09min 35.9sec - average of  126 MiB/s
> zfs send -c
> summary:  418 GiByte in 1h 05min 58.5sec - average of  108 MiB/s
>
> and the same dataset, sending just through OpenSSH took 1h:06m (zfs
> send) and 1h:01m (zfs send -c)
>
>
> On the large dataset (large VMDK files), similar pattern. I did find one
> interesting thing, when I was testing with a smaller dataset of just
> 12G.  As the server has 65G of ram, 29 allocated to ARC, sending a zfs
> stream with -c made a giant difference. I guess there is some efficiency
> with sending something thats already compressed in arc ? Or maybe its
> just all cache effect.
>
> Testing with one with about 1TB of referenced data using mbuffer with
> and without ssh  and just ssh
>
> zfs send with mbuffer and ssh
> summary:  772 GiByte in 51min 06.2sec - average of  258 MiB/s
> zfs send -c
> summary:  772 GiByte in 1h 22min 09.3sec - average of  160 MiB/s
>
> And the same dataset just with ssh -- zfs send 53min and zfs send -c 55min
>
> and just mbuffer (no ssh)
>
> summary:  772 GiByte in 56min 45.7sec - average of  232 MiB/s (zfs send -c)
> summary: 1224 GiByte in 53min 20.4sec - average of  392 MiB/s (zfs send)
>
> This seems to imply the disk is the bottleneck. mbuffer doesnt seem to
> make much of a difference either way.  Straight up ssh looks to be fine
> / best.
>
> Next step is going to allocate a pair of SSDs as special allocation
> class vdevs to see if it starts to make a difference for all that
> metadata. I guess I will have to send/resend the datasets to make sure
> they make full use of the special vdevs
>
>     ----Mike
>
>
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>