zfs send/receive: is this slow?
Jeremy Chadwick
freebsd at jdc.parodius.com
Mon Oct 4 18:10:34 UTC 2010
On Mon, Oct 04, 2010 at 01:31:07PM -0400, Dan Langille wrote:
>
> On Mon, October 4, 2010 3:27 am, Martin Matuska wrote:
> > Try using zfs receive with the -v flag (gives you some stats at the end):
> > # zfs send storage/bacula at transfer | zfs receive -v
> > storage/compressed/bacula
> >
> > And use the following sysctl (you may set that in /boot/loader.conf, too):
> > # sysctl vfs.zfs.txg.write_limit_override=805306368
> >
> > I have good results with the 768MB writelimit on systems with at least
> > 8GB RAM. With 4GB ram, you might want to try to set the TXG write limit
> > to a lower threshold (e.g. 256MB):
> > # sysctl vfs.zfs.txg.write_limit_override=268435456
> >
> > You can experiment with that setting to get the best results on your
> > system. A value of 0 means using calculated default (which is very high).
>
> I will experiment with the above. In the meantime:
>
> > During the operation you can observe what your disks actually do:
> > a) via ZFS pool I/O statistics:
> > # zpool iostat -v 1
> > b) via GEOM:
> > # gstat -a
>
> The following output was produced while the original copy was underway.
>
> $ sudo gstat -a -b -I 20s
> dT: 20.002s w: 20.000s
> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
> 7 452 387 24801 9.5 64 2128 7.1 79.4 ada0
> 7 452 387 24801 9.5 64 2128 7.2 79.4 ada0p1
> 4 492 427 24655 6.7 64 2128 6.6 63.0 ada1
> 4 494 428 24691 6.9 65 2127 6.6 66.9 ada2
> 8 379 313 24798 13.5 65 2127 7.5 78.6 ada3
> 5 372 306 24774 14.2 64 2127 7.5 77.6 ada4
> 10 355 291 24741 15.9 63 2127 7.4 79.6 ada5
> 4 380 316 24807 13.2 64 2128 7.7 77.0 ada6
> 7 452 387 24801 9.5 64 2128 7.4 79.7 gpt/disk06-live
> 4 492 427 24655 6.7 64 2128 6.7 63.1 ada1p1
> 4 494 428 24691 6.9 65 2127 6.6 66.9 ada2p1
> 8 379 313 24798 13.5 65 2127 7.6 78.6 ada3p1
> 5 372 306 24774 14.2 64 2127 7.6 77.6 ada4p1
> 10 355 291 24741 15.9 63 2127 7.5 79.6 ada5p1
> 4 380 316 24807 13.2 64 2128 7.8 77.0 ada6p1
> 4 492 427 24655 6.8 64 2128 6.9 63.4 gpt/disk01-live
> 4 494 428 24691 6.9 65 2127 6.8 67.2 gpt/disk02-live
> 8 379 313 24798 13.5 65 2127 7.7 78.8 gpt/disk03-live
> 5 372 306 24774 14.2 64 2127 7.8 77.8 gpt/disk04-live
> 10 355 291 24741 15.9 63 2127 7.7 79.8 gpt/disk05-live
> 4 380 316 24807 13.2 64 2128 8.0 77.2 gpt/disk07-live
>
> $ zpool ol iostat 10
> capacity operations bandwidth
> pool used avail read write read write
> ---------- ----- ----- ----- ----- ----- -----
> storage 8.08T 4.60T 364 161 41.7M 7.94M
> storage 8.08T 4.60T 926 133 112M 5.91M
> storage 8.08T 4.60T 738 164 89.0M 9.75M
> storage 8.08T 4.60T 1.18K 179 146M 8.10M
> storage 8.08T 4.60T 1.09K 193 135M 9.94M
> storage 8.08T 4.60T 1010 185 122M 8.68M
> storage 8.08T 4.60T 1.06K 184 131M 9.65M
> storage 8.08T 4.60T 867 178 105M 11.8M
> storage 8.08T 4.60T 1.06K 198 131M 12.0M
> storage 8.08T 4.60T 1.06K 185 131M 12.4M
>
> Yeterday's write bandwidth was more 80-90M. It's down, a lot.
>
> I'll look closer this evening.
>
>
> >
> > mm
> >
> > DÅa 4. 10. 2010 4:06, Artem Belevich wrote / napÃsal(a):
> >> On Sun, Oct 3, 2010 at 6:11 PM, Dan Langille <dan at langille.org> wrote:
> >>> I'm rerunning my test after I had a drive go offline[1]. But I'm not
> >>> getting anything like the previous test:
> >>>
> >>> time zfs send storage/bacula at transfer | mbuffer | zfs receive
> >>> storage/compressed/bacula-buffer
> >>>
> >>> $ zpool iostat 10 10
> >>> capacity operations bandwidth
> >>> pool used avail read write read write
> >>> ---------- ----- ----- ----- ----- ----- -----
> >>> storage 6.83T 5.86T 8 31 1.00M 2.11M
> >>> storage 6.83T 5.86T 207 481 25.7M 17.8M
> >>
> >> It may be worth checking individual disk activity using gstat -f 'da.$'
> >>
> >> Some time back I had one drive that was noticeably slower than the
> >> rest of the drives in RAID-Z2 vdev and was holding everything back.
> >> SMART looked OK, there were no obvious errors and yet performance was
> >> much worse than what I'd expect. gstat clearly showed that one drive
> >> was almost constantly busy with much lower number of reads and writes
> >> per second than its peers.
> >>
> >> Perhaps previously fast transfer rates were due to caching effects.
> >> I.e. if all metadata already made it into ARC, subsequent "zfs send"
> >> commands would avoid a lot of random seeks and would show much better
> >> throughput.
> >>
> >> --Artem
Please read all of the following items before responding in-line. Some
are just informational items for other people reading the thread.
1) I don't see any indication in the thread of what "zpool status" looks
like on this machine. I'd like to assume it's the same machine in
another thread you started, but I can't reliably make that assumption.
2) All we know about the controller/disks is from your original dmesg.
I've taken the below from it, including only the stuff relevant to
disks ada[0-6]:
siis0: <SiI3124 SATA controller> port 0xdc00-0xdc0f mem 0xfbeffc00-0xfbeffc7f,0xfbef0000-0xfbef7fff irq 17 at device 4.0 on pci6
siisch0: <SIIS channel> at channel 0 on siis0
siisch1: <SIIS channel> at channel 1 on siis0
siisch2: <SIIS channel> at channel 2 on siis0
siisch3: <SIIS channel> at channel 3 on siis0
siis1: <SiI3124 SATA controller> port 0xbc00-0xbc0f mem 0xfbcffc00-0xfbcffc7f,0xfbcf0000-0xfbcf7fff irq 19 at device 4.0 on pci3
siisch4: <SIIS channel> at channel 0 on siis1
siisch5: <SIIS channel> at channel 1 on siis1
siisch6: <SIIS channel> at channel 2 on siis1
siisch7: <SIIS channel> at channel 3 on siis1
ada0 at siisch0 bus 0 scbus0 target 0 lun 0
ada0: <Hitachi HDS722020ALA330 JKAOA28A> ATA-8 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada1 at siisch2 bus 0 scbus2 target 0 lun 0
ada1: <Hitachi HDS722020ALA330 JKAOA28A> ATA-8 SATA 2.x device
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada2 at siisch3 bus 0 scbus3 target 0 lun 0
ada2: <Hitachi HDS722020ALA330 JKAOA28A> ATA-8 SATA 2.x device
ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada3 at siisch4 bus 0 scbus4 target 0 lun 0
ada3: <Hitachi HDS722020ALA330 JKAOA28A> ATA-8 SATA 2.x device
ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada3: Command Queueing enabled
ada3: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada4 at siisch5 bus 0 scbus5 target 0 lun 0
ada4: <Hitachi HDS722020ALA330 JKAOA28A> ATA-8 SATA 2.x device
ada4: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada4: Command Queueing enabled
ada4: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada5 at siisch6 bus 0 scbus6 target 0 lun 0
ada5: <Hitachi HDS722020ALA330 JKAOA28A> ATA-8 SATA 2.x device
ada5: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada5: Command Queueing enabled
ada5: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
ada6 at siisch7 bus 0 scbus7 target 0 lun 0
ada6: <Hitachi HDS722020ALA330 JKAOA28A> ATA-8 SATA 2.x device
ada6: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada6: Command Queueing enabled
ada6: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)
2) There are no other devices on the machine which share IRQ 17 or 19,
but that aside, please provide "vmstat -i" output during the time where
the transfer is happening. I doubt this is the problem though.
"pciconf -lvc" would also be useful (include only the siis0 and siis1
devices), just to rule out any oddities there (I doubt any).
3) Have you done "sysctl kstat.zfs.misc.arcstats" during the transfer to
see which counters get incremented? If I remember right, the ARC being
"starved" or under high utilisation can cause slow I/O, especially in
the case where you were seeing decent I/O in the past but not now. I
think kstat.zfs.misc.arcstats.memory_throttle_count monitors this, but
it'd be best to watch everything.
4) The machine has 4GB of RAM; that said, what /boot/loader.conf tunings
are in place on this system?
5) The machine where "sudo gstat -a -b -I 20s" is being run from shows
both reads and writes happening at the same time (approximately
24MBytes/sec read and 2MBytes/sec write, on each individual device).
Worth noting.
6) Are the abysmal performance numbers here with or without mbuffer?
Your numbers without mbuffer were around 40MB/read and 40MB/write (per
device), while with mbuffer (once tuned) were around 110MB/read and
80MB/write (per device).
7) ZFS compression is in use on the filesystems on this system.
8) Since you're testing zfs send/receive, don't you need to show what's
happening on *both* systems (the sender and the recipient)?
Footnote: Please use "-f '^ada[0-9]+$' in the future on your gstat, to
include only the adaXX devices. The xxxp1 and gpt entries aren't of
interest.
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |
More information about the freebsd-stable
mailing list