11-STABLE vs 11.0-RELENG test
Alex Tutubalin
lexa at lexa.ru
Tue Dec 6 16:20:09 UTC 2016
Followup, same box as in 1st message, but with different HBA and 5 HDD:
Summary:
1) read speed depends on 'on-disk' data, not by code used while read.
2) FreeBSD 11.0-releng (svn up today) creates 'fast' big files (420+
Mb/s read speed)
3) FreeBSD 11-STABLE creates 'slow' big files. (~200 Mb/s)
4) zfs send slow-dataset | zfs recv (under 11.0-releng) creates
'intermediate' files (320Mb/s)
5) file copy slow-file ... (under 11.0-releng) creates fast copy
(400+Mb/s)
So, ZFS write code in 11-STABLE looks broken.
Here detailed log of my tests:
Box: Intel i5-2400, 16Gb of RAM, Adaptec 5805, 5x4Tb drives (HGST NAS
drives) in JBOD, merged in RAIDZ2, enough free space:
$ zpool list
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
zdata 18,1T 5,11T 13,0T - 9% 28% 1.00x ONLINE -
tunables in loader.conf:
vm.kmem_size="15G"
vfs.zfs.arc_max="14G"
Boot 11.0-releng:
$ uname -a
FreeBSD iscsi.lexa.ru 11.0-RELEASE-p3 FreeBSD 11.0-RELEASE-p3 #7
r309605: Tue Dec 6 12:21:10 MSK 2016
lexa at iscsi.lexa.ru:/usr/obj/usr/src/sys/GENERIC amd64
create separate dataset for test:
# zfs create -o recordsize=1m -o mountpoint=/test-11releng zdata/11releng
write 100gb file:
# dd if=/dev/zero of=/test-11releng/100g-created-on-11releng bs=1m
count=100k
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 257.686751 secs (416684917 bytes/sec)
reboot (still 11.0-releng) and read test file again:
Cold read:
$ dd if=/test-11releng/100g-created-on-11releng of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 250.526067 secs (428594850 bytes/sec)
read again to see how cached metadata helps (if any):
$ dd if=/test-11releng/100g-created-on-11releng of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 243.992662 secs (440071358 bytes/sec)
Not much difference between cold and cached metadata, OK.
Now boot 11.0-STABLE on same box:
$ uname -a
FreeBSD iscsi.lexa.ru 11.0-STABLE FreeBSD 11.0-STABLE #0 r309616: Tue
Dec 6 16:15:48 MSK 2016
lexa at iscsi.lexa.ru:/usr/obj/usr/src11/sys/GENERIC amd64
Cold read of same data file:
$ dd if=/test-11releng/100g-created-on-11releng of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 250.640950 secs (428398402 bytes/sec)
Still OK: data created on 11.0-releng reads at full speed under 11-stable
Create 11-stable own dataset:
# zfs create -o recordsize=1m -o mountpoint=/test-11stable zdata/11stable
And write similar 100g file:
# dd if=/dev/zero of=/test-11stable/100g-created-on-11stable bs=1m
count=100k
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 246.409501 secs (435755041 bytes/sec)
Reboot (again into 11-stable) and read back created file
$ dd if=/test-11stable/100g-created-on-11stable of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 524.488506 secs (204721707 bytes/sec)
Hmm, about 1/2 of expected speed.
Let's read fist test file, created on 11-releng:
$ dd if=/test-11releng/100g-created-on-11releng of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 250.591102 secs (428483620 bytes/sec)
Speed is still here.
So, problem is related to on-disk data written by 11-stable.
Let's try to recover 'bad data' into good one (assuming we're downgraded
to 11-releng and want to get speed back)
Boot into 11-releng:
$ uname -a
FreeBSD iscsi.lexa.ru 11.0-RELEASE-p3 FreeBSD 11.0-RELEASE-p3 #7
r309605: Tue Dec 6 12:21:10 MSK 2016
lexa at iscsi.lexa.ru:/usr/obj/usr/src/sys/GENERIC amd64
Try zfs send|recv:
# zfs snapshot zdata/11stable at 1
# zfs send zdata/11stable at 1 | zfs recv
zdata/11stable-sendrecv-on11releng # slow because same dataset used
# zfs destroy zdata/11stable-sendrecv-on11releng at 1
# zfs destroy zdata/11stable at 1
# zfs set mountpoint=/test-11stable-sendrecv-on11releng
zdata/11stable-sendrecv-on11releng
Reboot and test read speed:
$ dd if=/test-11stable-sendrecv-on11releng/100g-created-on-11stable
of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 336.193935 secs (319381676 bytes/sec)
INTERMEDIATE speed :( Better than 'bad datafile', worse than 'good
datafile'
Try file copy to another dataset:
# zfs create -o recordsize=1m -o
mountpoint=/test-11stable-copy-on11releng zdata/11stable-copy-on11releng
# cd /test-11stable
# tar cf - . | (cd /test-11stable-copy-on11releng/; tar xvf - )
Reboot again and test speed another time:
$ dd if=/test-11stable-copy-on11releng/100g-created-on-11stable
of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 261.484845 secs (410632526 bytes/sec)
OK, per-file copy solves the problem (as expected).
Finally, let's check that all testfiles still reads at the expected speed
'good' one, created on 11.0-releng:
]$ dd if=/test-11releng/100g-created-on-11releng of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 249.435476 secs (430468770 bytes/sec)
'bad' one, created on 11-stable:
$ dd if=/test-11stable/100g-created-on-11stable of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 525.558477 secs (204304920 bytes/sec)
and 'intermediate', created by zfs send bad-dataset | zfs recv:
$ dd if=/test-11stable-sendrecv-on11releng/100g-created-on-11stable
of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 336.570705 secs (319024148 bytes/sec)
Boot into 11-stable again and test read speed too:
$ uname -a
FreeBSD iscsi.lexa.ru 11.0-STABLE FreeBSD 11.0-STABLE #0 r309616: Tue
Dec 6 16:15:48 MSK 2016
lexa at iscsi.lexa.ru:/usr/obj/usr/src11/sys/GENERIC amd64
Good:
]$ dd if=/test-11releng/100g-created-on-11releng of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 256.322824 secs (418902151 bytes/sec)
Bad:
$ dd if=/test-11stable/100g-created-on-11stable of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 529.918422 secs (202623985 bytes/sec)
Copied by tar (from bad to some intermediate state):
$ dd if=/test-11stable-copy-on11releng/100g-created-on-11stable
of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 260.234944 secs (412604783 bytes/sec)
Copied by zfs send | zfs recv:
$ dd if=/test-11stable-sendrecv-on11releng/100g-created-on-11stable
of=/dev/null bs=1m
102400+0 records in
102400+0 records out
107374182400 bytes transferred in 336.240919 secs (319337047 bytes/sec)
(virtually) same speed as under 11.0-releng, so read speed is limited by
on-disk structures.
--
Alex Tutubalin
Web: http://blog.lexa.ru
mailto:lexa at lexa.ru
More information about the freebsd-fs
mailing list