HAST performance overheads?

Fri Jan 25 12:17:59 UTC 2013

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

I realise the second line is going to sound pretty vague...

What sort of performance overhead should we expect when using HAST?

I'm seeing approximately 1/10th of the performance of sequential
writes when using the HAST layer.  Its "stacked" like so
"hard disk --> hast layer --> filesystem"

I've run tests using several setups of various numbers of disks and
UfS/ZFS - all are slower when I introduce HAST.

An example of this is using 6 disks of the following spec:
 - # dmesg | grep ^da0
     da0 at mps0 bus 0 scbus0 target 11 lun 0
     da0: <TOSHIBA MK1001TRKB DCA8> Fixed Direct Access SCSI-6 device 
     da0: 600.000MB/s transfers
     da0: Command Queueing enabled
     da0: 953869MB (1953525168 512 byte sectors: 255H 63S/T 121601C)

If I create ZFS raidz2 on these...

 - # zpool create pool raidz2 da0 da1 da2 da3 da4 da5

Then run a dd test, a sample output is...

 - # dd if=/dev/zero of=test.dat bs=1M count=1024
     1073741824 bytes transferred in 7.689634 secs (139634974 bytes/sec)

 - # dd if=/dev/zero of=test.dat bs=16k count=65535
     1073725440 bytes transferred in 1.909157 secs (562408130 bytes/sec)

This is much faster than compared to running hast, I would expect an
overhead, but not this much.  For example:

 - # hastctl create disk0/disk1/disk2/disk3/disk4/disk5
 - # hastctl role primary all
 - # zpool create pool raidz2 disk0 disk1 disk2 disk3 disk4 disk5

Run a dd test, and the speed is...

 - # dd if=/dev/zero of=test.dat bs=1M count=1024
     1073741824 bytes transferred in 40.908153 secs (26247624 bytes/sec)

 - # dd if=/dev/zero of=test.dat bs=16k count=65535
     1073725440 bytes transferred in 42.017997 secs (25553942 bytes/sec)

Note that no secondary server is setup, as this degrades the speed
even further and I have removed that for the testing.

We can see better speeds than this (up to approx 30MBs) with metaflush
switched off.  The async replication mode actually seems to degrade the
speed too, which was quite unexpected.

There is 1 exception where the speed seems quite good; by configuring
ZFS-on-HAST-on-ZVOL-on-ZFS :)

 - # zpool create pool raidz2 da0 da1 da2 da3 da4 da5
 - # zfs create -s -V 3T pool/hastpool
 - # hastctl create zhast
 - # zpool create zhast /dev/hast/zhast

However this set seems quite insane to me, but, we do get better
performance (metaflush off):

 - # dd if=/dev/zero of=test.dat bs=1M count=1024
     1073741824 bytes transferred in 14.057880 secs (76380067 bytes/sec)
 - # dd if=/dev/zero of=test.dat bs=16k count=65535
     1073725440 bytes transferred in 10.341796 secs (103823884
   bytes/sec)

I guess the question for me really is why the big difference when
having just 1 HAST provider, in comparison to using 6?  Would we see
this performance gain if we were to concatinate the disks together or
use a hware raid controller?  If so, why?  Is there a massive
performance overhead with using several providers?

Thanks in advance for reading.

Regards
Laurence

- -- 
Laurence Gill

t: 01843 590 784
f: 08721 157 665
skype: laurencegg
e: laurencesgill at googlemail.com
PGP on Key Servers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)

iEYEARECAAYFAlECdkkACgkQygVt8Sq0Pf+sCQCcCe4OivP42ErgYH65DZ2pzQVU
KXsAn1qg3OENmfOajueG1GzqEsXdM5Ux
=lF9X
-----END PGP SIGNATURE-----