tuning zfs for large file reads

Mike Tancsa mike at sentex.net
Wed Jul 28 13:48:30 UTC 2010


Every once in a while I need to go through some rather large 
netflow/argus logs (each file about 1G, total of about 90 files) 
stored on a zfs array and I am finding reading from the file is at 
best about 50MB/s and often much worse. The individual underlying 
disks should be able to read much faster than that I would think.  Is 
there anything I can do to tune large file read performance a bit 
more ? I do have compression on, so not sure how much that is 
impacting things. I do plan to add more storage soon, so perhaps 
another 4 spindles might help to spread the load ?

e.g.

# iostat -c 100 ada0 ada1 ada2 ada3
        tty            ada0             ada1             ada2 
     ada3             cpu
  tin  tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s   KB/t 
tps  MB/s  us ni sy in id
    0    24 
34.51  70  2.37  38.64  56  2.10  38.60  65  2.45  38.10  56  2.09 
5  0  4  0 91
    0   279 28.11 397 10.89  29.65 251  7.26  36.14 285 10.06  30.03 
250  7.33  17  0 10  1 72
    0    93 27.66 361  9.74  30.23 201  5.93  35.60 275  9.56  30.31 
223  6.61  12  0  8  0 79
    0    94 23.27 345  7.83  32.51 187  5.94  40.10 250  9.79  29.80 
173  5.04  14  0 13  0 73
    0    94 28.59 326  9.10  29.10 234  6.64  37.00 235  8.48  28.53 
198  5.52  12  0 25  0 63
    0    94 21.23 420  8.72  23.88 237  5.52  33.71 285  9.39  26.91 
239  6.28  38  0 18  1 43
    0    94 27.24 352  9.37  24.60 238  5.71  33.94 275  9.12  26.99 
235  6.19  23  0 19  0 59
    0    93 30.95 223  6.75  35.46 160  5.55  37.01 172  6.22  31.46 
147  4.50   7  0  7  0 85
    0    93 23.55 349  8.02  30.52 194  5.77  33.86 276  9.13  28.82 
216  6.09  16  0  9  0 75
    0    94 33.06 350 11.29  44.79 167  7.32  45.95 230 10.30  39.10 
137  5.23  14  0 12  0 75
    0    93 25.32 254  6.28  29.80 147  4.27  31.92 225  7.02  27.21 
175  4.65   8  0  8  0 84
    0    93 26.92 380  9.99  28.80 247  6.95  34.63 299 10.11  27.18 
273  7.24  15  0 11  0 74
    0    93 25.77 287  7.23  22.90 179  4.01  31.63 241  7.45  22.71 
202  4.48  13  0 14  0 73

gstat shows the disks 100% busy while doing the read


# zfs get all
NAME      PROPERTY              VALUE                  SOURCE
zbackup1  type                  filesystem             -
zbackup1  creation              Tue Apr 29 22:48 2008  -
zbackup1  used                  1.91T                  -
zbackup1  available             781G                   -
zbackup1  referenced            1.91T                  -
zbackup1  compressratio         1.43x                  -
zbackup1  mounted               yes                    -
zbackup1  quota                 none                   default
zbackup1  reservation           none                   default
zbackup1  recordsize            128K                   default
zbackup1  mountpoint            /zbackup1              default
zbackup1  sharenfs              off                    default
zbackup1  checksum              on                     default
zbackup1  compression           on                     local
zbackup1  atime                 off                    local
zbackup1  devices               on                     default
zbackup1  exec                  on                     default
zbackup1  setuid                on                     default
zbackup1  readonly              off                    default
zbackup1  jailed                off                    default
zbackup1  snapdir               hidden                 default
zbackup1  aclmode               groupmask              default
zbackup1  aclinherit            restricted             default
zbackup1  canmount              on                     default
zbackup1  shareiscsi            off                    default
zbackup1  xattr                 off                    temporary
zbackup1  copies                1                      default
zbackup1  version               3                      -
zbackup1  utf8only              off                    -
zbackup1  normalization         none                   -
zbackup1  casesensitivity       sensitive              -
zbackup1  vscan                 off                    default
zbackup1  nbmand                off                    default
zbackup1  sharesmb              off                    default
zbackup1  refquota              none                   default
zbackup1  refreservation        none                   default
zbackup1  primarycache          all                    default
zbackup1  secondarycache        all                    default

# zpool status
   pool: zbackup1
  state: ONLINE
  scrub: none requested
config:

         NAME        STATE     READ WRITE CKSUM
         zbackup1    ONLINE       0     0     0
           raidz1    ONLINE       0     0     0
             ada3    ONLINE       0     0     0
             ada1    ONLINE       0     0     0
             ada2    ONLINE       0     0     0
             ada0    ONLINE       0     0     0

errors: No known data errors

ada0: <ST31000340AS SD1A> ATA-8 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada1 at ahcich3 bus 0 scbus7 target 0 lun 0
ada1: <ST31000340AS SD15> ATA-8 SATA 2.x device
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada2 at ahcich4 bus 0 scbus8 target 0 lun 0
ada2: <ST31000333AS SD35> ATA-8 SATA 2.x device
ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada3 at ahcich5 bus 0 scbus9 target 0 lun 0
ada3: <ST31000528AS CC35> ATA-8 SATA 2.x device
ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada3: Command Queueing enabled
ada3: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)

Its an intel ICH10 based chipset.  Its RELENG_8 from July 13th, AMD64 
8gig of RAM

vfs.zfs.l2c_only_size: 0
vfs.zfs.mfu_ghost_data_lsize: 8519680
vfs.zfs.mfu_ghost_metadata_lsize: 95119872
vfs.zfs.mfu_ghost_size: 103639552
vfs.zfs.mfu_data_lsize: 28704768
vfs.zfs.mfu_metadata_lsize: 33280
vfs.zfs.mfu_size: 60260864
vfs.zfs.mru_ghost_data_lsize: 5898240
vfs.zfs.mru_ghost_metadata_lsize: 106185728
vfs.zfs.mru_ghost_size: 112083968
vfs.zfs.mru_data_lsize: 33816576
vfs.zfs.mru_metadata_lsize: 32736768
vfs.zfs.mru_size: 77434368
vfs.zfs.anon_data_lsize: 0
vfs.zfs.anon_metadata_lsize: 0
vfs.zfs.anon_size: 2245120
vfs.zfs.l2arc_norw: 1
vfs.zfs.l2arc_feed_again: 1
vfs.zfs.l2arc_noprefetch: 0
vfs.zfs.l2arc_feed_min_ms: 200
vfs.zfs.l2arc_feed_secs: 1
vfs.zfs.l2arc_headroom: 2
vfs.zfs.l2arc_write_boost: 8388608
vfs.zfs.l2arc_write_max: 8388608
vfs.zfs.arc_meta_limit: 432819840
vfs.zfs.arc_meta_used: 151634344
vfs.zfs.mdcomp_disable: 0
vfs.zfs.arc_min: 216409920
vfs.zfs.arc_max: 1731279360
vfs.zfs.zfetch.array_rd_sz: 1048576
vfs.zfs.zfetch.block_cap: 256
vfs.zfs.zfetch.min_sec_reap: 2
vfs.zfs.zfetch.max_streams: 8
vfs.zfs.prefetch_disable: 0
vfs.zfs.check_hostid: 1
vfs.zfs.recover: 0
vfs.zfs.txg.write_limit_override: 0
vfs.zfs.txg.synctime: 5
vfs.zfs.txg.timeout: 30
vfs.zfs.scrub_limit: 10
vfs.zfs.vdev.cache.bshift: 16
vfs.zfs.vdev.cache.size: 10485760
vfs.zfs.vdev.cache.max: 16384
vfs.zfs.vdev.aggregation_limit: 131072
vfs.zfs.vdev.ramp_rate: 2
vfs.zfs.vdev.time_shift: 6
vfs.zfs.vdev.min_pending: 4
vfs.zfs.vdev.max_pending: 35
vfs.zfs.cache_flush_disable: 0
vfs.zfs.zil_disable: 0
vfs.zfs.zio.use_uma: 0
vfs.zfs.version.zpl: 3
vfs.zfs.version.vdev_boot: 1
vfs.zfs.version.spa: 14
vfs.zfs.version.dmu_backup_stream: 1
vfs.zfs.version.dmu_backup_header: 2
vfs.zfs.version.acl: 1
vfs.zfs.debug: 0
vfs.zfs.super_owner: 0
kstat.zfs.misc.zfetchstats.hits: 3276257635
kstat.zfs.misc.zfetchstats.misses: 358569869
kstat.zfs.misc.zfetchstats.colinear_hits: 101066
kstat.zfs.misc.zfetchstats.colinear_misses: 358468803
kstat.zfs.misc.zfetchstats.stride_hits: 3232647654
kstat.zfs.misc.zfetchstats.stride_misses: 32579337
kstat.zfs.misc.zfetchstats.reclaim_successes: 3639515
kstat.zfs.misc.zfetchstats.reclaim_failures: 354829288
kstat.zfs.misc.zfetchstats.streams_resets: 547669
kstat.zfs.misc.zfetchstats.streams_noresets: 43427240
kstat.zfs.misc.zfetchstats.bogus_streams: 0
kstat.zfs.misc.arcstats.hits: 1628471537
kstat.zfs.misc.arcstats.misses: 157059604
kstat.zfs.misc.arcstats.demand_data_hits: 56872510
kstat.zfs.misc.arcstats.demand_data_misses: 7357670
kstat.zfs.misc.arcstats.demand_metadata_hits: 811953750
kstat.zfs.misc.arcstats.demand_metadata_misses: 92175332
kstat.zfs.misc.arcstats.prefetch_data_hits: 21867244
kstat.zfs.misc.arcstats.prefetch_data_misses: 35428145
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 737778033
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 22098457
kstat.zfs.misc.arcstats.mru_hits: 255443336
kstat.zfs.misc.arcstats.mru_ghost_hits: 37264902
kstat.zfs.misc.arcstats.mfu_hits: 614687158
kstat.zfs.misc.arcstats.mfu_ghost_hits: 28488139
kstat.zfs.misc.arcstats.allocated: 178693054
kstat.zfs.misc.arcstats.deleted: 100291619
kstat.zfs.misc.arcstats.stolen: 32463801
kstat.zfs.misc.arcstats.recycle_miss: 112803117
kstat.zfs.misc.arcstats.mutex_miss: 595987
kstat.zfs.misc.arcstats.evict_skip: 43698367555
kstat.zfs.misc.arcstats.evict_l2_cached: 0
kstat.zfs.misc.arcstats.evict_l2_eligible: 4878982700544
kstat.zfs.misc.arcstats.evict_l2_ineligible: 1884938647552
kstat.zfs.misc.arcstats.hash_elements: 47690
kstat.zfs.misc.arcstats.hash_elements_max: 211948
kstat.zfs.misc.arcstats.hash_collisions: 23698456
kstat.zfs.misc.arcstats.hash_chains: 8682
kstat.zfs.misc.arcstats.hash_chain_max: 30
kstat.zfs.misc.arcstats.p: 202873036
kstat.zfs.misc.arcstats.c: 216409920
kstat.zfs.misc.arcstats.c_min: 216409920
kstat.zfs.misc.arcstats.c_max: 1731279360
kstat.zfs.misc.arcstats.size: 216359384
kstat.zfs.misc.arcstats.hdr_size: 10358144
kstat.zfs.misc.arcstats.data_size: 139796480
kstat.zfs.misc.arcstats.other_size: 66204760
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_read_bytes: 0
kstat.zfs.misc.arcstats.l2_write_bytes: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 21505931
kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0
kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0
kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0
kstat.zfs.misc.arcstats.l2_write_in_l2: 0
kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0
kstat.zfs.misc.arcstats.l2_write_not_cacheable: 26447966
kstat.zfs.misc.arcstats.l2_write_full: 0
kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0
kstat.zfs.misc.arcstats.l2_write_pios: 0
kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0
kstat.zfs.misc.vdev_cache_stats.delegations: 14308501
kstat.zfs.misc.vdev_cache_stats.hits: 145613677
kstat.zfs.misc.vdev_cache_stats.misses: 22833085

some stats during a read.

# sysctl -a kstat.zfs.misc.arcstats;sleep 5;sysctl kstat.zfs.misc.arcstats
kstat.zfs.misc.arcstats.hits: 1628766598
kstat.zfs.misc.arcstats.misses: 157095912
kstat.zfs.misc.arcstats.demand_data_hits: 57123742
kstat.zfs.misc.arcstats.demand_data_misses: 7361404
kstat.zfs.misc.arcstats.demand_metadata_hits: 811993421
kstat.zfs.misc.arcstats.demand_metadata_misses: 92177322
kstat.zfs.misc.arcstats.prefetch_data_hits: 21870856
kstat.zfs.misc.arcstats.prefetch_data_misses: 35458138
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 737778579
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 22099048
kstat.zfs.misc.arcstats.mru_hits: 255507575
kstat.zfs.misc.arcstats.mru_ghost_hits: 37268564
kstat.zfs.misc.arcstats.mfu_hits: 614913822
kstat.zfs.misc.arcstats.mfu_ghost_hits: 28489531
kstat.zfs.misc.arcstats.allocated: 178729987
kstat.zfs.misc.arcstats.deleted: 100325316
kstat.zfs.misc.arcstats.stolen: 32480168
kstat.zfs.misc.arcstats.recycle_miss: 112813638
kstat.zfs.misc.arcstats.mutex_miss: 596031
kstat.zfs.misc.arcstats.evict_skip: 43705615686
kstat.zfs.misc.arcstats.evict_l2_cached: 0
kstat.zfs.misc.arcstats.evict_l2_eligible: 4882493084160
kstat.zfs.misc.arcstats.evict_l2_ineligible: 1885466736640
kstat.zfs.misc.arcstats.hash_elements: 45389
kstat.zfs.misc.arcstats.hash_elements_max: 211948
kstat.zfs.misc.arcstats.hash_collisions: 23703581
kstat.zfs.misc.arcstats.hash_chains: 7986
kstat.zfs.misc.arcstats.hash_chain_max: 30
kstat.zfs.misc.arcstats.p: 202884300
kstat.zfs.misc.arcstats.c: 216409920
kstat.zfs.misc.arcstats.c_min: 216409920
kstat.zfs.misc.arcstats.c_max: 1731279360
kstat.zfs.misc.arcstats.size: 217163440
kstat.zfs.misc.arcstats.hdr_size: 9940256
kstat.zfs.misc.arcstats.data_size: 135882240
kstat.zfs.misc.arcstats.other_size: 71340944
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_read_bytes: 0
kstat.zfs.misc.arcstats.l2_write_bytes: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 21731398
kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0
kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0
kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0
kstat.zfs.misc.arcstats.l2_write_in_l2: 0
kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0
kstat.zfs.misc.arcstats.l2_write_not_cacheable: 26451995
kstat.zfs.misc.arcstats.l2_write_full: 0
kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0
kstat.zfs.misc.arcstats.l2_write_pios: 0
kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0

kstat.zfs.misc.arcstats.hits: 1628769047
kstat.zfs.misc.arcstats.misses: 157097445
kstat.zfs.misc.arcstats.demand_data_hits: 57124342
kstat.zfs.misc.arcstats.demand_data_misses: 7361439
kstat.zfs.misc.arcstats.demand_metadata_hits: 811995060
kstat.zfs.misc.arcstats.demand_metadata_misses: 92177359
kstat.zfs.misc.arcstats.prefetch_data_hits: 21871065
kstat.zfs.misc.arcstats.prefetch_data_misses: 35459599
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 737778580
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 22099048
kstat.zfs.misc.arcstats.mru_hits: 255508558
kstat.zfs.misc.arcstats.mru_ghost_hits: 37269451
kstat.zfs.misc.arcstats.mfu_hits: 614915078
kstat.zfs.misc.arcstats.mfu_ghost_hits: 28489553
kstat.zfs.misc.arcstats.allocated: 178731542
kstat.zfs.misc.arcstats.deleted: 100325916
kstat.zfs.misc.arcstats.stolen: 32481046
kstat.zfs.misc.arcstats.recycle_miss: 112813939
kstat.zfs.misc.arcstats.mutex_miss: 596033
kstat.zfs.misc.arcstats.evict_skip: 43705791814
kstat.zfs.misc.arcstats.evict_l2_cached: 0
kstat.zfs.misc.arcstats.evict_l2_eligible: 4882575294464
kstat.zfs.misc.arcstats.evict_l2_ineligible: 1885578672128
kstat.zfs.misc.arcstats.hash_elements: 45419
kstat.zfs.misc.arcstats.hash_elements_max: 211948
kstat.zfs.misc.arcstats.hash_collisions: 23703681
kstat.zfs.misc.arcstats.hash_chains: 7991
kstat.zfs.misc.arcstats.hash_chain_max: 30
kstat.zfs.misc.arcstats.p: 202884300
kstat.zfs.misc.arcstats.c: 216409920
kstat.zfs.misc.arcstats.c_min: 216409920
kstat.zfs.misc.arcstats.c_max: 1731279360
kstat.zfs.misc.arcstats.size: 216364904
kstat.zfs.misc.arcstats.hdr_size: 9946424
kstat.zfs.misc.arcstats.data_size: 135031808
kstat.zfs.misc.arcstats.other_size: 71386672
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_read_bytes: 0
kstat.zfs.misc.arcstats.l2_write_bytes: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 21731407
kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0
kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0
kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0
kstat.zfs.misc.arcstats.l2_write_in_l2: 0
kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0
kstat.zfs.misc.arcstats.l2_write_not_cacheable: 26452849
kstat.zfs.misc.arcstats.l2_write_full: 0
kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0
kstat.zfs.misc.arcstats.l2_write_pios: 0
kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0



--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications,                            mike at sentex.net
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada                         www.sentex.net/mike



More information about the freebsd-performance mailing list