Irregular disk IO and poor performance (possibly after reading a lot of data from pool)
Steven Hartland
killing at multiplay.co.uk
Mon Dec 1 17:28:29 UTC 2014
What disks?
On 01/12/2014 13:21, Dmitriy Makarov wrote:
> We have big ZFS pool (16TiB) with 36 disks that are grouped into 18 mirror devices.
>
> This weekend we were maintaining data on the pool.
> Two days straight 16 processes were busy reading files (to calculate checksums and things like that)
>
> Starting from the monday morning, few hours after maintainance was terminated
> we started to observe abnormal ZFS behaviour that was also accompanied by
> very very poor pool performance (many processes were blocked in zio->i).
>
> But the most strange thing is how IO is distributed between mirror devices.
> Normally, our 'iostat -x 1' looks like
>
> device r/s w/s kr/s kw/s qlen svc_t %b
> md0 0.0 5.9 0.0 0.0 0 0.0 0
> da0 28.7 178.2 799.6 6748.3 1 3.8 58
> da1 23.8 180.2 617.9 6748.3 1 3.4 56
> da2 44.6 168.3 681.3 6733.9 1 5.2 72
> da3 38.6 164.4 650.6 6240.3 1 4.9 65
> da4 29.7 176.3 471.3 5935.3 0 4.1 58
> da5 27.7 180.2 546.1 6391.3 1 3.9 57
> da6 27.7 238.6 555.0 6714.6 0 3.7 68
> da7 28.7 239.6 656.0 6714.6 0 3.3 58
> da8 26.7 318.8 738.7 8304.4 0 2.5 54
> da9 27.7 315.9 725.3 7769.7 0 3.0 77
> da10 23.8 268.3 510.0 7663.7 0 2.6 56
> da11 32.7 276.3 905.5 7697.9 0 3.4 70
> da12 24.8 293.1 559.0 6222.0 2 2.3 53
> da13 27.7 285.2 279.7 6058.1 1 2.9 62
> da14 29.7 226.8 374.3 5733.3 0 3.2 57
> da15 32.7 220.8 532.2 5538.7 1 3.3 65
> da16 30.7 165.4 638.2 4537.6 1 3.8 51
> da17 39.6 173.3 819.9 4884.2 1 3.2 46
> da18 28.7 221.8 765.4 5659.1 1 2.6 42
> da19 30.7 214.9 464.4 5417.4 0 4.6 78
> da20 32.7 177.2 725.3 4732.7 1 4.0 63
> da21 29.7 177.2 448.6 4722.8 0 5.3 66
> da22 19.8 153.5 398.6 4168.3 0 2.5 35
> da23 16.8 151.5 291.1 4243.6 1 2.9 39
> da24 26.7 186.2 547.1 5018.4 1 4.4 68
> da25 30.7 190.1 709.0 5096.6 1 5.0 71
> da26 28.7 222.8 690.7 5251.1 0 3.0 55
> da27 21.8 213.9 572.3 5248.6 0 2.8 49
> da28 34.7 177.2 1096.2 5027.8 1 4.9 65
> da29 36.6 175.3 1172.9 5012.0 2 4.9 63
> da30 22.8 197.1 462.9 5906.6 0 2.8 51
> da31 25.7 204.0 445.6 6138.3 0 3.4 62
> da32 31.7 170.3 557.0 5600.6 1 4.6 58
> da33 33.7 161.4 698.1 5509.5 1 4.8 60
> da34 28.7 269.3 473.8 6661.6 1 5.2 77
> da35 27.7 268.3 424.3 6440.8 0 5.6 75
>
>
> kw/s is always distributed pretty much evenly.
> Now it looks mostly like this:
>
> device r/s w/s kr/s kw/s qlen svc_t %b
> md0 0.0 18.8 0.0 0.0 0 0.0 0
> da0 35.7 0.0 1070.9 0.0 0 13.3 37
> da1 38.7 0.0 1227.0 0.0 0 12.7 40
> da2 25.8 0.0 920.2 0.0 0 12.0 26
> da3 26.8 0.0 778.0 0.0 0 10.9 23
> da4 22.8 0.0 792.4 0.0 0 14.4 25
> da5 26.8 0.0 1050.5 0.0 0 13.4 27
> da6 32.7 0.0 1359.3 0.0 0 17.0 41
> da7 23.8 229.9 870.7 17318.1 0 3.0 55
> da8 58.5 0.0 1813.7 0.0 1 12.9 56
> da9 63.4 0.0 1615.0 0.0 0 12.4 61
> da10 48.6 0.0 1448.0 0.0 0 16.7 55
> da11 49.6 0.0 1148.2 0.0 1 16.7 60
> da12 47.6 0.0 1508.4 0.0 0 14.8 46
> da13 47.6 0.0 1417.7 0.0 0 17.9 55
> da14 44.6 0.0 1997.5 0.0 1 15.6 49
> da15 48.6 0.0 2061.4 0.0 1 14.2 47
> da16 44.6 0.0 1587.7 0.0 1 16.9 51
> da17 45.6 0.0 1326.1 0.0 2 15.7 55
> da18 50.5 0.0 1433.6 0.0 2 16.7 57
> da19 57.5 0.0 2415.8 0.0 3 20.4 70
> da20 52.5 222.0 2097.1 10613.0 5 12.8 100
> da21 52.5 256.7 1967.8 11498.5 3 10.6 100
> da22 37.7 433.1 1342.4 12880.1 4 5.5 99
> da23 42.6 359.8 2304.3 13073.8 5 7.2 101
> da24 33.7 0.0 1256.7 0.0 1 15.4 40
> da25 26.8 0.0 853.8 0.0 2 15.1 32
> da26 23.8 0.0 343.9 0.0 1 12.4 28
> da27 26.8 0.0 400.4 0.0 0 12.4 31
> da28 15.9 0.0 575.3 0.0 1 11.4 17
> da29 20.8 0.0 750.7 0.0 0 14.4 24
> da30 37.7 0.0 952.4 0.0 0 12.6 37
> da31 29.7 0.0 777.0 0.0 0 13.6 37
> da32 54.5 121.9 1824.6 6514.4 7 27.7 100
> da33 56.5 116.9 2017.3 6213.6 6 29.7 99
> da34 42.6 0.0 1303.3 0.0 1 14.9 43
> da35 45.6 0.0 1400.9 0.0 2 14.8 45
>
> Some deviced have 0.0 kw/s for long period of time,
> then others and so on and so on.
> Here some more results:
>
> device r/s w/s kr/s kw/s qlen svc_t %b
> md0 0.0 37.9 0.0 0.0 0 0.0 0
> da0 58.9 173.7 1983.5 4585.3 3 11.2 87
> da1 49.9 162.7 1656.2 4548.4 3 14.0 95
> da2 40.9 187.6 1476.5 3466.6 1 4.8 58
> da3 42.9 188.6 1646.7 3466.6 0 5.3 64
> da4 54.9 33.9 2222.6 1778.4 1 13.3 63
> da5 53.9 37.9 2429.6 1778.4 2 12.9 68
> da6 42.9 33.9 1445.1 444.6 0 10.3 45
> da7 40.9 28.9 2045.9 444.6 0 12.3 43
> da8 53.9 0.0 959.6 0.0 1 22.7 62
> da9 29.9 0.0 665.2 0.0 1 52.1 64
> da10 52.9 83.8 1845.3 2084.8 2 8.2 64
> da11 44.9 103.8 1654.2 4895.2 1 8.8 71
> da12 50.9 60.9 1273.0 2078.3 1 10.3 69
> da13 39.9 57.9 940.1 2078.3 0 15.4 75
> da14 45.9 72.9 977.0 3178.6 0 8.5 63
> da15 48.9 72.9 1000.5 3178.6 0 9.6 72
> da16 42.9 74.9 1187.6 2118.8 1 6.7 51
> da17 48.9 82.8 1651.7 3013.0 0 5.7 52
> da18 67.9 78.8 2735.5 2456.1 0 11.5 75
> da19 52.9 79.8 2436.6 2456.1 0 13.1 82
> da20 48.9 91.8 2623.8 1682.6 1 7.2 60
> da21 52.9 92.8 1893.2 1682.6 0 7.1 61
> da22 67.9 20.0 2518.0 701.1 0 13.5 79
> da23 68.9 23.0 3331.8 701.1 1 13.6 77
> da24 45.9 17.0 2148.7 369.8 1 11.6 47
> da25 36.9 18.0 1747.5 369.8 1 12.6 46
> da26 46.9 1.0 1873.3 0.5 0 21.3 55
> da27 38.9 1.0 1395.7 0.5 0 34.6 58
> da28 34.9 9.0 1523.5 53.9 0 14.1 39
> da29 26.9 10.0 1124.8 53.9 1 13.8 28
> da30 44.9 0.0 1887.2 0.0 0 18.8 50
> da31 47.9 0.0 2273.0 0.0 0 20.2 49
> da32 65.9 90.8 2221.6 1730.5 3 9.7 77
> da33 79.8 90.8 3304.9 1730.5 1 9.9 88
> da34 75.8 134.7 3638.7 3938.1 2 10.2 90
> da35 49.9 209.6 1792.4 5756.0 2 8.1 85
>
>
> md0 0.0 19.0 0.0 0.0 0 0.0 0
> da0 38.0 194.8 1416.1 1175.8 1 10.6 100
> da1 40.0 190.8 1424.6 1072.9 2 10.4 100
> da2 37.0 0.0 1562.4 0.0 0 14.9 40
> da3 31.0 0.0 1169.8 0.0 0 14.0 33
> da4 44.0 0.0 2632.4 0.0 0 18.0 45
> da5 41.0 0.0 1944.6 0.0 0 19.0 45
> da6 38.0 0.0 1786.2 0.0 1 18.4 44
> da7 45.0 0.0 2275.7 0.0 0 16.0 48
> da8 80.9 0.0 4151.3 0.0 2 24.1 85
> da9 83.9 0.0 3256.2 0.0 3 21.2 83
> da10 61.9 0.0 3657.3 0.0 1 18.9 65
> da11 53.9 0.0 2532.5 0.0 1 18.7 56
> da12 54.9 0.0 2650.8 0.0 0 18.9 60
> da13 48.0 0.0 1975.5 0.0 0 19.6 53
> da14 43.0 0.0 1802.7 0.0 2 14.1 43
> da15 49.0 0.0 2455.5 0.0 0 14.0 48
> da16 45.0 0.0 1521.5 0.0 1 16.0 50
> da17 45.0 0.0 1650.8 0.0 4 13.7 47
> da18 48.0 0.0 1618.9 0.0 1 15.0 54
> da19 47.0 0.0 1982.0 0.0 0 16.5 55
> da20 52.9 0.0 2186.3 0.0 0 19.8 65
> da21 61.9 0.0 3020.5 0.0 0 16.3 61
> da22 70.9 0.0 3309.7 0.0 1 15.5 67
> da23 67.9 0.0 2742.3 0.0 2 16.5 73
> da24 38.0 0.0 1426.1 0.0 1 15.5 40
> da25 41.0 0.0 1905.6 0.0 1 14.0 39
> da26 43.0 0.0 2371.1 0.0 0 14.2 40
> da27 46.0 0.0 2178.3 0.0 0 15.2 45
> da28 44.0 0.0 2092.9 0.0 0 12.4 43
> da29 41.0 0.0 1442.1 0.0 1 13.4 37
> da30 42.0 37.0 1171.3 645.9 1 17.5 62
> da31 27.0 67.9 713.8 290.7 0 16.7 64
> da32 47.0 0.0 1043.5 0.0 0 13.3 43
> da33 50.0 0.0 1741.3 0.0 1 15.7 57
> da34 42.0 0.0 1119.9 0.0 0 18.2 55
> da35 45.0 0.0 1071.4 0.0 0 15.7 55
>
>
> First thing we did is tried to reboot.
> It took system more than 5 minutes to import the pool (normally it's a fraction of a second).
> Nedless to say reboot did not help a bit.
>
> What can we do about this problem?
>
>
> System info:
> FreeBSD 11.0-CURRENT #5 r260625
>
> zpool get all disk1
> NAME PROPERTY VALUE SOURCE
> disk1 size 16,3T -
> disk1 capacity 59% -
> disk1 altroot - default
> disk1 health ONLINE -
> disk1 guid 4909337477172007488 default
> disk1 version - default
> disk1 bootfs - default
> disk1 delegation on default
> disk1 autoreplace off default
> disk1 cachefile - default
> disk1 failmode wait default
> disk1 listsnapshots off default
> disk1 autoexpand off default
> disk1 dedupditto 0 default
> disk1 dedupratio 1.00x -
> disk1 free 6,56T -
> disk1 allocated 9,76T -
> disk1 readonly off -
> disk1 comment - default
> disk1 expandsize 0 -
> disk1 freeing 0 default
> disk1 feature at async_destroy enabled local
> disk1 feature at empty_bpobj active local
> disk1 feature at lz4_compress active local
> disk1 feature at multi_vdev_crash_dump enabled local
> disk1 feature at spacemap_histogram active local
> disk1 feature at enabled_txg active local
> disk1 feature at hole_birth active local
> disk1 feature at extensible_dataset enabled local
> disk1 feature at bookmarks enabled local
>
>
>
> zfs get all disk1
> NAME PROPERTY VALUE SOURCE
> disk1 type filesystem -
> disk1 creation Wed Sep 18 11:47 2013 -
> disk1 used 9,75T -
> disk1 available 6,30T -
> disk1 referenced 9,74T -
> disk1 compressratio 1.63x -
> disk1 mounted yes -
> disk1 quota none default
> disk1 reservation none default
> disk1 recordsize 128K default
> disk1 mountpoint /......... local
> disk1 sharenfs off default
> disk1 checksum on default
> disk1 compression lz4 local
> disk1 atime off local
> disk1 devices on default
> disk1 exec off local
> disk1 setuid off local
> disk1 readonly off default
> disk1 jailed off default
> disk1 snapdir hidden default
> disk1 aclmode discard default
> disk1 aclinherit restricted default
> disk1 canmount on default
> disk1 xattr off temporary
> disk1 copies 1 default
> disk1 version 5 -
> disk1 utf8only off -
> disk1 normalization none -
> disk1 casesensitivity sensitive -
> disk1 vscan off default
> disk1 nbmand off default
> disk1 sharesmb off default
> disk1 refquota none default
> disk1 refreservation none default
> disk1 primarycache all default
> disk1 secondarycache none local
> disk1 usedbysnapshots 0 -
> disk1 usedbydataset 9,74T -
> disk1 usedbychildren 9,71G -
> disk1 usedbyrefreservation 0 -
> disk1 logbias latency default
> disk1 dedup off default
> disk1 mlslabel -
> disk1 sync standard local
> disk1 refcompressratio 1.63x -
> disk1 written 9,74T -
> disk1 logicalused 15,8T -
> disk1 logicalreferenced 15,8T -
>
>
> This is very severe, thanks.
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
More information about the freebsd-fs
mailing list