kern/113885: [gmirror] [patch] improved gmirror balance algorithm

Will Andrews will at firepipe.net
Tue Dec 2 14:20:04 PST 2008


The following reply was made to PR kern/113885; it has been noted by GNATS.

From: "Will Andrews" <will at firepipe.net>
To: bug-followup at freebsd.org
Cc: "Mykola Zubach" <zuborg at advancedhosters.com>
Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm
Date: Tue, 2 Dec 2008 14:50:29 -0700

 ------=_Part_29247_27764781.1228254629354
 Content-Type: multipart/alternative; 
 	boundary="----=_Part_29248_27670565.1228254629354"
 
 ------=_Part_29248_27670565.1228254629354
 Content-Type: text/plain; charset=ISO-8859-1
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline
 
 I have attached what I believe is a better version of your patch.  It:
 
 1) Fixes the type ambiguity of the new use_delay/best_use_delay and
 dist/best_dist variables, to match the variables used in their calculations;
 they should be uint64_t and off_t, respectively.
 2) Uses bit shifts instead of multiplication/division in the use delay and
 distance calculations.  The precision loss should be acceptable in this
 situation.
 3) Cleans up the style of the code; add more commenting, better comments.
 4) Gets rid of the g_mirror_disk.d_delay variable since it is no longer
 used, along with the function the original patch short-circuited.
 
 In my testing, with 16 simultaneous processes performing the same test at
 the same time, by throughput, random reads/writes improved by about 35% (low
 variance), while sequential reads/writes improved by 100-400% (high
 variance).  IOs also increased proportionally.  Testing was done using
 "rawio -a -p 16 /dev/mirror/testa", where the test mirror composed of two
 160GB Seagate SATA disks and the system is a dual Opteron 246 with 1.5GB of
 RAM with no other load, running 8.0-CURRENT as of 12/1/2008.  CPU usage
 impact vs. old load algorithm appears negligible as well.
 
 Regards,
 --Will.
 
 ------=_Part_29248_27670565.1228254629354
 Content-Type: text/html; charset=ISO-8859-1
 Content-Transfer-Encoding: 7bit
 Content-Disposition: inline
 
 I have attached what I believe is a better version of your patch.&nbsp; It:<br><br>1) Fixes the type ambiguity of the new use_delay/best_use_delay and dist/best_dist variables, to match the variables used in their calculations; they should be uint64_t and off_t, respectively.<br>
 
 2) Uses bit shifts instead of multiplication/division in the use delay and distance calculations.&nbsp; The precision loss should be acceptable in this situation.<br>3) Cleans up the style of the code; add more commenting, better comments.<br>
 
 4) Gets rid of the g_mirror_disk.d_delay variable since it is no longer used, along with the function the original patch short-circuited.<br><br>In my testing, with 16 simultaneous processes performing the same test at the same time, by throughput, random reads/writes improved by about 35% (low variance), while sequential reads/writes improved by 100-400% (high variance).&nbsp; IOs also increased proportionally.&nbsp; Testing was done using &quot;rawio -a -p 16 /dev/mirror/testa&quot;, where the test mirro r composed of two 160GB Seagate SATA disks and the system is a dual Opteron 246 with 1.5GB of RAM with no other load, running 8.0-CURRENT as of 12/1/2008.&nbsp; CPU usage impact vs. old load algorithm appears negligible as well.<br>
 <br>Regards,<br>--Will.<br>
 
 ------=_Part_29248_27670565.1228254629354--
 
 ------=_Part_29247_27764781.1228254629354
 Content-Type: application/octet-stream; name=g_mirror_113885.diff
 Content-Transfer-Encoding: base64
 X-Attachment-Id: f_fo92vwa00
 Content-Disposition: attachment; filename=g_mirror_113885.diff
 
 SW5kZXg6IGdfbWlycm9yLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
 PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gZ19taXJyb3IuYwkocmV2aXNpb24gMTg1
 NTY3KQorKysgZ19taXJyb3IuYwkod29ya2luZyBjb3B5KQpAQCAtMjUsNyArMjUsNyBAQAogICov
 CiAKICNpbmNsdWRlIDxzeXMvY2RlZnMuaD4KLV9fRkJTRElEKCIkRnJlZUJTRCQiKTsKK19fRkJT
 RElEKCIkRnJlZUJTRDogc3JjL3N5cy9nZW9tL21pcnJvci9nX21pcnJvci5jLHYgMS45NCAyMDA3
 LzEwLzIwIDIzOjIzOjE5IGp1bGlhbiBFeHAgJCIpOwogCiAjaW5jbHVkZSA8c3lzL3BhcmFtLmg+
 CiAjaW5jbHVkZSA8c3lzL3N5c3RtLmg+CkBAIC00NSw3ICs0NSw2IEBACiAjaW5jbHVkZSA8c3lz
 L3NjaGVkLmg+CiAjaW5jbHVkZSA8Z2VvbS9taXJyb3IvZ19taXJyb3IuaD4KIAotCiBzdGF0aWMg
 TUFMTE9DX0RFRklORShNX01JUlJPUiwgIm1pcnJvcl9kYXRhIiwgIkdFT01fTUlSUk9SIERhdGEi
 KTsKIAogU1lTQ1RMX0RFQ0woX2tlcm5fZ2VvbSk7CkBAIC03MSw3ICs3MCwxMiBAQAogVFVOQUJM
 RV9JTlQoImtlcm4uZ2VvbS5taXJyb3Iuc3luY19yZXF1ZXN0cyIsICZnX21pcnJvcl9zeW5jcmVx
 cyk7CiBTWVNDVExfVUlOVChfa2Vybl9nZW9tX21pcnJvciwgT0lEX0FVVE8sIHN5bmNfcmVxdWVz
 dHMsIENUTEZMQUdfUkRUVU4sCiAgICAgJmdfbWlycm9yX3N5bmNyZXFzLCAwLCAiUGFyYWxsZWwg
 c3luY2hyb25pemF0aW9uIEkvTyByZXF1ZXN0cy4iKTsKK3N0YXRpYyB1X2ludCBnX21pcnJvcl9w
 bHVzZGVsYXkgPSA2MDAwMDsKK1RVTkFCTEVfSU5UKCJrZXJuLmdlb20ubWlycm9yLnBsdXNkZWxh
 eSIsICZnX21pcnJvcl9wbHVzZGVsYXkpOworU1lTQ1RMX1VJTlQoX2tlcm5fZ2VvbV9taXJyb3Is
 IE9JRF9BVVRPLCBwbHVzZGVsYXksIENUTEZMQUdfUlcsCisgICAgJmdfbWlycm9yX3BsdXNkZWxh
 eSwgMCwgIkFkZGl0aW9uYWwgbG9hZCBkZWxheSBpbiAxLzY1NTM2dGhzIG9mIGEgc2Vjb25kLiIp
 OwogCisKICNkZWZpbmUJTVNMRUVQKGlkZW50LCBtdHgsIHByaW9yaXR5LCB3bWVzZywgdGltZW91
 dCkJZG8gewkJXAogCUdfTUlSUk9SX0RFQlVHKDQsICIlczogU2xlZXBpbmcgJXAuIiwgX19mdW5j
 X18sIChpZGVudCkpOwlcCiAJbXNsZWVwKChpZGVudCksIChtdHgpLCAocHJpb3JpdHkpLCAod21l
 c2cpLCAodGltZW91dCkpOwkJXApAQCAtNDUxLDggKzQ1NSw3IEBACiAJZGlzay0+ZF9pZCA9IG1k
 LT5tZF9kaWQ7CiAJZGlzay0+ZF9zdGF0ZSA9IEdfTUlSUk9SX0RJU0tfU1RBVEVfTk9ORTsKIAlk
 aXNrLT5kX3ByaW9yaXR5ID0gbWQtPm1kX3ByaW9yaXR5OwotCWRpc2stPmRfZGVsYXkuc2VjID0g
 MDsKLQlkaXNrLT5kX2RlbGF5LmZyYWMgPSAwOworCWRpc2stPmxhc3Rfb2Zmc2V0ID0gMDsKIAli
 aW51cHRpbWUoJmRpc2stPmRfbGFzdF91c2VkKTsKIAlkaXNrLT5kX2ZsYWdzID0gbWQtPm1kX2Rm
 bGFnczsKIAlpZiAobWQtPm1kX3Byb3ZpZGVyWzBdICE9ICdcMCcpCkBAIC04NjMsMTYgKzg2Niw2
 IEBACiB9CiAKIHN0YXRpYyB2b2lkCi1nX21pcnJvcl91cGRhdGVfZGVsYXkoc3RydWN0IGdfbWly
 cm9yX2Rpc2sgKmRpc2ssIHN0cnVjdCBiaW8gKmJwKQotewotCi0JaWYgKGRpc2stPmRfc29mdGMt
 PnNjX2JhbGFuY2UgIT0gR19NSVJST1JfQkFMQU5DRV9MT0FEKQotCQlyZXR1cm47Ci0JYmludXB0
 aW1lKCZkaXNrLT5kX2RlbGF5KTsKLQliaW50aW1lX3N1YigmZGlzay0+ZF9kZWxheSwgJmJwLT5i
 aW9fdDApOwotfQotCi1zdGF0aWMgdm9pZAogZ19taXJyb3JfZG9uZShzdHJ1Y3QgYmlvICpicCkK
 IHsKIAlzdHJ1Y3QgZ19taXJyb3Jfc29mdGMgKnNjOwpAQCAtOTA0LDggKzg5Nyw2IEBACiAJCWdf
 dG9wb2xvZ3lfbG9jaygpOwogCQlnX21pcnJvcl9raWxsX2NvbnN1bWVyKHNjLCBicC0+YmlvX2Zy
 b20pOwogCQlnX3RvcG9sb2d5X3VubG9jaygpOwotCX0gZWxzZSB7Ci0JCWdfbWlycm9yX3VwZGF0
 ZV9kZWxheShkaXNrLCBicCk7CiAJfQogCiAJcGJwLT5iaW9faW5iZWQrKzsKQEAgLTE0NzIsMjUg
 KzE0NjMsNDUgQEAKIAlzdHJ1Y3QgZ19jb25zdW1lciAqY3A7CiAJc3RydWN0IGJpbyAqY2JwOwog
 CXN0cnVjdCBiaW50aW1lIGN1cnRpbWU7CisJb2ZmX3QgIGJpb19vZmZzZXQgPSBicC0+YmlvX29m
 ZnNldDsKKwlvZmZfdCAgYmVzdF9kaXN0ID0gLTEsIGRpc3Q7CisJdWludDY0X3QgYmVzdF91c2Vf
 ZGVsYXkgPSAwLCB1c2VfZGVsYXkgPSAwOwogCi0JYmludXB0aW1lKCZjdXJ0aW1lKTsKKwlnZXRi
 aW51cHRpbWUoJmN1cnRpbWUpOwogCS8qCi0JICogRmluZCBhIGRpc2sgd2hpY2ggdGhlIHNtYWxs
 ZXN0IGxvYWQuCisJICogRmluZCB0aGUgZGlzayB3aGljaCBoYXMgdGhlIHNtYWxsZXN0IHJhdGlv
 IG9mIGRpc3RhbmNlIHRvIHVzZQorCSAqIGRlbGF5LCBpLmUuIGl0cyBoZWFkIGxvb2tzIGNsb3Nl
 c3QgdG8gYmlvX29mZnNldCBhbmQgaXQgd2FzIHVzZWQKKwkgKiBsZWFzdCByZWNlbnRseS4KIAkg
 Ki8KIAlkaXNrID0gTlVMTDsKIAlMSVNUX0ZPUkVBQ0goZHAsICZzYy0+c2NfZGlza3MsIGRfbmV4
 dCkgewogCQlpZiAoZHAtPmRfc3RhdGUgIT0gR19NSVJST1JfRElTS19TVEFURV9BQ1RJVkUpCiAJ
 CQljb250aW51ZTsKLQkJLyogSWYgZGlzayB3YXNuJ3QgdXNlZCBmb3IgbW9yZSB0aGFuIDIgc2Vj
 LCB1c2UgaXQuICovCi0JCWlmIChjdXJ0aW1lLnNlYyAtIGRwLT5kX2xhc3RfdXNlZC5zZWMgPj0g
 MikgeworCisJCWRpc3QgPSBkcC0+bGFzdF9vZmZzZXQgLSBiaW9fb2Zmc2V0OworCQlpZiAoZGlz
 dCA8IDApCisJCQlkaXN0ID0gLWRpc3Q7CisKKwkJLyoKKwkJICogQ2FsY3VsYXRlIHRoZSB1c2Ug
 ZGVsYXkgYXMgZm9sbG93czogQWRkIHRoZSBzeXNjdGwKKwkJICogY29uZmlndXJlZCBkZWxheSwg
 dGhlbiBjb252ZXJ0IHRoZSBiaW50aW1lIHN0cnVjdHVyZQorCQkgKiBpbiB0ZXJtcyBvZiAxLzY1
 NTM2dGhzIG9mIGEgc2Vjb25kIGJlZm9yZSBhZGRpbmcgaXRzCisJCSAqIGNvbXBvbmVudHMuICBT
 byBtdWx0aXBseSBzZWNvbmRzIGRpZmZlcmVuY2UgYnkgNjU1MzYKKwkJICogYW5kIGRyb3AgYWxs
 IGJ1dCB0aGUgMTYgbW9zdCBzaWduaWZpY2FudCBiaXRzIGluIHRoZQorCQkgKiBmcmFjdGlvbiwg
 c2luY2UgdGhleSdyZSBhbGwgZ3JlYXRlciB0aGFuIDEvNjU1MzYuCisJCSAqLworCQl1c2VfZGVs
 YXkgPSBnX21pcnJvcl9wbHVzZGVsYXk7CisJCXVzZV9kZWxheSArPSAoKGN1cnRpbWUuc2VjIC0g
 ZHAtPmRfbGFzdF91c2VkLnNlYykgPDwgMTYpOworCQl1c2VfZGVsYXkgKz0gKChjdXJ0aW1lLmZy
 YWMgLSBkcC0+ZF9sYXN0X3VzZWQuZnJhYykgPj4gNDgpOworCisJCWlmIChiZXN0X2Rpc3QgPT0g
 LTEgfHwKKwkJICAgIGRpc3QgKiBiZXN0X3VzZV9kZWxheSA8IGJlc3RfZGlzdCAqIHVzZV9kZWxh
 eSkgewogCQkJZGlzayA9IGRwOwotCQkJYnJlYWs7CisJCQliZXN0X2Rpc3QgPSBkaXN0OworCQkJ
 YmVzdF91c2VfZGVsYXkgPSB1c2VfZGVsYXk7CiAJCX0KLQkJaWYgKGRpc2sgPT0gTlVMTCB8fAot
 CQkgICAgYmludGltZV9jbXAoJmRwLT5kX2RlbGF5LCAmZGlzay0+ZF9kZWxheSkgPCAwKSB7Ci0J
 CQlkaXNrID0gZHA7Ci0JCX0KIAl9CisKIAlLQVNTRVJUKGRpc2sgIT0gTlVMTCwgKCJOVUxMIGRp
 c2sgZm9yICVzLiIsIHNjLT5zY19uYW1lKSk7CiAJY2JwID0gZ19jbG9uZV9iaW8oYnApOwogCWlm
 IChjYnAgPT0gTlVMTCkgewpAQCAtMTUwNSw3ICsxNTE2LDggQEAKIAljcCA9IGRpc2stPmRfY29u
 c3VtZXI7CiAJY2JwLT5iaW9fZG9uZSA9IGdfbWlycm9yX2RvbmU7CiAJY2JwLT5iaW9fdG8gPSBj
 cC0+cHJvdmlkZXI7Ci0JYmludXB0aW1lKCZkaXNrLT5kX2xhc3RfdXNlZCk7CisJZGlzay0+ZF9s
 YXN0X3VzZWQgPSBjdXJ0aW1lOworCWRpc2stPmxhc3Rfb2Zmc2V0ID0gYmlvX29mZnNldDsKIAlH
 X01JUlJPUl9MT0dSRVEoMywgY2JwLCAiU2VuZGluZyByZXF1ZXN0LiIpOwogCUtBU1NFUlQoY3At
 PmFjciA+PSAxICYmIGNwLT5hY3cgPj0gMSAmJiBjcC0+YWNlID49IDEsCiAJICAgICgiQ29uc3Vt
 ZXIgJXMgbm90IG9wZW5lZCAociVkdyVkZSVkKS4iLCBjcC0+cHJvdmlkZXItPm5hbWUsIGNwLT5h
 Y3IsCkBAIC0xNjU5LDYgKzE2NzEsNyBAQAogCQkJCWdfaW9fZGVsaXZlcihicCwgYnAtPmJpb19l
 cnJvcik7CiAJCQkJcmV0dXJuOwogCQkJfQorCQkJZGlzay0+bGFzdF9vZmZzZXQgPSBicC0+Ymlv
 X29mZnNldDsKIAkJCWJpb3FfaW5zZXJ0X3RhaWwoJnF1ZXVlLCBjYnApOwogCQkJY2JwLT5iaW9f
 ZG9uZSA9IGdfbWlycm9yX2RvbmU7CiAJCQljcCA9IGRpc2stPmRfY29uc3VtZXI7CkluZGV4OiBn
 X21pcnJvci5oCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
 PT09PT09PT09PT09PT09PT09PT0KLS0tIGdfbWlycm9yLmgJKHJldmlzaW9uIDE4NTU2NykKKysr
 IGdfbWlycm9yLmgJKHdvcmtpbmcgY29weSkKQEAgLTIzLDcgKzIzLDcgQEAKICAqIE9VVCBPRiBU
 SEUgVVNFIE9GIFRISVMgU09GVFdBUkUsIEVWRU4gSUYgQURWSVNFRCBPRiBUSEUgUE9TU0lCSUxJ
 VFkgT0YKICAqIFNVQ0ggREFNQUdFLgogICoKLSAqICRGcmVlQlNEJAorICogJEZyZWVCU0Q6IHNy
 Yy9zeXMvZ2VvbS9taXJyb3IvZ19taXJyb3IuaCx2IDEuMjQgMjAwNi8xMS8wMSAyMjo1MTo0OSBw
 amQgRXhwICQKICAqLwogCiAjaWZuZGVmCV9HX01JUlJPUl9IXwpAQCAtMTMzLDcgKzEzMyw3IEBA
 CiAJc3RydWN0IGdfbWlycm9yX3NvZnRjCSpkX3NvZnRjOyAvKiBCYWNrLXBvaW50ZXIgdG8gc29m
 dGMuICovCiAJaW50CQkgZF9zdGF0ZTsJLyogRGlzayBzdGF0ZS4gKi8KIAl1X2ludAkJIGRfcHJp
 b3JpdHk7CS8qIERpc2sgcHJpb3JpdHkuICovCi0Jc3RydWN0IGJpbnRpbWUJIGRfZGVsYXk7CS8q
 IERpc2sgZGVsYXkuICovCisJb2ZmX3QJCSBsYXN0X29mZnNldDsJLyogTEJBIG9mIGxhc3Qgb3Bl
 cmF0aW9uLiAqLwogCXN0cnVjdCBiaW50aW1lCSBkX2xhc3RfdXNlZDsJLyogV2hlbiBkaXNrIHdh
 cyBsYXN0IHVzZWQuICovCiAJdWludDY0X3QJIGRfZmxhZ3M7CS8qIEFkZGl0aW9uYWwgZmxhZ3Mu
 ICovCiAJdV9pbnQJCSBkX2dlbmlkOwkvKiBEaXNrJ3MgZ2VuZXJhdGlvbiBJRC4gKi8K
 ------=_Part_29247_27764781.1228254629354--


More information about the freebsd-geom mailing list