From ivoras at freebsd.org Mon Dec 1 02:12:40 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Mon Dec 1 02:12:46 2008 Subject: (trivial) patch to add provider name to printed warning In-Reply-To: <20081130233255.GA27667@keira.kiwi-computer.com> References: <20081130225805.GA27328@keira.kiwi-computer.com> <20081130233255.GA27667@keira.kiwi-computer.com> Message-ID: Rick C. Petty wrote: > On Sun, Nov 30, 2008 at 04:58:05PM -0600, Rick C. Petty wrote: >> Could someone with a commit bit (perhaps phk@) look at the following >> trivial patch? It can't hurt anything, since the provider name cannot >> be NULL here. Thanks, > > D'oh, the attachment was filtered. Here it is inline... I can take it. > > --- src/sys/geom/geom_bsd.c.orig 2007-12-17 19:24:27.000000000 -0600 > +++ src/sys/geom/geom_bsd.c 2008-11-30 03:09:04.000000000 -0600 > @@ -136,7 +136,8 @@ > } > > if (rawoffset != 0 && (off_t)rawoffset != ms->mbroffset) > - printf("WARNING: Expected rawoffset %jd, found %jd\n", > + printf("WARNING: %s expected rawoffset %jd, found %jd\n", > + gp->name, > (intmax_t)ms->mbroffset/dl.d_secsize, > (intmax_t)rawoffset/dl.d_secsize); > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20081201/4c7000c1/signature.pgp From ivoras at freebsd.org Mon Dec 1 02:27:36 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Mon Dec 1 02:27:43 2008 Subject: about pluggable disk scheduler In-Reply-To: <10210814530.20081130163737@gmail.com> References: <10210814530.20081130163737@gmail.com> Message-ID: Dmitry wrote: > Hello freebsd-geom, > > I'm a student, interested in developing pluggable disk scheduler > idea. I want to know if some work is being processed and if there is > need to do this work. Also I have interest in participating in > Summer of Code 2009, so I would like to contact with possible > mentor for further discussion. Hi, You should first find out what happened to the old project to implement a pluggable disk scheduler idea for a SoC. There's some information here: http://wiki.freebsd.org/Hybrid . It has apparently been developed, up to a point, but has never been integrated into FreeBSD sources, and is thus abandoned now. You should: a) Contact the old author for more information b) Find out why the project hasn't been accepted / merged into the official FreeBSD source tree c) Decide if the problems that the previous project faced are something you can solve (or have been solved by others in the meantime) so your project results end up in FreeBSD. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20081201/259bbf01/signature.pgp From bugmaster at FreeBSD.org Mon Dec 1 03:06:56 2008 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Dec 1 03:08:03 2008 Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org Message-ID: <200812011106.mB1B6tph052545@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/129245 geom [geom] gcache is more suitable for suffix based provid o kern/128529 geom [gjournal] root FS on GEOM Journal cannot boot when jo o kern/128398 geom [PATCH] glabel(8): teach geom_label to recognise gpt l f kern/128276 geom [gmirror] machine lock up when gmirror module is used o kern/126902 geom [geom] [geom_label] Kernel panic during install boot o kern/124973 geom [gjournal] [patch] boot order affects geom_journal con o kern/124969 geom gvinum(8): gvinum raid5 plex does not detect missing s o kern/124294 geom [geom] gmirror(8) have inappropriate logic when workin o kern/124130 geom [gmirror][usb] gmirror fails to start usb devices that o kern/123962 geom [panic] [gjournal] gjournal (455Gb data, 8Gb journal), o kern/123630 geom [patch] [gmirror] gmirror doesnt allow the original dr o kern/123122 geom [geom] GEOM / gjournal kernel lock f kern/122415 geom [geom] UFS labels are being constantly created and rem o kern/122067 geom [geom] [panic] Geom crashed during boot o kern/121559 geom [patch] [geom] geom label class allows to create inacc o kern/121364 geom [gmirror] Removing all providers create a "zombie" mir o kern/120231 geom [geom] GEOM_CONCAT error adding second drive o kern/120044 geom [msdosfs] [geom] incorrect MSDOSFS label fries adminis o kern/120021 geom [geom] [panic] net-p2p/qbittorrent crashes system when o kern/119743 geom [geom] geom label for cds is keeped after dismount and f kern/115547 geom [geom] [patch] [request] let GEOM Eli get password fro o kern/114532 geom [geom] GEOM_MIRROR shows up in kldstat even if compile o kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/113885 geom [gmirror] [patch] improved gmirror balance algorithm o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113419 geom [geom] geom fox multipathing not failing back p bin/110705 geom gmirror(8) control utility does not exit with correct o kern/107707 geom [geom] [patch] [request] add new class geom_xbox360 to o kern/104389 geom [geom] [patch] sys/geom/geom_dump.c doesn't encode XML o kern/98034 geom [geom] dereference of NULL pointer in acd_geom_detach o kern/94632 geom [geom] Kernel output resets input while GELI asks for o kern/90582 geom [geom] [panic] Restore cause panic string (ffs_blkfree o bin/90093 geom fdisk(8) incapable of altering in-core geometry a kern/89660 geom [vinum] [patch] [panic] due to g_malloc returning null o kern/89546 geom [geom] GEOM error s kern/89102 geom [geom] [panic] panic when forced unmount FS from unplu o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o kern/84556 geom [geom] GBDE-encrypted swap causes panic at shutdown o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/79035 geom [vinum] gvinum unable to create a striped set of mirro o bin/78131 geom gbde(8) "destroy" not working. s kern/73177 geom kldload geom_* causes panic due to memory exhaustion 42 problems total. From ivoras at freebsd.org Mon Dec 1 07:03:54 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Mon Dec 1 07:04:01 2008 Subject: (trivial) patch to add provider name to printed warning In-Reply-To: References: <20081130225805.GA27328@keira.kiwi-computer.com> <20081130233255.GA27667@keira.kiwi-computer.com> Message-ID: Ivan Voras wrote: > Rick C. Petty wrote: >> On Sun, Nov 30, 2008 at 04:58:05PM -0600, Rick C. Petty wrote: >>> Could someone with a commit bit (perhaps phk@) look at the following >>> trivial patch? It can't hurt anything, since the provider name cannot >>> be NULL here. Thanks, >> D'oh, the attachment was filtered. Here it is inline... > > I can take it. http://svn.freebsd.org/changeset/base/185518 > >> --- src/sys/geom/geom_bsd.c.orig 2007-12-17 19:24:27.000000000 -0600 >> +++ src/sys/geom/geom_bsd.c 2008-11-30 03:09:04.000000000 -0600 >> @@ -136,7 +136,8 @@ >> } >> >> if (rawoffset != 0 && (off_t)rawoffset != ms->mbroffset) >> - printf("WARNING: Expected rawoffset %jd, found %jd\n", >> + printf("WARNING: %s expected rawoffset %jd, found %jd\n", >> + gp->name, >> (intmax_t)ms->mbroffset/dl.d_secsize, >> (intmax_t)rawoffset/dl.d_secsize); >> > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20081201/05c60a55/signature.pgp From rajkumars at gmail.com Tue Dec 2 05:06:45 2008 From: rajkumars at gmail.com (Rajkumar S) Date: Tue Dec 2 05:06:51 2008 Subject: g_vfs_done():ufs/root1[READ(offset=106196992, length=4096)]error = 6 Message-ID: <64de5c8b0812020437t30236d52p6e748d508b7b7b7@mail.gmail.com> Hi, I am working on a nanobsd derived system for updating an embedded pfSense image. The disk is partitioned into 4 partitions with 2 similar "code" partitions. One of the two code partition is live at any moment. To update the partition image is written to the other partition and a command like boot0cfg -s 2 -v ad2 to boot to the new partition. Instead of using device names I am using bsdlabel and refer the disks using the label in fdisk. Current partitions are as follows: nanoimg:~# fdisk ad2 ******* Working on device /dev/ad2 ******* parameters extracted from in-core disklabel are: cylinders=1999 heads=16 sectors/track=63 (1008 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=1999 heads=16 sectors/track=63 (1008 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 1 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 32, size 239584 (116 Meg), flag 80 (active) beg: cyl 0/ head 1/ sector 1; end: cyl 467/ head 15/ sector 32 The data for partition 2 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 239648, size 239584 (116 Meg), flag 0 beg: cyl 468/ head 1/ sector 1; end: cyl 935/ head 15/ sector 32 The data for partition 3 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 479232, size 2048 (1 Meg), flag 0 beg: cyl 936/ head 0/ sector 1; end: cyl 939/ head 15/ sector 32 The data for partition 4 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 481280, size 20480 (10 Meg), flag 0 beg: cyl 940/ head 0/ sector 1; end: cyl 979/ head 15/ sector 32 dmesg shows the following when booting: ad2: 983MB at ata1-master PIO4 GEOM: ad2: partition 4 does not start on a track boundary. GEOM: ad2: partition 4 does not end on a track boundary. GEOM: ad2: partition 3 does not start on a track boundary. GEOM: ad2: partition 3 does not end on a track boundary. GEOM: ad2: partition 2 does not start on a track boundary. GEOM: ad2: partition 2 does not end on a track boundary. GEOM: ad2: partition 1 does not start on a track boundary. GEOM: ad2: partition 1 does not end on a track boundary. GEOM_LABEL: Label for provider ad2s3 is ufs/cfg. GEOM_LABEL: Label for provider ad2s4 is ufs/cf. GEOM_LABEL: Label for provider ad2s1a is ufs/root0. GEOM_LABEL: Label for provider ad2s2a is ufs/root1. Trying to mount root from ufs:/dev/ufs/root0 Fstab is: /dev/ufs/root0 / ufs ro 1 1 /dev/ufs/cfg /cfg ufs rw,noauto 2 2 /dev/ufs/cf /cf ufs ro 1 1 I can switch booting to ufs/root0 or ufs/root1 using a command like mettlenano:~# sysctl kern.geom.debugflags=16 kern.geom.debugflags: 0 -> 16 and mettlenano:~# boot0cfg -s 1 -v ad2 # flag start chs type end chs offset size 1 0x00 0: 1: 1 0xa5 467: 15:32 32 239584 2 0x80 468: 1: 1 0xa5 935: 15:32 239648 239584 3 0x00 936: 0: 1 0xa5 939: 15:32 479232 2048 4 0x00 940: 0: 1 0xa5 979: 15:32 481280 20480 version=1.0 drive=0x80 mask=0x3 ticks=182 options=packet,update,nosetdrv default_selection=F1 (Slice 1) But after executing this command I get the following messages in my dmesg. GEOM_LABEL: Label ufs/cf removed. GEOM_LABEL: Label ufs/cfg removed. GEOM_LABEL: Label ufs/root1 removed. GEOM_LABEL: Label ufs/root0 removed. g_vfs_done():ufs/root1[READ(offset=106196992, length=4096)]error = 6 g_vfs_done():ufs/root1[READ(offset=106196992, length=4096)]error = 6 g_vfs_done():ufs/root1[READ(offset=106201088, length=4096)]error = 6 I have no idea why such messages are appearing. Also some commands like reboot does not work. mettlenano:~# reboot /sbin/reboot: Device not configured. mettlenano:~# less /usr/bin/less: Device not configured. But some other commands work. Any one with any idea about what could be wrong here? raj From k0802647 at telus.net Tue Dec 2 13:44:55 2008 From: k0802647 at telus.net (Carl) Date: Tue Dec 2 13:45:02 2008 Subject: gmirror, identical priority values Message-ID: <49359F9D.4090200@telus.net> What are the consequences of creating a mirror provider using gmirror with more than one consumer having the *same* priority value. This is an easy mistake to make and it can't easily be fixed because gmirror provides no way to change priority after the fact. For what it's worth, I'm using the round-robin balancing algorithm, but as I understand it, priority isn't just used for the balancing algorithm. Note how even documentation in FreeBSD Diary makes this mistake: http://www.freebsddiary.org/gmirror.php And it's easy to find other folks in the mailing lists making this mistake too. Carl / K0802647 From will at firepipe.net Tue Dec 2 14:20:04 2008 From: will at firepipe.net (Will Andrews) Date: Tue Dec 2 14:20:11 2008 Subject: kern/113885: [gmirror] [patch] improved gmirror balance algorithm Message-ID: <200812022220.mB2MK37K085195@freefall.freebsd.org> The following reply was made to PR kern/113885; it has been noted by GNATS. From: "Will Andrews" To: bug-followup@freebsd.org Cc: "Mykola Zubach" Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm Date: Tue, 2 Dec 2008 14:50:29 -0700 ------=_Part_29247_27764781.1228254629354 Content-Type: multipart/alternative; boundary="----=_Part_29248_27670565.1228254629354" ------=_Part_29248_27670565.1228254629354 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline I have attached what I believe is a better version of your patch. It: 1) Fixes the type ambiguity of the new use_delay/best_use_delay and dist/best_dist variables, to match the variables used in their calculations; they should be uint64_t and off_t, respectively. 2) Uses bit shifts instead of multiplication/division in the use delay and distance calculations. The precision loss should be acceptable in this situation. 3) Cleans up the style of the code; add more commenting, better comments. 4) Gets rid of the g_mirror_disk.d_delay variable since it is no longer used, along with the function the original patch short-circuited. In my testing, with 16 simultaneous processes performing the same test at the same time, by throughput, random reads/writes improved by about 35% (low variance), while sequential reads/writes improved by 100-400% (high variance). IOs also increased proportionally. Testing was done using "rawio -a -p 16 /dev/mirror/testa", where the test mirror composed of two 160GB Seagate SATA disks and the system is a dual Opteron 246 with 1.5GB of RAM with no other load, running 8.0-CURRENT as of 12/1/2008. CPU usage impact vs. old load algorithm appears negligible as well. Regards, --Will. ------=_Part_29248_27670565.1228254629354 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline I have attached what I believe is a better version of your patch.  It:

1) Fixes the type ambiguity of the new use_delay/best_use_delay and dist/best_dist variables, to match the variables used in their calculations; they should be uint64_t and off_t, respectively.
2) Uses bit shifts instead of multiplication/division in the use delay and distance calculations.  The precision loss should be acceptable in this situation.
3) Cleans up the style of the code; add more commenting, better comments.
4) Gets rid of the g_mirror_disk.d_delay variable since it is no longer used, along with the function the original patch short-circuited.

In my testing, with 16 simultaneous processes performing the same test at the same time, by throughput, random reads/writes improved by about 35% (low variance), while sequential reads/writes improved by 100-400% (high variance).  IOs also increased proportionally.  Testing was done using "rawio -a -p 16 /dev/mirror/testa", where the test mirro r composed of two 160GB Seagate SATA disks and the system is a dual Opteron 246 with 1.5GB of RAM with no other load, running 8.0-CURRENT as of 12/1/2008.  CPU usage impact vs. old load algorithm appears negligible as well.

Regards,
--Will.
------=_Part_29248_27670565.1228254629354-- ------=_Part_29247_27764781.1228254629354 Content-Type: application/octet-stream; name=g_mirror_113885.diff Content-Transfer-Encoding: base64 X-Attachment-Id: f_fo92vwa00 Content-Disposition: attachment; filename=g_mirror_113885.diff SW5kZXg6IGdfbWlycm9yLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gZ19taXJyb3IuYwkocmV2aXNpb24gMTg1 NTY3KQorKysgZ19taXJyb3IuYwkod29ya2luZyBjb3B5KQpAQCAtMjUsNyArMjUsNyBAQAogICov CiAKICNpbmNsdWRlIDxzeXMvY2RlZnMuaD4KLV9fRkJTRElEKCIkRnJlZUJTRCQiKTsKK19fRkJT RElEKCIkRnJlZUJTRDogc3JjL3N5cy9nZW9tL21pcnJvci9nX21pcnJvci5jLHYgMS45NCAyMDA3 LzEwLzIwIDIzOjIzOjE5IGp1bGlhbiBFeHAgJCIpOwogCiAjaW5jbHVkZSA8c3lzL3BhcmFtLmg+ CiAjaW5jbHVkZSA8c3lzL3N5c3RtLmg+CkBAIC00NSw3ICs0NSw2IEBACiAjaW5jbHVkZSA8c3lz L3NjaGVkLmg+CiAjaW5jbHVkZSA8Z2VvbS9taXJyb3IvZ19taXJyb3IuaD4KIAotCiBzdGF0aWMg TUFMTE9DX0RFRklORShNX01JUlJPUiwgIm1pcnJvcl9kYXRhIiwgIkdFT01fTUlSUk9SIERhdGEi KTsKIAogU1lTQ1RMX0RFQ0woX2tlcm5fZ2VvbSk7CkBAIC03MSw3ICs3MCwxMiBAQAogVFVOQUJM RV9JTlQoImtlcm4uZ2VvbS5taXJyb3Iuc3luY19yZXF1ZXN0cyIsICZnX21pcnJvcl9zeW5jcmVx cyk7CiBTWVNDVExfVUlOVChfa2Vybl9nZW9tX21pcnJvciwgT0lEX0FVVE8sIHN5bmNfcmVxdWVz dHMsIENUTEZMQUdfUkRUVU4sCiAgICAgJmdfbWlycm9yX3N5bmNyZXFzLCAwLCAiUGFyYWxsZWwg c3luY2hyb25pemF0aW9uIEkvTyByZXF1ZXN0cy4iKTsKK3N0YXRpYyB1X2ludCBnX21pcnJvcl9w bHVzZGVsYXkgPSA2MDAwMDsKK1RVTkFCTEVfSU5UKCJrZXJuLmdlb20ubWlycm9yLnBsdXNkZWxh eSIsICZnX21pcnJvcl9wbHVzZGVsYXkpOworU1lTQ1RMX1VJTlQoX2tlcm5fZ2VvbV9taXJyb3Is IE9JRF9BVVRPLCBwbHVzZGVsYXksIENUTEZMQUdfUlcsCisgICAgJmdfbWlycm9yX3BsdXNkZWxh eSwgMCwgIkFkZGl0aW9uYWwgbG9hZCBkZWxheSBpbiAxLzY1NTM2dGhzIG9mIGEgc2Vjb25kLiIp OwogCisKICNkZWZpbmUJTVNMRUVQKGlkZW50LCBtdHgsIHByaW9yaXR5LCB3bWVzZywgdGltZW91 dCkJZG8gewkJXAogCUdfTUlSUk9SX0RFQlVHKDQsICIlczogU2xlZXBpbmcgJXAuIiwgX19mdW5j X18sIChpZGVudCkpOwlcCiAJbXNsZWVwKChpZGVudCksIChtdHgpLCAocHJpb3JpdHkpLCAod21l c2cpLCAodGltZW91dCkpOwkJXApAQCAtNDUxLDggKzQ1NSw3IEBACiAJZGlzay0+ZF9pZCA9IG1k LT5tZF9kaWQ7CiAJZGlzay0+ZF9zdGF0ZSA9IEdfTUlSUk9SX0RJU0tfU1RBVEVfTk9ORTsKIAlk aXNrLT5kX3ByaW9yaXR5ID0gbWQtPm1kX3ByaW9yaXR5OwotCWRpc2stPmRfZGVsYXkuc2VjID0g MDsKLQlkaXNrLT5kX2RlbGF5LmZyYWMgPSAwOworCWRpc2stPmxhc3Rfb2Zmc2V0ID0gMDsKIAli aW51cHRpbWUoJmRpc2stPmRfbGFzdF91c2VkKTsKIAlkaXNrLT5kX2ZsYWdzID0gbWQtPm1kX2Rm bGFnczsKIAlpZiAobWQtPm1kX3Byb3ZpZGVyWzBdICE9ICdcMCcpCkBAIC04NjMsMTYgKzg2Niw2 IEBACiB9CiAKIHN0YXRpYyB2b2lkCi1nX21pcnJvcl91cGRhdGVfZGVsYXkoc3RydWN0IGdfbWly cm9yX2Rpc2sgKmRpc2ssIHN0cnVjdCBiaW8gKmJwKQotewotCi0JaWYgKGRpc2stPmRfc29mdGMt PnNjX2JhbGFuY2UgIT0gR19NSVJST1JfQkFMQU5DRV9MT0FEKQotCQlyZXR1cm47Ci0JYmludXB0 aW1lKCZkaXNrLT5kX2RlbGF5KTsKLQliaW50aW1lX3N1YigmZGlzay0+ZF9kZWxheSwgJmJwLT5i aW9fdDApOwotfQotCi1zdGF0aWMgdm9pZAogZ19taXJyb3JfZG9uZShzdHJ1Y3QgYmlvICpicCkK IHsKIAlzdHJ1Y3QgZ19taXJyb3Jfc29mdGMgKnNjOwpAQCAtOTA0LDggKzg5Nyw2IEBACiAJCWdf dG9wb2xvZ3lfbG9jaygpOwogCQlnX21pcnJvcl9raWxsX2NvbnN1bWVyKHNjLCBicC0+YmlvX2Zy b20pOwogCQlnX3RvcG9sb2d5X3VubG9jaygpOwotCX0gZWxzZSB7Ci0JCWdfbWlycm9yX3VwZGF0 ZV9kZWxheShkaXNrLCBicCk7CiAJfQogCiAJcGJwLT5iaW9faW5iZWQrKzsKQEAgLTE0NzIsMjUg KzE0NjMsNDUgQEAKIAlzdHJ1Y3QgZ19jb25zdW1lciAqY3A7CiAJc3RydWN0IGJpbyAqY2JwOwog CXN0cnVjdCBiaW50aW1lIGN1cnRpbWU7CisJb2ZmX3QgIGJpb19vZmZzZXQgPSBicC0+YmlvX29m ZnNldDsKKwlvZmZfdCAgYmVzdF9kaXN0ID0gLTEsIGRpc3Q7CisJdWludDY0X3QgYmVzdF91c2Vf ZGVsYXkgPSAwLCB1c2VfZGVsYXkgPSAwOwogCi0JYmludXB0aW1lKCZjdXJ0aW1lKTsKKwlnZXRi aW51cHRpbWUoJmN1cnRpbWUpOwogCS8qCi0JICogRmluZCBhIGRpc2sgd2hpY2ggdGhlIHNtYWxs ZXN0IGxvYWQuCisJICogRmluZCB0aGUgZGlzayB3aGljaCBoYXMgdGhlIHNtYWxsZXN0IHJhdGlv IG9mIGRpc3RhbmNlIHRvIHVzZQorCSAqIGRlbGF5LCBpLmUuIGl0cyBoZWFkIGxvb2tzIGNsb3Nl c3QgdG8gYmlvX29mZnNldCBhbmQgaXQgd2FzIHVzZWQKKwkgKiBsZWFzdCByZWNlbnRseS4KIAkg Ki8KIAlkaXNrID0gTlVMTDsKIAlMSVNUX0ZPUkVBQ0goZHAsICZzYy0+c2NfZGlza3MsIGRfbmV4 dCkgewogCQlpZiAoZHAtPmRfc3RhdGUgIT0gR19NSVJST1JfRElTS19TVEFURV9BQ1RJVkUpCiAJ CQljb250aW51ZTsKLQkJLyogSWYgZGlzayB3YXNuJ3QgdXNlZCBmb3IgbW9yZSB0aGFuIDIgc2Vj LCB1c2UgaXQuICovCi0JCWlmIChjdXJ0aW1lLnNlYyAtIGRwLT5kX2xhc3RfdXNlZC5zZWMgPj0g MikgeworCisJCWRpc3QgPSBkcC0+bGFzdF9vZmZzZXQgLSBiaW9fb2Zmc2V0OworCQlpZiAoZGlz dCA8IDApCisJCQlkaXN0ID0gLWRpc3Q7CisKKwkJLyoKKwkJICogQ2FsY3VsYXRlIHRoZSB1c2Ug ZGVsYXkgYXMgZm9sbG93czogQWRkIHRoZSBzeXNjdGwKKwkJICogY29uZmlndXJlZCBkZWxheSwg dGhlbiBjb252ZXJ0IHRoZSBiaW50aW1lIHN0cnVjdHVyZQorCQkgKiBpbiB0ZXJtcyBvZiAxLzY1 NTM2dGhzIG9mIGEgc2Vjb25kIGJlZm9yZSBhZGRpbmcgaXRzCisJCSAqIGNvbXBvbmVudHMuICBT byBtdWx0aXBseSBzZWNvbmRzIGRpZmZlcmVuY2UgYnkgNjU1MzYKKwkJICogYW5kIGRyb3AgYWxs IGJ1dCB0aGUgMTYgbW9zdCBzaWduaWZpY2FudCBiaXRzIGluIHRoZQorCQkgKiBmcmFjdGlvbiwg c2luY2UgdGhleSdyZSBhbGwgZ3JlYXRlciB0aGFuIDEvNjU1MzYuCisJCSAqLworCQl1c2VfZGVs YXkgPSBnX21pcnJvcl9wbHVzZGVsYXk7CisJCXVzZV9kZWxheSArPSAoKGN1cnRpbWUuc2VjIC0g ZHAtPmRfbGFzdF91c2VkLnNlYykgPDwgMTYpOworCQl1c2VfZGVsYXkgKz0gKChjdXJ0aW1lLmZy YWMgLSBkcC0+ZF9sYXN0X3VzZWQuZnJhYykgPj4gNDgpOworCisJCWlmIChiZXN0X2Rpc3QgPT0g LTEgfHwKKwkJICAgIGRpc3QgKiBiZXN0X3VzZV9kZWxheSA8IGJlc3RfZGlzdCAqIHVzZV9kZWxh eSkgewogCQkJZGlzayA9IGRwOwotCQkJYnJlYWs7CisJCQliZXN0X2Rpc3QgPSBkaXN0OworCQkJ YmVzdF91c2VfZGVsYXkgPSB1c2VfZGVsYXk7CiAJCX0KLQkJaWYgKGRpc2sgPT0gTlVMTCB8fAot CQkgICAgYmludGltZV9jbXAoJmRwLT5kX2RlbGF5LCAmZGlzay0+ZF9kZWxheSkgPCAwKSB7Ci0J CQlkaXNrID0gZHA7Ci0JCX0KIAl9CisKIAlLQVNTRVJUKGRpc2sgIT0gTlVMTCwgKCJOVUxMIGRp c2sgZm9yICVzLiIsIHNjLT5zY19uYW1lKSk7CiAJY2JwID0gZ19jbG9uZV9iaW8oYnApOwogCWlm IChjYnAgPT0gTlVMTCkgewpAQCAtMTUwNSw3ICsxNTE2LDggQEAKIAljcCA9IGRpc2stPmRfY29u c3VtZXI7CiAJY2JwLT5iaW9fZG9uZSA9IGdfbWlycm9yX2RvbmU7CiAJY2JwLT5iaW9fdG8gPSBj cC0+cHJvdmlkZXI7Ci0JYmludXB0aW1lKCZkaXNrLT5kX2xhc3RfdXNlZCk7CisJZGlzay0+ZF9s YXN0X3VzZWQgPSBjdXJ0aW1lOworCWRpc2stPmxhc3Rfb2Zmc2V0ID0gYmlvX29mZnNldDsKIAlH X01JUlJPUl9MT0dSRVEoMywgY2JwLCAiU2VuZGluZyByZXF1ZXN0LiIpOwogCUtBU1NFUlQoY3At PmFjciA+PSAxICYmIGNwLT5hY3cgPj0gMSAmJiBjcC0+YWNlID49IDEsCiAJICAgICgiQ29uc3Vt ZXIgJXMgbm90IG9wZW5lZCAociVkdyVkZSVkKS4iLCBjcC0+cHJvdmlkZXItPm5hbWUsIGNwLT5h Y3IsCkBAIC0xNjU5LDYgKzE2NzEsNyBAQAogCQkJCWdfaW9fZGVsaXZlcihicCwgYnAtPmJpb19l cnJvcik7CiAJCQkJcmV0dXJuOwogCQkJfQorCQkJZGlzay0+bGFzdF9vZmZzZXQgPSBicC0+Ymlv X29mZnNldDsKIAkJCWJpb3FfaW5zZXJ0X3RhaWwoJnF1ZXVlLCBjYnApOwogCQkJY2JwLT5iaW9f ZG9uZSA9IGdfbWlycm9yX2RvbmU7CiAJCQljcCA9IGRpc2stPmRfY29uc3VtZXI7CkluZGV4OiBn X21pcnJvci5oCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT0KLS0tIGdfbWlycm9yLmgJKHJldmlzaW9uIDE4NTU2NykKKysr IGdfbWlycm9yLmgJKHdvcmtpbmcgY29weSkKQEAgLTIzLDcgKzIzLDcgQEAKICAqIE9VVCBPRiBU SEUgVVNFIE9GIFRISVMgU09GVFdBUkUsIEVWRU4gSUYgQURWSVNFRCBPRiBUSEUgUE9TU0lCSUxJ VFkgT0YKICAqIFNVQ0ggREFNQUdFLgogICoKLSAqICRGcmVlQlNEJAorICogJEZyZWVCU0Q6IHNy Yy9zeXMvZ2VvbS9taXJyb3IvZ19taXJyb3IuaCx2IDEuMjQgMjAwNi8xMS8wMSAyMjo1MTo0OSBw amQgRXhwICQKICAqLwogCiAjaWZuZGVmCV9HX01JUlJPUl9IXwpAQCAtMTMzLDcgKzEzMyw3IEBA CiAJc3RydWN0IGdfbWlycm9yX3NvZnRjCSpkX3NvZnRjOyAvKiBCYWNrLXBvaW50ZXIgdG8gc29m dGMuICovCiAJaW50CQkgZF9zdGF0ZTsJLyogRGlzayBzdGF0ZS4gKi8KIAl1X2ludAkJIGRfcHJp b3JpdHk7CS8qIERpc2sgcHJpb3JpdHkuICovCi0Jc3RydWN0IGJpbnRpbWUJIGRfZGVsYXk7CS8q IERpc2sgZGVsYXkuICovCisJb2ZmX3QJCSBsYXN0X29mZnNldDsJLyogTEJBIG9mIGxhc3Qgb3Bl cmF0aW9uLiAqLwogCXN0cnVjdCBiaW50aW1lCSBkX2xhc3RfdXNlZDsJLyogV2hlbiBkaXNrIHdh cyBsYXN0IHVzZWQuICovCiAJdWludDY0X3QJIGRfZmxhZ3M7CS8qIEFkZGl0aW9uYWwgZmxhZ3Mu ICovCiAJdV9pbnQJCSBkX2dlbmlkOwkvKiBEaXNrJ3MgZ2VuZXJhdGlvbiBJRC4gKi8K ------=_Part_29247_27764781.1228254629354-- From vadim_nuclight at mail.ru Wed Dec 3 03:30:27 2008 From: vadim_nuclight at mail.ru (Vadim Goncharov) Date: Wed Dec 3 03:30:34 2008 Subject: System freeze with gvinum References: Message-ID: Hi Hilko Meyer! On Sat, 29 Nov 2008 23:47:59 +0100; Hilko Meyer wrote about 'System freeze with gvinum': > Every time I tried to run newfs on one of the volumes it stucked and the > complete system freezed, so I cannot provide a coredump. The system runs > 6.4-RELEASE that was compiled today. What can I do to debug this problem? You can try to call panic manually. Change your keymap first, e.g.: vadim@hostel:~>grep key /etc/rc.conf keymap="ru.koi8-r.vg" keyrate="fast" vadim@hostel:~>grep panic ru.koi8-r.vg.kbd 001 esc esc nop nop 155 155 debug panic O 092 nscr pscr debug debug nop nop nop panic O 104 slock saver slock saver susp nop susp panic O 129 esc esc nop nop 155 155 debug panic O 220 nscr pscr debug debug nop nop nop panic O 232 slock saver slock saver susp nop susp panic O Then do 'sysctl machdep.enable_panic_key=1' (may be add this to /etc/sysctl.conf). Then, if your dump device is configured, keymap is applied and sysctl enabled, you can press Alt-Shift-Ctrl-Esc on the console, and machine will panic with coredump. Of course it is better to fall into kernel debugger with Alt-Ctrl-Esc, but that requires debuger compiled into kernel and you knowing what to do with it :) -- WBR, Vadim Goncharov. ICQ#166852181 mailto:vadim_nuclight@mail.ru [Moderator of RU.ANTI-ECOLOGY][FreeBSD][http://antigreen.org][LJ:/nuclight] From r.c.ladan at gmail.com Wed Dec 3 14:00:04 2008 From: r.c.ladan at gmail.com (Rene Ladan) Date: Wed Dec 3 14:00:11 2008 Subject: kern/107707: [geom] [patch] [request] add new class geom_xbox360 to slice up xbox360 media Message-ID: <200812032200.mB3M037k086499@freefall.freebsd.org> The following reply was made to PR kern/107707; it has been noted by GNATS. From: Rene Ladan To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/107707: [geom] [patch] [request] add new class geom_xbox360 to slice up xbox360 media Date: Wed, 03 Dec 2008 22:52:57 +0100 I've moved the patch to perforce, available in //depot/user/rene/xtaf/... Rene From Hilko.Meyer at gmx.de Wed Dec 3 17:35:13 2008 From: Hilko.Meyer at gmx.de (Hilko Meyer) Date: Wed Dec 3 17:35:19 2008 Subject: System freeze with gvinum In-Reply-To: References: <20081130153558.GA2120@nobby.lan> Message-ID: Hilko Meyer schrieb: >Ulf Lilleengen schrieb: >>On Sat, Nov 29, 2008 at 11:47:59PM +0100, Hilko Meyer wrote: >>> Involved hardware: >>> atapci0: >>> ad4: 476940MB at ata2-master UDMA33 >>> ad5: 476940MB at ata2-slave UDMA33 >>> ad6: 476940MB at ata3-master UDMA33 >>> >>> BTW: That are SATA-disks. Why they are reported as UDMA33? >>Seems weird. Maybe there are some bios settings turning of AHCI-mode? > >Ah, I think I know were to look for that. I'll try tomorrow. Done, now it looks like that: | atapci1: port 0xf80-0xf87,0xf00-0xf03,0xe80-0xe87,0xe00-0xe03,0xb800-0xb80f mem 0xfcffc000-0xfcffdfff irq 20 at device 10.0 on pci0 | atapci1: AHCI Version 01.10 controller with 4 ports detected | ad4: 476940MB at ata2-master SATA300 | ad6: 476940MB at ata3-master SATA300 | ad8: 476940MB at ata4-master SATA300 Just for the record: Board is a K9N Neo V3 with nforce 560 chipset. Changing this BIOS-setting does the trick: Integrated Periphals -> On-Chip ATA Devices: RAID mode [AHCI] "RAID mode" is a verry clever name for that setting... thanks, Hilko From Hilko.Meyer at gmx.de Wed Dec 3 18:02:45 2008 From: Hilko.Meyer at gmx.de (Hilko Meyer) Date: Wed Dec 3 18:02:51 2008 Subject: System freeze with gvinum In-Reply-To: <20081201021720.GA1949@carrot.studby.ntnu.no> References: <20081130153558.GA2120@nobby.lan> <20081130222445.GA1528@carrot.studby.ntnu.no> <20081201021720.GA1949@carrot.studby.ntnu.no> Message-ID: Ulf Lilleengen schrieb: >On man, des 01, 2008 at 12:32:22am +0100, Hilko Meyer wrote: >> Is gvinum in 7.1RC and 7.x the same? We considered to update to 7.1 >> before it's released anyway, because we need nfe(4). And wanted to try >> gvinum and zfs there. >Yes, they are the same. >> >> But we can test a patch against 6.4 before the big update if you want. >> >It's really up to you. If you're going to upgrade anyway, it will at least >save me from a little bit of work :) Unfortunately I have some other work for you. After changing the BIOS-setting to AHCI, I tried gvinum with 6.4 again. And strangely enough it worked. No freeze with newfs and I could copy several GB to the volumes, but after a reboot gvinum list looks like that: | D sata3 State: up /dev/ad10 A: 9/476939 MB (0%) | D sata2 State: up /dev/ad8 A: 9/476939 MB (0%) | D sata1 State: up /dev/ad4 A: 9/476939 MB (0%) | | 2 volumes: | V homes_raid5 State: down Plexes: 1 Size: 465 GB | V dump_raid5 State: down Plexes: 1 Size: 465 GB | | 2 plexes: | P homes_raid5.p0 R5 State: down Subdisks: 3 Size: 465 GB | P dump_raid5.p0 R5 State: down Subdisks: 3 Size: 465 GB | | 6 subdisks: | S homes_raid5.p0.s0 State: stale D: sata1 Size: 232 GB | S homes_raid5.p0.s1 State: stale D: sata2 Size: 232 GB | S homes_raid5.p0.s2 State: stale D: sata3 Size: 232 GB | S dump_raid5.p0.s0 State: stale D: sata1 Size: 232 GB | S dump_raid5.p0.s1 State: stale D: sata2 Size: 232 GB | S dump_raid5.p0.s2 State: stale D: sata3 Size: 232 GB Then we updated to FreeBSD 7.1-PRERELEASE, but nothing changed. After a reboot the volumes are down. In dmesg I found g_vfs_done():gvinum/dump_raid5[READ(offset=65536, length=8192)]error = 6 but I think, that occurred during a try to mount a volume. bye, Hilko From lulf at stud.ntnu.no Wed Dec 3 23:34:18 2008 From: lulf at stud.ntnu.no (Ulf Lilleengen) Date: Wed Dec 3 23:34:25 2008 Subject: System freeze with gvinum In-Reply-To: References: <20081130153558.GA2120@nobby.lan> <20081130222445.GA1528@carrot.studby.ntnu.no> <20081201021720.GA1949@carrot.studby.ntnu.no> Message-ID: <20081204063410.GA1465@nobby.lan> On Thu, Dec 04, 2008 at 03:02:39AM +0100, Hilko Meyer wrote: > Ulf Lilleengen schrieb: > >On man, des 01, 2008 at 12:32:22am +0100, Hilko Meyer wrote: > >> Is gvinum in 7.1RC and 7.x the same? We considered to update to 7.1 > >> before it's released anyway, because we need nfe(4). And wanted to try > >> gvinum and zfs there. > >Yes, they are the same. > >> > >> But we can test a patch against 6.4 before the big update if you want. > >> > >It's really up to you. If you're going to upgrade anyway, it will at least > >save me from a little bit of work :) > > Unfortunately I have some other work for you. After changing the > BIOS-setting to AHCI, I tried gvinum with 6.4 again. And strangely > enough it worked. No freeze with newfs and I could copy several GB to > the volumes, but after a reboot gvinum list looks like that: > > | D sata3 State: up /dev/ad10 A: 9/476939 MB (0%) > | D sata2 State: up /dev/ad8 A: 9/476939 MB (0%) > | D sata1 State: up /dev/ad4 A: 9/476939 MB (0%) > | > | 2 volumes: > | V homes_raid5 State: down Plexes: 1 Size: 465 GB > | V dump_raid5 State: down Plexes: 1 Size: 465 GB > | > | 2 plexes: > | P homes_raid5.p0 R5 State: down Subdisks: 3 Size: 465 GB > | P dump_raid5.p0 R5 State: down Subdisks: 3 Size: 465 GB > | > | 6 subdisks: > | S homes_raid5.p0.s0 State: stale D: sata1 Size: 232 GB > | S homes_raid5.p0.s1 State: stale D: sata2 Size: 232 GB > | S homes_raid5.p0.s2 State: stale D: sata3 Size: 232 GB > | S dump_raid5.p0.s0 State: stale D: sata1 Size: 232 GB > | S dump_raid5.p0.s1 State: stale D: sata2 Size: 232 GB > | S dump_raid5.p0.s2 State: stale D: sata3 Size: 232 GB > > Then we updated to FreeBSD 7.1-PRERELEASE, but nothing changed. After a > reboot the volumes are down. In dmesg I found > g_vfs_done():gvinum/dump_raid5[READ(offset=65536, length=8192)]error = 6 > but I think, that occurred during a try to mount a volume. > Well, this can happen if there was errors reading/writing to volumes previously. When volumes are in the down state, it is not possible to use them. You have a few options: If currently have any data on the volumes, and would like to recover without reinitializing the volumes, you can try and force the subdisk states to up by doing: 1. 'gvinum setstate -f up ' on all subdisk. The plexes should then go into the upstate as all the subdisks are up. 2. Do fsck on the volumes to ensure that they are ok. If so, you are ready to go again. Note that you might have to pass -t ufs to fsck as vinum volumes previously have set their own disklabels and other weird stuff. If you don't have any valuable data yet, you can run 'gvinum start ' on all volumes, which should reinitialize the plexes, or you can just recreate the entire config. Recreating the entire config might also work if you have data, but I'd try the tip above first. In any case, I don't guarantee for any these methods to work, but forcing the state of the subdisks should to the trick. Preferably, you can try the method on the subdisks of one of the volumes first and see if it works. -- Ulf Lilleengen From pjd at FreeBSD.org Thu Dec 4 11:51:19 2008 From: pjd at FreeBSD.org (pjd@FreeBSD.org) Date: Thu Dec 4 11:51:29 2008 Subject: kern/113885: [gmirror] [patch] improved gmirror balance algorithm Message-ID: <200812041951.mB4JpIoL015725@freefall.freebsd.org> Synopsis: [gmirror] [patch] improved gmirror balance algorithm Responsible-Changed-From-To: freebsd-geom->pjd Responsible-Changed-By: pjd Responsible-Changed-When: czw 4 gru 19:51:01 2008 UTC Responsible-Changed-Why: I'll handle this one. http://www.freebsd.org/cgi/query-pr.cgi?pr=113885 From marius at nuenneri.ch Thu Dec 4 12:41:19 2008 From: marius at nuenneri.ch (=?ISO-8859-1?Q?Marius_N=FCnnerich?=) Date: Thu Dec 4 12:41:26 2008 Subject: DTrace probes for geom_kern, geom_io and geom_event Message-ID: Hi, I wrote a bunch of DTrace probes for the core geom files mentioned in the subject. The patch for current is available at http://nuenneri.ch/freebsd/geom_probes.patch Anyone interested in testing them? Just apply the patch, add options KDTRACE_HOOKS to your kernel and build it like this: # make WITH_CTF=1 KERNCONF=YOURKERNEL buildkernel installkernel After reboot you can # kldload dtraceall and see the new probes with # dtrace -lP geom A sample script: #!/usr/sbin/dtrace -s #pragma D option quiet geom::: { @geom[execname, probemod, probefunc, probename] = count(); @geom_all[execname, probemod, probefunc, probename] = count(); } tick-10sec { normalize(@geom, 10) printa("%@8u %@8u %12s %s:%s:%s\n", @geom_all, @geom); printf("\n"); clear(@geom); } This is hand copied. You can chmod 755 and run it. I'm not sure how to handle the opt_kdtrace.h case in geom.h, see patch line 842. Any comments on the patch? @phk: Are you interested in committing this when there are no complaints? Are you interested in more probes? Kind regards Marius From marius at nuenneri.ch Thu Dec 4 12:41:52 2008 From: marius at nuenneri.ch (=?ISO-8859-1?Q?Marius_N=FCnnerich?=) Date: Thu Dec 4 12:41:58 2008 Subject: Trivial(?) reorganization of topology lock in geom_event Message-ID: Hi, while working on the DTrace probes for geom I noticed that g_topology_lock() is called 20 times per second from the g_event thread, even though the thread only runs 10 times per second when idle. Maybe it is possible to change the locking like in this patch? I also changed the position of one unlocking of g_eventlock. Patch (relative to src/sys): http://nuenneri.ch/freebsd/geom_tl.patch As a side note: Do all the msleep calls have to have a timeout? Why is this so? Kind regards Marius From charlie at clamothe.com Fri Dec 5 04:58:40 2008 From: charlie at clamothe.com (Charlie La Mothe) Date: Fri Dec 5 04:58:46 2008 Subject: gmirror insert error: Synchronization request failed (error=1) Message-ID: <6AEFCB5D-BF7F-49EE-9EDC-E5CD63920508@clamothe.com> I have three SATA disks: ad6, ad7, and ad8. I just installed a fresh copy of FreeBSD 7.0 amd64 (minimal) on ad6s1a. I setup a gmirror with: gmirror -vb round-robin gm0 /dev/ad6s1a. I edited my fstab, and told my loader.conf to load geom_mirror. The machine boots up fine, however I run into an error when I try to insert an additional component: # gmirror insert gm0 /dev/ad7s1d # tail /var/log/messages Dec 5 04:27:51 monopoly kernel: GEOM_MIRROR: Device gm0: rebuilding provider ad7s1d. Dec 5 04:27:51 monopoly kernel: GEOM_MIRROR: Synchronization request failed (error=1). ad7s1d[WRITE(offset=0, length=131072)] Dec 5 04:27:51 monopoly kernel: GEOM_MIRROR: Device gm0: provider ad7s1d disconnected. Dec 5 04:27:51 monopoly kernel: GEOM_MIRROR: Device gm0: rebuilding provider ad7s1d stopped. # gmirror status Name Status Components mirror/gm0 DEGRADED ad6s1a # gmirror list Geom name: gm0 State: DEGRADED Components: 2 Balance: round-robin Slice: 4096 Flags: NONE GenID: 4 SyncID: 1 ID: 1745622490 Providers: 1. Name: mirror/gm0 Mediasize: 5368708608 (5.0G) Sectorsize: 512 Mode: r1w1e1 Consumers: 1. Name: ad6s1a Mediasize: 5368709120 (5.0G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 0 Flags: NONE GenID: 4 SyncID: 1 ID: 3963284114 I haven't been able to find any explanation for this error. The same error occurs when I attempt to insert the ad8s1d component. All of the hard drives diagnose to be working. BSD label Reference: Each disk has one slice. Summary: 5GB UFS boot partition 5GB Swap ~950GB unused partition (for zfs / raidz). # bsdlabel /dev/ad6s1 # /dev/ad6s1: 8 partitions: # size offset fstype [fsize bsize bps/cpg] a: 10485760 0 4.2BSD 2048 16384 28528 b: 10485760 10485760 swap c: 1953520002 0 unused 0 0 # "raw" part, don't edit d: 1932548482 20971520 4.2BSD 0 0 0 # bsdlabel /dev/ad7s1 # /dev/ad7s1: 8 partitions: # size offset fstype [fsize bsize bps/cpg] b: 10485760 10485760 swap c: 1953520002 0 unused 0 0 # "raw" part, don't edit d: 10485760 0 4.2BSD 0 0 0 e: 1932548482 20971520 4.2BSD 0 0 0 # bsdlabel /dev/ad8s1 # /dev/ad8s1: 8 partitions: # size offset fstype [fsize bsize bps/cpg] b: 10485760 10485760 swap c: 1953520002 0 unused 0 0 # "raw" part, don't edit d: 10485760 0 4.2BSD 0 0 0 e: 1932548482 20971520 4.2BSD 0 0 0 From avg at icyb.net.ua Fri Dec 5 05:11:28 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Fri Dec 5 05:11:40 2008 Subject: partition covering the whole slice [repost] Message-ID: <4939287C.3020208@icyb.net.ua> [Repost: I originally cc-ed gnome instead of geom; Sorry.] I have a disk with two slices and each slices has a single real partition covering the whole slice, sector-to-sector. I don't remember how I managed to configure the disk this way, is this even possible? :-) $ gpart show => 63 781422705 ad12 MBR (373G) 63 209712447 1 freebsd [active] (100G) 209712510 571705155 2 freebsd [active] (273G) 781417665 5103 - free - (2.5M) => 0 209712447 ufs/extbackup BSD (100G) 0 209712447 1 freebsd-ufs (100G) => 0 209712447 ad12s1 BSD (100G) 0 209712447 1 freebsd-ufs (100G) => 0 571705155 ufs/extstuff BSD (273G) 0 571705155 1 freebsd-ufs (273G) => 0 571705155 ad12s2 BSD (273G) 0 571705155 1 freebsd-ufs (273G) You can immediately spot another oddity - I never used glabel on this disk, but I did use tunefs -L to label the UFS filesystems within the partitions. Now it seems that the label of filesystems is also somehow recognized as a label for the whole slice. E.g. "ufs/extbackup" is exatcly the same as "ad12s1". Weird. Here's some additional data: $ ls -1 /dev/ad12* /dev/ad12 /dev/ad12s1 /dev/ad12s1a /dev/ad12s2 /dev/ad12s2a Looks usual. $ ls -1 /dev/ufs/ extbackup extbackupa extstuff extstuffa So there is one "normal" label for each filesystem and the second label for it as a filesystem in partition "a" of a labeled slice. There is nothing in /dev/label though. And a bit more: $ file -s /dev/ad12s1 /dev/ad12s1: Unix Fast File system [v2] (little-endian) last mounted on /automnt/ufs/extbackupa, volume name extbackup, last written at Tue Dec 2 17:47:21 2008, clean flag 1, readonly flag 0, number of blocks 13107027, number of data blocks 13002290, number of cylinder groups 35, block size 65536, fragment size 8192, average file size 16384, average number of files in dir 64, pending blocks to free 0, pending inodes to free 0, system-wide uuid 0, minimum percentage of free blocks 8, TIME optimization $ file -s /dev/ad12s1a /dev/ad12s1a: Unix Fast File system [v2] (little-endian) last mounted on /automnt/ufs/extbackupa, volume name extbackup, last written at Tue Dec 2 17:47:21 2008, clean flag 1, readonly flag 0, number of blocks 13107027, number of data blocks 13002290, number of cylinder groups 35, block size 65536, fragment size 8192, average file size 16384, average number of files in dir 64, pending blocks to free 0, pending inodes to free 0, system-wide uuid 0, minimum percentage of free blocks 8, TIME optimization So it looks like start of ad12s1 is the same as ad12s1a. On some better configured disks I see: $ file -s /dev/ad6s1 /dev/ad6s1: x86 boot sector; partition 4: ID=0xa5, active, starthead 0, startsector 0, 50000 sectors Ultimately I would like to fix this so that I don't see labels on the slices. -- Andriy Gapon From pjd at FreeBSD.org Fri Dec 5 06:48:14 2008 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Fri Dec 5 06:48:22 2008 Subject: gmirror insert error: Synchronization request failed (error=1) In-Reply-To: <6AEFCB5D-BF7F-49EE-9EDC-E5CD63920508@clamothe.com> References: <6AEFCB5D-BF7F-49EE-9EDC-E5CD63920508@clamothe.com> Message-ID: <20081205144806.GA3284@garage.freebsd.pl> On Fri, Dec 05, 2008 at 04:33:38AM -0800, Charlie La Mothe wrote: > I have three SATA disks: ad6, ad7, and ad8. > > I just installed a fresh copy of FreeBSD 7.0 amd64 (minimal) on ad6s1a. > > I setup a gmirror with: gmirror -vb round-robin gm0 /dev/ad6s1a. I > edited my fstab, and told my loader.conf to load geom_mirror. > > The machine boots up fine, however I run into an error when I try to > insert an additional component: > > # gmirror insert gm0 /dev/ad7s1d > # tail /var/log/messages > Dec 5 04:27:51 monopoly kernel: GEOM_MIRROR: Device gm0: rebuilding > provider ad7s1d. > Dec 5 04:27:51 monopoly kernel: GEOM_MIRROR: Synchronization request > failed (error=1). ad7s1d[WRITE(offset=0, length=131072)] > Dec 5 04:27:51 monopoly kernel: GEOM_MIRROR: Device gm0: provider > ad7s1d disconnected. > Dec 5 04:27:51 monopoly kernel: GEOM_MIRROR: Device gm0: rebuilding > provider ad7s1d stopped. > > # gmirror status > Name Status Components > mirror/gm0 DEGRADED ad6s1a > > # gmirror list > Geom name: gm0 > State: DEGRADED > Components: 2 > Balance: round-robin > Slice: 4096 > Flags: NONE > GenID: 4 > SyncID: 1 > ID: 1745622490 > Providers: > 1. Name: mirror/gm0 > Mediasize: 5368708608 (5.0G) > Sectorsize: 512 > Mode: r1w1e1 > Consumers: > 1. Name: ad6s1a > Mediasize: 5368709120 (5.0G) > Sectorsize: 512 > Mode: r1w1e1 > State: ACTIVE > Priority: 0 > Flags: NONE > GenID: 4 > SyncID: 1 > ID: 3963284114 > > > I haven't been able to find any explanation for this error. The same > error occurs when I attempt to insert the ad8s1d component. > > All of the hard drives diagnose to be working. > > > BSD label Reference: > > Each disk has one slice. > > Summary: > 5GB UFS boot partition > 5GB Swap > ~950GB unused partition (for zfs / raidz). > > # bsdlabel /dev/ad6s1 > > # /dev/ad6s1: > 8 partitions: > # size offset fstype [fsize bsize bps/cpg] > a: 10485760 0 4.2BSD 2048 16384 28528 > b: 10485760 10485760 swap > c: 1953520002 0 unused 0 0 # "raw" > part, don't edit > d: 1932548482 20971520 4.2BSD 0 0 0 > > # bsdlabel /dev/ad7s1 > > # /dev/ad7s1: > 8 partitions: > # size offset fstype [fsize bsize bps/cpg] > b: 10485760 10485760 swap > c: 1953520002 0 unused 0 0 # "raw" > part, don't edit > d: 10485760 0 4.2BSD 0 0 0 This is your problem. No partition should start at offset 0 and yet sysinstall creates those. > e: 1932548482 20971520 4.2BSD 0 0 0 > > > # bsdlabel /dev/ad8s1 > > # /dev/ad8s1: > 8 partitions: > # size offset fstype [fsize bsize bps/cpg] > b: 10485760 10485760 swap > c: 1953520002 0 unused 0 0 # "raw" > part, don't edit > d: 10485760 0 4.2BSD 0 0 0 > e: 1932548482 20971520 4.2BSD 0 0 0 > > _______________________________________________ > freebsd-geom@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-geom > To unsubscribe, send any mail to "freebsd-geom-unsubscribe@freebsd.org" -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20081205/917d5c10/attachment.pgp From Hilko.Meyer at gmx.de Fri Dec 5 09:16:12 2008 From: Hilko.Meyer at gmx.de (Hilko Meyer) Date: Fri Dec 5 09:16:19 2008 Subject: System freeze with gvinum In-Reply-To: <20081204063410.GA1465@nobby.lan> References: <20081130153558.GA2120@nobby.lan> <20081130222445.GA1528@carrot.studby.ntnu.no> <20081201021720.GA1949@carrot.studby.ntnu.no> <20081204063410.GA1465@nobby.lan> Message-ID: <6snij41mh5vtm92ch1d045upgjj6atbkn1@mail.gmx.net> Ulf Lilleengen schrieb: >On Thu, Dec 04, 2008 at 03:02:39AM +0100, Hilko Meyer wrote: >> Unfortunately I have some other work for you. After changing the >> BIOS-setting to AHCI, I tried gvinum with 6.4 again. And strangely >> enough it worked. No freeze with newfs and I could copy several GB to >> the volumes, but after a reboot gvinum list looks like that: >> >> | D sata3 State: up /dev/ad10 A: 9/476939 MB (0%) >> | D sata2 State: up /dev/ad8 A: 9/476939 MB (0%) >> | D sata1 State: up /dev/ad4 A: 9/476939 MB (0%) >> | >> | 2 volumes: >> | V homes_raid5 State: down Plexes: 1 Size: 465 GB >> | V dump_raid5 State: down Plexes: 1 Size: 465 GB >> | >> | 2 plexes: >> | P homes_raid5.p0 R5 State: down Subdisks: 3 Size: 465 GB >> | P dump_raid5.p0 R5 State: down Subdisks: 3 Size: 465 GB >> | >> | 6 subdisks: >> | S homes_raid5.p0.s0 State: stale D: sata1 Size: 232 GB >> | S homes_raid5.p0.s1 State: stale D: sata2 Size: 232 GB >> | S homes_raid5.p0.s2 State: stale D: sata3 Size: 232 GB >> | S dump_raid5.p0.s0 State: stale D: sata1 Size: 232 GB >> | S dump_raid5.p0.s1 State: stale D: sata2 Size: 232 GB >> | S dump_raid5.p0.s2 State: stale D: sata3 Size: 232 GB >> >> Then we updated to FreeBSD 7.1-PRERELEASE, but nothing changed. After a >> reboot the volumes are down. In dmesg I found >> g_vfs_done():gvinum/dump_raid5[READ(offset=65536, length=8192)]error = 6 >> but I think, that occurred during a try to mount a volume. >> >Well, this can happen if there was errors reading/writing to volumes >previously. When volumes are in the down state, it is not possible to use >them. You have a few options: > >If currently have any data on the volumes, and would like to recover without >reinitializing the volumes, you can try and force the subdisk states to up by >doing: > >1. 'gvinum setstate -f up ' on all subdisk. The plexes should then >go into the upstate as all the subdisks are up. >2. Do fsck on the volumes to ensure that they are ok. If so, you are ready to >go again. Note that you might have to pass -t ufs to fsck as vinum volumes >previously have set their own disklabels and other weird stuff. That didn't helped. After a reboot were the subdisks stale again. >If you don't have any valuable data yet, you can run 'gvinum start ' >on all volumes, which should reinitialize the plexes, That worked, All up after a reboot. Took nine hours per volume... In dmesg I found | GEOM_VINUM: subdisk 'homes_raid5.p0.s2' init: finished successfully | GEOM_VINUM: subdisk 'homes_raid5.p0.s0' init: finished successfully | GEOM_VINUM: plex homes_raid5.p0 state change: down -> up | GEOM_VINUM: g_access failed on drive sata2, errno 1 | GEOM_VINUM: subdisk 'homes_raid5.p0.s1' init: finished successfully Do I have to worry about "g_access failed on drive sata2, errno 1"? >or you can just recreate the entire config. Recreating the entire config >might also work if you have data, but I'd try the tip above first. I've tried that before writing the last mail, but didn't mentioned that. Has not worked. thanks for your help, Hilko From pjd at FreeBSD.org Fri Dec 5 12:45:24 2008 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Fri Dec 5 12:45:32 2008 Subject: gmirror insert error: Synchronization request failed (error=1) In-Reply-To: <8C75F72F-A68C-4E42-97F6-FA4BD4B2F57A@clamothe.com> References: <6AEFCB5D-BF7F-49EE-9EDC-E5CD63920508@clamothe.com> <20081205144806.GA3284@garage.freebsd.pl> <8C75F72F-A68C-4E42-97F6-FA4BD4B2F57A@clamothe.com> Message-ID: <20081205204515.GA2303@garage.freebsd.pl> On Fri, Dec 05, 2008 at 12:39:10PM -0800, Charlie La Mothe wrote: > What offset should I use, then? Does this only apply to the first > labels in each slice, or should there be an offset between each label? First 16 sectors is where bsdlabel keeps its metadata. bsdlabel(8) correctly skips those, but not sysinstall, which is lame on our (FreeBSD) side, I know. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20081205/7df2b687/attachment.pgp From trasz at FreeBSD.org Sat Dec 6 02:46:42 2008 From: trasz at FreeBSD.org (trasz@FreeBSD.org) Date: Sat Dec 6 02:46:48 2008 Subject: kern/128529: [gjournal] root FS on GEOM Journal cannot boot when journal isn't marked clean/consistent Message-ID: <200812061046.mB6AkfDK031569@freefall.freebsd.org> Synopsis: [gjournal] root FS on GEOM Journal cannot boot when journal isn't marked clean/consistent Responsible-Changed-From-To: freebsd-geom->trasz Responsible-Changed-By: trasz Responsible-Changed-When: Sat Dec 6 10:46:35 UTC 2008 Responsible-Changed-Why: http://www.freebsd.org/cgi/query-pr.cgi?pr=128529 From phk at phk.freebsd.dk Sat Dec 6 13:00:12 2008 From: phk at phk.freebsd.dk (Poul-Henning Kamp) Date: Sat Dec 6 13:00:18 2008 Subject: Trivial(?) reorganization of topology lock in geom_event In-Reply-To: Your message of "Thu, 04 Dec 2008 21:41:49 +0100." Message-ID: <31114.1228597207@critter.freebsd.dk> In message , "=?ISO- 8859-1?Q?Marius_N=FCnnerich?=" writes: >Hi, > >while working on the DTrace probes for geom I noticed that >g_topology_lock() is called 20 times per second from the g_event >thread, even though the thread only runs 10 times per second when >idle. Maybe it is possible to change the locking like in this patch? I >also changed the position of one unlocking of g_eventlock. In theory the timeout is not necessary, it was added as a stopgap because there were synchronisation issues long time ago. Try dropping the timeout and see if you can provoke problems, if not, kill it. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From marius at nuenneri.ch Sun Dec 7 16:20:56 2008 From: marius at nuenneri.ch (=?ISO-8859-1?Q?Marius_N=FCnnerich?=) Date: Sun Dec 7 16:21:03 2008 Subject: Trivial(?) reorganization of topology lock in geom_event In-Reply-To: <31114.1228597207@critter.freebsd.dk> References: <31114.1228597207@critter.freebsd.dk> Message-ID: On Sat, Dec 6, 2008 at 10:00 PM, Poul-Henning Kamp wrote: > In message , "=?ISO- > 8859-1?Q?Marius_N=FCnnerich?=" writes: >>Hi, >> >>while working on the DTrace probes for geom I noticed that >>g_topology_lock() is called 20 times per second from the g_event >>thread, even though the thread only runs 10 times per second when >>idle. Maybe it is possible to change the locking like in this patch? I >>also changed the position of one unlocking of g_eventlock. > > In theory the timeout is not necessary, it was added as a stopgap > because there were synchronisation issues long time ago. > > Try dropping the timeout and see if you can provoke problems, > if not, kill it. Did so, please take a look at this patch: http://nuenneri.ch/freebsd/geom_tl2.patch I am running a version of this with the DTrace probes included, I hope the patch is complete. I did a few buildkernels and some of the geom regression tests, so far no problems. I changed the position of the loop to better match how it's like in the up and down threads. What do you think of it? Thanks Marius From bugmaster at FreeBSD.org Mon Dec 8 03:06:56 2008 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Dec 8 03:07:55 2008 Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org Message-ID: <200812081106.mB8B6tnc014258@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/129245 geom [geom] gcache is more suitable for suffix based provid o kern/128398 geom [PATCH] glabel(8): teach geom_label to recognise gpt l f kern/128276 geom [gmirror] machine lock up when gmirror module is used o kern/126902 geom [geom] [geom_label] Kernel panic during install boot o kern/124973 geom [gjournal] [patch] boot order affects geom_journal con o kern/124969 geom gvinum(8): gvinum raid5 plex does not detect missing s o kern/124294 geom [geom] gmirror(8) have inappropriate logic when workin o kern/124130 geom [gmirror][usb] gmirror fails to start usb devices that o kern/123962 geom [panic] [gjournal] gjournal (455Gb data, 8Gb journal), o kern/123630 geom [patch] [gmirror] gmirror doesnt allow the original dr o kern/123122 geom [geom] GEOM / gjournal kernel lock f kern/122415 geom [geom] UFS labels are being constantly created and rem o kern/122067 geom [geom] [panic] Geom crashed during boot o kern/121559 geom [patch] [geom] geom label class allows to create inacc o kern/121364 geom [gmirror] Removing all providers create a "zombie" mir o kern/120231 geom [geom] GEOM_CONCAT error adding second drive o kern/120044 geom [msdosfs] [geom] incorrect MSDOSFS label fries adminis o kern/120021 geom [geom] [panic] net-p2p/qbittorrent crashes system when o kern/119743 geom [geom] geom label for cds is keeped after dismount and f kern/115547 geom [geom] [patch] [request] let GEOM Eli get password fro o kern/114532 geom [geom] GEOM_MIRROR shows up in kldstat even if compile o kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113419 geom [geom] geom fox multipathing not failing back p bin/110705 geom gmirror(8) control utility does not exit with correct o kern/107707 geom [geom] [patch] [request] add new class geom_xbox360 to o kern/104389 geom [geom] [patch] sys/geom/geom_dump.c doesn't encode XML o kern/98034 geom [geom] dereference of NULL pointer in acd_geom_detach o kern/94632 geom [geom] Kernel output resets input while GELI asks for o kern/90582 geom [geom] [panic] Restore cause panic string (ffs_blkfree o bin/90093 geom fdisk(8) incapable of altering in-core geometry a kern/89660 geom [vinum] [patch] [panic] due to g_malloc returning null o kern/89546 geom [geom] GEOM error s kern/89102 geom [geom] [panic] panic when forced unmount FS from unplu o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o kern/84556 geom [geom] GBDE-encrypted swap causes panic at shutdown o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/79035 geom [vinum] gvinum unable to create a striped set of mirro o bin/78131 geom gbde(8) "destroy" not working. s kern/73177 geom kldload geom_* causes panic due to memory exhaustion 40 problems total. From vadim_nuclight at mail.ru Mon Dec 8 04:20:58 2008 From: vadim_nuclight at mail.ru (Vadim Goncharov) Date: Mon Dec 8 04:21:05 2008 Subject: gmirror insert error: Synchronization request failed (error=1) References: <6AEFCB5D-BF7F-49EE-9EDC-E5CD63920508@clamothe.com> <20081205144806.GA3284@garage.freebsd.pl> <8C75F72F-A68C-4E42-97F6-FA4BD4B2F57A@clamothe.com> <20081205204515.GA2303@garage.freebsd.pl> Message-ID: Hi Pawel Jakub Dawidek! On Fri, 5 Dec 2008 21:45:15 +0100; Pawel Jakub Dawidek wrote about 'Re: gmirror insert error: Synchronization request failed (error=1)': >> What offset should I use, then? Does this only apply to the first =20 >> labels in each slice, or should there be an offset between each label? > First 16 sectors is where bsdlabel keeps its metadata. bsdlabel(8) > correctly skips those, but not sysinstall, which is lame on our > (FreeBSD) side, I know. What? bsdlabel occupies only two sectors, in fact, only one, because first sector (0) is occupied by boot1, if any. Both UFS and swap are teached not to use first 8K of space, to be that able to contain boot2. In fact, the only problem that could arise from an offset-0 UFS partition is that a glabel(8) will incorrectly detect /dev/ufs/label on a slice itself, not partition, but I always have swap ('b' partition) starting from offset 0 and never had any problems (my 'a' UFS'es start from 16 when there no swap). So what metadata do you really mean? -- WBR, Vadim Goncharov. ICQ#166852181 mailto:vadim_nuclight@mail.ru [Moderator of RU.ANTI-ECOLOGY][FreeBSD][http://antigreen.org][LJ:/nuclight] From vadim_nuclight at mail.ru Mon Dec 8 04:30:22 2008 From: vadim_nuclight at mail.ru (Vadim Goncharov) Date: Mon Dec 8 04:30:28 2008 Subject: partition covering the whole slice [repost] References: <4939287C.3020208@icyb.net.ua> Message-ID: Hi Andriy Gapon! On Fri, 05 Dec 2008 15:11:24 +0200; Andriy Gapon wrote about 'partition covering the whole slice [repost]': > I have a disk with two slices and each slices has a single real > partition covering the whole slice, sector-to-sector. > I don't remember how I managed to configure the disk this way, is this > even possible? :-) > You can immediately spot another oddity - I never used glabel on this > disk, but I did use tunefs -L to label the UFS filesystems within the > partitions. > Now it seems that the label of filesystems is also somehow recognized as > a label for the whole slice. E.g. "ufs/extbackup" is exatcly the same as > "ad12s1". Weird. > Here's some additional data: > $ ls -1 /dev/ad12* > /dev/ad12 > /dev/ad12s1 > /dev/ad12s1a > /dev/ad12s2 > /dev/ad12s2a > Looks usual. > $ ls -1 /dev/ufs/ > extbackup > extbackupa > extstuff > extstuffa > So there is one "normal" label for each filesystem and the second label > for it as a filesystem in partition "a" of a labeled slice. > There is nothing in /dev/label though. > Ultimately I would like to fix this so that I don't see labels on the > slices. Yes, of course. You should not intermix using glabel(8) utilizing /dev/ufs (via tunefs) and bsdlabel partition starting from offset 0. This is because glabel can't distinguish is that slice or partition - with offset 0 superblock will be at the same position. You can try to erase bsdlabel completely (if this is not your boot partition) from the slice and use filesystem directly from the slice. This will not affect mount as you're already using labels. The other way will require shrinking-then-moving partition on the disk and editing disklabel, better done with newfs(8). -- WBR, Vadim Goncharov. ICQ#166852181 mailto:vadim_nuclight@mail.ru [Moderator of RU.ANTI-ECOLOGY][FreeBSD][http://antigreen.org][LJ:/nuclight] From marius at nuenneri.ch Wed Dec 10 14:15:45 2008 From: marius at nuenneri.ch (=?ISO-8859-1?Q?Marius_N=FCnnerich?=) Date: Wed Dec 10 14:16:23 2008 Subject: DTrace probes for geom_kern, geom_io and geom_event In-Reply-To: References: Message-ID: current CC'ed, maybe there are some people interested in DTrace that don't read geom. On Thu, Dec 4, 2008 at 9:41 PM, Marius N?nnerich wrote: > Hi, > > I wrote a bunch of DTrace probes for the core geom files mentioned in > the subject. The patch for current is available at > http://nuenneri.ch/freebsd/geom_probes.patch > > Anyone interested in testing them? Just apply the patch, add options > KDTRACE_HOOKS to your kernel and build it like this: > # make WITH_CTF=1 KERNCONF=YOURKERNEL buildkernel installkernel > > After reboot you can > # kldload dtraceall > and see the new probes with > # dtrace -lP geom > > A sample script: > #!/usr/sbin/dtrace -s > #pragma D option quiet > > geom::: > { > @geom[execname, probemod, probefunc, probename] = count(); > @geom_all[execname, probemod, probefunc, probename] = count(); > } > > tick-10sec > { > normalize(@geom, 10) > printa("%@8u %@8u %12s %s:%s:%s\n", @geom_all, @geom); > printf("\n"); > clear(@geom); > } > > This is hand copied. You can chmod 755 and run it. > > I'm not sure how to handle the opt_kdtrace.h case in geom.h, see patch line 842. > Any comments on the patch? > > @phk: Are you interested in committing this when there are no > complaints? Are you interested in more probes? After some tips from Alexander Leidinger I updated the patch, new version here: http://nuenneri.ch/freebsd/geom_probes2.patch There are some questions I'd like to discuss: 1. I wrote the SDT_PROBE_DEFINEs right before the function definition, I know this violates the usual style as that stuff would normally belong to the top of the file. I think in this case it would be worthwhile to break with this tradition 2. Should I use the full function name for the probes (with the g_ prefix) even though it's defined under the provider geom 3. Should there be a probe for every switch case in g_io_check? I think this won't work with the fall-through that is used right now 4. Alexander proposed to change the module name kern to core. I'm not sure about this as kern refers to the filename, like io and event do 5. I'm thinking about defining a G_TRACE macro for SDT_PROBE(geom, ...) 6. Does anybody know of a way to probe functions with varargs properly? Like g_trace 7. What about g_bioq_(un)lock functions, I just added one probe for it, I do not really see a point in adding entry and return probes (they are there with FBT anyway). This is consistent with g_topology_lock etc. What about making macros of the two functions like g_topology_lock 8. What about adding macros for (un)locking other locks like g_eventlock, so one could add probes and trace them 9. Writing hundreds of entry and return probes is boring, especially as there is the FBT provider. Maybe it's possible to give the FBT probes better names like geom:io:g_io_schedule_down:entry instead of fbt:kernel:g_io_schedule:entry. Every FBT probe has a provider of fbt und module of kernel right now. One also has to define the argument types which I think FBT figures out by itself. If this would work we could concentrate on adding SDT probes to interesting places inside of functions or macros Bye Marius From oxy at field.hu Thu Dec 11 04:08:08 2008 From: oxy at field.hu (oxy) Date: Thu Dec 11 04:08:42 2008 Subject: Encrypting raid5 volume with geli Message-ID: <4940FF0F.2020404@field.hu> Is there any method to use encrypted raid5 volumes? i created the raid5 volume with gvinum, works fine, but when i try to encrypt it: geli init -P -K /root/enc.key /dev/gvinum/raid5 it says: Cannot store Metadata....Operation not permitted any ideas? thank you! From Alexander at Leidinger.net Thu Dec 11 05:11:54 2008 From: Alexander at Leidinger.net (Alexander Leidinger) Date: Thu Dec 11 05:12:00 2008 Subject: DTrace probes for geom_kern, geom_io and geom_event In-Reply-To: References: Message-ID: <20081211135438.52433nmj45ia112c@webmail.leidinger.net> Quoting Marius N?nnerich (from Wed, 10 Dec 2008 23:15:43 +0100): > After some tips from Alexander Leidinger I updated the patch, new > version here: > http://nuenneri.ch/freebsd/geom_probes2.patch Again: I just reviewed the patch, so I don't have the complete context of the functions, just what I see in the patch (-> high level dtrace review, not geom specific probe review). Still inconsistent locking probes. Lock is fired without the lock held, unlock is fired with the lock held. Both should IMHO be fired either with the lock held or without the lock held, but not with the current mix. g_new_bio/g_io_schedule_up: the return probe has the name "entry" in your patch. A msleep probe could give some more info (sometimes there are even zero arguments, but there are 3-5 things which could be interesting to know). Similar for tsleep (the time should be provided in the probe arguments too). I don't think we need "loop" probes. Given that g_trace is a debugging function and that dtrace is superior, I don't think you need to instrument g_trace with dtrace probes. g_topology_lock/unlock should provide the lock in the probe arguments. Again, see above, either both probes firing with the lock, or without the lock, but not mixed as it is. > There are some questions I'd like to discuss: > 2. Should I use the full function name for the probes (with the g_ > prefix) even though it's defined under the provider geom IMHO yes, it's more easy for the person grepping around, as "bioq" can be found in a lot of unrelated places, but "g_bioq_init" only in places where you want to know about. > 3. Should there be a probe for every switch case in g_io_check? I > think this won't work with the fall-through that is used right now IMHO at least in every code block which is doing something sensible. Dtrace is not expensive, having a lot of probes does not hurt (except maybe in a critical path). You could fire an read_not_permitted probe, or a write_not_permitted probe or whatever. This can be done additionally to the return probe. This way it's very easy to see if there's a permission problem, without the need to write errno checks in dtrace. If you have a lot of returns but only a handful of permission errors, it's better to have some specific probes which can be fired. Keep in mind dtrace is designed to be used to debug problems on production systems. > 4. Alexander proposed to change the module name kern to core. I'm not > sure about this as kern refers to the filename, like io and event do - core for kern - core_io for io (maybe) - core_event for event (maybe) This way you can use gmirror, graid3, ... later as module names and people/sysadmins without much GEOM knowledge don't have a problem to see distinguish with real GEOM core stuff and stuff in GEOM providers. > 7. What about g_bioq_(un)lock functions, I just added one probe for > it, I do not really see a point in adding entry and return probes > (they are there with FBT anyway). This is consistent with > g_topology_lock etc. What about making macros of the two functions > like g_topology_lock Regarding FBT: the advantage of the geom dtrace-provider is that you can tell to give everything for geom, while with with the fbt dtrace-provider you need to know the naming conventions in the kernel. So while you have in fbt the possibility to get access to the probes, the sysadmin which does not know much about GEOM can get a meaningful overview in case entry and return probes available in the geom dtrace-provider. A lot of places in the kernel do not have a naming convention like GEOM, so when we handle them (e.g. the linuxulator), we need to add entry/return probes so that sysadmins without knowledge about kernel internals can search for solutions of their problems. We should be consistent in our probe naming, else it's not easy to use dtrace. > 9. Writing hundreds of entry and return probes is boring, especially > as there is the FBT provider. Maybe it's possible to give the FBT > probes better names like geom:io:g_io_schedule_down:entry instead of > fbt:kernel:g_io_schedule:entry. Every FBT probe has a provider of fbt > und module of kernel right now. One also has to define the argument > types which I think FBT figures out by itself. If this would work we > could concentrate on adding SDT probes to interesting places inside of > functions or macros Both ways have good and bad parts. One argument against this is to stay synchronized with vendor code. Another one is complexity to handle this (currently the fbt part is automatic, I don't see a way to handle related stuff which is spread into several places to within the same namespace without introducing hints into different places which tells what belongs where). HTH, Alexander. -- "They shot him five times. But he's though." -- Santino Corleone, "Chapter 2", page 79 http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 From rick-freebsd2008 at kiwi-computer.com Thu Dec 11 12:57:01 2008 From: rick-freebsd2008 at kiwi-computer.com (Rick C. Petty) Date: Thu Dec 11 12:57:06 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <4940FF0F.2020404@field.hu> References: <4940FF0F.2020404@field.hu> Message-ID: <20081211205659.GA72478@keira.kiwi-computer.com> On Thu, Dec 11, 2008 at 12:52:47PM +0100, oxy wrote: > Is there any method to use encrypted raid5 volumes? > i created the raid5 volume with gvinum, works fine, but when i try to > encrypt it: > geli init -P -K /root/enc.key /dev/gvinum/raid5 > it says: > Cannot store Metadata....Operation not permitted > any ideas? > thank you! What's the output of "gvinum l"? -- Rick C. Petty From oxy at field.hu Thu Dec 11 14:39:23 2008 From: oxy at field.hu (oxy) Date: Thu Dec 11 14:39:31 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <20081211205659.GA72478@keira.kiwi-computer.com> References: <4940FF0F.2020404@field.hu> <20081211205659.GA72478@keira.kiwi-computer.com> Message-ID: <49419691.4020403@field.hu> here it is: [root@test /]# gvinum l 4 drives: D disk_2 State: up /dev/ad9 A: 0/238475 MB (0%) D disk_1 State: up /dev/ad8 A: 0/238475 MB (0%) D disk_4 State: up /dev/ad5 A: 0/238475 MB (0%) D disk_3 State: up /dev/ad4 A: 0/238475 MB (0%) 1 volume: V raid5 State: down Plexes: 1 Size: 698 GB 1 plex: P raid5.p0 R5 State: down Subdisks: 4 Size: 698 GB 4 subdisks: S raid5.p0.s0 State: stale D: disk_1 Size: 232 GB S raid5.p0.s1 State: stale D: disk_2 Size: 232 GB S raid5.p0.s2 State: stale D: disk_3 Size: 232 GB S raid5.p0.s3 State: stale D: disk_4 Size: 232 GB [root@test /]# geli init -P -K /root/raid5.key /dev/gvinum/raid5 geli: Cannot store metadata on /dev/gvinum/raid5: Device not configured. Rick C. Petty ?rta: > On Thu, Dec 11, 2008 at 12:52:47PM +0100, oxy wrote: > >> Is there any method to use encrypted raid5 volumes? >> i created the raid5 volume with gvinum, works fine, but when i try to >> encrypt it: >> geli init -P -K /root/enc.key /dev/gvinum/raid5 >> it says: >> Cannot store Metadata....Operation not permitted >> any ideas? >> thank you! >> > > What's the output of "gvinum l"? > > -- Rick C. Petty > _______________________________________________ > freebsd-geom@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-geom > To unsubscribe, send any mail to "freebsd-geom-unsubscribe@freebsd.org" > From rick-freebsd2008 at kiwi-computer.com Thu Dec 11 20:01:38 2008 From: rick-freebsd2008 at kiwi-computer.com (Rick C. Petty) Date: Thu Dec 11 20:01:44 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <49419680.4010003@field.hu> References: <4940FF0F.2020404@field.hu> <20081211205659.GA72478@keira.kiwi-computer.com> <49419680.4010003@field.hu> Message-ID: <20081212040137.GA76422@keira.kiwi-computer.com> On Thu, Dec 11, 2008 at 11:38:56PM +0100, oxy wrote: > here it is: > > [root@test /]# gvinum l > 4 drives: > D disk_2 State: up /dev/ad9 A: 0/238475 MB (0%) > D disk_1 State: up /dev/ad8 A: 0/238475 MB (0%) > D disk_4 State: up /dev/ad5 A: 0/238475 MB (0%) > D disk_3 State: up /dev/ad4 A: 0/238475 MB (0%) > > 1 volume: > V raid5 State: down Plexes: 1 Size: 698 GB > > 1 plex: > P raid5.p0 R5 State: down Subdisks: 4 Size: 698 GB > > 4 subdisks: > S raid5.p0.s0 State: stale D: disk_1 Size: 232 GB > S raid5.p0.s1 State: stale D: disk_2 Size: 232 GB > S raid5.p0.s2 State: stale D: disk_3 Size: 232 GB > S raid5.p0.s3 State: stale D: disk_4 Size: 232 GB > > [root@test /]# geli init -P -K /root/raid5.key /dev/gvinum/raid5 > geli: Cannot store metadata on /dev/gvinum/raid5: Device not configured. The error message is quite accurate-- the raid5 volume is down because the plex is stale. You need to run a "gvinum start raid5.p0" and let it complete before the volume will be "up". This operation will sync the four plexes and write out the parity info. There are a set of patches that lulf@ has which I believe put the volume in "up" state initially instead of "down", but maybe it only works for mirrors. The code in current and RELENG_7 does initially put the volume in "down" state. -- Rick C. Petty From ulf.lilleengen at gmail.com Fri Dec 12 00:36:41 2008 From: ulf.lilleengen at gmail.com (Ulf Lilleengen) Date: Fri Dec 12 00:36:47 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <20081212040137.GA76422@keira.kiwi-computer.com> References: <4940FF0F.2020404@field.hu> <20081211205659.GA72478@keira.kiwi-computer.com> <49419680.4010003@field.hu> <20081212040137.GA76422@keira.kiwi-computer.com> Message-ID: <20081212083708.GA1455@carrot.studby.ntnu.no> On tor, des 11, 2008 at 10:01:37pm -0600, Rick C. Petty wrote: *snip* > There are a set of patches that lulf@ has which I believe put the volume in > "up" state initially instead of "down", but maybe it only works for > mirrors. The code in current and RELENG_7 does initially put the volume in > "down" state. > Yes, it only works for mirrors, since I thought it doesn't really matter if a mirror is properly initialized, since the user need to put data into the mirror for it to be useful anyway. The same goes for RAID-5 I guess, but I was not sure if it might trigger some weird behaviour since parity would not match if reading the volume. I will test out a small modification I made, which removes the need to run 'gvinum start' on the raid5 plexes. -- Ulf Lilleengen From oxy at field.hu Fri Dec 12 02:03:40 2008 From: oxy at field.hu (oxy) Date: Fri Dec 12 02:03:47 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <20081212083708.GA1455@carrot.studby.ntnu.no> References: <4940FF0F.2020404@field.hu> <20081211205659.GA72478@keira.kiwi-computer.com> <49419680.4010003@field.hu> <20081212040137.GA76422@keira.kiwi-computer.com> <20081212083708.GA1455@carrot.studby.ntnu.no> Message-ID: <494236F2.5040803@field.hu> You mean that after every reboot i have to start the raid5 manually and re-sync? Ulf Lilleengen ?rta: > On tor, des 11, 2008 at 10:01:37pm -0600, Rick C. Petty wrote: > *snip* > >> There are a set of patches that lulf@ has which I believe put the volume in >> "up" state initially instead of "down", but maybe it only works for >> mirrors. The code in current and RELENG_7 does initially put the volume in >> "down" state. >> >> > Yes, it only works for mirrors, since I thought it doesn't really matter if a > mirror is properly initialized, since the user need to put data into the > mirror for it to be useful anyway. The same goes for RAID-5 I guess, but I > was not sure if it might trigger some weird behaviour since parity would not > match if reading the volume. I will test out a small modification I made, > which removes the need to run 'gvinum start' on the raid5 plexes. > > From ivoras at freebsd.org Fri Dec 12 02:07:55 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Fri Dec 12 02:08:01 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <20081212083708.GA1455@carrot.studby.ntnu.no> References: <4940FF0F.2020404@field.hu> <20081211205659.GA72478@keira.kiwi-computer.com> <49419680.4010003@field.hu> <20081212040137.GA76422@keira.kiwi-computer.com> <20081212083708.GA1455@carrot.studby.ntnu.no> Message-ID: Ulf Lilleengen wrote: > On tor, des 11, 2008 at 10:01:37pm -0600, Rick C. Petty wrote: > *snip* >> There are a set of patches that lulf@ has which I believe put the volume in >> "up" state initially instead of "down", but maybe it only works for >> mirrors. The code in current and RELENG_7 does initially put the volume in >> "down" state. >> > Yes, it only works for mirrors, since I thought it doesn't really matter if a > mirror is properly initialized, since the user need to put data into the > mirror for it to be useful anyway. The same goes for RAID-5 I guess, but I > was not sure if it might trigger some weird behaviour since parity would not > match if reading the volume. I will test out a small modification I made, > which removes the need to run 'gvinum start' on the raid5 plexes. It doesn't have to be "weird" behaviour, depending on whether gvinum checks parity on reads (does it?). If it does, it will only have to ignore checksum errors in this case. I suppose people will want to run utilities like diskinfo -vt on the volume with invalid parities so it's not a theoretical scenario :) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20081212/2808d169/signature.pgp From ulf.lilleengen at gmail.com Fri Dec 12 03:58:28 2008 From: ulf.lilleengen at gmail.com (Ulf Lilleengen) Date: Fri Dec 12 03:58:43 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <494236F2.5040803@field.hu> References: <4940FF0F.2020404@field.hu> <20081211205659.GA72478@keira.kiwi-computer.com> <49419680.4010003@field.hu> <20081212040137.GA76422@keira.kiwi-computer.com> <20081212083708.GA1455@carrot.studby.ntnu.no> <494236F2.5040803@field.hu> Message-ID: <20081212125905.GA39875@carrot.studby.ntnu.no> On Fri, Dec 12, 2008 at 11:03:30AM +0100, oxy wrote: > You mean that after every reboot i have to start the raid5 manually and > re-sync? > No, just initially after creating it, and if one of the disks fails, in which case you will probably put in a new one and re-sync it. -- Ulf Lilleengen From ulf.lilleengen at gmail.com Fri Dec 12 04:08:27 2008 From: ulf.lilleengen at gmail.com (Ulf Lilleengen) Date: Fri Dec 12 04:08:34 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: References: <4940FF0F.2020404@field.hu> <20081211205659.GA72478@keira.kiwi-computer.com> <49419680.4010003@field.hu> <20081212040137.GA76422@keira.kiwi-computer.com> <20081212083708.GA1455@carrot.studby.ntnu.no> Message-ID: <20081212130848.GB39875@carrot.studby.ntnu.no> On Fri, Dec 12, 2008 at 11:07:40AM +0100, Ivan Voras wrote: > Ulf Lilleengen wrote: > > On tor, des 11, 2008 at 10:01:37pm -0600, Rick C. Petty wrote: > > *snip* > >> There are a set of patches that lulf@ has which I believe put the volume in > >> "up" state initially instead of "down", but maybe it only works for > >> mirrors. The code in current and RELENG_7 does initially put the volume in > >> "down" state. > >> > > Yes, it only works for mirrors, since I thought it doesn't really matter if a > > mirror is properly initialized, since the user need to put data into the > > mirror for it to be useful anyway. The same goes for RAID-5 I guess, but I > > was not sure if it might trigger some weird behaviour since parity would not > > match if reading the volume. I will test out a small modification I made, > > which removes the need to run 'gvinum start' on the raid5 plexes. > > It doesn't have to be "weird" behaviour, depending on whether gvinum > checks parity on reads (does it?). If it does, it will only have to > ignore checksum errors in this case. It does check parity on reads. But I think it doesn't matter, since no sane data has been written in that block anyway. But as you say, one way to handle it is to ignore the checksums if the data is known to not be initialized, but then wouldn't one have to keep track of which blocks have a valid parity and which who does not? > I suppose people will want to run utilities like diskinfo -vt on the > volume with invalid parities so it's not a theoretical scenario :) > I guess, but I then one can just initialize the volume anyway. -- Ulf Lilleengen From ivoras at freebsd.org Fri Dec 12 04:29:39 2008 From: ivoras at freebsd.org (Ivan Voras) Date: Fri Dec 12 04:29:46 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <20081212130848.GB39875@carrot.studby.ntnu.no> References: <4940FF0F.2020404@field.hu> <20081211205659.GA72478@keira.kiwi-computer.com> <49419680.4010003@field.hu> <20081212040137.GA76422@keira.kiwi-computer.com> <20081212083708.GA1455@carrot.studby.ntnu.no> <20081212130848.GB39875@carrot.studby.ntnu.no> Message-ID: Ulf Lilleengen wrote: > On Fri, Dec 12, 2008 at 11:07:40AM +0100, Ivan Voras wrote: >> Ulf Lilleengen wrote: >>> On tor, des 11, 2008 at 10:01:37pm -0600, Rick C. Petty wrote: >>> *snip* >>>> There are a set of patches that lulf@ has which I believe put the volume in >>>> "up" state initially instead of "down", but maybe it only works for >>>> mirrors. The code in current and RELENG_7 does initially put the volume in >>>> "down" state. >>>> >>> Yes, it only works for mirrors, since I thought it doesn't really matter if a >>> mirror is properly initialized, since the user need to put data into the >>> mirror for it to be useful anyway. The same goes for RAID-5 I guess, but I >>> was not sure if it might trigger some weird behaviour since parity would not >>> match if reading the volume. I will test out a small modification I made, >>> which removes the need to run 'gvinum start' on the raid5 plexes. >> It doesn't have to be "weird" behaviour, depending on whether gvinum >> checks parity on reads (does it?). If it does, it will only have to >> ignore checksum errors in this case. > It does check parity on reads. But I think it doesn't matter, since no sane data > has been written in that block anyway. > > But as you say, one way to handle it is to ignore the checksums if the data > is known to not be initialized, but then wouldn't one have to keep track of > which blocks have a valid parity and which who does not? > >> I suppose people will want to run utilities like diskinfo -vt on the >> volume with invalid parities so it's not a theoretical scenario :) >> > I guess, but I then one can just initialize the volume anyway. In the interest of simplicity, maybe a single, well documented flag that says "don't check checksums on reads" will do best. It will also probably help read performance. Of course, it should be off by default and its implications explained in the man page :) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20081212/8fb2f51f/signature.pgp From rick-freebsd2008 at kiwi-computer.com Fri Dec 12 07:50:27 2008 From: rick-freebsd2008 at kiwi-computer.com (Rick C. Petty) Date: Fri Dec 12 07:50:38 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <20081212130848.GB39875@carrot.studby.ntnu.no> References: <4940FF0F.2020404@field.hu> <20081211205659.GA72478@keira.kiwi-computer.com> <49419680.4010003@field.hu> <20081212040137.GA76422@keira.kiwi-computer.com> <20081212083708.GA1455@carrot.studby.ntnu.no> <20081212130848.GB39875@carrot.studby.ntnu.no> Message-ID: <20081212155023.GA82667@keira.kiwi-computer.com> On Fri, Dec 12, 2008 at 02:08:49PM +0100, Ulf Lilleengen wrote: > > > > It doesn't have to be "weird" behaviour, depending on whether gvinum > > checks parity on reads (does it?). If it does, it will only have to > > ignore checksum errors in this case. > It does check parity on reads. But I think it doesn't matter, since no sane data > has been written in that block anyway. > > But as you say, one way to handle it is to ignore the checksums if the data > is known to not be initialized, but then wouldn't one have to keep track of > which blocks have a valid parity and which who does not? IMO, trying to read a block that hasn't been initialized is perfectly acceptable as an "error". I would just mark the volume as up. Chances are a sane admin would be writing to the blocks before reading the same blocks (except in the disk test scenario, in which case why are they bothering with a raid5?). If a read happens on a block that hasn't been written to, it is a parity error. The real question is what happens when parity errors happen. I guess I suggest allowing you to force the plex up (via setstate) if you are pretty confident reads will only happen after writes (which is the case after newfs-ing the volume). In either case, always mark the volume as up.. the plex can be in a degraded state, meaning the parity has failed but reads can still happen. It sounds perfect to me; the states reflect the actual state of things. -- Rick C. Petty From mikej at paymentallianceintl.com Fri Dec 12 10:06:23 2008 From: mikej at paymentallianceintl.com (Michael Jung) Date: Fri Dec 12 10:06:34 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <20081212155023.GA82667@keira.kiwi-computer.com> Message-ID: FreeBSD charon.confluentasp.com 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #2: Thu Sep 4 12:06:08 EDT 2008 In the interest of this thread I tried to duplicate the problem. I created: 10 drives: D d9 State: up /dev/da9 A: 0/17366 MB (0%) D d8 State: up /dev/da8 A: 0/17366 MB (0%) D d7 State: up /dev/da7 A: 0/17366 MB (0%) D d6 State: up /dev/da6 A: 0/17366 MB (0%) D d5 State: up /dev/da5 A: 0/17366 MB (0%) D d4 State: up /dev/da4 A: 0/17366 MB (0%) D d3 State: up /dev/da3 A: 0/17366 MB (0%) D d2 State: up /dev/da2 A: 0/17366 MB (0%) D d1 State: up /dev/da1 A: 0/17366 MB (0%) D d0 State: up /dev/da0 A: 0/17366 MB (0%) 1 volume: V test State: up Plexes: 1 Size: 152 GB 1 plex: P test.p0 R5 State: up Subdisks: 10 Size: 152 GB 10 subdisks: S test.p0.s9 State: up D: d9 Size: 16 GB S test.p0.s8 State: up D: d8 Size: 16 GB S test.p0.s7 State: up D: d7 Size: 16 GB S test.p0.s6 State: up D: d6 Size: 16 GB S test.p0.s5 State: up D: d5 Size: 16 GB S test.p0.s4 State: up D: d4 Size: 16 GB S test.p0.s3 State: up D: d3 Size: 16 GB S test.p0.s2 State: up D: d2 Size: 16 GB S test.p0.s1 State: up D: d1 Size: 16 GB S test.p0.s0 State: up D: d0 Size: 16 GB Which I can newfs and mount (root@charon) /etc# mount /dev/gvinum/test /mnt (root@charon) /etc# df -h Filesystem Size Used Avail Capacity Mounted on /dev/ad4s1a 357G 119G 209G 36% / devfs 1.0K 1.0K 0B 100% /dev 172.0.255.28:/data/unix 1.3T 643G 559G 54% /nas1 /dev/gvinum/test 148G 4.0K 136G 0% /mnt But with /dev/gvinum/test unmounted if I try: (root@charon) /etc# geli init -P -K /root/test.key /dev/gvinum/test geli: Cannot store metadata on /dev/gvinum/test: Operation not permitted. (root@charon) /etc# My random file was created like dd if=/dev/random of=/root/test.key bs=64 count=1 I use GELI at home with no trouble, although not with a gvinum volume. --mikej CONFIDENTIALITY NOTE: This message is intended only for the use of the individual or entity to whom it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this transmission in error, please notify us by telephone at (502) 212-4001 or notify us at PAI , Dept. 99, 11857 Commonwealth Drive, Louisville, KY 40299. Thank you. From ulf.lilleengen at gmail.com Sat Dec 13 06:17:05 2008 From: ulf.lilleengen at gmail.com (Ulf Lilleengen) Date: Sat Dec 13 06:17:12 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> References: <20081212155023.GA82667@keira.kiwi-computer.com> <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> Message-ID: <917871cf0812130617s2c321612m1497bc2de8aa8501@mail.gmail.com> On Sat, Dec 13, 2008 at 2:59 PM, Ulf Lilleengen wrote: > > > On Fri, Dec 12, 2008 at 5:00 PM, Michael Jung < > mikej@paymentallianceintl.com> wrote: > >> FreeBSD charon.confluentasp.com 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE >> #2: Thu Sep 4 12:06:08 EDT 2008 >> > *snip* > >> > I hope to commit the attached change in the near future. > > -- > Ulf Lilleengen > Done, rev 186038 -- Ulf Lilleengen From ulf.lilleengen at gmail.com Sat Dec 13 06:30:22 2008 From: ulf.lilleengen at gmail.com (Ulf Lilleengen) Date: Sat Dec 13 06:30:29 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: References: <20081212155023.GA82667@keira.kiwi-computer.com> Message-ID: <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> On Fri, Dec 12, 2008 at 5:00 PM, Michael Jung wrote: > FreeBSD charon.confluentasp.com 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE > #2: Thu Sep 4 12:06:08 EDT 2008 > > In the interest of this thread I tried to duplicate the problem. I > created: > > 10 drives: > D d9 State: up /dev/da9 A: 0/17366 MB > (0%) > D d8 State: up /dev/da8 A: 0/17366 MB > (0%) > D d7 State: up /dev/da7 A: 0/17366 MB > (0%) > D d6 State: up /dev/da6 A: 0/17366 MB > (0%) > D d5 State: up /dev/da5 A: 0/17366 MB > (0%) > D d4 State: up /dev/da4 A: 0/17366 MB > (0%) > D d3 State: up /dev/da3 A: 0/17366 MB > (0%) > D d2 State: up /dev/da2 A: 0/17366 MB > (0%) > D d1 State: up /dev/da1 A: 0/17366 MB > (0%) > D d0 State: up /dev/da0 A: 0/17366 MB > (0%) > > 1 volume: > V test State: up Plexes: 1 Size: 152 > GB > > 1 plex: > P test.p0 R5 State: up Subdisks: 10 Size: 152 > GB > > 10 subdisks: > S test.p0.s9 State: up D: d9 Size: 16 > GB > S test.p0.s8 State: up D: d8 Size: 16 > GB > S test.p0.s7 State: up D: d7 Size: 16 > GB > S test.p0.s6 State: up D: d6 Size: 16 > GB > S test.p0.s5 State: up D: d5 Size: 16 > GB > S test.p0.s4 State: up D: d4 Size: 16 > GB > S test.p0.s3 State: up D: d3 Size: 16 > GB > S test.p0.s2 State: up D: d2 Size: 16 > GB > S test.p0.s1 State: up D: d1 Size: 16 > GB > S test.p0.s0 State: up D: d0 Size: 16 > GB > > Which I can newfs and mount > > (root@charon) /etc# mount /dev/gvinum/test /mnt > (root@charon) /etc# df -h > Filesystem Size Used Avail Capacity Mounted on > /dev/ad4s1a 357G 119G 209G 36% / > devfs 1.0K 1.0K 0B 100% /dev > 172.0.255.28:/data/unix 1.3T 643G 559G 54% /nas1 > /dev/gvinum/test 148G 4.0K 136G 0% /mnt > > But with /dev/gvinum/test unmounted if I try: > > (root@charon) /etc# geli init -P -K /root/test.key /dev/gvinum/test > geli: Cannot store metadata on /dev/gvinum/test: Operation not > permitted. > (root@charon) /etc# > > My random file was created like > > dd if=/dev/random of=/root/test.key bs=64 count=1 > > I use GELI at home with no trouble, although not with a gvinum volume. > Hello, When I tried this myself, I also got the EPERM error in return. I though this was very strange. I went through the gvinum code today, and put debugging prints everywhere, but everything looked fine, and it was only raid5 volumes that failed. Then I saw that the EPERM error came from the underlying providers of geom (more specifially from the read requests to the parity stripes etc), so I was starting to suspect that it was not a gvinum error. But still, I was able to write/read from the disks from outside of gvinum! Then, I discovered in geom userland code that it opens the disk where metadata should be written in write only mode. Then I discovered the reason: gvinum tries to write to the stripe in question, but has to read back the parity data from one of the other stripes. But, they are opened O_WRONLY, so the request fails. I tried opening the device as O_RDWR, and everything is find. Phew :) You can bet I was frustrated I hope to commit the attached change in the near future. -- Ulf Lilleengen -------------- next part -------------- A non-text attachment was scrubbed... Name: geomfix.diff Type: application/octet-stream Size: 316 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20081213/15cbd4da/geomfix.obj From oxy at field.hu Sat Dec 13 12:22:36 2008 From: oxy at field.hu (oxy@field.hu) Date: Sat Dec 13 12:22:43 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> References: <20081212155023.GA82667@keira.kiwi-computer.com> <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> Message-ID: <3934.79.122.6.53.1229199747.squirrel@webmail.field.hu> as i read it seems that it's useless for sync the raid after initing, i have no chance to encrypt it with geli, am i right? On Szo, December 13, 2008 14:59, Ulf Lilleengen wrote: > On Fri, Dec 12, 2008 at 5:00 PM, Michael Jung > >> wrote: >> > >> FreeBSD charon.confluentasp.com 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE >> #2: Thu Sep 4 12:06:08 EDT 2008 >> >> >> In the interest of this thread I tried to duplicate the problem. I >> created: >> >> >> 10 drives: >> D d9 State: up /dev/da9 A: 0/17366 MB >> (0%) >> D d8 State: up /dev/da8 A: 0/17366 MB >> (0%) >> D d7 State: up /dev/da7 A: 0/17366 MB >> (0%) >> D d6 State: up /dev/da6 A: 0/17366 MB >> (0%) >> D d5 State: up /dev/da5 A: 0/17366 MB >> (0%) >> D d4 State: up /dev/da4 A: 0/17366 MB >> (0%) >> D d3 State: up /dev/da3 A: 0/17366 MB >> (0%) >> D d2 State: up /dev/da2 A: 0/17366 MB >> (0%) >> D d1 State: up /dev/da1 A: 0/17366 MB >> (0%) >> D d0 State: up /dev/da0 A: 0/17366 MB >> (0%) >> >> >> 1 volume: >> V test State: up Plexes: 1 Size: 152 >> GB >> >> >> 1 plex: >> P test.p0 R5 State: up Subdisks: 10 Size: 152 >> GB >> >> >> 10 subdisks: >> S test.p0.s9 State: up D: d9 Size: 16 >> GB >> S test.p0.s8 State: up D: d8 Size: 16 >> GB >> S test.p0.s7 State: up D: d7 Size: 16 >> GB >> S test.p0.s6 State: up D: d6 Size: 16 >> GB >> S test.p0.s5 State: up D: d5 Size: 16 >> GB >> S test.p0.s4 State: up D: d4 Size: 16 >> GB >> S test.p0.s3 State: up D: d3 Size: 16 >> GB >> S test.p0.s2 State: up D: d2 Size: 16 >> GB >> S test.p0.s1 State: up D: d1 Size: 16 >> GB >> S test.p0.s0 State: up D: d0 Size: 16 >> GB >> >> >> Which I can newfs and mount >> >> >> (root@charon) /etc# mount /dev/gvinum/test /mnt >> (root@charon) /etc# df -h >> Filesystem Size Used Avail Capacity Mounted on >> /dev/ad4s1a 357G 119G 209G 36% / >> devfs 1.0K 1.0K 0B 100% /dev >> 172.0.255.28:/data/unix 1.3T 643G 559G 54% /nas1 >> /dev/gvinum/test 148G 4.0K 136G 0% /mnt >> >> >> But with /dev/gvinum/test unmounted if I try: >> >> >> (root@charon) /etc# geli init -P -K /root/test.key /dev/gvinum/test >> geli: Cannot store metadata on /dev/gvinum/test: Operation not >> permitted. (root@charon) /etc# >> >> >> My random file was created like >> >> >> dd if=/dev/random of=/root/test.key bs=64 count=1 >> >> I use GELI at home with no trouble, although not with a gvinum volume. >> >> > > Hello, > > > When I tried this myself, I also got the EPERM error in return. I though > this was very strange. I went through the gvinum code today, and put > debugging prints everywhere, but everything looked fine, and it was only > raid5 volumes > > that failed. Then I saw that the EPERM error came from the underlying > providers of geom (more specifially from the read requests to the parity > stripes etc), so I was starting to suspect that it was not a gvinum error. > But still, I was > able to write/read from the disks from outside of gvinum! > > Then, I discovered in geom userland code that it opens the disk where > metadata should be written in write only mode. Then I discovered the > reason: > gvinum tries to write to the stripe in question, but has to read back the > parity data from one of the other stripes. But, they are opened O_WRONLY, > so the request fails. I tried opening the device as O_RDWR, and everything > is find. > > Phew :) You can bet I was frustrated > > > I hope to commit the attached change in the near future. > > > -- > Ulf Lilleengen > _______________________________________________ > freebsd-geom@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-geom > To unsubscribe, send any mail to "freebsd-geom-unsubscribe@freebsd.org" > > From rick-freebsd2008 at kiwi-computer.com Sat Dec 13 13:28:36 2008 From: rick-freebsd2008 at kiwi-computer.com (Rick C. Petty) Date: Sat Dec 13 13:28:42 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <3934.79.122.6.53.1229199747.squirrel@webmail.field.hu> References: <20081212155023.GA82667@keira.kiwi-computer.com> <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> <3934.79.122.6.53.1229199747.squirrel@webmail.field.hu> Message-ID: <20081213212835.GA99136@keira.kiwi-computer.com> On Sat, Dec 13, 2008 at 09:22:27PM +0100, oxy@field.hu wrote: > as i read it seems that it's useless for sync the raid after initing, i Not sure what you mean here. RAID5 sync after creation is pretty typical. I think it many cases it's unnecessary, but the current gvinum code does perform a sync-after-create for raid5. > have no chance to encrypt it with geli, am i right? With the patch lulf@ just committed, you should be able to geli a raid5 volume under gvinum. You may have to wait for it to be MFC'd if you're not using HEAD (FreeBSD-CURRENT). You could also apply his patch to your source code and rebuild from /usr/src/sbin/gvinum/ and it should work for you. -- Rick C. Petty From ulf.lilleengen at gmail.com Sat Dec 13 16:17:24 2008 From: ulf.lilleengen at gmail.com (Ulf Lilleengen) Date: Sat Dec 13 16:17:31 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <3934.79.122.6.53.1229199747.squirrel@webmail.field.hu> References: <20081212155023.GA82667@keira.kiwi-computer.com> <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> <3934.79.122.6.53.1229199747.squirrel@webmail.field.hu> Message-ID: <917871cf0812131617m4dff5295xe1eeb6d83a568fe@mail.gmail.com> On Sat, Dec 13, 2008 at 9:22 PM, wrote: > > as i read it seems that it's useless for sync the raid after initing, i > have no chance to encrypt it with geli, am i right? As Rick says, you can try to apply the patch itself. You should just have to rebuild and reinstall src/sbin/geom for it to take effect. If not, I'm planning to MFC it pretty soon anyway. -- Ulf Lilleengen From mikej at paymentallianceintl.com Sat Dec 13 18:07:29 2008 From: mikej at paymentallianceintl.com (Michael Jung) Date: Sat Dec 13 18:07:36 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> References: <20081212155023.GA82667@keira.kiwi-computer.com> <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> Message-ID: From: Ulf Lilleengen [mailto:ulf.lilleengen@gmail.com] Sent: Saturday, December 13, 2008 8:59 AM To: Michael Jung Cc: freebsd-geom@freebsd.org Subject: Re: Encrypting raid5 volume with geli On Fri, Dec 12, 2008 at 5:00 PM, Michael Jung wrote: FreeBSD charon.confluentasp.com 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #2: Thu Sep 4 12:06:08 EDT 2008 In the interest of this thread I tried to duplicate the problem. I created: 10 drives: D d9 State: up /dev/da9 A: 0/17366 MB (0%) D d8 State: up /dev/da8 A: 0/17366 MB (0%) D d7 State: up /dev/da7 A: 0/17366 MB (0%) D d6 State: up /dev/da6 A: 0/17366 MB (0%) D d5 State: up /dev/da5 A: 0/17366 MB (0%) D d4 State: up /dev/da4 A: 0/17366 MB (0%) D d3 State: up /dev/da3 A: 0/17366 MB (0%) D d2 State: up /dev/da2 A: 0/17366 MB (0%) D d1 State: up /dev/da1 A: 0/17366 MB (0%) D d0 State: up /dev/da0 A: 0/17366 MB (0%) 1 volume: V test State: up Plexes: 1 Size: 152 GB 1 plex: P test.p0 R5 State: up Subdisks: 10 Size: 152 GB 10 subdisks: S test.p0.s9 State: up D: d9 Size: 16 GB S test.p0.s8 State: up D: d8 Size: 16 GB S test.p0.s7 State: up D: d7 Size: 16 GB S test.p0.s6 State: up D: d6 Size: 16 GB S test.p0.s5 State: up D: d5 Size: 16 GB S test.p0.s4 State: up D: d4 Size: 16 GB S test.p0.s3 State: up D: d3 Size: 16 GB S test.p0.s2 State: up D: d2 Size: 16 GB S test.p0.s1 State: up D: d1 Size: 16 GB S test.p0.s0 State: up D: d0 Size: 16 GB Which I can newfs and mount (root@charon) /etc# mount /dev/gvinum/test /mnt (root@charon) /etc# df -h Filesystem Size Used Avail Capacity Mounted on /dev/ad4s1a 357G 119G 209G 36% / devfs 1.0K 1.0K 0B 100% /dev 172.0.255.28:/data/unix 1.3T 643G 559G 54% /nas1 /dev/gvinum/test 148G 4.0K 136G 0% /mnt But with /dev/gvinum/test unmounted if I try: (root@charon) /etc# geli init -P -K /root/test.key /dev/gvinum/test geli: Cannot store metadata on /dev/gvinum/test: Operation not permitted. (root@charon) /etc# My random file was created like dd if=/dev/random of=/root/test.key bs=64 count=1 I use GELI at home with no trouble, although not with a gvinum volume. Hello, When I tried this myself, I also got the EPERM error in return. I though this was very strange. I went through the gvinum code today, and put debugging prints everywhere, but everything looked fine, and it was only raid5 volumes that failed. Then I saw that the EPERM error came from the underlying providers of geom (more specifially from the read requests to the parity stripes etc), so I was starting to suspect that it was not a gvinum error. But still, I was able to write/read from the disks from outside of gvinum! Then, I discovered in geom userland code that it opens the disk where metadata should be written in write only mode. Then I discovered the reason: gvinum tries to write to the stripe in question, but has to read back the parity data from one of the other stripes. But, they are opened O_WRONLY, so the request fails. I tried opening the device as O_RDWR, and everything is find. Phew :) You can bet I was frustrated I hope to commit the attached change in the near future. -- Ulf Lilleengen I+++++++++++++++++++++++++++++++++ 7.1-PRERELEASE #0: Sat Dec 13 15:09:38 EST 2008 I just cvsup and applied your patch, now: (root@charon) /etc# geli init -P -K /root/test.key /dev/gvinum/test (root@charon) /etc# geli attach -p -k /root/test.key /dev/gvinum/test (root@charon) /etc# newfs /dev/gvinum/test.eli /dev/gvinum/test.eli: 121564.2MB (248963480 sectors) block size 16384, fragment size 2048 using 662 cylinder groups of 183.77MB, 11761 blks, 23552 inodes. super-block backups (for fsck -b #) at: 160, 376512, 752864, 1129216, 1505568, 1881920, 2258272,........... (root@charon) /etc# mount /dev/gvinum/test.eli /mnt (root@charon) /etc# df -h /mnt Filesystem Size Used Avail Capacity Mounted on /dev/gvinum/test.eli 115G 4.0K 106G 0% /mnt (root@charon) /etc# I exercise it some but patch looks good! --mikej CONFIDENTIALITY NOTE: This message is intended only for the use of the individual or entity to whom it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this transmission in error, please notify us by telephone at (502) 212-4001 or notify us at PAI , Dept. 99, 11857 Commonwealth Drive, Louisville, KY 40299. Thank you. From oxy at field.hu Sun Dec 14 05:42:16 2008 From: oxy at field.hu (oxy) Date: Sun Dec 14 05:42:23 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <20081213212835.GA99136@keira.kiwi-computer.com> References: <20081212155023.GA82667@keira.kiwi-computer.com> <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> <3934.79.122.6.53.1229199747.squirrel@webmail.field.hu> <20081213212835.GA99136@keira.kiwi-computer.com> Message-ID: <49450D2B.9000304@field.hu> I am using 7.0-RELEASE can you guys give me a link where can I download the patch? thank you Rick C. Petty ?rta: > On Sat, Dec 13, 2008 at 09:22:27PM +0100, oxy@field.hu wrote: > >> as i read it seems that it's useless for sync the raid after initing, i >> > > Not sure what you mean here. RAID5 sync after creation is pretty typical. > I think it many cases it's unnecessary, but the current gvinum code does > perform a sync-after-create for raid5. > > >> have no chance to encrypt it with geli, am i right? >> > > With the patch lulf@ just committed, you should be able to geli a raid5 > volume under gvinum. You may have to wait for it to be MFC'd if you're not > using HEAD (FreeBSD-CURRENT). You could also apply his patch to your > source code and rebuild from /usr/src/sbin/gvinum/ and it should work for > you. > > -- Rick C. Petty > _______________________________________________ > freebsd-geom@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-geom > To unsubscribe, send any mail to "freebsd-geom-unsubscribe@freebsd.org" > From ulf.lilleengen at gmail.com Sun Dec 14 10:36:52 2008 From: ulf.lilleengen at gmail.com (Ulf Lilleengen) Date: Sun Dec 14 10:36:58 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <49450D2B.9000304@field.hu> References: <20081212155023.GA82667@keira.kiwi-computer.com> <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> <3934.79.122.6.53.1229199747.squirrel@webmail.field.hu> <20081213212835.GA99136@keira.kiwi-computer.com> <49450D2B.9000304@field.hu> Message-ID: <20081214170659.GA12437@nobby> On Sun, Dec 14, 2008 at 02:42:03PM +0100, oxy wrote: > I am using 7.0-RELEASE > can you guys give me a link where can I download the patch? > thank you Here: http://people.freebsd.org/~lulf/readwritefix.diff -- Ulf Lilleengen From jmg at funkthat.com Sun Dec 14 12:17:29 2008 From: jmg at funkthat.com (John-Mark Gurney) Date: Sun Dec 14 12:17:35 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> References: <20081212155023.GA82667@keira.kiwi-computer.com> <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> Message-ID: <20081214195649.GK34842@funkthat.com> Ulf Lilleengen wrote this message on Sat, Dec 13, 2008 at 14:59 +0100: > Then, I discovered in geom userland code that it opens the disk where > metadata should be written in write only mode. Then I discovered the reason: > gvinum tries to write to the stripe in question, but has to read back the > parity data from one of the other stripes. But, they are opened O_WRONLY, so > the request fails. I tried opening the device as O_RDWR, and everything is > find. Isn't this a bug in gvinum that it lets a disk be opened in O_WRONLY, when it needs read permissions? Shouldn't it add the read permission to it's provider open, and let the underlying fd still be O_WRONLY (so the OS will prevent any reads) or it should be documented in the gvinum man page that raid5 volumes cannot be opened in O_WRONLY mode.. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From ulf.lilleengen at gmail.com Sun Dec 14 14:59:15 2008 From: ulf.lilleengen at gmail.com (Ulf Lilleengen) Date: Sun Dec 14 14:59:21 2008 Subject: Encrypting raid5 volume with geli In-Reply-To: <20081214195649.GK34842@funkthat.com> References: <20081212155023.GA82667@keira.kiwi-computer.com> <917871cf0812130559r6d423688q57287dd765d6edf4@mail.gmail.com> <20081214195649.GK34842@funkthat.com> Message-ID: <20081214215913.GA3723@nobby> On Sun, Dec 14, 2008 at 11:56:49AM -0800, John-Mark Gurney wrote: > Ulf Lilleengen wrote this message on Sat, Dec 13, 2008 at 14:59 +0100: > > Then, I discovered in geom userland code that it opens the disk where > > metadata should be written in write only mode. Then I discovered the reason: > > gvinum tries to write to the stripe in question, but has to read back the > > parity data from one of the other stripes. But, they are opened O_WRONLY, so > > the request fails. I tried opening the device as O_RDWR, and everything is > > find. > > Isn't this a bug in gvinum that it lets a disk be opened in O_WRONLY, > when it needs read permissions? Shouldn't it add the read permission > to it's provider open, and let the underlying fd still be O_WRONLY (so > the OS will prevent any reads) or it should be documented in the > gvinum man page that raid5 volumes cannot be opened in O_WRONLY mode.. > Yes, I agree. Michael, could you try the attached patch? It should fix the issue within gvinum itself. The previous change will have to be reverted too. -- Ulf Lilleengen From linimon at FreeBSD.org Sun Dec 14 20:50:38 2008 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Sun Dec 14 20:50:49 2008 Subject: kern/129645: gjournal(8): GEOM_JOURNAL causes system to fail to bood due to a GEOM Timeout problem if the Journals and Data are on storage provided by separate device drivers. Message-ID: <200812150450.mBF4ocSe074784@freefall.freebsd.org> Old Synopsis: GEOM_JOURNAL causes system to fail to bood due to a GEOM Timeout problem if the Journals and Data are on storage provided by separate device drivers. New Synopsis: gjournal(8): GEOM_JOURNAL causes system to fail to bood due to a GEOM Timeout problem if the Journals and Data are on storage provided by separate device drivers. Responsible-Changed-From-To: freebsd-bugs->freebsd-geom Responsible-Changed-By: linimon Responsible-Changed-When: Mon Dec 15 04:49:42 UTC 2008 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=129645 From avg at icyb.net.ua Mon Dec 15 02:56:25 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Mon Dec 15 02:56:36 2008 Subject: partition covering the whole slice [repost] In-Reply-To: References: <4939287C.3020208@icyb.net.ua> Message-ID: <494637A3.9080807@icyb.net.ua> on 08/12/2008 14:30 Vadim Goncharov said the following: > Yes, of course. You should not intermix using glabel(8) utilizing /dev/ufs > (via tunefs) and bsdlabel partition starting from offset 0. This is because > glabel can't distinguish is that slice or partition - with offset 0 superblock > will be at the same position. > > You can try to erase bsdlabel completely (if this is not your boot partition) > from the slice and use filesystem directly from the slice. This will not affect > mount as you're already using labels. > > The other way will require shrinking-then-moving partition on the disk and > editing disklabel, better done with newfs(8). Vadim, thanks a lot for the explanation and the advice. I used gpart destroy to remove bsdlabel and now I have filesystems covering the whole slices. This works well. -- Andriy Gapon From bugmaster at FreeBSD.org Mon Dec 15 03:06:52 2008 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Dec 15 03:08:02 2008 Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org Message-ID: <200812151106.mBFB6pFm004327@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/129645 geom gjournal(8): GEOM_JOURNAL causes system to fail to boo o kern/129245 geom [geom] gcache is more suitable for suffix based provid o kern/128398 geom [PATCH] glabel(8): teach geom_label to recognise gpt l f kern/128276 geom [gmirror] machine lock up when gmirror module is used o kern/126902 geom [geom] [geom_label] Kernel panic during install boot o kern/124973 geom [gjournal] [patch] boot order affects geom_journal con o kern/124969 geom gvinum(8): gvinum raid5 plex does not detect missing s o kern/124294 geom [geom] gmirror(8) have inappropriate logic when workin o kern/124130 geom [gmirror][usb] gmirror fails to start usb devices that o kern/123962 geom [panic] [gjournal] gjournal (455Gb data, 8Gb journal), o kern/123630 geom [patch] [gmirror] gmirror doesnt allow the original dr o kern/123122 geom [geom] GEOM / gjournal kernel lock f kern/122415 geom [geom] UFS labels are being constantly created and rem o kern/122067 geom [geom] [panic] Geom crashed during boot o kern/121559 geom [patch] [geom] geom label class allows to create inacc o kern/121364 geom [gmirror] Removing all providers create a "zombie" mir o kern/120231 geom [geom] GEOM_CONCAT error adding second drive o kern/120044 geom [msdosfs] [geom] incorrect MSDOSFS label fries adminis o kern/120021 geom [geom] [panic] net-p2p/qbittorrent crashes system when o kern/119743 geom [geom] geom label for cds is keeped after dismount and f kern/115547 geom [geom] [patch] [request] let GEOM Eli get password fro o kern/114532 geom [geom] GEOM_MIRROR shows up in kldstat even if compile o kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113419 geom [geom] geom fox multipathing not failing back p bin/110705 geom gmirror(8) control utility does not exit with correct o kern/107707 geom [geom] [patch] [request] add new class geom_xbox360 to o kern/104389 geom [geom] [patch] sys/geom/geom_dump.c doesn't encode XML o kern/98034 geom [geom] dereference of NULL pointer in acd_geom_detach o kern/94632 geom [geom] Kernel output resets input while GELI asks for o kern/90582 geom [geom] [panic] Restore cause panic string (ffs_blkfree o bin/90093 geom fdisk(8) incapable of altering in-core geometry a kern/89660 geom [vinum] [patch] [panic] due to g_malloc returning null o kern/89546 geom [geom] GEOM error s kern/89102 geom [geom] [panic] panic when forced unmount FS from unplu o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o kern/84556 geom [geom] GBDE-encrypted swap causes panic at shutdown o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/79035 geom [vinum] gvinum unable to create a striped set of mirro o bin/78131 geom gbde(8) "destroy" not working. s kern/73177 geom kldload geom_* causes panic due to memory exhaustion 41 problems total. From avg at icyb.net.ua Mon Dec 15 03:51:29 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Mon Dec 15 03:51:35 2008 Subject: gpart: mark slice active Message-ID: <494644BE.8000307@icyb.net.ua> How do I mark a slice active/inactive with gpart? -- Andriy Gapon From vadim_nuclight at mail.ru Mon Dec 15 04:36:03 2008 From: vadim_nuclight at mail.ru (Vadim Goncharov) Date: Mon Dec 15 04:36:10 2008 Subject: partition covering the whole slice [repost] References: <4939287C.3020208@icyb.net.ua> <494637A3.9080807@icyb.net.ua> Message-ID: Hi Andriy Gapon! On Mon, 15 Dec 2008 12:55:31 +0200; Andriy Gapon wrote about 'Re: partition covering the whole slice [repost]': >> Yes, of course. You should not intermix using glabel(8) utilizing /dev/ufs >> (via tunefs) and bsdlabel partition starting from offset 0. This is because >> glabel can't distinguish is that slice or partition - with offset 0 superblock >> will be at the same position. Also, that's why bsdlabel(8) creates partition 'a' with offset 16 by default. >> You can try to erase bsdlabel completely (if this is not your boot partition) >> from the slice and use filesystem directly from the slice. This will not affect >> mount as you're already using labels. >> >> The other way will require shrinking-then-moving partition on the disk and >> editing disklabel, better done with newfs(8). > thanks a lot for the explanation and the advice. > I used gpart destroy to remove bsdlabel and now I have filesystems > covering the whole slices. This works well. Also, check that your fsck doesn't complain about that setup (can't automatically determine how to check it). -- WBR, Vadim Goncharov. ICQ#166852181 mailto:vadim_nuclight@mail.ru [Moderator of RU.ANTI-ECOLOGY][FreeBSD][http://antigreen.org][LJ:/nuclight] From xcllnt at mac.com Mon Dec 15 15:55:07 2008 From: xcllnt at mac.com (Marcel Moolenaar) Date: Mon Dec 15 15:55:13 2008 Subject: gpart: mark slice active In-Reply-To: <494644BE.8000307@icyb.net.ua> References: <494644BE.8000307@icyb.net.ua> Message-ID: <03FF0C63-26EA-4A1D-8FDE-A699A3D8DBD9@mac.com> On Dec 15, 2008, at 3:51 AM, Andriy Gapon wrote: > > How do I mark a slice active/inactive with gpart? gpart set -a active -i FYI, -- Marcel Moolenaar xcllnt@mac.com From gfritz at gmail.com Mon Dec 15 19:09:24 2008 From: gfritz at gmail.com (Geoff Fritz) Date: Mon Dec 15 19:09:32 2008 Subject: GEOM_JOURNAL: Flush cache of concat/foo: error=19 Message-ID: <20081216024921.GC55072@dev.null> I've been experiementing with a bunch of GEOM services lately, and I've got a really ugly patchwork of disk space that might be better off done with ZFS. Without going into insane detail, here's how the layers stack up: drives --> gmirror --> gconcat --> geli -> gjournal (The goal was to utilize a bunch of misc drives, have all of the used space have redundancy, and only need to provide a single password when the system boots.) So I've got this /dev/concat/foo.eli.journal device mounted as my root partition (I boot via USB stick, to test full-disk crypto), with options "rw,noatime,async". I'm seeing the error (mentioned in the subject) when I beat the tar out of the filesystem, followed by a healthy stack of "last message repeated...". I'm currently compiling openoffice.org, which is when I first noticed the errors. I've been unable to locate the meaning of error "19" from searching the archives or the sources (a few too many levels of function calls for me to sort through). What does the error mean, is it serious, and how do I prevent it? Thanks for any pointers. -- Geoff From gfritz at gmail.com Mon Dec 15 20:06:37 2008 From: gfritz at gmail.com (Geoff Fritz) Date: Mon Dec 15 20:06:43 2008 Subject: GCACHE -- what's it for? Message-ID: <20081216040633.GA17495@dev.null> I noticed the presence of the geom_cache module recently. Very little available in the archives on what it's used for. Found a post by pjd@ with a link to a tarball containing a man page: "The gcache utility is used for setting up a clean cache in front of the IDE controller on one disk." (btw, where do I post to get this man page included in the release? freebsd-doc? I'm running 7.1-RC1 and the man page is absent.) I set up a test device with it, and noticed that the disk did a lot of thrashing when it was being written to, moreso than normal. Is the purpose of this module to ensure that when the gcache provdider says the write was made it was in fact 100% written to the physical disk (or at least accepted by the next layer down)? Thanks for the info. -- Geoff From avg at icyb.net.ua Tue Dec 16 03:41:13 2008 From: avg at icyb.net.ua (Andriy Gapon) Date: Tue Dec 16 03:41:20 2008 Subject: partition covering the whole slice [repost] In-Reply-To: References: <4939287C.3020208@icyb.net.ua> <494637A3.9080807@icyb.net.ua> Message-ID: <494793D2.5070107@icyb.net.ua> on 15/12/2008 14:35 Vadim Goncharov said the following: > Hi Andriy Gapon! > > On Mon, 15 Dec 2008 12:55:31 +0200; Andriy Gapon wrote about 'Re: partition covering the whole slice [repost]': > >>> Yes, of course. You should not intermix using glabel(8) utilizing /dev/ufs >>> (via tunefs) and bsdlabel partition starting from offset 0. This is because >>> glabel can't distinguish is that slice or partition - with offset 0 superblock >>> will be at the same position. > > Also, that's why bsdlabel(8) creates partition 'a' with offset 16 by default. But it seems that sysinstall doesn't, unfortunately. >>> You can try to erase bsdlabel completely (if this is not your boot partition) >>> from the slice and use filesystem directly from the slice. This will not affect >>> mount as you're already using labels. >>> >>> The other way will require shrinking-then-moving partition on the disk and >>> editing disklabel, better done with newfs(8). >> thanks a lot for the explanation and the advice. >> I used gpart destroy to remove bsdlabel and now I have filesystems >> covering the whole slices. This works well. > > Also, check that your fsck doesn't complain about that setup (can't > automatically determine how to check it). Yes, this happens indeed - whether I specify /dev/da0s1 or /dev/ufs/