From xcllnt at mac.com Wed Jul 1 03:43:28 2009 From: xcllnt at mac.com (Marcel Moolenaar) Date: Wed Jul 1 03:43:35 2009 Subject: gmirror gm0 destroyed on shutdown; GPT corrupt In-Reply-To: <20090630222540.GA34541@keira.kiwi-computer.com> References: <20090625110253.GA31443@mech-cluster238.men.bris.ac.uk> <10FCC74D-6D46-4112-AD89-BBB4C5933957@mac.com> <2FFFB36F-EFA3-4D92-98A3-692BA2D6F63E@mac.com> <20090629210003.GA24038@keira.kiwi-computer.com> <704EE47D-F0C4-4C63-AA3C-3ADF92CC8379@mac.com> <20090630215345.GC33849@keira.kiwi-computer.com> <9bbcef730906301508l6f2ae344tff8f7495e870049e@mail.gmail.com> <20090630222540.GA34541@keira.kiwi-computer.com> Message-ID: <06F4B172-3A59-49EA-A271-CCFC74B2B52A@mac.com> On Jun 30, 2009, at 3:25 PM, Rick C. Petty wrote: > > According to wikipedia, the GPT header contains: > - (offset 40) First usable LBA for partitions > - (offset 48) Last usable LBA These do not represent the media size. They relate to the region of the disk that can be assigned to partitions. -- Marcel Moolenaar xcllnt@mac.com From pjd at FreeBSD.org Wed Jul 1 13:53:37 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Wed Jul 1 13:53:43 2009 Subject: gmirror gm0 destroyed on shutdown; GPT corrupt In-Reply-To: <704EE47D-F0C4-4C63-AA3C-3ADF92CC8379@mac.com> References: <20090625110253.GA31443@mech-cluster238.men.bris.ac.uk> <10FCC74D-6D46-4112-AD89-BBB4C5933957@mac.com> <2FFFB36F-EFA3-4D92-98A3-692BA2D6F63E@mac.com> <20090629210003.GA24038@keira.kiwi-computer.com> <704EE47D-F0C4-4C63-AA3C-3ADF92CC8379@mac.com> Message-ID: <20090701135338.GE4372@garage.freebsd.pl> On Tue, Jun 30, 2009 at 02:37:55PM -0700, Marcel Moolenaar wrote: > > On Jun 29, 2009, at 2:00 PM, Rick C. Petty wrote: > > >[[ Removing the double cross-post, since this is GEOM-specific ]] > > > >On Sat, Jun 27, 2009 at 06:20:49PM -0700, Marcel Moolenaar wrote: > >> > >>Using the last sector is not only flawed because it creates a race > >>condition, > > > >It shouldn't create a race condition. > > It does. > > Answer the following: > > foo0 is a provider with 3 sectors. > bar is a geom class that puts meta-data in the first sector. > baz is a geom class that puts meta-data in the last sector. > > Both bar and baz get to taste foo0. Which one should go first? Marcel, I don't think you expect than entire world will agree on one place where metadata should be stored? A provider can contain metadata of few independent GEOM classes and its class responsibility to detect its providers correctly. Even for my classes where I store provider size in metadata there are configurations I can't cope with cleanly, like the 'c' partition. Workaround I implemented is to store provider name in metadata, but of course it's problematic if your disk name will change. All in all there is nothing wrong with gmirror. In your example you want all metadata formats to be exact same size and stored in exact same place... -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20090701/38fbc758/attachment.pgp From rick-freebsd2008 at kiwi-computer.com Wed Jul 1 14:38:34 2009 From: rick-freebsd2008 at kiwi-computer.com (Rick C. Petty) Date: Wed Jul 1 14:38:41 2009 Subject: gmirror gm0 destroyed on shutdown; GPT corrupt In-Reply-To: <06F4B172-3A59-49EA-A271-CCFC74B2B52A@mac.com> References: <20090625110253.GA31443@mech-cluster238.men.bris.ac.uk> <10FCC74D-6D46-4112-AD89-BBB4C5933957@mac.com> <2FFFB36F-EFA3-4D92-98A3-692BA2D6F63E@mac.com> <20090629210003.GA24038@keira.kiwi-computer.com> <704EE47D-F0C4-4C63-AA3C-3ADF92CC8379@mac.com> <20090630215345.GC33849@keira.kiwi-computer.com> <9bbcef730906301508l6f2ae344tff8f7495e870049e@mail.gmail.com> <20090630222540.GA34541@keira.kiwi-computer.com> <06F4B172-3A59-49EA-A271-CCFC74B2B52A@mac.com> Message-ID: <20090701143832.GA41858@keira.kiwi-computer.com> On Tue, Jun 30, 2009 at 08:42:57PM -0700, Marcel Moolenaar wrote: > > On Jun 30, 2009, at 3:25 PM, Rick C. Petty wrote: > > > >According to wikipedia, the GPT header contains: > > - (offset 40) First usable LBA for partitions > > - (offset 48) Last usable LBA > > These do not represent the media size. They relate to > the region of the disk that can be assigned to partitions. According to wikipedia: "The values for current and backup LBAs of the primary header should be the second sector of the disk (1) and the last sector of the disk, respectively." And: offset contents ------ -------- 24 Current LBA (location of this header copy) 32 Backup LBA (location of the other header copy) 40 First usable LBA for partitions (primary partition table last LBA + 1) 48 Last usable LBA (secondary partition table first LBA - 1) So that the media is from relative LBA 0 (the protective MBR) to LBA N-1, the secondary GPT header, which is described in offset 32. Offset 48 should contain LBA N-2. Therefore the media size N is the value of offset 32 minus the value of offset 24, plus 1 (for the MBR). It seems pretty clear cut to me. -- Rick C. Petty From xcllnt at mac.com Wed Jul 1 15:29:40 2009 From: xcllnt at mac.com (Marcel Moolenaar) Date: Wed Jul 1 15:29:47 2009 Subject: gmirror gm0 destroyed on shutdown; GPT corrupt In-Reply-To: <20090701135338.GE4372@garage.freebsd.pl> References: <20090625110253.GA31443@mech-cluster238.men.bris.ac.uk> <10FCC74D-6D46-4112-AD89-BBB4C5933957@mac.com> <2FFFB36F-EFA3-4D92-98A3-692BA2D6F63E@mac.com> <20090629210003.GA24038@keira.kiwi-computer.com> <704EE47D-F0C4-4C63-AA3C-3ADF92CC8379@mac.com> <20090701135338.GE4372@garage.freebsd.pl> Message-ID: On Jul 1, 2009, at 6:53 AM, Pawel Jakub Dawidek wrote: >> Answer the following: >> >> foo0 is a provider with 3 sectors. >> bar is a geom class that puts meta-data in the first sector. >> baz is a geom class that puts meta-data in the last sector. >> >> Both bar and baz get to taste foo0. Which one should go first? > > Marcel, I don't think you expect than entire world will agree on one > place where metadata should be stored? No, I don't expect it. But we do need to realize that there is a race and unless we keep track of the ordering (outside of GEOM), we will always run into some scenarios where the tasting results in warnings or errors... -- Marcel Moolenaar xcllnt@mac.com From pjd at FreeBSD.org Sat Jul 4 09:15:39 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Sat Jul 4 09:16:11 2009 Subject: gmirror gm0 destroyed on shutdown; GPT corrupt In-Reply-To: References: <20090625110253.GA31443@mech-cluster238.men.bris.ac.uk> <10FCC74D-6D46-4112-AD89-BBB4C5933957@mac.com> <2FFFB36F-EFA3-4D92-98A3-692BA2D6F63E@mac.com> <20090629210003.GA24038@keira.kiwi-computer.com> <704EE47D-F0C4-4C63-AA3C-3ADF92CC8379@mac.com> <20090701135338.GE4372@garage.freebsd.pl> Message-ID: <20090704091538.GA2891@garage.freebsd.pl> On Wed, Jul 01, 2009 at 08:29:23AM -0700, Marcel Moolenaar wrote: > > On Jul 1, 2009, at 6:53 AM, Pawel Jakub Dawidek wrote: > > >>Answer the following: > >> > >>foo0 is a provider with 3 sectors. > >>bar is a geom class that puts meta-data in the first sector. > >>baz is a geom class that puts meta-data in the last sector. > >> > >>Both bar and baz get to taste foo0. Which one should go first? > > > >Marcel, I don't think you expect than entire world will agree on one > >place where metadata should be stored? > > No, I don't expect it. But we do need to realize that there > is a race and unless we keep track of the ordering (outside > of GEOM), we will always run into some scenarios where the > tasting results in warnings or errors... This is not a race, really and also ordering is not important. Let's do the following: # gmirror create test da0 # gpt create /dev/mirror/test Let's assume GPT will be given providers for tasting before MIRROR on boot: da0 arrives GEOM: GPT->taste(da0) GPT: Raport GPT corrupted (da0 is not the size we expect) GEOM: MIRROR->taste(da0) MIRROR: g_new_providerf(mirror/test) GEOM: GPT->taste(mirror/test) GPT: GPT is ok, configure partitions, etc. Now let's revert the order: MIRROR goes first, then GPT: da0 arrives GEOM: MIRROR->taste(da0) MIRROR: g_new_providerf(mirror/test) GEOM: GPT->taste(da0) GPT: Raport GPT corrupted (da0 is not the size we expect) GEOM: GPT->taste(mirror/test) GPT: GPT is ok, configure partitions, etc. This is the same, because GEOM will still present da0 for GPT tasting even if MIRROR will decide to use it. I do agree that it is hard to cope with, especially for metadata formats that are given and that we cannot extend. The real problem here is that in some situations (for some metadata formats) class cannot auto-discover its providers reliably. GPT is not alone here. There is similar issue for UFS labels. You have a 500GB disk da0, you also have 200GB partition da0a starting at sector 0. You create UFS file system on da0a: # newfs -L foo /dev/da0a The LABEL class is given disk da0 for tasting. How can it tell if the file system was created on da0 or da0a? What we do now is to look inside UFS metadata and get file system size from there. If the file system size is equal to provider's size this is our provider. So in this case file system size is 200GB and da0 size is 500GB, so we skip it. This is not perfect, because one can create smaller UFS file system than provider size: # newfs -s 419430400 -L foo /dev/da0 We created 200GB file system on 500GB da0. Now the LABEL class will incorrectly skip da0 during tasting, because of size mismatch. The problem is similar to GPT: they cannot reliably work in auto-discovery mode. This is also problematic that provider can have multiple consumers attached, but solution I use in some classes (which is a side-effect really) is to open provider for write and exclusively during tasting. Even if MIRROR provider isn't mounted it keeps its components open for writing and exclusively all the time (the main reason was to allow synchronization). Once MIRROR opens provider for writing every consumer attached to this provider gets spoiled event (at least those that depend on metadata). Going back to our example even if GPT will configure partitions on da0, it should remove them on spoiled event once MIRROR opens this provider for writing. At the end GPT will configure partitions on mirror/test. This is of course not perfect, but reduce the mess in /dev/ a bit. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20090704/753b39e1/attachment.pgp From marius at nuenneri.ch Sun Jul 5 18:10:05 2009 From: marius at nuenneri.ch (=?ISO-8859-1?Q?Marius_N=FCnnerich?=) Date: Sun Jul 5 18:10:12 2009 Subject: bin/128398: [patch] glabel(8): teach geom_label to recognise gpt labels and uuids Message-ID: <200907051810.n65IA4lD064091@freefall.freebsd.org> The following reply was made to PR bin/128398; it has been noted by GNATS. From: =?ISO-8859-1?Q?Marius_N=FCnnerich?= To: bug-followup@FreeBSD.org Cc: Subject: Re: bin/128398: [patch] glabel(8): teach geom_label to recognise gpt labels and uuids Date: Sun, 5 Jul 2009 19:37:32 +0200 Thank Ivan, this PR can be closed now. From bugmaster at FreeBSD.org Mon Jul 6 11:06:59 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Jul 6 11:08:13 2009 Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org Message-ID: <200907061106.n66B6v6h010768@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/135898 geom [geom] Severe filesystem corruption - large files or l o kern/135874 geom [geom] [patch] geom_linux_lvm misses newer fedora defa o kern/134922 geom [gmirror] [panic] kernel panic when use fdisk on disk o kern/134113 geom [geli] Problem setting secondary GELI key o kern/134044 geom [geom] gmirror(8) overwrites fs with stale data from r o kern/133931 geom [geli] [request] intentionally wrong password to destr o bin/132845 geom [geom] [patch] ggated(8) does not close files opened a o kern/132273 geom glabel(8): [patch] failing on journaled partition o kern/132242 geom [gmirror] gmirror.ko fails to fully initialize o kern/131353 geom [geom] gjournal(8) kernel lock o kern/131037 geom [geli] Unable to create disklabel on .eli-Device p docs/130548 geom [patch] gjournal(8) man page is missing sysctls o kern/130528 geom gjournal fsck during boot o kern/129674 geom [geom] gjournal root did not mount on boot o kern/129645 geom gjournal(8): GEOM_JOURNAL causes system to fail to boo o kern/129245 geom [geom] gcache is more suitable for suffix based provid o bin/128398 geom [patch] glabel(8): teach geom_label to recognise gpt l f kern/128276 geom [gmirror] machine lock up when gmirror module is used o kern/126902 geom [geom] geom_label: kernel panic during install boot o kern/124973 geom [gjournal] [patch] boot order affects geom_journal con o kern/124969 geom gvinum(8): gvinum raid5 plex does not detect missing s o kern/124294 geom [geom] gmirror(8) have inappropriate logic when workin o kern/124130 geom [gmirror] [usb] gmirror fails to start usb devices tha o kern/123962 geom [panic] [gjournal] gjournal (455Gb data, 8Gb journal), o kern/123630 geom [patch] [gmirror] gmirror doesnt allow the original dr o kern/123122 geom [geom] GEOM / gjournal kernel lock o kern/122738 geom [geom] gmirror list "losts consumers" after gmirror de f kern/122415 geom [geom] UFS labels are being constantly created and rem o kern/122067 geom [geom] [panic] Geom crashed during boot o kern/121559 geom [patch] [geom] geom label class allows to create inacc o kern/121481 geom [gmirror] data rot on disk with gmirror o kern/121364 geom [gmirror] Removing all providers create a "zombie" mir o kern/120231 geom [geom] GEOM_CONCAT error adding second drive o kern/120091 geom [geom] [geli] [gjournal] geli does not prompt for pass o kern/120044 geom [msdosfs] [geom] incorrect MSDOSFS label fries adminis o kern/120021 geom [geom] [panic] net-p2p/qbittorrent crashes system when o kern/119743 geom [geom] geom label for cds is keeped after dismount and p kern/116896 geom [geom] [patch] Typo in a kassert in GEOM o kern/115856 geom [geli] ZFS thought it was degraded when it should have o kern/115547 geom [geom] [patch] [request] let GEOM Eli get password fro o kern/114532 geom [geom] GEOM_MIRROR shows up in kldstat even if compile o kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/113885 geom [gmirror] [patch] improved gmirror balance algorithm o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113419 geom [geom] geom fox multipathing not failing back p bin/110705 geom gmirror(8) control utility does not exit with correct o kern/107707 geom [geom] [patch] [request] add new class geom_xbox360 to o kern/104389 geom [geom] [patch] sys/geom/geom_dump.c doesn't encode XML o kern/98034 geom [geom] dereference of NULL pointer in acd_geom_detach o kern/94632 geom [geom] Kernel output resets input while GELI asks for o kern/90582 geom [geom] [panic] Restore cause panic string (ffs_blkfree o bin/90093 geom fdisk(8) incapable of altering in-core geometry a kern/89660 geom [vinum] [patch] [panic] due to g_malloc returning null o kern/89546 geom [geom] GEOM error s kern/89102 geom [geom] [panic] panic when forced unmount FS from unplu o kern/88601 geom [geli] geli cause kernel panic under heavy disk usage o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o kern/84556 geom [geom] [panic] GBDE-encrypted swap causes panic at shu o bin/81779 geom misleading error messages in geom(8) utilities. o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/79035 geom [vinum] gvinum unable to create a striped set of mirro o bin/78131 geom gbde(8) "destroy" not working. s kern/73177 geom kldload geom_* causes panic due to memory exhaustion 63 problems total. From freeaz at gmail.com Mon Jul 6 11:32:47 2009 From: freeaz at gmail.com (aZ) Date: Mon Jul 6 11:32:54 2009 Subject: problems using geli on top of gmirror Message-ID: <20090706132036.7ccfc623@az> Hello, I have the same error. :( Bye. From ivoras at FreeBSD.org Tue Jul 7 11:02:24 2009 From: ivoras at FreeBSD.org (ivoras@FreeBSD.org) Date: Tue Jul 7 11:02:29 2009 Subject: bin/128398: [patch] glabel(8): teach geom_label to recognise gpt labels and uuids Message-ID: <200907071102.n67B2N53005913@freefall.freebsd.org> Synopsis: [patch] glabel(8): teach geom_label to recognise gpt labels and uuids State-Changed-From-To: open->closed State-Changed-By: ivoras State-Changed-When: Tue Jul 7 11:01:36 UTC 2009 State-Changed-Why: Patch applied. http://www.freebsd.org/cgi/query-pr.cgi?pr=128398 From ivoras at FreeBSD.org Tue Jul 7 11:07:08 2009 From: ivoras at FreeBSD.org (ivoras@FreeBSD.org) Date: Tue Jul 7 11:07:14 2009 Subject: kern/121481: [gmirror] data rot on disk with gmirror Message-ID: <200907071107.n67B77p2006025@freefall.freebsd.org> Synopsis: [gmirror] data rot on disk with gmirror State-Changed-From-To: open->closed State-Changed-By: ivoras State-Changed-When: Tue Jul 7 11:05:46 UTC 2009 State-Changed-Why: Mostly irrelevant, RAID1 does not provide checksumming / data consistency checks that would catch bit-rot errors. (See ZFS for alternatives). http://www.freebsd.org/cgi/query-pr.cgi?pr=121481 From ivoras at FreeBSD.org Tue Jul 7 11:12:49 2009 From: ivoras at FreeBSD.org (ivoras@FreeBSD.org) Date: Tue Jul 7 11:12:57 2009 Subject: kern/116896: [geom] [patch] Typo in a kassert in GEOM Message-ID: <200907071112.n67BCmeX013004@freefall.freebsd.org> Synopsis: [geom] [patch] Typo in a kassert in GEOM State-Changed-From-To: patched->closed State-Changed-By: ivoras State-Changed-When: Tue Jul 7 11:12:29 UTC 2009 State-Changed-Why: Patched and MFC-ed http://www.freebsd.org/cgi/query-pr.cgi?pr=116896 From ivoras at FreeBSD.org Tue Jul 7 11:14:08 2009 From: ivoras at FreeBSD.org (ivoras@FreeBSD.org) Date: Tue Jul 7 11:14:14 2009 Subject: bin/81779: misleading error messages in geom(8) utilities. Message-ID: <200907071114.n67BE7F4013857@freefall.freebsd.org> Synopsis: misleading error messages in geom(8) utilities. State-Changed-From-To: open->closed State-Changed-By: ivoras State-Changed-When: Tue Jul 7 11:13:54 UTC 2009 State-Changed-Why: Patched some time ago. http://www.freebsd.org/cgi/query-pr.cgi?pr=81779 From dan.naumov at gmail.com Tue Jul 7 23:20:09 2009 From: dan.naumov at gmail.com (Dan Naumov) Date: Tue Jul 7 23:20:21 2009 Subject: glabel metadata protection (WAS: ZFS: drive replacement performance) Message-ID: >> Not to derail this discussion, but can anyone explain if the actual >> glabel metadata is protected in any way? If I use glabel to label a >> disk and then create a pool using /dev/label/disklabel, won't ZFS >> eventually overwrite the glabel metadata in the last sector since the >> disk in it's entirety is given to the pool? Or is every filesystem >> used by FreeBSD (ufs, zfs, etc) hardcoded to ignore the last few >> sectors of any disk and/or partition and not write data to it to avoid >> such issues? > > Disks labeled with glabel lose their last sector to the label. ?It is not > accessible by ZFS. ?Disks with bsdlabel partition tables are at risk due to > the brain dead decision to allow partitions to overlap the first sector, > but modern designs like glabel avoid this mistake. > > -- Brooks So what happens if I was to do the following (for the same of example): gpart create -s GPT /dev/ad1 glabel label -v disk01 /dev/ad1 gpart add -b 1 -s -t freebsd-zfs /dev/ad1 Does "gpart add" automatically somehow recognize that the last sector of contains the glabel and automatically re-adjusts this command to make the freebsd-zfs partition take "entiredisk minus last sector" ? I can understand the logic of metadata being protected if I do a: "gpart add -b 1 -s -t freebsd-zfs /dev/label/disk01" since gpart will have to go through the actual label first, but what actually happens if I issue a gpart directly to the /dev/device? - Sincerely, Dan Naumov From bp at barryp.org Wed Jul 8 01:07:17 2009 From: bp at barryp.org (Barry Pederson) Date: Wed Jul 8 01:07:24 2009 Subject: glabel metadata protection (WAS: ZFS: drive replacement performance) In-Reply-To: References: Message-ID: <4A53ED2D.4070309@barryp.org> Dan Naumov wrote: >>> If I use glabel to label a >>> disk and then create a pool using /dev/label/disklabel, won't ZFS >>> eventually overwrite the glabel metadata in the last sector since the >>> disk in it's entirety is given to the pool? I would say in this case you're *not* giving the entire disk to the pool, you're giving ZFS a geom that's one sector smaller than the disk. ZFS never sees or can touch the glabel metadata. > So what happens if I was to do the following (for the same of example): > > gpart create -s GPT /dev/ad1 > glabel label -v disk01 /dev/ad1 > gpart add -b 1 -s -t freebsd-zfs /dev/ad1 > > Does "gpart add" automatically somehow recognize that the last sector > of contains the glabel and automatically re-adjusts this > command to make the freebsd-zfs partition take "entiredisk minus last > sector" ? I can understand the logic of metadata being protected if I > do a: "gpart add -b 1 -s -t freebsd-zfs > /dev/label/disk01" since gpart will have to go through the actual > label first, but what actually happens if I issue a gpart directly to > the /dev/device? I'd guess bad stuff would happen here, with a conflict between what gpt and glabel would want to do with the end of the disk. If you wanted to use glabel with a GPT partition, I'd think you'd want to gpart create -s GPT /dev/ad1 (use "gpart show" to see what space is now available for GPT partitions, it won't start at 1 and won't go to the very end of the disk) gpart add -b 34 -s -t freebsd-zfs /dev/ad1 glabel label -v disk01 /dev/ad1p1 (and then use label/disk01 in a zpool) Barry From petefrench at ticketswitch.com Wed Jul 8 08:38:05 2009 From: petefrench at ticketswitch.com (Pete French) Date: Wed Jul 8 08:38:18 2009 Subject: glabel metadata protection (WAS: ZFS: drive replacement performance) In-Reply-To: <4A53ED2D.4070309@barryp.org> Message-ID: > I would say in this case you're *not* giving the entire disk to the > pool, you're giving ZFS a geom that's one sector smaller than the disk. > ZFS never sees or can touch the glabel metadata. Is ZFS happy if the size of it's disc changes underneath it ? I have expanded a zpool a couple of times simply by changing the size of the partition and rebooting the machine - it comes up with the new amount of free space fine. Never tried it the other way though. The reason I mention it is that someone suggested glabeling a drive in an existing pool and using replace to swap it over. Which should be good I guess unless the last sector was in use. ZFS spreads stuff all over the disc as I unserdtand it though, so that might not be a good assumption, even on a fairly empty filesystem. -pete. From trasz at FreeBSD.org Wed Jul 8 12:28:25 2009 From: trasz at FreeBSD.org (trasz@FreeBSD.org) Date: Wed Jul 8 12:28:32 2009 Subject: kern/89102: [geom] [panic] panic when forced unmount FS from unplugged device Message-ID: <200907081228.n68CSOnI020663@freefall.freebsd.org> Synopsis: [geom] [panic] panic when forced unmount FS from unplugged device Responsible-Changed-From-To: freebsd-geom->trasz Responsible-Changed-By: trasz Responsible-Changed-When: Wed Jul 8 12:28:24 UTC 2009 Responsible-Changed-Why: I'll take it. http://www.freebsd.org/cgi/query-pr.cgi?pr=89102 From linimon at FreeBSD.org Wed Jul 8 18:58:14 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Wed Jul 8 18:58:20 2009 Subject: kern/136467: [geom] glabel(8) destroys access to GEOM tree if volume label contains non ASCII characters Message-ID: <200907081858.n68IwD8R091108@freefall.freebsd.org> Old Synopsis: glabel destroys access to GEOM tree if volume label contains non ASCII characters New Synopsis: [geom] glabel(8) destroys access to GEOM tree if volume label contains non ASCII characters Responsible-Changed-From-To: freebsd-bugs->freebsd-geom Responsible-Changed-By: linimon Responsible-Changed-When: Wed Jul 8 18:57:58 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=136467 From freebsd-geom at hub.freebsd.org Wed Jul 8 20:20:23 2009 From: freebsd-geom at hub.freebsd.org (Duncan Djhyof) Date: Wed Jul 8 20:20:32 2009 Subject: Review of your results Message-ID: <6900VT.479913D19.0372788654NZMPOSSBRRDJXVS704@athedsl-23882.home.otenet.gr> [1]Click here in case if image is blocked Today is Wednesday, July 8 [2]Check today's assignment. More on Love & Relationships [3]7 manual sex secrets [4]7 outrageous sex positions _________________________________________________________________ [5]Manage Your Subscriptions You are subscribed as freebsd-geom@hub.freebsd.org. To unsubscribe from this newsletter, please [6]click here [7]Privacy Policy | [8]Email Help | [9]About us References 1. http://dvtk02.vzeyihap.cn/?odexohykjn=1db6b13d19006c4ceea&oazebeiryfoj=18747028150 2. http://omqn45.vzeyihap.cn/?fuputj=1db6b13d19006c4ceea&talouaxql=18747028150 3. http://juh88.vzeyihap.cn/?qkyzqowjtibu=1db6b13d19006c4ceea&oxojhqcuol=18747028150 4. http://juh88.vzeyihap.cn/?yukofeq=1db6b13d19006c4ceea&yrebicoqtot=18747028150 5. http://juh88.vzeyihap.cn/?etudavivan=1db6b13d19006c4ceea&qtytooihosq=18747028150 6. http://juh88.vzeyihap.cn/?onyneexegoz=1db6b13d19006c4ceea&egarqlqys=18747028150 7. http://juh88.vzeyihap.cn/?hiegoljk=1db6b13d19006c4ceea&vibivym=18747028150 8. http://juh88.vzeyihap.cn/?ivoxet=1db6b13d19006c4ceea&ebqnefizix=18747028150 9. http://juh88.vzeyihap.cn/?ejryhqzan=1db6b13d19006c4ceea&jdakuvac=18747028150 From ivoras at freebsd.org Thu Jul 9 11:53:09 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Thu Jul 9 11:53:16 2009 Subject: glabel and real disk IDs Message-ID: Hi, I've been working with glabels for a time and just remembered that ATA (ad) drives do in fact export the drive ID, for example: # diskinfo -v ad4 ad4 512 # sectorsize 320072933376 # mediasize in bytes (298G) 625142448 # mediasize in sectors 620181 # Cylinders according to firmware. 16 # Heads according to firmware. 63 # Sectors according to firmware. ad:9QF4H15Y # Disk ident. # diskinfo -v ad6 ad6 512 # sectorsize 320072933376 # mediasize in bytes (298G) 625142448 # mediasize in sectors 620181 # Cylinders according to firmware. 16 # Heads according to firmware. 63 # Sectors according to firmware. ad:9QF4EP7A # Disk ident. # diskinfo -v ad8 ad8 512 # sectorsize 320072933376 # mediasize in bytes (298G) 625142448 # mediasize in sectors 620181 # Cylinders according to firmware. 16 # Heads according to firmware. 63 # Sectors according to firmware. ad:9QF4H16L # Disk ident. I don't think it would be hard to add a label parser to gather this information and export it as a label. The purpose of this would be to have a unique disk ID without explicitly setting a label (e.g. as is commonly advised for ZFS and drive swapping). Any objections? From marius at nuenneri.ch Thu Jul 9 13:33:59 2009 From: marius at nuenneri.ch (=?ISO-8859-1?Q?Marius_N=FCnnerich?=) Date: Thu Jul 9 13:34:05 2009 Subject: glabel and real disk IDs In-Reply-To: References: Message-ID: On Thu, Jul 9, 2009 at 13:52, Ivan Voras wrote: > Hi, > > I've been working with glabels for a time and just remembered that ATA (ad) > drives do in fact export the drive ID, for example: > > # diskinfo -v ad4 > ad4 > ? ? ? ?512 ? ? ? ? ? ? # sectorsize > ? ? ? ?320072933376 ? ?# mediasize in bytes (298G) > ? ? ? ?625142448 ? ? ? # mediasize in sectors > ? ? ? ?620181 ? ? ? ? ?# Cylinders according to firmware. > ? ? ? ?16 ? ? ? ? ? ? ?# Heads according to firmware. > ? ? ? ?63 ? ? ? ? ? ? ?# Sectors according to firmware. > ? ? ? ?ad:9QF4H15Y ? ? # Disk ident. > > # diskinfo -v ad6 > ad6 > ? ? ? ?512 ? ? ? ? ? ? # sectorsize > ? ? ? ?320072933376 ? ?# mediasize in bytes (298G) > ? ? ? ?625142448 ? ? ? # mediasize in sectors > ? ? ? ?620181 ? ? ? ? ?# Cylinders according to firmware. > ? ? ? ?16 ? ? ? ? ? ? ?# Heads according to firmware. > ? ? ? ?63 ? ? ? ? ? ? ?# Sectors according to firmware. > ? ? ? ?ad:9QF4EP7A ? ? # Disk ident. > > # diskinfo -v ad8 > ad8 > ? ? ? ?512 ? ? ? ? ? ? # sectorsize > ? ? ? ?320072933376 ? ?# mediasize in bytes (298G) > ? ? ? ?625142448 ? ? ? # mediasize in sectors > ? ? ? ?620181 ? ? ? ? ?# Cylinders according to firmware. > ? ? ? ?16 ? ? ? ? ? ? ?# Heads according to firmware. > ? ? ? ?63 ? ? ? ? ? ? ?# Sectors according to firmware. > ? ? ? ?ad:9QF4H16L ? ? # Disk ident. > > I don't think it would be hard to add a label parser to gather this > information and export it as a label. > > The purpose of this would be to have a unique disk ID without explicitly > setting a label (e.g. as is commonly advised for ZFS and drive swapping). > > Any objections? I like the idea. If one should use it in which situation is something different. From avg at icyb.net.ua Thu Jul 9 14:01:50 2009 From: avg at icyb.net.ua (Andriy Gapon) Date: Thu Jul 9 14:01:58 2009 Subject: glabel and real disk IDs In-Reply-To: References: Message-ID: <4A55F5E5.9080703@icyb.net.ua> on 09/07/2009 14:52 Ivan Voras said the following: > Hi, > > I've been working with glabels for a time and just remembered that ATA > (ad) drives do in fact export the drive ID, for example: [snip] > I don't think it would be hard to add a label parser to gather this > information and export it as a label. > > The purpose of this would be to have a unique disk ID without explicitly > setting a label (e.g. as is commonly advised for ZFS and drive swapping). > > Any objections? This is a very good idea in my opinion. -- Andriy Gapon From ivoras at freebsd.org Thu Jul 9 19:40:03 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Thu Jul 9 19:40:09 2009 Subject: kern/136467: [geom] glabel(8) destroys access to GEOM tree if volume label contains non ASCII characters Message-ID: <200907091940.n69Je2aG046088@freefall.freebsd.org> The following reply was made to PR kern/136467; it has been noted by GNATS. From: Ivan Voras To: bug-followup@freebsd.org, IZ-FreeBSD0902@hs-karlsruhe.de Cc: Subject: Re: kern/136467: [geom] glabel(8) destroys access to GEOM tree if volume label contains non ASCII characters Date: Thu, 09 Jul 2009 13:48:35 +0200 Would you be happy with a sysctl knob that controls whether to interpret all labels as UTF-8 or to strip out all 8-bit characters, which would by default be set to "interpret as UTF-8"? From pjd at FreeBSD.org Thu Jul 9 20:00:58 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Thu Jul 9 20:01:30 2009 Subject: glabel and real disk IDs In-Reply-To: References: Message-ID: <20090709200102.GA2438@garage.freebsd.pl> On Thu, Jul 09, 2009 at 01:52:53PM +0200, Ivan Voras wrote: > Hi, > > I've been working with glabels for a time [...] You should stop, really. We have enough labels in /dev/ as it is. > [...] and just remembered that ATA > (ad) drives do in fact export the drive ID, for example: > > # diskinfo -v ad4 > ad4 > 512 # sectorsize > 320072933376 # mediasize in bytes (298G) > 625142448 # mediasize in sectors > 620181 # Cylinders according to firmware. > 16 # Heads according to firmware. > 63 # Sectors according to firmware. > ad:9QF4H15Y # Disk ident. > > # diskinfo -v ad6 > ad6 > 512 # sectorsize > 320072933376 # mediasize in bytes (298G) > 625142448 # mediasize in sectors > 620181 # Cylinders according to firmware. > 16 # Heads according to firmware. > 63 # Sectors according to firmware. > ad:9QF4EP7A # Disk ident. > > # diskinfo -v ad8 > ad8 > 512 # sectorsize > 320072933376 # mediasize in bytes (298G) > 625142448 # mediasize in sectors > 620181 # Cylinders according to firmware. > 16 # Heads according to firmware. > 63 # Sectors according to firmware. > ad:9QF4H16L # Disk ident. > > I don't think it would be hard to add a label parser to gather this > information and export it as a label. It was proposed in the past and the consensus was not to do it. One of the reasons was polution of /dev/, another one was that the way of getting SCSI disks IDs was not perfect. > The purpose of this would be to have a unique disk ID without explicitly > setting a label (e.g. as is commonly advised for ZFS and drive swapping). I guess you advice that? There is no such need when it comes to ZFS. ZFS can find his components just fine without using their names. Disk IDs were added for ZFS in the past, but now they serve no purpose, I'd prefer to remove them altogether or just leave them for informational purpose as they exist now. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20090709/37d25f0f/attachment.pgp From xcllnt at mac.com Thu Jul 9 21:33:01 2009 From: xcllnt at mac.com (Marcel Moolenaar) Date: Thu Jul 9 21:33:08 2009 Subject: glabel and real disk IDs In-Reply-To: <20090709200102.GA2438@garage.freebsd.pl> References: <20090709200102.GA2438@garage.freebsd.pl> Message-ID: On Jul 9, 2009, at 1:01 PM, Pawel Jakub Dawidek wrote: > >> The purpose of this would be to have a unique disk ID without >> explicitly >> setting a label (e.g. as is commonly advised for ZFS and drive >> swapping). > > I guess you advice that? There is no such need when it comes to ZFS. > ZFS > can find his components just fine without using their names. Disk IDs > were added for ZFS in the past, but now they serve no purpose, I'd > prefer to remove them altogether or just leave them for informational > purpose as they exist now. Just as an FYI: I see ZFS getting confused when disks are shuffled around. The confusion is the result of having device paths stored in the ZFS label match the device name of some other vdev that part of the same pool. Replacing a device with itself doesn't help, because ZFS complains that the vdev is part of an active pool in that case. It seems that only labels will work here... -- Marcel Moolenaar xcllnt@mac.com From ivoras at freebsd.org Thu Jul 9 21:34:06 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Thu Jul 9 21:35:04 2009 Subject: glabel and real disk IDs In-Reply-To: <20090709200102.GA2438@garage.freebsd.pl> References: <20090709200102.GA2438@garage.freebsd.pl> Message-ID: <9bbcef730907091433i6417de15o1462750b90fe54a@mail.gmail.com> I really am not doing the things I do to agitate you personally. We can debate on technical grounds and I will back down if sufficient technical or logical reasons are given. I will not reply to any part of your messages that seem too emotional. 2009/7/9 Pawel Jakub Dawidek : > One of the reasons was polution of /dev/, The pollution of the /dev namespace could have been lessened by using a different policy in naming devices, as was suggested before (by me and others). The situation now is that we have passed the point of no return when glabel as-is arrived in the GENERIC kernel. Putting a freeze on adding new label parsers to glabel will not change anything for the better and will not fix existing problems. > another one was that the way > of getting SCSI disks IDs was not perfect. On the other hand, ATA IDs seem ok. If there are problems I don't see with them, I'd like to find out about them, better sooner than later. >> The purpose of this would be to have a unique disk ID without explicitly >> setting a label (e.g. as is commonly advised for ZFS and drive swapping). > > I guess you advice that? There is no such need when it comes to ZFS. ZFS I am not involved in ZFS development enough to advise or disadvise anything, except that I notice that there apparently is a problem somewhere in that area and that using glabel to fixate disk names is a common advice given to those who encounter it. The most recent thread (and the direct cause of my post) is http://permalink.gmane.org/gmane.os.freebsd.stable/63970 . I remember there are other similar threads. > can find his components just fine without using their names. Disk IDs > were added for ZFS in the past, but now they serve no purpose, I'd > prefer to remove them altogether or just leave them for informational > purpose as they exist now. I have seen your commit and was surprised by it. Doesn't it mean that because of it ZFS will not automatically pick up renumbered/renamed drives? Was the reason of removing disk id usage from ZFS that it didn't work? Can you suggest a solution (or a better solution than manually labeling drives) to the "drives renumbered" problem in the above thread? Finally, I will do what I proposed except if a) there is a noticeable community or developer outcry not to do it, for whatever reason or b) strong technical reasons are presented from anyone that would make the proposal invalid, unsecure, problematic to maintain, problematic to use for general users or others. Please also note that glabel is optional and noone is forcing anyone to use it. If you have problems with others' modifications to glabel, I also respectfully propose to take maintainance of it. From pjd at FreeBSD.org Thu Jul 9 22:24:14 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Thu Jul 9 22:24:21 2009 Subject: glabel and real disk IDs In-Reply-To: References: <20090709200102.GA2438@garage.freebsd.pl> Message-ID: <20090709222420.GE2438@garage.freebsd.pl> On Thu, Jul 09, 2009 at 02:32:59PM -0700, Marcel Moolenaar wrote: > > On Jul 9, 2009, at 1:01 PM, Pawel Jakub Dawidek wrote: > > > > >>The purpose of this would be to have a unique disk ID without > >>explicitly > >>setting a label (e.g. as is commonly advised for ZFS and drive > >>swapping). > > > >I guess you advice that? There is no such need when it comes to ZFS. > >ZFS > >can find his components just fine without using their names. Disk IDs > >were added for ZFS in the past, but now they serve no purpose, I'd > >prefer to remove them altogether or just leave them for informational > >purpose as they exist now. > > Just as an FYI: > > I see ZFS getting confused when disks are shuffled around. > The confusion is the result of having device paths stored > in the ZFS label match the device name of some other vdev > that part of the same pool. > > Replacing a device with itself doesn't help, because ZFS > complains that the vdev is part of an active pool in that > case. It seems that only labels will work here... Solaris is using device names stored in ZFS label and if this is not the drive it was looking for, it is doing ID-to-path translation to find new path name. On FreeBSD on the other hand (after upgrade to v13) I gave up doing similar thing because disk IDs weren't available from all disk device drivers (I implemented it for ATA and I received no help with other drivers). Currently the idea is to just go through all GEOM providers looking for proper ZFS metadata (think of it as manual tasting), so even if device name changes, ZFS should be able to locate it. If there are still problems locating the disk, there simply might be a bug in the code of some sort. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20090709/4f4b0c4c/attachment.pgp From xcllnt at mac.com Fri Jul 10 16:38:05 2009 From: xcllnt at mac.com (Marcel Moolenaar) Date: Fri Jul 10 16:38:10 2009 Subject: glabel and real disk IDs In-Reply-To: <20090709222420.GE2438@garage.freebsd.pl> References: <20090709200102.GA2438@garage.freebsd.pl> <20090709222420.GE2438@garage.freebsd.pl> Message-ID: <0903FECF-3D0D-430E-9E93-C6DC00CA1BC5@mac.com> On Jul 9, 2009, at 3:24 PM, Pawel Jakub Dawidek wrote: >> I see ZFS getting confused when disks are shuffled around. >> The confusion is the result of having device paths stored >> in the ZFS label match the device name of some other vdev >> that part of the same pool. >> >> Replacing a device with itself doesn't help, because ZFS >> complains that the vdev is part of an active pool in that >> case. It seems that only labels will work here... > > Solaris is using device names stored in ZFS label and if this is not > the > drive it was looking for, it is doing ID-to-path translation to find > new > path name. On FreeBSD on the other hand (after upgrade to v13) I > gave up > doing similar thing because disk IDs weren't available from all disk > device drivers (I implemented it for ATA and I received no help with > other drivers). Currently the idea is to just go through all GEOM > providers looking for proper ZFS metadata (think of it as manual > tasting), so even if device name changes, ZFS should be able to locate > it. If there are still problems locating the disk, there simply > might be > a bug in the code of some sort. Disks are found correctly, it's just that ZFS' internal state is messed up. It uses both the device special file name and the stored vdev path and as such can end up with multiple VDEVs of the same name. As such, some VDEVs are marked as corrupted/faulted. I can reproduce it if you're interested. FYI, -- Marcel Moolenaar xcllnt@mac.com From jeff+freebsd at wagsky.com Fri Jul 10 19:03:18 2009 From: jeff+freebsd at wagsky.com (Jeff Kletsky) Date: Fri Jul 10 19:03:25 2009 Subject: 7.x and 8.0 gpt and gpart GPT PMBR prevents Intel boot In-Reply-To: <4A578E3F.8050305@wagsky.com> References: <4A578E3F.8050305@wagsky.com> Message-ID: <4A579076.5070008@wagsky.com> It appears to me that the PMBR being used both by gpt and gpart is causing boot problems with current Intel hardware. There appear to be two issues that can cause difficulties: * The "start" of the PMBR is 0xffffff * The PMBR is not marked "bootable" by gpt or gpart In particular, it has been confirmed that it is not possible to format a GPT drive that will boot on an Intel D945GCLF2 with current BIOS (and may be the causes for the observed "freezes" at boot). Both these issues have been previously documented: These two PMBR excerpts illustrate the changes required: This is the "stock" PMBR generated by gpart for a 500 GB SATA drive. It does *not* allow the BIOS to boot.(FreeBSD 7.2-RELEASE-p2) 000001b0: 9090 9090 9090 9090 0000 0000 0000 00ff ................ 000001c0: ffff eeff ffff 0100 0000 2e60 383a 0000 ...........`8:.. 000001d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000001e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000001f0: 0000 0000 0000 0000 0000 0000 0000 55aa ..............U. This is a hand-modified version of the PMBR installed on the same drive that does permit the BIOS to boot. 000001b0: 9090 9090 9090 9090 0000 0000 0000 8001 ................ 000001c0: 0100 eeff ffff 0100 0000 2e60 383a 0000 ...........`8:.. 000001d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000001e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 000001f0: 0000 0000 0000 0000 0000 0000 0000 55aa ..............U. Note that the "bootable" flag has been set (0x80) and the start of the partition has been set to the beginning of the disk (0x010100). Without the 0x80 flag also set, the D945GCLF2 BIOS does not identify any bootable drives. The Intel specification for EFI boot does indicate (pp 374-375) that 0xffffff is to be used when the start of the partition can't be represented using CHS notation. However, this PMBR refers to the entire disk, arguably to prevent legacy systems from overwriting it (Protective Master Boot Record), but apparently also "reputable" BIOS manufacturers are producing current systems that key off the information as well. The PR indicates patches for the CHS issue for gpt. For gpart, the code is in /usr/src/sys/geom/part/g_part_gpt.c and, I believe, should be modified to read le16enc(table->mbr + DOSMAGICOFFSET, DOSMAGIC); table->mbr[DOSPARTOFF + 1] = 0x01; /* shd */ table->mbr[DOSPARTOFF + 2] = 0x01; /* ssect */ table->mbr[DOSPARTOFF + 3] = 0x00; /* scyl */ table->mbr[DOSPARTOFF + 4] = 0xee; /* typ */ table->mbr[DOSPARTOFF + 5] = 0xff; /* ehd */ table->mbr[DOSPARTOFF + 6] = 0xff; /* esect */ table->mbr[DOSPARTOFF + 7] = 0xff; /* ecyl */ le32enc(table->mbr + DOSPARTOFF + 8, 1); /* start */ le32enc(table->mbr + DOSPARTOFF + 12, MIN(last, 0xffffffffLL)); which will make it consistent with MBR content for the first partition on Windows machines (protecting GPT drives from inadvertent modification on non-GPT-aware platforms), as well as the patch suggested in PR 115406. (Code still unmodified in HEAD v1.16) As to marking the partition bootable, there is a question of where the most appropriate place might be and if a flag to override should be made available. In my opinion, it should be so marked when a GPT boot segment is added to the drive as well as when either drive-level or partition-level boot code is written, with a command-line option to not change the "bootable" flag in the PMBR for these conditions. In my opinion, these issues should be considered for inclusion in the 8.0-RELEASE -- at the very least, the 0xffffff issue, as it cannot easily be resolved from the command line. Thanks, Jeff From pjd at FreeBSD.org Fri Jul 10 19:39:02 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Fri Jul 10 19:39:09 2009 Subject: glabel and real disk IDs In-Reply-To: <0903FECF-3D0D-430E-9E93-C6DC00CA1BC5@mac.com> References: <20090709200102.GA2438@garage.freebsd.pl> <20090709222420.GE2438@garage.freebsd.pl> <0903FECF-3D0D-430E-9E93-C6DC00CA1BC5@mac.com> Message-ID: <20090710193905.GA1463@garage.freebsd.pl> On Fri, Jul 10, 2009 at 09:37:30AM -0700, Marcel Moolenaar wrote: > > On Jul 9, 2009, at 3:24 PM, Pawel Jakub Dawidek wrote: > >>I see ZFS getting confused when disks are shuffled around. > >>The confusion is the result of having device paths stored > >>in the ZFS label match the device name of some other vdev > >>that part of the same pool. > >> > >>Replacing a device with itself doesn't help, because ZFS > >>complains that the vdev is part of an active pool in that > >>case. It seems that only labels will work here... > > > >Solaris is using device names stored in ZFS label and if this is not > >the > >drive it was looking for, it is doing ID-to-path translation to find > >new > >path name. On FreeBSD on the other hand (after upgrade to v13) I > >gave up > >doing similar thing because disk IDs weren't available from all disk > >device drivers (I implemented it for ATA and I received no help with > >other drivers). Currently the idea is to just go through all GEOM > >providers looking for proper ZFS metadata (think of it as manual > >tasting), so even if device name changes, ZFS should be able to locate > >it. If there are still problems locating the disk, there simply > >might be > >a bug in the code of some sort. > > Disks are found correctly, it's just that ZFS' internal > state is messed up. It uses both the device special file > name and the stored vdev path and as such can end up with > multiple VDEVs of the same name. As such, some VDEVs are > marked as corrupted/faulted. > > I can reproduce it if you're interested. I am, but I'm leaving tomorrow and I'll be out of e-mail probably for the next two weeks. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20090710/2b7eb428/attachment.pgp From Bernard.Steiner at lahmeyer.de Sun Jul 12 20:30:06 2009 From: Bernard.Steiner at lahmeyer.de (Steiner, Bernard) Date: Sun Jul 12 20:30:16 2009 Subject: kern/121481: [gmirror] data rot on disk with gmirror Message-ID: <200907122030.n6CKU5cP017101@freefall.freebsd.org> The following reply was made to PR kern/121481; it has been noted by GNATS. From: "Steiner, Bernard" To: Cc: Subject: Re: kern/121481: [gmirror] data rot on disk with gmirror Date: Sun, 12 Jul 2009 22:11:36 +0200 This is a multi-part message in MIME format. ------_=_NextPart_001_01CA032D.322E1433 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable That was most unhelpful. The very reason I asked for data consistency checks was because graid3 = at least seems to have -w for checking as opposed to gmirror. ZFS might be nice, but (I quote)... WARNING: ZFS is considered to be an experimental feature in FreeBSD. Time for me to move to a serious operating system, I guess. Bernard --=20 i.A. Dipl.-Inform. Bernard Steiner Netzwerk- und Systemadministrator Phone: +49 6101 55 1280, Fax: +49 6101 55 1623 Lahmeyer International GmbH Friedberger Strasse 173, 61118 Bad Vilbel, Deutschland/Germany Geschaeftsfuehrer/Managing Directors: Dr. Henning Nothdurft (Vorsitzender/President), Burkhard Neumann Firmensitz/Registered office: Bad Vilbel Registergericht/Registry court: Frankfurt am Main HRB 80852 Internet: http://www.lahmeyer.de/ Disclaimer: http://www.lahmeyer.de/disclaimer/ ------_=_NextPart_001_01CA032D.322E1433 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: kern/121481: [gmirror] data rot on disk with gmirror

That was most unhelpful.

The very reason I asked for data consistency checks was because graid3 = at least seems to have -w for checking as opposed to gmirror.

ZFS might be nice, but (I quote)...

WARNING: ZFS is considered to be an experimental feature in FreeBSD.

Time for me to move to a serious operating system, I guess.

Bernard


--
i.A. Dipl.-Inform. Bernard Steiner
Netzwerk- und Systemadministrator
Phone: +49 6101 55 1280, Fax: +49 6101 55 1623

Lahmeyer International GmbH

Friedberger Strasse = 173, 61118 Bad Vilbel, Deutschland/Germany

Geschaeftsfuehrer/Managing Directors:
Dr. Henning Nothdurft (Vorsitzender/President), Burkhard Neumann

Firmensitz/Registered office: Bad Vilbel
Registergericht/Registry court: Frankfurt am Main HRB 80852

Internet: http://www.lahmeyer.de/
Disclaimer: http://www.lahmeyer.de/discla= imer/


------_=_NextPart_001_01CA032D.322E1433-- From dan.naumov at gmail.com Mon Jul 13 08:30:03 2009 From: dan.naumov at gmail.com (Dan Naumov) Date: Mon Jul 13 08:30:09 2009 Subject: kern/121481: [gmirror] data rot on disk with gmirror Message-ID: <200907130830.n6D8U2gf016730@freefall.freebsd.org> The following reply was made to PR kern/121481; it has been noted by GNATS. From: Dan Naumov To: bug-followup@FreeBSD.org, zdbs@lif.de Cc: Subject: Re: kern/121481: [gmirror] data rot on disk with gmirror Date: Mon, 13 Jul 2009 11:23:32 +0300 Bernard, while I understand your frustration, you are barking up the wrong tree. RAID offers protection against very specific kinds of disk failure and does not offer any kind of protection against bit rot. I want to emphasize that this is not a FreeBSD issue, but a RAID issue in general and you will run into exact same limitations if you try raid on Linux or Windows or hardware raid from any hardware vendor. For another example of a fault that RAID mirror will NOT protect you or even warn you against, is your disk/raid controller going berserk and writing garbage to the mirror or one of it's member disks. If you are happy with just getting a warning when file(s) somewhere are silently getting corrupted, this can easily be easily implemented with existing tools: there are plenty of checksumming utilities you can use to checksum your datasets and you could set up a cronjob to have the utility run a check of your files against a known hash database and list all the files (if any) that have changed, mailing you the output. When properly configured, this can also help with intrusion detection, as it can help detecting all new or changed files on the system :) However, if you require not only a warning, but also automatic recovery and healing from such corruption, your only option is ZFS and if you have evaluated the state of ZFS in FreeBSD and concluded that it's not mature enough for your needs, then your only other option is Solaris. - Sincerely, Dan Naumov From Bernard.Steiner at lahmeyer.de Mon Jul 13 09:00:07 2009 From: Bernard.Steiner at lahmeyer.de (Steiner, Bernard) Date: Mon Jul 13 09:00:14 2009 Subject: kern/121481: [gmirror] data rot on disk with gmirror Message-ID: <200907130900.n6D906TE039950@freefall.freebsd.org> The following reply was made to PR kern/121481; it has been noted by GNATS. From: "Steiner, Bernard" To: "Dan Naumov" , Cc: Subject: RE: kern/121481: [gmirror] data rot on disk with gmirror Date: Mon, 13 Jul 2009 10:50:02 +0200 Dan, > RAID offers protection against very specific kinds of disk > failure and does not offer any kind of protection against bit > rot. I want to emphasize that this is not a FreeBSD issue, > but a RAID issue in general and you will run into exact same > limitations if you try raid on Linux or Windows or hardware I was asking for -w to be implemented by gmirror, and/or graid6 (double parity) be implemented (also with -w or even -w2 ;-) > raid from any hardware vendor. For another example of a fault > that RAID mirror will NOT protect you or even warn you > against, is your disk/raid controller going berserk and > writing garbage to the mirror or one of it's member disks. ACK. This is exactly why I want a check on the data read. > [checksumming utilities] Please explain how to do that on both sides of a gmirror. AFAIK, gmirror can be configured in the following ways: (1) always read from "primary" disk => cannot check secondary (2) round robin or load => read cannot be reliably reproduced Correct me if I am wrong, but this does not seem like a solution to my problem. > [ZFS / Solaris] I think I *like* ZFS (raidz2) and probably go with that. Solaris' future is uncertain in the light of SUN's future... I think maybe I'll wait a while till the warning is edited out of ZFS in FreeBSD and give it another shot. Bernard -- i.A. Dipl.-Inform. Bernard Steiner Netzwerk- und Systemadministrator Phone: +49 6101 55 1280, Fax: +49 6101 55 1623 Lahmeyer International GmbH Friedberger Strasse 173, 61118 Bad Vilbel, Deutschland/Germany Geschaeftsfuehrer/Managing Directors: Dr. Henning Nothdurft (Vorsitzender/President), Burkhard Neumann Firmensitz/Registered office: Bad Vilbel Registergericht/Registry court: Frankfurt am Main HRB 80852 Internet: http://www.lahmeyer.de/ Disclaimer: http://www.lahmeyer.de/disclaimer/ From dan.naumov at gmail.com Mon Jul 13 09:20:03 2009 From: dan.naumov at gmail.com (Dan Naumov) Date: Mon Jul 13 09:20:10 2009 Subject: kern/121481: [gmirror] data rot on disk with gmirror Message-ID: <200907130920.n6D9K3bL055032@freefall.freebsd.org> The following reply was made to PR kern/121481; it has been noted by GNATS. From: Dan Naumov To: "Steiner, Bernard" Cc: bug-followup@freebsd.org Subject: Re: kern/121481: [gmirror] data rot on disk with gmirror Date: Mon, 13 Jul 2009 12:16:33 +0300 On Mon, Jul 13, 2009 at 11:50 AM, Steiner, Bernard wrote: >> [checksumming utilities] > > Please explain how to do that on both sides of a gmirror. > AFAIK, gmirror can be configured in the following ways: > (1) always read from "primary" disk => cannot check secondary > (2) round robin or load => read cannot be reliably reproduced > > Correct me if I am wrong, but this does not seem like a solution > to my problem. You have several options: Option 1 (this has the benefit of working with all balance algorithms): Take disc2 offline, run checksum check (so that checks are done against disc1) Take disc2 online, take disc1 offline, run checksum check Option 2 (for "prefer" algorithm): Assuming disk1 is the promoted disk, run checksum check Promote disk2, run checksum check Promote disk1 to return to original state - Sincerely, Dan Naumov From juli at clockworksquid.com Mon Jul 13 09:41:27 2009 From: juli at clockworksquid.com (Juli Mallett) Date: Mon Jul 13 09:41:34 2009 Subject: Is anything being done to un-break partition names? In-Reply-To: <5BE6CEC9-8F49-473B-A3E4-2702680A8836@mac.com> References: <20090605051203.GD1705@garage.freebsd.pl> <46FB00ED-62DC-4924-A84A-8C34B26DA22E@mac.com> <20090605200512.GA2313@garage.freebsd.pl> <5BE6CEC9-8F49-473B-A3E4-2702680A8836@mac.com> Message-ID: On Fri, Jun 5, 2009 at 13:38, Marcel Moolenaar wrote: > That introduces significant breakages in normal setups. > The priority should not be changed. It's resulting in > the right behaviour and any exceptions to that (i.e. > we need the wrong behaviour) should be coded explicitly. > > The closest we can get is by having the BSD disklabel > code check if there are valid MBR partitions defined > and if yes, back-off. This covers exactly the problem > case and doesn't introduce false negatives in other > scenarios. Are there any efforts outstanding to do this for 8.0? Or did it get done and I missed it? > But: we should fix sysinstall as well. Either we should > finally rip out DD or we should have it create proper > DD images... Perhaps randi@ will take a look; she's been working on nearby parts of sysinstall for a while. Adding her to CC since I don't know if she reads geom@. From bugmaster at FreeBSD.org Mon Jul 13 11:06:55 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Jul 13 11:08:19 2009 Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org Message-ID: <200907131106.n6DB6sSQ040601@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/136467 geom [geom] glabel(8) destroys access to GEOM tree if volum o kern/135898 geom [geom] Severe filesystem corruption - large files or l o kern/135874 geom [geom] [patch] geom_linux_lvm misses newer fedora defa o kern/134922 geom [gmirror] [panic] kernel panic when use fdisk on disk o kern/134113 geom [geli] Problem setting secondary GELI key o kern/134044 geom [geom] gmirror(8) overwrites fs with stale data from r o kern/133931 geom [geli] [request] intentionally wrong password to destr o bin/132845 geom [geom] [patch] ggated(8) does not close files opened a o kern/132273 geom glabel(8): [patch] failing on journaled partition o kern/132242 geom [gmirror] gmirror.ko fails to fully initialize o kern/131353 geom [geom] gjournal(8) kernel lock o kern/131037 geom [geli] Unable to create disklabel on .eli-Device p docs/130548 geom [patch] gjournal(8) man page is missing sysctls o kern/130528 geom gjournal fsck during boot o kern/129674 geom [geom] gjournal root did not mount on boot o kern/129645 geom gjournal(8): GEOM_JOURNAL causes system to fail to boo o kern/129245 geom [geom] gcache is more suitable for suffix based provid f kern/128276 geom [gmirror] machine lock up when gmirror module is used o kern/126902 geom [geom] geom_label: kernel panic during install boot o kern/124973 geom [gjournal] [patch] boot order affects geom_journal con o kern/124969 geom gvinum(8): gvinum raid5 plex does not detect missing s o kern/124294 geom [geom] gmirror(8) have inappropriate logic when workin o kern/124130 geom [gmirror] [usb] gmirror fails to start usb devices tha o kern/123962 geom [panic] [gjournal] gjournal (455Gb data, 8Gb journal), o kern/123630 geom [patch] [gmirror] gmirror doesnt allow the original dr o kern/123122 geom [geom] GEOM / gjournal kernel lock o kern/122738 geom [geom] gmirror list "losts consumers" after gmirror de f kern/122415 geom [geom] UFS labels are being constantly created and rem o kern/122067 geom [geom] [panic] Geom crashed during boot o kern/121559 geom [patch] [geom] geom label class allows to create inacc o kern/121364 geom [gmirror] Removing all providers create a "zombie" mir o kern/120231 geom [geom] GEOM_CONCAT error adding second drive o kern/120091 geom [geom] [geli] [gjournal] geli does not prompt for pass o kern/120044 geom [msdosfs] [geom] incorrect MSDOSFS label fries adminis o kern/120021 geom [geom] [panic] net-p2p/qbittorrent crashes system when o kern/119743 geom [geom] geom label for cds is keeped after dismount and o kern/115856 geom [geli] ZFS thought it was degraded when it should have o kern/115547 geom [geom] [patch] [request] let GEOM Eli get password fro o kern/114532 geom [geom] GEOM_MIRROR shows up in kldstat even if compile o kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/113885 geom [gmirror] [patch] improved gmirror balance algorithm o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113419 geom [geom] geom fox multipathing not failing back p bin/110705 geom gmirror(8) control utility does not exit with correct o kern/107707 geom [geom] [patch] [request] add new class geom_xbox360 to o kern/104389 geom [geom] [patch] sys/geom/geom_dump.c doesn't encode XML o kern/98034 geom [geom] dereference of NULL pointer in acd_geom_detach o kern/94632 geom [geom] Kernel output resets input while GELI asks for o kern/90582 geom [geom] [panic] Restore cause panic string (ffs_blkfree o bin/90093 geom fdisk(8) incapable of altering in-core geometry a kern/89660 geom [vinum] [patch] [panic] due to g_malloc returning null o kern/89546 geom [geom] GEOM error o kern/88601 geom [geli] geli cause kernel panic under heavy disk usage o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o kern/84556 geom [geom] [panic] GBDE-encrypted swap causes panic at shu o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/79035 geom [vinum] gvinum unable to create a striped set of mirro o bin/78131 geom gbde(8) "destroy" not working. s kern/73177 geom kldload geom_* causes panic due to memory exhaustion 59 problems total. From emikulic at gmail.com Thu Jul 16 08:20:11 2009 From: emikulic at gmail.com (Emil Mikulic) Date: Thu Jul 16 08:20:17 2009 Subject: kern/113885: [gmirror] [patch] improved gmirror balance algorithm Message-ID: <200907160820.n6G8KAGe072164@freefall.freebsd.org> The following reply was made to PR kern/113885; it has been noted by GNATS. From: Emil Mikulic To: bug-followup@FreeBSD.org Cc: will@firepipe.net Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm Date: Thu, 16 Jul 2009 17:56:19 +1000 Will Andrews' patch is *fantastic* With this patch and gmirror set to "load" style balancing, I can run two long sequential reads in parallel and get practically linear scaling on a two-disk mirror. Could someone please commit this? --Emil From ivoras at freebsd.org Thu Jul 16 11:40:15 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Thu Jul 16 11:40:22 2009 Subject: kern/113885: [gmirror] [patch] improved gmirror balance algorithm In-Reply-To: <200907160820.n6G8KAGe072164@freefall.freebsd.org> References: <200907160820.n6G8KAGe072164@freefall.freebsd.org> Message-ID: Emil Mikulic wrote: > The following reply was made to PR kern/113885; it has been noted by GNATS. > > From: Emil Mikulic > To: bug-followup@FreeBSD.org > Cc: will@firepipe.net > Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance > algorithm > Date: Thu, 16 Jul 2009 17:56:19 +1000 > > Will Andrews' patch is *fantastic* > > With this patch and gmirror set to "load" style balancing, I can run two > long sequential reads in parallel and get practically linear scaling on > a two-disk mirror. > > Could someone please commit this? > > --Emil Can you please do some testing (of the style you just did but also diskinfo -vt and possibly random reads) on both patch candidates: http://www.freebsd.org/cgi/query-pr.cgi?pr=113885 and http://sobomax.sippysoft.com/~sobomax/geom_mirror.diff Unless there are significant differences in favour of the second version, I'm inclined to commit the version in the PR (unless problems and obstructions are indicated, of course). From dfilter at FreeBSD.ORG Sun Jul 19 13:00:23 2009 From: dfilter at FreeBSD.ORG (dfilter service) Date: Sun Jul 19 13:00:30 2009 Subject: kern/135874: commit references a PR Message-ID: <200907191300.n6JD0FL4035902@freefall.freebsd.org> The following reply was made to PR kern/135874; it has been noted by GNATS. From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/135874: commit references a PR Date: Sun, 19 Jul 2009 12:57:22 +0000 (UTC) Author: lulf Date: Sun Jul 19 12:57:10 2009 New Revision: 195759 URL: http://svn.freebsd.org/changeset/base/195759 Log: MFC r194924: - Apply the same naming rules of LVM names as done in the LVM code itself. PR: kern/135874 Modified: stable/7/sys/ (props changed) stable/7/sys/contrib/pf/ (props changed) stable/7/sys/geom/linux_lvm/g_linux_lvm.c Modified: stable/7/sys/geom/linux_lvm/g_linux_lvm.c ============================================================================== --- stable/7/sys/geom/linux_lvm/g_linux_lvm.c Sat Jul 18 21:50:53 2009 (r195758) +++ stable/7/sys/geom/linux_lvm/g_linux_lvm.c Sun Jul 19 12:57:10 2009 (r195759) @@ -826,14 +826,6 @@ llvm_md_decode(const u_char *data, struc return (0); } -#define GRAB_NAME(tok, name, len) \ - len = 0; \ - while (tok[len] && (isalpha(tok[len]) || isdigit(tok[len])) && \ - len < G_LLVM_NAMELEN - 1) \ - len++; \ - bcopy(tok, name, len); \ - name[len] = '\0'; - #define GRAB_INT(key, tok1, tok2, v) \ if (tok1 && tok2 && strncmp(tok1, key, sizeof(key)) == 0) { \ v = strtol(tok2, &tok1, 10); \ @@ -864,6 +856,27 @@ llvm_md_decode(const u_char *data, struc break; \ } +static size_t +llvm_grab_name(char *name, const char *tok) +{ + size_t len; + + len = 0; + if (tok == NULL) + return (0); + if (tok[0] == '-') + return (0); + if (strcmp(tok, ".") == 0 || strcmp(tok, "..") == 0) + return (0); + while (tok[len] && (isalpha(tok[len]) || isdigit(tok[len]) || + tok[len] == '.' || tok[len] == '_' || tok[len] == '-' || + tok[len] == '+') && len < G_LLVM_NAMELEN - 1) + len++; + bcopy(tok, name, len); + name[len] = '\0'; + return (len); +} + static int llvm_textconf_decode(u_char *data, int buflen, struct g_llvm_metadata *md) { @@ -872,7 +885,7 @@ llvm_textconf_decode(u_char *data, int b char *tok, *v; char name[G_LLVM_NAMELEN]; char uuid[G_LLVM_UUIDLEN]; - int len; + size_t len; if (buf == NULL || *buf == '\0') return (EINVAL); @@ -880,7 +893,7 @@ llvm_textconf_decode(u_char *data, int b tok = strsep(&buf, "\n"); if (tok == NULL) return (EINVAL); - GRAB_NAME(tok, name, len); + len = llvm_grab_name(name, tok); if (len == 0) return (EINVAL); @@ -970,7 +983,7 @@ llvm_textconf_decode_pv(char **buf, char { struct g_llvm_pv *pv; char *v; - int len; + size_t len; if (*buf == NULL || **buf == '\0') return (EINVAL); @@ -983,7 +996,7 @@ llvm_textconf_decode_pv(char **buf, char len = 0; if (tok == NULL) goto bad; - GRAB_NAME(tok, pv->pv_name, len); + len = llvm_grab_name(pv->pv_name, tok); if (len == 0) goto bad; @@ -1024,7 +1037,7 @@ llvm_textconf_decode_lv(char **buf, char struct g_llvm_lv *lv; struct g_llvm_segment *sg; char *v; - int len; + size_t len; if (*buf == NULL || **buf == '\0') return (EINVAL); @@ -1036,10 +1049,9 @@ llvm_textconf_decode_lv(char **buf, char lv->lv_vg = vg; LIST_INIT(&lv->lv_segs); - len = 0; if (tok == NULL) goto bad; - GRAB_NAME(tok, lv->lv_name, len); + len = llvm_grab_name(lv->lv_name, tok); if (len == 0) goto bad; @@ -1162,7 +1174,6 @@ bad: free(sg, M_GLLVM); return (-1); } -#undef GRAB_NAME #undef GRAB_INT #undef GRAB_STR #undef SPLIT _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From bugmaster at FreeBSD.org Mon Jul 20 11:06:56 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Jul 20 11:08:16 2009 Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org Message-ID: <200907201106.n6KB6sFY002276@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/136467 geom [geom] glabel(8) destroys access to GEOM tree if volum o kern/135898 geom [geom] Severe filesystem corruption - large files or l o kern/135874 geom [geom] [patch] geom_linux_lvm misses newer fedora defa o kern/134922 geom [gmirror] [panic] kernel panic when use fdisk on disk o kern/134113 geom [geli] Problem setting secondary GELI key o kern/134044 geom [geom] gmirror(8) overwrites fs with stale data from r o kern/133931 geom [geli] [request] intentionally wrong password to destr o bin/132845 geom [geom] [patch] ggated(8) does not close files opened a o kern/132273 geom glabel(8): [patch] failing on journaled partition o kern/132242 geom [gmirror] gmirror.ko fails to fully initialize o kern/131353 geom [geom] gjournal(8) kernel lock o kern/131037 geom [geli] Unable to create disklabel on .eli-Device p docs/130548 geom [patch] gjournal(8) man page is missing sysctls o kern/130528 geom gjournal fsck during boot o kern/129674 geom [geom] gjournal root did not mount on boot o kern/129645 geom gjournal(8): GEOM_JOURNAL causes system to fail to boo o kern/129245 geom [geom] gcache is more suitable for suffix based provid f kern/128276 geom [gmirror] machine lock up when gmirror module is used o kern/126902 geom [geom] geom_label: kernel panic during install boot o kern/124973 geom [gjournal] [patch] boot order affects geom_journal con o kern/124969 geom gvinum(8): gvinum raid5 plex does not detect missing s o kern/124294 geom [geom] gmirror(8) have inappropriate logic when workin o kern/124130 geom [gmirror] [usb] gmirror fails to start usb devices tha o kern/123962 geom [panic] [gjournal] gjournal (455Gb data, 8Gb journal), o kern/123630 geom [patch] [gmirror] gmirror doesnt allow the original dr o kern/123122 geom [geom] GEOM / gjournal kernel lock o kern/122738 geom [geom] gmirror list "losts consumers" after gmirror de f kern/122415 geom [geom] UFS labels are being constantly created and rem o kern/122067 geom [geom] [panic] Geom crashed during boot o kern/121559 geom [patch] [geom] geom label class allows to create inacc o kern/121364 geom [gmirror] Removing all providers create a "zombie" mir o kern/120231 geom [geom] GEOM_CONCAT error adding second drive o kern/120091 geom [geom] [geli] [gjournal] geli does not prompt for pass o kern/120044 geom [msdosfs] [geom] incorrect MSDOSFS label fries adminis o kern/120021 geom [geom] [panic] net-p2p/qbittorrent crashes system when o kern/119743 geom [geom] geom label for cds is keeped after dismount and o kern/115856 geom [geli] ZFS thought it was degraded when it should have o kern/115547 geom [geom] [patch] [request] let GEOM Eli get password fro o kern/114532 geom [geom] GEOM_MIRROR shows up in kldstat even if compile o kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/113885 geom [gmirror] [patch] improved gmirror balance algorithm o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113419 geom [geom] geom fox multipathing not failing back p bin/110705 geom gmirror(8) control utility does not exit with correct o kern/107707 geom [geom] [patch] [request] add new class geom_xbox360 to o kern/104389 geom [geom] [patch] sys/geom/geom_dump.c doesn't encode XML o kern/98034 geom [geom] dereference of NULL pointer in acd_geom_detach o kern/94632 geom [geom] Kernel output resets input while GELI asks for o kern/90582 geom [geom] [panic] Restore cause panic string (ffs_blkfree o bin/90093 geom fdisk(8) incapable of altering in-core geometry a kern/89660 geom [vinum] [patch] [panic] due to g_malloc returning null o kern/89546 geom [geom] GEOM error o kern/88601 geom [geli] geli cause kernel panic under heavy disk usage o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o kern/84556 geom [geom] [panic] GBDE-encrypted swap causes panic at shu o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/79035 geom [vinum] gvinum unable to create a striped set of mirro o bin/78131 geom gbde(8) "destroy" not working. s kern/73177 geom kldload geom_* causes panic due to memory exhaustion 59 problems total. From freebsdpr at satin.sensation.net.au Tue Jul 21 07:50:19 2009 From: freebsdpr at satin.sensation.net.au (freebsdpr) Date: Tue Jul 21 07:50:25 2009 Subject: kern/113885: [gmirror] [patch] improved gmirror balance algorithm Message-ID: <200907210750.n6L7o7MC000583@freefall.freebsd.org> The following reply was made to PR kern/113885; it has been noted by GNATS. From: freebsdpr To: bug-followup@FreeBSD.org Cc: freebsdpr Subject: Re: kern/113885: [gmirror] [patch] improved gmirror balance algorithm Date: Tue, 21 Jul 2009 17:45:37 +1000 (EST) I was also surprised to discover that gmirror, regardless of the algorithm used, does not seem to offer either random or sequential read performance any better than a single drive. I have a new SATA backplane which shows individual drive activity indicators - with these you can easily see that the "load" algorithm seems to be selecting (and staying on) only a single drive at a time, for anywhere between 0.1 - 1 seconds. Some simple testing confirmed that there's no discernable read performance benefit between 1 or >1 drives - so much for my 4 drive RAID1 idea! In comparison, a 5 drive graid3 array offers a sequential read speed of nearly 4 times a single drive... with read verify ON. ---- Onto the "load" patch above - it doesn't seem to work for me. I thought it may have been because I had 4 drives in the array, but even after dropping back to 2 it still only reads from a *single* drive. Any ideas? I'm using 7.1R-amd64. Geom name: db0 State: COMPLETE Components: 2 Balance: load <--- *** From naeem_jarral at yahoo.com Tue Jul 21 19:30:56 2009 From: naeem_jarral at yahoo.com (Naeem Afzal) Date: Tue Jul 21 19:31:02 2009 Subject: taking account of bio_resid when b_maxsize is greater than bio_length Message-ID: <937396.78922.qm@web81102.mail.mud.yahoo.com> Hi, I am writing a disk driver which can only read maximum of 0x1000 bytes at a time, so d_maxsize = 0x1000. If the application tries to read more than that say 0x3000 and current bio_offset pointer is set to say 0xe00, driver can write only 0x200 (to stay within 0x1000 window), that means it will set the bio_resid=0xe00 as the request from OS will be 0x1000 (d_maxsize). Now subsequent OS request shuold be at 0x1000, but it was returning 0x1e00 as it does not count for bio_resid and assumes that maximum write happened. Could someone explain if my assumption is correct that it needs to take care of bio_resid? regards naeem (FreeBSD 7.1) /usr/src/sys/geom/geom_disk.c static void g_disk_start(struct bio *bp) { struct bio *bp2, *bp3; struct disk *dp; int error; off_t off; .... /* fall-through */ case BIO_READ: case BIO_WRITE: off = 0; bp3 = NULL; bp2 = g_clone_bio(bp); if (bp2 == NULL) { error = ENOMEM; break; } do { bp2->bio_offset += off; bp2->bio_length -= off; bp2->bio_data += off; if (bp2->bio_length > dp->d_maxsize) { /* * XXX: If we have a stripesize we should really * use it here. */ bp2->bio_length = dp->d_maxsize; off += dp->d_maxsize; /* * To avoid a race, we need to grab the next bio * before we schedule this one. See "notes". */ bp3 = g_clone_bio(bp); if (bp3 == NULL) bp->bio_error = ENOMEM; } bp2->bio_done = g_disk_done; bp2->bio_pblkno = bp2->bio_offset / dp->d_sectorsize; bp2->bio_bcount = bp2->bio_length; bp2->bio_disk = dp; devstat_start_transaction_bio(dp->d_devstat, bp2); g_disk_lock_giant(dp); dp->d_strategy(bp2); g_disk_unlock_giant(dp); #if 1 // I belive this line is needed to account for bio_resid?? off -= bp2->bio_resid; #endif From lulf at FreeBSD.org Wed Jul 22 18:05:34 2009 From: lulf at FreeBSD.org (lulf@FreeBSD.org) Date: Wed Jul 22 18:05:46 2009 Subject: kern/135874: [geom] [patch] geom_linux_lvm misses newer fedora defaults Message-ID: <200907221805.n6MI5Xmv090458@freefall.freebsd.org> Synopsis: [geom] [patch] geom_linux_lvm misses newer fedora defaults State-Changed-From-To: open->closed State-Changed-By: lulf State-Changed-When: Wed Jul 22 18:05:09 UTC 2009 State-Changed-Why: - Fixed in both HEAD and RELENG_7. http://www.freebsd.org/cgi/query-pr.cgi?pr=135874 From IZ-FreeBSD0902-nospam at hs-karlsruhe.de Thu Jul 23 09:50:03 2009 From: IZ-FreeBSD0902-nospam at hs-karlsruhe.de (Ralf Wenk) Date: Thu Jul 23 09:50:09 2009 Subject: kern/136467: [geom] glabel(8) destroys access to GEOM tree if volume label contains non ASCII characters Message-ID: <200907230950.n6N9o2Pd062827@freefall.freebsd.org> The following reply was made to PR kern/136467; it has been noted by GNATS. From: Ralf Wenk To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/136467: [geom] glabel(8) destroys access to GEOM tree if volume label contains non ASCII characters Date: Thu, 23 Jul 2009 10:24:44 +0200 > Would you be happy with a sysctl knob that controls whether to interpret > all labels as UTF-8 or to strip out all 8-bit characters, which would by > default be set to "interpret as UTF-8"? Yes, that would be a nice solution. From jh at saunalahti.fi Thu Jul 23 11:30:04 2009 From: jh at saunalahti.fi (Jaakko Heinonen) Date: Thu Jul 23 11:30:49 2009 Subject: kern/136467: [geom] glabel(8) destroys access to GEOM tree if volume label contains non ASCII characters Message-ID: <200907231130.n6NBU3dB045066@freefall.freebsd.org> The following reply was made to PR kern/136467; it has been noted by GNATS. From: Jaakko Heinonen To: Ivan Voras Cc: bug-followup@freebsd.org, IZ-FreeBSD0902@hs-karlsruhe.de Subject: Re: kern/136467: [geom] glabel(8) destroys access to GEOM tree if volume label contains non ASCII characters Date: Thu, 23 Jul 2009 14:26:16 +0300 Hi, This PR is a duplicate of kern/104389 and kern/120044. On 2009-07-09, Ivan Voras wrote: > Would you be happy with a sysctl knob that controls whether to interpret > all labels as UTF-8 or to strip out all 8-bit characters, which would by > default be set to "interpret as UTF-8"? You may already know this but it's not enough to strip 8-bit characters but all characters unsafe for XML output need to be handled. There are more details in kern/104389. -- Jaakko From bugmaster at FreeBSD.org Mon Jul 27 11:06:54 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Jul 27 11:08:21 2009 Subject: Current problem reports assigned to freebsd-geom@FreeBSD.org Message-ID: <200907271106.n6RB6r37018936@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/136467 geom [geom] glabel(8) destroys access to GEOM tree if volum o kern/135898 geom [geom] Severe filesystem corruption - large files or l o kern/134922 geom [gmirror] [panic] kernel panic when use fdisk on disk o kern/134113 geom [geli] Problem setting secondary GELI key o kern/134044 geom [geom] gmirror(8) overwrites fs with stale data from r o kern/133931 geom [geli] [request] intentionally wrong password to destr o bin/132845 geom [geom] [patch] ggated(8) does not close files opened a o kern/132273 geom glabel(8): [patch] failing on journaled partition o kern/132242 geom [gmirror] gmirror.ko fails to fully initialize o kern/131353 geom [geom] gjournal(8) kernel lock o kern/131037 geom [geli] Unable to create disklabel on .eli-Device p docs/130548 geom [patch] gjournal(8) man page is missing sysctls o kern/130528 geom gjournal fsck during boot o kern/129674 geom [geom] gjournal root did not mount on boot o kern/129645 geom gjournal(8): GEOM_JOURNAL causes system to fail to boo o kern/129245 geom [geom] gcache is more suitable for suffix based provid f kern/128276 geom [gmirror] machine lock up when gmirror module is used o kern/126902 geom [geom] geom_label: kernel panic during install boot o kern/124973 geom [gjournal] [patch] boot order affects geom_journal con o kern/124969 geom gvinum(8): gvinum raid5 plex does not detect missing s o kern/124294 geom [geom] gmirror(8) have inappropriate logic when workin o kern/124130 geom [gmirror] [usb] gmirror fails to start usb devices tha o kern/123962 geom [panic] [gjournal] gjournal (455Gb data, 8Gb journal), o kern/123630 geom [patch] [gmirror] gmirror doesnt allow the original dr o kern/123122 geom [geom] GEOM / gjournal kernel lock o kern/122738 geom [geom] gmirror list "losts consumers" after gmirror de f kern/122415 geom [geom] UFS labels are being constantly created and rem o kern/122067 geom [geom] [panic] Geom crashed during boot o kern/121559 geom [patch] [geom] geom label class allows to create inacc o kern/121364 geom [gmirror] Removing all providers create a "zombie" mir o kern/120231 geom [geom] GEOM_CONCAT error adding second drive o kern/120091 geom [geom] [geli] [gjournal] geli does not prompt for pass o kern/120044 geom [msdosfs] [geom] incorrect MSDOSFS label fries adminis o kern/120021 geom [geom] [panic] net-p2p/qbittorrent crashes system when o kern/119743 geom [geom] geom label for cds is keeped after dismount and o kern/115856 geom [geli] ZFS thought it was degraded when it should have o kern/115547 geom [geom] [patch] [request] let GEOM Eli get password fro o kern/114532 geom [geom] GEOM_MIRROR shows up in kldstat even if compile o kern/113957 geom [gmirror] gmirror is intermittently reporting a degrad o kern/113885 geom [gmirror] [patch] improved gmirror balance algorithm o kern/113837 geom [geom] unable to access 1024 sector size storage o kern/113419 geom [geom] geom fox multipathing not failing back p bin/110705 geom gmirror(8) control utility does not exit with correct o kern/107707 geom [geom] [patch] [request] add new class geom_xbox360 to o kern/104389 geom [geom] [patch] sys/geom/geom_dump.c doesn't encode XML o kern/98034 geom [geom] dereference of NULL pointer in acd_geom_detach o kern/94632 geom [geom] Kernel output resets input while GELI asks for o kern/90582 geom [geom] [panic] Restore cause panic string (ffs_blkfree o bin/90093 geom fdisk(8) incapable of altering in-core geometry a kern/89660 geom [vinum] [patch] [panic] due to g_malloc returning null o kern/89546 geom [geom] GEOM error o kern/88601 geom [geli] geli cause kernel panic under heavy disk usage o kern/87544 geom [gbde] mmaping large files on a gbde filesystem deadlo o kern/84556 geom [geom] [panic] GBDE-encrypted swap causes panic at shu o kern/79251 geom [2TB] newfs fails on 2.6TB gbde device o kern/79035 geom [vinum] gvinum unable to create a striped set of mirro o bin/78131 geom gbde(8) "destroy" not working. s kern/73177 geom kldload geom_* causes panic due to memory exhaustion 58 problems total. From acc at hexadecagram.org Fri Jul 31 06:23:09 2009 From: acc at hexadecagram.org (Anthony Chavez) Date: Fri Jul 31 06:23:25 2009 Subject: Re-starting a gjournal provider In-Reply-To: <20090729140436.GG1586@garage.freebsd.pl> References: <4A62E0CE.1000508@hexadecagram.org> <20090729140436.GG1586@garage.freebsd.pl> Message-ID: <4A7289B9.2060907@hexadecagram.org> Thanks very much for responding, Pawel. I'm moving this discussion to freebsd-geom, which is where I probably should have posted in the first place. Lack of sleep and coffee on Sunday morning were partly to blame, I'm sure. ;-) Pawel Jakub Dawidek wrote: > On Sun, Jul 19, 2009 at 03:01:02AM -0600, Anthony Chavez wrote: >> Hello freebsd-fs, >> >> I'm trying to get gjournal working on a "removable" hard disk. I use >> the term loosely, because I'm using a very simple eSATA enclosure: an >> AMS Venus DS5 [1]. >> >> If I swap out disks, atacontrol cap ad0 seems sufficient enough to >> detect the new drive: the reported device model, serial number, firmware >> revision, and CHS values change as one would expect. >> >> My interpretation of [2] section 5.3 and gjournal(8) is that the >> following sequence of commands should ensure me that all write buffers >> have been flushed and bring the system to a point where it is safe to >> remove a disk. >> >> sync; sync; sync >> gjournal sync >> umount /dev/ad0s1.journal >> gjournal stop ad0s1.journal > > You should first unmount and then call 'gjournal sync'. Thank you for clarifying that. You mention this again later on in your response, and I respond below. >> However, once they are executed, /dev/ad0s1.journal disappears and when >> I swap out the disk it doesn't come back. The only way I've found to >> bring it back is atacontrol detach ata0; atacontrol attach ata0, which >> doesn't seem like a wise thing to do if I have another device on the >> same channel. > > It doesn't come back because something (ATA layer?) doesn't properly > remove ad0 provider. When you remove the disk, /dev/ad0 should disappear > and reappear once you insert it again. > > You can still do this trick after you insert the disk again so the GEOM > can schedule retaste: > > # true > /dev/ad0 Thank you for informing me of that trick. I tried using it after "gjournal stop" but unfortunately, nothing changed. My terminology might have been a bit off in my initial post (gjournal is still a bit new to me). So I will attempt to clarify a bit more. Here is an example of a typical session (the only difference this time being "gjournal sync" following umount as you prescribed). % sudo atacontrol info ata0 Master: ad0 SATA revision 1.x Slave: no device present % ls /dev/ad0* /dev/ad0 /dev/ad0s1 /dev/ad0s1.journal % mount | grep ad0s1 /dev/ad0s1.journal on /mnt/ad0s1 (ufs, local, gjournal) % ( subsh> set -o errexit subsh> sync subsh> sync subsh> sync subsh> sudo umount /dev/ad0s1.journal subsh> gjournal sync subsh> sudo gjournal stop ad0s1.journal subsh> ) % ls /dev/ad0* /dev/ad0 /dev/ad0s1 % sudo true \> /dev/ad0; ls /dev/ad0* /dev/ad0 /dev/ad0s1 % sudo true \> /dev/ad0s1; ls /dev/ad0* /dev/ad0 /dev/ad0s1 % sudo atacontrol detach ata0 && sudo atacontrol attach ata0 Master: ad0 SATA revision 1.x Slave: no device present % ls /dev/ad0* /dev/ad0 /dev/ad0s1 /dev/ad0s1.journal Here are the points to note. 1) When I physically remove a drive from the enclosure, /dev/ad0 does not disappear. /dev/ad0 *always* exists until I "atacontrol detach." Even when the device is powered off, /dev/ad0 continues to exist. 2) /dev/ad0s1.journal disappears when I "gjournal stop." /dev/ad0s1.journal is the device that, AFAIK, will only come back after "atacontrol detach ata0; atacontrol attach ata0". 3) When I swap drives, "atacontrol cap ad0" will produce a report for the newly-inserted drive. If I attempt to "atacontrol info ata0" before issuing that command, it continues to display the drive model and firmware revision from the drive that was previously inserted. However, "atacontrol cap" does not appear to provoke the return of /dev/ad0s1.journal. >> My question is, do I need to issue gjournal stop before I swap disks? >> And if so, is there any way that I can avoid the atacontrol >> detach/attach cycle that would need to take place before any mount is >> attempted so that /dev/ad0s1.journal appears (if in the drive inserted >> at the time does in fact utilize gjournal; I may want to experiment with >> having disks with either gjournal or soft updates)? This paragraph (above) and the one that that proceeded it in my original post contains the following 2 questions that remain unanswered (I've added another which was implied previously at best). 1) Is "atacontrol detach ata0 && atacontrol attach ata0" in fact a safe operation to perform in any circumstance? My better judgment has me thinking that the answer to this question is almost certainly "no." However, I am hypothesizing that it would safe enough if all devices on ata0 are properly unmounted first, but if I can avoid that, I will. It feels clumsy and seems to defeat the purpose of hot-swapping. 2) Is it *necessary* to "gjournal stop" before hot-swapping? In such a scenario, I would opt to simply "umount; gjournal sync," swap disks, and then "atacontrol cap ad0; mount" (or even just "mount"). It seems quite likely, however, that all drives that undergo this treatment would be *required* to have gjournal labels since /dev/ad0s1.journal would never disappear (although I've yet to actually test that). 3) If the answer to question 2 is "yes," then how can I handle the case of inserting a drive that does *not* have a gjournal label? >> And while I'm on the subject, are the (gjournal) syncs commands >> preceeding umount absolutely necessary in the case of removable media? > > 'gjournal sync' should follow unmount, not the other way around. And its > better to do it, but 'gjournal stop' should do the same. If that is indeed the case then the article I referenced as [2], "Implementing UFS Journaling on a Desktop PC," should be updated to reflect that ordering (section 5.3 prescribes a "umount" followed by "gjournal sync"). I'm submitting a PR that addresses this. In any case, the question I was asking here is actually twofold: 1) Is it really necessary to perform 3 "sync" commands before "umount"? Line 94 of src/sbin/umount/umount.c,v 1.45.20.1 has me thinking that the answer is "no," since it calls sync() itself, albeit only once. I got the idea for executing "sync" three times from /etc/rc.suspend. 2) Is it necessary to "gjournal sync" if I'm going to "gjournal stop" anyway? (You answered this one already.) Thank you for the assistance. -- Anthony Chavez http://hexadecagram.org/ mailto:acc@hexadecagram.org xmpp:acc@hexadecagram.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 260 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20090731/a76e024a/signature.pgp From pjd at FreeBSD.org Fri Jul 31 06:49:32 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Fri Jul 31 06:49:45 2009 Subject: Re-starting a gjournal provider In-Reply-To: <4A7289B9.2060907@hexadecagram.org> References: <4A62E0CE.1000508@hexadecagram.org> <20090729140436.GG1586@garage.freebsd.pl> <4A7289B9.2060907@hexadecagram.org> Message-ID: <20090731064948.GG1584@garage.freebsd.pl> On Fri, Jul 31, 2009 at 12:05:45AM -0600, Anthony Chavez wrote: > > It doesn't come back because something (ATA layer?) doesn't properly > > remove ad0 provider. When you remove the disk, /dev/ad0 should disappear > > and reappear once you insert it again. > > > > You can still do this trick after you insert the disk again so the GEOM > > can schedule retaste: > > > > # true > /dev/ad0 > > Thank you for informing me of that trick. I tried using it after > "gjournal stop" but unfortunately, nothing changed. This is because it should be /dev/ad0s1 and not /dev/ad0. Try with /dev/ad0s1. > Here are the points to note. > > 1) When I physically remove a drive from the enclosure, /dev/ad0 does > not disappear. /dev/ad0 *always* exists until I "atacontrol detach." > Even when the device is powered off, /dev/ad0 continues to exist. This might be three things: 1. Your enclosure/controller doesn't report back about disk being removed. 2. Your enclosure does report back, but ATA ignores such report. This will be a bug in ATA. 3. Your controller doesn't support hot-swap or it supports warm-swap, which means you have to detach it by hand before removing it. > 2) /dev/ad0s1.journal disappears when I "gjournal stop." > /dev/ad0s1.journal is the device that, AFAIK, will only come back after > "atacontrol detach ata0; atacontrol attach ata0". It should also get back after 'true > /dev/ad0s1'. What this command do is to open provider for writing (it doesn't write anything). In GEOM it will trigger spoil event and then, once command completes, it will trigger retaste event. This mean that GEOM will inform gjournal to check /dev/ad0s1 again and this will allow gjournal to find its metadata and create /dev/ad0s1.journal once again. One more test would be in place. If you could try the command below before removing disk and after inserting different disk: # diskinfo -v /dev/ad0 If it shows exactly the same in two cases, it means that it is not aware that disk was replaced and detach/attach cycle is needed. > 1) Is "atacontrol detach ata0 && atacontrol attach ata0" in fact a safe > operation to perform in any circumstance? > > My better judgment has me thinking that the answer to this question is > almost certainly "no." However, I am hypothesizing that it would safe > enough if all devices on ata0 are properly unmounted first, but if I can > avoid that, I will. It feels clumsy and seems to defeat the purpose of > hot-swapping. It should be safe, but there were plenty of bugs related to disappearing disk from under mount file system, etc. If nothing is mounted you should be fine (if there are no ATA bugs in this area). But for full hot-swap the disk controller should discover disk being removed and ATA code should remove it from /dev/. > 2) Is it *necessary* to "gjournal stop" before hot-swapping? > > In such a scenario, I would opt to simply "umount; gjournal sync," swap > disks, and then "atacontrol cap ad0; mount" (or even just "mount"). It > seems quite likely, however, that all drives that undergo this treatment > would be *required* to have gjournal labels since /dev/ad0s1.journal > would never disappear (although I've yet to actually test that). I'd go with 'umount; gjournal stop' and drop 'gjournal sync'. Controler should inform ATA that disk is gone. ATA should inform GEOM that ad0 is gone. If that would be the case, simple 'umount; gjournal sync' will be enough. But because it isn't the case, you have to stop gjournal and detach ad0. > 3) If the answer to question 2 is "yes," then how can I handle the case > of inserting a drive that does *not* have a gjournal label? There's nothing special here. Let's see how diskinfo test will go first. > 1) Is it really necessary to perform 3 "sync" commands before "umount"? > > Line 94 of src/sbin/umount/umount.c,v 1.45.20.1 has me thinking that the > answer is "no," since it calls sync() itself, albeit only once. I got > the idea for executing "sync" three times from /etc/rc.suspend. The idea is that unmount should take case of syncing data. There should be not need for even one sync. It is called "just in case". > 2) Is it necessary to "gjournal sync" if I'm going to "gjournal stop" > anyway? (You answered this one already.) No, stop should be sufficient. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-geom/attachments/20090731/415d0c4d/attachment.pgp From paul at gromit.dlib.vt.edu Fri Jul 31 14:17:12 2009 From: paul at gromit.dlib.vt.edu (Paul Mather) Date: Fri Jul 31 14:17:19 2009 Subject: ZFS ignores some labels, now pool is corrupted. Message-ID: <431FC16E-A25C-4BC3-A283-B1DAF2E3E46E@gromit.dlib.vt.edu> I recently repurposed a motley assortment of hardware that used to be a JBOD ad hoc backup mirror to use FreeBSD 7-STABLE and ZFS. When I say motley I mean motley: it has four internal SATA 1 TB drives and three external Maxtor OneTouch 1 TB USB drives. I aggregated together all of these drives as a single raidz1 using ZFS. Following a recent suggestion on here, before creating the raidz1 vdev I labelled each drive as "driveN" using glabel, e.g., "glabel label drive1 /dev/ad4". (I figured this would be important especially for the external USB drives, which might get plugged into different USB ports and thus probed in a different order to the one when the pool was created and hence shuffle device names.) When creating the pool, I used "zpool create backups raidz label/drive1 label/drive2 ...". That all worked for a week or so until today when I rebooted. One of the USB drives was not probed during boot and so was flagged as "REMOVED" when doing a "zpool status.": pool: backups state: DEGRADED scrub: none requested config: NAME STATE READ WRITE CKSUM backups DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 label/drive1 ONLINE 0 0 0 label/drive2 ONLINE 0 0 0 label/drive3 ONLINE 0 0 0 label/drive4 ONLINE 0 0 0 label/drive5 REMOVED 0 0 0 label/drive6 ONLINE 0 0 0 label/drive7 ONLINE 0 0 0 errors: No known data errors I unplugged and plugged in the REMOVED drive's cable to get it to probe. Eventually, the system appeared to recognise the drive and resilver: pool: backups state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Fri Jul 31 07:54:22 2009 config: NAME STATE READ WRITE CKSUM backups ONLINE 0 0 0 raidz1 ONLINE 0 0 0 label/drive1 ONLINE 0 0 0 11.5K resilvered label/drive2 ONLINE 0 0 0 11K resilvered label/drive3 ONLINE 0 0 0 12K resilvered label/drive4 ONLINE 0 0 0 11.5K resilvered label/drive5 ONLINE 0 0 0 17.5K resilvered label/drive6 ONLINE 0 0 0 13K resilvered label/drive7 ONLINE 0 0 0 11.5K resilvered errors: No known data errors I rebooted again, but, once more, the drive did not probe during boot, so I had to force it to probe by unplugging and plugging in its USB cable. This time, however, the drive was mis-identified in the pool as "da2" instead of "label/drive5" and, in fact, /dev/label/drive5 was missing: pool: backups state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Fri Jul 31 07:59:43 2009 config: NAME STATE READ WRITE CKSUM backups ONLINE 0 0 0 raidz1 ONLINE 0 0 0 label/drive1 ONLINE 0 0 0 8.50K resilvered label/drive2 ONLINE 0 0 0 10K resilvered label/drive3 ONLINE 0 0 0 9K resilvered label/drive4 ONLINE 0 0 0 10K resilvered da2 ONLINE 0 0 0 11.5K resilvered label/drive6 ONLINE 0 0 0 7.50K resilvered label/drive7 ONLINE 0 0 0 8.50K resilvered errors: No known data errors $ ls /dev/label drive1 drive2 drive3 drive4 drive6 drive7 For some reason, the label was not being detected properly. When I rebooted, things went from bad to worse. I now have two "da2" devices show up in my raidz vdev and this time my label/drive7 has disappeared. This seems to have thrown ZFS for a loop and my vdev is corrupted: pool: backups state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-4J scrub: none requested config: NAME STATE READ WRITE CKSUM backups DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 label/drive1 ONLINE 0 0 0 label/drive2 ONLINE 0 0 0 label/drive3 ONLINE 0 0 0 label/drive4 ONLINE 0 0 0 da2 FAULTED 0 0 0 corrupted data label/drive6 ONLINE 0 0 0 da2 ONLINE 0 0 0 errors: No known data errors $ ls /dev/label drive1 drive2 drive3 drive4 drive5 drive6 When I boot up in single-user mode all of my original "driveN" labels (1-7) show up. However, right now, with ZFS active, label/drive7 refused to appear. Is there a problem with ZFS and labels? Does anyone have any suggestions for how to repair this pool? I'm presuming I can't do a "zpool replace backups da2 /dev/label/drive5" to repair the faulted drive because I now have two "da2" devices in my vdev. As a sort of related question, is there a better way to create a pool out of these devices yet still maximise the amount of storage (allowing for some redundancy)? For example, would it be better to do something like this: zpool create backups raidz label/sata1 label/sata2 label/sata3 label/ sata4 \ raidz label/usb1 label/usb2 label/usb3 (where "sataN" are the internal SATA drives and "usbN" are the external USB drives) to place the internal and external drives into separate vdevs (albeit losing an extra drive of storage space to parity)? (Would that improve I/O speeds? I'm guessing it should.) Or, is it just storing up trouble to try and mix these USB devices into the pool as I am now and I'd be best off trying to lobby for an eSATA enclosure if I want to use external drives? Cheers, Paul. From wjw at digiware.nl Fri Jul 31 14:52:37 2009 From: wjw at digiware.nl (Willem Jan Withagen) Date: Fri Jul 31 14:52:42 2009 Subject: Gmirror rebuilding Message-ID: <4A7305A9.3080506@digiware.nl> Hi, I lost one of my disk in a gmirror, so I inserted a fresh one. And thusfar things went rather smoothly,it started rebuilding automagically. That's good,but what isn't is: Jul 31 16:43:15 www kernel: ad2: FAILURE - READ_DMA status=51 error=40 LBA=16344448 Jul 31 16:43:15 www kernel: GEOM_MIRROR: Synchronization request failed (error=5). mirror/mirror[READ(offset=8368291840, length=131072)] Jul 31 16:43:40 www kernel: ad2: FAILURE - READ_DMA status=51 error=40 LBA=16910976 Jul 31 16:43:40 www kernel: GEOM_MIRROR: Synchronization request failed (error=5). mirror/mirror[READ(offset=8658354176, length=131072)] and ad2 is the original disk. So somewhere I'm left with corrupt files. And what's worse, once this happens, geom_mirror does not continue with the remainder of the disk... It claims it is, but there is no activity at all on the disks. So what to do???? Hard way out would be to make a backup, reinstall the basic system with another/second fresh harddisk, and recover the backup. But that is a lot of work. Why doesn't geom_mirror continue with the remainder of the disk? --WjW From lists at jnielsen.net Fri Jul 31 15:39:20 2009 From: lists at jnielsen.net (John Nielsen) Date: Fri Jul 31 15:39:27 2009 Subject: Gmirror rebuilding In-Reply-To: <4A7305A9.3080506@digiware.nl> References: <4A7305A9.3080506@digiware.nl> Message-ID: <200907311118.33490.lists@jnielsen.net> On Friday 31 July 2009 10:54:33 Willem Jan Withagen wrote: > I lost one of my disk in a gmirror, so I inserted a fresh one. > And thusfar things went rather smoothly,it started rebuilding > automagically. > > That's good,but what isn't is: > > Jul 31 16:43:15 www kernel: ad2: FAILURE - READ_DMA > status=51 error=40 LBA=16344448 > Jul 31 16:43:15 www kernel: GEOM_MIRROR: Synchronization request failed > (error=5). mirror/mirror[READ(offset=8368291840, length=131072)] > Jul 31 16:43:40 www kernel: ad2: FAILURE - READ_DMA > status=51 error=40 LBA=16910976 > Jul 31 16:43:40 www kernel: GEOM_MIRROR: Synchronization request failed > (error=5). mirror/mirror[READ(offset=8658354176, length=131072)] > > and ad2 is the original disk. So somewhere I'm left with corrupt files. > > And what's worse, once this happens, geom_mirror does not continue with the > remainder of the disk... > It claims it is, but there is no activity at all on the disks. > > So what to do???? > > Hard way out would be to make a backup, reinstall the basic system with > another/second fresh harddisk, and recover the backup. > But that is a lot of work. If you have a backup already then restoring from it would be the best option. If you don't then the "hard" way above is a good second (assuming you can read enough of your remaining disk to get your backup tool to cooperate). You should make a backup in any case, but if you want to try to avoid reinstalling you could do some dd trickery. Remove the new disk from the mirror. Create a new mirror containing only the new disk. Run dd if=/dev/mirror/ of=/dev/mirror/ conv=noerror,sync This will take a long time with the default block size (512 bytes, one sector), but the plus side is that you only lose the data from the sectors that cannot be read. Depending on the extent of the damage to the original disk and your level of desperation, you may want to make note of which sectors fail to copy on the first run and try to copy them again (dd if=... of=... skip=NN seek=NN count=1). See also sysutils/ddrescue, sysutils/recoverdm and similar (I haven't used any of them). If you get a (mostly) viable clone, run fsck -f on it, assess the damage, update your fstab(s), reboot and add a second new disk to your new mirror. > Why doesn't geom_mirror continue with the remainder of the disk? I'll leave this question for someone else, but I suspect the behavior is intentional. As you say, you now have corruption on your volume so the best recourse is to restore from a known good backup. JN