vinum raid5 subdisks keep changing length?

Tony Frank tfrank at optushome.com.au
Tue Feb 17 04:01:41 PST 2004


On Tue, Feb 17, 2004 at 05:53:06PM +1030, Greg 'groggy' Lehey wrote:
> On Tuesday, 17 February 2004 at 11:39:26 +1100, Tony Frank wrote:
> > On Tue, Feb 17, 2004 at 09:51:30AM +1030, Greg 'groggy' Lehey wrote:
> >> On Monday, 16 February 2004 at 22:04:44 +1100, Tony Frank wrote:

[... snip ...]

> OK, I tried almost exactly the same thing.  My disks are fractionally
> smaller than yours, so I took a different integral number of stripes.
> I didn't get any messages, and the config now looks like:
> 
> volume data
> plex name data.p0 org raid5 984s vol data
> sd name data.p0.s0 drive drive1 plex data.p0 len 8374824s driveoffset 265s plexoffset 0s
> sd name data.p0.s1 drive drive2 plex data.p0 len 8374824s driveoffset 265s plexoffset 984s
> sd name data.p0.s2 drive drive3 plex data.p0 len 8374824s driveoffset 265s plexoffset 1968s
> sd name data.p0.s3 drive drive4 plex data.p0 len 8374824s driveoffset 265s plexoffset 2952s
> sd name data.p0.s4 drive drive5 plex data.p0 len 8374824s driveoffset 265s plexoffset 3936s
> 
> I've stopped and started vinum a couple of times, and all works well.
> I then removed the objects and tried again with subdisks 4 sectors
> longer.  Vinum gives the message:
> 
> vinum: removing 16 blocks of partial stripe at the end of data.p0
> 
> printconfig is then identical with the previous version.

With no obvious known problems and this being a test system I have 
wiped the disks and started over.

My near exact steps:

Rebooted from 4.9-RELEASE CD1
Went to 'fixit' mode with live filesystem CD2

"Wiped" the disks:
dd if=/dev/zero of=/dev/da0 bs=512 count=32
fdisk -BI /dev/da0
dd if=/dev/zero of=/dev/da0s1 bs=512 count=32
disklabel -w -B da0s1 auto

First thing I notice is that the geometry for my SCSI disks differs by 1 cylinder
between da0 and da0s1.
I configured all the vinum slices as da0s1h etc.

Anyway, then I went on to a 'standard' installation.
I selected ad0 (space), q -no changes (already had a partition from above steps)
Selected "BootMgr"
Repeated for each disk (ad0, ad2, da0-da3)

Deleted all the prelisted partitions & filesystems and performed an 'auto' based
on ad2 (leaves ad0, da0-da3 unused)

Selected "Minimal" installation type
Sourced from CD

Post install configured fxp0 for DHCP,
Accepted NFS client, selected no for all other options
Set timezone to Australia Victoria
Added an extra user "tony" with additional group wheel.
No ports, no packages, no other options.
Rebooted
System booted from ad2 with 4.9-RELEASE GENERIC kernel

Ran disklabel -e da0s1 and added a 'h' slice of:
#   h:  *        *  vinum
Repeated for ad0s1, da0s1-da3s1.

ad0 is size 16498692, da0-da3 is size 8803557

The scsi disk being smaller, I use it's size for stripe calculations.

For stripe of 984s:
 (8803557 - 256) / 984 = 8946 (rounded to nearest whole number)

 8946 * 984 = 8802864

 8802864 + 256 = 8803120 which is less than total drive size so it should fit.

I build a vinum config file test-config, just raid5 volume without vinum root:

### start test-config
drive vinumdrive0 device /dev/ad0s1h
drive vinumdrive1 device /dev/da0s1h
drive vinumdrive2 device /dev/da1s1h
drive vinumdrive3 device /dev/da2s1h
drive vinumdrive4 device /dev/da3s1h

volume data
 plex org raid5 984s
  sd drive vinumdrive0 len 8802864s driveoffset 265s
  sd drive vinumdrive1 len 8802864s driveoffset 265s
  sd drive vinumdrive2 len 8802864s driveoffset 265s
  sd drive vinumdrive3 len 8802864s driveoffset 265s
  sd drive vinumdrive4 len 8802864s driveoffset 265s
### end test-config

Create the configuration:

raider# vinum create test-config
5 drives:
D vinumdrive0           State: up       Device /dev/ad0s1h      Avail: 3757/8056 MB (46%)
D vinumdrive1           State: up       Device /dev/da0s1h      Avail: 0/4298 MB (0%)
D vinumdrive2           State: up       Device /dev/da1s1h      Avail: 0/4298 MB (0%)
D vinumdrive3           State: up       Device /dev/da2s1h      Avail: 0/4298 MB (0%)
D vinumdrive4           State: up       Device /dev/da3s1h      Avail: 0/4298 MB (0%)

1 volumes:
V data                  State: down     Plexes:       1 Size:         16 GB

1 plexes:
P data.p0            R5 State: init     Subdisks:     5 Size:         16 GB

5 subdisks:
S data.p0.s0            State: empty    PO:        0  B Size:       4298 MB
S data.p0.s1            State: empty    PO:      492 kB Size:       4298 MB
S data.p0.s2            State: empty    PO:      984 kB Size:       4298 MB
S data.p0.s3            State: empty    PO:     1476 kB Size:       4298 MB
S data.p0.s4            State: empty    PO:     1968 kB Size:       4298 MB

Initialise the plex:

raider# vinum init data.p0
raider# vinum[215]: initializing subdisk /dev/vinum/sd/data.p0.s1
vinum[216]: initializing subdisk /dev/vinum/sd/data.p0.s2
vinum[217]: initializing subdisk /dev/vinum/sd/data.p0.s3
vinum[218]: initializing subdisk /dev/vinum/sd/data.p0.s4
vinum[214]: initializing subdisk /dev/vinum/sd/data.p0.s0

While waiting for my SCSI drives to write 4G worth of zeros
I had a bit of a review of /usr/src/sbin/vinum/commands.c

Specifically it seems there's some bits done in initsd that is
a little strange to me.
Specifically SSize is checked and initsize is set up early on.
Later SSize is checked again and finally SSize is used instead
of initsize.   Patch on how I suspect it should work is attached.

While I was poking, my disks initialised:

subdisk /dev/vinum/sd/data.p0.s0 initialized
subdisk /dev/vinum/sd/data.p0.s4 initialized
subdisk /dev/vinum/sd/data.p0.s3 initialized
subdisk /dev/vinum/sd/data.p0.s2 initialized
subdisk /dev/vinum/sd/data.p0.s1 initialized

raider# vinum list
5 drives:
D vinumdrive0           State: up       Device /dev/ad0s1h      Avail: 3757/8056 MB (46%)
D vinumdrive1           State: up       Device /dev/da0s1h      Avail: 0/4298 MB (0%)
D vinumdrive2           State: up       Device /dev/da1s1h      Avail: 0/4298 MB (0%)
D vinumdrive3           State: up       Device /dev/da2s1h      Avail: 0/4298 MB (0%)
D vinumdrive4           State: up       Device /dev/da3s1h      Avail: 0/4298 MB (0%)

1 volumes:
V data                  State: up       Plexes:       1 Size:         16 GB

1 plexes:
P data.p0            R5 State: up       Subdisks:     5 Size:         16 GB

5 subdisks:
S data.p0.s0            State: up       PO:        0  B Size:       4298 MB
S data.p0.s1            State: up       PO:      492 kB Size:       4298 MB
S data.p0.s2            State: up       PO:      984 kB Size:       4298 MB
S data.p0.s3            State: up       PO:     1476 kB Size:       4298 MB
S data.p0.s4            State: up       PO:     1968 kB Size:       4298 MB
raider# newfs -v /dev/vinum/data
Warning: Block size and bytes per inode restrict cylinders per group to 22.
Warning: 1856 sector(s) in last cylinder unallocated
/dev/vinum/data:        35211456 sectors in 8597 cylinders of 1 tracks, 4096 sectors
        17193.1MB in 391 cyl groups (22 c/g, 44.00MB/g, 10944 i/g)
super-block backups (for fsck -b #) at:
 32, 90144, 180256, 270368, 360480, 450592, 540704, 630816, 720928, 811040, 901152, 991264, 1081376, 1171488, 1261600, 1351712, 1441824, 1531936, 1622048,
 1712160, 1802272, 1892384, 1982496, 2072608, 2162720, 2252832, 2342944, 2433056, 2523168, 2613280, 2703392, 2793504, 2883616, 2973728, 3063840, 3153952,
 3244064, 3334176, 3424288, 3514400, 3604512, 3694624, 3784736, 3874848, 3964960, 4055072, 4145184, 4235296, 4325408, 4415520, 4505632, 4595744, 4685856,
 4775968, 4866080, 4956192, 5046304, 5136416, 5226528, 5316640, 5406752, 5496864, 5586976, 5677088, 5767200, 5857312, 5947424, 6037536, 6127648, 6217760,
 6307872, 6397984, 6488096, 6578208, 6668320, 6758432, 6848544, 6938656, 7028768, 7118880, 7208992, 7299104, 7389216, 7479328, 7569440, 7659552, 7749664,
[... snip ...]
 33792032, 33882144, 33972256, 34062368, 34152480, 34242592, 34332704, 34422816, 34512928, 34603040, 34693152, 34783264, 34873376, 34963488, 35053600,
 35143712
raider# tunefs -n enable /dev/vinum/data
tunefs: soft updates set
raider# vinum printconfig built-config
raider# diff test-config built-config
0a1
> # Vinum configuration of raider.home.local, saved at Tue Feb 17 22:18:40 2004
6d6
<
8,14c8,13
<  plex org raid5 984s
<   sd drive vinumdrive0 len 8802864s driveoffset 265s
<   sd drive vinumdrive1 len 8802864s driveoffset 265s
<   sd drive vinumdrive2 len 8802864s driveoffset 265s
<   sd drive vinumdrive3 len 8802864s driveoffset 265s
<   sd drive vinumdrive4 len 8802864s driveoffset 265s
<
---
> plex name data.p0 org raid5 984s vol data
> sd name data.p0.s0 drive vinumdrive0 plex data.p0 len 8802864s driveoffset 265s plexoffset 0s
> sd name data.p0.s1 drive vinumdrive1 plex data.p0 len 8802864s driveoffset 265s plexoffset 984s
> sd name data.p0.s2 drive vinumdrive2 plex data.p0 len 8802864s driveoffset 265s plexoffset 1968s
> sd name data.p0.s3 drive vinumdrive3 plex data.p0 len 8802864s driveoffset 265s plexoffset 2952s
> sd name data.p0.s4 drive vinumdrive4 plex data.p0 len 8802864s driveoffset 265s plexoffset 3936s
raider#

All looks fine up to this point.

In fact /var/log/messages has all the vinum messages also and vinum_history shows what I have done.

I now added data volume to fstab:

raider# cat /etc/fstab
# See the fstab(5) manual page for important information on automatic mounts
# of network filesystems before modifying this file.
#
# Device                Mountpoint      FStype  Options         Dump    Pass#
/dev/ad2s1b             none            swap    sw              0       0
/dev/ad2s1a             /               ufs     rw              1       1
/dev/ad2s1f             /tmp            ufs     rw              2       2
/dev/ad2s1g             /usr            ufs     rw              2       2
/dev/ad2s1e             /var            ufs     rw              2       2
/dev/vinum/data         /data           ufs     rw              2       2
/dev/acd0c              /cdrom          cd9660  ro,noauto       0       0
proc                    /proc           procfs  rw              0       0

I also added vinum_load="YES" to /boot/loader.conf:

raider# cat /boot/loader.conf                                                                                                                                 # -- sysinstall generated deltas -- #
userconfig_script_load="YES"
vinum_load="YES"
raider#

Did a few more tests - vinum stop / vinum start etc - still no problems.

And reboot:

raider# shutdown -r now

/kernel and /modules/vinum.ko are loaded early on and system boots.
And immediately ends up in single user mode as I left out some important
bits from loader.conf like:
vinum.drives="/dev/ad0s1 /dev/da0s1 /dev/da1s1 /dev/da2s1 /dev/da3s1"

Fixed that and rebooted again.

This time everything starts normally.
No errors reported by vinum, however the vinum startup messages
no longer appear in the /var/log/messages file anymore although they do still
display on the console.

vinum list shows everything as expected wrt devices & avail.
vinum printconfig output matches (except for time) to that taken before reboot.

Took vinum out of /boot/loader.conf and included start_vinum="YES" in /etc/rc.conf.
Rebooted, all appears ok still.
messages & dmesg show no record of vinum, but kldstat -v shows it plus I can see
vinum messages on the console, only one of 'interest' is:

vinum: /dev is mounted read-only, not rebuilding /dev/vinum

Tried rebooting several more times, including some vinum stop/start
sequences etc.
When I entered vinum stop, the vinum messages were logged to /var/log/messages.
Likewise when I entered vinum start the messages were also logged.
It seems that only the messages during boot are not captured.
This is the same whether I have vinum loaded by loader.conf or through rc.conf.

Despite this logging oddity everything else appears to work just fine using 4.9-RELEASE.

I am currently building world based on RELENG_4 cvsup from this evening.
Will try it all again with the new kernel & world probably tomorrow.

Main thing that is different is that I no longer have a vinum root environment.
I might try to rebuild that while I wait for world to build.
Vinum root needs a bit more planning from the start due to disk offsets and the like...

> > Please advise if I can help - I have plenty of free time this week
> > and am willing to get my hands dirty.
> Hmm.  Unfortunately, I don't have much time for the rest of the week.
> Does this mean you're not coming to the AUUG security symposium in
> Canberra?

No, perhaps I should keep myself more abreast of 'local' events.
AUUG hmm..  will have to ask about it at the next vicfug gathering.

> There's obviously something going on here which isn't immediately
> obvious.  Take a look at
> http://www.vinumvm.org/vinum/how-to-debug.html and send me the
> information I ask for there, and maybe we can track it down.

IMHO I did supply basically everything in the original email.
No core or panics to debug, the rest of the config & setup was listed.

Will further report tomorrow on outcome of further testing.
Scenarios:
data raid5 with vinum root on 4.9-RELEASE
plain data raid5 with RELENG_4
data raid5 with vinum root on RELENG_4

Thanks for your time, patch attached.

Tony


-------------- next part --------------
--- /usr/src/sbin/vinum/commands.c      Tue Jun 24 23:31:55 2003
+++ commands.c  Tue Feb 17 22:05:05 2004
@@ -37,7 +37,7 @@
  * advised of the possibility of such damage.
  *
  * $Id: commands.c,v 1.14 2000/11/14 20:01:23 grog Exp grog $
  * $FreeBSD: src/sbin/vinum/commands.c,v 1.31.2.6 2003/06/06 05:13:29 grog Exp $
  */

 #include <ctype.h>
@@ -465,9 +465,6 @@
     message->verify = vflag;                               /* verify what we write? */
     message->force = 1;                                            /* insist */
     ioctl(superdev, VINUM_SETSTATE, message);
-    if ((SSize > 0)                                        /* specified a size for init */
-    &&(SSize < 512))
-       SSize <<= DEV_BSHIFT;
     if (reply.error) {
        fprintf(stderr,
            "Can't initialize %s: %s (%d)\n",
@@ -483,7 +480,7 @@
            message->type = sd_object;                      /* and type of object */
            message->state = object_up;
            message->verify = vflag;                        /* verify what we write? */
-           message->blocksize = SSize;
+           message->blocksize = initsize;
            ioctl(superdev, VINUM_SETSTATE, message);
        }
        while (reply.error == EAGAIN);                      /* until we're done */



More information about the freebsd-questions mailing list