OCE and GPT

Lister lister at kawashti.org
Thu Apr 22 15:22:04 UTC 2010


Hello all,

I'd like to make a few clarifications first:
1. All my systems are AMD64 and either 7.1-REL or 8.0-REL.
2. The GPT on the 5TB RAID5  I want to expand is on 7.1.  The latter has both gpt and gpart. I didn't know about gpart when I built its RAID. Partition 3 is the last, 3.6TB, 87% full, and is the one desperately needing expansion.
3. The GPT on 4TB RAID is on 8.0.  I just built it a few days ago for a project (not my own). Its entire 9 partitions are still empty (just newfs'd) and for that reason I can use it temporarily for the experiment.

Certainly, I'll share the results of the expansion experiment with you. It'll just be a day or so before I get there, as it evidently calls for a good deal of prep.  Fortunately, I wrote a verbose Bash script to automate the process of creating a GPT, so I don't have to read everything when I need do it again a few months down the line. It handles everything from deletion, destruction, creation, newfs, tunefs, /etc/fstab updates, mounting and 'df' summary display. To customize, only a few 
variables need be changed.  If anyone thinks this might come in handy someday, please let me know to post it.

Now regarding the hexdump commands, I used them on the 8.0 system for a reference visual comparision. First here's the output of gpart on that system:
/ :633: gpart show da0
=>        34  7812415421  da0  GPT  (3.6T)
          34    41943040    1  freebsd-ufs  (20G)
    41943074   188743680    2  freebsd-ufs  (90G)
   230686754    62914560    3  freebsd-ufs  (30G)
   293601314  4294967296    4  freebsd-ufs  (2.0T)
  4588568610   838860800    5  freebsd-ufs  (400G)
  5427429410  1111490560    6  freebsd-ufs  (530G)
  6538919970   209715200    7  freebsd-ufs  (100G)
  6748635170   629145600    8  freebsd-ufs  (300G)
  7377780770   434634685    9  freebsd-ufs  (207G)

Here are the commands. Note that I used Bash notations here for easy immediate recognition. I tested everyone of them before submitting this message. They do exactly the same as the ones I originally used which included absolute lengths (in decimal) and offsets (in decimal, hexadecimal and 'b' for block variations).  I've thoroughly tested all combinations and proved them to achieve same result (which is nothingness.)

# First 34 sectors of /dev/da0
hd -n $((512*34)) /dev/da0
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 01  |................|
000001c0  01 00 ee ff ff ff 01 00  00 00 ff ff ff ff 00 00  |................|
000001d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200  45 46 49 20 50 41 52 54  00 00 01 00 5c 00 00 00  |EFI PART....\...|
00000210  54 69 12 d7 00 00 00 00  01 00 00 00 00 00 00 00  |Ti..............|
00000220  ff ff a7 d1 01 00 00 00  22 00 00 00 00 00 00 00  |........".......|
00000230  de ff a7 d1 01 00 00 00  80 b7 4f 66 87 4c df 11  |..........Of.L..|
00000240  97 12 00 e0 81 b3 63 76  02 00 00 00 00 00 00 00  |......cv........|
00000250  80 00 00 00 80 00 00 00  cd c5 e0 ec 00 00 00 00  |................|
00000260  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
……… edited for brevity
# 34 sectors after the last partition
    hd -n $((512*34)) -s $((7377780770+434634685)) /dev/da0
-> frozen
# First 1 sector after the last partition
    hd -n 512 -s $((7377780770+434634685)) /dev/da0
-> frozen
# Last 1 sector of last partition
    hd -n 512 -s $((7377780770+434634685-1)) /dev/da0
-> frozen
# First 1 sector of last partition
    hd -n 512 -s 7377780770 /dev/da0
-> frozen

This is interesting: Since 7.1-REL has both gpt and gpart, I did both on my 5TB array. Here's the output:
/ :1134: gpt show da0
       start        size  index  contents
           0           1         PMBR
           1           1         Pri GPT header
           2          32         Pri GPT table
          34  1677721600      1  GPT part - FreeBSD UFS/UFS2
  1677721634   419430400      2  GPT part - FreeBSD UFS/UFS2
  2097152034  7668367293      3  GPT part - FreeBSD UFS/UFS2
  9765519327          32         Sec GPT table
  9765519359           1         Sec GPT header

/ :1135: gpart show da0
=>        34  9765519293  da0  GPT  (4.5T)
          34  1677721600    1  freebsd-ufs  (800G)
  1677721634   419430400    2  freebsd-ufs  (200G)
  2097152034  7668367293    3  freebsd-ufs  (3.6T)

Obviously the output of gpt is more detailed and does reference the 2ry GPT.  It lead me to incidentally learn that the 2ry is one sector shorter than 1ry on account of absent PMBR.
This also leads me suggest to the implementers of gpart to use the verbosity of gpt.  Would you concur?

Kind regards,
Hatem Kawashti
----- Original Message ----- 
From: "Andriy Gapon" <avg at icyb.net.ua>
To: "Lister" <lister at kawashti.org>
Cc: <freebsd-geom at freebsd.org>
Sent: Thursday, April 22, 2010 11:58
Subject: Re: OCE and GPT


> on 21/04/2010 23:49 Lister said the following:
>> Hello All,
>>
>> I'd like to first thank Andrey Elsukov and Andriy Gapon for their
>> valuable contribution and very quick reply.
>> Given that the patch is not yet ready as I understand it, I'll go with
>> the alternate method of destroying and recreating the GPT. To that end I
>> yet have to ask 3 more  questions:
>> 1. How do I make sure I have a valid secondary GPT? Neither gpt nor
>> gpart tell anything about it. Can I assume that if 'gpart show da0'
>> shows a proper layout and no error messages that the 2ry is valid?
>
> I think that should be sufficient.
>
>>    I tried to make a quick visual comparison on another system
>> (8.0-RELEASE this time) with a 4TB RAID5 that I just setup yesterday,
>> using gpart this time because I had to.  I used hexdump for the purpose,
>> dumping the first 34 sectors of /dev/da0, and on another ssh shell, THE
>> 34 sectors beyond the last partition.
>> hexdump of the second got nothing, it seemed to have frozen but would
>> break normally on CTRL+C. I've never seen the likes of this before.
>> In an attempt to troubleshoot, I narrowed the selection to only ONE
>> sector…same result. Then the last sector of the last partition…same
>> thing. Even dump of the first sector of the last partition exhibited
>> same behavior. The partition is viable, though.  I copied a 4.4GB file
>> to it over ssh without a problem and the data rate was consistent with
>> expectations.
>> I know this is a side issue, but is hexdump/hd known to have problems
>> with large devices, or perhaps 32/64-bit issues?
>> I forgot to mention that all my systems are AMD64.
>
> Can you provide the actual commands you used?
> Not to doubt your skills, but just to be sure.
> BTW, you can discover disk size with diskinfo tool, subtract 34 from that and
> use dd on that.
>
>> 2. Now assuming OCE adds the new space at the tail– which I yet have to
>> verify before proceeding– will 'growfs' serve the purpose of extending
>> newfs' work?
>>    Its man page doesn't reference gpt or gpart, but rather bsdlabel and
>> fdisk; something suggestiive of the contrary.
>
> Theoretically growfs should work with filesystem data within a partition and
> should be agnostic to partition type.
> Practically, I am not sure.
> Also, there _could_ be issues with very large FS sizes.
>
> In your case it would be great if you could experiment with dummy data on a
> different system.  I.e. create something similar to what you have now, then grow
> it the way you want and see how it works out.
>
> Don't forget to share the results with us :)
>
>> 3. Does it make a difference if use gpt or gpart to recreate the gpt,
>> given that I'd initially created it with gpt?
>
> I think that it's better to use gpart because gpt was deprecated.
> But I am not sure what version of FreeBSD you use, that may be important.
>
>> Note. My root fs and everything else beyond the library is on another
>> RAID1 (on the Motherboard).
>
> That's good, gives you more freedom in actions.
>
>> ----- Original Message ----- From: "Andriy Gapon" <avg at icyb.net.ua>
>> To: "Lister" <lister at kawashti.org>
>> Cc: <freebsd-geom at freebsd.org>
>> Sent: Wednesday, April 21, 2010 14:07
>> Subject: Re: OCE and GPT
>>
>>
>>> on 21/04/2010 12:21 Lister said the following:
>>>> Hi All,
>>>>
>>>> I have a 5TB RAID5 (/dev/da0) on a 3Ware controller supporting OCE.  I
>>>> partitioned it into p1, p2 & p3 using gpt on FreeBSD-7.1-RELEAE.
>>>> P3 is 3.5TB and is the one I need to expand by adding another 1TB drive
>>>> to the RAID. It is now 87% full.
>>>>
>>>> Both gpt and gpart don't allow resizing a partition.
>>>> Of course, backing up the RAID to another is not an option.
>>>>
>>>> I'm in a rather desperate situation and I'm willing to do whatever it
>>>> takes. If there's no current software solution, I'm willing to use a hex
>>>> editor to edit the disk directly if someone could advise me of the
>>>> layout of GPT as created by gpt- and gpart if different.  I used to do
>>>> this on MBR disks at times of necessity.
>>>
>>> If you make any mistake and lose your data, then don't blame me.
>>> Before trying what I suggest wait for a few days in case someone
>>> points out a
>>> mistake or suggests a better way.
>>>
>>> 1. Get current layout e.g. with 'gpart show'
>>> 2. Print (several copies of) it and don't lose it
>>> 3. Boot using Live CD (if da0 is your boot disk)
>>> 4. Undo the whole GPT layout using 'gpart delete' and 'gpart destroy'
>>> 5. Expand RAID (I hope OCE means that the new space will be added at
>>> the end)
>>> 5. Re-create the same layout but using new size for p3
>>>
>>> Some notes:
>>> 1. Deleting/destroying/adding/creating partitions and scheme does not
>>> touch your
>>> data/filesystems; it operates only on sectors belonging to GPT metadata.
>>> 2. There are two copies of GPT metadata, one at the start of a disk,
>>> the other at
>>> the end; they both must be valid and provide the same information.
>>> -- 
>>> Andriy Gapon
>>> _______________________________________________
>>> freebsd-geom at freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-geom
>>> To unsubscribe, send any mail to "freebsd-geom-unsubscribe at freebsd.org"
>>
>
>
> -- 
> Andriy Gapon
> _______________________________________________
> freebsd-geom at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-geom
> To unsubscribe, send any mail to "freebsd-geom-unsubscribe at freebsd.org" 



More information about the freebsd-geom mailing list