GPT partitions on whole disk gmirror - questions about the metadata issue

Matthias Petermann matthias at petermann-it.de
Thu Nov 22 04:46:38 UTC 2018


Hello John,

Thank you for your answer and your thoughts on the subject. I'm 
completely with you what ZFS is concerned. I am glad that we have the 
choice in FreeBSD and I prefer to make my decision according to the 
expected workload and the potential of the available hardware. In the 
meantime, I have made a few experiments on my understanding of gmirror 
metadata. I attach the protocol here once including my conclusions. From 
my naive point of view, the problems of GPT on whole disk gmirror are 
not that dramatic. I would be very happy about comments.

Kind regards,
Matthias


0) Preface

I am going to setup a gmirror using two file based memory discs as 
components. Once the gmirror is active, I create a GPT partitioning 
scheme on it, format with UFS and read / write some data. After this I 
plan to perform some re-partitioning steps to verify that the metadata 
records in the last sectors are safe and not corrupted by this. If the 
kernel outputs something in dmesg while doing this, I record the log 
entries with timestamp.

1) Setup components

root at l-mpe-fbsd:/home/admin # truncate -s 512m vol1.img
root at l-mpe-fbsd:/home/admin # truncate -s 512m vol2.img
root at l-mpe-fbsd:/home/admin # mdconfig -a -t vnode -f vol1.img
md0
root at l-mpe-fbsd:/home/admin # mdconfig -a -t vnode -f vol2.img
md1

2) Build gmirror of components md0 and md1

root at l-mpe-fbsd:/home/admin # gmirror label test /dev/md0 /dev/md1
Nov 22 04:52:24 l-mpe-fbsd kernel: GEOM_MIRROR: Device mirror/test 
launched (2/2).


3) Create GPT partitioning scheme on test mirror

root at l-mpe-fbsd:/home/admin # gpart create -s gpt /dev/mirror/test
mirror/test created


4) Check last sectors of components

1FFFFBF0 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 45 46 
49 20 │ 50 41 52 54 │ 00 00 01 00  ................EFI PART....
1FFFFC0C 5C 00 00 00 │ 57 8E 30 76 │ 00 00 00 00 │ FE FF 0F 00 │ 00 00 
00 00 │ 01 00 00 00 │ 00 00 00 00  \...W.0v..................
1FFFFC28 28 00 00 00 │ 00 00 00 00 │ D7 FF 0F 00 │ 00 00 00 00 │ 4E C8 
72 1E │ 0A EE E8 11 │ A8 29 F0 DE  (.............Nr...)
1FFFFC44 F1 DD 9D 59 │ DE FF 0F 00 │ 00 00 00 00 │ 80 00 00 00 │ 80 00 
00 00 │ 86 D2 54 AB │ 00 00 00 00  .Y...............T....
1FFFFC60 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFC7C 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFC98 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFCB4 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFCD0 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFCEC 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFD08 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFD24 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFD40 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFD5C 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFD78 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFD94 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFDB0 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFDCC 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFDE8 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 47 45 4F 4D  ........................GEOM
1FFFFE04 3A 3A 4D 49 │ 52 52 4F 52 │ 00 FF FF FF │ 04 00 00 00 │ 74 65 
73 74 │ 00 00 00 00 │ 00 00 00 00  ::MIRROR.....test........  .
1FFFFE20 00 00 00 00 │ B0 49 4B F7 │ 73 07 3C A2 │ 02 00 00 00 │ 00 01 
00 00 │ 00 00 00 10 │ 00 00 02 00  ....IKs.<................  .
1FFFFE3C FE FF 1F 00 │ 00 00 00 00 │ 02 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 40  .........................@
1FFFFE58 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 20 00  .......................... .
1FFFFE74 00 00 00 58 │ 3C 77 29 C0 │ 9B 60 4D B4 │ 68 4E CF F9 │ CD FD 
5D 00 │ 00 00 00 00 │ 00 00 00 00  ...X<w).`MhN].........
1FFFFE90 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFEAC 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFEC8 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFEE4 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFF00 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFF1C 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFF38 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFF54 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFF70 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFF8C 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFFA8 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFFC4 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFFE0 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  .........................@ .

Looks good: in the last block there is the gmirror label, one block 
before the secondary GPT header

5) Create gpt partition

root at l-mpe-fbsd:/home/admin # gpart add -t freebsd-ufs -l test 
/dev/mirror/test
mirror/testp1 added


6) Create filesystem

root at l-mpe-fbsd:/home/admin # newfs -j /dev/gpt/test
/dev/gpt/test: 512.0MB (1048496 sectors) block size 32768, fragment size 
4096
         using 4 cylinder groups of 128.00MB, 4096 blks, 16384 inodes.
         with soft updates
super-block backups (for fsck_ffs -b #) at:
  192, 262336, 524480, 786624
Using inode 4 in cg 0 for 4194304 byte journal
newfs: soft updates journaling set
root at l-mpe-fbsd:/home/admin #


7) Mount filesystem, test and unmount

root at l-mpe-fbsd:/home/admin # mount /dev/gpt/test /mnt/
root at l-mpe-fbsd:/home/admin # echo "test" > /mnt/test.txt
root at l-mpe-fbsd:/home/admin # cat /mnt/test.txt
test
root at l-mpe-fbsd:/home/admin # umount /mnt/


8) Remove GPT Partitioning schema

root at l-mpe-fbsd:/home/admin # gpart destroy -F /dev/mirror/test
mirror/test destroyed


9) Check last sectors of components

1FFFFBF0 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFC0C 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFC28 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFC44 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFC60 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFC7C 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFC98 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFCB4 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFCD0 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFCEC 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFD08 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFD24 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFD40 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFD5C 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFD78 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFD94 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFDB0 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFDCC 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFDE8 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 47 45 4F 4D  ........................GEOM
1FFFFE04 3A 3A 4D 49 │ 52 52 4F 52 │ 00 00 00 00 │ 04 00 00 00 │ 74 65 
73 74 │ 00 01 00 00 │ 00 00 00 00  ::MIRROR........test........
1FFFFE20 A0 07 5F E9 │ B0 49 4B F7 │ 73 07 3C A2 │ 02 00 00 00 │ 00 01 
00 00 │ 00 00 00 10 │ 00 00 02 00  ._IKs.<................
1FFFFE3C FE FF 1F 00 │ 00 00 00 00 │ 02 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 40  .........................@
1FFFFE58 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 20 00  .......................... .
1FFFFE74 00 00 00 BE │ 2B CF 02 21 │ 60 DB DE FB │ AB FC D2 03 │ 26 F6 
64 00 │ 00 00 00 00 │ 00 00 00 00  ...+.!`.&d.........
1FFFFE90 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFEAC 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFEC8 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFEE4 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFF00 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFF1C 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFF38 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFF54 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFF70 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFF8C 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFFA8 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFFC4 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ............................
1FFFFFE0 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 
00 00 │ 00 00 00 00 │ 00 00 00 00  ...................t.....@ .

Looks good: the gmirror label in the last block was untouched, the 
secondary GPT header correctly removed.

10) Re-Create GPT partitioning scheme, partition and format

root at l-mpe-fbsd:/home/admin # gpart create -s gpt /dev/mirror/test
mirror/test created
root at l-mpe-fbsd:/home/admin # gpart add -t freebsd-ufs -l test 
/dev/mirror/test
mirror/testp1 added
root at l-mpe-fbsd:/home/admin # newfs -j /dev/gpt/test
/dev/gpt/test: 512.0MB (1048496 sectors) block size 32768, fragment size 
4096
         using 4 cylinder groups of 128.00MB, 4096 blks, 16384 inodes.
         with soft updates
super-block backups (for fsck_ffs -b #) at:
  192, 262336, 524480, 786624
Using inode 4 in cg 0 for 4194304 byte journal
newfs: soft updates journaling set
root at l-mpe-fbsd:/home/admin #


11) Stop mirror

root at l-mpe-fbsd:/home/admin # gmirror stop test
Nov 22 05:06:44 l-mpe-fbsd kernel: GEOM_MIRROR: Device test: provider 
destroyed.
Nov 22 05:06:44 l-mpe-fbsd kernel: GEOM_MIRROR: Device test destroyed.
Nov 22 05:06:44 l-mpe-fbsd kernel: GEOM: md1: the secondary GPT header 
is not in the last LBA.
Nov 22 05:06:44 l-mpe-fbsd kernel: GEOM: md0: the secondary GPT header 
is not in the last LBA.

12) Conclusion

 From the observations I conclude:

* As long as no disc previously partitioned with GPT is added / 
converted to a gmirror, there is no danger of overwriting metadata

* As long as the gmirror is active, both primary and secondary GPT 
headers are correctly recognized and managed by the kernel

* If the gmirror is paused, the kernel components are again single disks 
whose secondary GPT header is not found because it starts a block too 
early. This circumstance is reported by the kernel (as seen in step 11)

 From an extended experiment on real hardware (two physical disks with 
root filesystem instead of memory disks) I could conclude:

* When geom_mirror is loaded in loader.conf, the gmirror is active early 
enough to not notice inconsistencies from the kernel's point of view 
(due to the position of the secondary GPT header)

What would interest me:

* What are the possible consequences of not finding the secondary GPT 
header in the eyes of the UEFI firmware? Is it conceivable that a system 
refuses to boot because of this?

* What potential problems have I missed?


Am 21.11.2018 um 20:42 schrieb freebsd at johnea.net:
> 
> Hello Matthias,
> 
> I'm very interested in this question as well, but I'm afraid I can't offer authoritative answers 8-(
> 
> There is a similar subject discussed in this blog post:
> http://blog.frankleonhardt.com/2017/zfs-is-not-always-the-answer-bring-back-gmirror/
> 
> Mostly in the final comment at the bottom.
> 
> Warren Block has offered very helpful advice regarding freebsd storage in general.
> 
> Specifically these two howto's are related:
> http://www.wonkity.com/~wblock/docs/html/disksetup.html
> http://www.wonkity.com/~wblock/docs/html/gmirror.html
> 
> But this conflict in the last sector remains a problem.
> 
> With the broader industry scope of GPT, versus the FreeBSD specific nature of gmirror, it is obviously gmirror that will need to change if these two system are to be made cooperative.
> 
> To my eye, ZFS seems heavy handed, and loaded with many features that aren't critical for my applications. (personally I liken it to btrfs) I hope the conflict of whole disk gmirror containing GPT partitions can be resolved.
> 
> This GPT/gmirror conflict has been around for years now, but as the prevalence of drives > 2TB becomes almost universal, the old process of using MBR for whole disk gmirrors is becoming increasingly ineffective.
> 
> I hope someone knowledgeable in the internals of gmirror can speak to your question...
> 
> johnea
> 
> 
> On 11/20/18 8:47 PM, Matthias Petermann wrote:
>> Hello,
>>
>> in section 18.3 of the FreeBSD handbook[1] there is a warning regarding
>> using whole disc mirroring with gmirror together with GPT:
>>
>> "gmirror(8) stores one block of metadata at the end of the disk. Because
>> GPT partition schemes also store metadata at the end of the disk,
>> mirroring entire GPT disks with gmirror(8) is not recommended. MBR
>> partitioning is used here because it only stores a partition table at
>> the start of the disk and does not conflict with the mirror metadata."
>>
>> Why is it that gmirror does not represent the mirrored device for the
>> levels above it in a way that does not allow access to the last block
>> containing the metadata? Would it be enough to simply mimic a smaller
>> device, one block less than the underlying providers?
>>
>> In the case mentioned above, the workaround is to first partition both
>> providers using GPT and then form gmirrors from two GPT partitions each.
>> In this case, the metadata problem should not be critical because
>> gmirror exists within the partition itself. Here is my further question:
>> how does the file system (UFS) ensure that e.g. newfs does not overwrite
>> the last block of a gmirror in this setting?
>>
>> Best regards,
>> Matthias
>>
>> [1] https://www.freebsd.org/doc/handbook/geom-mirror.html
>>
> _______________________________________________
> freebsd-questions at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-questions
> To unsubscribe, send any mail to "freebsd-questions-unsubscribe at freebsd.org"
> 

-- 
Matthias Petermann <matthias at petermann-it.de> | www.petermann-it.de
GnuPG: 0x5C3E6D75 | 5930 86EF 7965 2BBA 6572  C3D7 7B1D A3C3 5C3E 6D75


More information about the freebsd-questions mailing list