newfs returns cg 0: bad magic number
Konstantin Belousov
kostikbel at gmail.com
Fri Jul 7 06:24:02 UTC 2017
On Fri, Jul 07, 2017 at 12:12:49AM +0200, Willem Jan Withagen wrote:
> On 5-7-2017 08:55, Bruce Evans wrote:
> > On Wed, 5 Jul 2017, Konstantin Belousov wrote:
> >
> >> On Wed, Jul 05, 2017 at 02:00:43AM +0200, Willem Jan Withagen wrote:
> >>> Hi,
> >>>
> >>> I'm able to create a Ceph RBD backed ggate disk, in /dev/ggate0.
> >>> It looks like I can:
> >>> run dd on it
> >>> gpart the disk
> >>> create a zpool on it
> >>>
> >>> But when I try to create a UFS file system on it, newfs complains
> >>> straight from the bat.
> >>>
> >>> # sudo newfs -E /dev/ggate0p1
> >>> /dev/ggate0p1: 1022.0MB (2093056 sectors) block size 32768, fragment
> >>> size 4096
> >>> using 4 cylinder groups of 255.53MB, 8177 blks, 32768 inodes.
> >>> Erasing sectors [128...2093055]
> >>> super-block backups (for fsck_ffs -b #) at:
> >>> 192, 523520, 1046848, 1570176
> >>> cg 0: bad magic number
> >>>
> >>> Googling returns that this is on and off a problem with new devices, but
> >>> there is no generic suggestion on how to debug this....
> >>>
> >>> Any/all suggestions are welcome,
> >> Typically this error means that the drive returns wrong data, not the
> >> bytes that were written to it and expected to be read.
> >
> > This might be for writing to a nonexistent sector. Checking for write
> > errors was broken by libufs, so some write errors are only sometimes
> > detected as a side effect of reading back garbage.
> >
> > I use the following quick fix (the patch also fixes some style bugs).
> >
> > X Index: mkfs.c
> > X ===================================================================
> > X RCS file: /home/ncvs/src/sbin/newfs/mkfs.c,v
> > X retrieving revision 1.85
> > X diff -u -1 -r1.85 mkfs.c
> > X --- mkfs.c 9 Apr 2004 19:58:33 -0000 1.85
> > X +++ mkfs.c 7 Apr 2005 23:51:56 -0000
> > X @@ -437,16 +441,19 @@
> > X if (!Nflag && Oflag != 1) {
> > X - i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy,
> > SBLOCKSIZE);
> > X + i = bread(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy,
> > X + SBLOCKSIZE);
> > X if (i == -1)
> > X - err(1, "can't read old UFS1 superblock: %s", disk.d_error);
> > X -
> > X + err(1, "can't read old UFS1 superblock: %s",
> > X + disk.d_error);
> > X if (fsdummy.fs_magic == FS_UFS1_MAGIC) {
> > X fsdummy.fs_magic = 0;
> > X - bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy,
> > SBLOCKSIZE);
> > X + bwrite(&disk, SBLOCK_UFS1 / disk.d_bsize, chdummy,
> > X + SBLOCKSIZE);
> > X for (i = 0; i < fsdummy.fs_ncg; i++)
> > X - bwrite(&disk, fsbtodb(&fsdummy, cgsblock(&fsdummy, i)),
> > X - chdummy, SBLOCKSIZE);
> > X + bwrite(&disk,
> > X + fsbtodb(&fsdummy, cgsblock(&fsdummy, i)),
> > X + chdummy, SBLOCKSIZE);
> > X }
> > X }
> > X - if (!Nflag)
> > X - sbwrite(&disk, 0);
> > X + if (!Nflag && sbwrite(&disk, 0) != 0)
> > X + err(1, "sbwrite: %s", disk.d_error);
> > X if (Eflag == 1) {
> > X @@ -518,4 +525,4 @@
> > X }
> > X - if (!Nflag)
> > X - sbwrite(&disk, 0);
> > X + if (!Nflag && sbwrite(&disk, 0) != 0)
> > X + err(1, "sbwrite: %s", disk.d_error);
> > X for (i = 0; i < sblock.fs_cssize; i += sblock.fs_bsize)
> >
> > libufs broke the error handling for the most important writes -- to
> > the superblock. Error handling is still done almost correctly in
> > wtfs(), and most writes are still done using wtfs() which is now
> > just a wrapper which adds error handling to libufs's bwrite(3), but
> > writes to superblock are (were) now done internally by libufs's
> > sbwrite(3) which (like most of libufs) is too hard to use.
> >
> > Note that -current needs a slightly different fix. Part of libufs
> > being too hard to use is that it is a library so it can't just exit
> > for errors. It returns errors in the string disk.d_error and the
> > fix uses that for newfs, unlike for most other calls to sbwrite(3).
> > However, newfs no longer uses sbwrite(3). It uses a wrapper
> > do_sbwrite() which reduces to pwrite(2). The wrapper doesn't set
> > d_error, so it is incompatible with sbwrite(3).
> >
> > This is an example that libufs is even harder to use than might first
> > appear. The version with the do_sbwrite() wrapper fixes a previous
> > version which replaced bwrite(3) instead of wrapping it. bwrite()
> > in the application conflicted with bwrite(3) in libufs, since libufs
> > is not designed to have its internals replaced by inconsistent parts
> > like that. Apparently, a special case is only needed for superblock
> > writes, and do_sbwrite() does that, and since libufs doesn't call any
> > sbwrite() function internally there is no need to replace sbwrite(3);
> > sbwrite(3) is just useless for its main application. All that the
> > bwrite(3) and sbwrite(3) library functions do is handle the block
> > size implicitly in a way that makes them harder to use than just
> > multiplying by the block size like wtfs() used to do and do_sbwrite()
> > now does.
>
> This is where the trouble originates:
> /usr/srcs/11/src/lib/libufs/sblock.c:148
> /*
> * Write superblock summary information.
> */
> blks = howmany(fs->fs_cssize, fs->fs_fsize);
> space = (uint8_t *)disk->d_sbcsum;
> for (i = 0; i < blks; i += fs->fs_frag) {
>
> But:
>
> (gdb) p disk->d_sbcsum
> $19 = (struct csum *) 0x0
>
> and this pointer is later on used to write:
> for (i = 0; i < blks; i += fs->fs_frag) {
> size = fs->fs_bsize;
> if (i + fs->fs_frag > blks)
> size = (blks - i) * fs->fs_fsize;
> if (bwrite(disk, fsbtodb(fs, fs->fs_csaddr + i), space, size)
> == -1) {
> ERROR(disk, "Failed to write sb summary information");
> return (-1);
> }
> space += size;
> }
>
> But the bwrite returns error because the called pwrite() tries to write
> 4096 bytes from a null pointer. And that it does not like.
>
> Now the question is: why isn't d_sbcsum not filled out?
> Note that the disk is filled with random data.
>
> I've been looking for quite some time, but I just don't get it.
> Where should the superblock come from if a whole disk is being used?
> (so there no MBR or gpart written. Dangerously dedicated)
Indeed I am not sure what do you report there. newfs(8) does not use
sbwrite() function from libufs. Set a breakpoint on the sbwrite()
function and catch the backtrace of the call.
More information about the freebsd-fs
mailing list