newfs returns cg 0: bad magic number

Sat Jul 8 00:13:01 UTC 2017

On Jul 7, 2017 3:16 PM, "Willem Jan Withagen" <wjw at digiware.nl> wrote:

On 7-7-2017 16:23, Bruce Evans wrote:
> On Fri, 7 Jul 2017, Konstantin Belousov wrote:

Reverted all I changed and,
I have the following change now that diagnose the errors, against 11.1RC1:

for mkfs.c:
532,533c533,534
<       if (!Nflag)
<               do_sbwrite(&disk);
---
>       if (!Nflag && do_sbwrite(&disk) == -1)
>             err(1, "do_sbwrite(%d): ", __LINE__ );
601c602,603
<               do_sbwrite(&disk);
---
>               if (do_sbwrite(&disk) == -1)
>                     err(1, "do_sbwrite(%d): ", __LINE__ );

But that brings me back to the original issue:
cg 0: bad magic number

> For newfs, there are a lot of silly complications involving the block
size:
> - the of initialization of disk.d_bsize to sectorsize (from the firmware
>   or label) or even the -S option doesn't seem to be good for anything.
>   Actually, the comment explains this.  It says that "Our blocks = sector
>   size".  This is not what libufs wants.  It is what was convenient for
>   newfs before libufs.  It sometimes works for libufs too, but is not
>   documented to work (no method for initializing disk.d_bsize is
> documented).
> - oops, the wrapper db_sbwrite() is no to handle complications for the
> block
>   size.  It is to add the partition offset.  For most i/o's newfs tells
>   libufs the full offset.  Superblock i/o's are special because the
offsets
>   are passed implicitly and the implicit values don't contain the offset.

# newfs /dev/ggate0
/dev/ggate0: 64.0MB (131072 sectors) block size 32768, fragment size 4096
        using 4 cylinder groups of 16.03MB, 513 blks, 2176 inodes.
super-block backups (for fsck_ffs -b #) at:
 192, 33024, 65856, 98688
cg 0: bad magic number

But the geom access pattern is rather akward:
1) ggate0[WRITE(offset=67108352, length=512)]
2) ggate0[READ(offset=8192, length=8192)]
3) ggate0[WRITE(offset=65536, length=8192)]
4) ggate0[WRITE(offset=98304, length=131072)]
5) ggate0[WRITE(offset=16908288, length=131072)]
6) ggate0[WRITE(offset=33718272, length=131072)]
7) ggate0[WRITE(offset=50528256, length=131072)]
8) ggate0[READ(offset=131072, length=4096)]

WRITE-4 is where initcg writes the first cylinder group.
So there should cg_magic be set.

READ-8 is the bread that actually errors in not reading CG_MAGIC.
mkfs.c:1002:alloc():  bread(&disk, part_ofs + fsbtodb(&sblock,
       cgtod(&sblock, 0)), (char *)&acg, sblock.fs_cgsize)
This data was written in WRITE-4.

The full disk.d_cg seems to be empty both in the acg block as well as on
disk:
# hexdump -s 128k /dev/ggate0
0020000 0000 0000 0000 0000 0000 0000 0000 0000
*
0024880 0000 0000 0255 0009 0000 0000 0003 0000

the second cg however does seem the have a correct CG_MAGIC in the
second 32bit var (0x90255) on line 24880.

So I'm assuming that it did not get written. Now things are even more
awkward, in that once in a while newfs does complete without complaints.
Making me believe that some uninitialized use could play a role

We are seeing at work corruption where we newfs and create a bunch of
directories and reboot. On the fsck, it complains that the two sup er
blocks don't match and 15 of the 256 directories we created return EBADF
when listed. It looks like some of the writes didn't reach disk maybe. We
are chasing it down still. This is with current of maybe 4-8 weeks ago...

Warner

--WjW
_______________________________________________
freebsd-fs at freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"