ZFS on root booting broken somewhere after r270020
Kimmo Paasiala
kpaasial at icloud.com
Thu Sep 11 00:55:30 UTC 2014
> On 11.9.2014, at 3.52, Steven Hartland <killing at multiplay.co.uk> wrote:
>
>
> ----- Original Message ----- From: "Kimmo Paasiala" <kpaasial at icloud.com>
> To: "Steven Hartland" <killing at multiplay.co.uk>
> Cc: <freebsd-stable at freebsd.org>
> Sent: Thursday, September 11, 2014 1:04 AM
> Subject: Re: ZFS on root booting broken somewhere after r270020
>
>
>
>> On 11.9.2014, at 2.41, Steven Hartland <killing at multiplay.co.uk> wrote:
>>
>>
>> ----- Original Message ----- From: "Steven Hartland" <killing at multiplay.co.uk>
>> To: "Kimmo Paasiala" <kpaasial at icloud.com>
>> Cc: <freebsd-stable at freebsd.org>
>> Sent: Wednesday, September 10, 2014 11:36 PM
>> Subject: Re: ZFS on root booting broken somewhere after r270020
>>
>>
>>>
>>> ----- Original Message ----- From: "Kimmo Paasiala" <kpaasial at icloud.com>
>>> To: "Steven Hartland" <killing at multiplay.co.uk>
>>> Cc: <freebsd-stable at freebsd.org>
>>> Sent: Wednesday, September 10, 2014 8:26 PM
>>> Subject: Re: ZFS on root booting broken somewhere after r270020
>>>
>>>
>>>>
>>>>> On 9.9.2014, at 19.03, Kimmo Paasiala <kpaasial at icloud.com> wrote:
>>>>>
>>>>>
>>>>>> On 9.9.2014, at 18.53, Steven Hartland <killing at multiplay.co.uk> wrote:
>>>>>>
>>>>>> ----- Original Message ----- From: "Kimmo Paasiala" <kpaasial at icloud.com>
>>>>>>> Hi it’s me again. Something that was committed in stable/10 after r271213 up to
>>>>>>> and including r271288 broke ZFS on Root booting in exactly the same way again.
>>>>>>> I know the problem is no longer related to extra kernel modules loaded in
>>>>>>> /boot/loader.conf because I’m loading only the required zfs.ko and opensolaris.ko
>>>>>>> modules. Also, the new vt(4) console that I’m using is not the culprit because the
>>>>>>> same thing happens with kern.vty set to “sc”.
>>>>>>
>>>>>> I've just updated my stable/10 box to r271316 and no problems booting from a ZFS root.
>>>>>>
>>>>>> So first things first what error are you seeing?
>>>>>>
>>>>>> Next what is you're:
>>>>>> * Hardware
>>>>>> * Pool layout
>>>>>>
>>>>>> Regards
>>>>>> Steve
>>>>>
>>>>> The error is the same as before:
>>>>>
>>>>> • Mounting from zfs:rdnzltank/ROOT/default failed with error 5.
>>>>>
>>>>> Followed by the mountroot prompt and I get only these devices to choose from, no sign of the ZFS pool:
>>>>>
>>>>> • mountroot>
>>>>> • List of GEOM managed disk devices:
>>>>> • gpt/fb10disk1 gpt/fb10swap1 diskid/DISK-S13UJDWS301624p3 diskid/DISK-S13UJDWS301624p2 diskid/DISK-S13UJDWS301624p1 ada0p3 ada0p2 ada0p1 diskid/DISK-S13UJDWS301624 ada0
>>>>>
>>>>> Hardware is a Gigabyte GA-D510UD Mini-ITX motherboard:
>>>>>
>>>>> http://www.gigabyte.com/products/product-page.aspx?pid=3343#ov
>>>>>
>>>>> 4GBs of RAM. One 750GB Samsung HD753LJ 3.5” SATA HD on the Intel SATA controller.
>>>>>
>>>>> Pool layout:
>>>>>
>>>>> pool: rdnzltank
>>>>> state: ONLINE
>>>>> scan: scrub repaired 0 in 1h7m with 0 errors on Wed Aug 20 09:27:48 2014
>>>>> config:
>>>>>
>>>>> NAME STATE READ WRITE CKSUM
>>>>> rdnzltank ONLINE 0 0 0
>>>>> gpt/fb10disk1 ONLINE 0 0 0
>>>>>
>>>>> errors: No known data errors
>>>>>
>>>>> Output of ‘gpart show’:
>>>>>
>>>>> freebsd10 ~ % gpart show
>>>>> => 34 1465146988 ada0 GPT (699G)
>>>>> 34 2014 - free - (1.0M)
>>>>> 2048 1024 1 freebsd-boot (512K)
>>>>> 3072 1024 - free - (512K)
>>>>> 4096 16777216 2 freebsd-swap (8.0G)
>>>>> 16781312 1448365710 3 freebsd-zfs (691G)
>>>>>
>>>>>
>>>>> HTH,
>>>>>
>>>>> -Kimmo
>>>>
>>>>
>>>> More information. This version still works:
>>>>
>>>> FreeBSD freebsd10.rdnzl.info 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #0 r271237: Wed Sep 10 11:00:15 EEST 2014 root at buildstable10amd64.rdnzl.info:/usr/obj/usr/src/sys/GENERIC amd64
>>>>
>>>> The next higher version r271238 breaks booting for me. The commit in question is this one:
>>>>
>>>> http://svnweb.freebsd.org/base?view=revision&sortby=rev&sortdir=down&revision=271238
>>>
>>> Investigating, had no reports of issues while this has been in head.
>>
>> I've just installed a stable/10 kernel, specifically:
>> 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #11 r271316M
>>
>> and booted fine from a mirrored root without issue:
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> tank ONLINE 0 0 0
>> mirror-0 ONLINE 0 0 0
>> ada0p3 ONLINE 0 0 0
>> ada2p3 ONLINE 0 0 0
>>
>> gpart show ada0 ada2
>> => 34 250069613 ada0 GPT (119G)
>> 34 128 1 freebsd-boot (64K)
>> 162 8388608 2 freebsd-swap (4.0G)
>> 8388770 241680877 3 freebsd-zfs (115G)
>>
>> => 40 586072288 ada2 GPT (279G)
>> 40 128 1 freebsd-boot (64K)
>> 168 8388608 2 freebsd-swap (4.0G)
>> 8388776 577683552 3 freebsd-zfs (275G)
>>
>> I then detached the second disk so the machine had just:
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> tank ONLINE 0 0 0
>> ada0p3 ONLINE 0 0 0
>>
>> Rebooted and again all fine no issues
>>
>> I've also got a raidz1 box on the same kernel it too is fine.
>>
>> => 34 500118125 ada0 GPT (238G)
>> 34 128 1 freebsd-boot (64K)
>> 162 500117997 2 freebsd-zfs (238G)
>> ...
>>
>> So its seems like there's something odd about your environment, especially
>> given you've had a similar issue before.
>>
>> So the questions:
>> 1. What does zpool get all report?
>> 2. What does /boot/loader.conf have in it?
>> 3. What does zdb -C rdnzltank report?
>> 4. What does /etc/rc.conf have in it?
>>
>> Regards
>> Steve
>
> Here goes:
> snip...
>
> The next is now with the second disk being resilvered, gpt/fb10disk2 is the new disk:
>
> MOS Configuration:
> version: 5000
> name: 'rdnzltank'
> state: 0
> txg: 1634460
> pool_guid: 5382786142589818227
> hostid: 852094392
> hostname: 'freebsd10.rdnzl.info'
> vdev_children: 1
> vdev_tree:
> type: 'root'
> id: 0
> guid: 5382786142589818227
> children[0]:
> type: 'mirror'
> id: 0
> guid: 6268049119730836293
> whole_disk: 0
> metaslab_array: 34
> metaslab_shift: 32
> ashift: 9
> asize: 741558452224
> is_log: 0
> create_txg: 4
> children[0]:
> type: 'disk'
> id: 0
> guid: 1732695434302750511
> path: '/dev/gpt/fb10disk1'
> phys_path: '/dev/gpt/fb10disk1'
> whole_disk: 1
> DTL: 98
> create_txg: 4
> children[1]:
> type: 'disk'
> id: 1
> guid: 15812067837864729710
> path: '/dev/gpt/fb10disk2'
> phys_path: '/dev/gpt/fb10disk2'
> whole_disk: 1
> DTL: 526
> create_txg: 4
> resilver_txg: 1634424
> features_for_read:
> com.delphix:hole_birth
> com.delphix:embedded_data
>
> Ok this could show your problem ^^
>
> In a previous post your said
>>>>> pool: rdnzltank
>>>>> state: ONLINE
>>>>> scan: scrub repaired 0 in 1h7m with 0 errors on Wed Aug 20 09:27:48 2014
>>>>> config:
>>>>>
>>>>> NAME STATE READ WRITE CKSUM
>>>>> rdnzltank ONLINE 0 0 0
>>>>> gpt/fb10disk1 ONLINE 0 0 0
>
> But zdb thinks your pool is a mirror which I believe indicates that your pool's real
> config is out of sync with the cache file.
>
> Now this shouldn't cause an issue as it should just try all devices in order until it
> succeeds but there may be an issue there somewhere.
>
> Could you:-
> 1. backup your cache file
> cp /boot/zfs/zpool.cache /boot/zfs/zpool.cache.old
> 2. regenerate your cache file
> zpool set cachefile=/boot/zfs/zpool.cache tank
> 3. rerun the zdb command and let us know the output
> zdb -C rdnzltank
> I'm hoping that it should show:
> ...
> vdev_tree:
> type: 'root'
> id: 0
> guid: 5382786142589818227
> children[0]:
> type: 'disk'
> ..
> 4. If it does show type 'disk' try rebooting with the new kernel.
>
> Regards
> Steve
Yes, as I said I’m right now trying to see if the pool would work as a mirror with the newer kernel that somehow broke booting for me. I’m in the middle of a resilver that keeps restarting for some odd reason every 15 minutes…
-Kimmo
More information about the freebsd-stable
mailing list