ZFS on root booting broken somewhere after r270020

Kimmo Paasiala kpaasial at icloud.com
Thu Sep 11 00:55:30 UTC 2014


> On 11.9.2014, at 3.52, Steven Hartland <killing at multiplay.co.uk> wrote:
> 
> 
> ----- Original Message ----- From: "Kimmo Paasiala" <kpaasial at icloud.com>
> To: "Steven Hartland" <killing at multiplay.co.uk>
> Cc: <freebsd-stable at freebsd.org>
> Sent: Thursday, September 11, 2014 1:04 AM
> Subject: Re: ZFS on root booting broken somewhere after r270020
> 
> 
> 
>> On 11.9.2014, at 2.41, Steven Hartland <killing at multiplay.co.uk> wrote:
>> 
>> 
>> ----- Original Message ----- From: "Steven Hartland" <killing at multiplay.co.uk>
>> To: "Kimmo Paasiala" <kpaasial at icloud.com>
>> Cc: <freebsd-stable at freebsd.org>
>> Sent: Wednesday, September 10, 2014 11:36 PM
>> Subject: Re: ZFS on root booting broken somewhere after r270020
>> 
>> 
>>> 
>>> ----- Original Message ----- From: "Kimmo Paasiala" <kpaasial at icloud.com>
>>> To: "Steven Hartland" <killing at multiplay.co.uk>
>>> Cc: <freebsd-stable at freebsd.org>
>>> Sent: Wednesday, September 10, 2014 8:26 PM
>>> Subject: Re: ZFS on root booting broken somewhere after r270020
>>> 
>>> 
>>>> 
>>>>> On 9.9.2014, at 19.03, Kimmo Paasiala <kpaasial at icloud.com> wrote:
>>>>> 
>>>>> 
>>>>>> On 9.9.2014, at 18.53, Steven Hartland <killing at multiplay.co.uk> wrote:
>>>>>> 
>>>>>> ----- Original Message ----- From: "Kimmo Paasiala" <kpaasial at icloud.com>
>>>>>>> Hi it’s me again. Something that was committed in stable/10 after r271213 up to
>>>>>>> and including r271288 broke ZFS on Root booting in exactly the same way again.
>>>>>>> I know the problem is no longer related to extra kernel modules loaded in
>>>>>>> /boot/loader.conf because I’m loading only the required zfs.ko and opensolaris.ko
>>>>>>> modules. Also, the new vt(4) console that I’m using is not the culprit because the
>>>>>>> same thing happens with kern.vty set to “sc”.
>>>>>> 
>>>>>> I've just updated my stable/10 box to r271316 and no problems booting from a ZFS root.
>>>>>> 
>>>>>> So first things first what error are you seeing?
>>>>>> 
>>>>>> Next what is you're:
>>>>>> * Hardware
>>>>>> * Pool layout
>>>>>> 
>>>>>> Regards
>>>>>> Steve
>>>>> 
>>>>> The error is the same as before:
>>>>> 
>>>>> • Mounting from zfs:rdnzltank/ROOT/default failed with error 5.
>>>>> 
>>>>> Followed by the mountroot prompt and I get only these devices to choose from, no sign of the ZFS pool:
>>>>> 
>>>>> • mountroot>
>>>>> • List of GEOM managed disk devices:
>>>>> •   gpt/fb10disk1 gpt/fb10swap1 diskid/DISK-S13UJDWS301624p3 diskid/DISK-S13UJDWS301624p2 diskid/DISK-S13UJDWS301624p1 ada0p3 ada0p2 ada0p1 diskid/DISK-S13UJDWS301624 ada0
>>>>> 
>>>>> Hardware is a Gigabyte GA-D510UD Mini-ITX motherboard:
>>>>> 
>>>>> http://www.gigabyte.com/products/product-page.aspx?pid=3343#ov
>>>>> 
>>>>> 4GBs of RAM. One 750GB Samsung HD753LJ 3.5” SATA HD on the Intel SATA controller.
>>>>> 
>>>>> Pool layout:
>>>>> 
>>>>> pool: rdnzltank
>>>>> state: ONLINE
>>>>> scan: scrub repaired 0 in 1h7m with 0 errors on Wed Aug 20 09:27:48 2014
>>>>> config:
>>>>> 
>>>>>      NAME             STATE     READ WRITE CKSUM
>>>>>      rdnzltank        ONLINE       0     0     0
>>>>>        gpt/fb10disk1  ONLINE       0     0     0
>>>>> 
>>>>> errors: No known data errors
>>>>> 
>>>>> Output of ‘gpart show’:
>>>>> 
>>>>> freebsd10 ~ % gpart show
>>>>> =>        34  1465146988  ada0  GPT  (699G)
>>>>>        34        2014        - free -  (1.0M)
>>>>>      2048        1024     1  freebsd-boot  (512K)
>>>>>      3072        1024        - free -  (512K)
>>>>>      4096    16777216     2  freebsd-swap  (8.0G)
>>>>>  16781312  1448365710     3  freebsd-zfs  (691G)
>>>>> 
>>>>> 
>>>>> HTH,
>>>>> 
>>>>> -Kimmo
>>>> 
>>>> 
>>>> More information. This version still works:
>>>> 
>>>> FreeBSD freebsd10.rdnzl.info 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #0 r271237: Wed Sep 10 11:00:15 EEST 2014 root at buildstable10amd64.rdnzl.info:/usr/obj/usr/src/sys/GENERIC  amd64
>>>> 
>>>> The next higher version r271238 breaks booting for me. The commit in question is this one:
>>>> 
>>>> http://svnweb.freebsd.org/base?view=revision&sortby=rev&sortdir=down&revision=271238
>>> 
>>> Investigating, had no reports of issues while this has been in head.
>> 
>> I've just installed a stable/10 kernel, specifically:
>> 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #11 r271316M
>> 
>> and booted fine from a mirrored root without issue:
>> config:
>> 
>>      NAME        STATE     READ WRITE CKSUM
>>      tank        ONLINE       0     0     0
>>        mirror-0  ONLINE       0     0     0
>>          ada0p3  ONLINE       0     0     0
>>          ada2p3  ONLINE       0     0     0
>> 
>> gpart show ada0 ada2
>> =>       34  250069613  ada0  GPT  (119G)
>>       34        128     1  freebsd-boot  (64K)
>>      162    8388608     2  freebsd-swap  (4.0G)
>>  8388770  241680877     3  freebsd-zfs  (115G)
>> 
>> =>       40  586072288  ada2  GPT  (279G)
>>       40        128     1  freebsd-boot  (64K)
>>      168    8388608     2  freebsd-swap  (4.0G)
>>  8388776  577683552     3  freebsd-zfs  (275G)
>> 
>> I then detached the second disk so the machine had just:
>> config:
>> 
>>      NAME        STATE     READ WRITE CKSUM
>>      tank        ONLINE       0     0     0
>>        ada0p3    ONLINE       0     0     0
>> 
>> Rebooted and again all fine no issues
>> 
>> I've also got a raidz1 box on the same kernel it too is fine.
>> 
>> =>       34  500118125  ada0  GPT  (238G)
>>       34        128     1  freebsd-boot  (64K)
>>      162  500117997     2  freebsd-zfs  (238G)
>> ...
>> 
>> So its seems like there's something odd about your environment, especially
>> given you've had a similar issue before.
>> 
>> So the questions:
>> 1. What does zpool get all report?
>> 2. What does /boot/loader.conf have in it?
>> 3. What does zdb -C rdnzltank report?
>> 4. What does /etc/rc.conf have in it?
>> 
>>  Regards
>>  Steve
> 
> Here goes:
> snip...
> 
> The next is now with the second disk being resilvered, gpt/fb10disk2 is the new disk:
> 
> MOS Configuration:
>       version: 5000
>       name: 'rdnzltank'
>       state: 0
>       txg: 1634460
>       pool_guid: 5382786142589818227
>       hostid: 852094392
>       hostname: 'freebsd10.rdnzl.info'
>       vdev_children: 1
>       vdev_tree:
>           type: 'root'
>           id: 0
>           guid: 5382786142589818227
>           children[0]:
>               type: 'mirror'
>               id: 0
>               guid: 6268049119730836293
>               whole_disk: 0
>               metaslab_array: 34
>               metaslab_shift: 32
>               ashift: 9
>               asize: 741558452224
>               is_log: 0
>               create_txg: 4
>               children[0]:
>                   type: 'disk'
>                   id: 0
>                   guid: 1732695434302750511
>                   path: '/dev/gpt/fb10disk1'
>                   phys_path: '/dev/gpt/fb10disk1'
>                   whole_disk: 1
>                   DTL: 98
>                   create_txg: 4
>               children[1]:
>                   type: 'disk'
>                   id: 1
>                   guid: 15812067837864729710
>                   path: '/dev/gpt/fb10disk2'
>                   phys_path: '/dev/gpt/fb10disk2'
>                   whole_disk: 1
>                   DTL: 526
>                   create_txg: 4
>                   resilver_txg: 1634424
>       features_for_read:
>           com.delphix:hole_birth
>           com.delphix:embedded_data
> 
> Ok this could show your problem ^^
> 
> In a previous post your said
>>>>> pool: rdnzltank
>>>>> state: ONLINE
>>>>> scan: scrub repaired 0 in 1h7m with 0 errors on Wed Aug 20 09:27:48 2014
>>>>> config:
>>>>> 
>>>>>      NAME             STATE     READ WRITE CKSUM
>>>>>      rdnzltank        ONLINE       0     0     0
>>>>>        gpt/fb10disk1  ONLINE       0     0     0
> 
> But zdb thinks your pool is a mirror which I believe indicates that your pool's real
> config is out of sync with the cache file.
> 
> Now this shouldn't cause an issue as it should just try all devices in order until it
> succeeds but there may be an issue there somewhere.
> 
> Could you:-
> 1. backup your cache file
>   cp /boot/zfs/zpool.cache /boot/zfs/zpool.cache.old
> 2. regenerate your cache file
>   zpool set cachefile=/boot/zfs/zpool.cache tank
> 3. rerun the zdb command and let us know the output
>   zdb -C rdnzltank
>   I'm hoping that it should show:
>   ...
>       vdev_tree:
>           type: 'root'
>           id: 0
>           guid: 5382786142589818227
>           children[0]:
>               type: 'disk'
>   ..
> 4. If it does show type 'disk' try rebooting with the new kernel.
> 
>   Regards
>   Steve 

Yes, as I said I’m right now trying to see if the pool would work as a mirror with the newer kernel that somehow broke booting for me. I’m in the middle of a resilver that keeps restarting for some odd reason every 15 minutes…

-Kimmo



More information about the freebsd-stable mailing list