16.0E ExpandSize? -- New Server

Steven Hartland killing at multiplay.co.uk
Tue Jan 31 21:17:24 UTC 2017


Ok so that confirms it, try the attached patch (only a new kernel is 
needed) on a read only import of the pool and see if that fixes it.

     Regards
     Steve

On 31/01/2017 21:00, Larry Rosenman wrote:
>
> borg-new /home/ler $ sudo ./vdev-stats.d
> Password:
> vdev_path: n/a, vdev_max_asize: 0, vdev_asize: 0, vdev_min_asize: 0
> vdev_path: n/a, vdev_max_asize: 11947471798272, vdev_asize: 
> 11947478089728, vdev_min_asize: 11888469475328
> vdev_path: /dev/mfid4p4, vdev_max_asize: 1991245299712, vdev_asize: 
> 1991245299712, vdev_min_asize: 1981411579221
> vdev_path: /dev/mfid0p4, vdev_max_asize: 1991246348288, vdev_asize: 
> 1991246348288, vdev_min_asize: 1981411579221
> vdev_path: /dev/mfid1p4, vdev_max_asize: 1991246348288, vdev_asize: 
> 1991246348288, vdev_min_asize: 1981411579221
> vdev_path: /dev/mfid3p4, vdev_max_asize: 1991247921152, vdev_asize: 
> 1991247921152, vdev_min_asize: 1981411579221
> vdev_path: /dev/mfid2p4, vdev_max_asize: 1991246348288, vdev_asize: 
> 1991246348288, vdev_min_asize: 1981411579221
> vdev_path: /dev/mfid5p4, vdev_max_asize: 1991246348288, vdev_asize: 
> 1991246348288, vdev_min_asize: 1981411579221
> ^C
>
> borg-new /home/ler $
>
>
> borg-new /home/ler $ sudo zpool list -v
> Password:
> NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
> zroot 10.8T 94.3G 10.7T 16.0E 0% 0% 1.00x ONLINE -
> raidz1 10.8T 94.3G 10.7T 16.0E 0% 0%
> mfid4p4 - - - - - -
> mfid0p4 - - - - - -
> mfid1p4 - - - - - -
> mfid3p4 - - - - - -
> mfid2p4 - - - - - -
> mfid5p4 - - - - - -
> borg-new /home/ler $
>
>
> On 01/31/2017 2:37 pm, Steven Hartland wrote:
>
>> In that case based on your zpool history I suspect that the original 
>> mfid4p4 was the same size as mfid0p4 (1991246348288) but its been 
>> replaced with a drive which is (1991245299712), slightly smaller.
>>
>> This smaller size results in a max_asize of 1991245299712 * 6 instead 
>> of original 1991246348288* 6.
>>
>> Now given the way min_asize (the value used to check if the device 
>> size is acceptable) is rounded to the the nearest metaslab I believe 
>> that replace would be allowed.
>> https://github.com/freebsd/freebsd/blob/master/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c#L4947
>>
>> Now the problem is that on open the calculated asize is only updated 
>> if its expanding:
>> https://github.com/freebsd/freebsd/blob/master/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c#L1424
>>
>> The updated dtrace file outputs vdev_min_asize which should confirm 
>> my suspicion about why the replace was allowed.
>>
>> On 31/01/2017 19:05, Larry Rosenman wrote:
>>>
>>> I've replaced some disks due to failure, and some of the pariition 
>>> sizes are different.
>>>
>>>
>>> autoexpand is off:
>>>
>>> borg-new /home/ler $ zpool get all zroot
>>> NAME PROPERTY VALUE SOURCE
>>> zroot size 10.8T -
>>> zroot capacity 0% -
>>> zroot altroot - default
>>> zroot health ONLINE -
>>> zroot guid 11945658884309024932 default
>>> zroot version - default
>>> zroot bootfs zroot/ROOT/default local
>>> zroot delegation on default
>>> zroot autoreplace off default
>>> zroot cachefile - default
>>> zroot failmode wait default
>>> zroot listsnapshots off default
>>> zroot autoexpand off default
>>> zroot dedupditto 0 default
>>> zroot dedupratio 1.00x -
>>> zroot free 10.7T -
>>> zroot allocated 94.3G -
>>> zroot readonly off -
>>> zroot comment - default
>>> zroot expandsize 16.0E -
>>> zroot freeing 0 default
>>> zroot fragmentation 0% -
>>> zroot leaked 0 default
>>> zroot feature at async_destroy enabled local
>>> zroot feature at empty_bpobj active local
>>> zroot feature at lz4_compress active local
>>> zroot feature at multi_vdev_crash_dump enabled local
>>> zroot feature at spacemap_histogram active local
>>> zroot feature at enabled_txg active local
>>> zroot feature at hole_birth active local
>>> zroot feature at extensible_dataset enabled local
>>> zroot feature at embedded_data active local
>>> zroot feature at bookmarks enabled local
>>> zroot feature at filesystem_limits enabled local
>>> zroot feature at large_blocks enabled local
>>> zroot feature at sha512 enabled local
>>> zroot feature at skein enabled local
>>> borg-new /home/ler $
>>>
>>>
>>> borg-new /home/ler $ gpart show
>>> => 40 3905945520 mfid0 GPT (1.8T)
>>> 40 1600 1 efi (800K)
>>> 1640 1024 2 freebsd-boot (512K)
>>> 2664 1432 - free - (716K)
>>> 4096 16777216 3 freebsd-swap (8.0G)
>>> 16781312 3889162240 4 freebsd-zfs (1.8T)
>>> 3905943552 2008 - free - (1.0M)
>>>
>>> => 40 3905945520 mfid1 GPT (1.8T)
>>> 40 1600 1 efi (800K)
>>> 1640 1024 2 freebsd-boot (512K)
>>> 2664 1432 - free - (716K)
>>> 4096 16777216 3 freebsd-swap (8.0G)
>>> 16781312 3889162240 4 freebsd-zfs (1.8T)
>>> 3905943552 2008 - free - (1.0M)
>>>
>>> => 40 3905945520 mfid2 GPT (1.8T)
>>> 40 1600 1 efi (800K)
>>> 1640 1024 2 freebsd-boot (512K)
>>> 2664 1432 - free - (716K)
>>> 4096 16777216 3 freebsd-swap (8.0G)
>>> 16781312 3889162240 4 freebsd-zfs (1.8T)
>>> 3905943552 2008 - free - (1.0M)
>>>
>>> => 40 3905945520 mfid3 GPT (1.8T)
>>> 40 1600 1 efi (800K)
>>> 1640 1024 2 freebsd-boot (512K)
>>> 2664 16777216 3 freebsd-swap (8.0G)
>>> 16779880 3889165680 4 freebsd-zfs (1.8T)
>>>
>>> => 40 3905945520 mfid5 GPT (1.8T)
>>> 40 1600 1 efi (800K)
>>> 1640 1024 2 freebsd-boot (512K)
>>> 2664 1432 - free - (716K)
>>> 4096 16777216 3 freebsd-swap (8.0G)
>>> 16781312 3889162240 4 freebsd-zfs (1.8T)
>>> 3905943552 2008 - free - (1.0M)
>>>
>>> => 40 3905945520 mfid4 GPT (1.8T)
>>> 40 1600 1 efi (800K)
>>> 1640 1024 2 freebsd-boot (512K)
>>> 2664 1432 - free - (716K)
>>> 4096 16777216 3 freebsd-swap (8.0G)
>>> 16781312 3889160192 4 freebsd-zfs (1.8T)
>>> 3905941504 4056 - free - (2.0M)
>>>
>>> borg-new /home/ler $
>>>
>>>
>>> this system was built last week, and I **CAN** rebuild it if 
>>> necessary, but I didn't do anything strange (so I thought :) )
>>>
>>>
>>>
>>>
>>> On 01/31/2017 12:30 pm, Steven Hartland wrote:
>>>
>>>     Your issue is the reported vdev_max_asize > vdev_asize:
>>>     vdev_max_asize: 11947471798272
>>>     vdev_asize:     11947478089728
>>>
>>>     max asize is smaller than asize by 6291456
>>>
>>>     For raidz1 Xsize should be the smallest disk Xsize * disks so:
>>>     1991245299712 * 6 = 11947471798272
>>>
>>>     So your max asize looks right but asize looks too big
>>>
>>>     Expand Size is calculated by:
>>>     if (vd->vdev_aux == NULL && tvd != NULL && vd->vdev_max_asize !=
>>>     0) {
>>>         vs->vs_esize = P2ALIGN(vd->vdev_max_asize - vd->vdev_asize,
>>>             1ULL << tvd->vdev_ms_shift);
>>>     }
>>>
>>>     So the question is why is asize too big?
>>>
>>>     Given you seem to have some random disk sizes do you have auto
>>>     expand turned on?
>>>
>>>     On 31/01/2017 17:39, Larry Rosenman wrote:
>>>
>>>         vdev_path: n/a, vdev_max_asize: 11947471798272, vdev_asize:
>>>         11947478089728
>>>
>>>
>>> -- 
>>> Larry Rosenman http://people.freebsd.org/~ler 
>>> <http://people.freebsd.org/%7Eler>
>>> Phone: +1 214-642-9640                 E-Mail: ler at FreeBSD.org 
>>> <mailto:ler at FreeBSD.org>
>>> US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281
>
>
> -- 
> Larry Rosenman http://people.freebsd.org/~ler 
> <http://people.freebsd.org/%7Eler>
> Phone: +1 214-642-9640                 E-Mail: ler at FreeBSD.org 
> <mailto:ler at FreeBSD.org>
> US Mail: 17716 Limpia Crk, Round Rock, TX 78664-7281

-------------- next part --------------
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c
===================================================================
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c	(revision 313003)
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c	(working copy)
@@ -1377,7 +1377,7 @@
 	vd->vdev_psize = psize;
 
 	/*
-	 * Make sure the allocatable size hasn't shrunk.
+	 * Make sure the allocatable size hasn't shrunk too much.
 	 */
 	if (asize < vd->vdev_min_asize) {
 		vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN,
@@ -1420,10 +1420,19 @@
 	 * If all children are healthy and the asize has increased,
 	 * then we've experienced dynamic LUN growth.  If automatic
 	 * expansion is enabled then use the additional space.
+	 * 
+	 * Otherwise if asize has reduced, shrink to ensure that
+	 * calculations based of max_asize and asize e.g. esize are
+	 * always valid. This is safe as we've already validated that
+	 * asize is not less than min_asize.
 	 */
-	if (vd->vdev_state == VDEV_STATE_HEALTHY && asize > vd->vdev_asize &&
-	    (vd->vdev_expanding || spa->spa_autoexpand))
-		vd->vdev_asize = asize;
+	if (vd->vdev_state == VDEV_STATE_HEALTHY) {
+		if (asize > vd->vdev_asize &&
+		    (vd->vdev_expanding || spa->spa_autoexpand))
+			vd->vdev_asize = asize;
+		else if (asize < vd->vdev_asize)
+			vd->vdev_asize = asize;
+	}
 
 	vdev_set_min_asize(vd);
 


More information about the freebsd-fs mailing list