Waht is the minimum free space ... (Full Report)

Peter pmc at citylink.dinoex.sub.org
Fri May 17 01:13:43 UTC 2019


Alright, now I was able to reproduce the failure under test
conditions. And in addition to this, I was able to produce
two deadlocks, a bunch of GPT errors and a ZFS assertion failure.

Here we go:

ABSTRACT
========
The original idea was to check if ZFS can grow a raid5. (This is
technically easy, but not all volume managers are willing to do
it, so I decided to give it a try.) Therefore I created three
partitions on some spare diskspace, with free-space interleave
between, then created a ZFS raidz pool on them, enlarged the
partitions, and tried to autoexpand the ZFS pool.
While the procedure appeared to work in theory, I got strange
results in practice, leading even to kernel crashes.

Now I conducted five test-cases with protocols:
1. try with an USB-stick on a core-i consumer machine.
   The chosen procedure did now lead into deadlock
   long before even expanding the pool.
2. try with a spinning drive on the core-i.
   The chosen procedure was fully successful.
3. same as 2., but without exporting/importing the pool before
   expansion. This procedure led into deadlock.
4. try with spinning drive on ancient pentium server machine, with
   the exact same procedure as recently.
   The procedure did not lead to kernel crash this time; instead
   it produced GPT errors and ZFS assertion failure, which may
   provide further insight.
5. try the same procedure as in 4. now on the core-i machine.
   The procedure did lead to immediate reboot once, and to
   "trap 12: page fault" the other times.

In both cases 1. and 3. it was not possible to execute "sync"
anymore, so death of the system would be just a matter of time.

See the protocols of the testcases below and some commentary at the
end.


Summary of the procedures:
==========================
Testcase 1:
  * create partitions
  * create pool
  * export pool -> hangs

Testcase 2:
  * create partitions
  * create pool
  * export pool
  * import pool
  * enlarge partitions
  * export pool
  * import pool
  * set autoexpand=on
  * online -e  -> success!

Testcase 3:
  * create partitions
  * create pool
  * enlarge partitions
  * set autoexpand=on
  * online -e  -> hangs

Testcase 4:
  * create partitions
  * create pool
  * enlarge partitions
  * set autoexpand=on
  * export pool
  * -> cannot import anymore
  * shrink back partitions
  * -> ZFS assertion failed
  * destroy pool and partitions
  * -> devices still present!

Testcase 5:
  * create partitions
  * create pool
  * enlarge partitions
  * set autoexpand=on
  * export pool
  * -> cannot import anymore
  * shrink back partition -> crash!
  * shrink back partition -> crash!
  * shrink back partition -> crash!
  * import successful.


--------------- TESTCASE 1 BEGIN ------------------------
Script started on Thu May 16 20:47:21 2019
root at disp:~ # camcontrol devlist
<KINGSTON SA400S37240G S1Z40102>   at scbus1 target 0 lun 0 (pass0,ada0)
<Hitachi HDS5C1050CLA382 JC2OA50E>  at scbus2 target 0 lun 0 (pass1,ada1)
<TSSTcorp CDDVDW SH-224DB SB01>    at scbus3 target 0 lun 0 (pass2,cd0)
<AHCI SGPIO Enclosure 1.00 0001>   at scbus4 target 0 lun 0 (pass3,ses0)
<General UDisk 5.00>               at scbus5 target 0 lun 0 (da0,pass4)
<Kingston DataTraveler 3.0 PMAP>   at scbus6 target 0 lun 0 (da1,pass5)
root at disp:~ # gpart create -s GPT da1
da1 created
root at disp:~ # gpart add -t freebsd-zfs -s 1G da1
da1p1 added
root at disp:~ # gpart add -t freebsd-zfs -s 1G -b 4194344 da1
da1p2 added
root at disp:~ # gpart add -t freebsd-zfs -s 1G -b 8388648 da1
da1p3 added
root at disp:~ # gpart show da1
=>      40  60555184  da1  GPT  (29G)
        40   2097152    1  freebsd-zfs  (1.0G)
   2097192   2097152       - free -  (1.0G)
   4194344   2097152    2  freebsd-zfs  (1.0G)
   6291496   2097152       - free -  (1.0G)
   8388648   2097152    3  freebsd-zfs  (1.0G)
  10485800  50069424       - free -  (24G)

root at disp:~ # zpool create -f testraid raidz da1p1 da1p2 da1p3
root at disp:~ # zpool list testraid
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
testraid  2.75G   552K  2.75G        -         -     0%     0%  1.00x  ONLINE  -
root at disp:~ # zpool status testraid
  pool: testraid
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        testraid    ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            da1p1   ONLINE       0     0     0
            da1p2   ONLINE       0     0     0
            da1p3   ONLINE       0     0     0

errors: No known data errors
root at disp:~ # zpool export testraid

## At this point the command hung forever and we were dead-in-the-water,
## no disk-I/O happening.

Output ps shows "D"eadlock in "connec":
   0 5640 4647   0  20  0    7760   3784 connec   D+   10    0:00.01 zpool

No "df", no "sync", no "reboot".
--------------- TESTCASE 1  END  ------------------------

--------------- TESTCASE 2 BEGIN ------------------------
Script started on Thu May 16 21:26:14 2019
root at disp:~ # camcontrol devlist
<KINGSTON SA400S37240G S1Z40102>   at scbus1 target 0 lun 0 (pass0,ada0)
<Hitachi HDS5C1050CLA382 JC2OA50E>  at scbus2 target 0 lun 0 (pass1,ada1)
<WDC WD5000AAKS-00A7B2 01.03B01>   at scbus3 target 0 lun 0 (pass2,ada2)
<TSSTcorp CDDVDW SH-224DB SB01>    at scbus4 target 0 lun 0 (pass3,cd0)
<AHCI SGPIO Enclosure 1.00 0001>   at scbus5 target 0 lun 0 (pass4,ses0)
<General UDisk 5.00>               at scbus6 target 0 lun 0 (da0,pass5)
root at disp:~ # gpart show ada2
gpart: No such geom: ada2.
root at disp:~ # gpart create -s GPT ada2
ada2 created
root at disp:~ # gpart add -t freebsd-zfs -s 1G ada2
ada2p1 added
root at disp:~ # gpart add -t freebsd-zfs -s 1G -b 4194344 ada2
ada2p2 added
root at disp:~ # gpart add -t freebsd-zfs -s 1G -b 8388648 ada2
ada2p3 added
root at disp:~ # gpart show ada2
=>       40  976773088  ada2  GPT  (466G)
         40    2097152     1  freebsd-zfs  (1.0G)
    2097192    2097152        - free -  (1.0G)
    4194344    2097152     2  freebsd-zfs  (1.0G)
    6291496    2097152        - free -  (1.0G)
    8388648    2097152     3  freebsd-zfs  (1.0G)
   10485800  966287328        - free -  (461G)

root at disp:~ # zpool create testraid raidz ada2p1 ada2p2 ada2p3
root at disp:~ # zpool list testraid
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
testraid  2.75G   696K  2.75G        -         -     0%     0%  1.00x  ONLINE  -
root at disp:~ # zpool status testraid
  pool: testraid
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	testraid    ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    ada2p1  ONLINE       0     0     0
	    ada2p2  ONLINE       0     0     0
	    ada2p3  ONLINE       0     0     0

errors: No known data errors
root at disp:~ # zpool export testraid
root at disp:~ # zpool import
   pool: testraid
     id: 3885773658779285422
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

	testraid    ONLINE
	  raidz1-0  ONLINE
	    ada2p1  ONLINE
	    ada2p2  ONLINE
	    ada2p3  ONLINE
root at disp:~ # zpool import testraid
root at disp:~ # zpool status testraid
  pool: testraid
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	testraid    ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    ada2p1  ONLINE       0     0     0
	    ada2p2  ONLINE       0     0     0
	    ada2p3  ONLINE       0     0     0

errors: No known data errors
root at disp:~ # gpart resize -s 4194304 -i 1 ada2
ada2p1 resized
root at disp:~ # gpart resize -s 4194304 -i 2 ada2
ada2p2 resized
root at disp:~ # gpart resize -s 4194304 -i 3 ada2
ada2p3 resized
root at disp:~ # gpart show ada2
=>       40  976773088  ada2  GPT  (466G)
         40    4194304     1  freebsd-zfs  (2.0G)
    4194344    4194304     2  freebsd-zfs  (2.0G)
    8388648    4194304     3  freebsd-zfs  (2.0G)
   12582952  964190176        - free -  (460G)

root at disp:~ # zpool status testraid
  pool: testraid
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	testraid    ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    ada2p1  ONLINE       0     0     0
	    ada2p2  ONLINE       0     0     0
	    ada2p3  ONLINE       0     0     0

errors: No known data errors
root at disp:~ # zpool export testraid
root at disp:~ # zpool import
   pool: testraid
     id: 3885773658779285422
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

	testraid    ONLINE
	  raidz1-0  ONLINE
	    ada2p1  ONLINE
	    ada2p2  ONLINE
	    ada2p3  ONLINE
root at disp:~ # zpool import testraid
root at disp:~ # zpool list testraid
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
testraid  2.75G  1.02M  2.75G        -        3G     0%     0%  1.00x  ONLINE  -
root at disp:~ # zpool set autoexpand=on testraid
root at disp:~ # zpool list testraid
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
testraid  2.75G  1.32M  2.75G        -        3G     0%     0%  1.00x  ONLINE  -
root at disp:~ # zpool status testraid
  pool: testraid
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	testraid    ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    ada2p1  ONLINE       0     0     0
	    ada2p2  ONLINE       0     0     0
	    ada2p3  ONLINE       0     0     0

errors: No known data errors
root at disp:~ # zpool online -e testraid ada2p1
root at disp:~ # zpool status testraid
  pool: testraid
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	testraid    ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    ada2p1  ONLINE       0     0     0
	    ada2p2  ONLINE       0     0     0
	    ada2p3  ONLINE       0     0     0

errors: No known data errors
root at disp:~ # zpool list testraid
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
testraid  5.75G  1.20M  5.75G        -         -     0%     0%  1.00x  ONLINE  -
root at disp:~ # 
root at disp:~ # zpool destroy testraid
root at disp:~ # gpart destroy -F ada2
ada2 destroyed
--------------- TESTCASE 2  END  ------------------------
--------------- TESTCASE 3 BEGIN ------------------------

## continuation of script from testcase 2 ##

root at disp:~ # gpart create -s GPT ada2
ada2 created
root at disp:~ # gpart add -t freebsd-zfs -s 1G ada2
ada2p1 added
root at disp:~ # gpart add -t freebsd-zfs -s 1G -b 4194344 ada2
ada2p2 added
root at disp:~ # gpart add -t freebsd-zfs -s 1G -b 8388648 ada2
ada2p3 added
root at disp:~ # zpool create testraid raidz ada2p1 ada2p2 ada2p3
root at disp:~ # zpool status testraid
  pool: testraid
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	testraid    ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    ada2p1  ONLINE       0     0     0
	    ada2p2  ONLINE       0     0     0
	    ada2p3  ONLINE       0     0     0

errors: No known data errors
root at disp:~ # gpart resize -s 4194304 -i 1 ada2
ada2p1 resized
root at disp:~ # gpart resize -s 4194304 -i 2 ada2
ada2p2 resized
root at disp:~ # gpart resize -s 4194304 -i 3 ada2
ada2p3 resized
root at disp:~ # zpool set autoexpand=yes testraid
root at disp:~ # zpool online -e testraid ada2p1

## At this point the command hung forever and we were dead-in-the-water,
## no disk-I/O happening.

Output ps shows "D"eadlock in "tx->tx_s":
   0 4433 1224   0  20  0   7760   3764 tx->tx_s D+    9   0:00.01 zpool

No "df", no "sync", no "reboot".
--------------- TESTCASE 3  END  ------------------------

--------------- TESTCASE 4 BEGIN ------------------------

## DON'T DARE TO COMMENT ON THE AGE OF THIS HARDWARE!
## (it is a precious antique)
##

Script started on Thu May 16 22:07:04 2019
root at edge:~ # camcontrol devlist
<Maxtor 33073H3 YAH814Y0>          at scbus0 target 0 lun 0 (ada0,pass0)
<ASUS CD-S340 3.40>                at scbus1 target 0 lun 0 (pass1,cd0)
<QUANTUM ATLAS10K3_36_WLS 020W>    at scbus2 target 0 lun 0 (da0,pass2)
<IBM IC35L018UWDY10-0 S25F>        at scbus2 target 2 lun 0 (da1,pass3)
<IBM IC35L018UWDY10-0 S25F>        at scbus2 target 4 lun 0 (da2,pass4)
<SanDisk SDSSDA120G Z22000RL>      at scbus4 target 0 lun 0 (ada1,pass5)
<ST3000DM008-2DM166 CC26>          at scbus5 target 0 lun 0 (ada2,pass6)
<KINGSTON SA400S37120G SBFK71E0>   at scbus7 target 0 lun 0 (ada3,pass7)
<Hitachi HDS5C1010CLA382 JC4OA3MA>  at scbus8 target 0 lun 0 (ada4,pass8)
<Kingston DataTraveler G2 1.00>    at scbus10 target 0 lun 0 (da3,pass9)
root at edge:~ # gpart add -t freebsd-zfs -s 1G ada2
ada2p5 added
root at edge:~ # gpart show ada2
=>        40  5860533088  ada2  GPT  (2.7T)
          40   209715200     1  freebsd  (100G)
   209715240  1687971896     2  freebsd-zfs  (805G)
  1897687136  1924953472     4  freebsd-zfs  (918G)
  3822640608    55838024     3  freebsd-zfs  (27G)
  3878478632     2097152     5  freebsd-zfs  (1.0G)
  3880575784  1979957344        - free -  (944G)

root at edge:~ # gpart add -t freebsd-zfs -s 1G -b 3882672936 ada2
ada2p6 added
root at edge:~ # gpart add -t freebsd-zfs -s 1G -b 3886867240 ada2
ada2p7 added
root at edge:~ # gpart show ada2
=>        40  5860533088  ada2  GPT  (2.7T)
          40   209715200     1  freebsd  (100G)
   209715240  1687971896     2  freebsd-zfs  (805G)
  1897687136  1924953472     4  freebsd-zfs  (918G)
  3822640608    55838024     3  freebsd-zfs  (27G)
  3878478632     2097152     5  freebsd-zfs  (1.0G)
  3880575784     2097152        - free -  (1.0G)
  3882672936     2097152     6  freebsd-zfs  (1.0G)
  3884770088     2097152        - free -  (1.0G)
  3886867240     2097152     7  freebsd-zfs  (1.0G)
  3888964392  1971568736        - free -  (940G)

root at edge:~ # zpool create testraid raidz ada2p5 ada2p6 ada2p7
root at edge:~ # zpool list testraid
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
testraid  2.75G   656K  2.75G        -         -     0%     0%  1.00x  ONLINE  -
root at edge:~ # zpool status testraid
  pool: testraid
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	testraid    ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    ada2p5  ONLINE       0     0     0
	    ada2p6  ONLINE       0     0     0
	    ada2p7  ONLINE       0     0     0

errors: No known data errors
root at edge:~ # gpart resize -s 4194304 -i 5 ada2
ada2p5 resized
root at edge:~ # gpart resize -s 4194304 -i 6 ada2
ada2p6 resized
root at edge:~ # gpart resize -s 4194304 -i 7 ada2
ada2p7 resized
root at edge:~ # gpart show ada2
=>        40  5860533088  ada2  GPT  (2.7T)
          40   209715200     1  freebsd  (100G)
   209715240  1687971896     2  freebsd-zfs  (805G)
  1897687136  1924953472     4  freebsd-zfs  (918G)
  3822640608    55838024     3  freebsd-zfs  (27G)
  3878478632     4194304     5  freebsd-zfs  (2.0G)
  3882672936     4194304     6  freebsd-zfs  (2.0G)
  3886867240     4194304     7  freebsd-zfs  (2.0G)
  3891061544  1969471584        - free -  (939G)

root at edge:~ # zpool set autoexpand=on testraid
root at edge:~ # zpool list testraid
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
testraid  2.75G   848K  2.75G        -         -     0%     0%  1.00x  ONLINE  -
root at edge:~ # zpool export testraid
root at edge:~ # zpool import
   pool: testraid
     id: 12608152059619624422
  state: UNAVAIL
 status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
	devices and try again.
   see: http://illumos.org/msg/ZFS-8000-3C
 config:

	testraid                  UNAVAIL  insufficient replicas
	  raidz1-0                UNAVAIL  insufficient replicas
	    13692722988113028666  UNAVAIL  cannot open
	    10312580954503443965  UNAVAIL  cannot open
	    16943054157341459289  UNAVAIL  cannot open

root at edge:~ # zpool list
NAME     SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
backup   918G   461G   456G        -         -    12%    50%  1.00x  ONLINE  -
bm       800G   581G   219G        -         -    13%    72%  1.00x  ONLINE  -
gr      50.5G  17.2G  33.3G        -         -    15%    34%  1.00x  ONLINE  -
im      26.5G  11.5G  15.0G        -         -    47%    43%  1.00x  ONLINE  -
root at edge:~ # sync
root at edge:~ # gpart show ada2
=>        40  5860533088  ada2  GPT  (2.7T)
          40   209715200     1  freebsd  (100G)
   209715240  1687971896     2  freebsd-zfs  (805G)
  1897687136  1924953472     4  freebsd-zfs  (918G)
  3822640608    55838024     3  freebsd-zfs  (27G)
  3878478632     4194304     5  freebsd-zfs  (2.0G)
  3882672936     4194304     6  freebsd-zfs  (2.0G)
  3886867240     4194304     7  freebsd-zfs  (2.0G)
  3891061544  1969471584        - free -  (939G)

root at edge:~ # gpart resize -s 2097152 -i 5 ada2
ada2p5 resized
root at edge:~ # gpart resize -s 2097152 -i 6 ada2
ada2p6 resized
root at edge:~ # gpart resize -s 2097152 -i 7 ada2
ada2p7 resized
root at edge:~ # gpart show ada2
=>        40  5860533088  ada2  GPT  (2.7T)
          40   209715200     1  freebsd  (100G)
   209715240  1687971896     2  freebsd-zfs  (805G)
  1897687136  1924953472     4  freebsd-zfs  (918G)
  3822640608    55838024     3  freebsd-zfs  (27G)
  3878478632     2097152     5  freebsd-zfs  (1.0G)
  3880575784     2097152        - free -  (1.0G)
  3882672936     2097152     6  freebsd-zfs  (1.0G)
  3884770088     2097152        - free -  (1.0G)
  3886867240     2097152     7  freebsd-zfs  (1.0G)
  3888964392  1971568736        - free -  (940G)

root at edge:~ # zpool import
Assertion failed: (avl_find() succeeded inside avl_add()), file /usr/src/sys/cddl/contrib/opensolaris/common/avl/avl.c, line 649.
Abort (core dumped)
root at edge:~ # zpool import
Assertion failed: (avl_find() succeeded inside avl_add()), file /usr/src/sys/cddl/contrib/opensolaris/common/avl/avl.c, line 649.
Abort (core dumped)
root at edge:~ # gpart delete -i 7 ada2
ada2p7 deleted
root at edge:~ # gpart delete -i 6 ada2
ada2p6 deleted
root at edge:~ # gpart delete -i 5 ada2
ada2p5 deleted
root at edge:~ # zpool import
root at edge:~ # ls -la /dev/gptid/
total 1
dr-xr-xr-x  2 root  wheel      512 May 16 20:54 .
dr-xr-xr-x  9 root  wheel      512 May 16 20:54 ..
crw-r-----  1 root  operator  0xed May 16 22:22 4ea3d975-7816-11e9-a104-00e01836f13c
crw-r-----  1 root  operator  0xef May 16 22:22 934b476f-7816-11e9-a104-00e01836f13c
crw-r-----  1 root  operator  0x94 May 16 20:54 ac0d2fe4-4b3b-11e9-8d45-00e01836f13c
crw-r-----  1 root  operator  0xf1 May 16 22:22 b71d41a1-7816-11e9-a104-00e01836f13c

##
## Ignore the one timestamped 20:54! - that one belongs to the regular
## installation (and I didnt yet find a way to get rid of it).
##

root at edge:~ # 
root at edge:~ # exit
exit

Script done on Thu May 16 23:09:43 2019

##
## During this procedure the following errors appeared in the syslog:

May 16 22:22:31 <kern.crit> edge kernel: g_access(944): provider gptid/4ea3d975-7816-11e9-a104-00e01836f13c has error 6 set
May 16 22:22:31 <kern.crit> edge kernel: g_access(944): provider gptid/4ea3d975-7816-11e9-a104-00e01836f13c has error 6 set
May 16 22:22:31 <kern.crit> edge kernel: g_dev_taste: make_dev_p() failed (gp->name=gptid/4ea3d975-7816-11e9-a104-00e01836f13c, error=17)
May 16 22:22:48 <kern.crit> edge kernel: g_access(944): provider gptid/934b476f-7816-11e9-a104-00e01836f13c has error 6 set
May 16 22:22:48 <kern.crit> edge kernel: g_access(944): provider gptid/934b476f-7816-11e9-a104-00e01836f13c has error 6 set
May 16 22:22:48 <kern.crit> edge kernel: g_dev_taste: make_dev_p() failed (gp->name=gptid/934b476f-7816-11e9-a104-00e01836f13c, error=17)
May 16 22:22:55 <kern.crit> edge kernel: g_access(944): provider gptid/b71d41a1-7816-11e9-a104-00e01836f13c has error 6 set
May 16 22:22:55 <kern.crit> edge kernel: g_access(944): provider gptid/b71d41a1-7816-11e9-a104-00e01836f13c has error 6 set
May 16 22:22:55 <kern.crit> edge kernel: g_dev_taste: make_dev_p() failed (gp->name=gptid/b71d41a1-7816-11e9-a104-00e01836f13c, error=17)
May 16 22:23:21 <kern.info> edge kernel: pid 11779 (zpool), uid 0: exited on signal 6 (core dumped)
May 16 22:27:35 <kern.info> edge kernel: pid 12342 (zpool), uid 0: exited on signal 6 (core dumped)
May 16 22:42:36 <kern.crit> edge kernel: g_access(944): provider gptid/934b476f-7816-11e9-a104-00e01836f13c has error 6 set
May 16 22:42:36 <kern.crit> edge kernel: g_access(944): provider gptid/b71d41a1-7816-11e9-a104-00e01836f13c has error 6 set
May 16 22:42:36 <kern.crit> edge kernel: g_access(944): provider gptid/4ea3d975-7816-11e9-a104-00e01836f13c has error 6 set

##
##
--------------- TESTCASE 4  END  ------------------------

--------------- TESTCASE 5 BEGIN ------------------------
Script started on Thu May 16 22:45:58 2019
root at disp:~ # camcontrol devlist
<KINGSTON SA400S37240G S1Z40102>   at scbus1 target 0 lun 0 (pass0,ada0)
<Hitachi HDS5C1050CLA382 JC2OA50E>  at scbus2 target 0 lun 0 (pass1,ada1)
<WDC WD5000AAKS-00A7B2 01.03B01>   at scbus3 target 0 lun 0 (pass2,ada2)
<TSSTcorp CDDVDW SH-224DB SB01>    at scbus4 target 0 lun 0 (pass3,cd0)
<AHCI SGPIO Enclosure 1.00 0001>   at scbus5 target 0 lun 0 (pass4,ses0)
<General UDisk 5.00>               at scbus6 target 0 lun 0 (da0,pass5)
root at disp:~ # gpart create -s GPT ada2
ada2 created
root at disp:~ # gpart add -t freebsd-zfs -s 1G ada2
ada2p1 added
root at disp:~ # gpart add -t freebsd-zfs -s 1G -b 4194344 ada2
ada2p2 added
root at disp:~ # gpart add -t freebsd-zfs -s 1G -b 8388648 ada2
ada2p3 added
root at disp:~ # gpart show ada2
=>       40  976773088  ada2  GPT  (466G)
         40    2097152     1  freebsd-zfs  (1.0G)
    2097192    2097152        - free -  (1.0G)
    4194344    2097152     2  freebsd-zfs  (1.0G)
    6291496    2097152        - free -  (1.0G)
    8388648    2097152     3  freebsd-zfs  (1.0G)
   10485800  966287328        - free -  (461G)

root at disp:~ # zpool create -f testraid raidz ada2p1 ada2p2 ada2p3
root at disp:~ # zpool list
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
build       78G  34.0G  44.0G        -         -    35%    43%  1.00x  ONLINE  -
media      464G   418G  46.5G        -         -    27%    89%  1.00x  ONLINE  -
testraid  2.75G   552K  2.75G        -         -     0%     0%  1.00x  ONLINE  -
zdesk     39.5G  11.9G  27.6G        -         -    16%    30%  1.00x  ONLINE  -
root at disp:~ # gpart resize -s 4194304 -i 1 ada2
ada2p1 resized
root at disp:~ # gpart resize -s 4194304 -i 2 ada2
ada2p2 resized
root at disp:~ # gpart resize -s 4194304 -i 3 ada2
ada2p3 resized
root at disp:~ # zpool set autoexpand=on testraid
root at disp:~ # zpool export testraid
root at disp:~ # zpool import
   pool: testraid
     id: 9285999494183920856
  state: UNAVAIL
 status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
	devices and try again.
   see: http://illumos.org/msg/ZFS-8000-3C
 config:

	testraid                  UNAVAIL  insufficient replicas
	  raidz1-0                UNAVAIL  insufficient replicas
	    5467198674063294812   UNAVAIL  cannot open
	    16413066309772469567  UNAVAIL  cannot open
	    10976529604851394099  UNAVAIL  cannot open
root at disp:~ #

## Here the typescript ends due to system crash.

## The next command entered was
## # gpart resize -s 2097152 -i 2 ada2
##
## The system showed some messages for 1/4 second and then rebooted.
## Nevertheless, the resize itself had successed, as was visible
## after reboot.

## Resizing of partition #2 gave this output (transcript from photo):

@$ gpart resize -s 2097152 -i 2 ada2


Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 06
fault virtual address  = 0x8
fault code             = supervisor read data, page not present
instruction pointer    = 0x20:0xffffffff8076103e
stack pointer          = 0x28:0xfffffe02299209e0
frame pointer          = 0x28:0xfffffe02299209f0
code segment           = base 0x0, limit 0xfffff, tyoe 0x1b
                       = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags       = interrupt enabled, resume, IOPL = 0
current process        = 13 (g_event)
[ thread pid 13 tid 100025 ]
Stooped at      g_dev_orphan+0x2e:     cmpb    $0,0x8(%r14)
db>

##
## Resizing of partition #3 then delivered the same page fault.
##
## After this, the pool could be imported again:


Script started on Thu May 16 23:28:14 2019
root at disp:~ # gpart show ada2
=>       40  976773088  ada2  GPT  (466G)
         40    2097152     1  freebsd-zfs  (1.0G)
    2097192    2097152        - free -  (1.0G)
    4194344    2097152     2  freebsd-zfs  (1.0G)
    6291496    2097152        - free -  (1.0G)
    8388648    2097152     3  freebsd-zfs  (1.0G)
   10485800  966287328        - free -  (461G)

root at disp:~ # zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
build    78G  34.0G  44.0G        -         -    35%    43%  1.00x  ONLINE  -
media   464G   418G  46.5G        -         -    27%    89%  1.00x  ONLINE  -
zdesk  39.5G  11.9G  27.6G        -         -    16%    30%  1.00x  ONLINE  -
root at disp:~ # zpool import
   pool: testraid
     id: 9285999494183920856
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

	testraid    ONLINE
	  raidz1-0  ONLINE
	    ada2p1  ONLINE
	    ada2p2  ONLINE
	    ada2p3  ONLINE
root at disp:~ # zpool import testraid
root at disp:~ # zpool list
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
build       78G  34.0G  44.0G        -         -    35%    43%  1.00x  ONLINE  -
media      464G   418G  46.5G        -         -    27%    89%  1.00x  ONLINE  -
testraid  2.75G   968K  2.75G        -         -     0%     0%  1.00x  ONLINE  -
zdesk     39.5G  11.9G  27.6G        -         -    16%    30%  1.00x  ONLINE  -
root at disp:~ # zpool stat us testraid
  pool: testraid
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	testraid    ONLINE       0     0     0
	  raidz1-0  ONLINE       0     0     0
	    ada2p1  ONLINE       0     0     0
	    ada2p2  ONLINE       0     0     0
	    ada2p3  ONLINE       0     0     0

errors: No known data errors
root at disp:~ # exit

Script done on Thu May 16 23:29:29 2019
--------------- TESTCASE 5  END  ------------------------

COMMENTARY
----------

Case 1:
The USB stick showed seek times of 10'000 ms and more during
the operations and became hot. It seems the device is rather unfit for
such operation and/or not capable to handle FLUSH commands properly.

Case 4:
It appears that the device nodes are STILL PRESENT in /dev/gptid
at the end of the procedure even AFTER they had been destroyed.

Might it be they are created somehow DUPLICATE and that being the reason
for the failures?

The main difference in the -successful- testcase 2 seems to be
that an export+import is done AFTER resizing the partitions and
BEFORE trying to grow the pool. This way ZFS can properly adjust
the "expandsize" attribute, before actually doing the grow.
So the issue seems mainly a matter of (not) following proper
procedures (which, to my knowledge, aren't documented anywhere).

cheerio,
PMc


More information about the freebsd-fs mailing list