[Bug 270340] zpool import does not return

From: <bugzilla-noreply_at_freebsd.org>
Date: Mon, 20 Mar 2023 14:58:30 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=270340

--- Comment #7 from Peter Much <pmc@citylink.dinoex.sub.org> ---
(In reply to Graham Perrin from comment #6)
Graham, thank You, these are wonderful questions, they inspired me to
investigate further.

> If you create a directory for altroot, and import with option -R, then are symptoms reproducible?

It doesn't get that far.
I now collected a truss log. It scans all the things it finds in the deepest
corners of geom, does *four* aio requests onto each of them, and it properly
gets all the answers except three, and then it stalls.

The three missing devices are:
da6
da6.eli
da6.elip1
And these are also the three threads that are a bit different in the
procstat-kk output. Also interesting is (three times four):
vfs.aio.num_unmapped_aio: 12
vfs.aio.num_queue_count: 12

da6 is the disk that takes the longest time to spinup. So I tried some more:

# camcontrol tur da6
Unit is ready
# zpool import
no pools available to import

# camcontrol stop da6
Unit stopped successfully
# zpool import  # starts disk, and:
no pools available to import

It seems not directly related to da6. But:

# for i in `seq 0 6`; do camcontrol stop da$i; done
Unit stopped successfully
Unit stopped successfully
Unit stopped successfully
Unit stopped successfully
Unit stopped successfully
Unit stopped successfully
Unit stopped successfully
# for i in `seq 0 6`; do camcontrol tur da$i; done
Unit is not ready
Unit is not ready
Unit is not ready
Unit is not ready
Unit is not ready
Unit is not ready
Unit is not ready

# zpool import  # starts all disks, and
<hangs forever>


>Knowing more about the hardware (disks and connectivity) might help. 

You don't wanna know that. ;) This is basically a retirement facility for no
longer loved hardware (I can't throw away things that are still in working
order)

<QUANTUM ATLAS10K3_36_WLS 020W>    at scbus0 target 0 lun 0 (pass0,da0)
<IBM IC35L018UWDY10-0 S25F>        at scbus0 target 2 lun 0 (pass1,da1)
<IBM IC35L018UWDY10-0 S25F>        at scbus0 target 4 lun 0 (pass2,da2)
<SEAGATE ST373207LC 0005>          at scbus0 target 8 lun 0 (pass3,da3)
<SEAGATE ST373207LC 0005>          at scbus0 target 10 lun 0 (pass4,da4)
<SEAGATE ST373207LC 0005>          at scbus0 target 12 lun 0 (pass5,da5)
<IBM DCAS-34330 S65A>              at scbus1 target 0 lun 0 (pass6,da6)
<WDC WD5000AAKS-00A7B2 01.03B01>   at scbus2 target 0 lun 0 (pass7,ada0)
<KINGSTON SA400S37240G S3E00100>   at scbus3 target 0 lun 0 (pass8,ada1)
<ST3000DM008-2DM166 CC26>          at scbus4 target 0 lun 0 (pass9,ada2)
<SPCC Solid State Disk V0303B0>    at scbus5 target 0 lun 0 (pass10,ada3)
<AHCI SGPIO Enclosure 2.00 0001>   at scbus6 target 0 lun 0 (ses0,pass11)
<Hitachi HDS5C1010CLA382 JC4OA3MA>  at scbus7 target 0 lun 0 (pass12,ada4)
<Hitachi HDS5C1010CLA382 JC4OA3MA>  at scbus8 target 0 lun 0 (pass13,ada5)
<TS480GSSD220S VD0R3A06>           at scbus9 target 0 lun 0 (pass14,ada6)
<HGST HUS726040ALA610 A5GNT907>    at scbus10 target 0 lun 0 (pass15,ada7)
<ST2000VM003-1ET164 SC13>          at scbus11 target 0 lun 0 (pass16,ada8)
<AHCI SGPIO Enclosure 2.00 0001>   at scbus13 target 0 lun 0 (ses1,pass17)
<SanDisk Cruzer Facet 1.26>        at scbus14 target 0 lun 0 (pass18,da7)
<HP SSD S 700 250GB 0301>          at scbus15 target 0 lun 0 (pass19,da8)

SCSI-SE via mpt, SCSI-LVD via ahd, SATA via onboard Wellsburg, USB via onboard
Wellsburg, USB3-to-SATA via some chinese noname adapter.

> Can you describe the modifications?

None necessary, as I know now.

Conclusion: as this is apparently a problem with spinup and aio interaction, I
can as a workaround just spinup manually before using zpool import.

-- 
You are receiving this mail because:
You are the assignee for the bug.