[SOLVED] Re: "zpool attach" problem
Scott Bennett
bennett at sdf.org
Sat Nov 21 22:33:39 UTC 2020
Hi David,
Thanks for your reply. I was about to respond to my own message to say that the
issue has been resolved, but I saw your reply first. However, I respond below to
your comments and questions, as well as stating what the problem turned out to be.
On Fri, 20 Nov 2020 21:16:06 -0800 David Christensen <dpchrist at holgerdanske.com>
wrote:
>On 2020-11-19 22:59, Scott Bennett via freebsd-questions wrote:
>> I had a pool with two two-way mirrors as the top-level vdevs. I needed
>> to shift some of those partitions by a short distance on the drives, so I
>> detached and deleted and rebuilt them one at a time until I hit a snag. Here
>> is the situation.
>>
>> Script started on Fri Nov 20 00:40:36 2020
>> hellas# gpart show -l ada2 da0 da1 da2
>> => 40 5860533088 ada2 GPT (2.7T)
>> 40 4294967296 1 WD-WMC130F2V1RN (2.0T)
>> 4294967336 31457496 - free - (15G)
>> 4326424832 125829120 11 zmisc mirror-0 1 (60G)
>> 4452253952 209715200 15 bw2-0 (100G)
>> 4661969152 1198563976 - free - (572G)
>>
>> => 34 3907029101 da0 GPT (1.8T)
>> 34 14 - free - (7.0K)
>> 48 3749709824 1 WD WCC4MH1P7LYS (1.7T)
>> 3749709872 73400320 5 bw1-0 (35G)
>> 3823110192 2000 - free - (1.0M)
>> 3823112192 83886080 8 zmisc mirror-1 1 (40G)
>> 3906998272 30863 - free - (15M)
>>
>> => 34 3907029100 da1 GPT (1.8T)
>> 34 14 - free - (7.0K)
>> 48 3749709824 1 Seagate NA5KYLVM (1.7T)
>> 3749709872 16 - free - (8.0K)
>> 3749709888 73400320 5 bw1-1 (35G)
>> 3823110208 1984 - free - (992K)
>> 3823112192 83886080 8 zmisc mirror-1 0 (40G)
>> 3906998272 30862 - free - (15M)
>>
>> => 40 3907029088 da2 GPT (1.8T)
>> 40 8 - free - (4.0K)
>> 48 3749709824 1 WD-WCC6N7KD2YAK (1.7T)
>> 3749709872 16 - free - (8.0K)
>> 3749709888 31457280 5 bw0-0 (15G)
>> 3781167168 1984 - free - (992K)
>> 3781169152 125829120 8 zmisc mirror-0 0 (60G)
>> 3906998272 30856 - free - (15M)
>>
>> hellas# zpool status zmisc
>> pool: zmisc
>> state: ONLINE
>> scan: resilvered 25.8G in 0 days 00:16:07 with 0 errors on Fri Nov 20 00:10:19 2020
>> config:
>>
>> NAME STATE READ WRITE CKSUM
>> zmisc ONLINE 0 0 0
>> ada2p11 ONLINE 0 0 0
>> mirror-1 ONLINE 0 0 0
>> da0p8 ONLINE 0 0 0
>> da1p8 ONLINE 0 0 0
>>
>> errors: No known data errors
>> hellas# zpool attach zmisc ada2p11 da2p8
>> cannot attach da2p8 to ada2p11: no such pool or dataset
>> hellas# exit
>> exit
>>
>> Script done on Fri Nov 20 00:42:33 2020
>>
>> Would somebody please tell me what I am doing wrong here? Many thanks in
>> advance to whoever can help.
>
>It looks like you added the slice ada2p11 to zmisc, rather than the
>mirror ada2p11 da2p8. If so, these commands could fix things:
>
No, ada2p11 is what was left after detaching a partition from
mirror-0 of that pool.
>
> # zpool remove zmisc ada2p11
>
> # zpool add zmisc mirror ada2p11 da2p8
>
I thought about doing that, but the allocated portion of mirror-0
was too much to fit into the free space in mirror-1. Also, even if
mirror-1 could have held that much, that kind of monkeying around ends
up creating a situation of horribly unbalanced allocation, and so I
would have hesitated at least a day or three to see if I could find a
better way, and it was a good thing that I stopped and went to bed when
I was done posting my message. (See confession further below.)
>
>But, I am confused by your storage architecture. Why one internal "3
>TB" drive and three external "2 TB" drives? What is the 2.0T internal
>slice for? What are the three 1.7 GiB external slices for? What are
Long story. Sigh. About eight or nine years ago I began using
ZFS under 9.something i386. (Currently the machine is running
11.4-STABLE amd64.) At first it was all experimental while I learned
enough to begin to feel some confidence in using it. Once I had purchased
six 1.8 TB drives, I created my largest pool called rz7A and quickly moved
my backups and archives into it, and AFAIK I have not lost a single byte
due to hardware errors, power failures, or anything else since then. (I
likely need to think up a better name for it, but that is way down on
the list of my worries for now.) It comprised six 1.7 TB partitions on
the six 1.8 TB (actually closer to 1.9 TB, but FreeBSD truncates, rather
than rounds) drives in a raidz2. That left a bit of room for other things
I intended to do that would take up much less space. It also meant that
those 1.7 TB partitions could be exactly the same in terms of space and
not differ among them due to slight differences in the real storage
capacities of drives of different make{,r}s and models. In the
intervening time there have been many drive failures (mostly Seagates, but
a few aged-out WD drives, too). About a year ago, a drive failed, and I
replaced it with a WD Black 1.8 TB drive, which continues to function
flawlessly.
Then in January or February two drives failed in rapid succession.
At that time, I found two 2.7 TB enterprise drives as replacements, and
they were priced much lower apiece than the drive I had bought a month or
two earlier. While allocating the partitions on them, I allocated 2 TB
on each as the replacements for the 1.7 TB partitions that were on the
failed drives. This past summer one of the new enterprise drives failed.
It turned out that the reason they had been available so cheaply was that
they had been leftover stock of a now discontinued line, so basically
they were sold at a closeout price. Getting a replacement for the failed
enterprise drive under warranty turned out to be a nightmare. First,
the manufacturer said they didn't have a drive of that capacity in the
new line, and they wanted to know if I would accept a "4 TB" drive as a
replacement, which I naturally approved. When no drive appeared after two
weeks, I called and discovered they had left the apartment number off of
the address, even though I had had the agent repeat the address back to
me on the phone. The parcel service had returned the drive to them as
undeliverable. The manufacturer then turned around and *gave my drive to
somebody else*, which I believe legally constitutes theft and sale of
stolen property, but I did not pursue that. They said they would send
another, but that didn't appear either. I called and was told that it had
been held up until they could confirm the shipping address *again*, which
I then did. When the 3.6 TB replacement arrived, it was *not* an enterprise
drive. I called again and asked what was going on and was told that they
substituted a non-enterprise drive because they didn't have a "4 TB"
enterprise drive available. I then gave them a pretty bad time about
leaving my array at risk in a degraded state for so long by their not living
up to their warranty, as well as having given a drive that belonged to me
away to somebody else. They kept putting me off by requiring to speak with
another and yet another person in their company, usually requiring separate
phone calls on different days and shipping the non-enterprise drive back to
them, but eventually someone arranged for an enterprise drive (of their
current line of enterprise drives) to be shipped from their Canadian
inventory with an expected additional delay due to having to pass customs
and exacerbated by the COVID-19 situation. The drive arrived after one week.
Total time until I had a replacement under warranty was nearly *two months*
on a failed *enterprise* drive. I know I am not a high-volume customer like
Netflix or Amazon, but really(!) that seems unreasonable.
So that is the story in a nutshell of how my ever-changing configuration
has evolved and why some of the unallocated space on the drives appears where
it does.
As the 1.8 TB drives give up, I intend to replace them with larger-
capacity drives and expand the single top-level vdev in that pool, such
that each component will have a 2 TB capacity, rather than its current
1.7 TB capacity. If disk capacities continue to increase with prices
decreasing fast enough compared to the remaining lifetimes of the 1.8
TB drives, I may expand the components still further. The two enterprise
drives already have the spare space to expand their components quite
substantially more than the present 2 TB each.
>the bw?-? slices for, and why are they different sizes? Why are the
>zmisc slices different sizes? What about ada0 and ada1? And, do you
Name Status Components
mirror/bw0 COMPLETE da3p5 (ACTIVE)
da2p5 (ACTIVE)
mirror/bw1 COMPLETE da0p5 (ACTIVE)
da1p5 (ACTIVE)
mirror/bw2 COMPLETE ada2p15 (ACTIVE)
ada3p5 (ACTIVE)
Name Status Components
concat/buildwork UP mirror/bw1
mirror/bw0
mirror/bw2
(N.B. the components of buildwork are listed out of sequence here. They
are configured as bw0, bw1, bw2.)
>have spaces in your GPT labels?
>
The motherboard in the tower has six SATA ports. Two are for optical
drives, and four are for HDDs/SSDs. There is also an eSATA controller that
I used for one of the external drives for a while, but something failed,
and now I can't use the drive that way, so it is on a USB 3.0 port. The
machine is very old and has no native USB 3.0 support, but I added two PCIe
cards for USB 3.0, one with four ports and one with two ports. The external
drives are currently connected with two per controller, and the four-port
card also has a seven-port USB 3.0 external hub plugged into it that rarely
sees any use (mostly just flash drives).
ada0 and ada1 are the much smaller boot drives and are not involved in
what happened.
ada2 and ada3 are the two drives internal to my ancient tower that have
components of the large raidz2, and da0 through da3 contain rest of the
six components.
The GPT label fields in the "gpart show -l" output in my earlier
message have no unprintable characters in them, so they are exactly as
shown.
Now, on to my confession. The problem was that I had reinserted the
wrong partition into bw0 due to a typo; i.e., I had typed da2p8 instead of
da2p5, so da2p8 was not available. :-( (It would be nice if GEOM and ZFS
error messages were more intelligibly worded, but if wishes were horses ...)
Once I saw what the problem was, it was trivially easy and quick to fix.
Again, thank you much for your reply. I wish I had gotten the trouble
shot sooner (sleep can only be postponed for so long) and posted a followup
sooner (ditto) in order to have saved you the bother, but it's nice to know
that someone usually does try to help when someone asks for help on these
lists.
Scott Bennett, Comm. ASMELG, CFIAG
**********************************************************************
* Internet: bennett at sdf.org *xor* bennett at freeshell.org *
*--------------------------------------------------------------------*
* "A well regulated and disciplined militia, is at all times a good *
* objection to the introduction of that bane of all free governments *
* -- a standing army." *
* -- Gov. John Hancock, New York Journal, 28 January 1790 *
**********************************************************************
More information about the freebsd-questions
mailing list