From nobody Sat Mar 01 22:25:04 2025 X-Original-To: freebsd-hackers@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Z50592zcjz5ps1T for ; Sat, 01 Mar 2025 22:25:17 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Z50585xy1z3rvt for ; Sat, 01 Mar 2025 22:25:16 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-pj1-x1029.google.com with SMTP id 98e67ed59e1d1-2fee4d9c2efso614222a91.3 for ; Sat, 01 Mar 2025 14:25:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20230601.gappssmtp.com; s=20230601; t=1740867915; x=1741472715; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=vd69lU+Wp81pXFFIuheireCCy09ZTbvYjO+HSTi3pCs=; b=zd/CvNevshEyH4wKd7Cm6VrdtUyebgduPg+cu+Rynww7b9j5+CILtnLYGKf7ZSM5dU jqeX4GyUsDDu8BUXAmc4b2uGNaEXgk45CZW7zBZGw7AIG6YzuOIW6XWxNKDHFZoSNYE1 cvqZrMs5dL46ZMDmSqnkOXGkP1lNfO/wL6+zzb13wnCgPEjXalv5TZo9tqF8qauMfQ9a w1vrjwdn1bNPzcimA7AN6wM2MkCyMUDnV7V0zcgnO6hEg5Sm64VQWMv3KuwXO48xhasm k/okmd0kB2RrTrjx2zM/9bKq8NvUUbz0tOnrWor3R1QVjq8sGjn47D+WY+uoaqk/3Ugd 8pLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740867915; x=1741472715; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=vd69lU+Wp81pXFFIuheireCCy09ZTbvYjO+HSTi3pCs=; b=KO9Vn0zUhcQ5NZcR1qOHZEitJMS6Wyl67iErwCjAOI1eoKAURTmFgGFizy0KD6nXd2 2EtBnnU6pUom1Dn+Mssv594rfTQ7SZKyaSO7exDr1eJw5xQmSq7eRjshZ0+9DIeKUDB6 gCXuyXX0kq1cPDyUElDb3zVG8ZBD6jy4Dtl6+TL+W2KuMPVhlvZcq/jBNm/lw6ixeHDf H5fqoXB3rxhxdQC+AOK2/XWnYON1JMejsyaVSY/KYBlPXoKWTPORbdaPxownVg6p+iOC W0WtYOO0Q2u8ncPpSE9aWY7RphWwerZrHZmozgthFYi/MQMUR4vWbYlRiP07kZ1PXVLw nMkw== X-Gm-Message-State: AOJu0YxdgiJ5RT9f0itB1Mx0TVFrsMO5nkJK9B5Q9wZPAf2SKNu+Lm7i hQVBv6+NYRN0lKCA5Ft45VQ/0vlh0IOlRFGPDMRS8lRUPvouxXi7K6RCi19nhl+cdq05+6PuW6l j7V0Yj5zAk3bwDuBc2etYjxyAkLKttWp+gahmiwymJmL4e27nxaj0Hw== X-Gm-Gg: ASbGnctzZHT49WZt7UTMw7velyXM45h8lqboS06D/mp80I5SGG8cnvelogowwTrGpW7 347oVXdKhUVEc5Isv+1mSoawfQI3KUb7HHbUrMYY9BGKKkDiRtDrIEhnonYdZdiMK/pXW/C2FyF D+os0rd/f4KHpu9KQqrMgOVKWnpg== X-Google-Smtp-Source: AGHT+IEECZkqGMhtdZdxjty99EJXyb4gp8iabSlXK3WwCN4BlRyk+5o4UpJakqVTnMHj7stcFd9KJKnCU6QwNMlLgzc= X-Received: by 2002:a17:90b:1b47:b0:2f6:d266:f45e with SMTP id 98e67ed59e1d1-2febab2ebdamr13421033a91.2.1740867915406; Sat, 01 Mar 2025 14:25:15 -0800 (PST) List-Id: Technical discussions relating to FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-hackers List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-hackers@FreeBSD.org MIME-Version: 1.0 References: <60B4BAAA-E333-4219-99BE-D6C1B198E0BD@freebsd.org> <94A47C59-46D7-40E3-B680-8364558BF623@panasas.com> In-Reply-To: <94A47C59-46D7-40E3-B680-8364558BF623@panasas.com> From: Warner Losh Date: Sat, 1 Mar 2025 15:25:04 -0700 X-Gm-Features: AQ5f1JrCEAAu6UeKKDxTsZo--6uJ9sb6_917M3GlWlVxOWakelB_EYzhYRSP7Gk Message-ID: Subject: Re: PCI topology-based hints To: Ravi Pokala Cc: "freebsd-hackers@freebsd.org" Content-Type: multipart/alternative; boundary="000000000000b2fcfc062f4f65fd" X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Queue-Id: 4Z50585xy1z3rvt X-Spamd-Bar: ---- --000000000000b2fcfc062f4f65fd Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, Mar 1, 2025 at 3:19=E2=80=AFPM Ravi Pokala wr= ote: > > Yes. You can use what's already there, but maybe not documented or is a= t > the very least underdocumented. You can wire devices to the UEFI path, > which is guaranteed to be unique and avoid all these problems. > > > > > > hint.nvme.77.at=3D"UEFI:PcieRoot(2)/Pci(0x1,0x1)/Pci(0x0,0x0)" > > > > > > Which is on pcie root complex 2, then follow device 1 function 1 on tha= t > bus to device 0 function 0 on the second zero. `devctl getpath UEFI nvme0= ` > will do all the heavy lifting for you. TaDa! No bus numbers. > > > > > > I added this several years ago to solve exactly this problem, or what > happens when you lose a riser card, etc. > > > > > > Warner > > > > Sweet! Thanks Warner, that=E2=80=99s exactly what I=E2=80=99m looking for= . :-) > > > > You=E2=80=99re right that it=E2=80=99s under-documented. I think it shoul= d be relatively > easy to find a list of buses which support wiring; I think this search > should find them: > > > > | grep -Erl 'DEVMETHOD.*hint' /usr/src/sys > No. This is kinda independent of buses, but I think only PCI supports them now. I'd look for DEVMETHOD.*get_device_path, though, since that's required to make this work. I think I only implemented PCI though. > And then make sure that the bus=E2=80=99 manpage describes the hinting me= chanism, > and add cross-refs between the bus=E2=80=99 manpage and device.hints.5 > > > > If that sounds right, I=E2=80=99ll see if I can find some time to do that= in the > near future > Yes. I think that's right. I have this review https://reviews.freebsd.org/D49195 that I just whipped up (history suggests my writing will need a lot of help). Warner > Thanks again! > > > > -Ravi (rpokala@) > > > > > > *From: *Warner Losh > *Date: *Saturday, March 1, 2025 at 13:23 > *To: *Ravi Pokala > *Cc: *"freebsd-hackers@freebsd.org" > *Subject: *Re: PCI topology-based hints > > > > > > > > On Fri, Feb 28, 2025 at 12:43=E2=80=AFAM Ravi Pokala wrote: > > Hi folks, > > Setting up device attachment hints based on PCI address is easy; it's > right there in the manual (pci.4): > > | DEVICE WIRING > | You can wire the device unit at a given location with device.hints= . > | Entries of the form hints...at=3D"pci::" or > | hints...at=3D"pci:::" will force the drive= r > name to > | probe and attach at unit unit for any PCI device found to match th= e > | specification, where: > | ... > | Examples > | Given the following lines in /boot/device.hints: > | hint.nvme.3.at=3D"pci6:0:0" hint.igb.8.at=3D"pci14:0:0" If there i= s a > device > | that supports igb(4) at PCI bus 14 slot 0 function 0, then it will > be > | assigned igb8 for probe and attach. Likewise, if there is an > nvme(4) > > That's all well and good in a world without pluggable and hot-swappable > devices, but things get tricker when devices can appear and disappear. > > We have systems which have multiple U.2 bays, which take NVMe PCIe > devices. Across multiple reboots, the address assigned to th= e > device in each of those bays was consistent. Great! We set up wring hints > for those devices, and confirmed that the wiring worked when devices were > swapped ... > > .. until we added NIC into the hot-swap OCP slot and rebooted. > > While things continued to work before the reboot, upon reboot, many > addresses changed. It looks like the slot into which the NIC was installe= d, > is on the same segment of the bus as the U.2 bays. When that segment was > enumerated, the addresses got shuffled to include the NIC. > > So, we can't necessarily rely on the PCI address. But the > PCIe topology is consistent, even when devices are added and removed -- > it's the physical wiring between the root complex, bridges, devices, and > expansion slots. > > The `lspci' utility -- ubiquitous on Linux, and available via the > "sysutils/pciutils" port on FreeBSD -- can show the topology. For example= , > consider three NVMe devices, reported by `pciconf', and by `lspci's tree > view (device details redacted): > > | % pciconf -l | tr '@' ' ' | sort -V -k2 | grep nvme > | nvme2 pci0:65:0:0: ... > | nvme0 pci0:133:0:0: ... > | nvme1 pci0:137:0:0: ... > | % > | % lspci -vt | grep -C2 -E '^..-|NVMe' > | -+-[0000:00]-+-00.0 Root Complex > | | +-00.2 ... > | | +-00.3 ... > | -- > | | +-18.6 ... > | | \-18.7 ... > | +-[0000:40]-+-00.0 Root Complex > | | +-00.2 ... > | | +-00.3 ... > | | +-01.0 ... > | | +-01.1-[41]----00.0 ${VENDOR} NVMe > | | +-01.3-[42-43]-- > | | +-01.4-[44-45]-- > | -- > | | | \-00.1 ... > | | \-07.2 ... > | +-[0000:80]-+-00.0 Root Complex > | | +-00.2 ... > | | +-00.3 ... > | -- > | | +-03.0 ... > | | +-03.1-[83-84]-- > | | +-03.2-[85-86]----00.0 ${VENDOR} NVMe > | | +-03.3-[87-88]-- > | | +-03.4-[89-8a]----00.0 ${VENDOR} NVMe > | | +-04.0 ... > | | +-05.0 ... > | -- > | | | \-00.1 ... > | | \-07.2 ... > | \-[0000:c0]-+-00.0 Root Complex > | +-00.2 ... > | +-00.3 ... > > The first set of xdigits, "[0000:n0]" are a "domain" and "bus", which are > only shown for the Root Complex devices. The second set of xdigits, "xy.z= ", > are either an endpoint's "slot" and "function", or else a bridge device's > (address?) and (slot?). If there is a bridge, there is a set of xdigits i= n > brackets next to each (slot?), which becomes the "bus" of the attached > endpoint, and then "xy.z", which is the endpoint's "slot" and "function". > > Thus, we can see from the tree that the NVMe devices are "0000:41:00.0", > "0000:85:00.0", and "0000:89:00.0". (Which, if you convert to decimal, is > the same as reported by `pciconf': "pci0:65:0:0", "pci0:133:0:0", > "pci0:137:0:0".) It is also apparent that the latter two devices are > connected to the same bridge, which in turn is connected to a different > root complex than the first device. > > The problem is, depending on what devices are connected to a given root > complex, the "bus" component which is associated with a bridge slot can > change. In the example above, with the current population of devices in t= he > "0000:80" portion of the tree, the "bus" components associated with bridg= e > "03" are "83", "85", "87", and "89". But add another device to "0000:80" > and reboot, and the addresses associated with bridge "03" become "84", > "86", "88", and "8a". > > The question is this: How do I indicate that I would like a certain devic= e > unit to be wired to a specific bridge device address and slot -- which > cannot change -- rather than to a specific , where the "B" > component can change. > > Any thoughts? > > > > Yes. You can use what's already there, but maybe not documented or is at > the very least underdocumented. You can wire devices to the UEFI path, > which is guaranteed to be unique and avoid all these problems. > > > > hint.nvme.77.at=3D"UEFI:PcieRoot(2)/Pci(0x1,0x1)/Pci(0x0,0x0)" > > > > Which is on pcie root complex 2, then follow device 1 function 1 on that > bus to device 0 function 0 on the second zero. `devctl getpath UEFI nvme0= ` > will do all the heavy lifting for you. TaDa! No bus numbers. > > > > I added this several years ago to solve exactly this problem, or what > happens when you lose a riser card, etc. > > > > Warner > > > --000000000000b2fcfc062f4f65fd Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sat, Mar 1, = 2025 at 3:19=E2=80=AFPM Ravi Pokala <rpokala@freebsd.org> wrote:

> Yes. You can use what's already there, but maybe = not documented or is at the very least underdocumented. You can wire device= s to the UEFI path, which is guaranteed to be unique and avoid all these pr= oblems.

>

> hint.nvme.77.at=3D"UEFI:P= cieRoot(2)/Pci(0x1,0x1)/Pci(0x0,0x0)"

>

> Which is on pcie root complex 2, then follow device = 1 function 1 on that bus to device 0 function 0 on the second zero. `devctl= getpath UEFI nvme0` will do all the heavy lifting for you. TaDa! No bus nu= mbers.

>

> I added this = several years ago to solve exactly this problem, or what happens when you l= ose a riser card, etc.

>

=

&g= t; Warner

=C2=A0

Sweet! Thanks = Warner, that=E2=80=99s exactly what I=E2=80=99m looking for. :-)<= /u>

=C2=A0

You=E2=80=99re right that it=E2=80= =99s under-documented. I think it should be relatively easy to find a list = of buses which support wiring; I think this search should find them:=

=C2=A0

| grep -Erl 'DEVMETHOD.*hi= nt' /usr/src/sys


No. This is kinda independent of buses, but I think only PCI support= s them now. I'd look for DEVMETHOD.*get_device_path, though, since that= 's required to make this work. I think I only implemented PCI though.
=C2=A0

And then m= ake sure that the bus=E2=80=99 manpage describes the hinting mechanism, and= add cross-refs between the bus=E2=80=99 manpage and device.hints.5<= u>

=C2=A0

If that sounds right, I=E2=80= =99ll see if I can find some time to do that in the near future

<= /div>

Yes. I think that's r= ight. I have this review=C2=A0=C2=A0https://reviews.freebsd.org/D49195 that I just whipped up (hist= ory suggests my writing will need a lot of help).

= Warner
=C2=A0
=

Th= anks again!

=C2=A0

-Ravi (rpo= kala@)

=C2=A0

=C2=A0<= /u>

From: Warner Losh <imp@bsdimp.com>
Date: Saturday, Ma= rch 1, 2025 at 13:23
To: Ravi Pokala <rpokala@freebsd.org>
Cc: = "free= bsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subje= ct: Re: PCI topology-based hints

=C2=A0

=C2= =A0

<= /u>=C2=A0

On Fri, Feb 28, 2025 at 12:43=E2=80=AFAM Ravi Pokala <rpokala@freebsd.org> wrote:

<= /div>

Hi folks,

Setting up device attachment hints based on PCI addres= s is easy; it's right there in the manual (pci.4):

| DEVICE WIRI= NG
|=C2=A0 =C2=A0 =C2=A0 You can wire the device unit at a given locatio= n with device.hints.
|=C2=A0 =C2=A0 =C2=A0 Entries of the form hints.<= ;name>.<unit>.at=3D"pci<B>:<S>:<F>" or=
|=C2=A0 =C2=A0 =C2=A0 hints.<name>.<unit>.at=3D"pci<= ;D>:<B>:<S>:<F>" will force the driver name to|=C2=A0 =C2=A0 =C2=A0 probe and attach at unit unit for any PCI device fou= nd to match the
|=C2=A0 =C2=A0 =C2=A0 specification, where:
| ...
= |=C2=A0 =C2=A0 Examples
|=C2=A0 =C2=A0 =C2=A0 Given the following lines = in /boot/device.hints:
|=C2=A0 =C2=A0 =C2=A0 hint.nvme.3.at=3D"pci6:0:0" hint.igb.8.at=3D"pci14= :0:0" If there is a device
|=C2=A0 =C2=A0 =C2=A0 that supports igb(= 4) at PCI bus 14 slot 0 function 0, then it will be
|=C2=A0 =C2=A0 =C2= =A0 assigned igb8 for probe and attach.=C2=A0 Likewise, if there is an nvme= (4)

That's all well and good in a world without pluggable and ho= t-swappable devices, but things get tricker when devices can appear and dis= appear.

We have systems which have multiple U.2 bays, which take NVM= e PCIe devices. Across multiple reboots, the <D, B, S, F> address ass= igned to the device in each of those bays was consistent. Great! We set up = wring hints for those devices, and confirmed that the wiring worked when de= vices were swapped ...

.. until we added NIC into the hot-swap OCP s= lot and rebooted.

While things continued to work before the reboot, = upon reboot, many addresses changed. It looks like the slot into which the = NIC was installed, is on the same segment of the bus as the U.2 bays. When = that segment was enumerated, the addresses got shuffled to include the NIC.=

So, we can't necessarily rely on the PCI <D, B, S, F> add= ress. But the PCIe topology is consistent, even when devices are added and = removed -- it's the physical wiring between the root complex, bridges, = devices, and expansion slots.

The `lspci' utility -- ubiquitous = on Linux, and available via the "sysutils/pciutils" port on FreeB= SD -- can show the topology. For example, consider three NVMe devices, repo= rted by `pciconf', and by `lspci's tree view (device details redact= ed):

| % pciconf -l | tr '@' ' ' | sort -V -k2 | gre= p nvme
| nvme2 pci0:65:0:0: ...
| nvme0 pci0:133:0:0: ...
| nvme1 = pci0:137:0:0: ...
| %
| % lspci -vt | grep -C2 -E '^..-|NVMe'= ;
| -+-[0000:00]-+-00.0=C2=A0 Root Complex
|=C2=A0 |=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0+-00.2=C2=A0 ...
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0+-00.3=C2=A0 ...
| --
|=C2=A0 |=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0+-18.6=C2=A0 ...
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0\-18.7=C2=A0 ...
|=C2=A0 +-[0000:40]-+-00.0=C2=A0 Ro= ot Complex
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+-00.2=C2= =A0 ...
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+-00.3=C2=A0 .= ..
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+-01.0=C2=A0 ...|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+-01.1-[41]----00.0=C2= =A0 ${VENDOR} NVMe
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+-0= 1.3-[42-43]--
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+-01.4-[= 44-45]--
| --
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 \-00.1=C2=A0 ...
|=C2=A0 |=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0\-07.2=C2=A0 ...
|=C2=A0 +-[0000:80]-+= -00.0=C2=A0 Root Complex
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0+-00.2=C2=A0 ...
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+-= 00.3=C2=A0 ...
| --
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0+-03.0=C2=A0 ...
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+-= 03.1-[83-84]--
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+-03.2-= [85-86]----00.0=C2=A0 ${VENDOR} NVMe
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0+-03.3-[87-88]--
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0+-03.4-[89-8a]----00.0=C2=A0 ${VENDOR} NVMe
|=C2=A0 |=C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+-04.0=C2=A0 ...
|=C2=A0 |=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0+-05.0=C2=A0 ...
| --
|=C2=A0 |=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 \-00.1=C2=A0 ...
|=C2=A0 |=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0\= -07.2=C2=A0 ...
|=C2=A0 \-[0000:c0]-+-00.0=C2=A0 Root Complex
|=C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 +-00.2=C2=A0 ...
|=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 +-00.3=C2=A0 ...

The first se= t of xdigits, "[0000:n0]" are a "domain" and "bus&= quot;, which are only shown for the Root Complex devices. The second set of= xdigits, "xy.z", are either an endpoint's "slot" a= nd "function", or else a bridge device's (address?) and (slot= ?). If there is a bridge, there is a set of xdigits in brackets next to eac= h (slot?), which becomes the "bus" of the attached endpoint, and = then "xy.z", which is the endpoint's "slot" and &qu= ot;function".

Thus, we can see from the tree that the NVMe devi= ces are "0000:41:00.0", "0000:85:00.0", and "0000:= 89:00.0". (Which, if you convert to decimal, is the same as reported b= y `pciconf': "pci0:65:0:0", "pci0:133:0:0", "p= ci0:137:0:0".) It is also apparent that the latter two devices are con= nected to the same bridge, which in turn is connected to a different root c= omplex than the first device.

The problem is, depending on what devi= ces are connected to a given root complex, the "bus" component wh= ich is associated with a bridge slot can change. In the example above, with= the current population of devices in the "0000:80" portion of th= e tree, the "bus" components associated with bridge "03"= ; are "83", "85", "87", and "89". B= ut add another device to "0000:80" and reboot, and the addresses = associated with bridge "03" become "84", "86"= , "88", and "8a".

The question is this: How do I= indicate that I would like a certain device unit to be wired to a specific= bridge device address and slot -- which cannot change -- rather than to a = specific <D, B, S, F>, where the "B" component can change.<= br>
Any thoughts?

=C2=A0

Yes. You can use what's alre= ady there, but maybe not documented or is at the very least underdocumented= . You can wire devices to the UEFI path, which is guaranteed to be unique a= nd avoid all these problems.

=C2=A0

hint.nvme.77.at=3D"UEFI:PcieRoot(2)/Pci(0x1,= 0x1)/Pci(0x0,0x0)"

=C2=A0

Which is on pcie root complex 2, the= n follow device 1 function 1 on that bus to device 0 function 0 on the seco= nd zero. `devctl getpath UEFI nvme0` will do all the heavy lifting for you.= TaDa! No bus numbers.

=C2=A0

I added this several years ago to sol= ve exactly this problem, or what happens when you lose a riser card, etc.

=C2=A0

Warner

=C2=A0

--000000000000b2fcfc062f4f65fd--