svn commit: r253708 - head/sys/dev/ipmi

John Baldwin jhb at freebsd.org
Mon Jul 29 20:28:12 UTC 2013


On Monday, July 29, 2013 3:59:12 pm Sean Bruno wrote:
> On Mon, 2013-07-29 at 10:54 -0400, John Baldwin wrote:
> > On Saturday, July 27, 2013 12:32:34 pm Sean Bruno wrote:
> > > Author: sbruno
> > > Date: Sat Jul 27 16:32:34 2013
> > > New Revision: 253708
> > > URL: http://svnweb.freebsd.org/changeset/base/253708
> > > 
> > > Log:
> > >   At some point after stable/7 the ACPI and ISA interfaces to the IPMI controller
> > >   no longer have the parent in the device tree.  This causes the identify
> > >   function in ipmi_isa.c to attempt to probe and poke at the ISA IPMI interface
> > 
> > They never had a common parent, even in 6.x and 7.x.
> > 
> The identify function in isa_ipmi.c shows that there is already an
> ipmi(4) device attached (ACPI) version and aborts on 7.x.  in 9.x and
> higher (not testing on 8.x) the identify function does not see an
> attached ipmi interface and attempts to create /dev/ipmi1

The identify function in 7.x has no such check:

static void
ipmi_isa_identify(driver_t *driver, device_t parent)
{
	struct ipmi_get_info info;
	uint32_t devid;

	if (ipmi_smbios_identify(&info) && info.iface_type != SSIF_MODE &&
	    device_find_child(parent, "ipmi", -1) == NULL) {
		/*
		 * XXX: Hack alert.  On some broken systems, the IPMI
		 * interface is described via SMBIOS, but the actual
		 * IO resource is in a PCI device BAR, so we have to let
		 * the PCI device attach ipmi instead.  In that case don't
		 * create an isa ipmi device.  For now we hardcode the list
		 * of bus, device, function tuples.
		 */
		devid = pci_cfgregread(0, 4, 2, PCIR_DEVVENDOR, 4);
		if (devid != 0xffffffff &&
		    ipmi_pci_match(devid & 0xffff, devid >> 16) != NULL)
			return;
		BUS_ADD_CHILD(parent, 0, "ipmi", -1);
	}
}

The only check in 7 was the one you just moved in ipmi_isa_attach():

static int
ipmi_isa_attach(device_t dev)
{
	struct ipmi_softc *sc = device_get_softc(dev);
	struct ipmi_get_info info;
	const char *mode;
	int count, error, i, type;

	/*
	 * Pull info out of the SMBIOS table.  If that doesn't work, use
	 * hints to enumerate a device.
	 */
	if (!ipmi_smbios_identify(&info) &&
	    !ipmi_hint_identify(dev, &info))
		return (ENXIO);

	/*
	 * Give other drivers precedence.  Unfortunately, this doesn't
	 * work if we have an SMBIOS table that duplicates a PCI device
	 * that's later on the bus than the PCI-ISA bridge.
	 */
	if (ipmi_attached)
		return (EBUSY);
	...
}

> Am I just confused on the bus relationship here?
> 
> We've gone over this a couple of times in different emails on different
> lists.  I've just never sat down and walked through the code.  If you
> see a better way to keep ipmi(4) from erroneously attaching to the ISA
> interface, let me know.

I haven't seen any mention of this problem before.  I've seen threads about
the watchdog issue (trying to set watchdog on shutdown which wants to use
threads, etc.), but not this.

Also, can you provide the console messages without this patch?  The previous
check in ipmi_isa_attach() is long before we touch the BMC or ever get
around to creating /dev/ipmi1.  (Just because you see ipmi1: in dmesg doesn't
mean it's created /dev/ipmi1.)

> > >   Move the check for ipmi_attached out of the ipmi_isa_attach function and into
> > >   the ipmi_isa_identify function.  Remove the check of the device tree for
> > >   ipmi devices attached.
> > >   
> > >   This probing appears to make Broadcom management firmware on Dell machines
> > >   crash and emit NMI EISA warnings at various times requiring power cycles
> > >   of the machines to restore.
> > 
> > This makes no sense.  All you are doing is skipping ipmi_smbios_identify()
> > which just looks at the SMBIOS table in RAM.  It doesn't try to probe the
> > BMC at all (no register accesses, etc.).  If just reading a table in memory
> > causes side effects, then running dmidecode in userland should be hosing your
> > machines as well.
> > 
> 
> Probably right.  I'm not exactly sure what is making the Broadcom
> firmware fall over and die.  But when I see the console emitting "NMI
> EISA" error messages, this ususally requires a full reboot as the
> network interface has crashed.  Hopefully, I can dig up more "fact" soon
> as testing continues.

I'd rather be sure this is the right fix, and if it isn't I'd prefer to
revert this as I don't think it is actually fixing anything.

-- 
John Baldwin


More information about the svn-src-head mailing list