[Bug 243225] "mpr0: Out of chain frames" boot hang after clang 9.0.1 import (probably timing, not compiler related)
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Thu Jan 9 19:00:54 UTC 2020
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243225
Bug ID: 243225
Summary: "mpr0: Out of chain frames" boot hang after clang
9.0.1 import (probably timing, not compiler related)
Product: Base System
Version: 12.0-STABLE
Hardware: Any
OS: Any
Status: New
Severity: Affects Some People
Priority: ---
Component: kern
Assignee: bugs at FreeBSD.org
Reporter: terry-freebsd at glaver.org
I updated my test system from r356239 to r356557 (which crosses the clang 9.0.1
import) and started receiving "mpr0: Out of chain frames" at boot time, which
causes a boot hang with the mpr0 controller being reset and reinitialized, and
the error happening again. This happens before the device (tape drive) is
detected, and happens regardless of whether anything is connected to the mpr
controller.
I had this before (many months ago) on this system and worked with Dell
service, replacing boards / cables / tape drive, etc. The solution at that
point was to put the controller into a different slot, which apparently hid
whatever timing problem is causing the boot hang. That's why I say in the PR
title that I don't think it is a clang 9.0.1 problem (incorrect code
generation). Presumably clang 9 generates faster (hopefully) or slower code
that is triggering the problem.
Escaping to the boot loader and killing time, then saying "boot" without
changing anything will sometimes let the system boot normally. Again pointing
to a possible timing problem.
The boot messages from r356239 are:
mpr0: <Avago Technologies (LSI) SAS3008> port 0x8000-0x80ff mem
0xc9100000-0xc910ffff,0xc8000000-0xc80fffff irq 64 at device 0.0 on pci17
mpr0: Firmware: 16.00.08.00, Driver: 23.00.00.00-fbsd
mpr0: IOCCapabilities:
7a85c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,MSIXIndex,HostDisc,FastPath,RDPQArray>
mpr0: Found device <c01<SspTarg,Direct>,End Device> <6.0Gbps> handle<0x0009>
enclosureHandle<0x0001> slot 7
mpr0: At enclosure level 0 and connector name (1 )
sa0 at mpr0 bus 0 scbus14 target 7 lun 0
In r356557, only the first of those 3 lines appear, followed by:
mpr0: Out of chain frames, consider increasing hw.mpr.max_chains
And then, eventually by:
mpr0: Calling Reinit from mpr_wait_command, timeout=60, elapsed=60
mpr0: Reinitializing controller
At that point we're in a perpetual loop of reinit / timeout.
I can make the problem system available via remote console access (Dell iDRAC
8) or can try any suggestions for debugging this further myself.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-bugs
mailing list