sparc64/141918: [ehci] ehci_interrupt: unrecoverable error, controller halted (sparc64)

Marius Strobl marius at alchemy.franken.de
Tue Apr 3 15:10:04 UTC 2012


The following reply was made to PR sparc64/141918; it has been noted by GNATS.

From: Marius Strobl <marius at alchemy.franken.de>
To: Manuel Tobias Schiller <mala at hinterbergen.de>
Cc: bug-followup at FreeBSD.org
Subject: Re: sparc64/141918: [ehci] ehci_interrupt: unrecoverable error, controller halted (sparc64)
Date: Tue, 3 Apr 2012 17:00:43 +0200

 On Tue, Apr 03, 2012 at 10:37:14AM +0200, Manuel Tobias Schiller wrote:
 > On Mon, 2 Apr 2012 10:43:14 +0200
 > Manuel Tobias Schiller <mala at hinterbergen.de> wrote:
 > 
 > > On Mon, 2 Apr 2012 01:00:56 +0200
 > > Manuel Tobias Schiller <mala at hinterbergen.de> wrote:
 > > 
 > > > On Sun, 1 Apr 2012 12:41:24 +0200
 > > > Marius Strobl <marius at alchemy.franken.de> wrote:
 > > > 
 > > > > Well, the individual patches shouldn't make things worse except for
 > > > > the second one causing more memory to be used so I'd suggest to
 > > > > combine them. If in the end things actually work we still can check
 > > > > what changes are needed for that.
 > > > > Looking at the Linux USB code, the FreeBSD one doesn't some to honor
 > > > > some DMA constraints and at least for the alignment it's actually
 > > > > hard to follow what value eventually is used. One thing that stands
 > > > > out is that for EHCI, the boundary is 4096. This is most easily
 > > > > fixed by defining USB_PAGE_SIZE to 4096 in sys/dev/usb/usb_busdma.h.
 > > > > 
 > > > > Marius
 > > > 
 > > > Ok, the second patch on its own doesn't appear to work either, so I'm
 > > > trying the combination of patches now. By the way: defining
 > > > USB_PAGE_SIZE to 4096 in sys/dev/usb/usb_busdma.h is a bad idea - the
 > > > kernel panics with a backtrace pointing into the mmu-related code.
 > > > Probably has to do with sparc64 mmu only supporting 8k pages, so I'm
 > > > not terribly surprised... Ok, I'm waiting for the next make
 > > > buildkernel to finish, and I'll let you know what comes out.
 > > > 
 > > > Manuel
 > > 
 > > Ok, I also tested a kernel with both patches, and the issue persists. Do
 > > you have something else to try?
 > > 
 > > Manuel
 > >
 > 
 > Hi Marius,
 > 
 > I did a bit of code reading (/usr/src/sys/dev/usb/controller/ehci.c near
 > line 1494), and I realised that the "unrecoverable error" message should
 > only be triggered if the EHCI status register has the EHCI_STS_HCH bit
 > set - according to the status word dump in my log, it is not set (just
 > after the "unrecoverable error" message). The register dump re-reads the
 > status register from the hardware. Could it be that some controllers have
 > a glitch or something on that particular bit, and we better re-read the
 > status register before we conclude that the controller "really wanted to
 > set that bit"?
 
 You mean EHCI_STS_HSE? This is expected, ehci_interrupt() clears the
 pending interrupt status bits before dumping the register content:
 EOWRITE4(sc, EHCI_USBSTS, status);      /* acknowledge */
 
 > I can also see that the bit is set in the original bug report. I don't
 > know if that machine is just faster (and the bit has not had the time to
 > clear yet), or if we're talking about two different problems here...
 
 Probably, the other controller just sets it again after the bit is
 cleared.
 
 Marius
 


More information about the freebsd-sparc64 mailing list