sparc64/141918: [ehci] ehci_interrupt: unrecoverable error,
controller halted (sparc64)
Marius Strobl
marius at alchemy.franken.de
Tue Apr 3 15:10:04 UTC 2012
The following reply was made to PR sparc64/141918; it has been noted by GNATS.
From: Marius Strobl <marius at alchemy.franken.de>
To: Manuel Tobias Schiller <mala at hinterbergen.de>
Cc: bug-followup at FreeBSD.org
Subject: Re: sparc64/141918: [ehci] ehci_interrupt: unrecoverable error, controller halted (sparc64)
Date: Tue, 3 Apr 2012 17:00:43 +0200
On Tue, Apr 03, 2012 at 10:37:14AM +0200, Manuel Tobias Schiller wrote:
> On Mon, 2 Apr 2012 10:43:14 +0200
> Manuel Tobias Schiller <mala at hinterbergen.de> wrote:
>
> > On Mon, 2 Apr 2012 01:00:56 +0200
> > Manuel Tobias Schiller <mala at hinterbergen.de> wrote:
> >
> > > On Sun, 1 Apr 2012 12:41:24 +0200
> > > Marius Strobl <marius at alchemy.franken.de> wrote:
> > >
> > > > Well, the individual patches shouldn't make things worse except for
> > > > the second one causing more memory to be used so I'd suggest to
> > > > combine them. If in the end things actually work we still can check
> > > > what changes are needed for that.
> > > > Looking at the Linux USB code, the FreeBSD one doesn't some to honor
> > > > some DMA constraints and at least for the alignment it's actually
> > > > hard to follow what value eventually is used. One thing that stands
> > > > out is that for EHCI, the boundary is 4096. This is most easily
> > > > fixed by defining USB_PAGE_SIZE to 4096 in sys/dev/usb/usb_busdma.h.
> > > >
> > > > Marius
> > >
> > > Ok, the second patch on its own doesn't appear to work either, so I'm
> > > trying the combination of patches now. By the way: defining
> > > USB_PAGE_SIZE to 4096 in sys/dev/usb/usb_busdma.h is a bad idea - the
> > > kernel panics with a backtrace pointing into the mmu-related code.
> > > Probably has to do with sparc64 mmu only supporting 8k pages, so I'm
> > > not terribly surprised... Ok, I'm waiting for the next make
> > > buildkernel to finish, and I'll let you know what comes out.
> > >
> > > Manuel
> >
> > Ok, I also tested a kernel with both patches, and the issue persists. Do
> > you have something else to try?
> >
> > Manuel
> >
>
> Hi Marius,
>
> I did a bit of code reading (/usr/src/sys/dev/usb/controller/ehci.c near
> line 1494), and I realised that the "unrecoverable error" message should
> only be triggered if the EHCI status register has the EHCI_STS_HCH bit
> set - according to the status word dump in my log, it is not set (just
> after the "unrecoverable error" message). The register dump re-reads the
> status register from the hardware. Could it be that some controllers have
> a glitch or something on that particular bit, and we better re-read the
> status register before we conclude that the controller "really wanted to
> set that bit"?
You mean EHCI_STS_HSE? This is expected, ehci_interrupt() clears the
pending interrupt status bits before dumping the register content:
EOWRITE4(sc, EHCI_USBSTS, status); /* acknowledge */
> I can also see that the bit is set in the original bug report. I don't
> know if that machine is just faster (and the bit has not had the time to
> clear yet), or if we're talking about two different problems here...
Probably, the other controller just sets it again after the bit is
cleared.
Marius
More information about the freebsd-sparc64
mailing list