Thinkpad R60 hangs when booting recent 8.4-STABLE
Don Lewis
truckman at FreeBSD.org
Fri May 2 18:48:13 UTC 2014
On 2 May, John Baldwin wrote:
> On Friday, May 02, 2014 5:39:19 am Don Lewis wrote:
>> On 1 May, John Baldwin wrote:
>> > On Thursday, May 01, 2014 1:55:41 am Don Lewis wrote:
>> >> On 30 Apr, John Baldwin wrote:
>> >>
>> >> > Are you up for doing some printf sleuthing? There are two odd things that I
>> >> > see so far:
>> >> >
>> >> > 1) the base address of 0. The question here is if pci_add_map() in
>> >> > sys/dev/pci/pci.c decides to set start to 0 explicitly, or if it happens
>> >> > further up the callchain (should be bus_alloc_resource calls in
>> >> > sys/dev/acpica/acpi_pcib_acpi.c, sys/x86/x86/nexus.c and then in the
>> >> > rman code itself in sys/kern/subr_rman.c)
>> >> >
>> >> > 2) The 'reserved' printfs during boot probe. Those come from a printf in
>> >> > pci_alloc_resource() in sys/dev/pci/pci.c. However, that should not be called
>> >> > until a driver attaches to a device and calls bus_alloc_resource(). It should
>> >> > not be called from pci_add_child() as it seems to be now.
>> >>
>> >> The call graph for the four earlier ones that you previously pointed
>> >> out (not hostb0) is:
>> >> pci_add_child()
>> >> pci_add_resources()
>> >> *_early_takeover()
>> >> [I suspect]
>> >> bus_alloc_resource_any()
>> >> pci_alloc_resource()
>> >>
>> >> These are the three system uhci controllers and the system ehci
>> >> controller, which apparently pass this test:
>> >> pci_get_class(dev) == PCIC_SERIALBUS &&
>> >> pci_get_subclass(dev) == PCIS_SERIALBUS_USB
>> >
>> > Oh, ok. That is fine, and that explains why it was selective in the past (and
>> > only for I/O resources). That just leaves 1) then. It would be especially good
>> > to know what pci_add_map() does when it sees this BAR during the bus probe.
>>
>> In either case after reading the BAR, base=0xd0000000,
>> testval=0xf0000008. and the following gets printed:
>> map[10]: type Prefetchable Memory, range 32, base 0xd0000000, size 28, enabled
>> Then we call resource_list_add(..., d0000000, dfffffff, 10000000)
>>
>> When we don't pass the size in flags (w/o r262226),
>> resource_list_alloc() eventually calls nexus_alloc_resource() with the
>> expected start, end, and count values. After returning from
>> pci_add_map, we write 0xd0000000 to the BAR.
>>
>> If we *do* pass the size in flags (with r262226), then
>> resource_list_alloc() never calls nexus_alloc_resource(), but it does
>> return a non-NULL value. We then write 0 to the BAR.
>>
>> Digging into resource_list_alloc(), I see start, count, and end having
>> the expected values before the second call to BUS_ALLOC_RESOURCE()
>> but afterwards rman_get_start(rle->res) is 0 and rman_get_end(rle->res)
>> is 0xfffffff. We don't call nexus_alloc_resource(). Setting flags to 0
>> for this specific call to BUS_ALLOC_RESOURCE() based on the value of
>> start fixes the problem.
>>
>> The culprit appears to be the call to rman_reserve_resource() in
>> acpi_alloc_resource(). With r262226 it "succeeds", but the returned
>> resource has the wrong start and end addresses. Without r262226,
>> rman_reserve_resource returns NULL and we call BUS_ALLOC_RESOURCE(),
>> which calls nexus_alloc_resource(), which returns the expected range.
>>
>> I enabled debug.rman_debug and saw the following:
>>
>> rman_reserve_resource_bound: <ACPI I/O memory addresses> request:
>> [0xd0000000, 0xdfffffff], length 0x10000000, flags 28736, device (null)
>> considering [0xfec00000, 0xffffffff]
>> truncated region: [0, 0xdfffffff]; size 0xe0000000 (requested
>> 0x10000000)
>> candidate region [0, 0xdfffffff], size 0xe0000000
>> allocating at the end
>>
>> This is with subr_rman.c from HEAD.
>>
>> This expression will overflow:
>> if (s->r_start + count - 1 > end) {
>> I don't understand the point of this test. Is it an optimization to
>> exit the loop early based on an assumption about the ordering of
>> elements in the list?
>
> Yes. The elements in an rman list are supposed to be sorted.
>
>> This expression will also overflow:
>> rstart = (rstart + amask) & ~amask;
>> which is where I think the start address is getting set to 0.
>>
>>
>> After reverting subr_rman.c and applying the following patch, I see:
>> considering [0xfec00000, 0xffffffff]
>> no unshared regions found
>> Then nexus_alloc_resource() gets called and I see
>> considering [0x3ff60000, 0xfebfffff]
>> and then all the right stuff happens.
>>
>>
>> Index: sys/kern/subr_rman.c
>> ===================================================================
>> --- sys/kern/subr_rman.c (revision 262226)
>> +++ sys/kern/subr_rman.c (working copy)
>> @@ -468,11 +468,9 @@
>> */
>> for (s = r; s; s = TAILQ_NEXT(s, r_link)) {
>> DPRINTF(("considering [%#lx, %#lx]\n", s->r_start, s->r_end));
>> - if (s->r_start + count - 1 > end) {
>> - DPRINTF(("s->r_start (%#lx) + count - 1> end (%#lx)\n",
>> - s->r_start, end));
>> - break;
>> - }
>> + if (s->r_end < start + count - 1 ||
>> + s->r_start > end - count + 1)
>> + continue;
>> if (s->r_flags & RF_ALLOCATED) {
>> DPRINTF(("region is allocated\n"));
>> continue;
>>
>>
>
> I think this looks good.
>
>> I have no idea why 9.2-STABLE was able to boot on this machine ...
>
> acpi_alloc_resource() works differently in 9.x and later. It only
> tries the internal rmans if nexus_alloc_resource() fails. In 8.x it
> tries the internal rmans first. That change was made in r216674.
Yeah, I enabled rman debugging and booted a 9.2 snapshot CD. I only saw
one call to rman_reserve_resource_bound(), which looked like it was from
nexus_alloc_resource().
Here's a more paranoid patch, though I don't like the code duplication
between the two loops:
Index: sys/kern/subr_rman.c
===================================================================
--- sys/kern/subr_rman.c (revision 262226)
+++ sys/kern/subr_rman.c (working copy)
@@ -461,6 +461,11 @@
}
amask = (1ul << RF_ALIGNMENT(flags)) - 1;
+ if (start + amask < start) {
+ DPRINTF(("start+amask wrapped around\n"));
+ goto out;
+ }
+
/* If bound is 0, bmask will also be 0 */
bmask = ~(bound - 1);
/*
@@ -468,11 +473,10 @@
*/
for (s = r; s; s = TAILQ_NEXT(s, r_link)) {
DPRINTF(("considering [%#lx, %#lx]\n", s->r_start, s->r_end));
- if (s->r_start + count - 1 > end) {
- DPRINTF(("s->r_start (%#lx) + count - 1> end (%#lx)\n",
- s->r_start, end));
- break;
- }
+ if (s->r_end < start + count - 1 ||
+ s->r_start > end - count + 1 ||
+ s->r_start + amask < s->r_start)
+ continue;
if (s->r_flags & RF_ALLOCATED) {
DPRINTF(("region is allocated\n"));
continue;
@@ -584,8 +588,10 @@
goto out;
for (s = r; s; s = TAILQ_NEXT(s, r_link)) {
- if (s->r_start > end)
- break;
+ if (s->r_end < start + count - 1 ||
+ s->r_start > end - count + 1 ||
+ s->r_start + amask < s->r_start)
+ continue;
if ((s->r_flags & flags) != flags)
continue;
rstart = ulmax(s->r_start, start);
More information about the freebsd-stable
mailing list