GPE handler livelock [fixed?]

Yousif Hassan yousif at alumni.jmu.edu
Thu Jan 10 05:01:16 PST 2008


Nate, everyone:

Well guess what?  The patch appears to WORK!!!!

I tested it first with a normal startup with ACPI enabled, and it didn't 
freeze.
Not satisfied, I turned on verbose booting to watch the thermal change 
messages,
and in fact, for the first time, 7.0 successfully changed AC states and TZ 
zones
on this laptop.

Still not satisfied, I began some tasks that generate lots of heat (say,
compiling the kernel), and watched as the temperature rose.  No freeze.

So, I'd like to give a huge thank-you to Nate and the gang, and cautiously,
I'd like to say congratulations, too.

I'm happy to run any additional tests you need to confirm.

Is this fix too late for 7.0 or is it a candidate for an RC?  I'd definitely 
be
willing to test any new release w/ this fix to make sure it works 
out-of-the-box.

--Yousif

----- Original Message ----- 
From: "Nate Lawson" <nate at root.org>
To: "Alexey Starikovskiy" <astarikovskiy at suse.de>
Cc: "Moore, Robert" <robert.moore at intel.com>; "Yousif Hassan" 
<yousif at alumni.jmu.edu>; <freebsd-acpi at FreeBSD.org>
Sent: Monday, January 07, 2008 6:47 PM
Subject: Re: GPE handler livelock


> This makes sense.  For the _L00 method in question, it runs 2 notifies
> before it completes.  So in our queue, that would be:
>
> T1: GPE:_L00
>    [_LOO method runs, GPE removed]
>    [_L00 queues 2 more Notifies]
> T2: Notify, Notify
>    [_L00 completes]
>
> Right now, there is nothing in our code to synchronize _L00's completion
> with completion of the Notifys.  In fact, _L00 finishes before the
> Notifys run.  Also, if another GPE comes in while this GPE is running,
> it will be queued in front of the Notifys and so would execute first.
>
> So I think with just your evgpe.c change, we will defer re-enabling the
> GPE until after the queued Notifys complete since the re-enable task
> will be queued at the same priority.
>
> The only remaining concern I have is if another GPE comes in before one
> or both Notifys run, it will be inserted at the head of the queue.  So
> as a hack, I'm setting the priority of these to be equal.
>
> Yousif, please try the attached patch and don't load a custom ASL.
>
> Thanks,
> Nate
>
> Alexey Starikovskiy wrote:
>> Nate,
>>
>> There are no debugger events in Linux, and all other deferred execution
>> happens
>> in kacpid (GPE, EC and GL included). Notify events used to be executed
>> on this same
>> queue until we noticed deadlocks with HP machines. All events are not
>> prioritized in
>> any way -- it is a simple FIFO. To avoid deadlock, we moved notify
>> events to separate queue,
>> but it had a drawback of enabling level GPE too early, thus I inserted a
>> reschedule call to each completion on first queue, giving the notify
>> queue chance to complete.
>> Later, I was reminded that this approach is not bulletproof, so enabling
>> of the level events was
>> moved to notify queue as well. As it happens after all notify events for
>> the gpe event were called,
>> (but, probably, not executed), enabling of GPE will be deferred until
>> these notify events have chance to
>> complete.
>>
>> So, essentially, we had no priority for any event, but now notify event
>> could preempt execution of
>> any other event, and level GPE event does a flush of notify queue.
>>
>> Hope this helps,
>> Alex.
>>
>> Nate Lawson wrote:
>>> Alex,
>>>
>>> I had one question about your approach.  It maintains two single-thread
>>> task queues (kacpid and kacpi_notify).  It inserts each type of event on
>>> its own queue.  So there is no strict ordering of handling notifies in
>>> priority to other acpi tasks unless you're assuming something about the
>>> linux task priority model.  Do you have any expectation that notify
>>> tasks run before other tasks (perhaps by a special priority assigned to
>>> the kacpi_notify work queue)?
>>>
>>> In FreeBSD, we have a single task queue.  However, we prioritize events
>>> in the queue in the following order (highest to lowest priority):
>>>
>>> * GPE
>>> * EC/global lock
>>> * Notify
>>> * Debugger
>>>
>>> If an event is inserted on the queue with a higher priority and a
>>> previous event has not started executing yet, this priority determines
>>> the order of insertion.  Thus if GPEs keep arriving, the Notify won't be
>>> executed until they're done.
>>>
>>> Thanks,
>>> Nate
>>>
>>> Alexey Starikovskiy wrote:
>>>> Here is the patch...
>>>> Alexey Starikovskiy wrote:
>>>>> I proposed this patch some time ago, it moves _Lxx enabling to the end
>>>>> of Notify queue, thus all notifies must complete before event becomes
>>>>> enabled again.
>>>>> Hope it is readable to non-Linux people...
>>>>>
>>>>> Regards,
>>>>> Alex.
>>>>>
>>>>> Moore, Robert wrote:
>>>>>> No changes that I know of before 20070508.
>>>>>>
>>>>>> You'll need to figure out why you are getting another GPE before the
>>>>>> _Lxx method completes. There was something like this on Linux with
>>>>>> an HP
>>>>>> machine, perhaps Alexey can help.
>>>>>>
>>>>>> As I recall, there was something nasty happening where the TZ trip
>>>>>> points had to be reset before the Notify() handler completed, but 
>>>>>> this
>>>>>> ended up causing another GPE, etc. etc.
>>>>>>
>>>>>> Bob
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Nate Lawson [mailto:nate at root.org]
>>>>>>> Sent: Monday, January 07, 2008 10:09 AM
>>>>>>> To: Moore, Robert
>>>>>>> Cc: Yousif Hassan; freebsd-acpi at FreeBSD.org
>>>>>>> Subject: Re: GPE handler livelock
>>>>>>>
>>>>>>> Bob, thanks for the reply.  That's exactly what my investigation is
>>>>>>> showing also.  It appears we're still on 20070320 so I'm not sure 
>>>>>>> why
>>>>>>> this would affect us though.  Perhaps a similar change was already
>>>>>>> present?  In any case, we should see if an import fixes this.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Nate
>>>>>>>
>>>>>>> Moore, Robert wrote:
>>>>>>>> This sounds suspiciously like the changes we made to the Notify()
>>>>>>>> handling last year. We attempted to make the notify handler run
>>>>>>>> synchronously with the caller to Notify(), but this created more
>>>>>>>> problems than it solved. We ended up returning the behavior of
>>>>>>>> Notify
>>>>>>>> handlers to be asynchronous:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 19 October 2007. Summary of changes for version 20071019:
>>>>>>>>
>>>>>>>> 1) ACPI CA Core Subsystem:
>>>>>>>>
>>>>>>>> Reverted a change to Notify handling that was introduced in version
>>>>>>>> 20070508. This version changed the Notify handling from 
>>>>>>>> asynchronous
>>>>>> to
>>>>>>>> fully synchronous (Device driver Notify handling with respect to 
>>>>>>>> the
>>>>>>>> Notify
>>>>>>>> ASL operator). It was found that this change caused more problems
>>>>>> than
>>>>>>>> it
>>>>>>>> solved and was removed by most users.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: owner-freebsd-acpi at freebsd.org [mailto:owner-freebsd-
>>>>>>>>> acpi at freebsd.org] On Behalf Of Yousif Hassan
>>>>>>>>> Sent: Sunday, January 06, 2008 12:18 PM
>>>>>>>>> To: Nate Lawson
>>>>>>>>> Cc: freebsd-acpi at FreeBSD.org
>>>>>>>>> Subject: Re: GPE handler livelock
>>>>>>>>>
>>>>>>>>> Nate wrote:
>>>>>>>>>> Thanks for digging into this.  I reviewed this and am trying to
>>>>>>>> figure
>>>>>>>>>> out why the _L00 handler never completes.  It keeps getting
>>>>>> preempted
>>>>>>>> by
>>>>>>>>>> the next one.  To help track this down, try removing these two
>>>>>> lines
>>>>>>>>>> from the _L00 method and recompile your ASL:
>>>>>>>>>>
>>>>>>>>>>    Acquire (\_TZ.C173, 0xFFFF)
>>>>>>>>>>    ...
>>>>>>>>>>    Release (\_TZ.C173)
>>>>>>>>>>
>>>>>>>>>> For others who have this problem, instructions on how to 
>>>>>>>>>> recompile
>>>>>>>> and
>>>>>>>>>> load your custom ASL can be found here (11.16.4 and 5):
>>>>>>>>>>
>>>>>> http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/acpi-debug.htm
>>>>>>
>>>>>>>> l
>>>
>
>


--------------------------------------------------------------------------------


> Index: sys/dev/acpica/Osd/OsdSchedule.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/dev/acpica/Osd/OsdSchedule.c,v
> retrieving revision 1.39
> diff -u -r1.39 OsdSchedule.c
> --- sys/dev/acpica/Osd/OsdSchedule.c 22 Mar 2007 18:16:41 -0000 1.39
> +++ sys/dev/acpica/Osd/OsdSchedule.c 7 Jan 2008 23:46:47 -0000
> @@ -114,7 +114,7 @@
>  pri = 5;
>  break;
>     case OSL_NOTIFY_HANDLER:
> - pri = 3;
> + pri = 10; // XXX testing
>  break;
>     case OSL_DEBUGGER_THREAD:
>  pri = 0;
> Index: sys/contrib/dev/acpica/evgpe.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/contrib/dev/acpica/evgpe.c,v
> retrieving revision 1.1.1.12
> diff -u -r1.1.1.12 evgpe.c
> --- sys/contrib/dev/acpica/evgpe.c 22 Mar 2007 17:23:30 -0000 1.1.1.12
> +++ sys/contrib/dev/acpica/evgpe.c 7 Jan 2008 21:17:28 -0000
> @@ -123,6 +123,10 @@
>
> /* Local prototypes */
>
> +static void
> +AcpiEvAsynchEnableGpe (
> +    void                    *Context);
> +
> static void ACPI_SYSTEM_XFACE
> AcpiEvAsynchExecuteGpeMethod (
>     void                    *Context);
> @@ -684,14 +688,26 @@
>         }
>     }
>
> -    if ((LocalGpeEventInfo.Flags & ACPI_GPE_XRUPT_TYPE_MASK) ==
> +    /* Defer enabling of GPE until all notify handlers are done */
> +    AcpiOsExecute(OSL_NOTIFY_HANDLER, AcpiEvAsynchEnableGpe, 
> GpeEventInfo);
> +    return_VOID;
> +}
> +
> +static void
> +AcpiEvAsynchEnableGpe (
> +    void                    *Context)
> +{
> +    ACPI_GPE_EVENT_INFO     *GpeEventInfo = (void *) Context;
> +    ACPI_STATUS             Status;
> +
> +    if ((GpeEventInfo->Flags & ACPI_GPE_XRUPT_TYPE_MASK) ==
>             ACPI_GPE_LEVEL_TRIGGERED)
>     {
>         /*
>          * GPE is level-triggered, we clear the GPE status bit after
>          * handling the event.
>          */
> -        Status = AcpiHwClearGpe (&LocalGpeEventInfo);
> +        Status = AcpiHwClearGpe (GpeEventInfo);
>         if (ACPI_FAILURE (Status))
>         {
>             return_VOID;
> @@ -700,7 +716,7 @@
>
>     /* Enable this GPE */
>
> -    (void) AcpiHwWriteGpeEnableReg (&LocalGpeEventInfo);
> +    (void) AcpiHwWriteGpeEnableReg (GpeEventInfo);
>     return_VOID;
> }
>
> 



More information about the freebsd-acpi mailing list