INTRNG (Was: svn commit: r301453....)

Thu Jul 28 15:33:19 UTC 2016

Dne 27.07.2016 v 19:19 Nathan Whitehorn napsal(a):
>
>
> On 07/27/16 09:27, Michal Meloun wrote:
>
>
> [snip]
>>> The concept is *extremely* necessary, which is why both Linux and
>>> FreeBSD decided to use it independently There is no way to handle
>>> parent buses with a single rman and devices on multiple PICs with
>>> overlapping interrupt ranges without them; neither is there a way to
>>> decode arbitrary-length interrupt specifiers or to handle things like
>>> MSIs. Please see the list of cases at
>>> https://wiki.freebsd.org/Complicated_Interrupts and in my earlier
>>> email for some examples of things that you just can't represent with
>>> this new system, as far as I can tell.
>>>
>>>>     Also, both variants needs attached PIC at bus_alloc_resource()
>>>> time,
>>>> so timing wasn't been changed.
>>> They absolutely do not, as I have explained repeatedly. Due to parent
>>> devices with interrupts handled by their bus children, this is a hard
>>> requirement of any workable system. This is not a theoretical issue; I
>>> have lots of hardware like this.
>> Nathan,  I'm talking  about pre /post  r301453 state. Current INTRNG it
>> has never supported. But yes,  I'm fully understand why you want it. See
>> [2].
>
> Current "INTRNG", at least on PowerPC, has supported this for 10
> years. If it doesn't on ARM, it's because of some trivial
> implementation bug.
>>
>>>> - it implements new BUS_MAP_INTR(). As I understand it,  this is
>>>> problematic for you, and I'm ready to change it. But I need more
>>>> details
>>>> than "it's fundamentally broken".
>>> Please explain (a) what cases it handles that the existing code and
>>> does, and (b) how you would resolve each of the cases on the wiki page
>>> I sent.
>>>
>>> The general issue is that it traces the newbus hierarchy, when
>>> interrupts often do not, and so breaks when you have links to
>>> as-yet-unattached parts of the hierarchy. It also relies on the
>>> assigned "IRQ" numbers being nonoverlapping to avoid rman errors, but
>>> that's a relatively minor thing.
>> Well, from my point of view,  only  "get interrupt data from
>> OFW/GPIO/.."  part of bus_alloc_resource() follows bus hierarchy,  real
>> execution not changed.
>> In any case,  I see many possible variants how we can modify current
>> implementation. For rest, see [2].
>
> But that's an important part, for three reasons:
> 1. The PIC may not exist when bus_alloc_resource() is called
> 2. The bus parents have no useful information to contribute to this
> since the hierarchy doesn't go that way
See [2]

> 3. If you try to use the interrupt pin as the resource ID, which
> r301453 does, you end up with rman conflicts if, for example, a bus
> has two children with interrupts on identically numbered pins on two
> different interrupt controllers.
No, r301453 doesn't do this.
Index to global interrupt source table (irq_sources) is used as resource
ID (and r301453 not changed it). 

>
>>
>>>>   
>>>>> There are some other differences, of varying degrees of importance,
>>>>> but that's the really fundamental one. I haven't seen any cases where
>>>>> r301453 provides functionality absent in the already existing system,
>>>>> but there seem to be a large number of cases (see the first email I
>>>>> sent to freebsd-arch or
>>>>> https://wiki.freebsd.org/Complicated_Interrupts) that the API in
>>>>> r301453 cannot accommodate and that are needed to support a
>>>>> variety of
>>>>> hardware.
>>>> Which single feature has been removed by r301453?
>>> Nothing has been removed by it, because the normal code is intact.
>>> However, the new code is less functional than the old code, so cannot
>>> replace it. In particular, it is architecturally incapable of working
>>> with the kinds of device trees found on PowerPC systems (see the wiki
>>> page). That means we would have to keep both indefinitely, which is a
>>> significant maintenance burden to no gain whatsoever.
>>> -Nathan
>> Firstly,  please,  ignore dependency problem. It will be explained later
>> in [2].
>
> Except that that isn't a workable solution (see the end)
>
>>
>> [1]
>> The original (your) INTRNG assume several things:
>>
>> - value of interrupt parent xref  together with byte contents of
>> 'interrupts' property forms some sort of  'key' (aka virtual IRQ)
>> - byte contents of  'interrupts' property cannot be parsed in any way
>> inside INTRNG core.
>> - data forming this key are sufficient for subsequent mapping (and/or
>> configuring) to real IRQ.
>> - there must  exist bidirectional unique relation between virtual and
>> real IRQs.  So one key (virtual IRQ)  can be mappable to one and just
>> one
>>     real IRQ.  And only one virtual IRQ  can be mapped to any given real
>> IRQ.
>> - we have cache that stores all observed 'keys' and associated real IRQ
>> ) the they already mapped.
>>
>> Unfortunately,  necessity of unique mapping is very problematic.
>> We cannot handle situation, where shared interrupt is declared by
>> different but compatible 'interrupts' properties.
>
> There isn't actually a requirement of bidirectional uniqueness, as I
> explained the last time you brought this up. One "virtual" IRQ does
> need to map to exactly one interrupt specifier, but the reverse is not
> the case. The PIC driver is absolutely free to map and dispatch
> multiple virtual IRQs from the same shared interrupt pin, with no more
> overhead than you usually get from shared interrupt lines.

Oki, lets me to expand this a little bit more.

The pre r301453 INTRNG[ARM] does this:
- ofw_bus_intr_to_rl() reads and pre-parses  'interrupts' property 
into  'key'.
- cache is searched for duplicates and new entry is created.
- position of this entry in cache  is used as resource ID.
- given resource ID is stored to RL  and is later used in
bus_alloc_resource()
I'm right ?

In the above model, individual cache entries may be shared are
referenced by multiple entities (PIC, RL, ..).  Due to this, life cycle
of cache entries is not well defined and we probably must use some sort
of reference counting to control it. 
But, as you wrote,  INTRNG code is "absolutely free to map and dispatch
multiple virtual IRQs from the same shared interrupt pin" (allow me to
expand this also to INTRNG code itself) .
In attempt to simply  situation abound cache, in first step, we removed
duplicates check from above model.  This also removes unpleasantly
sharing of entries.
This step converts cache to some sort of duplicate of resource list
entries. Each existing RLE have exactly one corresponding entry in cache.
Also,  the cache is not needed for INTRNG core. At this point, its only
transfers data from ofw_bus_intr_to_rl() to consumers PIC.

So, if we are able to attach 'key' to RLE,  then we can remove the whole
cache machinery from INTRNG.
Do you agree?

>
>>
>> For example:
>> -------------------------
>>       foo1: bar at 12345678 {
>>           interrupt-parent = <&pic1>;
>>           interrupts = <1 IRQ_TYPE_LEVEL_HIGH>;
>>      }
>>
>>       foo2: bar at 1234567C {
>>           interrupt-parent = <&pic1>;
>>           interrupts = <1 IRQ_TYPE_NONE>;
>>      }
>>
>> Is this valid (from DT point of view) configuration?  I think that yes.
>
> No, it's malformed and a violation of the standard.
Can you give me pointer, please? I'm not able to find anything about
this... 

>
>> Is this frequent configuration?  I'm sure,  isn't.
>> Is this possible configuration?  I'm afraid that yes.
>
> People do in fact do deranged non-conformant things with device trees
> sometimes, unfortunately, so it is indeed worth worrying about.
>
>> --------------------------
>>
>> Next example are GPIO interrupts. These are encoded differently,  so two
>> 'keys' may point to to same real IRQ.
>> After that we decided to change our approach and utilize standard
>> resources and resource list entries to deliveryconfiguration data from
>> various sources (OFW/GPIO/...)  to consumers.
>
> This is a non-issue. You make the GPIO controller a PIC, it handles
> interrupts however you like. If you got a weird device tree that uses
> two encodings (one for "interrupts" properties with GPIOs and one for
> "GPIO" properties on which you need to take interrupts but that
> weren't added to the "interrupts" list for mysterious reasons), you
> introduce a helper function for the GPIO case.
>
> Are there any actual problems with the pre-existing interrupt mapping
> code? I have not seen any so far.
>
>> [2]
>> Each OFW device has a lot of dependencies. It must enable  clocks, power
>> regulators..., it must be taken from reset. All these action must be
>> done before we can access single register.
>> Most of these action can be done  only with attached clock/power/reset
>> controller.
>> Within this, interrupts are not special.
>
> They are actually quite special in that the kernel enables them late
> and so you can defer setup. They are also special in that, for that
> reason, it is possible to design systems with circular dependency
> graphs with interrupts. It is not possible to do this with, for
> example, clocks: if I need to apply a clock to the clock controller
> with itself, the system is just not bootable and such a system will
> not be built or shipped. Interrupts need only happen later, after
> device setup, and so such systems *are* designed and shipped. The same
> is true for power.
The G5 problem is standard  'cross type' circular dependency.  You must
(at BUS_PASS_BUS)  initialize memory address window of PCI bridge and
start bus enumeration.
Rest of PCI bridge initialization (including all IRQ stuff) must be
postponed to any later pass. Of course, all interrupt controllers must
be attached at BUS_PASS_INTERRUPT.
Or I missed something?

>> Moreover, 'cross type' circular
>> dependencies are not that rare.
>> I want to say:
>> Resource dependencies are must solved (and resolved) at different level
>> than is INTRNG.
>> Blind resource allocation is not universal solution because given
>> resource may/must be accessible immediately after allocation (in many
>> cases).
>
> Absolutely: for GPIOs, clocks, power, etc. you want a system that
> looks different than the interrupt virtualization system. Probably
> with extra resource types and maybe with some API similar to r301453.
> But interrupts have different, and more complex, requirements and
> cannot be mapped this way.
>
>>
>> Unfortunately,  and at time time, I known only one really working
>> solution:
>> staggered driver initialization combined with multipass bus
>> initialization.
>
> That does not solve the interrupt problem. If devices have interrupts
> on their own children, no amount of multipass initialization can
> possibly break the loop. Multipass would work for other kinds of
> resources (clocks, power, etc.) and is a perfect match for those
> resource types. A dynamic multipass (where a driver can return EAGAIN
> or something from attach if its resources don't exist yet) would work
> great.
>
> *But* in the specific, special case of interrupts, it is not a
> workable model. You could imagine some change to initialization that
> gives devices a late attach and an early attach method and does
> interrupts in the late part, but that is hugely invasive change that
> would need to touch every single driver in the tree -- to solve a
> problem that is already solved by interrupt virtualization.
> -Nathan
>
>> I'm sorry, I'm not able to express all this accurately and clearly, but
>> I've really tried...
>> Michal
>>
>>
>> _______________________________________________
>> freebsd-arch at freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-arch
>> To unsubscribe, send any mail to "freebsd-arch-unsubscribe at freebsd.org"
>>
>