INTRNG (Was: svn commit: r301453....)

Michal Meloun mmel at FreeBSD.org
Tue Aug 2 13:25:18 UTC 2016


I'm sorry for delayed response. We have serious trouble @work so I
cannot spend to much time on FreeBSB for next 2 or 3 days.

Dne 02.08.2016 v 6:00 Nathan Whitehorn napsal(a):
>
>
> On 07/31/16 11:43, Nathan Whitehorn wrote:
>>
>>
>> On 07/31/16 08:40, Michal Meloun wrote:
>>> Dne 31.07.2016 v 8:11 Nathan Whitehorn napsal(a):
>>> [snip]
>>>>> The current API is certainly a bit of a hack and I would be
>>>>> pleased to
>>>>>> see it go; but the replacement needs to be more functional and more
>>>>>> robust. As far as I can tell, I literally *cannot* move PowerPC over
>>>>>> to this architecture.
>>>>> Yes, at this time, I agree. If you can accept enhancement of
>>>>> (owf_)bus_map_intr() then we can move to next, significantly less
>>>>> hackish, state.
>>>> OK, sure. I wrote some code this afternoon (attached) to sketch this.
>>>> It does not compile or run or anything useful, but it should
>>>> illustrate the important parts of what I think is an API that should
>>>> work for everyone. I'm happy to finish it if it looks reasonable -- I
>>>> may in any case just to replace PowerPC's increasingly teetering
>>>> intr_machdep.c.
>>>>
>>>> The OF part is essentially unchanged from how it currently works:
>>>> there's a method that you pass the interrupt specifier and interrupt
>>>> parent to, and it spits back an IRQ that means that combination of
>>>> things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its current
>>>> API, can be implemented in terms of it in one line. Whenever the
>>>> relevant PIC attaches, it is told to use the given IRQ to refer to
>>>> that opaque bytestring.
>>>>
>>>> I've added a set of equivalent functions for ACPI that take the
>>>> contents of an ACPI interrupt specifier and do the same things. It
>>>> tries to have the IRQ be human-meaningful, if possible, by matching
>>>> the ACPI interrupt number. Having separate functions seemed a little
>>>> cleaner than exposing the enums and unions as part of the KPI, but I'm
>>>> not wedded to it -- this is just an example.
>>>>
>>>> PICs register (and deregister) themselves with either an OF phandle or
>>>> an ACPI resource string or (god help us) both as needed. The PICs have
>>>> corresponding methods for interpreting various kinds of interrupt
>>>> specifiers.
>>>>
>>>> Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to
>>>> succeed before the PICs are attached is gated on an #ifdef. That can
>>>> probably be off by default on ARM. A PowerPC version of this code
>>>> would have to keep it on until various bus drivers are fixed.
>>>>
>>>> Here's the general flow:
>>>> - Parent bus enumerates children, maps IRQs as needed, adds to
>>>> resource list. The struct resources involved aren't special (just a
>>>> single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to
>>>> implement (and already are implemented, in general, for OF/FDT
>>>> drivers).
>>>> - bus_alloc_resource() does nothing special
>>>> - bus_setup_intr() calls into the PIC and does what is needed, as in
>>>> r301453 (to within that #ifdef)
>>>>
>>>> This should keep all the best pieces of everything:
>>>> - ACPI and OF are supported together, and easy to extend for other
>>>> types of mapping technologies without breaking either KBI or KPI (just
>>>> add new functions)
>>>> - No APIs change for existing Open Firmware code
>>>> - The support code can live entirely in machine-dependent directories,
>>>> with no code changes at all required in kern/ or dev/ofw/
>>>> - bus_setup_intr() and friends behave as in r301453 and succeed or
>>>> fail for real
>>>> - PCI code is not an issue
>>>> - No disconnected interrupt state maintained in mapping table (unless
>>>> the transitional #ifdef is on) and the mapping table content is only
>>>> larger than r301453 by having a copy of the original interrupt
>>>> specifier.
>>>>
>>>> If and when we manage to fix the kernel APIs to allow non-scalar
>>>> interrupt specifiers, this should also be easy to gradually
>>>> transmogrify to support that case since there exists a 1:1 mapping of
>>>> scalar IRQs to rich data and the control flow would be identical:
>>>> interrupt specifiers are read at enumeration time and a resulting
>>>> handle -- struct resource or scalar IRQ -- used afterward to refer
>>>> to it.
>>>>
>>> Nice, nice, i think that we are very close at this point.
>>> The problem with the above workflow is that maps IRQ function is called
>>> at parent bus enumeration time, generally at BUS_PASS_BUS pass. And at
>>> this time, PICs are not exist.
>>> Also, 'static struct intr_irqsrc *irq_sources[NIRQ];' is not a mapping
>>> table,  it doesn't provide this functionality.
>>> But yes, i understand that mapping table is important and necessary
>>> for you.
>>>
>>> So, if i can extend our  concept a little bit, then:
>>> We can add new table (map_data_tbl for example)  that holds  a copy of
>>> the original interrupt specifier, index to irq_sources table and
>>> probably some flags.
>>> And we can modify the above workflow to:
>>> - Parent bus enumerates children, allocates entries from map_data_tbl,
>>> adds to resource list. The struct resources involved aren't special
>>> (just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are
>>> trivial to implement (and already are implemented, in general, for
>>> OF/FDT drivers). Index to map_data_tbl is used as resource ID.
>>> - bus_alloc_resource()  sets 'ALLOCATED' in map_data_tbl flags field,
>>> *maps IRQs* and stores result in 'index' field.
>>> - bus_setup_intr() sets 'SETUP_DONE' in map_data_tbl flags field, calls
>>> into the PIC and does what is needed, as in r301453. (to within that
>>> #ifdef)
>>>
>>> And, in PPC case, newly attached PIC scans whole map_data_tbl table,
>>> finds his entries and makes what needs.
>>>
>>> Does that make sense?
>>> I hope that postponing of map IRQ call is not a problem for PPC,
>>> everything else looks easy.
>>> Moreover,  this additional level of indirection also solves all my
>>> 'hypothetical corner cases' with N:1 mappings (between original
>>> interrupt specifier and real interrupt number).
>>
>> Yes, I think we are converging.
>>
>> My qualm about bus_alloc_resource() is twofold:
>> 1. Traditionally, with interrupts, bus_alloc_resource() has no side
>> effects and I'm not sure it propagates cleanly down the tree in all
>> cases.
>> 2. There are a few bus APIs (bus_config_intr() comes to mind) that
>> use raw IRQ IDs and, so far as I know, can be, and sometimes are
>> called before bus_alloc_resource(). If the PIC doesn't know about the
>> IRQ yet when bus_config_intr() (etc.) is called, things will just break.
>>
>> So you would need to make sure that any bus method handling a
>> resource ID causes it to be mapped on the PIC at that time. It's OK
>> if that happens in bus_alloc_resource() so long as bus_config_intr(),
>> bus_setup_intr(), etc. cause that to happen if it hasn't happened yet
>> -- I don't care when these calls are made to the PIC driver so long
>> as *what* calls will be made is set at enumeration time.
>>
>> I am also totally fine with having, on ARM, bus_config_intr(),
>> bus_setup_intr() etc. fail if the PIC hasn't attached yet (On
>> PowerPC, we can't do that yet, but after this conversation, I regard
>> that as a bug and would fix that later), as well as delaying setup on
>> the PIC to the first time any bus driver actually tries to *use* the
>> interrupt (alloc_resource(), setup_intr(), config_intr(), whatever)
>> rather than when the interrupt is originally allocated.
>>
I understand. And yes, bus_alloc_resource()  isn't propagated cleanly
down the tree in all cases. That's why we have  'INTRNG' hook in
subr_bus.c, which  is suboptimal.

About bus_config_intr(). The only consumer of  bus_config_intr() is ACPI
code, in little hackish manner,  as workaround for problem which is
fully solved by INTRNG.
Also, the semantic of bus_config_intr() is not clear, from INTRNG point
of view.
So i have tendency to ignore it :)

>> ------------ The following is a large parenthesis -------------------
>>
>> One other possible route here that would also work well would be to
>> make ofwbus.c MD (it's a trivial piece of code anyway, so we don't
>> gain a lot by sharing it) and implement ofw_bus_map_intr() locally as
>> an ofwbus bus method. Then you could have the mapping table stored in
>> ofwbus's softc -- the API was designed for this initially. You would
>> need MD extensions for doing PIC registration there (which is fine),
>> but that segregates all the OFW-specific information into
>> OFW-specific code and would let the bus methods and the OFW interrupt
>> mapping table interact cleanly in the same place. This still
>> preserves the pre-r301453 API, makes PowerPC work, and maybe
>> address's skra@'s concern about extensibility and letting core
>> interrupt code know about FDT (or ACPI). I'd be happy to mock this up
>> as well if you think it's a good approach.
>>
> [snip]
>>
>> This has the following features:
>> - Existing OFW API and semantics unchanged
>> - As such, PowerPC, PCI, etc. work fine with no changes
>> - Details encapsulated in MD code, so individual platforms can
>> implement this however they like
>> - arm/arm/intr.c (or whatever) only needs a method to allocate a
>> fresh interrupt, with no state, and anoter to set the device_t for an
>> interrupt sometime later.
>> - The internal table in the platform interrupt code has no knowledge
>> of any mappings whatsoever except having the appropriate device_t for
>> the PIC stored with the interrupt vector.
>> - Device tree bits handled purely in device tree code
>> - No action need be taken on any mapping until the interrupt is
>> actually allocated/set up, like r301453
>> - Easy to add more mapping mechanisms (e.g. ACPI) by having similar
>> enumeration-mechanism-specific code in the root bus for that mapping
>> mechanism.
>>
>> -------------- End parenthesis -------------------------------
>
> Here's an implementation of the parenthesis I wrote on an airplane
> this afternoon. It should be complete, though has not been tested. The
> code is short and simple (+70 lines in ofwbus.c). This preserves the
> pre-r301453 API and semantics relative to drivers, which means PowerPC
> and PCI work out of the box, while keeping the semantics relative to
> the interrupt layer of r301453 (PIC methods only called on resource
> allocation, no allocatable IRQs on unattached PICs, encapsulation of
> OFW-specific code in OFW-specific bits of the tree). It turns out
> those two things are compatible, somewhat to my surprise, and that
> makes the result very clean. I like this approach and would be happy
> to move forward with it. There are five functions of interest:
>
> 1. OFW_BUS_MAP_INTR(). This has the semantics and API it has now: you
> pass an interrupt specifier and parent, you get back an IRQ. No
> changes. This is the core of the normal OFW interrupt API.
>
> 2. OFW_BUS_REGISTER_PIC(device_t pic, phandle_t phandle). This is a
> new function that PIC drivers are supposed to use to register control
> of an interrupt domain. This replaces machine-specific code like
> powerpc_register_pic() to allow the PIC table to be in a bus parent
> rather than in the interrupt core.
>
> 3. PIC_MAP_OFW_INTR(device_t pic, int irq, interrupt specifier). This
> is a new function that PIC drivers that know how to handle device-tree
> interrupt descriptors implement (analogous to various existing ones
> that vary by platform). It tells the PIC that the given abstract IRQ
> means the given opaque interrupt specifier.
>
> 4. arm_allocate_irq(int suggested). This allocates a new IRQ number
> not (yet) attached to a PIC, as in r301453. I've added a parameter
> that lets you pass a suggested number to try in case it is possible to
> make it match an interrupt pin or something for human-readability.
>
> 5. arm_set_pic_for_irq(int irq, device_t). This tells the MD interrupt
> layer to associate a given PIC device_t with an IRQ. That is all the
> information the MD layer ever has about the IRQ mapping.
>
> Functions #1 and #2 are now implemented completely in ofwbus.c: there
> are no callouts anywhere else and the interrupt mapping table is
> maintained entirely internally to ofwbus (in its softc). In order to
> implement ACPI, or some other strategy, you would implement analogs to
> functions #1 and #2 that live somewhere in the bus hierarchy that is
> guaranteed to be above all devices that might want that type of
> interrupt (e.g. in acpi0), and some analog to #3 that PIC drivers
> implementing the mapping scheme would provide.
>
> Since the system interrupt code has no knowledge at all of interrupt
> mapping, of any type, in this scheme, adding new mapping types is
> trivial and can be done on a driver-by-driver basis if necessary
> without changing KPI and without any other part of the system even
> being aware. For example, GPIOs can use a completely different
> mechanism if they want and can do setup purely in the GPIO controller
> driver. You could have a method GPIO_GET_INTERRUPT_FOR_PIN() on a GPIO
> controller in which the GPIO controller allocates a generic IRQ,
> assigns through some internal table just in the GPIO driver, and
> returns to it to a consumer in some other device driver -- without a
> GPIO mapping type, new bus functions, or modifications to the platform
> interrupt code.
>
> The control flow goes like this:
> - Bus driver enumerates children, parses interrupts properties, calls
> OFW_BUS_MAP_INTR() to get IRQs for them (as pre-r301453), adds to
> interrupt list.
> - ofwbus receives the OFW_BUS_MAP_INTR() call, allocates a blank
> disconnected IRQ from the MD interrupt layer, and stores the mapping
> from the new IRQ to the given interrupt specifier and phandle in an
> internal table in ofwbus's softc.
>   NB: Nothing else happens here, like post-r301453. Changing this does
> not change any semantics of the API pre-r301453, which means it
> remains fully compatible with PCI and PowerPC. Also, like
> post-r301453, there is no involvement of nexus.
> - PICs attach and call OFW_BUS_REGISTER_PIC(). ofwbus receives these
> messages and adds a (device_t, phandle_t) mapping to a second internal
> table. Note that the interrupt layer does not need to handle PIC
> registration anymore at all (except for the root PIC).
> - Bus child eventually calls a function that tries to set the
> interrupt up (e.g. bus_setup_intr()). That propagates up the bus
> hierarchy, eventually getting to ofwbus. ofwbus notes the IRQ number,
> looks it up in the table, looks up the appropriate PIC from the PIC
> table, then:
>   A) calls arm_set_pic_for_irq(irq, pic_device_t) -- this is the
> interrupt layer's only interaction with the mapping code. All it deals
> with is device_ts and abstract IRQ numbers.
>   B) calls PIC_MAP_OFW_INTR(pic, irq_number, interrupt-cells,
> interrupt-specifier) to tell the PIC that the interrupt layer's IRQ
> irq_number means the given specifier
>   C) finally, passes the call onto nexus, which will do whatever would
> normally happen (unmasking the interrupt, setting handlers, etc.) in
> terms only of the abstract IRQ and the device_t assigned by ofwbus.
>
> You would implement ACPI just by doing a s/OFW/ACPI/g
> search-and-replace above -- since the interrupt layer doesn't know
> about OFW or ACPI or anything else, there is no need to touch it. This
> seems clean, simple, compartmentalized, preserves the existing API,
> and should work on all of our various hardware. PowerPC can't quite
> work with it yet without some multipass foo, but, since the API is
> preserved, that transition can happen gradually without KPI changes.
> For the same reason that it is API-preserving, I think this code is
> also MFC-able.
> -Nathan

I think that we are slightly diverges in this place:
 - please note that PIC API (in kern/pic_if.m) is different from the one
that PPC uses.
 - we must be able to store configuration data (original interrupt
specifier) in all cases, not only for OFW. That's reason why I have
proposed
to create 'mapping table'.

Anyway, i think that we both talking about +/-  same solution, only i
want to move it from OFW specific level implemented at OFW bus level to
bus/interrupt specifier format independent level implemented  in
subr_intr.c. 
Most of your control flow can be implemented at general level as is, or
already exist [intr_map_irq(),  intr_pic_register()] .
Also, impact for current PPC code is, by me, minimal.
I can sketch my idea in more details, if you found it acceptable.

Again, I'm sorry for delayed and very brief response.
Michal



More information about the freebsd-arm mailing list