From nobody Mon Nov 28 20:27:35 2022 X-Original-To: freebsd-arch@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4NLcTq13yyz4hqnf for ; Mon, 28 Nov 2022 20:27:43 +0000 (UTC) (envelope-from ehem@m5p.com) Received: from mailhost.m5p.com (mailhost.m5p.com [74.104.188.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "m5p.com", Issuer "R3" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4NLcTp40RZz3Hd2 for ; Mon, 28 Nov 2022 20:27:42 +0000 (UTC) (envelope-from ehem@m5p.com) Authentication-Results: mx1.freebsd.org; dkim=none; spf=pass (mx1.freebsd.org: domain of ehem@m5p.com designates 74.104.188.4 as permitted sender) smtp.mailfrom=ehem@m5p.com; dmarc=none Received: from m5p.com (mailhost.m5p.com [IPv6:2001:470:1f07:15ff:0:0:0:f7]) by mailhost.m5p.com (8.16.1/8.15.2) with ESMTPS id 2ASKRZ73044608 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO) for ; Mon, 28 Nov 2022 15:27:40 -0500 (EST) (envelope-from ehem@m5p.com) Received: (from ehem@localhost) by m5p.com (8.16.1/8.15.2/Submit) id 2ASKRZG6044607 for freebsd-arch@freebsd.org; Mon, 28 Nov 2022 12:27:35 -0800 (PST) (envelope-from ehem) Date: Mon, 28 Nov 2022 12:27:35 -0800 From: Elliott Mitchell To: freebsd-arch@freebsd.org Subject: Interrupts, Interrupted, Part II Message-ID: List-Id: Discussion related to FreeBSD architecture List-Archive: https://lists.freebsd.org/archives/freebsd-arch List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arch@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=0.4 required=10.0 tests=KHOP_HELO_FCRDNS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on mattapan.m5p.com X-Spamd-Result: default: False [-3.28 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.98)[-0.976]; R_SPF_ALLOW(-0.20)[+a:c]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[m5p.com]; MLMMJ_DEST(0.00)[freebsd-arch@freebsd.org]; ARC_NA(0.00)[]; R_DKIM_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; MID_RHS_MATCH_FROMTLD(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; ASN(0.00)[asn:701, ipnet:74.104.0.0/16, country:US]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_TLS_LAST(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-arch@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; TO_DN_NONE(0.00)[]; TAGGED_FROM(0.00)[freebsd] X-Rspamd-Queue-Id: 4NLcTp40RZz3Hd2 X-Spamd-Bar: --- X-ThisMailContainsUnwantedMimeParts: N As part of trying to get a project into FreeBSD yet another issue showed up. There has been a push to get this functioning on top of INTRNG. This should be doable, but a problem arose. Due to this being a very dynamic PIC (hotplug PICs may be difficult in hardware, they're easy in software) it seems best to register interrupts during allocation and then fully release them during release. Problem is the call sequence: intr_isrc_deregister() -> (for non-IPI interrupts) isrc_release_counters() -> panic("%s: not implemented", __func__) The inverse of isrc_release_counters() is isrc_setup_counters(). These two are for setup/teardown of the "intrnames" and "intrcnt" arrays. Initially I was thinking "intrnames" and "intrcnt" were likely to be deeply intertwined into the FreeBSD kernel, they've been around since the mid-1990s. Yet upon examination, only two places use these two global variables. First, they are used by watchdog_fire() for reporting interrupts in case it triggers. Second, the functions sysctl_intrnames() and sysctl_intrcnt() use them for their sysctls (which are then used by `vmstat -i`). That is a *really* short list of uses. With the default parameters the "intrnames" array will tend to be roughly ~2MB and "intrcnt" would be roughly 128KB. That isn't huge, but an average system is only going to use perhaps 15% of that. The only benefit I see of having "intrnames" and "intrcnt" is they make those 2 sections of source code simpler. Yet by consuming noticeable amounts of memory, they make other sections slower. If they saw heavy usage this might be worthwhile, but I doubt they see enough for that to justify their existence. In light of this I've created D36610 for their removal. Portions of my initial implementation are likely to be rejected, but I like the idea. Notably this has a net removal of almost 100 lines of source code per architecture. THAT seems the characteristic of something whose time has passed. One spot likely to be rejected is the watchdog_fire() replacement implementation. I had been thinking the architecture interrupt tables were a good source of replacement data, but I'm now doubtful of this. In particular the functionality proposed in D36609 doesn't seem quite right. I now suspect it may be better to add a function to sys/kern/kern_intr.c for reporting this, but I'm unsure what to name it. The other spot is the "hw.intrnames" and "hw.intrcnt" sysctl()s really need work. Much of removed duplicated source code from the architectures appears to be directed towards maintaining compatibility with the existing sysctl()s. The difficult reimplementation also suggests these need a careful reimplementation. Their behavior hints at their long history, but their behavior is also rather poor for real use. I suggest there really should instead be a "hw.interrupts" sysctl() which provides the interrupt name, counter and stray count as triplets. This would ensure consistency between the two, avoiding the potential for an interrupt being added or removed between reading "hw.intrnames" and "hw.intrcnt". The format of the "hw.intrnames" speaks to its history. At some point pre-2003 it\0was\0a\0NUL\0separated\0list\0of\0strings. When interrupt names became changeable packing the strings became a problem. As such they were padded with spaces which allowed the old approach of finding the next interrupt name using `strlen()` to continue working, even though the entries were actually fixed-length. Problem is, since 2003 `vmstat -i` has never been updated to work with the current situation. As such, trying to figure out the whys behind everything has been difficult. Then there was at least one lurking bug just to make things more interesting. -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sigmsg@m5p.com PGP 87145445 | ) / \_CS\ | _____ -O #include O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445