Re: [RFC] An idea for general kernel post-processing automation in FreeBSD

From: Jessica Clarke <jrtc27_at_freebsd.org>
Date: Wed, 24 May 2023 15:20:02 UTC
On 24 May 2023, at 10:47, Hans Petter Selasky <hps@selasky.org> wrote:
> 
> On 5/24/23 08:28, Emmanuel Vadot wrote:
>>>> +1.
> 
> Hi,
> 
>>>> Also, the concern over the runtime for the potential worst-case for qsort
>>>> seems_way_  overblown compared to the actual unsorted arrays one will get
>>>> out of the linker.  The only possible downside of qsort that matters in
>>>> this case is the stack usage.
> I think it is fair to say there are two levels of questions:
> 
> The first question is:
> Runtime sorting vs Compile time sorting
> 
> The second question is:
> Mathematically correct or System/Political correct sorting
> 
> Level 1:
> 
> 1.1) Runtime sorting puts the least requirements on the toolchain. Data is concated in an unsorted binary object section. But in order to read and debug this section, special tools are needed.
> 
> 1.2) Compiletime sorting puts more requirements on the toolchain. Referring Jessica Clarke, this is a "gross hack". On the other hand, debugging and visibility makes this desirable.

I’ve been deliberately staying out of my thread, but this is blatantly false. I said that in response to *one specific part of your proposed implementation*, which is completely different to saying it about the general concept. The full quote is:

>> If you are worried multiple DEFINE_MUTEX(lock) will result in multiple global symbols having the same "lock" name, all you need to do is pass through the ${.TARGET} variable from "make" as a define, stripping a few invalid characters, and macro-concat that to the locally generated global variable name, and you are all good.
> 
> You could. But that’s a pretty gross hack IMO, and depending on how you do it may still not be unique. Not to mention it’s going to further bloat the symbol tables with these long names.


So this is some pretty horrendous misquoting. Do not do that again. I am extremely unimpressed.

Jess

> Level 2:
> 
> Should an appropriate sorting function be used or not?
> 
> Typically there are like 1K of entries to sort, but speaking about the indexer, there are far less sorting keys for the 1K of entries.
> 
> 2.1) libkern's qsort() aka https://en.wikipedia.org/wiki/Quicksort
> 
> Using Quicksort is basically like first randomizing the data, and then sorting it again. It is not optimal in any ways with regards to CPU time. But system wise you could save code by using the same sorting algorithm over and over again.
> 
> 2.2) mergesort()
> 
> Using mergesort() has a predictable executing time, bound to log2(N) * N, where N is the number of elements to sort. The code in question already allocates a new array to store the output of the sort, and a kernel mergesort() routine could easily use the output array as temporary storage.
> 
> 2.3) radixsort()
> 
> There are less 64-bit keys than sysinits, and I see a radix sort algorithm would complete in less than log2(N) * N time. Radix sort is an in-place sorting algorithm, and requires little code. It works similarly to QuickSort, but doesn't go around a guess pivots. It already knows them, because there is a sorting key.
> 
> 2.4) insertionsort()
> 
> When arrays are already sorted, the execution time is O(N), else O(N²)
> 
> 2.5) bubblesort()
> 
> Sorting time is more or less constant O(N²).
> 
> 
> 
> 
> In case compile time sorting is selected, there is no need for runtime sorting. FreeBSD controls the build of all kernel modules, in-tree and out-of-the tree. We go by source and not binaries, and that is our strength. This can be an ABI contract thing for SYSINITs and kernel modules.
> 
> 
> In case runtime sorting is selected, I see a radix sorter as the less CPU and stack intensive algorithm. This would be the mathematically correct choice. qsort() as-of-today is the politically correct choice. Less code, but more CPU and stack.
> 
>>>> 
>>> IMO sorting at runtime is a minor feature by itself, too. I mentioned
>>> in one of the reviews somewhere, we have in the past have had bugs due
>>> to poor specification of sysinit order within a subsystem -- I'm
>>> recalling a specific one that wasn't even that long ago, maybe three
>>> or four years, that manu@ had hit because he did or didn't include
>>> some device in his config that ended up shifting link order of other
>>> objects and reversing two sysinits into an order that wasn't actually
>>> functional.
>>  Yes this problem was a pain to debug (I mean just to understand the
>> actual problem), and iirc it was never solved, I only had the issue on
>> one machine that I don't use anymore and the problem went away and
>> re-apears from time to time when sysinits where added.
>>> We can still add some bits to test that (preferably a
>>> tunable to reverse order within a subsystem rather than having to
>>> re-link the kernel) even with sorting at link-time, but it sure does
>>> look a lot less fragile if we have to sort it anyways and we just
>>> reverse one criteria.
> 
> I think if you could just diff two plaintext files, listing all the constructors in the correct order, that would be of great help when problems arise in this area. You cannot always rely on having a console during bringup.
> 
> 
> At the mention of manu@, I remember how difficult it was to bringup the graphics driver code and kernel modules which are common among FreeBSD desktop users. What do you do, if all you have is a black screen, and you need to debug something?
> 
> 
> 
> Does it make sense if I enumerate and collect the different views on this issue on a WikiPage at FreeBSD.org, so it doesn't get lost for the future?
> 
> 
> 
> --HPS