Unbreaking ports with n64 MIPS.

Sat Mar 17 08:23:42 UTC 2012

On Fri, Mar 16, 2012 at 23:05, Warner Losh <imp at bsdimp.com> wrote:
> The argument for adding the alaises is transition from older release.

Indeed.  What do you think about Makefile.inc1 giving a helpful error
if TARGET_ARCH is set to mips(n32|64)?eb so that it's somewhat more
guided (i.e. it's not that the build breaks at some point due to wrong
TARGET_CPUARCH) but still not the baggage of an alias?  Is an alias
much of a bigger win?

> This is a bigger discussion.  Several issues:
>
> (1) Multilib.  If we had multilib, then we can build one or more of {o32, n32, n64}.  Then the ABI decision would be what to build for the entire system.  SGI used n64 for everything.  Other systems have a default ABI that we build.

SGI used n32 plenty :)  I still have several IRIX systems that in SGI
parlance were "32-bit systems" because they were lowish-end, but they
were mips3-based (the R4K and R4400 went into systems like this) and
after IRIX 6.2 (IIRC) were using n32 and not o32.

> (2) What's the default ABI that tools produce?  Is this tied to MACHINE_ARCH?  We spent a lot of time making sure that we have the right default tools so we build everything correctly.

The default has to be right not just for MACHINE_ARCH but also for
CPUTYPE/TARGET_CPUTYPE.  I've complained about the binutils
shortcoming that necessitates this before, but I'm still not happy
about it.  Perhaps our MACHINE_ARCH should be more like an ISA if we
have a more mature notion of ABI, and then much of the need for
TARGET_CPUTYPE goes away.

> (3) Do we support building other ABIs as part of the build system.  We had that before, but TARGET_BIG_ENDIAN removal killed that.  There's pros and cons of adding support here.  Multiple ABIs does junk up a lot of places in very machine specific ways.  Lots of places need tweaking if we go back to this.

Did it kill that?  -mabi={32,n32,64} works with my n64 base system.
-mabi=o64 fails wrongly, but o64 has never been quite right with our
binutils.

> MACHINE_ABI is what we need.  But do we really need it?  If we want to support building different ABIs for the same MACHINE_ARCH, then we'll need some way to persist this so we can be self-hosting.  Right now the 'make this the default ABI' method for gcc/binutils persists this information and makes things work nicely.  sysctl likely is the way to go here.

I think we are trying to persist too many parameters, really.  ISA and
ABI (including endianness) are really what we have to persist, but
we're doing it piecemeal in slightly-contradictory ways in several
places, and are talking about adding more.  It sucks.  I'm also not
sure how we solve it well.  I think moving FreeBSD to a triple-like
model in which it's all in one place and easy to parse out would be
nice.  ISA, ISA variant, ABI, Endianness.  Kernels inherently have
each of those, too, but can support variations of at least the last
two in userland, and so the possibility of persisting these things
through a sysctl is I think problematic.  I have an n64 kernel and an
o32 world, why should self-hosting (without overrides) mean I end up
with an o32 kernel?

> I'm sure that this has decayed into dust.  I tried to get gcc to generate -msoft-float on x86, and it just didn't work.  Today, I think we burn this into the default settings of the toolchain we use to bootstrap the system.  We can have a knob for it, but it is purely a userland concept: there's no floating point in the kernel to speak of.  MACHINE_FLOAT={hard,soft} might not be a bad idea, with the value exported via sysctl.  Not sure if make needs to grow support for this and MACHINE_ABI, or if it would suffice for the necessary Makefiles and/or .mk files to query the sysctl value.

I'd argue that floatness is a part of the ISA variant sort of field
above.  mips64r2-octeon-n64-big has soft float, for example.
mips64r2-softfloat-n64-big does, too.  If one compiles for a specific
CPU family as the ISA variant, then floating point is usually
consistent.  Otherwise, are there other variations on the ISA that one
cares about other than floating point?  Perhaps soon: hypervisor.  I
want this all in one place, though.  Old BSD/Mach-style plus-and-minus
config strings?  mips:mips64r2+hypervisor-fp:n64:big?

>> We need to be thinking about superpages.  This is non-trivial even
>> though MIPS is just about ideal for superpages.  For one thing, it'd
>> be really nice if we did not split TLB entries as we currently do, so
>> the default PAGE_SIZE would be 8K, and then we wouldn't have to deal
>> with TLB behavior where superpages are involved.  Does the TLB always
>> use the nearest match?  How does it impact performance to have two TLB
>> entries covering the same range of addresses?  It depends on how the
>> hardware implements TLB lookups, yes?  Wouldn't it be nice to not have
>> to split the TLB?  Wouldn't it?  I know I bring this up a lot, but it
>> seems like it really would make superpages just slightly less ugly.  I
>> mean, you do tlbp and you find that your VA is covered by the TLB, but
>> the entry it's in is split, and your VA isn't covered by a superpage,
>> but the one in the TLB is, so you have to add a more specific entry,
>> and suddenly all of your functions using the TLB have gotten
>> non-trivially complex.
>
> Doesn't cache aliasing occur when you have multiple TLBs pointing to the same physical page, which is a MIPSy no-no?

I don't mean to suggest doing that.  I mean that TLB Lo0 and TLB Lo1
point to successive physical pages coming from a single PTE.  So you
have 8K pages in the VM system which are automatically translated into
two 4K pages in a single TLB entry.  Otherwise, when you have
superpages, and you have a 256MB region followed by a 4K page, the
superpage TLB entry is going to have a valid Lo0 and an invalid Lo1
and then you have to have a separate 4K-page TLB entry for the VA of
the 4K page.

So at least for superpages, you have to always not share the TLB entry
(and just have TLB Lo0 and Lo1 point to successive physical
superpages; if the VM system were aware of this page-splitting, you
could use a 512MB superpage with two non-contiguous 256MB regions, but
I don't think anyone wants to try to make that work.)  You can still
share TLB entries for 4K pages, but at that point I'd rather not,
y'know?

Also, which superpage sizes do we support?  For quick lookups,
remembering that we have software page tables, we'd probably only want
to support those that align with the levels of our page tables, yes?
So that you just check the low bits of the address when walking the
page tables and you can quickly tell that you've hit a superpage and
don't actually need to load the next level.  That sucks, because MIPS
supports much better granularity, but otherwise TLB refills are a
nightmare.  If we need to double the size of the kernel stack or the
PCPU or some other wired region, we can use a smaller superpage, but
do we have a good way to handle things in the page table?

With 64-bit PTEs we have a lot more software-usable bits, so we can
just copy the PTE into all the PTE slots covered by the superpage
mapping, and that opens up most of our superpage sizes, but at the
cost of bigger page tables.  If you do it that way, it's easy to
design and easy to implement, modulo the need to ensure that your
superpages are actually twice the size of half the TLB entry :)

> I haven't thought about this in ages.  I believe that it is complex to design, but relatively simple to implement.  I did some preliminary looking into this a couple of years ago, but never made it out of the early explorer stage for want of time.

Implementing superpages is easy.  Not sharing TLB entries is easy.
Flipping the switch is not.  When I last did this with FreeBSD as-is,
extant binaries simply broke, since our image activator couldn't
handle semipage-aligned executables.