tlb, tsb & ...stuff

Mon Apr 14 19:24:00 PDT 2003

[ ... ]
> 
> So these are mainly capacity and not conflict misses? The obvious wasy to
> improve this would be to increase the size and eliminate the TAILQ, unless
> the TAILQ linking removal would cause massive changes. But entry number
> wise we are probably at the minimal range right now.
> 

Yes.  The main problem is that the tsb is small, and the way things work
right now its not feasible to make it much larger because its mapped into
contiguous KVA.  Having the TAILQ in the ttes buys us a lot because it
avoids allocating a separate structure for tracking aliases for the same
page, on other architectures this is done with pv entries.  Its also used
to maintain consistency in the L1 data cache between mappings with different
virtual colors, since the L1 cache is virtually indexed.  We need to maintain
consistency between all kernel and userland mappings, and kernel mappings
normally don't have a pv entry allocated for them, so having this allocated
externally would be undesirable (having to allocate something in code paths
that don't normally expect it).  This has helped a lot with stability due
to not allowing illegal aliases into the data cache.  NetBSD and OpenBSD
rely on flushing the cache at strategic places where illegal aliases would
matter if they had been created, but this is hard to get right and you end
up flushing the data cache a lot.

[ ... ]
> >
> > With 3 levels this can support a 32 gigabyte virtual address space (fully
> > resident), but doesn't penalize processes with sparse address spaces due
> > to not requiring intermediate "page table pages" in all cases.  Basically
> > like traditional page tables but backwards :).
> >
> 
> This would have rather bad locality of access though, no? The conflict
> side of the thing is resolvable (to an extent) using it as a sum or xor
> addressed cache. I guess I should whip up some code to show what I mean.
> Missing in L2 can delay you for say 100 cycles. Also, this scheme cause
> the pages to accumulate with basicly no eviction of 'vist once' pages.

Yes, you're right.  The current scheme has ok cache locality and the
eviction properties are nice.  How does an xor addressed cache work?
I would be interested to see a simple example.  The hash function needs
to be as simple as possible because it needs to fit in 1 or 2 instructions
in the tlb fault handlers.  So '&' is attractive but may not have great
properties.

[ ... ]
> 
> Another way to overcome the would be to allocate teh different "ways" of
> the TSB separately, so that the area need not be contiguous.

This has got me thinking.  The main problem is how to map the tsb into
contiguous virtual address space so it can be indexed easily by the tlb
fault handlers, ideally base + offset, where offset is a hash of the
virtual address, with no indirection.  But the only requirement for
contiguity is to make the tlb fault handlers fast.  When its accessed by
the kernel for pmap processing it doesn't much matter because there are
other slow things going, adding indirection there won't make much difference.
Just mapping it into the kernel address space is easy because faults on
the tsb inside of the user tlb fault handler are handled uniformly as normal
kernel tlb faults.  If they were handled specially it might be possible
to use a multi level structure that's more transparent.

[ ... ]
> 
> I was more thinking about say using 64K pages for user stack or malloc or
> similar - it doesn't quite have the same complexity as general mapping
> scheme, in the malloc case could be requested by the user and would keep
> pressure or TLB & faults down. But sounds like this is not simple/feasible
> for now.

I'd really like to see this, but yes it is hard to do in a generalized way.
The vm system needs to be aware of the contiguity of physical pages, and
try to make reservations of large contiguous regions, expecting all the
pages to be faulted in with the same protections so the mapping can be
"upgraded" to use a larger page size.  The best work that I've seen on
this topic so far is this paper:
	http://www.cs.rice.edu/~ssiyer/r/superpages/
That Alan Cox is a FreeBSD guy, so we may see this in FreeBSD at some point.

Jake