Re: cvs commit: src/sys/i386/i386 pmap.c
Jeff Roberson wrote:
> On Sat, 25 Oct 2003, Peter Wemm wrote:
> > peter 2003/10/25 11:51:41 PDT
> > FreeBSD src repository
> > Modified files:
> > sys/i386/i386 pmap.c
> > Log:
> > For the SMP case, flush the TLB at the beginning of the page zero/copy
> > routines. Otherwise we run into trouble with speculative tlb preloads
> > on SMP systems. This effectively defeats Jeff's revision 1.438
> > optimization (for his pentium4-M laptop) in the SMP case. It breaks
> > other systems, particularly athlon-MP's.
> If the page tables are NULL why does this break speculative tlb preloads?
While we're zeroing the page, CMAP2 (or friends) are non-NULL. If another
cpu accesses a nearby page and the cpu decides to speculatively preload
the nearby TLB entries, then it will cache the CMAP2 value. Meanwhile, the
originating cpu clears it again and flushes its own cache. But, if we then
do a pmap_zero_page on the other cpu, it can still have the speculatively
cached tlb entry and zero the wrong page.
Poul-Henning was able to reproduce this problem in short order. The first
hack we tried was to change invlcaddr() to do a global shootdown. It solved
the crashes.. presumably by purging all other cpu's copies of CMAP2 including
any speculatively loaded values. Obviously this is expensive and defeats
the point of doing local flushes only.
So, as a lighter weight solution, we tried flushing after every page table
modification, as the IA32 system programmers manual says we must, and it
too solved the problem - without the expense of extra tlb shootdowns.
Perhaps we should change back to using the the switchin purge and flush at the
beginning as an alternative to two flushes. The expense of invlpg seems to
be unique to the pentium-4's. athlon's run at about 100 clock cycles (80 on
Peter Wemm - peter_at_wemm.org; peter_at_FreeBSD.org; peter_at_yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5
Received on Sat Oct 25 2003 - 16:07:12 UTC