How to prevent other CPU from accessing a set of pages before calling pmap_remove_all function

Kostik Belousov kostikbel at gmail.com
Thu Sep 10 12:08:51 UTC 2009


On Wed, Sep 09, 2009 at 11:57:24PM -0700, MingyanGuo wrote:
> On Wed, Sep 9, 2009 at 11:26 PM, MingyanGuo <guomingyan at gmail.com> wrote:
> 
> > Hi all,
> >
> > I find that function pmap_remove_all for arch amd64 works with a time
> > window between reading & clearing the PTE flags(access flag and dirty flag)
> > and invalidating its TLB entry on other CPU. After some discussion with Li
> > Xin(cced), I think all the processes that are using the PTE being removed
> > should be blocked before calling pmap_remove_all, or other CPU may dirty the
> > page but does not set the dirty flag before the TLB entry is flushed. But I
> > can not find how to block them to call the function. I read the function
> > vm_pageout_scan in file vm/vm_pageout.c but can not find the exact method it
> > used.  Or I just misunderstood the semantics of function pmap_remove_all ?
> >
> > Thanks in advance.
> >
> > Regards,
> > MingyanGuo
> >
> 
> Sorry for the noise. I understand the logic now. There is no time window
> problem between reading & clearing the PTE and invalidating it on other CPU,
> even if other CPU is using the PTE.  I misunderstood the logic.

Hmm. What would happen for the following scenario.

Assume that the page m is mapped by vm map active on CPU1, and that
CPU1 has cached TLB entry for some writable mapping of this page,
but neither TLB entry not PTE has dirty bit set.

Then, assume that the following sequence of events occur:

CPU1:						CPU2:
					call pmap_remove_all(m)
					clear pte
write to the address mapped
    by m [*]
					invalidate the TLB,
					    possibly making IPI to CPU1

I assume that at the point marked [*], we can
- either loose the dirty bit, while CPU1 (atomically) sets the dirty bit
  in the cleared pte.
  Besides not properly tracking the modification status of the page,
  it could also cause the page table page to be modified, that would
  create non-zero page with PG_ZERO flag set.
- or CPU1 re-reads the PTE entry when setting the dirty bit, and generates
  #pf since valid bit in PTE is zero.

Intel documentation mentions that dirty or accessed bits updates are done
with locked cycle, that definitely means that PTE is re-read, but I cannot
find whether valid bit is rechecked.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20090910/073a97f6/attachment.pgp


More information about the freebsd-hackers mailing list