Superpages on amd64 FreeBSD 7.2-STABLE

Sat Dec 12 19:50:53 UTC 2009

On Thu, Dec 10, 2009 at 8:50 AM, Bernd Walter <ticso at cicely7.cicely.de>wrote:

> On Wed, Dec 09, 2009 at 09:07:33AM -0500, John Baldwin wrote:
> > On Thursday 26 November 2009 10:14:20 am Linda Messerschmidt wrote:
> > > It's not clear to me if this might be a problem with the superpages
> > > implementation, or if squid does something particularly horrible to
> > > its memory when it forks to cause this, but I wanted to ask about it
> > > on the list in case somebody who understands it better might know
> > > whats going on. :-)
> >
> > I talked with Alan Cox some about this off-list and there is a case that
> can
> > cause this behavior if the parent squid process takes write faults on a
> > superpage before the child process has called exec() then it can result
> in
> > superpages being fragmented and never reassembled.  Using vfork() should
> > prevent this from happening.  It is a known issue, but it will probably
> be
> > some time before it is addressed.  There is lower hanging fruit in other
> areas
> > in the VM that will probably be worked on first.
>
> For me the whole threads puzzles me.
> Especially because vfork is often called a solution.
>
> Scenario A
> Parent with super page
> fork/exec
> This problem can happen because there is a race.
> The parent now has it's super pages fragmented permanently!?
> the child throws away his pages because of the exec!?
>
> Scenario B
> Parent with super page
> vfork/exec
> This problem won't happen because the child has no pseudo copy of the
> parents memory and then starts with a completely new map.
>
> Scenario C
> Parent with super page
> fork/ no exec
> The problem can happen because the child shares the same memory over
> it's complete lifetime.
> The parent can get it's super pages fragmented over time.
>
>
I'm not sure how you are defining "problem".  If we define "problem" as I
would, i.e., that "re-promotion can never occur", then Scenario C is not
a problem scenario, only Scenario A is.

The source of the problem in Scenario A is basically that we have two ways
of handling copy-on-write faults.  Before the exec() occurs, copy-on-write
faults are handled as you might intuit from the name, a new physical copy is
made.  If the entirety of the 2MB region is written to before the exec(),
then
this region will be promoted to a superpage.  However, once the exec()
occurs,
copy-on-write faults are "optimized".  Specifically, the kernel recognizes
that
the underlying physical page is no longer shared with the child and simply
restores write access to it.  It is the combination of these two methods
that
effectively blocks re-promotion because the underlying 4KB physical pages
within a 2MB region are no longer contiguous.

In other words, once the first page within a region has been copied, you
have
a choice to make: Do you perform avoidable copies or do you abandon the
possibility of ever creating a superpage.  The former has a significant
one-time cost and the latter has a small recurring cost.  Not knowing how
much the latter will add up to, I chose the former.  However, that choice
may change in time, particularly, if I find an effective heuristic for
choosing
between the two options.

Anyway, please keep trying superpages with large memory applications like
this.  Reports like this help me to prioritize my efforts.

Regards,
Alan