cvs commit: src/sys/kern subr_param.c sys_pipe.c src/sys/sys pipe.h

Thu Jul 10 12:19:28 PDT 2003

David Schultz wrote:
> 
> On Tue, Jul 08, 2003, Alan L. Cox wrote:
> > Mike Silbersack wrote:
> > >
> > > On Tue, 8 Jul 2003, Mike Silbersack wrote:
> > >
> > > > If I spend more time working on pipes, there are a bunch of other changes
> > > > I'll be working on first.
> > > >
> > > > Mike "Silby" Silbersack
> > >
> > > Let me explain this statement a little better, as it helps explain why I
> > > didn't do per-user limits in the initial commit.
> > >
> > > As pipes are implemented now (which basically still follows John's
> > > original design), for each side of a pipe we allocate:
> > >
> > > 1 VM object which is linked to
> > > 16KB of address space from the kernel map;
> > > this is pageable, and acquires backing as it is filled.
> > >
> > > For large writes which align on page boundaries, the pages are wired into
> > > kernel memory, and shown directly to the reading end of the pipe.
> > > Naturally, this memory isn't pageable, and is even more important to
> > > limit.  (However, as stated in my commit message, this _can_ be limited
> > > without any nasty sideeffects.)
> >
> > When "pages are wired into kernel memory" there are two distinct actions
> > being performed: vm_page_wire() and pmap_qenter().  However, as far as I
> > know, there is no reason why the pmap_qenter() has to be performed by
> > the sender.  I suspect the mapping could be delayed until just before
> > the copy and released immediately thereafter, thereby eliminating the
> > need for each pipe to have its own KVA for this purpose.  In fact, I
> > believe the sf_buf allocator could be used to provide the temporary KVA.
> 
> That would alleviate the KVA pressure, since the mapping would be
> very temporary and you could even get away with just a single
> page.  However, it would still tie up the associated physical
> memory until the pipe is read, which may not be soon at all.  Is
> there a reason for the memory to be wired, other than that the
> data is easier to track down while the sending process' PTEs are
> still there?  I would expect that you could instead just look up
> the appropriate vm_object and lazily fault in the appropriate pages
> on the receiver's side, modulo a few details such as segfault handling.
> But perhaps I'm missing something...

It's a matter of priorities.  With the growth trend in physical memory
sizes (and PAE), I see more problems due to KVA pressure than
unnecessarily wired memory.  A recent, and fairly visible example, was
the vnode autosizing problems that had to be fixed prior to 5.1-RELEASE.

Regards,
Alan