nullfs performance (Re: cvs commit: ports/Mk bsd.emacs.mk bsd.gnome.mk bsd.mail.mk bsd.openssl.mk bsd.port.mk bsd.port.subdir.mk bsd.python.mk bsd.ruby.mk bsd.scons.mk ports/Tools/scripts security-check.awk ports/databases/p5-DBD-Oracle Makefile ports/databases/p5-sqlrelay ...)

Thu Aug 17 20:46:10 UTC 2006

On Thu, 17 Aug 2006 12:38:48 -0400
Kris Kennaway <kris at obsecurity.org> wrote:

> On Thu, Aug 17, 2006 at 08:43:17AM -0700, Brian Somers wrote:
> > On Thu, 17 Aug 2006 05:45:01 -0400
> > Kris Kennaway <kris at obsecurity.org> wrote:
> > 
> > > On Wed, Aug 16, 2006 at 11:42:36PM -0700, Brian Somers wrote:
> > > > [-developers elided]
> > > > 
> > > > Interesting... I use nullfs as part of our build system here.
> > > > I found that it's performance is appalling when pushed a
> > > > little (running two builds on one machine, each with two
> > > > nullfs mounts, two devfs and two procfs mounts gives a build
> > > > time of 4.5 hours whereas a single build will finish in 1.5
> > > > hours).  I've seen >6 hour builds on a loaded box - only
> > > > attributable to nullfs.
> > > 
> > > What version?
> > 
> > The version that ships with 6.1.
> 
> OK, that's unusual and bears further investigation then.  When I
> measured it I did not see this kind of dramatic performance loss
> (anecdotally it doesn't fit with my own experiences either: I also use
> it for parallel compilations, and I do not see anomalously low
> performance compared to machines not using nullfs).
> 
> The only thing I can think of is that the underlying filesystem is not
> mpsafe (e.g. are you using UFS quotas?), in which case it's not really
> nullfs to blame.

We're not using quotas.

> Hmm, are you sure it's 6.1-RELEASE and not a prerelease?  I think the
> code that was supposed to make the nullfs mount mpsafe conditional on
> the mpsafety of the lower layer was broken until some time during the
> release cycle, but it was fixed before the release.

I integrated RELENG_6_1_0_RELEASE into our source tree.

I guess the missing info might be that things get indirected somewhat.

We check out code into /some/deep/directory/tree.  Then, to protect
against the 80 character path limitation, we create /tmp/bld.XXXXX/
and create a scratch -> /tmp/bld.XXXXX symlink in
/some/deep/directory/tree.

We then do various things like:

    mount -t nullfs /some/deep/directory/tree/src scratch/build/src
    mount -t nullfs /some/deep/directory/tree/obj scratch/build/obj
    mount -t devfs devfs scratch/dev
    mount -t procfs procfs scratch/proc

and do a "OBJDIRPREFIX=/build/obj chroot scratch make -C /build/src".

Oh, and errum, we've got debug.mpsafenet="0" in /boot/loader.conf -
which is a remnant of when we were using 5.4 and the races in the
socket code killed our application under load.

Does the nullfs code path hit the network stack??

-- 
Brian Somers                                       <brian at Awfulhak.org>
Don't _EVER_ lose your sense of humour !            <brian at FreeBSD.org>