Post 9.1 stable file system problems

Konstantin Belousov kostikbel at gmail.com
Sat Jan 5 15:09:31 UTC 2013


On Tue, Jan 01, 2013 at 05:58:06PM +0200, Konstantin Belousov wrote:
> On Tue, Jan 01, 2013 at 02:39:44PM +0100, Dominic Fandrey wrote:
> > On 01/01/2013 07:51, Konstantin Belousov wrote:
> > > On Tue, Jan 01, 2013 at 02:05:11AM +0100, Dominic Fandrey wrote:
> > >> On 01/01/2013 01:49, Dominic Fandrey wrote:
> > >>> On 01/01/2013 01:29, Chris Rees wrote:
> > >>>> On 1 Jan 2013 00:01, "Dominic Fandrey" <kamikaze at bsdforen.de> wrote:
> > >>>>>
> > >>>>> I have a Tinderbox that I just updated to the current RELENG_9.
> > >>>>> Following the update build times for packages have increased by a
> > >>>>> factor between 5 and 20. I.e. I have packages that used to build in
> > >>>>> 5 minutes and now take an hour.
> > >>>>>
> > >>>>> I'm suspecting the file system ever since I saw that the majority of CPU
> > >>>>> load was caused by ls when I looked at top (more than 2 minutes of CPU
> > >>>>> time were counted that moment). The majority of the time most of the CPU
> > >>>>> load is caused by bsdtar, pkg_add, qmake-qt4, etc. Without exception
> > >>>>> tools that access a lot of files.
> > >>>>>
> > >>>>> The file system on which packages are built is nullfs mounted from
> > >>>>> an async mounted UFS. I turned async off, to no avail.
> > >>>>>
> > >>>>> /usr/src/UPDATING says that there were nullfs optimisations. So I
> > >>>>> think this is where the problem originates. I might hack the tinderbox to
> > >>>>> use 'ln -s' or set it up for NFS to verify this.
> > >>>>
> > >>>> Is your kernel newer than the Jail?  The converse causes problems.
> > >>>
> > >>> I ran makeJail for all jails after updating.
> Did you rebuild your modules together with the new kernel ?
> 
> > >>>
> > >>> I also seem to have similar problems when building in the host-system.
> > >>> The unzip for openjdk-7 has just passed the 11 minutes CPU time mark.
> > >>> On my notebook it takes less than 10 seconds.
> > >>
> > >> Just set WRKOBJDIRPREFIX to a tmpfs on the Tinderbox host system
> > >> and the extract takes less than a second. Originally WRKOBJDIRPREFIX
> > >> also pointed to a nullfs mount.
> > >>
> > >> Afterwards I pointed WRKOBJDIRPREFIX to a UFS file system (without
> > >> nullfs involvement). The entire make extract took 20s.
> > >>
> > >> So still faster by at least factor 30 than running it on a nullfs mount
> > >> (I eventually SIGINTed so I don't know how long it would've run).
> > > 
> > > Start providing some useful debugging information ?
> > 
> > That one might be interesting. It's all system time:
> > 
> > # time -lh make extract
> > ===>  License GPLv2 accepted by the user
> > ===>  Found saved configuration for openjdk-7.9.05_1
> > ===>  Extracting for openjdk-7.9.05_2
> > => SHA256 Checksum OK for openjdk-7u6-fcs-src-b24-09_aug_2012.zip.
> > => SHA256 Checksum OK for apache-ant-1.8.4-bin.zip.
> > ===>   openjdk-7.9.05_2 depends on file: /usr/local/bin/unzip - found
> > ^Ctime: command terminated abnormally
> >         4m29.30s real           3.03s user              4m22.55s sys
> >       5008  maximum resident set size
> >        135  average shared memory size
> >       2932  average unshared data size
> >        127  average unshared stack size
> >       7772  page reclaims
> >          0  page faults
> >          0  swaps
> >         19  block input operations
> >        101  block output operations
> >          0  messages sent
> >          0  messages received
> >         41  signals received
> >       1597  voluntary context switches
> >      16590  involuntary context switches
> 
> Ok, from your mount -v output, are the three nullfs mounts the only
> nullfs mount ever used ?
> 
> Is it only unzip which demostrates the silly behaviour ? Or does it
> happen with any program ? E.g., does ls(1) or sha1 on the nullfs mount
> also slow ?
> 
> Could you try some low-tech profiling on the slow program. For instance,
> you could run ktrace/kdump -R to see which syscalls are slow.
> 
> Most darkly part of your report for me, is that I also use nullfs-backed
> jails both on HEAD and stable/9, with bigger scale, and I do not have
> an issue. I just did
> pooma32% time unzip -q /usr/local/arch/freebsd/distfiles/openjdk-7u6-fcs-src-b24-09_aug_2012.zip
> unzip -q   3.25s user 23.77s system 78% cpu 34.482 total
> over nullfs mount of
> /usr/home on /usr/sfw/local8/opt/pooma32/usr/home (nullfs, local).
> 
> Please try the following patch, which changes nullfs behaviour to be
> non-cached by default. You could turn on the caching with the 'mount -t
> nullfs -o cache from to' mounting command. I am interested if use/non-use
> of -o cache makes a difference for you.

Ping. Any update ?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20130105/09fdfd97/attachment.sig>


More information about the freebsd-stable mailing list