Post 9.1 stable file system problems
Konstantin Belousov
kostikbel at gmail.com
Sat Jan 5 15:09:31 UTC 2013
On Tue, Jan 01, 2013 at 05:58:06PM +0200, Konstantin Belousov wrote:
> On Tue, Jan 01, 2013 at 02:39:44PM +0100, Dominic Fandrey wrote:
> > On 01/01/2013 07:51, Konstantin Belousov wrote:
> > > On Tue, Jan 01, 2013 at 02:05:11AM +0100, Dominic Fandrey wrote:
> > >> On 01/01/2013 01:49, Dominic Fandrey wrote:
> > >>> On 01/01/2013 01:29, Chris Rees wrote:
> > >>>> On 1 Jan 2013 00:01, "Dominic Fandrey" <kamikaze at bsdforen.de> wrote:
> > >>>>>
> > >>>>> I have a Tinderbox that I just updated to the current RELENG_9.
> > >>>>> Following the update build times for packages have increased by a
> > >>>>> factor between 5 and 20. I.e. I have packages that used to build in
> > >>>>> 5 minutes and now take an hour.
> > >>>>>
> > >>>>> I'm suspecting the file system ever since I saw that the majority of CPU
> > >>>>> load was caused by ls when I looked at top (more than 2 minutes of CPU
> > >>>>> time were counted that moment). The majority of the time most of the CPU
> > >>>>> load is caused by bsdtar, pkg_add, qmake-qt4, etc. Without exception
> > >>>>> tools that access a lot of files.
> > >>>>>
> > >>>>> The file system on which packages are built is nullfs mounted from
> > >>>>> an async mounted UFS. I turned async off, to no avail.
> > >>>>>
> > >>>>> /usr/src/UPDATING says that there were nullfs optimisations. So I
> > >>>>> think this is where the problem originates. I might hack the tinderbox to
> > >>>>> use 'ln -s' or set it up for NFS to verify this.
> > >>>>
> > >>>> Is your kernel newer than the Jail? The converse causes problems.
> > >>>
> > >>> I ran makeJail for all jails after updating.
> Did you rebuild your modules together with the new kernel ?
>
> > >>>
> > >>> I also seem to have similar problems when building in the host-system.
> > >>> The unzip for openjdk-7 has just passed the 11 minutes CPU time mark.
> > >>> On my notebook it takes less than 10 seconds.
> > >>
> > >> Just set WRKOBJDIRPREFIX to a tmpfs on the Tinderbox host system
> > >> and the extract takes less than a second. Originally WRKOBJDIRPREFIX
> > >> also pointed to a nullfs mount.
> > >>
> > >> Afterwards I pointed WRKOBJDIRPREFIX to a UFS file system (without
> > >> nullfs involvement). The entire make extract took 20s.
> > >>
> > >> So still faster by at least factor 30 than running it on a nullfs mount
> > >> (I eventually SIGINTed so I don't know how long it would've run).
> > >
> > > Start providing some useful debugging information ?
> >
> > That one might be interesting. It's all system time:
> >
> > # time -lh make extract
> > ===> License GPLv2 accepted by the user
> > ===> Found saved configuration for openjdk-7.9.05_1
> > ===> Extracting for openjdk-7.9.05_2
> > => SHA256 Checksum OK for openjdk-7u6-fcs-src-b24-09_aug_2012.zip.
> > => SHA256 Checksum OK for apache-ant-1.8.4-bin.zip.
> > ===> openjdk-7.9.05_2 depends on file: /usr/local/bin/unzip - found
> > ^Ctime: command terminated abnormally
> > 4m29.30s real 3.03s user 4m22.55s sys
> > 5008 maximum resident set size
> > 135 average shared memory size
> > 2932 average unshared data size
> > 127 average unshared stack size
> > 7772 page reclaims
> > 0 page faults
> > 0 swaps
> > 19 block input operations
> > 101 block output operations
> > 0 messages sent
> > 0 messages received
> > 41 signals received
> > 1597 voluntary context switches
> > 16590 involuntary context switches
>
> Ok, from your mount -v output, are the three nullfs mounts the only
> nullfs mount ever used ?
>
> Is it only unzip which demostrates the silly behaviour ? Or does it
> happen with any program ? E.g., does ls(1) or sha1 on the nullfs mount
> also slow ?
>
> Could you try some low-tech profiling on the slow program. For instance,
> you could run ktrace/kdump -R to see which syscalls are slow.
>
> Most darkly part of your report for me, is that I also use nullfs-backed
> jails both on HEAD and stable/9, with bigger scale, and I do not have
> an issue. I just did
> pooma32% time unzip -q /usr/local/arch/freebsd/distfiles/openjdk-7u6-fcs-src-b24-09_aug_2012.zip
> unzip -q 3.25s user 23.77s system 78% cpu 34.482 total
> over nullfs mount of
> /usr/home on /usr/sfw/local8/opt/pooma32/usr/home (nullfs, local).
>
> Please try the following patch, which changes nullfs behaviour to be
> non-cached by default. You could turn on the caching with the 'mount -t
> nullfs -o cache from to' mounting command. I am interested if use/non-use
> of -o cache makes a difference for you.
Ping. Any update ?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20130105/09fdfd97/attachment.sig>
More information about the freebsd-stable
mailing list