Poudriere: rm -rf: Directory not empty

Bryan Drewery bdrewery at FreeBSD.org
Thu Apr 3 14:35:30 UTC 2014


Hi,

While using Poudriere to build packages on segregated tmpfs jails
we sometimes get the following errors:

====>> [08] Starting build of devel/qt4-qt3support
====>> [08] Starting build of graphics/qt4-opengl
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel/qt4-qt3support/work/qt-everywhere-opensource-src-4.8.5/include/Qt: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel/qt4-qt3support/work/qt-everywhere-opensource-src-4.8.5/include: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel/qt4-qt3support/work/qt-everywhere-opensource-src-4.8.5: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel/qt4-qt3support/work: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel/qt4-qt3support: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr: 
Directory not empty
====>> [08] Starting build of math/py-numpy
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel/qt4-qt3support/work/qt-everywhere-opensource-src-4.8.5/include/Qt: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel/qt4-qt3support/work/qt-everywhere-opensource-src-4.8.5/include: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel/qt4-qt3support/work/qt-everywhere-opensource-src-4.8.5: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel/qt4-qt3support/work: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel/qt4-qt3support: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports: 
Directory not empty
rm: 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr: 
Directory not empty

What is happening here is that the devel/qt4-qt3support finishes,
fails to cleanup itself, then the next build tries to cleanup the
previous tempdir and fails. The next build then fails, and so on.
Eventually crashing the whole build.

This is the result of just "rm -rf 
/usr/local/poudriere/data/build/92amd64-default/ref/../08/wrkdirs/usr/ports/devel/qt4-qt3support/work".

devel/qt4-qt3support runs rm -rf, fails. kill -9 -1 is ran in jail.
graphics/qt4-opengl starts, runs jail -r [kills processes], tries rm 
-rf, fails
math/py-numpy starts, runs jail -r [kills processes], tries rm -rf, 
fails

Another example is at the bottom of 
http://beefy1.isc.freebsd.org/bulk/83i386-default/2014-02-12_03h42m23s/logs/eclipse-3.7.1_4.log
The eclipse one involved a process crashing and a coredump as well.
I thought perhaps there was a race between writing core and
removing the directory, but I found no evidence of that either
by code inspection or testing.

As shown above, no processes should be running in the jail at this
point. Poudriere itself is not touching these directories outside
of the jail either. There's no nullfs mounts of these
files to elsewhere either that may be getting touched.

What might cause this? It's very difficult to reproduce and is
reported about once every 2 months or less. Note well this is
not due to flags. A rerun of these same ports won't hit the
issue.

So far the workaround is to umount the tmpfs and remount it, but this
is not a solution as tmpfs is optional for Poudriere. From past research
it was found to not be tmpfs-specific, but my confidence level is not 
100%
on that.

This has been seen on at least 9.2-R, and 10.0-R.

I can't recreate this with simple tests though on ZFS or TMPFS.

   cd /tmp
   ( rm -rf test; mkdir test; cat /dev/random > test/foo & sleep 1; rm 
-rf test; kill $! )
   ( rm -rf test; mkdir test; mkfifo test/foo; cat test/foo & sleep 1; rm 
-rf test; kill $! )
   ( rm -rf test; mkdir test; cd test; rm -rf ../test )

In the other cases it's not clear if looping on rm -rf would work or
if it would spin forever. We have not tried it since it's so difficult
to reproduce.

-- 
Regards,
Bryan Drewery


More information about the freebsd-fs mailing list