Strange ZFS problem, filesystem claims to be full when clearly not full

Torbjorn Kristoffersen torbjoern at gmail.com
Thu Sep 30 13:28:27 UTC 2010


On Thu, Sep 30, 2010 at 11:11 AM, Danny Carroll <fbsd at dannysplace.net> wrote:
>
>  On 30/09/2010 6:36 PM, Alexander Leidinger wrote:
> >
> > Quoting Jeremy Chadwick <freebsd at jdc.parodius.com> (from Wed, 29 Sep
> > 2010 15:15:49 -0700):
> >
> >> On Thu, Sep 30, 2010 at 12:11:09AM +0200, Torbjorn Kristoffersen wrote:
> >>> I'm at a complete loss here. I shut down the jail completely, and I am
> >>> watching the jail's ZFS filesystem grow as we speak.  No process is
> >>> using
> >>> it.   It only grows in "df" and "zfs list", I can't find any files
> >>> that are
> >>> growing.  I have to re-set the quota to be higher and higher to
> >>> accommodate
> >>> the space.
> >>>
> >>> On Wed, Sep 29, 2010 at 10:46 PM, Torbjorn Kristoffersen <
> >>> torbjoern at gmail.com> wrote:
> >>>
> >>> > Hi Jeremy.
> >>> >
> >>> > 1) I checked now, and found nothing extraordinary. Just processes
> >>> that have
> >>> > been running for a long while, such as screen, cron, sshd, bash,
> >>> irssi,
> >>> > syslogd, etc.
> >>> >
> >>> > 2) No compression used on this zfs filesystem (or any of the others).
> >>> >
> >>> > I completedly stopped the jail now, and removed some of the
> >>> directories
> >>> > with the most data in them, but to no avail.
> >>> >
> >>> >
> >>> > On Wed, Sep 29, 2010 at 9:25 PM, Jeremy Chadwick
> >>> <freebsd at jdc.parodius.com
> >>> > > wrote:
> >>> >
> >>> >> On Wed, Sep 29, 2010 at 08:46:38PM +0200, Torbjorn Kristoffersen
> >>> wrote:
> >>> >> > I have a ZFS "tank" called tpool, the server runs a couple of
> >>> jails
> >>> >> (each
> >>> >> > with a zfs filesystem).  There is a problem with one of these
> >>> >> filesystems.
> >>> >> > First, its disk usage as shown in ``df -h'':
> >>> >> > ...
> >>> >> > tpool/rb.org      100G     95G    4.6G    95%    /jails/rb.org
> >>> >> > ...
> >>> >> >
> >>> >> > The command ``zfs list'' shows the same:
> >>> >> > ..
> >>> >> > tpool/rb.org    95.4G  4.56G  95.4G  /jails/rb.org
> >>> >> > ..
> >>> >> >
> >>> >> > However, there is a very mysterious problem somewhere.
> >>> >> > Something inside this jail is eating diskspace, but we can't
> >>> find any
> >>> >> > directories that is actually taking the diskspace. We first
> >>> suspected
> >>> >> either
> >>> >> > fetchmail or spamassassin of causing a lot of space to be used,
> >>> since
> >>> >> some
> >>> >> > of their directories were huge. (These were later deleted, and
> >>> which is
> >>> >> why
> >>> >> > you see that 4.6GB is now available, before that 0GB was
> >>> available).
> >>> >> >
> >>> >> > However, we can't find *any trace* of an actual directory or
> >>> file that
> >>> >> is
> >>> >> > taking all the spac.e
> >>> >> >
> >>> >> > Take this for instance:
> >>> >> >
> >>> >> > outsidejail# du -sh rb.org
> >>> >> >  43G    rb.org
> >>> >> >
> >>> >> > How can this be?  df and zfs are showing that the entire drive
> >>> is nearly
> >>> >> > full, yet I can't find any directory that is actually taking
> >>> all this
> >>> >> space.
> >>> >> >  I've carefully looked through every single directory within
> >>> the jail
> >>> >> trying
> >>> >> > to find something that's taking all that space, but to no avail.
> >>> >> >
> >>> >> > ----
> >>> >> > My system stats:
> >>> >> > # uname -a
> >>> >> > FreeBSD grim 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Mon Jul 19
> >>> 02:36:49 UTC
> >>> >> > 2010
> >>> root at mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  amd64
> >>> >> > # zpool get version tpool
> >>> >> > NAME   PROPERTY  VALUE    SOURCE
> >>> >> > tpool  version   14       default
> >>> >> > # zpool status
> >>> >> >   pool: tpool
> >>> >> >  state: ONLINE
> >>> >> >  scrub: none requested
> >>> >> > config:
> >>> >> >
> >>> >> >         NAME        STATE     READ WRITE CKSUM
> >>> >> >         tpool       ONLINE       0     0     0
> >>> >> >           mirror    ONLINE       0     0     0
> >>> >> >             ad4s1d  ONLINE       0     0     0
> >>> >> >             ad6s1d  ONLINE       0     0     0
> >>> >> >
> >>> >> > errors: No known data errors
> >>> >> >
> >>> >> > [ Note that I've also done a scrub recently ]
> >>> >>
> >>> >> 1) Have you checked using fstat to ensure that no file descriptors
> >>> >> remain open on any of your ZFS filesystems (not pools)?
> >>> >>
> >>> >> 2) Are you using compression on any of your ZFS filesystems?
> >>
> >> Andriy and Pawel,
> >>
> >> Do either of you have ideas as to what could cause the issue Torbjorn's
> >> experiencing?  I swear I remember some bug or quirk that got fixed with
> >> regards to free space on ZFS, but as has been proven time and time again
> >> my memory is horrible.  His kernel's 8.1-RELEASE dated July 19th.
> >
> > IIRC the commit you talk about was by Martin (CCed). I do not know if
> > it is (already) MFCed.
> >
> > I'm not sure the bug you talk about is related to what Torbjorn is
> > talking about. The fact that the free space is going down while the
> > jail is shutdown (and I assume jls does not show his JID anymore, so
> > all of its processes are really gone) points more to some other
> > process (outside of the jail) which is filling some (maybe already
> > deleted, so not visible anymore with du) file.
> >
>
> It certainly smells like a process still writing to a file that is unlinked.
> I wonder if it would show up with lsof.
>
> If dtrace is enabled on that machine then I think it should be easy to
> see which process is performing write operations.
>

That could very well be.  Interestingly, dtrace is not installed and
doesn't even load.  When I do
kldload dtraceall it says:

    kldload: can't load dtraceall: Exec format error

 Perhaps I should recompile the kernel on this server, and build in
Dtrace into the kernel.  Perhaps I should first update to
FreeBSD-STABLE, as it is more cutting edge?

Actually, I'll first do a complete backup of this jail, remove the zfs
filesystem, then re-create it, put the files back, and see what
happens.  The unfortunate thing is that I will be ruining a chance to
find out what really happened.


More information about the freebsd-fs mailing list