panic ffs_truncate3 (maybe fuse being evil)
Rick Macklem
rmacklem at uoguelph.ca
Wed Jan 13 15:16:30 UTC 2016
Kostik wrote:
> On Sun, Jan 10, 2016 at 10:01:57AM -0500, Rick Macklem wrote:
> > Hi,
> >
> > When fooling around with GlusterFS, I can get this panic intermittently.
> > (I had a couple yesterday.) This happens on a Dec. 5, 2015 head kernel.
> >
> > panic: ffs_truncate3
> > - backtrace without the numbers (I just scribbled it off the screen)
> > ffs_truncate()
> > ufs_inactive()
> > VOP_INACTIVE_APV()
> > vinactive()
> > vputx()
> > kern_unlinkat()
> >
> > So, at a glance, it seems that either
> > b_dirty.bv_cnt
> > or b_clean.bv_cnt
> > is non-zero. (There is another case for the panic, but I thought it
> > was less likely?)
> >
> > So, I'm wondering if this might be another side effect of r291460,
> > since after that a new vnode isn't completely zero'd out?
> >
> > However, shouldn't bo_dirty.bv_cnt and bo_clean.bv_cnt be zero when
> > a vnode is recycled?
> > Does this make sense or do some fields of v_bufobj need to be zero'd
> > out by getnewvnode()?
> Look at the _vdrop(). When a vnode is freed to zone, it is asserted
> that bufobj queues are empty. I very much doubt that it is possible
> to leak either buffers or counters by reuse.
>
> >
> > GlusterFS is using fuse and I suspect that fuse isn't cleaning out
> > the buffers under some circumstance (I already noticed that there
> > isn't any code in its fuse_vnop_reclaim() and I vaguely recall that
> > there are conditions where VOP_INACTIVE() gets skipped, so that
> > VOP_RECLAIM()
> > has to check for anything that would have been done by VOP_INACTIVE()
> > and do it, if it isn't already done.)
> But even if fuse leaves the buffers around, is it UFS which panics for
> you ? I would rather worry about dandling pointers and use after free in
> fuse, which is a known issue with it anyway. I.e. it could be that fuse
> operates on reclaimed and reused vnode as its own.
>
> >
> > Anyhow, if others have thoughts on this (or other hunches w.r.t. what
> > could cause this panic(), please let me know.
>
> The ffs_truncate3 was deterministically triggered by a bug in ffs_balloc().
> The routine allocated buffers for indirect blocks, but if the blocks cannot
> be allocated, the buffers where left on queue. See r174973, this was fixed
> very long time ago.
>
Well, although I have r174973 in the kernel that crashes, it looks like this
bug might have been around for a while.
Here's what I've figured out sofar.
1 - The crashes only occur if soft updates are disabled. This isn't surprising
if you look at ffs_truncate(), since the test for the panic isn't done
when soft updates are enabled.
Here's the snippet from ffs_truncate(), in case you are interested:
if (DOINGSOFTDEP(vp)) {
335 if (softdeptrunc == 0 && journaltrunc == 0) {
336 /*
337 * If a file is only partially truncated, then
338 * we have to clean up the data structures
339 * describing the allocation past the truncation
340 * point. Finding and deallocating those structures
341 * is a lot of work. Since partial truncation occurs
342 * rarely, we solve the problem by syncing the file
343 * so that it will have no data structures left.
344 */
345 if ((error = ffs_syncvnode(vp, MNT_WAIT, 0)) != 0)
346 return (error);
347 } else {
348 flags = IO_NORMAL | (needextclean ? IO_EXT: 0);
349 if (journaltrunc)
350 softdep_journal_freeblocks(ip, cred, length,
351 flags);
352 else
353 softdep_setup_freeblocks(ip, length, flags);
354 ASSERT_VOP_LOCKED(vp, "ffs_truncate1");
355 if (journaltrunc == 0) {
356 ip->i_flag |= IN_CHANGE | IN_UPDATE;
357 error = ffs_update(vp, 0);
358 }
359 return (error);
360 }
361 }
You can see that it always returns once in this code block. The only way the code can get
past this block if soft updates are enabled is a "goto extclean;", which takes you past
the "panic()".
By adding a few printf()s, I have determined:
- The bo_clean.bv_cnt == 1 when the panic occurs and the b_lblkno of the buffer is -ve.
If you look at vtruncbuf():
trunclbn = (length + blksize - 1) / blksize;
1726
1727 ASSERT_VOP_LOCKED(vp, "vtruncbuf");
1728 restart:
1729 bo = &vp->v_bufobj;
1730 BO_LOCK(bo);
1731 anyfreed = 1;
1732 for (;anyfreed;) {
1733 anyfreed = 0;
1734 TAILQ_FOREACH_SAFE(bp, &bo->bo_clean.bv_hd, b_bobufs, nbp) {
1735 if (bp->b_lblkno < trunclbn)
1736 continue;
When length == 0 --> trunclbn is 0, but the test at line#1735 will skip over the b_lblkno
because it is negative.
That is as far as I've gotten. A couple of things I need help from others on:
- Is vtruncbuf() skipping over the cases where b_lblkno < 0 a feature or a bug?
- If it is a feature, then what needs to be done in the code after the vtruncbuf()
call in ffs_truncate() to ensure the buffer is gone by the time the panic check is
done?
--> I do see a bunch of code after the vtruncbuf() call related to indirect blocks
(which I think use the -ve b_lblkno?), but I'll admit I don't understand it well
enough to know if it expects vtruncbuf() to leave the -ve block on the bo_hd list?
Obviously fixing vtruncbuf() to get rid of these -ve b_lblkno entries would be easy,
but I don't know if that is a feature or a bug?
I did look at the commit logs and vtruncbuf() has been like this for at least 10years.
(I can only guess very few run UFS without soft updates or others would see these panic()s.)
I am now running with soft updates enabled to avoid the crashes, but I can easily test any
patch if others can a patch to try.
Thanks for your help with this, rick
More information about the freebsd-fs
mailing list