namecache: numneg > 0 but ncneg is empty

Peter Holm peter at holm.cc
Thu Dec 19 08:19:02 UTC 2013


On Thu, Dec 19, 2013 at 09:56:28AM +0200, Andriy Gapon wrote:
> on 19/12/2013 09:03 Konstantin Belousov said the following:
> > On Wed, Dec 18, 2013 at 11:17:59AM +0200, Andriy Gapon wrote:
> >>
> >> I've been running a test that exercises vfs, fs and namecache code quite a lot
> >> and I have run into the following panic:
> [snip]
> >> (kgdb) fr 8
> >> #8  0xffffffff8097c22f in cache_enter_time (dvp=0xfffffe031c7215f8,
> >> vp=0xfffffe0a684f05f8, cnp=0xffffff9de1875858, tsp=0x0, dtsp=0x0) at
> >> /usr/src/sys/kern/vfs_cache.c:902
> >> 902                     cache_zap(ncp);
> >> (kgdb) list
> >> 897                     zap = 1;
> >> 898             }
> >> 899             if (hold)
> >> 900                     vhold(dvp);
> >> 901             if (zap)
> >> 902                     cache_zap(ncp);
> >> 903             CACHE_WUNLOCK();
> >> 904     }
> >> 905
> >> 906     /*
> >> (kgdb) i loc
> >> ncp = (struct namecache *) 0x0
> >> n2 = (struct namecache *) 0xffffffff8178a740
> >> ncpp = (struct nchashhead *) 0xffffff8ccde4e9b0
> >> hash = <value optimized out>
> >> flag = 0
> >> hold = 1
> >> zap = 1
> >> len = <value optimized out>
> >>
> >> (kgdb) p numneg
> >> $4 = 437
> >> (kgdb) p ncp
> >> $7 = (struct namecache *) 0x0
> >> (kgdb) p ncneg
> >> $8 = {tqh_first = 0x0, tqh_last = 0xffffffff8178a710}
> >>
> >>
> >> I am not sure that there is a bug in namecache, but if there is one, then the
> >> only suspicious place I could find is ".." handling in cache_enter_time().
> >>
> > 
> > Do you mean that numneg accounting is wrong for the case when the
> > existing ncp retargeted for dd ? This is the only issue I see there, but
> > it looks as the real case for the failure.
> 
> Yes, this was the case that I suspected.
> 
> > Testcase would be lot of lookups down the long directory hierarchy, and
> > than walking back through the ".." entries.  Even if the thing does not
> > panic, the resulting length of the ncneg tailq should be strictly less
> > than the numneg.
> 
> Kostik,
> 
> thank you for the patch!  I will test it in my environment.
> 
> Peter,
> 
> I am curious about what ideology is behind vfs testing in stress2.  I know that
> I can just look at the code myself, but hope that asking you could be faster.
> Does stress2 exercise a certain set of scenarios?  Or does it have an element of
> randomness?
> 

The tests found in stress2/testcases does everything in a random
fashion.
Test found in stress2/misc are for the most part scenarios that has
been used for finding specific problems.

> The reason I am asking is that I have found fsstress (xfsstress) insufficient
> for finding all the corner cases.  I wrote a really simple script that just
> performs random operations like creating, unlinking, renaming, etc a file or
> directory using randomly generated paths (with certain constraints).  Running a
> hundred instances of that script on the same hierarchy is surprisingly effective
> at uncovering bugs that are very hard to reproduce otherwise.
> So, I am wondering if I've just duplicated what you already had.
> 
> -- 
> Andriy Gapon

-- 
Peter


More information about the freebsd-fs mailing list