DTrace script to trace processes entering vfs::vop_remove

dteske at FreeBSD.org dteske at FreeBSD.org
Fri Dec 5 21:46:59 UTC 2014



> -----Original Message-----
> From: Mark Johnston [mailto:markjdb at gmail.com] On Behalf Of 'Mark
> Johnston'
> Sent: Thursday, December 4, 2014 6:35 PM
> To: dteske at FreeBSD.org
> Cc: freebsd-dtrace at freebsd.org; 'Julian Elischer'
> Subject: Re: DTrace script to trace processes entering vfs::vop_remove
> 
> On Thu, Dec 04, 2014 at 12:55:40PM -0800, dteske at FreeBSD.org wrote:
> >
> >
> > > -----Original Message-----
> > > From: Mark Johnston [mailto:markjdb at gmail.com] On Behalf Of 'Mark
> > > Johnston'
> > > Sent: Thursday, December 4, 2014 10:23 AM
> > > To: dteske at FreeBSD.org
> > > Cc: freebsd-dtrace at freebsd.org; 'Julian Elischer'
> > > Subject: Re: DTrace script to trace processes entering vfs::vop_remove
> > >
> > > On Wed, Dec 03, 2014 at 06:03:45PM -0800, dteske at FreeBSD.org wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Mark Johnston [mailto:markjdb at gmail.com] On Behalf Of
> Mark
> > > > > Johnston
> > > > > Sent: Wednesday, December 3, 2014 4:45 PM
> > > > > To: dteske at FreeBSD.org
> > > > > Cc: freebsd-dtrace at freebsd.org; 'Julian Elischer'
> > > > > Subject: Re: DTrace script to trace processes entering
> vfs::vop_remove
> > > > >
> > > > > On Wed, Dec 03, 2014 at 03:19:31PM -0800, dteske at FreeBSD.org
> wrote:
> > > > > > Hi markj, list,
> > > > > >
> > > > > > I wrote a script for $work to help me find out "who on Earth
> > > > > > keeps deleting files XYZ?" from a particular storage server.
> > > > > >
> > > > > > Please find attached a copy of watch_vop_remove.d which
> > > > > > has the following sample output:
> > > > > >
> > > > > > 2014 Dec  3 11:58:52 rm[75596]: /tmp/foo
> > > > > >  -+= 72846 0.0 -bash
> > > > > >   \-+= 75589 0.0 /bin/bash /usr/home/support/bash_script
> > > > > >     \-+= 75596 0.0 rm -f /tmp/foo
> > > > > >
> > > > > > The above sample output was displayed when executing the
> following
> > > shell
> > > > > > script:
> > > > > >
> > > > > > #!/bin/bash
> > > > > > touch /tmp/foo
> > > > > > rm -f /tmp/foo
> > > > > >
> > > > > > The output format displayed for each vop_remove() call is as
> > follows:
> > > > > >
> > > > > > DATE process[PID]: PATH_TO_DELETE
> > > > > >  -+= GPID UID.GID grandparent_process [arguments (up to 3)]
> > > > > >   \-+= PPID UID.GID parent_process [arguments (up to 3)]
> > > > > >     \-+= PID UID.GID process [arguments (up to 3)]
> > > > >
> > > > > This is neat. I just had a few comments:
> > > > > - You can use walltimestamp when printing the date and time,
instead
> > of
> > > > >   timestamp + blah.
> > > >
> > > > I read that online as well, however:
> > > > walltimestamp appears to _always_ be zero.
> > >
> > > Right, it wasn't working properly on 8.0. :(
> > >
> > > gnn committed a fix for that as r238537.
> > >
> > > >
> > > >
> > > > > - It's possible to get the full argv of the current process with
> > > > >   curpsinfo->pr_psargs. It can be done for other processes too;
see
> > > > >   /usr/lib/dtrace/psinfo.d. (This might not be true depending on
the
> > > > >   FreeBSD version you're on.)
> > > >
> > > > Thanks! I'll have a look.
> > > >
> > > > > - Running this script with a make -j4 buildkernel causes dtrace to
run
> > > > >   out of dynamic variable space.
> > > > >
> > > >
> > > > Any recommendation on how to fix that?
> > > >
> > > > #pragma D option dynvarsize=what_exactly?
> > > > (16m causes a warning that it's lowering the dynamic variable
memory)
> > >
> > > It looks like a leak - once I start seeing the errors, no file
removals
> > > are logged at all. Dynamic variables need to be set to 0 once they're
> > > finished with in order to release the consumed memory.
> > >
> >
> > Thanks! Should be fixed in the latest (attached) version
> > (watch_vop_remove2.d).
> >
> > However, I read here:
> > http://wikis.oracle.com/display/DTrace/Variables
> >
> > Quote: Always assign zero to associative array elements that are no
longer
> > in use.
> >
> > And I read some more about the different variable types in DTrace:
> > http://dtrace.org/blogs/brendan/2011/11/25/dtrace-variable-types/
> >
> > It would appear that I've solved the issue by getting rid of associative
> > arrays.
> >
> > Can you give the latest (attached) a try?
> 
> Yup, it appears to fix the issue for me. Thanks!
> 

Thanks!

Success-aside, I was still noticing a lingering issue (fixed
in the attached most-recent "watch_vop_remove3.d").

On "all night" runs, watching a tight-[infinite-]loop of
"touch /tmp/foo; rm -f /tmp/foo", I was noticing this:

dtrace: error on enabled probe ID 24 (ID 36603: vfs:vop:vop_remove:entry):
invalid address (0xffffff80aa42ea00) in action #71 at DIF: offset 52
dtrace: error on enabled probe ID 24 (ID 36603: vfs:vop:vop_remove:entry):
invalid address (0xffffff80aa8aea00) in action #110 at DIF: offset 52
dtrace: error on enabled probe ID 24 (ID 36603: vfs:vop:vop_remove:entry):
invalid address (0xffffff80aa602a00) in action #93 at DIF: offset 52
dtrace: error on enabled probe ID 24 (ID 36603: vfs:vop:vop_remove:entry):
invalid address (0xffffff80aa6daa00) in action #60 at DIF: offset 52
dtrace: error on enabled probe ID 24 (ID 36603: vfs:vop:vop_remove:entry):
invalid address (0xffffff80aa7dfa00) in action #85 at DIF: offset 52
[snip 215 lines]
dtrace: error on enabled probe ID 24 (ID 36603: vfs:vop:vop_remove:entry):
invalid address (0xffffff80aa42ea00) in action #96 at DIF: offset 52
dtrace: error on enabled probe ID 24 (ID 36603: vfs:vop:vop_remove:entry):
invalid address (0xffffff80aa8aea00) in action #135
dtrace: error on enabled probe ID 24 (ID 36603: vfs:vop:vop_remove:entry):
invalid address (0xffffff80aa47fa00) in action #85 at DIF: offset 52
dtrace: error on enabled probe ID 24 (ID 36603: vfs:vop:vop_remove:entry):
invalid address (0xffffff80aa8dba00) in action #110 at DIF: offset 52
dtrace: error on enabled probe ID 24 (ID 36603: vfs:vop:vop_remove:entry):
invalid address (0xffffff80aa641a00) in action #85 at DIF: offset 52
dtrace: error on enabled probe ID 24 (ID 36603: vfs:vop:vop_remove:entry):
invalid address (0xffffff80aa3efa00) in action #110 at DIF: offset 52
[more lines snipped]

I read-up some more on the difference between variables and the costs
associated with them (specific to FreeBSD; specifically that globals may
have some locking issues and should be avoided).

I made a new version of the script that changes all the variables to the
clause-local syntax (this->), and that fixed the issue (no more spurious
errors about invalid addresses).

Please find attached latest watch_vop_remove3.d, potential candidate
for commit to /usr/share/dtrace/.
-- 
Cheers,
Devin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: watch_vop_remove3.d
Type: application/octet-stream
Size: 16290 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-dtrace/attachments/20141205/20aa48a2/attachment.obj>


More information about the freebsd-dtrace mailing list