Re: git: 0e69c959150c - main - dtrace: Fix up %rip for invop probes on x86

From: Mark Johnston <markj_at_freebsd.org>
Date: Tue, 01 Nov 2022 13:45:12 UTC
On Tue, Nov 01, 2022 at 03:36:20PM +0200, Konstantin Belousov wrote:
> On Mon, Oct 31, 2022 at 11:12:17PM +0000, Mark Johnston wrote:
> > The branch main has been updated by markj:
> > 
> > URL: https://cgit.FreeBSD.org/src/commit/?id=0e69c959150c0ba38459e9121158016ee34b0d92
> > 
> > commit 0e69c959150c0ba38459e9121158016ee34b0d92
> > Author:     Mark Johnston <markj@FreeBSD.org>
> > AuthorDate: 2022-10-31 23:11:36 +0000
> > Commit:     Mark Johnston <markj@FreeBSD.org>
> > CommitDate: 2022-10-31 23:11:36 +0000
> > 
> >     dtrace: Fix up %rip for invop probes on x86
> >     
> >     When a breakpoint exception is raised, the saved value of %rip points to
> >     the instruction following the breakpoint.  However, when fetching the
> >     value of %rip using regs[], it's more natural to provide the address of
> >     the breakpoint itself, so modify the kinst and fbt providers accordingly.
> >     
> >     Reported by:    khng
> >     Reviewed by:    christos, khng
> >     MFC after:      2 months
> >     Differential Revision:  https://reviews.freebsd.org/D37218
> > ---
> >  sys/cddl/dev/fbt/x86/fbt_isa.c       | 8 ++++++++
> >  sys/cddl/dev/kinst/amd64/kinst_isa.c | 8 +++++++-
> >  2 files changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/sys/cddl/dev/fbt/x86/fbt_isa.c b/sys/cddl/dev/fbt/x86/fbt_isa.c
> > index 05ec87ab437f..b05ae4cb2c44 100644
> > --- a/sys/cddl/dev/fbt/x86/fbt_isa.c
> > +++ b/sys/cddl/dev/fbt/x86/fbt_isa.c
> > @@ -84,6 +84,12 @@ fbt_invop(uintptr_t addr, struct trapframe *frame, uintptr_t scratch __unused)
> >  		if ((uintptr_t)fbt->fbtp_patchpoint != addr)
> >  			continue;
> >  		fbtrval = fbt->fbtp_rval;
> > +
> > +		/*
> > +		 * Report the address of the breakpoint for the benefit
> > +		 * of consumers fetching register values with regs[].
> > +		 */
> > +		frame->tf_rip--;
> >  		for (; fbt != NULL; fbt = fbt->fbtp_tracenext) {
> >  			ASSERT(fbt->fbtp_rval == fbtrval);
> >  			if (fbt->fbtp_roffset == 0) {
> > @@ -143,6 +149,8 @@ fbt_invop(uintptr_t addr, struct trapframe *frame, uintptr_t scratch __unused)
> >  				cpu->cpu_dtrace_caller = 0;
> >  			}
> >  		}
> > +		/* Advance to the instruction following the breakpoint. */
> > +		frame->tf_rip++;
> >  		return (fbtrval);
> >  	}
> >  
> > diff --git a/sys/cddl/dev/kinst/amd64/kinst_isa.c b/sys/cddl/dev/kinst/amd64/kinst_isa.c
> > index 6d8d5d521617..e47cfbbf4d4e 100644
> > --- a/sys/cddl/dev/kinst/amd64/kinst_isa.c
> > +++ b/sys/cddl/dev/kinst/amd64/kinst_isa.c
> > @@ -139,6 +139,12 @@ kinst_invop(uintptr_t addr, struct trapframe *frame, uintptr_t scratch)
> >  	if (kp == NULL)
> >  		return (0);
> >  
> > +	/*
> > +	 * Report the address of the breakpoint for the benefit of consumers
> > +	 * fetching register values with regs[].
> > +	 */
> > +	frame->tf_rip--;
> > +
> >  	DTRACE_CPUFLAG_SET(CPU_DTRACE_NOFAULT);
> >  	cpu->cpu_dtrace_caller = stack[0];
> >  	DTRACE_CPUFLAG_CLEAR(CPU_DTRACE_NOFAULT | CPU_DTRACE_BADADDR);
> > @@ -162,7 +168,7 @@ kinst_invop(uintptr_t addr, struct trapframe *frame, uintptr_t scratch)
> >  
> >  			if (kpmd->reg1 == -1 && kpmd->reg2 == -1) {
> >  				/* rip-relative */
> > -				rval = frame->tf_rip - 1 + kpmd->instlen;
> > +				rval = frame->tf_rip + kpmd->instlen;
> >  			} else {
> >  				/* indirect */
> >  				rval = kinst_regval(frame, kpmd->reg1) +
> 
> I am somewhat curious.  Is the breakpoint used there means INT3?

Right.

> If yes, then %eip++ most likely points into the middle of the overwritten
> multibyte instruction.

In this case it does not: on x86, FBT only overwrites single-byte
instructions currently, "push %rbp" and "ret", so we can assume that
%eip++ points to next instruction (and for "ret" %rip will be overridden
later anyway).  This could change, but it would require other changes to
fbt_invop().

kinst can probe multi-byte instructions, but there we do not increment
the instruction pointer this way.  After the probe fires and
kinst_invop() is finished its work, %rip will point to either 1) an
instruction trampoline containing a copy of the probed instruction
(possibly modified if it is %rip-relative) followed by a jmp to the
instruction following the probed instruction, or 2) the computed branch
target for "call" instructions.