i386/84842: i386_set_ioperm(2) timing issue

Wed Aug 17 04:20:25 GMT 2005

The following reply was made to PR i386/84842; it has been noted by GNATS.

From: Bruce Evans <bde at zeta.org.au>
To: John Baldwin <jhb at freebsd.org>
Cc: freebsd-gnats-submit at freebsd.org, arundel at h3c.de
Subject: Re: i386/84842: i386_set_ioperm(2) timing issue
Date: Wed, 17 Aug 2005 14:19:23 +1000 (EST)

 On Tue, 16 Aug 2005, John Baldwin wrote:

 > What about replacing the setting of TDF_NEEDRESCHED() in i386_extend_pcb()
 > with a call to ltr()?  Actually, it takes more work than a ltr() as you have

 That would be best if it works.  I haven't checked if more magic in
 cpu_switch() is needed.

 > to update the TSS descriptor in the GDT for the current CPU before you do the
 > ltr().  Maybe something like this:

 There is certainly more magic...

 > Index: i386/sys_machdep.c
 > ===================================================================
 > RCS file: /usr/cvs/src/sys/i386/i386/sys_machdep.c,v
 > retrieving revision 1.102
 > diff -u -r1.102 sys_machdep.c
 > --- i386/sys_machdep.c	23 Jun 2005 21:56:45 -0000	1.102
 > +++ i386/sys_machdep.c	16 Aug 2005 18:08:49 -0000
 > @@ -267,9 +267,11 @@
 > 	KASSERT(td->td_pcb->pcb_ext == 0, ("already have a TSS!"));
 > 	mtx_lock_spin(&sched_lock);
 > 	td->td_pcb->pcb_ext = ext;
 > -
 > -	/* switch to the new TSS after syscall completes */
 > -	td->td_flags |= TDF_NEEDRESCHED;
 > +
 > +	/* Switch to the new TSS. */
 > +	private_tss |= PCPU_GET(cpumask);
 > +	*PCPU_GET(tss_gdt) = ext->ext_tssd;
 > +	ltr(GSEL(GPROC0_SEL, SEL_KPL));
 > 	mtx_unlock_spin(&sched_lock);
 >
 > 	return 0;

 This seeks OK except the comment in it should be moved earlier so that
 the block of code for the switch includes both the lock and the unlock.
 I think the setting of pcb_ext doesn't belong in this block.

 The first KASSERT before this code seems to have been broken by KSE.
 We assert that td->td_proc != curproc, but pcb_ext is per-thread so
 we need td != curthread else we clobber another thread's tss.  If
 the setting of pcb_ext actually needs sched_lock, then the second
 KASSERT before this code is broken too since it reads an unlocked
 pcb_ext.

 The ltr() call is missing the style bug that all other callers of ltr()
 in *.c have.  The other callers laboriously copy the constant arg to
 a local variable.  This seems to have always been unnecessary.  The
 ltr instruction doesn't work with immediate operands, but assembler
 code has always accessed the arg as a non-immediate.  ltr() used to
 be implemented in support.s; then the access was to a local variable
 on the stack, so there were 2 logical copies of the constant in memory,
 first for the local variable and second for the copy of this on the
 stack, and probably 1 copy in memory in practice (gcc should push the
 constant directly onto the stack).  Now ltr() is implemented in cpufunc.h
 and it is missing support for memory accesses, so the constant is
 probably  loaded directly into a register and accessed from there
 whether or not it is to a local variable.

 I use the following fixes for some wrong and incomplete constraints:

 %%%
 Index: cpufunc.h
 ===================================================================
 RCS file: /home/ncvs/src/sys/i386/include/cpufunc.h,v
 retrieving revision 1.142
 diff -u -2 -r1.142 cpufunc.h
 --- cpufunc.h	7 Apr 2004 20:46:05 -0000	1.142
 +++ cpufunc.h	8 Apr 2004 13:22:43 -0000
 @@ -476,5 +481,5 @@
   lidt(struct region_descriptor *addr)
   {
 -	__asm __volatile("lidt (%0)" : : "r" (addr));
 +	__asm __volatile("lidt %0" : : "m" (*(char *)(void *)addr)); /* XXX */
   }

 @@ -482,5 +487,5 @@
   lldt(u_short sel)
   {
 -	__asm __volatile("lldt %0" : : "r" (sel));
 +	__asm __volatile("lldt %0" : : "rm" (sel));
   }

 @@ -488,5 +493,5 @@
   ltr(u_short sel)
   {
 -	__asm __volatile("ltr %0" : : "r" (sel));
 +	__asm __volatile("ltr %0" : : "rm" (sel));
   }

 %%%

 lidt() is wrong and the others are incomplete.  The lidt instruction
 only takes memory operands but we trick both the compiler and the
 assembler to use a pointer the the memory with the pointer constrained
 to a register.  The operand should be simply (*addr), but
 "struct region descriptor" is normally incompletely declared here so
 I use a bogus cast to avoid dereferencing the whole struct, although
 this defeats most of the point of the fix :(.  I think the __volatile
 declaration ensures no problems in practice.  Here we only (?) need
 asm to not be moved to before the initialization of the full *addr.
 We use __volatile for all asms in cpufunc.h although it is documented
 to be unecessary for asms with no output operands like the above.
 peter@ wonders if depending on this behaviour in similar "*addr"
 asms is what causes mysterious warnings (broken panics) in amd64
 npx^Wfpu context switching.

 Bruce