Initial FP exception flags incorrect on amd64

Fri Jun 4 19:30:14 PDT 2004

On Wed, Jun 02, 2004 at 06:53:15PM -0700, David Schultz wrote:
> On Thu, Jun 03, 2004, Tim Robbins wrote:
> > On Tue, Jun 01, 2004 at 11:48:46PM -0700, David Schultz wrote:
> > > I discovered that new processes on amd64 have the inexact flag
> > > raised by default, at least on sledge.  However, all the sticky
> > > flags should be clear initially.
> > [...]
> > > I don't have any amd64 hardware of my own to test kernel patches
> > > on, but if I were to make a wild guess as to how to solve the
> > > problem, it would be the following patch.  I would appreciate it
> > > if someone could address the problem, or at least let me know
> > > whether my proposed fix works.
> > > 
> > > Index: sys/amd64/amd64/fpu.c
> > > ===================================================================
> > > RCS file: /cvs/src/sys/amd64/amd64/fpu.c,v
> > > retrieving revision 1.149
> > > diff -u -r1.149 fpu.c
> > > --- fpu.c	5 Apr 2004 21:25:51 -0000	1.149
> > > +++ fpu.c	2 Jun 2004 06:08:34 -0000
> > > @@ -73,6 +73,7 @@
> > >  #define	fnstsw(addr)		__asm __volatile("fnstsw %0" : "=m" (*(addr)))
> > >  #define	fxrstor(addr)		__asm("fxrstor %0" : : "m" (*(addr)))
> > >  #define	fxsave(addr)		__asm __volatile("fxsave %0" : "=m" (*(addr)))
> > > +#define	stmxcsr(addr)		__asm("stmxcsr %0" : "=m" (*(addr)))
> > >  #define	start_emulating()	__asm("smsw %%ax; orb %0,%%al; lmsw %%ax" \
> > >  				      : : "n" (CR0_TS) : "ax")
> > >  #define	stop_emulating()	__asm("clts")
> > > @@ -119,6 +120,8 @@
> > >  	fninit();
> > >  	control = __INITIAL_FPUCW__;
> > >  	fldcw(&control);
> > > +	control = __INITIAL_MXCSR__;
> > > +	stmxcsr(&control);
> > >  	fxsave(&fpu_cleanstate);
> > >  	start_emulating();
> > >  	fpu_cleanstate_ready = 1;
> > 
> > This seems to cause a panic (trap 12) on startup. Shouldn't it be ldmxcsr
> > instead? Changing that causes a different kind of panic (trap 9.)
> 
> Oops, you're right.  The other problem is probably that control is
> a u_short, so garbage gets loaded into the upper 16 bits of the CSR.
> Making 'control' a u_int should fix the problem, but the following
> patch introduces a new variable to avoid relying on endianness
> for the fldcw.  Thanks, Tim!
> 
> 
> Index: sys/amd64/amd64/fpu.c
> ===================================================================
> RCS file: /cvs/src/sys/amd64/amd64/fpu.c,v
> retrieving revision 1.149
> diff -u -r1.149 fpu.c
> --- sys/amd64/amd64/fpu.c	5 Apr 2004 21:25:51 -0000	1.149
> +++ sys/amd64/amd64/fpu.c	3 Jun 2004 01:48:23 -0000
> @@ -73,6 +73,7 @@
>  #define	fnstsw(addr)		__asm __volatile("fnstsw %0" : "=m" (*(addr)))
>  #define	fxrstor(addr)		__asm("fxrstor %0" : : "m" (*(addr)))
>  #define	fxsave(addr)		__asm __volatile("fxsave %0" : "=m" (*(addr)))
> +#define	ldmxcsr(r)		__asm __volatile("ldmxcsr %0" : : "m" (r))
>  #define	start_emulating()	__asm("smsw %%ax; orb %0,%%al; lmsw %%ax" \
>  				      : : "n" (CR0_TS) : "ax")
>  #define	stop_emulating()	__asm("clts")
> @@ -111,6 +112,7 @@
>  fpuinit(void)
>  {
>  	register_t savecrit;
> +	u_int mxcsr;
>  	u_short control;
>  
>  	savecrit = intr_disable();
> @@ -119,6 +121,8 @@
>  	fninit();
>  	control = __INITIAL_FPUCW__;
>  	fldcw(&control);
> +	mxcsr = __INITIAL_MXCSR__;
> +	ldmxcsr(mxcsr);
>  	fxsave(&fpu_cleanstate);
>  	start_emulating();
>  	fpu_cleanstate_ready = 1;

Apologies for the late response -- the new patch no longer causes panics
on startup, and gives the expected results with your test program (0x00.)
It doesn't seem to break any applications, but I don't do anything
numerically intensive (X, KDE, Mozilla, compiling.)


Tim