amd64/169927: siginfo, si_code for fpe errors when error occurs using the SSE math processor

Tue Jul 17 05:40:11 UTC 2012

>Number:         169927
>Category:       amd64
>Synopsis:       siginfo, si_code for fpe errors when error occurs using the SSE math processor
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-amd64
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          update
>Submitter-Id:   current-users
>Arrival-Date:   Tue Jul 17 05:40:10 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator:     Ed Alley
>Release:        8.2-RELEASE amd64
>Organization:
>Environment:
System: FreeBSD epos.domos.org 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Mon Jun 25 00:07:01 PDT 2012 wea at epos.domos.org:/usr/src/sys/amd64/compile/EPOS.6 amd64
    machine is an Intel i5 x86-64

>Description:
     According to sigaction(2) by choosing SA_SIGINGO as one of the sa_flags
 one can catch sigfpe signals. What is returned to the signal handler
 for the sigfpe is a structure defined in siginfo(3). Within that
 structure: the si_code entry gives the error code as defined in siginfo(3)
 man page. This is useful when de-bugging a large code,
 because one can retrieve not only the actual fpe error: divide by zero,
 overflow or etc, but also the location of the error is also returned.

     For FPE errors using the x87 everything works fine, but when
the SSE is used for floating point calculations the si_code that
is returned is always zero. I have a fix for this which I have
included as a patch for FreeBSD 8.2 release. I have been applying
this fix since I got my 64-bit box since FreeBSD 7.x. I have not
sent this patch in, since I had assumed that the problem would get
fixed in later releases. However, that has not been the case so
here is the patch (upgraded to version 8.2) that I have been using.

  To apply the patch, just cd into /usr/src/sys/amd64 and apply the
patch. It will operate on two files in the directory amd64:
trap.c and fpu.c.

  In trap.c a single line will be replaced as can be seen in the patch.
This line occurs in the user trap switch for the case: T_XMMFLT. The
line ucode = 0; is replaced with ucode = fputrap(); This then will call
the fputrap() code similarly to the T_ARITHTRAP case.

  The process fputrap() is found in file fpu.c which is where the
rest of the patch operates. In function fputrap() I added additional
code to access the mxcsr status bits. These are then ORed into
the status code before the argument to the fpetable[] is calculated.

  Following the x87 case, before I return, I zero out the error flags
in the mxcsr register. Let me know if this is useful, also I have
not found an equivalent instruction to the fnclex (that zeros out
the x87 error flags) for easily zeroing out the mxcsr error flags,
so I have resorted to anding them out of a memory copy of the mxcsr
that I loaded earlier and then storing it back into the register. 

  With these changes in place, my kernel now handles SIMD fpe errors
(trap code 29) and returns the mxcsr decoded error in the si_code entry of the
siginfo_t structure.

>How-To-Repeat:

>Fix:



Patch attached with submission follows:

diff -Naur amd64-orig/fpu.c amd64/fpu.c

--- amd64-orig/fpu.c	2012-06-24 18:59:36.000000000 -0700
+++ amd64/fpu.c	2012-07-16 22:07:19.000000000 -0700
@@ -72,7 +72,8 @@
 #define	fnstsw(addr)		__asm __volatile("fnstsw %0" : "=am" (*(addr)))
 #define	fxrstor(addr)		__asm __volatile("fxrstor %0" : : "m" (*(addr)))
 #define	fxsave(addr)		__asm __volatile("fxsave %0" : "=m" (*(addr)))
-#define	ldmxcsr(csr)		__asm __volatile("ldmxcsr %0" : : "m" (csr))
+#define	ldmxcsr(addr)		__asm __volatile("ldmxcsr %0" : : "m" (*(addr)))
+#define stmxcsr(addr)		__asm __volatile("stmxcsr %0" : "=m" (*(addr)))
 #define	start_emulating()	__asm __volatile( \
 				    "smsw %%ax; orb %0,%%al; lmsw %%ax" \
 				    : : "n" (CR0_TS) : "ax")
@@ -87,7 +88,8 @@
 void	fnstsw(caddr_t addr);
 void	fxsave(caddr_t addr);
 void	fxrstor(caddr_t addr);
-void	ldmxcsr(u_int csr);
+void	ldmxcsr(caddr_t addr);
+void	stmxcsr(caddr_t addr);
 void	start_emulating(void);
 void	stop_emulating(void);
 
@@ -95,6 +97,7 @@
 
 #define GET_FPU_CW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_cw)
 #define GET_FPU_SW(thread) ((thread)->td_pcb->pcb_save->sv_env.en_sw)
+#define GET_MXCSR(thread)  ((thread)->td_pcb->pcb_save->sv_env.en_mxcsr)
 
 typedef u_char bool_t;
 
@@ -126,7 +129,7 @@
 	control = __INITIAL_FPUCW__;
 	fldcw(control);
 	mxcsr = __INITIAL_MXCSR__;
-	ldmxcsr(mxcsr);
+	ldmxcsr(&mxcsr);
 	if (PCPU_GET(cpuid) == 0) {
 		fxsave(&fpu_initialstate);
 		if (fpu_initialstate.sv_env.en_mxcsr_mask)
@@ -356,6 +359,7 @@
 fputrap()
 {
 	u_short control, status;
+        u_int mxcsr;
 
 	critical_enter();
 
@@ -367,13 +371,18 @@
 	if (PCPU_GET(fpcurthread) != curthread) {
 		control = GET_FPU_CW(curthread);
 		status = GET_FPU_SW(curthread);
+                mxcsr   = GET_MXCSR(curthread);
+                status |= (mxcsr & 0x3f);
 	} else {
 		fnstcw(&control);
 		fnstsw(&status);
+                stmxcsr(&mxcsr);
+                status |= (mxcsr & 0x3f);
+                fnclex();        /* Clear the x87 error bits */
+                mxcsr &= ~0x3f;  /* Clear the mxcsr error bits */
+                ldmxcsr(&mxcsr);
 	}
 
-	if (PCPU_GET(fpcurthread) == curthread)
-		fnclex();
 	critical_exit();
 	return (fpetable[status & ((~control & 0x3f) | 0x40)]);
 }
diff -Naur amd64-orig/trap.c amd64/trap.c
--- amd64-orig/trap.c	2012-06-24 23:58:01.000000000 -0700
+++ amd64/trap.c	2012-07-16 21:51:56.000000000 -0700
@@ -435,7 +435,9 @@
 			break;
 
 		case T_XMMFLT:		/* SIMD floating-point exception */
-			ucode = 0; /* XXX */
+                        ucode = fputrap();
+                        if (ucode == -1)
+                             goto userout;
 			i = SIGFPE;
 			break;
 		}


>Release-Note:
>Audit-Trail:
>Unformatted: