svn commit: r289332 - head/tools/regression/lib/msun

Thu Oct 15 11:42:57 UTC 2015

On Thu, 15 Oct 2015, Konstantin Belousov wrote:

> On Wed, Oct 14, 2015 at 08:22:12PM +0000, Garrett Cooper wrote:
>> Author: ngie
>> Date: Wed Oct 14 20:22:12 2015
>> New Revision: 289332
>> URL: https://svnweb.freebsd.org/changeset/base/289332
>>
>> Log:
>>   Fix test-fenv:test_dfl_env when run on some amd64 CPUs
>>
>>   Compare the fields that the AMD [1] and Intel [2] specs say will be
>>   set once fnstenv returns.
>>
>>   Not all amd64 capable processors zero out the env.__x87.__other field
>>   (example: AMD Opteron 6308). The AMD64/x64 specs aren't explicit on what the
>>   env.__x87.__other field will contain after fnstenv is executed, so the values
>>   in env.__x87.__other could be filled with arbitrary data depending on how the
>>   CPU-specific implementation of fnstenv.
> No Intel or AMD CPU write to __other field at all.

No, they all do.

Test on old i386 on old A64:

Initial state for fegetenv():
X 00000000  7F 12 00 00 00 00 80 1F FF FF FF FF 52 3F 67 C0
             --cw- -mxhi --sw- -mxlo --tw- -pad- ----fip----
 					  -----------------
X 00000010  08 00 05 01 48 5F 74 C0 10 00 FF FF
             -fcs- -opc- ----foff--- -fds- -pad-
 	    ----other[16]----------------------

This always fails the test:
cw:     passes
mxhi:   passes
sw:     passes
pad+tw: passes
fip:    always fails.  The test wants 0, but the value is always the kernel
         eip for the instruction that is used to avoid the security hole.
 	(The A64 CPU gives the security hole, and the kernel code is
 	pessimal so it apparently runs always on the first use of the
 	NPX by a thread.)
fcs:    always fails.  The test wants 0, but the value is always the kernel
         cs, as above.
opc:    always fails.  The value is the opcode for the kernel instruction
foff:   always fails.  The value is the data address for the kernel instruction
fds:    always fails.  The value is the kernel ds extended by padding with 1's

Modified state for fxsave:
X 0000001C  7F 12 00 38 80 00 F0 3F EE EE EE EE EE EE EE EE
X 0000002C  EE EE EE EE EE EE EE EE 80 1F 00 00 FF FF 00 00

Partial result of an fxsave instruction after fld1; fstl in the test
program to try to change the kernel data.  A byte value of EE indicates
that the byte was written by memset() and not overwritten by fxsave.
There are 16 such bytes.  These correspond to __other[16].  You are
probably thinking of fxsave not writing these, but fegetenv() doesn't
use fxsave and always writes these.  I think fxsave on all Intel CPUs
write these (freefall's Xeon does), and all AMD CPUs write these after
an error.

The format is slightly different.  All the other values seem to be the
same as for fegetenv().

Modified state for fegetenv():
X 0000005C  7F 12 00 00 00 00 80 1F FF FF FF FF 63 82 04 08
             --cw- -mxhi --sw- -mxlo --tw- -pad- ----fip----
 					  -----------------
X 0000006C  1F 00 1D 05 A0 92 04 08 2F 00 FF FF
             -fcs- -opc- ----foff--- -fds- -pad-
 	    ----other[16]----------------------

This is from the same state as the fxsave.  It has the following
modifications relative to the initial state:
- all kernel data changed to user addresses, as intended

Modified state for fnsave:
X 0000003C  7F 12 FF FF 00 00 FF FF FF FF FF FF 63 82 04 08
X 0000004C  1F 00 1D 05 A0 92 04 08 2F 00 FF FF

This is from the same state as the fxsave.  It has the same format
and values as the state saved by fegetenv() except 2 padding
fields are padded with 1's and not repurposed for mxcsr.

Modified state for fnstenv:
X 00000078  7F 12 FF FF 00 00 FF FF FF FF FF FF 63 82 04 08
X 00000088  1F 00 1D 05 A0 92 04 08 2F 00 FF FF EE EE EE EE

This is from the same state as the fxsave.  It has the same format
and values as the state saved by fnsave.

So last instruction/data values are always in fenv_t on i386.  They
are undocumented, but might be used.  On amd64 in 64-bit mode, they
don't fit in the hardware format used by fnstenv.  fenv_t extends
this format only to append mxcsr.  amd64 should have used the env
part of the better-designed 64-bit fxsave format.

Test on -current amd64 on Xeon (freefall):

early fegetenv():
X 00000000  7f 03 ff ff 00 00 ff ff  ff ff ff ff 00 00 00 00
             --cw- -pad- --sw- -pad-  --tw- -pad- -----------
X 00000010  00 00 00 00 00 00 00 00  00 00 ff ff 80 1f 00 00
             ---other[16]------------------------ ---mxcsr---

The test passes.  The 2 bytes at the end of other[20] are reserved
and not part of the last instruction/data.  They are padded normally
with 1's and not with 0's.  The default env hard-codes the assumption
that these reserved bits are 1's.

Since Xeons are not AMD CPUs, the kernel doesn't execute the code that
closes the security hole and all the last instruction/data values are
0.  These values are written.

later fxsave:
X 00000020  7f 03 00 00 00 00 1c 05  75 08 40 00 43 00 00 00
             --cw- --sw- tw 00 -opc-  ----fip---- ----fcs----
X 00000030  20 62 60 00 3b 00 00 00  80 1f 00 00 ff ff 00 00
             ----fdp---- ----fds----  ---mxcsr--- -mxcsrmask-

fxsave always stores the last instruction/data values on Intel CPUs.
They have the nonzero user values.  The format is the 64-bit format.
I now understand why it looked looked 32-bit format before -- it
has segment registers instead of 64-bit offsets, but that is apparently
from the memory model being small -- seg:offset32 is stored instead
of rip and edp if the operand size is 32 bits.  fcs == %cs.  %ds is
0 but fds == %ss.

Later fegetenv():
X 00000060  7f 03 ff ff 00 00 ff ff  ff ff ff ff 75 08 40 00
X 00000070  43 00 1c 05 20 62 60 00  3b 00 ff ff 80 1f 00 00

Later fnsave:
X 00000040  7f 03 ff ff 00 00 ff ff  ff ff ff ff 75 08 40 00
X 00000050  43 00 1c 05 20 62 60 00  3b 00 ff ff 00 00 00 00

Later fnstenv:
X 00000080  7f 03 ff ff 00 00 ff ff  ff ff ff ff 75 08 40 00
X 00000090  43 00 1c 05 20 62 60 00  3b 00 ff ff ee ee ee ee

Everything works right and the values are the same as for fxsave
except for rearrangement and padding, but only because 64-bit
offsets are not needed.

This shows that:
- it is the __other field that should be least MD, not most MD,
   since it has no reserved padding bits and just doesn't
   support 64-bit offsets
- there seem to be no hardware differences for the __other field
   as saved by fnstenv and fegetenv().  The AMD bug/optimization
   is only for fxsave.
- however, the FreeBSD fix for the AMD bug/optimization gives
   different initial values in the __other field for AMD CPUs only.
   I don't like it, but thought it was smarter than that.  Perhaps
   it is, but I tested my version.  My version is more efficient,
   and this involves doing the cleaning more unconditionally.  For
   the first use of the NPX by a thread, it would be efficient
   enough to clean using fstenv; clean; fldenv before the normal
   fxrstor.  But cleaning is needed on every NPX context switch,
   and we don't want to do fstenv; clean; fldenv on every switch.

The AMD bug/optimization breaks some debugging methods and some
error handling in applications, and the FreeBSD cleaning makes
this worse.  An application could try to use the last instruction/
data info for its error handling and use methods that preserve
it internally.  But the info is lost on any context switch.  Even
if no other thread uses the NPX, cleaning guarantees clobbering
of the info when a thread is switched back to.  It is lost even
after an exception where AMD hardware doesn't lose it.  Its loss
is visible to debuggers -- debuggers will usually see the result
of the cleaning.

Bruce