svn commit: r289332 - head/tools/regression/lib/msun

Mon Oct 19 16:50:58 UTC 2015

On Sun, Oct 18, 2015 at 04:42:04AM +1100, Bruce Evans wrote:
> On Sat, 17 Oct 2015, Konstantin Belousov wrote:
> 
> > On Thu, Oct 15, 2015 at 10:12:03PM +1100, Bruce Evans wrote:
> >> On Thu, 15 Oct 2015, Konstantin Belousov wrote:
> >>
> >>> On Wed, Oct 14, 2015 at 08:22:12PM +0000, Garrett Cooper wrote:
> >>>> Author: ngie
> >>>> Date: Wed Oct 14 20:22:12 2015
> >>>> New Revision: 289332
> >>>> URL: https://svnweb.freebsd.org/changeset/base/289332
> >>>>
> >>>> Log:
> >>>>   Fix test-fenv:test_dfl_env when run on some amd64 CPUs
> >>>>
> >>>>   Compare the fields that the AMD [1] and Intel [2] specs say will be
> >>>>   set once fnstenv returns.
> >>>>
> >>>>   Not all amd64 capable processors zero out the env.__x87.__other field
> >>>>   (example: AMD Opteron 6308). The AMD64/x64 specs aren't explicit on what the
> >>>>   env.__x87.__other field will contain after fnstenv is executed, so the values
> >>>>   in env.__x87.__other could be filled with arbitrary data depending on how the
> >>>>   CPU-specific implementation of fnstenv.
> >>> No Intel or AMD CPU write to __other field at all.
> >>
> >> No, they all do.
> >>
> >> Test on old i386 on old A64:
> >> ...
> > No, I did not thought about fegetenv() as executing FXSAVE instruction.
> > I did knew that fegetenv() is a wrapper around FNSAVE, and I was completely
> > sure that FNSAVE in the long mode, when executed by a 64bit program,
> > never writes anything into the %eip/data offset portion of the FNSAVE
> > area.  I suspect that this was motivated by unavoidable 32-bitness of
> > the store format.
> 
> fegetenv() is actually a wrapper around FNSTENV, and FNSTENV is a subset
> of FNSAVE.
Yes, it was a thinko, which in fact does not change the relevant
considerations.

> 
> > I was unable to find a reference in either Intel SDM or in AMD APM which
> > would support my statement.  The closest thing is the claim that the FOP
> > field is not filled, in the SDM.  Still, I wonder how things are really
> > arranged by hardware for FNSAVE in 64bit mode.
> 
> Let's check FOP below.
> 
> > Are your experiments below were done for the 32bit programs, or for 64bit ?
> > Both FXSAVE and XSAVE area formats and rules would be irrelevant for the
> > FreeBSD ABI.
> 
> For amd64, I used the freefall default which I think is long mode, but
> there is some magic for the pointer size (small model?)
I am not sure I follow.  Freefall is the native amd64 installation, and
its userspace is 64bit.  So the CPU is in long mode both in kernel and for
the default userspace.

> 
> BTW, -m32 has been broken on freefall for years.  At least libgcc.a is
> incompatible or not installed.
What command line do you use, exactly ?  Note that cc -m32 -static is not
supported right now and probably would not be supported at all.

> 
> >> ...
> >> Modified state for fegetenv():
> >> X 0000005C  7F 12 00 00 00 00 80 1F FF FF FF FF 63 82 04 08
> >>              --cw- -mxhi --sw- -mxlo --tw- -pad- ----fip----
> >>  					  -----------------
> >> X 0000006C  1F 00 1D 05 A0 92 04 08 2F 00 FF FF
> >>              -fcs- -opc- ----foff--- -fds- -pad-
> >>  	    ----other[16]----------------------
> 
> FOP (opc) is clearly filled on i386 (32-bit mode).
Which CPU is this ?

Please look at the the Intel' SDM vol 1 8.1.9 Fopcode Compatibility Sub-Mode,
which basically states that FOP is not filled starting with Core2.

> 
> >> ...
> >> Test on -current amd64 on Xeon (freefall):
> >> ...
> >> Later fegetenv():
> >> X 00000060  7f 03 ff ff 00 00 ff ff  ff ff ff ff 75 08 40 00
> >> X 00000070  43 00 1c 05 20 62 60 00  3b 00 ff ff 80 1f 00 00
> 
> FOP is filled to 1c 05 on freefall and to 1D 05 on my old i386.  But the
> instruction is the same (fstpl).  The difference is a different encoding
> of the direct address mode.
> 
> Futher testing:
> 
> Only small model seems to be supported.  I got relocation errors with
> messages about R_X86_64_32S for an array of size 4G.
Do you mean that -mcmodel=medium or large does not work ?

> 
> malloc() works to allocate arrays larger than 4G.  Writing to addresses
> above 4G in such arrays or on the stack never gave 64-bit offsets.
> It truncated the offsets to 32 bits and still printed the segment
> register.
Please show exact transcript of the session.