svn commit: r232275 - in head/sys: amd64/include i386/include pc98/include x86/include

Bruce Evans brde at optusnet.com.au
Fri Mar 2 04:11:26 UTC 2012


On Thu, 1 Mar 2012, Tijl Coosemans wrote:

> On Wednesday 29 February 2012 06:01:36 Bruce Evans wrote:
>> ...
>> Here is what current arches have in their machine/setjmp.h:
>>
>> amd64, i386: not much
>> arm: has lots of comments and register offsets.  These are defined as
>>       _JB_REG_* so they aren't pollution, but there is no reason to
>>       export them to the application <setjmp.h> either.  The actual
>>       structs are the usual 2 arrays of ints, with the extra 1 for
>>       both the comment not matching the code, as on i386.  The extra
>>       1 is unused, or at least has no _JB_REG_* for it.
>> ia64: has lots of namespace-pollution definitions under a __BSD_VISIBLE
>>       ifdef.  The structs are arrays of long doubles!  This defeats
>>       my idea of using a MI array of register_t's.  _JBLEN could be
>>       expanded for long doubles, but __align() would be required too,
>>       and it gets messier than a separate file.
>> mips: just the usual extra 1 (now 4 instances for 32/64 doubling) and
>>      the usual comment not matching the code.
>> powerpc: like x86
>> sparc64: just the usual extra 1.  The comment is fixed by removing it.
>>
>> So the extra 1 seems to be just a ~20-year old mistake, faithfully
>> propagated to all arches except amd64 i386, with unfaithful propagation
>> just fixed for i386.
>
> If we could add the returns_twice attribute to setjmp() then the
> compiler makes sure all registers are dead before calling it and
> jmp_buf wouldn't have to be that big.

I think compilers already do stuff like that automatically.  They have
to for setjmp() to work.  Since there was no way to declare such
attributes 20 years ago, compilers had to know that setjmp() was
special and make it work when it only has a Standard C declaration
(and some magic in its inmplementation).

> Also, from ISO C: "All accessible objects have values, and all other
> components of the abstract machine [249] have state, as of the time the
> longjmp function was called"
>
> "[249] This includes, but is not limited to, the floating-point status
> flags and the state of open files."
>
> So I think storing mxcsr in jmp_buf is incorrect.

This is a well known bug in ISO C.  ISO C never even tried to support
longjmp() from signal handlers, but we do.  Supporting them requires
restoring significant parts of the FP environment.  I fixed this for
the i387 control word in FreeBSD about 20 years ago.  This was required
even to support float to integer conversions on i386.  The situation
with these has changed a bit.  It was:
- there is a default rounding mode.  C didn't support changing it.  It
   is normally round-to-nearest.  But for float to integer conversions,
   it is round-towards-zero.  To implement the latter, compilers switch
   the mode to the latter mode.  In old versions of FreeBSD, and still
   with COMPAT4 signal handlers, handling of the FP state in signal
   handlers was mostly incorrect.  Signal handlers were passed the
   current FP state, except for clobbering the exception flags for
   SIGFPE's for hardware FP exceptions.  Thus it was normal for signal
   handlers to see the rounding mode switched to round-towards-zero.
   This is not part of the abstract machine.  A normal rounding mode
   must be restored somehow.  C90 didn't support changing the rounding
   mode, so it would have been correct for C90 to hard-code the rounding
   mode at the time of main() in signal handlers if you knew what that
   was (it really should be set in crt or inherited across exec, instead
   of being hard-coded in the kernel like it is in FreeBSD).  But i387
   supports changing the rounding mode.  It is simplest to restore it
   to that at the time of the setjmp().
Now, things are even more complicated:
- signal handlers are normally passed a clean FP state.  (Since C
   barely supports signal handlers, it doesn't say anything about this).
   Now, longjmp() from a signal handler would return this clean state
   if no FP state is restored, unless the signal handler does some FP
   operations that dirty its clean state.  Returning the clean state
   has much the same effect in simple cases as restoring the state at
   the time of the setjmp(), because nothing except the compiler doing
   the float to integer conversions changes the state from its default,
   and the longjmp() takes us to a point where the compiler is not doing
   these conversions so it is correct for the normal state to be
   restored.  There is now the minor simplication that i386 with SSE
   doesn't need the mode switch for floats, and i386 with SSE2 doesn't
   need it for doubles; but i386 still needs it for long doubles.  There
   is the minor complication that the signal handler may be COMPAT4,
   in which case its FP state is not clean and the old method must be
   used -- longjmp() can hardly be expected to tell which type the
   signal handler is and adjust its behaviour to match.
- C now supports changing the rounding mode.  Its requirement that
   longjmp() not restore the previous rounding mode may be correct for
   some cases, but it is broken for longjmp() out of signal handlers:
   - suppose the signal handler gets a clean state, as in FreeBSD.  Then
     any longjmp() out of a signal handler that doesn't restore the
     rounding mode (or any other part of the FP env) resets to the clean
     state (which should be the same as the default state); this state
     may differ from the state at the time of the setjmp() and also from
     the state at the time of the signal.  This is broken.  Perhaps the
   - suppose the signal handler doesn't get a clean state.  Who knows
     what it is?  Standards don't specify this.  Even if the signal
     handler understands everything, then it will have a hard time
     cleaning up the state so that it is right at the time of the
     longjmp().  Note that it is not just SIGFPE handlers for hardware
     FP exceptions that would need to understand everything about FP
     to do the right thing.  _All_ signal handlers would need this,
     since for example a harmless SIGINT handler might be interrupting
     a FP operation that changes the FP env in ways outside of the
     abstract machine.

Next, there are the FP exception flags.  C90 doesn't support these, and
I didn't worry about these 20 years ago.  I just put the i386 FP control
word in jmp_buf, and used fninit to clean out everything else in the
FP env.  Now, C99 supports these.  These should not be changed by
longjmp().  However, for the case of longjmp() from a signal handler,
if nothing is restored, then all of them will be be cleared by the
longjmp() in the usual case where the signal handler doesn't dirty
its clean state.  Worse, if the signal handler dirties it state and
doesn't do this intentionally to prepare for the longjmp(), then the
main part of the program gets its exception flags replaced by the
signal handler.  Again, it is very difficult for signal handlers to
understand FP well enough to do the right thing.  SIGFPE ones have to
understand a little more here.  They have to understand that the kernel
doesn't understand this stuff, so it has destroyed the exception flags
in the saved state after only making a lossy copy of them (in the
signal code).  Destruction of the exception flags allows the case of
returning from a signal handler to sort of work (the SIGFPE doesn't
repeat).  This problem only occurs if signals for FP exceptions are
unmasked.  Otherwise, SIGFPE never occurs for FP exceptions, but only
for integer exceptions like division by 0.

The amd64 _setjmp.S and setjmp.S (but not its sigsetjmp.S) save mxcsr
in setjmp but only restores the non-flags from it in longjmp; it loses
the fninit (except in sigsetjmp.S):

% setjmp:
% 	fnstcw	64(%rcx)		/* 8; fpu cw */
% 	stmxcsr	68(%rcx)		/*    and mxcsr */
% longjmp:
% 	/* Restore the mxcsr, but leave exception flags intact. */
% 	stmxcsr	-4(%rsp)
% 	movl	68(%rdx),%eax
% 	andl	$0xffffffc0,%eax
% 	movl	-4(%rsp),%edi
% 	andl	$0x3f,%edi
% 	xorl	%eax,%edi
% 	movl	%edi,-4(%rsp)
% 	ldmxcsr -4(%rsp)
% ...
%	// lost fninit here
% 	fldcw	64(%rdx)

For longjmp() from signal handlers, leaving the exception flags intact
is worse than useless, since these are only the signal handler's
exception flags (it would be better to clear them).

On i386, the bugs are similar, except the mxcsr is not touched:
- fninit has been removed from _setjmp.S and setjmp.S
- sigsetjmp.S has not been touched (so it is missing mxcsr handling,
   but still does fninit).

Removing the fninits is very large breakage in the i386 case.  amd64
doesn't support COMPAT4 signal handlers.  Thus its signal handlers
start with a clean state and fninit in them only cleans up any dirt
made by the signal handler, and there is rarely even minor dirt.
But for i386 with a COMPAT4 signal handler, when a signal interrupts
an FP operation, the i387 FP stack always has something on it.
This needs to be cleaned before or by longmp() if longjmp() is used
to quit the signal handler.  Without COMPAT4 signal handlers, the
only obvious bug is that leaving the i387 exception flags intact is
worse than useless, the same as for then the sr part of mxcsr.

Other things in the FP env are less portable so they are less used so
they cause fewer problems.  Standards don't support many other things,
so the implementation can be correct for them even if the standard
requires brokenness for other parts of the env.  i387 precision control
is an example.

Bruce


More information about the svn-src-all mailing list