problem with user trap handlers in -CURRENT

Michiel Boland michiel at
Mon Aug 13 17:50:03 UTC 2007

Hi. First, I would like to point out that I'm not at all an expert on 
sparc64, so please excuse me if I express myself a bit awkwardly.

Now the problem.

As you may or may not know, -CURRENT on sparc64 is broken in the sense 
that you can no longer ssh into a box that has

  UsePrivilegeSeparation yes

in sshd_config.

For more details, see 

At the root of all this lies what I think is a fundamental problem with 
the way that alignment traps are handled.

Attached you will find (unless the list software decides to eat it) a 
program that demonstrates what I mean. It creates a deliberate unaligned 
access trap. The idea is that the trap handler then emulates 
the load/store in software. This is done in __unaligned_fixup in 

Unfortunately this emulation will not work if the trap is taken in the 
delay slot right after a return instruction, and the faulting address is 
on the stack of the procedure from which the processor just returned. This 
is because the contents of the stack are overwritten with a trap frame, at 
which point the emulation code will store an erroneous value.

This is why the attached program outputs

expected 37, got 0

instead of nothing, which it should do.

(Surprise your friends: link traptest statically. It will then print a 
different value each time it is run. :)

The assembler code generated by gcc looks like

         save    %sp, -208, %sp
         add     %fp, 2019, %l0
         add     %l0, %i0, %l0
         add     %fp, 2027, %o0
         call    o3, 0
          mov    %l0, %o1
         ldsb    [%fp+2027], %g1
         cmp     %g1, 0
         add     %fp, 2019, %g1
         movne   %icc, %l0, %g1
         return  %i7+8
          ldsw   [%g1], %o0        <----- trap is taken here
                 this location will be overwritten by the trap frame

I'm not sure how to work around this. I guess one solution would be to 
tell gcc not to generate these kinds of instruction combinations.

But also I am wondering why FreeBSD attempts to emulate unaligned loads 
and stores in the first place. If I run traptest on Solaris, it crashes 
immediately with SIGBUS. I would have guessed it would do the same on 
FreeBSD. So I was a bit surprised that it ran at all.

Is it not easier to just not handle unaligned traps at all and simply let 
programs crash? Or did someone already try this in the past, and too many 
things broke after that?

Also I would assume that if you enforce that all memory access be aligned, 
and hence cut out all the (slow) emulation, you get at least a theoretical 
spead increase.

-------------- next part --------------
#include <stdio.h>

#define MAGIC_NUMBER 37

int o2(int);

int main()
	int i = o2(1);

	if (i != MAGIC_NUMBER) {
		fprintf(stderr, "expected %d, got %d\n", MAGIC_NUMBER, i);
		return 1;
	return 0;

void o3(char *, char *);

int o2(int offset)
	int *p;
	char c[4];
	char tmp[8];

         * o3 will store the magic number in *(tmp + offset)
	o3(c, tmp + offset);
	 * o3 will also make c[0] nozero, so we should always
	 * return *(tmp + offset), that is, the magic number.
	 * This construction is just a trick to make gcc
	 * emit the correct assembler statements.
	p = c[0] ? (int *) (tmp + offset) : (int *) tmp;
	return *p;

void o3(char *cp, char *s)
	int *ip = (int *) s;
	*cp = 1;

More information about the freebsd-sparc64 mailing list