misc/177624: Swapcontext can get compiled incorrectly
David Xu
davidxu at freebsd.org
Sun Apr 7 02:50:01 UTC 2013
The following reply was made to PR misc/177624; it has been noted by GNATS.
From: David Xu <davidxu at freebsd.org>
To: Bruce Evans <brde at optusnet.com.au>
Cc: Brian Demsky <bdemsky at uci.edu>, freebsd-bugs at FreeBSD.org,
freebsd-gnats-submit at FreeBSD.org
Subject: Re: misc/177624: Swapcontext can get compiled incorrectly
Date: Sun, 07 Apr 2013 10:41:29 +0800
On 2013/04/05 03:38, Bruce Evans wrote:
> On Thu, 4 Apr 2013, Brian Demsky wrote:
>
>> On Apr 4, 2013, at 8:16 AM, Bruce Evans wrote:
>>
>>> On Fri, 5 Apr 2013, Bruce Evans wrote:
>>>
>>>> On Thu, 4 Apr 2013, Brian Demsky wrote:
>>>>
>>>>>> Description:
>>>>> Here is the code for swap context:
>>>>> int
>>>>> swapcontext(ucontext_t *oucp, const ucontext_t *ucp)
>>>>> {
>>>>> int ret;
>>>>> if ((oucp == NULL) || (ucp == NULL)) {
>>>>> errno = EINVAL;
>>>>> return (-1);
>>>>> }
>>>>> oucp->uc_flags &= ~UCF_SWAPPED;
>>>>> ret = getcontext(oucp);
>>>>> if ((ret == 0) && !(oucp->uc_flags & UCF_SWAPPED)) {
>>>>> oucp->uc_flags |= UCF_SWAPPED;
>>>>> ret = setcontext(ucp);
>>>>> }
>>>>> return (ret);
>>>>> }
>>>>
>>>>> On the OS X port of libc in Mac OSX 10.7.5, this gets compiled as:
>>>>
>>>>> ...
>>>>> 0x00007fff901e870b <swapcontext+89>: pop %rbx
>>>>> 0x00007fff901e870c <swapcontext+90>: pop %r14
>>>>> 0x00007fff901e870e <swapcontext+92>: jmpq 0x7fff90262855
>>>>> <setcontext>
>>>>> The problem is that rbx is callee saved by compiled version of
>>>>> swapcontext and then reused before getcontext is called.
>>>>> Getcontext then stores the wrong value for rbx and setcontext later
>>>>> restores the wrong value for rbx. If the caller had any value in
>>>>> rbx, it has been trashed at this point.
>>>>
>>>> Later you wrote:
>>>>
>>>>> The analysis is a little wrong about the problem. Ultimately, the
>>>>> tail call to set context trashes the copies of bx and r14 on the
>>>>> stack�.
>>>>
>>>> The bug seems to be in setcontext(). It must preserve the callee-saved
>>>> registers, not restore them. This would happen automatically if more
>>>> were written in C. But setcontext() can't be written entirely in C,
>>>> since it must save all callee-saved registers including ones not used
>>>> and therefore not normally saved by any C function that it might be in,
>>>> and possibly also including callee-saved registers for nonstandard or
>>>> non-C ABIs. In FreeBSD, it is apparently always a syscall.
>>>
>>> This is more than a little wrong. When setcontext() succeeds, it
>>> doesn't return here. Then it acts like longjmp() and must restore all
>>> the callee-saved to whatever they were when getcontext() was called.
>>> Otherwise, it must not clobber any callee-saved registers (then it
>>> differs from longjmp(). longjmp() just can't fail).
>>>
>>> Now I don't see any bug here. If the saved state is returned to, then
>>> it is as if getcontext() returned, and the intermediately-saved %rbx
>>> is correct (we will restore the orginal %rbx if we return). If
>>> setcontext() fails, then it should preserve all callee-saved registers.
>>> In the tail-call case, we have already restored the orginal %rbx and
>>> the failing setcontext() should preserve that.
>>>
>>> Bruce
>>
>> Take at setcontext:
>>
>> (gdb) disassemble setcontext
>> Dump of assembler code for function setcontext:
>> 0x00007fff90262855 <setcontext+0>: push %rbx
>> 0x00007fff90262856 <setcontext+1>: lea 0x38(%rdi),%rbx
>> 0x00007fff9026285a <setcontext+5>: cmp 0x30(%rdi),%rbx
>> 0x00007fff9026285e <setcontext+9>: je 0x7fff90262864
>> <setcontext+15>
>> 0x00007fff90262860 <setcontext+11>: mov %rbx,0x30(%rdi)
>> 0x00007fff90262864 <setcontext+15>: mov 0x4(%rdi),%edi
>> 0x00007fff90262867 <setcontext+18>: callq 0x7fff90262998
>> <sigsetmask>
>> 0x00007fff9026286c <setcontext+23>: mov %rbx,%rdi
>> 0x00007fff9026286f <setcontext+26>: pop %rbx
>> 0x00007fff90262870 <setcontext+27>: jmpq 0x7fff90262875
>> <_setcontext>
>> End of assembler dump.
>>
>> The stack from swapcontext had rbx and r14 popped after getcontext
>> stored everything. Now we push rbx and then later call sigsetmask.
>> Those two actions guarantee that the memory locations where rbx and
>> r14 were on the stack have been overwritten. When we later return to
>> the save context, it will start up swapcontext and pop the wrong
>> values off of the stack for rbx and r14.
>
> Ah, it is not really rbx and r14, but rsp and the whole stack frame of
> swapcontext() that are mishandled. Even returning from swapcontext()
> leaves the saved rsp pointing to garbage. The stack frame could have
> had almost anything on it before it became invalid, but here it has mainly
> the saved rbx and r14 (not rbp; however, when compiled by clang on FreeBSD,
> it also has the saved rbp, and when compiled with -O0 it also has the
> local variable).
>
> Now I think swapcontext() can't be written in C, for the same reason that
> setjmp() can't be written in C -- the stack frame cannot be controlled in
> C, and if it has anything at all on it (even the return address requires
> special handling), then the stack pointer saved in the context becomes
> invalid when the function returns, or even earlier for tail calls and
> other optimizations.
This reminds me that I can not override swapcontext in libthr, I had
put a wrapper for swapcontext in libthr, I am considering to remove it
now ...
More information about the freebsd-bugs
mailing list