8.2 + apache == a LOT of sigprocmask
Daniil Cherednik
dcherednik at masterhost.ru
Thu Nov 17 12:39:43 UTC 2011
On 17.11.2011 14:18, Jeremy Chadwick wrote:
> On Thu, Nov 17, 2011 at 10:12:10AM +0200, Kostik Belousov wrote:
>> On Wed, Nov 16, 2011 at 11:59:06PM -0800, Doug Barton wrote:
>>> On 11/16/2011 23:49, Kostik Belousov wrote:
>>>> On Wed, Nov 16, 2011 at 10:46:27PM -0800, Doug Barton wrote:
>>>>> On 11/15/2011 02:09, Jeremy Chadwick wrote:
>>>>>> On Tue, Nov 15, 2011 at 11:07:45AM +0200, Kostik Belousov wrote:
>>>>>>> On Mon, Nov 14, 2011 at 12:51:35PM -0800, Doug Barton wrote:
>>>>>>>> On 11/14/2011 12:31, Doug Barton wrote:
>>>>>>>>> Trying to track down a load problem we're seeing on 8.2-RELEASE-p4 i386
>>>>>>>>> in a busy web hosting environment I came across the following post:
>>>>>>>>>
>>>>>>>>> http://lists.freebsd.org/pipermail/freebsd-questions/2011-October/234520.html
>>>>>>>>>
>>>>>>>>> That basically describes what we're seeing as well, including the
>>>>>>>>> "doesn't happen on Linux" part.
>>>>>>>>>
>>>>>>>>> Does anyone have any ideas about this?
>>>>>>>>>
>>>>>>>>> With incredibly similar stuff running on 7.x we didn't see this problem,
>>>>>>>>> so it seems to be something new in 8.
>>>>>>>> Just took a closer look at our ktrace, and actually our pattern is
>>>>>>>> slightly different than the one in that post. In ours the second option
>>>>>>>> is null, but the third is set:
>>>>>>>>
>>>>>>>> 74195 httpd 0.000017 RET sigprocmask 0
>>>>>>>> 74195 httpd 0.000013 CALL sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
>>>>>>>> 74195 httpd 0.000009 RET sigprocmask 0
>>>>>>>> 74195 httpd 0.000013 CALL sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
>>>>>>>> 74195 httpd 0.000009 RET sigprocmask 0
>>>>>>>> 74195 httpd 0.000012 CALL sigprocmask(SIG_BLOCK,0,0xbfbf89d4)
>>>>>>>>
>>>>>>>> But repeated hundreds of times in a row.
>>>>>>> The calls cannot come from rtld, they are generated by some setjmp()
>>>>>>> invocation. If signal-safety is not needed, sigsetjmp() should be used
>>>>>>> instead.
>>>>>>>
>>>>>>> Quick grep of the apache httpd source shows a single setjmp() in their
>>>>>>> copy of pcre. No idea is it to safe to change setjmp() into sigsetjmp(?, 0).
>>>>>> I hate cross-posting, but: adding freebsd-apache@ to the list. Some of
>>>>>> the Apache folks (not just port committers) may have some insight to
>>>>>> Kostik's findings.
>>>>> Thanks to everyone for the responses. We tried Kostik's suggestion and
>>>>> unfortunately it didn't reduce the number of sigprocmask() calls to a
>>>>> statistically significant degree.
>>>>>
>>>>> Does anyone have any other ideas on ways to debug this? We're sort of
>>>>> running out of things to test. :-/
>>>>>
>>>>> Given how important (and prevalent) the Apache + FreeBSD combination is,
>>>>> I'm kind of disturbed that we're seeing this performance problem, and if
>>>>> it's something in 8.x that's also in 9.x, it would be better to fix it
>>>>> prior to 9.0-RELEASE.
>>>> Since my guess appeared to be not useful,
>>> Well I wouldn't say that they weren't useful, we eliminated the obvious
>>> candidate. So, "not good news" certainly, but not unhelpful. :)
>>>
>>>> the way forward is to identify
>>>> the location of the call(s) that cause the issue. I suggest compliling
>>>> at least apache itself, libc, rtld and libthr (if used) with debugging
>>>> information. Then, attach to the running apache worker with the gdb and
>> Note this part.
>>
>>>> set breakpoint on sigprocmask. Several backtraces from the hit breakpoint
>>>> should give enough data.
>>> We tried that, and got this:
>>>
>>> Loaded symbols for /libexec/ld-elf.so.1
>>> 0x28183a5d in accept () from /lib/libc.so.7
>>> (gdb) b sigprocmask
>>> Breakpoint 1 at 0x282d8f84
>>> (gdb) c
>>> Continuing.
>>> no thread to satisfy query
>>> 0x28183a5d in accept () from /lib/libc.so.7
>>> (gdb)
>> It seems your libc has no debugging information.
>> accept() is the pure syscall wrapper, it cannot call sigprocmask.
>> If gdb catched the PLT trampoline instead of real accept(), we would
>> see the rtld frames. So install libc, libthr and rtld with debug.
>>
>> Also, having debug symbols for apache itself can be useful.
> I'd also like to point out that enabling debugging symbols in devel/apr1
> will be greatly needed here, not just in www/apache*.
>
> I'm wondering if maybe this is some sort of pthread "thing" going on. A
> quick grep -r sigmask of the Apache source turns up some pthread_* bits
> pertaining to worker.
>
> Is Apache build using WITH_THREADS? What about devel/apr1?
>
> I don't use worker MPM on any of our boxes, we actually use ITK MPM
> solely because of the hosting nature of what we do. I've actually never
> seen worker MPM in use on any *IX machine I've been on or administrated,
> only prefork. The Apache documentation even mentions that "if you want
> stability or compatibility, prefork is the choice", while "if you want
> scalability, worker is a better choice"[1]. These sorts of quotes often
> shock me given what year it is. :-)
>
> [1]: http://httpd.apache.org/docs/2.0/mpm.html
>
We use ITK MPM too, but we have big trouble with performance on FreeBSD.
Also, I have to say we can`t use keep-alive connection, so apache
creates new child for each request.
--
С уважением,
Daniil Cherednik
.masterhost
More information about the freebsd-stable
mailing list