ports/153715: FreeRADIUS exiting with Signal 11 on FreeBSD
Hugh Blandford
hugh at island.net.au
Thu Jan 6 00:00:21 UTC 2011
>Number: 153715
>Category: ports
>Synopsis: FreeRADIUS exiting with Signal 11 on FreeBSD
>Confidential: no
>Severity: serious
>Priority: low
>Responsible: freebsd-ports-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Thu Jan 06 00:00:20 UTC 2011
>Closed-Date:
>Last-Modified:
>Originator: Hugh Blandford
>Release: 8.1
>Organization:
Island Internet
>Environment:
FreeBSD 8.1-RELEASE-p1 FreeBSD 8.1-RELEASE-p1 #0: Thu Oct 7 12:09:14 EST 2010 hugh at island.net.au:/usr/obj/usr/src/sys/GENERIC amd64
>Description:
A ktrace of the problem:
/var/log/radius.log:
Sat Dec 18 02:43:43 2010 : Info: Exiting normally.
/var/log/messages:
Dec 18 02:43:43 radius kernel: pid 655 (radiusd), uid 133: exited on signal
11
$ kdump | tail -20
655 radiusd CALL fstat(0xa,0xbf8fb280)
655 radiusd STRU struct stat {dev=83, ino=2378852, mode=-rw------- ,
nlink=1, uid=133, gid=133, rdev=0, atime=1292640223, stime=1292640223,
ctime=1292640223, birthtime=1292640223, size=0, blksize=16384, blocks=0,
flags=0x0 }
655 radiusd RET fstat 0
655 radiusd CALL lseek(0xa,0,SEEK_SET,0x2)
655 radiusd RET lseek 0
655 radiusd CALL write(0xa,0x28752000,0x54)
655 radiusd GIO fd 10 wrote 84 bytes
"Sat Dec 18 02:43:43 2010
Timestamp = 1292640223
Request-Authenticator = Verified
"
655 radiusd RET write 84/0x54
655 radiusd CALL close(0xa)
655 radiusd RET close 0
655 radiusd PSIG SIGSEGV SIG_DFL
655 radiusd RET _umtx_op -1 errno 4 Interrupted system call
655 radiusd RET _umtx_op -1 errno 4 Interrupted system call
655 radiusd RET _umtx_op -1 errno 4 Interrupted system call
655 radiusd RET _umtx_op -1 errno 4 Interrupted system call
gdb output:
(I have cut some out as I ran out of characters for the post. Let me know if I might have chopped something important)
Starting program: /usr/local/sbin/radiusd -f
[New LWP 100056]
[New Thread 28401140 (LWP 100056)]
[New Thread 28567ac0 (LWP 100051)]
[New Thread 28567980 (LWP 100068)]
[New Thread 28567840 (LWP 100069)]
[New Thread 28567700 (LWP 100070)]
[New Thread 285675c0 (LWP 100102)]
Program received signal SIGABRT, Aborted.
[Switching to Thread 28567ac0 (LWP 100051)]
0x2831866b in thr_kill () from /lib/libc.so.7
(gdb)
(gdb)
(gdb) where
#0 0x2831866b in thr_kill () from /lib/libc.so.7
#1 0x282c7896 in pthread_kill () from /lib/libthr.so.3
#2 0x282c53b3 in raise () from /lib/libthr.so.3
#3 0x283bac4a in abort () from /lib/libc.so.7
#4 0x0806c852 in rad_assert_fail (file=Could not find the frame base for
"rad_assert_fail".
) at util.c:365
#5 0x0806af44 in request_dequeue (request=0x28542b7c, fun=0xbf9fef8c) at
threads.c:412
#6 0x0806b18b in request_handler_thread (arg=0x28542b60) at threads.c:516
#7 0x282c175f in pthread_getprio () from /lib/libthr.so.3
#8 0x00000000 in ?? ()
(gdb) thread apply all bt full
(these were the same so I have shortened them)
Thread 7 (Thread 285675c0 (LWP 100102)):
Thread 6 (Thread 28567700 (LWP 100070)):
Thread 5 (Thread 28567840 (LWP 100069)):
Thread 4 (Thread 28567980 (LWP 100068)):
#0 0x282cc227 in __error () from /lib/libthr.so.3
No symbol table info available.
#1 0x282cbc19 in __error () from /lib/libthr.so.3
No symbol table info available.
#2 0x285484fc in ?? ()
No symbol table info available.
#3 0x0000000f in ?? ()
No symbol table info available.
#4 0x00000000 in ?? ()
No symbol table info available.
#5 0x00000000 in ?? ()
No symbol table info available.
#6 0x00000000 in ?? ()
No symbol table info available.
#7 0x282cdbf4 in ?? () from /lib/libthr.so.3
No symbol table info available.
#8 0xbf8fdf68 in ?? ()
No symbol table info available.
#9 0x282c0fcc in sem_wait () from /lib/libthr.so.3
No symbol table info available.
Previous frame identical to this frame (corrupt stack?)
Thread 3 (Thread 28567ac0 (LWP 100051)):
---Type <return> to continue, or q <return> to quit---
#0 0x2831866b in thr_kill () from /lib/libc.so.7
No symbol table info available.
#1 0x282c7896 in pthread_kill () from /lib/libthr.so.3
No symbol table info available.
#2 0x282c53b3 in raise () from /lib/libthr.so.3
No symbol table info available.
#3 0x283bac4a in abort () from /lib/libc.so.7
No symbol table info available.
#4 0x0806c852 in rad_assert_fail (file=Could not find the frame base for
"rad_assert_fail".
) at util.c:365
No locals.
#5 0x0806af44 in request_dequeue (request=0x28542b7c, fun=0xbf9fef8c) at
threads.c:412
blocked = 134809920
i = RAD_LISTEN_ACCT
start = RAD_LISTEN_ACCT
entry = (request_queue_t *) 0x0
last_complained = 0
#6 0x0806b18b in request_handler_thread (arg=0x28542b60) at threads.c:516
fun = 0x804e250 <rad_accounting>
self = (THREAD_HANDLE *) 0x28542b60
#7 0x282c175f in pthread_getprio () from /lib/libthr.so.3
No symbol table info available.
#8 0x00000000 in ?? ()
No symbol table info available.
Thread 2 (Thread 28401140 (LWP 100056)):
#0 0x283abf9f in unlink () from /lib/libc.so.7
No symbol table info available.
#1 0x0805daf9 in command_socket_free (this=0x28569100) at command.c:1839
sock = (fr_command_socket_t *) 0x285b4800
#2 0x08060053 in listen_free (head=0x8090b7c) at listen.c:2257
next = (rad_listen_t *) 0x285690a0
this = (rad_listen_t *) 0x28569100
#3 0x08061e7c in free_mainconfig () at mainconfig.c:992
cc = (cached_config_t *) 0x0
next = (cached_config_t *) 0x0
#4 0x0806859a in main (argc=2, argv=0xbfbfec74) at radiusd.c:443
rcode = 0
argval = -1
spawn_flag = 1
dont_fork = 1
flag = 0
act = {__sigaction_u = {__sa_handler = 0x80687f0 <sig_fatal>,
__sa_sigaction = 0x80687f0 <sig_fatal>}, sa_flags = 0,
sa_mask = {__bits = {0, 0, 0, 0}}}
#0 0x2831866b in thr_kill () from /lib/libc.so.7
Alan DeKok the main FreeRADIUS developer looked at the output and commented:
Code:
> #4 0x0806c852 in rad_assert_fail (file=Could not find the frame base for
> > "rad_assert_fail".
> > ) at util.c:365
> > #5 0x0806af44 in request_dequeue (request=0x28542b7c, fun=0xbf9fef8c) at
> > threads.c:412
Ugh. Something is free'ing the request when it's still queued. This
*only* can happen when the request is in the queue for more than 30s.
So... why is your server so slow? Fix that, and the problem won't be
fixed, but it *will* go away.
I can take a look at the code, but this is a very odd edge case, and
hard to track down/fix.
>How-To-Repeat:
just let the server run. It will eventually fall over. It does seem to be random.
>Fix:
not known
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-ports-bugs
mailing list