ntpd hangs under FBSD 8
psteele at maxiscale.com
Mon Feb 22 15:44:03 UTC 2010
>Just out of curiosity, can you attach to the process via gdb and get a backtrace? This smells like a locked pthread_join I hit in my own code a few weeks ago
I'm not using the debug version of ntpd so the backtrace isn't too useful, but here's what I get:
#0 0x0000000800d52bfc in select () from /lib/libc.so.7
#1 0x0000000000425273 in ?? ()
#2 0x000000000040540e in ?? ()
#3 0x0000000800580000 in ?? ()
#4 0x0000000000000000 in ?? ()
The trace continues for 700+ entries. The first entry is useful enough though. One of the parameters to select() is a timeout parameter. Every time I do the backtrace it's stuck on this select call so it seems they have an infinite timeout set. One of these was running all weekend in fact and it's still stuck. Curiously, this problem only happens when we make the call from code via a system() call. If I run the same command interactively, it never hangs:
# /usr/sbin/ntpd -g -q
ntpd: time set +28845.997063s
The same code that runs this command does not hang when we run it on a BSD 7 box.
I think I'm going to have to build the debug version of ntpd and try to debug it. Definitely something weird going on.
More information about the freebsd-hackers