Stuck CLOSED sockets / sshd / zombies...

Konstantin Belousov kostikbel at gmail.com
Fri Apr 11 14:15:32 UTC 2014


On Fri, Apr 11, 2014 at 02:50:22PM +0100, Karl Pielorz wrote:
> 
> 
> --On 11 April 2014 16:16 +0300 Konstantin Belousov <kostikbel at gmail.com> 
> wrote:
> 
> > On Fri, Apr 11, 2014 at 01:39:54PM +0100, Karl Pielorz wrote:
> >>
> >> Ok, rebuilt a debug world (with your rtld-elf patch), installed it -
> >> reproduced the issue, and ran up gdb on a 'urdlck' stuck sshd, and got
> >> the  trace below.
> > The trace looks reasonable.
> 
> Great :)
> 
> > I vaguelly remember that you already answered this, but I want to start
> > investigating from the different angle.  Please show me the output
> > of 'ldd /usr/sbin/sshd' on your machine.  This happens on stable/10,
> > right ?
> 
> "
> ldd /usr/sbin/sshd
> /usr/sbin/sshd:
>         libssh.so.5 => /usr/lib/private/libssh.so.5 (0x800860000)
>         libutil.so.9 => /lib/libutil.so.9 (0x800abb000)
>         libwrap.so.6 => /usr/lib/libwrap.so.6 (0x800ccd000)
>         libpam.so.5 => /usr/lib/libpam.so.5 (0x800ed6000)
>         libbsm.so.3 => /usr/lib/libbsm.so.3 (0x8010e2000)
>         libgssapi_krb5.so.10 => /usr/lib/libgssapi_krb5.so.10 (0x8012fc000)
>         libgssapi.so.10 => /usr/lib/libgssapi.so.10 (0x80151a000)
>         libkrb5.so.11 => /usr/lib/libkrb5.so.11 (0x801723000)
>         libhx509.so.11 => /usr/lib/libhx509.so.11 (0x801999000)
>         libasn1.so.11 => /usr/lib/libasn1.so.11 (0x801be1000)
>         libcom_err.so.5 => /usr/lib/libcom_err.so.5 (0x801e7a000)
>         libroken.so.11 => /usr/lib/libroken.so.11 (0x80207c000)
>         libwind.so.11 => /usr/lib/libwind.so.11 (0x80228d000)
>         libheimbase.so.11 => /usr/lib/libheimbase.so.11 (0x8024b5000)
>         libheimipcc.so.11 => /usr/lib/private/libheimipcc.so.11 
> (0x8026b9000)
>         libcrypt.so.5 => /lib/libcrypt.so.5 (0x8028bb000)
>         libcrypto.so.7 => /lib/libcrypto.so.7 (0x802adb000)
>         libz.so.6 => /lib/libz.so.6 (0x802ec6000)
>         libc.so.7 => /lib/libc.so.7 (0x8030db000)
>         libldns.so.5 => /usr/lib/private/libldns.so.5 (0x803474000)
>         libmd.so.6 => /lib/libmd.so.6 (0x8036c8000)
>         libthr.so.3 => /lib/libthr.so.3 (0x8038d8000)
> "
So my suspicious idea seems to be true. From the ldd output, libc
appears before libthr in the global order, so libc sigaction() symbol
is resolved before libthr interposer. The result is that libthr wrapper
thr_sighandler() for the signal handlers is not installed as the
recepient of the kernel signal, which prevents libthr locks for rtld
from working properly.

You could see this in the backtrace below, which is indicated by lack of
the thr_signhandler in backtrace while obviously signal handler is
activated.

> 
> The box is stable/10 - quite an old stable 10 now, but afaik other people 
> have hit a similar issue on newer stable 10's - I've not updated this box, 
> as I've seen nothing to say it's "fixed" in newer versions [and it's 
> obviously been under investigation for weeks now on this machine as well, 
> long before I posted to -hackers]. I can update to a newer version (e.g. 
> today) if you want.
Better not, to keep the environment stable and the problem to not disappear
magically.  But it seems that it is consistent enough, on the HEAD box I
see the same order for needed libraries.

> 
> > I do not see any linking with libpthread in the sshd Makefile.  Could it
> > be that libthr is loaded as dependency of some pam module ?
> 
> Possibly - I don't know. This is stock FreeBSD #10 Stable - i.e. I've not 
> configured anything differently on SSH than what you get 'out the box'. 
> I've never done anything with PAM - so I don't know where I'd go checking 
> that kind of thing (but can if you point me in the right direction).

To confirm or deny my theory, please apply the patch below, in addition to
the previous patch, and rebuild sshd only,
# cd src/secure/usr.sbin/sshd && make clean all install
The patch tilts the order of initialization, for my build I got
sandy% ldd /usr/sbin/sshd                                                     ~
/usr/sbin/sshd:
        libssh.so.5 => /usr/lib/private/libssh.so.5 (0x800863000)
        libutil.so.9 => /lib/libutil.so.9 (0x800af0000)
...
        libz.so.6 => /lib/libz.so.6 (0x802f0d000)
        libthr.so.3 => /lib/libthr.so.3 (0x803123000)
        libc.so.7 => /lib/libc.so.7 (0x803348000)
        libldns.so.5 => /usr/lib/private/libldns.so.5 (0x8036d1000)
        libmd.so.6 => /lib/libmd.so.6 (0x803926000)
which could be enough to prevent the bug.

Please retest and report.

diff --git a/secure/usr.sbin/sshd/Makefile b/secure/usr.sbin/sshd/Makefile
index 4f730a9..5e399fa 100644
--- a/secure/usr.sbin/sshd/Makefile
+++ b/secure/usr.sbin/sshd/Makefile
@@ -54,8 +54,8 @@ LDADD+=	 -lgssapi_krb5 -lgssapi -lkrb5 -lhx509 -lasn1 \
 CFLAGS+= -DNONE_CIPHER_ENABLED
 .endif
 
-DPADD+= ${LIBCRYPT} ${LIBCRYPTO} ${LIBZ}
-LDADD+= -lcrypt -lcrypto -lz
+DPADD+= ${LIBCRYPT} ${LIBCRYPTO} ${LIBZ} ${LIBPTHREAD}
+LDADD+= -lcrypt -lcrypto -lz -lpthread
 
 .if defined(LOCALBASE)
 CFLAGS+= -DXAUTH_PATH=\"${LOCALBASE}/bin/xauth\"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20140411/e47bcb11/attachment.sig>


More information about the freebsd-hackers mailing list