gssd mystery

Attila Bogár attila.bogar at linguamatics.com
Fri Jan 4 16:04:06 UTC 2013


Hi All,

I have NFS server which exports via kerberos security.
The users and groups come from LDAP via port net/nss-pam-ldapd.
gssd is linked against the latest heimdal.
There are multiple LDAP servers for fail over.

A story was the following:
- NFS daemon locked up
- top shows that it's in gsslock - or similar - I don't remember the exact state -
- I noticed, that gssd isn't running
- /etc/rc.d/gssd start
... panic, reboot

Unfortunately I don't have a kernel dump, but checking the logs I see 3 minutes before the lockup:
[nslcd] [warning] [d802da] <passwd="someuser"> ldap_start_tls_s() failed (uri=ldap://ldap1.linguamatics.com): Can't contact LDAP server: Bad file descriptor
[nslcd] [warning] [d802da] <passwd="someuser"> failed to bind to LDAP server ldap://ldap1.linguamatics.com: Can't contact LDAP server: Bad file descriptor
[nslcd] [info] [d802da] <passwd="someuser"> connected to LDAP server ldap://ldap2.linguamatics.com
This may or may not be connected, but I can't see these messages for a long time back in history.

Anyway there is some bug around gssd, because it died.
I don't know if this is a reproducible bug or not yet.

How can be gssd monitored on a production system to figure out the reason for death?

Attila

-- 
Attila Bogár
Systems Administrator
Linguamatics - Cambridge, UK
http://www.linguamatics.com/


More information about the freebsd-fs mailing list