nsdispatch performance issue for large group files (libc)

Anthony Bourov ab at addr.com
Mon Mar 23 16:43:23 PDT 2009


Regarding performance of: lib/libc/net/nsdispatch.c
When used from: lib/libc/net/getgrent.c (called by initgroups())

I don't normally post here but I wanted to get some feed back on a performance issue that I spotted. I run a large number of high-volume web hosting servers and noticed on some of the servers a severe decrease in Apache's performance when the /etc/group file is large (over 100,000 entries in a group file as it is combined across servers).

I did a trace and found the following operation:
stat("/etc/nsswitch.conf", {st_mode=052, st_size=4503681233059861, ...}) = 0

repeating as many times as there is groups in the group file. I narrowed the problem down to where apache calls "initgroups()" before forking each process (nothing wrong here). And init groups goes through every entry in the group file using getgrent(), which in turn calls nsdispatch and which for every single call does "stat" on "/etc/nsswitch.conf" to see if it changed. 

This issue impacts different servers differently, on most of the SCSI servers this delays apache startup my maybe a minute, however, on a Dell SATA raid the "stat" command was significantly slower and caused everything to come to a halt for several minuted every time apache starts.

In my opinion this is a very significant performance issue when working with large servers. Most programs, including apache, will call "initgroups()" for every time they fork, and it the group file is large this means as many "stat" requests on the file system as there are entries in the group file for every single fork() that the server does.

For myself I just made it never test "stat" on "/etc/nsswitch.conf" after the first time since I know that file is never modified. However, a better solution would be to somehow let nsdispatch know that it is being ran in batch mode and should not keep testing if the file has changed. This would effect both "getgrent" and "getpwent". 



More information about the freebsd-performance mailing list