file descriptor leak in 5.2-RC
Robert Watson
rwatson at freebsd.org
Wed Dec 24 08:22:20 PST 2003
As a follow-up, it would also be interesting to know if you're using linux
emulation or some other kernel emulation.
Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
robert at fledge.watson.org Senior Research Scientist, McAfee Research
On Wed, 24 Dec 2003, Robert Watson wrote:
>
> On Wed, 24 Dec 2003, Oliver Brandmueller wrote:
>
> > Hi.
> >
> > I just started (by accident) a new thread regarding the same topic...
>
> Hmm. So this makes multiple reports, so we definitely have a problem.
> Are you using any sort of threaded applications -- if so, which threading
> packates are you using (linuxthreads, libc_r, libkse, et al). Do you know
> if you're making use of /dev/fd/*, or /dev/std* in scripts on your system?
> Do you have any reports of unusual process exits (via signals, etc)? If
> you look at the output of lsof or fstat while the system is actively
> running, it might be interesting to get a list of the kinds of sockets in
> use. Somewhere, presumably we're slipping a file descriptor reference,
> perhaps in a failure mode that turns up frequently in your environment.
> Helping to identify what differentiates your environment from the ones
> where this doesn't turn up may help track down the problem. The areas
> I've asked you to look at above are "interesting" file descriptor handling
> cases, and the problem might well be in one of these.
>
> Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
> robert at fledge.watson.org Senior Research Scientist, McAfee Research
>
> > On Sat, Dec 20, 2003 at 09:38:11PM +0100, Poul-Henning Kamp wrote:
> > > In message <Pine.NEB.3.96L.1031220105954.46326Q-100000 at fledge.watson.org>, Robe
> > > rt Watson writes:
> > >
> > > >[...] so if we actually have a leak,
> > > >fstat(8) should show a small number of files, but the sysctl
> > > >kern.openfiles should reveal a large number of files open.
> > >
> > > sysctl kern.malloc | grep "file desc" ?
> >
> > I can with no problems reproduce this behaviour.
> >
> > The machine is a mail filtering server running exim, amavisd +
> > SpamAssassin and ClamAV. I do have the machine currently in a testing
> > environment and thus can do some experimentation.
> >
> > The machine gets the whole feed of messages we usually have (but just
> > not delivers any mail back to the main servers after filtering). This
> > means about 3-5 Mails per second going through the machine, which seems
> > enough to reproduce the effect very fast.
> >
> > The following values are (with SCHED_4BSD, SCHED-ULE give the same) read
> > in single user mode after the machine had been up for about 25 minutes
> > and did 10 minutes of mail filtering. Of course none of the daemons are
> > running anymore:
> >
> > # sysctl kern.openfiles
> > kern.openfiles: 4715
> > # lsof | wc -l
> > 35
> > # fstat | wc -l
> > 23
> > # sysctl kern.malloc | grep "file desc"
> > file desc to leader 0 0K 1K 3 32
> > file desc 102 26K 58K 15408 256
> > # ps ax
> > PID TT STAT TIME COMMAND
> > 0 ?? DLs 0:00.11 (swapper)
> > 1 ?? ILs 0:00.64 /sbin/init --
> > 2 ?? DL 0:00.11 (g_event)
> > 3 ?? DL 0:02.30 (g_up)
> > 4 ?? DL 0:01.70 (g_down)
> > 5 ?? DL 0:00.00 (taskqueue)
> > 6 ?? IL 0:00.00 (acpi_task0)
> > 7 ?? IL 0:00.00 (acpi_task1)
> > 8 ?? IL 0:00.00 (acpi_task2)
> > 9 ?? DL 0:00.00 (pagedaemon)
> > 10 ?? DL 0:00.00 (ktrace)
> > 11 ?? RL 26:37.86 (idle: cpu3)
> > 12 ?? RL 26:33.18 (idle: cpu2)
> > 13 ?? RL 25:53.23 (idle: cpu1)
> > 14 ?? RL 25:26.75 (idle: cpu0)
> > 27 ?? WL 0:00.00 (irq14: ata0)
> > 29 ?? WL 0:01.34 (irq16: uhci0)
> > 37 ?? WL 0:01.61 (irq24: twe0)
> > 61 ?? WL 0:02.00 (irq48: em0)
> > 86 ?? WL 0:01.65 (swi8: tty:sio clock)
> > 88 ?? WL 0:03.32 (swi1: net)
> > 89 ?? DL 0:00.43 (random)
> > 91 ?? WL 0:00.00 (swi7: acpitaskq)
> > 92 ?? WL 0:00.00 (swi7: task queue)
> > 94 ?? WL 0:00.00 (swi0: tty:sio)
> > 95 ?? DL 0:05.38 (pagezero)
> > 96 ?? DL 0:00.02 (bufdaemon)
> > 97 ?? DL 0:00.01 (vnlru)
> > 98 ?? DL 0:00.88 (syncer)
> > 415 ?? DL 0:00.00 (usb0)
> > 416 ?? DL 0:00.00 (usbtask)
> > 15403 d0 Ss 0:00.01 -sh (sh)
> > 15415 d0 R+ 0:00.00 ps ax
> > # uname -a
> > FreeBSD lupin 5.2-CURRENT FreeBSD 5.2-CURRENT #13: Wed Dec 24 15:31:44 CET 2003 root at lupin.eusc.inter.net:/usr/obj/usr/src/sys/MOMAIL i386
> > # uptime
> > 4:35PM up 29 mins, 1 user, load averages: 0.10, 0.75, 0.63
> >
> > There are no debugging options in the kernel and malloc.conf is linked
> > to aj since I needed to do the performance testing. The machine has to
> > go into production state on sunday; I would like to stay with FBSD 5 due
> > to the better SMP performance and the ability to do FS snapshots. Only
> > in the worst case I'd put a 4-STABLE on it. So I will give any help I
> > can to solve the issue.
> >
> > Greetinx, merry x-mas, Oliver
> >
> > --
> > | Oliver Brandmueller | Offenbacher Str. 1 | Germany D-14197 Berlin |
> > | Fon +49-172-3130856 | Fax +49-172-3145027 | WWW: http://the.addict.de/ |
> > | Ich bin das Internet. Sowahr ich Gott helfe. |
> > | Eine gewerbliche Nutzung aller enthaltenen Adressen ist nicht gestattet! |
> > _______________________________________________
> > freebsd-current at freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
> >
>
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
>
More information about the freebsd-current
mailing list