file descriptor leak in 5.2-RC

Oliver Brandmueller ob at e-Gitt.NET
Wed Dec 24 07:41:28 PST 2003


Hi.

I just started (by accident) a new thread regarding the same topic...

On Sat, Dec 20, 2003 at 09:38:11PM +0100, Poul-Henning Kamp wrote:
> In message <Pine.NEB.3.96L.1031220105954.46326Q-100000 at fledge.watson.org>, Robe
> rt Watson writes:
> 
> >[...] so if we actually have a leak,
> >fstat(8) should show a small number of files, but the sysctl
> >kern.openfiles should reveal a large number of files open. 
> 
> sysctl kern.malloc | grep "file desc" ?

I can with no problems reproduce this behaviour.

The machine is a mail filtering server running exim, amavisd + 
SpamAssassin and ClamAV. I do have the machine currently in a testing 
environment and thus can do some experimentation.

The machine gets the whole feed of messages we usually have (but just 
not delivers any mail back to the main servers after filtering). This 
means about 3-5 Mails per second going through the machine, which seems 
enough to reproduce the effect very fast.

The following values are (with SCHED_4BSD, SCHED-ULE give the same) read 
in single user mode after the machine had been up for about 25 minutes 
and did 10 minutes of mail filtering. Of course none of the daemons are 
running anymore:

# sysctl kern.openfiles
kern.openfiles: 4715
# lsof | wc -l
      35
# fstat | wc -l
      23
# sysctl kern.malloc | grep "file desc"
file desc to leader     0     0K      1K        3  32
    file desc   102    26K     58K    15408  256
# ps ax
  PID  TT  STAT      TIME COMMAND
    0  ??  DLs    0:00.11  (swapper)
    1  ??  ILs    0:00.64 /sbin/init --
    2  ??  DL     0:00.11  (g_event)
    3  ??  DL     0:02.30  (g_up)
    4  ??  DL     0:01.70  (g_down)
    5  ??  DL     0:00.00  (taskqueue)
    6  ??  IL     0:00.00  (acpi_task0)
    7  ??  IL     0:00.00  (acpi_task1)
    8  ??  IL     0:00.00  (acpi_task2)
    9  ??  DL     0:00.00  (pagedaemon)
   10  ??  DL     0:00.00  (ktrace)
   11  ??  RL    26:37.86  (idle: cpu3)
   12  ??  RL    26:33.18  (idle: cpu2)
   13  ??  RL    25:53.23  (idle: cpu1)
   14  ??  RL    25:26.75  (idle: cpu0)
   27  ??  WL     0:00.00  (irq14: ata0)
   29  ??  WL     0:01.34  (irq16: uhci0)
   37  ??  WL     0:01.61  (irq24: twe0)
   61  ??  WL     0:02.00  (irq48: em0)
   86  ??  WL     0:01.65  (swi8: tty:sio clock)
   88  ??  WL     0:03.32  (swi1: net)
   89  ??  DL     0:00.43  (random)
   91  ??  WL     0:00.00  (swi7: acpitaskq)
   92  ??  WL     0:00.00  (swi7: task queue)
   94  ??  WL     0:00.00  (swi0: tty:sio)
   95  ??  DL     0:05.38  (pagezero)
   96  ??  DL     0:00.02  (bufdaemon)
   97  ??  DL     0:00.01  (vnlru)
   98  ??  DL     0:00.88  (syncer)
  415  ??  DL     0:00.00  (usb0)
  416  ??  DL     0:00.00  (usbtask)
15403  d0  Ss     0:00.01 -sh (sh)
15415  d0  R+     0:00.00 ps ax
# uname -a
FreeBSD lupin 5.2-CURRENT FreeBSD 5.2-CURRENT #13: Wed Dec 24 15:31:44 CET 2003     root at lupin.eusc.inter.net:/usr/obj/usr/src/sys/MOMAIL  i386
# uptime
 4:35PM  up 29 mins, 1 user, load averages: 0.10, 0.75, 0.63

There are no debugging options in the kernel and malloc.conf is linked 
to aj since I needed to do the performance testing. The machine has to 
go into production state on sunday; I would like to stay with FBSD 5 due 
to the better SMP performance and the ability to do FS snapshots. Only 
in the worst case I'd put a 4-STABLE on it. So I will give any help I 
can to solve the issue.

Greetinx, merry x-mas, Oliver

-- 
| Oliver Brandmueller | Offenbacher Str. 1  | Germany       D-14197 Berlin |
| Fon +49-172-3130856 | Fax +49-172-3145027 | WWW:   http://the.addict.de/ |
|               Ich bin das Internet. Sowahr ich Gott helfe.               |
| Eine gewerbliche Nutzung aller enthaltenen Adressen ist nicht gestattet! |


More information about the freebsd-current mailing list