kern/149323: Applications with large memory footprint failing with mmap - memory allocation errors

Robert Blayzor rblayzor at inoc.net
Thu Aug 5 14:20:12 UTC 2010


>Number:         149323
>Category:       kern
>Synopsis:       Applications with large memory footprint failing with mmap - memory allocation errors
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Aug 05 14:20:11 UTC 2010
>Closed-Date:
>Last-Modified:
>Originator:     Robert Blayzor
>Release:        8.1-RELEASE
>Organization:
INOC, LLC
>Environment:
FreeBSD dcc0.albyny.inoc.net 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Thu Aug  5 12:49:45 UTC 2010     root at dcc0.albyny.inoc.net:/usr/obj/mnt/pub/FreeBSD/8.1-RELEASE/sys/PE1750  i386

>Description:
System previously ran 7.3-RELEASE with no issues.  We are seeing a problem with an application which uses are rather large memory footprint.  The binary is "dccd" from the port /usr/ports/mail/dcc-dccd.

Typically the binary would run ata bout 1800-1900MB of RAM under 7.3, with no problems.  This is the only application which runs on this server other than an ntp daemon.

After upgrading to FreeBSD 8.1-RELEASE we started noticing that the dccd daemon would start failing with memory allocation errors after the process grew to about 850-900MB of RAM.  There are no FreeBSD kernel or userland errors, but dccd was logging in massive numbers to syslog:


Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x49b72000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x49b72000) ok
Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x48c84000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x48c84000) ok
Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x47d96000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x47d96000) ok
Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x46ea8000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x46ea8000) ok
Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x45fba000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x45fba000) ok
Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x450cc000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x450cc000) ok
Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x441de000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x441de000) ok
Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x432f0000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x432f0000) ok
Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x42402000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x42402000) ok
Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x41514000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x41514000) ok
Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x40626000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x40626000) ok
Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x3e84a000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x3e84a000) ok
Aug  4 14:18:16 dcc0 dccd[742]: try #1 mmap(dcc_db 0xeee000,0x3,0x801,11,0x3d95c000): Cannot allocate memory
Aug  4 14:18:16 dcc0 dccd[742]: try #2 mmap(dcc_db,0xeee000,0x3d95c000) ok


We first noticed this problem after immediately booting into the 8.1 GENERIC kernel.  We tried compiling our custom kernel (which included PAE) and also rebuilt all ports against the new 8.1 libraries.  The result was the same.

When the above happens, we can see the following in top:

last pid: 13710;  load averages:  0.00,  0.00,  0.00                        up 0+06:54:40  23:02:38
17 processes:  1 running, 16 sleeping
CPU:  0.0% user,  0.0% nice,  0.2% system,  0.0% interrupt, 99.8% idle
Mem: 867M Active, 54M Inact, 153M Wired, 164K Cache, 112M Buf, 2685M Free
Swap: 3915M Total, 3915M Free

 PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
12179 dcc         1  44    0   885M   886M select  0   2:35  0.00% dccd


>How-To-Repeat:
Unsure at this time.  We can repeat the problem by simply running our dccd daemon and letting it build it's database up in memory until it hits the breaking point.  We did not experience this problem under 7.3.

Some of our system configurations:

[dcc0:~] cat /boot/loader.conf
kern.maxdsiz="2048M"


[dcc0:~] cat /etc/sysctl.conf 
kern.maxfiles=131072
kern.maxfilesperproc=32768
#
kern.ipc.nmbclusters=65536
kern.ipc.maxsockbuf=4194304
kern.ipc.somaxconn=2048
kern.ipc.maxsockets=65536
net.inet.udp.recvspace=131072
net.inet.udp.maxdgram=16384
net.inet.tcp.msl=7500
#
net.inet.icmp.log_redirect=1
net.inet.icmp.drop_redirect=1
net.inet.tcp.delayed_ack=0
net.inet.ip.redirect=0
net.inet6.ip6.redirect=0
net.link.ether.inet.log_arp_wrong_iface=0
kern.sugid_coredump=1
net.inet.tcp.keepidle=120000
net.inet.tcp.keepintvl=10000


Custom kernel parts:

cpu             I486_CPU
cpu             I586_CPU
cpu             I686_CPU
ident           PE1750
options         PAE
makeoptions     DEBUG=-g                # Build kernel with gdb(1) debug symbols
makeoptions     NO_MODULES=yes
options         IPFIREWALL
options         IPFIREWALL_VERBOSE
options         IPFIREWALL_VERBOSE_LIMIT=100

device          bge             # Broadcom BCM570xx Gigabit Ethernet

device          scbus           # SCSI bus (required for SCSI)
#device         ch              # SCSI media changers
device          da              # Direct Access (disks)
#device         sa              # Sequential Access (tape etc)
device          pass            # Passthrough device (direct SCSI access)
device          ses             # SCSI Environmental Services (and SAF-TE)

# RAID controllers interfaced to the SCSI subsystem
device          amr             # AMI MegaRAID



dmesg output:

Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.1-RELEASE #0: Thu Aug  5 12:49:45 UTC 2010
    root at dcc0.albyny.inoc.net:/usr/obj/mnt/pub/FreeBSD/8.1-RELEASE/sys/PE1750 i386
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2781.67-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf29  Family = f  Model = 2  Stepping = 9
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x4400<CNXT-ID,xTPR>
real memory  = 4294967296 (4096 MB)
avail memory = 3936641024 (3754 MB)
ACPI APIC Table: <DELL   PE1750  >
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 2 package(s) x 1 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  6
ioapic0: Changing APIC ID to 8
ioapic1: Changing APIC ID to 9
ioapic2: Changing APIC ID to 10
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 <Version 1.1> irqs 0-15 on motherboard
ioapic1 <Version 1.1> irqs 16-31 on motherboard
ioapic2 <Version 1.1> irqs 32-47 on motherboard
kbd1 at kbdmux0
acpi0: <DELL PE1750> on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
vgapci0: <VGA-compatible display> port 0xec00-0xecff mem 0xfd000000-0xfdffffff,0xfe101000-0xfe101fff at device 14.0 on pci0
atapci0: <ServerWorks CSB5 UDMA100 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x8b0-0x8bf at device 15.1 on pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
ata1: <ATA channel 1> on atapci0
ata1: [ITHREAD]
ohci0: <OHCI (generic) USB controller> mem 0xfe100000-0xfe100fff irq 11 at device 15.2 on pci0
ohci0: [ITHREAD]
usbus0: <OHCI (generic) USB controller> on ohci0
isab0: <PCI-ISA bridge> at device 15.3 on pci0
isa0: <ISA bus> on isab0
pcib1: <ACPI Host-PCI bridge> on acpi0
pci4: <ACPI PCI bus> on pcib1
amr0: <LSILogic MegaRAID 1.53> mem 0xf0000000-0xf7ffffff,0xfcd00000-0xfcd3ffff irq 18 at device 3.0 on pci4
amr0: Using 64-bit DMA
amr0: [ITHREAD]
amr0: delete logical drives supported by controller
amr0: <LSILogic PERC 4/Di> Firmware 412W, BIOS H406, 128MB RAM
pcib2: <ACPI Host-PCI bridge> on acpi0
pci3: <ACPI PCI bus> on pcib2
pcib3: <ACPI Host-PCI bridge> on acpi0
pci2: <ACPI PCI bus> on pcib3
bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x002002> mem 0xfcf30000-0xfcf3ffff,0xfcf20000-0xfcf2ffff irq 16 at device 0.0 on pci2
miibus0: <MII bus> on bge0
brgphy0: <BCM5704 10/100/1000baseTX PHY> PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
bge0: Ethernet address: 00:0f:1f:66:a8:03
bge0: [ITHREAD]
bge1: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x002002> mem 0xfcf10000-0xfcf1ffff,0xfcf00000-0xfcf0ffff irq 17 at device 0.1 on pci2
miibus1: <MII bus> on bge1
brgphy1: <BCM5704 10/100/1000baseTX PHY> PHY 1 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
bge1: Ethernet address: 00:0f:1f:66:a8:04
bge1: [ITHREAD]
pcib4: <ACPI Host-PCI bridge> on acpi0
pci1: <ACPI PCI bus> on pcib4
fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0
fdc0: [FILTER]
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: [FILTER]
atrtc0: <AT realtime clock> port 0x70-0x7f irq 8 on acpi0
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xec000-0xeffff pnpid ORM0000 on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
p4tcc0: <CPU Frequency Thermal Control> on cpu0
p4tcc1: <CPU Frequency Thermal Control> on cpu1
Timecounters tick every 1.000 msec
ipfw2 (+ipv6) initialized, divert loadable, nat loadable, rule-based forwarding disabled, default to deny, logging disabled
usbus0: 12Mbps Full Speed USB v1.0
ugen0.1: <(0x1166)> at usbus0
uhub0: <(0x1166) OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
acd0: CDROM <SAMSUNG CD-ROM SN-124/N103> at ata1-master UDMA33 
amr0: delete logical drives supported by controller
amrd0: <LSILogic MegaRAID logical drive> on amr0
amrd0: 17280MB (35389440 sectors) RAID 1 (optimal)
SMP: AP CPU #1 Launched!
uhub0: 4 ports with 4 removable, self powered
Trying to mount root from ufs:/dev/amrd0s1a

>Fix:
The only workaround we've been able to find was recommended by the DCCD author Vernon Schryver to set the built time option in the port:

--with-max-db-mem=800

That limited the database memory footprint under the breaking point.


One thing we did try, but did not work was upping the vm.max_proc_mmap value in sysctl.

vm.max_proc_mmap=65535


Which had no effect at all.




>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list