Fresh 7.0 Install: Fatal Trap 12 panic when put under load

Michael Grant mgrant at grant.org
Mon Nov 24 06:59:40 PST 2008


On Thu, Sep 11, 2008 at 11:56 AM, Jeremy Chadwick <koitsu at freebsd.org> wrote:
> On Thu, Sep 11, 2008 at 12:08:47PM +0200, Michael Grant wrote:
>> On Thu, Sep 11, 2008 at 11:20 AM, Jeremy Chadwick <koitsu at freebsd.org> wrote:
>> > On Thu, Sep 11, 2008 at 10:38:36AM +0200, Michael Grant wrote:
>> >> My box crashed again:
>> >>
>> >> panic: kmem_malloc(4096): kmem_map too small: 1073741824 total allocated
>> >> cpuid = 0
>> >> Uptime: 33d11h12m58s
>> >> Dumping 3327 MB (2 chunks)
>> >>   chunk 0: 1MB (151 pages) ... ok
>> >>   chunk 1: 3327MB (851568 pages)  <---hung here
>> >>
>> >> Still no valid dump.
>> >>
>> >> There is 4gig of physical memory in the machine.
>> >>
>> >> In /boot/loader.conf, I currently have the following:
>> >>
>> >> vm.kmem_size=1G
>> >> vm.kmem_size_max=1G
>> >> vm.kmem_size_scale=2
>> >>
>> >> and in my kernel conf file I have:
>> >>
>> >> options         KVA_PAGES=512
>> >>
>> >> It stayed up for 33 days this time.  Is there anything else I can do?
>> >
>> > First and foremost: are you using ZFS on this machine?  If so, there are
>> > many tunables you can apply to try and limit this; I'm willing to bet
>> > it's ARC which is doing it.  See below.
>> >
>> > In general, it appears that you need to increase the maximum range of
>> > kmem.  The kernel attempted to utilise more than 1GB, and your limit is
>> > 1G.  My machines running RELENG_7 on amd64, with only 2GB of RAM
>> > installed, use the following tunables in loader.conf:
>> >
>> > vm.kmem_size="1536M"
>> > vm.kmem_size_max="1536M"
>> >
>> > If ZFS is in use, I recommend these as well:
>> >
>> > vfs.zfs.arc_min="16M"
>> > vfs.zfs.arc_max="64M"
>> > vfs.zfs.prefetch_disable="1"
>> >
>> > Do not increase kmem_size any larger than 1.5GB; the amount of RAM you
>> > have in the machine, with regards to RELENG_7, will not help.  This is a
>> > known limitation which has been fixed in HEAD/CURRENT (where the limit
>> > has been increased to 512GB).  See the "Kernel" section below; you'll
>> > see the applicable item.
>> >
>> > http://wiki.freebsd.org/JeremyChadwick/Commonly_reported_issues
>> >
>> > Your only solution may be to run HEAD/CURRENT.
>>
>> I am not running ZFS.  My file systems are ufs.
>>
>> This feels like some sort of memory leak in the kernel.  Giving it
>> more and more memory just seems to delay the crash.  Are you saying
>> the crash is fixed in HEAD/CURRENT?
>
> It's an intentional crash, not "the program tried to access NULL, which
> crashed the machine" crash.  The kernel wants more memory to accomplish
> a certain thing, and it's not available.  kris@ can explain this in
> better terms than I can.
>
> First and foremost, it would be good to find out what all you are
> running on this machine (process-wise).  A process could be tickling
> something in the kernel which requires a large amount of memory to be
> required.  I can imagine something like MySQL would require this.
>
> Ideally what needs to happen is to debug the kernel or get a full map
> of kmem to find out what's using what.  I believe vmstat -m or vmstat -z
> output might help.
>
> Obviously since the machine panics, you won't be able to run those
> commands after the fact.  I would recommend you set up a cronjob that
> runs every 1-2 minutes and logs the output of both of those commands
> to a file.  When the panic happens, restart the system and look at
> the logfile to see if you can figure out if anything suddenly starts
> taking up a large amount of memory, or if it's a gradual thing
> (indicating a memory leak).
>
> If you can figure out what might be tickling the problem, you can
> ultimately figure out if increasing kmem is the right thing to do, or if
> there's a greater problem here.
>
>> I'm running 6.3 by the way.
>>
>> I have put your changes into my loader.conf, we'll see how long it
>> goes this time.  I'm not qute in position to update everything to 7.x
>> at the moment.
>
> Our production webservers run RELENG_6 and RELENG_7, and we don't
> encounter this kind of problem.  I'm not saying what you're experiencing
> is indicative of hardware issues or something like that -- I'm simply
> saying I have loaded systems which don't ever hit that condition.  So
> figuring out what's causing it in your case would be good.
>

This appears to be too high as the machine reboots immediately after the fsck:

>> > vm.kmem_size="1536M"
>> > vm.kmem_size_max="1536M"

Returning it to 1G, it panics again about a month later.

Here's vmstat -m and -z roughly 1 minute before it crashed (I was
logging to a file every minute via cron):

Fri Nov 21 15:15:00 EST 2008
         Type InUse MemUse HighUse Requests  Size(s)
  pfs_vncache     2     1K       -   864205  32
         GEOM   168    24K       -   416279  16,32,64,128,256,512,1024,2048,4096
       isadev    17     2K       -       17  64
   CAM periph     1     1K       -        1  128
         cdev    26     4K       -       26  128
    CAM queue     3     1K       -        3  16
    file desc   739   474K       - 284943537  16,32,64,256,512,1024,2048,4096
        sigio     3     1K       -     4802  32
         kenv   116     8K       -      118  16,32,64,4096
       kqueue   246   154K       - 17652506  256,1024
    proc-args   153    10K       - 107101480  16,32,64,128,256
       zombie     0     0K       - 99871925  128
      ithread   147    15K       -      147  16,64,128
       KTRACE   100    13K       -   265722  16,32,64,128,256,512,1024,2048,4096
       linker   178   453K       -      475  16,32,256,512,1024,2048,4096
        lockf    18     2K       - 7774966702  64
       devbuf   594  1779K       -      598  16,32,64,128,256,512,1024,2048,4096
         temp 3170780 795024K       - 684086094  16,32,64,128,256,512,1024,2048,
4096
       ip6opt     1     1K       -        1  128
       ip6ndp     7     1K       -        8  64,128
       module   403    26K       -      403  64,128
     mtx_pool     1     8K       -        1
CAM dev queue     1     1K       -        1  64
         pgrp    90     6K       -   785669  64
      session    65     9K       -   681185  128
         proc     2     8K       -        2  4096
      subproc  1307  1576K       - 99873232  256,4096
         cred   268    34K       - 1054173599  128
  ata_generic     9     9K       -        9  1024
       plimit    44    11K       -  5647664  256
      uidinfo    29     2K       -   384426  32,1024
       sysctl     0     0K       -  2200402  16,32,64
    sysctloid  3411   104K       -     3411  16,32,64
    sysctltmp     0     0K       -  2662228  16,32,128
         umtx  1750   110K       -     3360  64
         SWAP     2  2189K       -        2  64
          bus  1090    46K       -     7017  16,32,64,128,1024
       bus-sc    79    28K       -     3015  16,32,64,128,256,512,1024,2048,4096
      devstat    12    25K       -       12  16,4096
 eventhandler    51     3K       -       51  32,128
      CAM SIM     1     1K       -        1  64
         kobj   257   514K       -      315  2048
      CAM XPT    10     1K       -       17  16,64,512
    ad_driver     8     1K       -        8  32
      ata_dma    10     2K       -       10  128
         rman   193    13K       -      707  16,64
         sbuf     0     0K       -  5350749  16,32,64,128,256,512,1024,2048,4096
    ar_driver     0     0K       -       34  512,2048
    taskqueue    11     1K       -       11  16,128
       Unitno    18     1K       - 160999938  16,64
     ioctlops     0     0K       - 31916658  16,32,64,128,256,512,1024
          iov     0     0K       - 323400897  16,32,64,128,256,4096
          msg     4    25K       -        4  1024,4096
          sem     4     7K       -        4  512,1024,4096
          shm   124   135K       -    65027  1024
         ttys  2337   328K       -   100279  128,1024
         ptys    21     3K       -       21  128
         accf    35     1K       -    12157  16,32
     mbextcnt    11     1K       - 87164975  16
     mbuf_tag     0     0K       - 17517357  32
       soname    73     9K       - 276614136  16,32,128
          pcb   106     6K       - 18574167  16,32,64,2048
   BIO buffer    28    56K       - 11612611  1024,2048
     vfscache     1   512K       -        1
cluster_save buffer     0     0K       -  3154212  32,64
     VFS hash     1   256K       -        1
       vnodes    11     1K       -      669  16,128
        mount   171     5K       -     7997  16,32,64,128,2048
  vnodemarker     0     0K       -  2210275  512
          BPF     6     1K       -     3103  16,64,128,256
        ifnet     7     7K       -        8  256,1024
       ifaddr    86    19K       -      105  16,32,64,128,256,512,2048
  ether_multi    22     1K       -       26  16,32,64
        clone     6    24K       -        6  4096
       arpcom     3     1K       -        3  16
           lo     1     1K       -        1  16
   acd_driver     1     2K       -        1  2048
     ppbusdev     3     1K       -        3  128
     routetbl   212    41K       -    16997  16,32,64,128,256
     in_multi     4     1K       -        5  32
  IpFw/IpAcct     1     1K       -        1  64
  ip_moptions     1     1K       -        1  128
    hostcache     1    24K       -        1
     syncache     1     8K       -        1
    in6_multi    16     1K       -       16  16,32,64
      NFS req     0     0K       - 250799856  128
 NFSV3 diroff     0     0K       -   183024  512
   NFS daemon     1     8K       -        1
     p1003.1b     1     1K       -        1  16
      pagedep     1    64K       -        1
     inodedep     1   256K       -        1
       newblk     1     1K       -        1  256
  UFS dirhash   770   175K       - 10023288  16,32,64,128,256,512,1024,2048,4096
    UFS mount    12   245K       -       15  256,2048
      UMAHash     9    42K       -       46  256,512,1024,2048,4096
      entropy  1024    64K       -     1024  64
          USB    49     5K       -       49  16,32,64,128,256
       USBdev     4     1K       -       13  16,128,512
    VM pgdata     2    65K       -        2  64
       DEVFS2   152     3K       -      203  16
     atkbddev     2     1K       -        2  32
       DEVFS3   494    62K       -      501  128
       DEVFS1   152    38K       -      154  256
   DEVFS_RULE    34     8K       -       34  32,256
        DEVFS    38     1K       -       42  16,128
     I/O APIC     4     4K       -        4  1024
      memdesc     1     4K       -        1  4096
     nexusdev     3     1K       -        3  16
    pfs_nodes    20     3K       -       20  128
       acpica  1207    66K       -    26775  16,32,64,128,256,512,1024,2048
     acpitask     0     0K       -        1  32
     PCI Link    16     2K       -       16  32,64,128
      acpisem    22     2K       -       22  64
      acpidev    58     2K       -       58  32
   raid3_data     4     2K       -  2597361  16,32,256,512
  NULLFS node   182     3K       - 1548645220  16
  NULLFS hash     1     1K       -        1  64
 NULLFS mount     5     1K       -        5  16
         vlan     2     1K       -        2  16,64
 netgraph_msg     0     0K       -     6464  64,128,256,512,1024
netgraph_node     5     2K       -     2521  256
netgraph_hook    16     2K       -      156  128
     netgraph     1     8K       -       18  512
netgraph_sock     1     1K       -     2453  64
netgraph_path     0     0K       -     6464  16,32
netgraph_iface     1     1K       -        2  64
 netgraph_ppp     1     2K       -        2  2048
 netgraph_bpf     6     2K       -      144  64,128,256,512
netgraph_ksock     0     0K       -       16  64
netgraph_mppc     0     0K       -       28  1024

ITEM                     SIZE     LIMIT      USED      FREE  REQUESTS  FAILURES

UMA Kegs:                 140,        0,       77,       19,       77,        0
UMA Zones:                480,        0,       77,        3,       77,        0
UMA Slabs:                 64,        0,     6484,     1304, 30596118,        0
UMA RCntSlabs:            104,        0,      625,      189,  1205420,        0
UMA Hash:                 128,        0,        3,       27,       12,        0
16 Bucket:                 76,        0,       39,      111,      186,        0
32 Bucket:                140,        0,       66,       74,      208,        0
64 Bucket:                268,        0,      118,       36,      459,        9
128 Bucket:               524,        0,    10974,      261,   984992,  4546474
VM OBJECT:                132,        0,    42296,    60016,
2315027163,        0
MAP:                      192,        0,        7,       13,        7,        0
KMAP ENTRY:                68,    90104,      160,     7512, 98287339,        0
MAP ENTRY:                 68,        0,    36757,    15379,
4327383373,        0
PV ENTRY:                  24,  2067410,   626121,  1238434,
52068959685,        0
DP fakepg:                 72,        0,        0,        0,        0,        0
mt_zone:                 1024,        0,      219,      237,      219,        0
16:                        16,        0,     3875,     1606,
2218944237,        0
32:                        32,        0,     2007,     3643, 157755404,        0
64:                        64,        0,     5655,     1012,
8091390625,        0
128:                      128,        0,     4065,     1245,
1507077079,        0
256:                      256,        0,  3169837,      458, 269064785,        0
512:                      512,        0,      928,     1288, 12048433,        0
1024:                    1024,        0,     2493,     1407, 405766834,        0
2048:                    2048,        0,      512,      788, 103888082,        0
4096:                    4096,        0,      399,      533, 114531797,        0
Files:                     72,        0,     1799,     2070,
2326899098,        0
TURNSTILE:                 52,        0,     1751,      373,     3361,        0
PROC:                     536,        0,      332,      641, 99872258,        0
THREAD:                   384,        0,     1114,      636, 80501077,        0
KSEGRP:                    88,        0,      994,      606,  2875793,        0
UPCALL:                    44,        0,       72,      630,  3421747,        0
SLEEPQUEUE:                32,        0,     1751,      509,     3361,        0
VMSPACE:                  296,        0,      282,      836, 99798265,        0
mbuf_packet:              256,        0,      288,      804,
1623032273,        0
mbuf:                     256,        0,       29,      649,
7723849747,        0
mbuf_cluster:            2048,    25600,     1092,      158, 41217209,        0
mbuf_jumbo_pagesize:     4096,        0,        0,        0,        0,        0
mbuf_jumbo_9k:           9216,        0,        0,        0,        0,        0
mbuf_jumbo_16k:         16384,        0,        0,        0,        0,        0
ACL UMA zone:             388,        0,        0,        0,        0,        0
g_bio:                    132,        0,        0,     1218,
4046376667,        2
ata_request:              204,        0,        0,      798,
1167883416,        2
ata_composite:            196,        0,        0,        0,        0,        0
VNODE:                    272,        0,    32878,    67432,
4286504460,        0
VNODEPOLL:                 76,        0,        2,      248,       39,        0
S VFS Cache:               68,        0,    32722,    65670,
3034960875,        0
L VFS Cache:              291,        0,      629,     2790, 66550626,        0
NAMEI:                   1024,        0,        1,      667,
9997159801,        0
DIRHASH:                 1024,        0,     1850,      434, 21697253,        0
NFSMOUNT:                 480,        0,        1,        7,        2,        0
NFSNODE:                  464,        0,        1,     3943, 221540609,        0
PIPE:                     408,        0,       24,      543, 54218876,        0
KNOTE:                     68,        0,     4132,      796, 110922846,        0
socket:                   356,    12331,      349,     1081, 47659527,        0
ipq:                       32,      904,        0,      904,  2259778,        0
udpcb:                    180,    12342,       46,      218, 14346034,        0
inpcb:                    180,    12342,      260,     1170, 17672886,        0
tcpcb:                    464,    12328,      142,      690, 17672886,        0
tcptw:                     48,     2496,      118,     1442,  8139533,        0
syncache:                 100,    15366,        2,      622, 12257432,        0
hostcache:                 76,    15400,     1220,     1130,  1125859,        0
tcpreass:                  20,     1690,        0,      845,   564503,        0
sackhole:                  20,        0,        1,      675,  2544305,        0
ripcb:                    180,    12342,        1,      153,   637466,        0
unpcb:                    144,    12339,      158,      787, 15000640,        0
rtentry:                  132,        0,       50,      182,     5723,        0
IPFW dynamic rule:        108,        0,        0,        0,        0,        0
SWAPMETA:                 276,   121576,    14525,    22393, 55820649,        0
Mountpoints:              664,        0,       15,       21,       17,        0
FFS inode:                132,        0,    32619,    58383,
2515314975,        0
FFS1 dinode:              128,        0,        0,        0,        0,        0
FFS2 dinode:              256,        0,    32619,    56586,
2515314975,        0
gr3:64k:                65536,        0,        0,      292, 10698518,   178524
gr3:16k:                16384,        0,        0,      348, 53407139,  2817786
gr3:4k:                  4096,        0,        0,      284, 53870651,     3399
gr3:64k:                65536,        0,        0,      434, 30935972,   267730
gr3:16k:                16384,        0,        0,      722, 253649141, 26383756
gr3:4k:                  4096,        0,        0,      659, 86316934,     4074
NetGraph items:            36,      546,        0,      312,   176587,        0


More information about the freebsd-stable mailing list