Crashing repeatedly: 6.2-RELEASE-p5 and MySQL 5.0.41

Kostik Belousov kostikbel at gmail.com
Mon Feb 4 16:19:29 UTC 2008


On Mon, Feb 04, 2008 at 12:50:32PM +0000, Primeroz lists wrote:
> Hi all,
> 
> we are experiencing repeated crash on a Dell PowerEdge 2950 (rev 1 or 2).
> 
> FBSD release is 6.2-RELEASE-p5 , AMD64. 2xXeon QuadCore and 8G of Ram.
> 
> MySQL Version is 5.0.41 with following configuration settings:
> 
> set-variable    = key_buffer=768M
> set-variable    = table_cache=800
> set-variable    = sort_buffer=24M
> set-variable    = myisam_sort_buffer_size=256M
> set-variable    = record_buffer=16M
> set-variable    = max_allowed_packet=10M
> set-variable    = thread_stack=128K
> set-variable    = join_buffer=512M
> set-variable    = max_heap_table_size=256M
> set-variable    = max_connections=300
> set-variable    = tmp_table_size=384M
> set-variable    = query_cache_size=402653184
> set-variable    = query_cache_limit=134217728
> set-variable    = read_rnd_buffer_size=10M
> set-variable    = ft_min_word_len=1
> pid-file        = /var/db/mysqld.pid
> tmpdir          = /var/tmp
> ft_stopword_file = ''
> set-variable    = thread_cache_size=80
> set-variable    = myisam_stats_method=nulls_equal
> 
> 
> The system is crashing repeatedly and from the graphs we collect on the box
> i can see that every time before the crash we have an intensive usage of
> *InnoDB* related resources, i collected several vmcore dump and attached is
> what i've been able to extract.
> 
> I'm not sure how much the *InnoDB* usage is related to the crash, btw i'm
> quite sure that it is triggering the crash.
> 
> I've looked on the various CVS and releases to see if anything related to my
> crash has been updated in the last period but i did not find anything
> specifically related so i'm wondering if anybody else had experience of this
> kind of problems before proceding to a blind upgrade or any other blind
> solution.
> 
> 
> > $ sudo  kgdb /usr/obj/usr/src/sys/PE2950/kernel.debug vmcore.2
> Password:
> [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so:
> Undefined symbol "ps_pglobal_lookup"]
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd".
> 
> Unread portion of the kernel message buffer:
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 5; apic id = 05
> fault virtual address   = 0x100166887ad
> fault code              = supervisor read, page not present
> instruction pointer     = 0x8:0xffffffff803fa290
> stack pointer           = 0x10:0xffffffffba0a9980
> frame pointer           = 0x10:0x2
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 1038 (mysqld)
> trap number             = 12
> panic: page fault
> cpuid = 5
> Uptime: 1d4h37m54s
> Dumping 8191 MB (3 chunks)
>   chunk 0: 1MB (156 pages) ... ok
>   chunk 1: 3327MB (851624 pages) 3311 3295 3279 3263 3247 3231 3215 3199
> 3183 3167 3151 3135 3119 3103 3087 3071 3055 3039 3023 3007 2991 2975 2959
> 2943 2927 2911 2895 2879 2863 2847 2831 2815 2799 2783 2767 2751 2735 2719
> 2703 2687 2671 2655 2639 2623 2607 2591 2575 2559 2543 2527 2511 2495 2479
> 2463 2447 2431 2415 2399 2383 2367 2351 2335 2319 2303 2287 2271 2255 2239
> 2223 2207 2191 2175 2159 2143 2127 2111 2095 2079 2063 2047 2031 2015 1999
> 1983 1967 1951 1935 1919 1903 1887 1871 1855 1839 1823 1807 1791 1775 1759
> 1743 1727 1711 1695 1679 1663 1647 1631 1615 1599 1583 1567 1551 1535 1519
> 1503 1487 1471 1455 1439 1423 1407 1391 1375 1359 1343 1327 1311 1295 1279
> 1263 1247 1231 1215 1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039
> 1023 1007 991 975 959 943 927 911 895 879 863 847 831 815 799 783 767 751
> 735 719 703 687 671 655 639 623 607 591 575 559 543 527 511 495 479 463 447
> 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143
> 127 111 95 79 63 47 31 15 ... ok
>   chunk 2: 4864MB (1245184 pages) 4849 4833 4817 4801 4785 4769 4753 4737
> 4721 4705 4689 4673 4657 4641 4625 4609 4593 4577 4561 4545 4529 4513 4497
> 4481 4465 4449 4433 4417 4401 4385 4369 4353 4337 4321 4305 4289 4273 4257
> 4241 4225 4209 4193 4177 4161 4145 4129 4113 4097 4081 4065 4049 4033 4017
> 4001 3985 3969 3953 3937 3921 3905 3889 3873 3857 3841 3825 3809 3793 3777
> 3761 3745 3729 3713 3697 3681 3665 3649 3633 3617 3601 3585 3569 3553 3537
> 3521 3505 3489 3473 3457 3441 3425 3409 3393 3377 3361 3345 3329 3313 3297
> 3281 3265 3249 3233 3217 3201 3185 3169 3153 3137 3121 3105 3089 3073 3057
> 3041 3025 3009 2993 2977 2961 2945 2929 2913 2897 2881 2865 2849 2833 2817
> 2801 2785 2769 2753 2737 2721 2705 2689 2673 2657 2641 2625 2609 2593 2577
> 2561 2545 2529 2513 2497 2481 2465 2449 2433 2417 2401 2385 2369 2353 2337
> 2321 2305 2289 2273 2257 2241 2225 2209 2193 2177 2161 2145 2129 2113 2097
> 2081 2065 2049 2033 2017 2001 1985 1969 1953 1937 1921 1905 1889 1873 1857
> 1841 1825 1809 1793 1777 1761 1745 1729 1713 1697 1681 1665 1649 1633 1617
> 1601 1585 1569 1553 1537 1521 1505 1489 1473 1457 1441 1425 1409 1393 1377
> 1361 1345 1329 1313 1297 1281 1265 1249 1233 1217 1201 1185 1169 1153 1137
> 1121 1105 1089 1073 1057 1041 1025 1009 993 977 961 945 929 913 897 881 865
> 849 833 817 801 785 769 753 737 721 705 689 673 657 641 625 609 593 577 561
> 545 529 513 497 481 465 449 433 417 401 385 369 353 337 321 305 289 273 257
> 241 225 209 193 177 161 145 129 113 97 81 65 49 33 17 1
> 
> #0  doadump () at pcpu.h:172
> 172     pcpu.h: No such file or directory.
>         in pcpu.h
> (kgdb) bt
> #0  doadump () at pcpu.h:172
> #1  0x0000000000000004 in ?? ()
> #2  0xffffffff802a7d67 in boot (howto=260) at
> /usr/src/sys/kern/kern_shutdown.c:409
> #3  0xffffffff802a8401 in panic (fmt=0xffffff0036f9c720
> "???\206C???\001???????????????%\\\001?????????\200i??????")
>     at /usr/src/sys/kern/kern_shutdown.c:565
> #4  0xffffffff80425f7f in trap_fatal (frame=0xffffff0036f9c720,
> eva=18446742981617878704)
>     at /usr/src/sys/amd64/amd64/trap.c:660
> #5  0xffffffff8042629f in trap_pfault (frame=0xffffffffba0a98d0, usermode=0)
> at /usr/src/sys/amd64/amd64/trap.c:573
> #6  0xffffffff80426553 in trap (frame=
>       {tf_rdi = 1099887576672, tf_rsi = 0, tf_rdx = 0, tf_rcx = -1173710312,
> tf_r8 = -1093564261992, tf_r9 = -1173710304, tf_rax = -1173710293, tf_rbx =
> -1093564262000, tf_rbp = 2, tf_r10 = -1098589288672, tf_r11 = 435836558,
> tf_r12 = 1099887576672, tf_r13 = -1093564262000, tf_r14 = 435835520, tf_r15
> = -1173710312, tf_trapno = 12, tf_addr = 1099887577005, tf_flags =
> -2144607018, tf_err = 0, tf_rip = -2143313264, tf_cs = 8, tf_rflags = 66118,
> tf_rsp = -1173710440, tf_ss = 16})
>     at /usr/src/sys/amd64/amd64/trap.c:352
> #7  0xffffffff8041173b in calltrap () at
> /usr/src/sys/amd64/amd64/exception.S:168
> #8  0xffffffff803fa290 in _vm_map_unlock () at /usr/src/sys/vm/vm_map.c:443
> #9  0xffffffff803fdecc in vm_map_lookup (var_map=0xffffffffba0a9a10,
> vaddr=435835520, fault_typea=2 '\002',
>     out_entry=0xffffffffba0a9a18, object=0xffffff01627d9998,
> pindex=0xffffffffba0a9a20, out_prot=0xffffffffba0a9a2b "",
>     wired=0xffffffffba0a9a2c) at /usr/src/sys/vm/vm_map.c:3074
The vm_map.c does not contain a call to the vm_map_unlock() at the
line 3074.

Please, rebuild you kernel from scratch. In case this does not help,
I ask you to show the backtrace from the ddb. Also, to speed up the
conversation, could you, please, for each <function>+<offset> from the
ddb output, do the list *(<function>+<offset>) in the kgdb ?


> #10 0xffffffff802b845e in umtx_key_get (td=0xffffff0036f9c720,
> umtx=0x19fa5280, key=0xffffff01627d9990)
>     at /usr/src/sys/kern/kern_umtx.c:312
> #11 0xffffffff802b8578 in _do_lock (td=0xffffff0036f9c720, umtx=0x19fa5280,
> id=100582, timo=0)
>     at /usr/src/sys/kern/kern_umtx.c:362
> #12 0xffffffff802b99e9 in _umtx_op (td=0xffffff0036f9c720, uap=0x188e6) at
> /usr/src/sys/kern/kern_umtx.c:545
> #13 0xffffffff80426dd1 in syscall (frame=
>       {tf_rdi = 435835520, tf_rsi = 0, tf_rdx = 100582, tf_rcx = 0, tf_r8 =
> 0, tf_r9 = 140737452053060, tf_rax = 454, tf_rbx = 100582, tf_rbp =
> 435835520, tf_r10 = 1, tf_r11 = 582, tf_r12 = 9982128, tf_r13 = 1024, tf_r14
> = 0, tf_r15 = 0, tf_trapno = 12, tf_addr = 1387466752, tf_flags = 0, tf_err
> = 2, tf_rip = 34378206780, tf_cs = 43, tf_rflags = 582, tf_rsp =
> 140737452052808, tf_ss = 35}) at /usr/src/sys/amd64/amd64/trap.c:792
> #14 0xffffffff804118d8 in Xfast_syscall () at
> /usr/src/sys/amd64/amd64/exception.S:270
> #15 0x000000080119ce3c in ?? ()
> Previous frame inner to this frame (corrupt stack?)
> (kgdb)
> 
> (kgdb) up 6
> #6  0xffffffff80426553 in trap (frame=
>       {tf_rdi = 1099887576672, tf_rsi = 0, tf_rdx = 0, tf_rcx = -1173710312,
> tf_r8 = -1093564261992, tf_r9 = -1173710304, tf_rax = -1173710293, tf_rbx =
> -1093564262000, tf_rbp = 2, tf_r10 = -1098589288672, tf_r11 = 435836558,
> tf_r12 = 1099887576672, tf_r13 = -1093564262000, tf_r14 = 435835520, tf_r15
> = -1173710312, tf_trapno = 12, tf_addr = 1099887577005, tf_flags =
> -2144607018, tf_err = 0, tf_rip = -2143313264, tf_cs = 8, tf_rflags = 66118,
> tf_rsp = -1173710440, tf_ss = 16})
>     at /usr/src/sys/amd64/amd64/trap.c:352
> 352                             (void) trap_pfault(&frame, FALSE);
> 
> (kgdb) up
> #7  0xffffffff8041173b in calltrap () at
> /usr/src/sys/amd64/amd64/exception.S:168
> 168             call    trap
> Current language:  auto; currently asm
> (kgdb) up
> #8  0xffffffff803fa290 in _vm_map_unlock () at /usr/src/sys/vm/vm_map.c:443
> 443                     _sx_xunlock(&map->lock, file, line);
> Current language:  auto; currently c
> (kgdb) up
> #9  0xffffffff803fdecc in vm_map_lookup (var_map=0xffffffffba0a9a10,
> vaddr=435835520, fault_typea=2 '\002',
>     out_entry=0xffffffffba0a9a18, object=0xffffff01627d9998,
> pindex=0xffffffffba0a9a20, out_prot=0xffffffffba0a9a2b "",
>     wired=0xffffffffba0a9a2c) at /usr/src/sys/vm/vm_map.c:3074
> 3074            vm_map_lock_read(map);
> (kgdb) list
> 3069    RetryLookup:;
> 3070            /*
> 3071             * Lookup the faulting address.
> 3072             */
> 3073
> 3074            vm_map_lock_read(map);
> 3075    #define RETURN(why) \
> 3076                    { \
> 3077                    vm_map_unlock_read(map); \
> 3078                    return (why); \
> (kgdb) p map
> $1 = 0x10016688660
> (kgdb) down
> #8  0xffffffff803fa290 in _vm_map_unlock () at /usr/src/sys/vm/vm_map.c:443
> 443                     _sx_xunlock(&map->lock, file, line);
> (kgdb) list
> 438     {
> 439
> 440             if (map->system_map)
> 441                     _mtx_unlock_flags(&map->system_mtx, 0, file, line);
> 442             else
> 443                     _sx_xunlock(&map->lock, file, line);
> 444     }
> 445
> 446     void
> 447     _vm_map_lock_read(vm_map_t map, const char *file, int line)
> 
> 
> Thanks,
> Francesco Ciocchetti

> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080204/e7161bbc/attachment.pgp


More information about the freebsd-stable mailing list