Amd64 Unstable Areca

KillFill pneumann at gmail.com
Sat Mar 24 01:13:08 UTC 2007


Hello...

El vie, 23-03-2007 a las 11:12 +1100, Jan Mikkelsen escribió: 
> Hi,
> 
> Phillip Neumann wrote:
> > My amd64 box is not very stable.
> > In its hardware list, you can see there is an areca 1210 card, wich 
> > suffer the errata of 6.2-release (high load crash)
> > 
> > Last week or so, i saw a commit where the areca bugs were fixed, so i 
> > updated the system.
> > 
> > I can still see the mashine crashing under load
> 
> How heavy is the load?  I can't make 6-STABLE crash, but I could make
> 6.2-RELEASE crash.  My guess is that you have filesystem corruption
> introduced with the earlier driver which is now causing problems even
> though the driver now works.
> 

Im attaching a new backtrace, when the system crashed, iostat reported
actually not very high load:

   0    0 21.80 171  3.64   0.00   0  0.00  16.00   2  0.03   6  0  3  0
91
   0    0 25.00  92  2.24   0.00   0  0.00   0.00   0  0.00   7  0  6  0
87
   0    0 13.99 143  1.95   8.00   5  0.04   0.00   0  0.00   3  0  7  0
90
   0    0 18.25 331  5.89   0.00   0  0.00  16.00   1  0.02   0  0  1  1
97
   0    0  9.84 766  7.36   9.00   8  0.07  23.32  74  1.68   1  0  3  1
95
   0    0  5.05 551  2.72   9.00   8  0.07  11.31 113  1.25   4  0  1  0
94


when it crashed, jailed-apache was the most moving process in the
system...

The real load is coused by tinderbox, wich uses the disks and one of
both CPUs present in the system.

Are you sudgesting to newfs FS's?

Actually i used plain 6.2 install disks to do that...

> Have you done an fsck in single user mode, not a background fsck?
>  

yes, sometimes (after panic) i need to fsck in single user mode...


> > sometimes (under load) i see this message:
> > Interrupt storm detected on "swi2:"; throttling interrupt source
> 
> I see this too.  It seems to be benign.
>  
okey.

> Regards,
> 
> Jan Mikkelsen

How would i get more info?

Any tips are welcome, Thanks!

-- 
KillFill <pneumann at gmail.com>
-------------- next part --------------
[root at worm /usr/obj/usr/src/sys/WORM]# kgdb kernel.debug /var/crash/vmcore.0
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd".

Unread portion of the kernel message buffer:
/usr/local: bad dir ino 363554 at offset 512: mangled entry
panic: ufs_dirbad: bad dir
cpuid = 0
Uptime: 19h46m33s
Dumping 2047 MB (2 chunks)
  chunk 0: 1MB (156 pages) ... ok
  chunk 1: 2047MB (524016 pages) 2031 2015 1999 1983 1967 1951 1935 1919 1903 1887 1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 1695 1679 1663 1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 1471 1455 1439 1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 1263 1247 1231 1215 1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 1023 1007 991 975 959 943 927 911 895 879 863 847 831 815 799 783 767 751 735 719 703 687 671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15

#0  doadump () at pcpu.h:172
172             __asm __volatile("movq %%gs:0,%0" : "=r" (td));
(kgdb) bt
#0  doadump () at pcpu.h:172
#1  0x0000000000000004 in ?? ()
#2  0xffffffff804085e7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#3  0xffffffff80408c81 in panic (fmt=0xffffff005e027000 "\b�L^")
    at /usr/src/sys/kern/kern_shutdown.c:565
#4  0xffffffff8059b230 in ufs_dirbad (ip=0x0, offset=0, how=0x0)
    at /usr/src/sys/ufs/ufs/ufs_lookup.c:599
#5  0xffffffff8059b657 in ufs_lookup (ap=0xffffffffb76867a0)
    at /usr/src/sys/ufs/ufs/ufs_lookup.c:287
#6  0xffffffff806807fa in VOP_CACHEDLOOKUP_APV (vop=0x0, a=0x0) at vnode_if.c:150
#7  0xffffffff80465bd5 in vfs_cache_lookup (ap=0x0) at vnode_if.h:82
#8  0xffffffff8068153d in VOP_LOOKUP_APV (vop=0xffffffff808d4540, a=0xffffffffb7686890)
    at vnode_if.c:99
#9  0xffffffff8046a555 in lookup (ndp=0xffffffffb7686990) at vnode_if.h:56
#10 0xffffffff8046b285 in namei (ndp=0xffffffffb7686990)
    at /usr/src/sys/kern/vfs_lookup.c:216
#11 0xffffffff8047c4b4 in kern_lstat (td=0xffffff005e027000, path=0x0, 
    pathseg=UIO_USERSPACE, sbp=0xffffffffb7686af0) at /usr/src/sys/kern/vfs_syscalls.c:2141
#12 0xffffffff8047c9a7 in lstat (td=0x0, uap=0xffffffffb7686bc0)
    at /usr/src/sys/kern/vfs_syscalls.c:2124
#13 0xffffffff8062ae91 in syscall (frame=
      {tf_rdi = 5275880, tf_rsi = 5275760, tf_rdx = 0, tf_rcx = 0, tf_r8 = -140737483074239, tf_r9 = 128, tf_rax = 190, tf_rbx = 5275648, tf_rbp = 5275760, tf_r10 = 0, tf_r11 = 0, tf_r12 = 5259264, tf_r13 = 0, tf_r14 = 5271552, tf_r15 = 0, tf_trapno = 12, tf_addr = 5275744, tf_flags = 0, tf_err = 2, tf_rip = 34367048508, tf_cs = 43, tf_rflags = 514, tf_rsp = 140737488349128, tf_ss = 35}) at /usr/src/sys/amd64/amd64/trap.c:803
#14 0xffffffff80615948 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:270
#15 0x00000008006f8b3c in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) 
-------------- next part --------------
[root at worm /usr/obj/usr/src/sys/WORM]# kgdb kernel.debug /var/crash/vmcore.0
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd".

Unread portion of the kernel message buffer:
/usr/local: bad dir ino 363554 at offset 512: mangled entry
panic: ufs_dirbad: bad dir
cpuid = 0
Uptime: 19h46m33s
Dumping 2047 MB (2 chunks)
  chunk 0: 1MB (156 pages) ... ok
  chunk 1: 2047MB (524016 pages) 2031 2015 1999 1983 1967 1951 1935 1919 1903 1887 1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 1695 1679 1663 1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 1471 1455 1439 1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 1263 1247 1231 1215 1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 1023 1007 991 975 959 943 927 911 895 879 863 847 831 815 799 783 767 751 735 719 703 687 671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15

#0  doadump () at pcpu.h:172
172             __asm __volatile("movq %%gs:0,%0" : "=r" (td));
(kgdb) bt
#0  doadump () at pcpu.h:172
#1  0x0000000000000004 in ?? ()
#2  0xffffffff804085e7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#3  0xffffffff80408c81 in panic (fmt=0xffffff005e027000 "\b�L^")
    at /usr/src/sys/kern/kern_shutdown.c:565
#4  0xffffffff8059b230 in ufs_dirbad (ip=0x0, offset=0, how=0x0)
    at /usr/src/sys/ufs/ufs/ufs_lookup.c:599
#5  0xffffffff8059b657 in ufs_lookup (ap=0xffffffffb76867a0)
    at /usr/src/sys/ufs/ufs/ufs_lookup.c:287
#6  0xffffffff806807fa in VOP_CACHEDLOOKUP_APV (vop=0x0, a=0x0) at vnode_if.c:150
#7  0xffffffff80465bd5 in vfs_cache_lookup (ap=0x0) at vnode_if.h:82
#8  0xffffffff8068153d in VOP_LOOKUP_APV (vop=0xffffffff808d4540, a=0xffffffffb7686890)
    at vnode_if.c:99
#9  0xffffffff8046a555 in lookup (ndp=0xffffffffb7686990) at vnode_if.h:56
#10 0xffffffff8046b285 in namei (ndp=0xffffffffb7686990)
    at /usr/src/sys/kern/vfs_lookup.c:216
#11 0xffffffff8047c4b4 in kern_lstat (td=0xffffff005e027000, path=0x0, 
    pathseg=UIO_USERSPACE, sbp=0xffffffffb7686af0) at /usr/src/sys/kern/vfs_syscalls.c:2141
#12 0xffffffff8047c9a7 in lstat (td=0x0, uap=0xffffffffb7686bc0)
    at /usr/src/sys/kern/vfs_syscalls.c:2124
#13 0xffffffff8062ae91 in syscall (frame=
      {tf_rdi = 5275880, tf_rsi = 5275760, tf_rdx = 0, tf_rcx = 0, tf_r8 = -140737483074239, tf_r9 = 128, tf_rax = 190, tf_rbx = 5275648, tf_rbp = 5275760, tf_r10 = 0, tf_r11 = 0, tf_r12 = 5259264, tf_r13 = 0, tf_r14 = 5271552, tf_r15 = 0, tf_trapno = 12, tf_addr = 5275744, tf_flags = 0, tf_err = 2, tf_rip = 34367048508, tf_cs = 43, tf_rflags = 514, tf_rsp = 140737488349128, tf_ss = 35}) at /usr/src/sys/amd64/amd64/trap.c:803
#14 0xffffffff80615948 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:270
#15 0x00000008006f8b3c in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) 


More information about the freebsd-stable mailing list