kern/84389: 4.11-STABLE stuck in kernel-space?

Mark Blackman mark at exonetric.com
Mon Aug 1 16:57:12 GMT 2005


On 31 Jul 2005, at 22:55, Marc G. Fournier wrote:

>
> 'k, I don't know how much RAM you have on this machine ... but, on  
> all of ours, we have 4G, and to fix *our* vnode issue, I have to  
> build the kernel/world with:

1GB ram.

>
> CFLAGS= -O -mpentium -pipe -g -DKVA_PAGES=512
> COPTFLAGS= -O -mpentium -pipe -DKVA_PAGES=512
>
> and I have /etc/sysctl.conf setup with:
>
> kern.maxvnodes=522240
>
> The fact that its dying at about the same time each day, I'm going  
> to guess that its something like the 'find' that runs for the  
> various periodic "security" checks that is sucking back the vnodes  
> for you ...
>
> This is all speculation, but this sounds like exactly everythign I  
> went through when I started to load up machines with jails :)

It appears we are in a very similar line of business. :)

Is there any data I can provide to pin this down. It didn't happen  
today, FWIW.

 From my reading of the core dump, it looks like it might have stuck  
on a
file write in a newly configured jail. It was a postfix 'postdrop'  
process
where the DDB froze it. Assuming it was actually stuck there, then that
write was the bad guy.

Cheers,
Mark

>
> hope this helps ...
>
>  On Sun, 31 Jul 2005, Mark Blackman wrote:
>
>
>>
>>
>>> Number:         84389
>>> Category:       kern
>>> Synopsis:       4.11-STABLE stuck in kernel-space?
>>> Confidential:   no
>>> Severity:       serious
>>> Priority:       medium
>>> Responsible:    freebsd-bugs
>>> State:          open
>>> Quarter:
>>> Keywords:
>>> Date-Required:
>>> Class:          sw-bug
>>> Submitter-Id:   current-users
>>> Arrival-Date:   Sun Jul 31 14:00:22 GMT 2005
>>> Closed-Date:
>>> Last-Modified:
>>> Originator:     Mark Blackman
>>> Release:        4.11-Stable
>>> Organization:
>>> Environment:
>>>
>> FreeBSD varadero.exonetric.net 4.11-STABLE FreeBSD 4.11-STABLE #2:  
>> Sat Apr 23 22:20:21 BST 2005     root at varadero.exonetric.net:/usr/ 
>> obj/usr/src/sys/MAIN-NOSMP  i386
>>
>>> Description:
>>>
>> System is unresponsive to any network or console userland inputs,  
>> however it does respond to ping and can break into ddb via serial  
>> console. I took a panic and rebooted. I have a full debug kernel  
>> and the full core dump. I've pasted in a full backtrace for  
>> reference and can answer any question about this core dump. Looks  
>> to me like some kind of issue
>> on vnode file write. As a curious side note, this condition has  
>> manifested itself at approximately if not exactly the same time  
>> every day when it has happened (09:20-09:25).
>> This is the first reoccurence after several months of dormancy.
>>
>> The key features of this system are several jails and vnodes (~15  
>> each).
>>
>> backtrace full follows..
>>
>> #0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:487
>>        error = 0
>> #1  0xc01b940b in boot (howto=256) at /usr/src/sys/kern/ 
>> kern_shutdown.c:316
>>        howto = 256
>> #2  0xc01b9849 in panic (fmt=0xc0330744 "from debugger") at /usr/ 
>> src/sys/kern/kern_shutdown.c:595
>>        fmt = 0xc0330744 "from debugger"
>>        bootopt = 256
>>        buf = "from debugger", '\000' <repeats 242 times>
>> #3  0xc0144739 in db_panic (addr=-1070466719, have_addr=0,  
>> count=-1, modif=0xe3711ab0 "")
>>    at /usr/src/sys/ddb/db_command.c:435
>> No locals.
>> #4  0xc01446d9 in db_command (last_cmdp=0xc0387eb8,  
>> cmd_table=0xc0387cf8, aux_cmd_tablep=0xc03aeb18)
>>    at /usr/src/sys/ddb/db_command.c:333
>>        cmd_table = (struct command *) 0x0
>>        aux_cmd_tablep = (struct command **) 0xc03aeb18
>>        cmd = (struct command *) 0xc0330728
>>        t = 0
>>        modif = "\000\032q?h\f2?\200%\000\000?\003\000\000\000\eq?? 
>> \032q?h\f2??\003\000\000?\032q?\f2??\003\000\000?\003\000\000\r\000 
>> \000\000?\0172?@?h\000\000\eq??\003\000\000\020\eq??\\\024??\f3?$\\ 
>> \024?h?>?\200\025;??W\024?\200\025;?@\017;?x\000\000\000\200\215}? 
>> \000\034q?4\eq?"
>>        addr = -1070466719
>>        count = -1
>>        have_addr = 0
>>        result = 0
>> #5  0xc014479e in db_command_loop () at /usr/src/sys/ddb/ 
>> db_command.c:457
>> No locals.
>> #6  0xc01468db in db_trap (type=3, code=0) at /usr/src/sys/ddb/ 
>> db_trap.c:71
>>        bkpt = 0
>> #7  0xc02fdbae in kdb_trap (type=3, code=0, regs=0xe3711bb8) at / 
>> usr/src/sys/i386/i386/db_interface.c:158
>>        ddb_mode = 1
>> #8  0xc030db40 in trap (frame={tf_fs = -479133680, tf_es =  
>> -479133680, tf_ds = -1070792688, tf_edi = -811750548,
>>      tf_esi = -1022605312, tf_ebp = -479126508, tf_isp =  
>> -479126556, tf_ebx = -811750588, tf_edx = 1017, tf_ecx = 3,
>>      tf_eax = 0, tf_trapno = 3, tf_err = 0, tf_eip = -1070466719,  
>> tf_cs = 8, tf_eflags = 70, tf_esp = -811750588,
>>      tf_ss = 0}) at /usr/src/sys/i386/i386/trap.c:592
>>        p = (struct proc *) 0xe37d8d80
>>        sticks = 13843839841289182128
>>        i = 0
>>        ucode = 0
>>        type = 3
>>        code = 0
>>        eva = 0
>> #9  0xc031f961 in siointr1 (com=0xc30c4800) at machine/cpufunc.h:67
>>        com = (struct com_s *) 0xc30c4800
>>        line_status = 249 '?'
>>        modem_status = 63 '?'
>>        ioptr = (u_char *) 0x68c040 <Address 0x68c040 out of bounds>
>>        recv_data = 0 '\000'
>>        int_ctl = 0 '\000'
>>        int_ctl_new = 0 '\000'
>>        tc = (struct timecounter *) 0x68c040
>>        count = 0
>> #10 0xc031f89b in siointr (arg=0xc30c4800) at /usr/src/sys/isa/ 
>> sio.c:1947
>> No locals.
>> #11 0xc02fee56 in Xfastintr4 ()
>> No symbol table info available.
>> #12 0xc02d16a9 in ufs_vnoperate (ap=0xe3711c88) at /usr/src/sys/ 
>> ufs/ufs/ufs_vnops.c:2376
>>        ap = (struct vop_generic_args *) 0x0
>> #13 0xc01e7b9d in vtruncbuf (vp=0xe195df80, cred=0xc46e5900,  
>> p=0xe37d8d80, length=196581, blksize=16384)
>>    at vnode_if.h:1193
>>        a = {a_desc = 0xc0382880, a_vp = 0xe195df80, a_bp =  
>> 0xcf9dab44}
>>        vp = (struct vnode *) 0x0
>>        bp = (struct buf *) 0xcf9dab44
>>        blksize = 0
>>        bp = (struct buf *) 0xcf9dab44
>>        nbp = (struct buf *) 0xcf9dab6c
>>        s = 0
>>        anyfreed = 0
>>        trunclbn = 12
>> #14 0xc02c139f in ffs_truncate (vp=0xe195df80, length=196581,  
>> flags=0, cred=0xc46e5900, p=0xe37d8d80)
>>    at /usr/src/sys/ufs/ffs/ffs_inode.c:314
>>        flags = -510271616
>>        ovp = (struct vnode *) 0xe195df80
>>        lastblock = 11
>>        oip = (struct inode *) 0xc4854300
>>        bn = 0
>>        lbn = 11
>>        lastiblock = {-1, -1, -1}
>>        indir_lbn = {0, 0, -803029019}
>>        oldblks = {92528, 92536, 92544, 92552, 92560, 92568, 92576,  
>> 92584, 92592, 92600, 92608, 92616, 0, 0, 0}
>>        newblks = {92528, 92536, 92544, 92552, 92560, 92568, 92576,  
>> 92584, 92592, 92600, 92608, 92616, 0, 0, 0}
>>        fs = (struct fs *) 0xc462c000
>>        bp = (struct buf *) 0xcf9490c8
>>        offset = 16357
>>        size = -510271616
>>        level = -510271616
>>        count = 16384
>>        nblocks = 32
>>        blocksreleased = 0
>>        i = 11
>>        aflags = 1
>>        error = 0
>>        allerror = 27
>>        osize = 196608
>> #15 0xc02ca002 in ffs_write (ap=0xe3711e64) at /usr/src/sys/ufs/ 
>> ufs/ufs_readwrite.c:598
>>        vp = (struct vnode *) 0xe195df80
>>        uio = (struct uio *) 0xe3711ed4
>>        ip = (struct inode *) 0xc4854300
>>        fs = (struct fs *) 0xc462c000
>>        bp = (struct buf *) 0x0
>>        p = (struct proc *) 0xe3711ed4
>>        lbn = -479125804
>>        osize = 196581
>>        seqcount = 127
>>        blkoffset = 0
>>        error = 27
>>        extended = 1
>>        flags = 2130706433
>>        ioflag = 8323073
>>        resid = 87
>>        size = 0
>>        xfersize = 60
>>        object = 0xe188b3f4
>> #16 0xc01ef322 in vn_write (fp=0xc431f540, uio=0xe3711ed4,  
>> cred=0xc46e5900, flags=0, p=0xe37d8d80) at vnode_if.h:363
>>        a = {a_desc = 0xc0382100, a_vp = 0xe195df80, a_uio =  
>> 0xe3711ed4, a_ioflag = 8323073, a_cred = 0xc46e5900}
>>        vp = (struct vnode *) 0xe195df80
>>        uio = (struct uio *) 0xe3711ed4
>>        ioflag = 8323073
>>        cred = (struct ucred *) 0xc46e5900
>>        fp = (struct file *) 0xc431f540
>>        vp = (struct vnode *) 0xe195df80
>>        error = 8323073
>>        ioflag = 8323073
>> #17 0xc01c8899 in dofilewrite (p=0xe37d8d80, fp=0xc431f540, fd=2,  
>> buf=0x8061e40, nbyte=87, offset=-1, flags=0)
>>    at /usr/src/sys/sys/file.h:163
>>        error = -478311040
>>        fp = (struct file *) 0xc431f540
>>        cred = (struct ucred *) 0x0
>>        p = (struct proc *) 0xe37d8d80
>>        fp = (struct file *) 0xc431f540
>>        offset = 0
>>        auio = {uio_iov = 0xe3711eac, uio_iovcnt = 1, uio_offset =  
>> 196608, uio_resid = 60, uio_segflg = UIO_USERSPACE,
>>  uio_rw = UIO_WRITE, uio_procp = 0xe37d8d80}
>>        aiov = {iov_base = 0x8061e5b "ue_enter: create file  
>> maildrop/381079.82030: File too large\n", iov_len = 60}
>>        cnt = 87
>>        error = -478311040
>>        ktriov = {iov_base = 0x0, iov_len = 0}
>>        ktruio = {uio_iov = 0xc01bebb0, uio_iovcnt = -479125788,  
>> uio_offset = 1240175767120, uio_resid = -1070294618,
>>  uio_segflg = 134656264, uio_rw = UIO_READ, uio_procp = 0x0}
>>        didktr = 0
>> #18 0xc01c8752 in write (p=0xe37d8d80, uap=0xe3711f80) at /usr/src/ 
>> sys/kern/sys_generic.c:329
>>        p = (struct proc *) 0xe37d8d80
>>        uap = (struct write_args *) 0xe3711f80
>>        fp = (struct file *) 0xc431f540
>>        error = -479125632
>> #19 0xc030e4a1 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds =  
>> 47, tf_edi = 2, tf_esi = 0, tf_ebp = -1077937328,
>>      tf_isp = -479125548, tf_ebx = 134614320, tf_edx = 134614320,  
>> tf_ecx = 134618688, tf_eax = 4, tf_trapno = 7,
>>      tf_err = 2, tf_eip = 672270488, tf_cs = 31, tf_eflags = 659,  
>> tf_esp = -1077937372, tf_ss = 47})
>>    at /usr/src/sys/i386/i386/trap.c:1175
>>        params = 0xbfbffb28 "\002"
>>        i = 0
>>        callp = (struct sysent *) 0xc038f9a0
>>        p = (struct proc *) 0xe37d8d80
>>        orig_tf_eflags = 659
>>        sticks = 34
>>        error = 0
>>        narg = 3
>>        args = {2, 134618688, 87, 0, 0, 0, 0, 0}
>>        have_mplock = 1
>>        code = 4
>> #20 0xc02fea85 in Xint0x80_syscall ()
>> No symbol table info available.
>> #21 0x8054550 in ?? ()
>> No symbol table info available.
>> #22 0x8054e80 in ?? ()
>> No symbol table info available.
>> #23 0x805292e in ?? ()
>> No symbol table info available.
>> #24 0x805277f in ?? ()
>> No symbol table info available.
>> #25 0x80526e3 in ?? ()
>> No symbol table info available.
>> #26 0x80524bb in ?? ()
>> No symbol table info available.
>> #27 0x804d3ba in ?? ()
>> No symbol table info available.
>> #28 0x804b07a in ?? ()
>> No symbol table info available.
>> #29 0x8049de2 in ?? ()
>> No symbol table info available.
>> #30 0x80499ea in ?? ()
>> No symbol table info available.
>>
>>
>>
>>
>>> How-To-Repeat:
>>>
>> not known
>>
>>> Fix:
>>>
>> not known
>>
>>> Release-Note:
>>> Audit-Trail:
>>> Unformatted:
>>>
>> _______________________________________________
>> freebsd-bugs at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-bugs
>> To unsubscribe, send any mail to "freebsd-bugs- 
>> unsubscribe at freebsd.org"
>>
>>
>>
>
> ----
> Marc G. Fournier           Hub.Org Networking Services (http:// 
> www.hub.org)
> Email: scrappy at hub.org           Yahoo!: yscrappy              ICQ:  
> 7615664
>



More information about the freebsd-bugs mailing list