5.3-RELEASE TODO
Kris Kennaway
kris at FreeBSD.org
Tue Sep 14 23:39:52 PDT 2004
On Thu, Sep 02, 2004 at 11:59:47AM -0400, Ken Smith wrote:
> On Thu, Jul 15, 2004 at 03:04:47PM -0700, Kris Kennaway wrote:
>
> > These are the bugs I'm currently tracking (those I can remember right
> > now, at least)
All of these issues except for the last one seem to be resolved for me
now. I haven't tested the last one (memory tuning on 4GB machines)
because I have tuned my kernel configs to avoid the problem, but I can
remove those changes and see if the problems persist.
I am now seeing a couple of other problems:
* softupdates stack overflow (previously reported; I've now hit this
on two machines). I might be able to hack around it by increasing
KSTACK_PAGES, but that doesn't help others. phk could not think of
any way to fix the unboundedness of the dependency chains, and kirk
replied saying he's on vacation.
* I had an apparent scheduler hang tonight (4BSD): the only process
that is running has a trace including sched_switch, and nothing else
apart from the idle tasks is running or runnable. I'll try to post
more details tomorrow.
* There may be a problem with swapping: I had an extremely weird
sequence of errors (binaries aborting, spurious "missing
/libexec/ld-elf.so.1") on pointyhat at around the time it started
swapping. I don't know if swapping was the cause or another symptom
of some other problem. I'll try to reproduce on another machine.
* I was able to break to KDB a few times on pointyhat to try and
diagnose this problem, but eventually it hung trying to enter KDB.
This happens with fairly high frequency (on SMP machines?)
I think there are some other bugs I'm forgetting right now.
Kris
> > * SMP is unusable for me because of the following frequent panic
> > (actually a panic and another kernel printf interleaved). Here is the
> > untangled version:
> >
> > panic: APIC: Previous IPI is stu c k
> > p m a
> > _ l a z y f i x : s p
> > u c p u i d = 0 ;
> > n f o r 5 0 0 0 0 0 0 0
> > c D e b u g g e r ( " p a n i
> >
> > jhb says:
> >
> > > Seems the two CPUs are deadlocked waiting on each other. The first sent a
> > > pmap_lazyfixup IPI to the second but the second has interrupts disabled as it
> > > is trying to send an IPI as well.
> >
> > He suggested a patch, but it did not fix the problem.
>
> Was this fixed with the IPI patches done before BETA2?
>
> > * linprocfs
> >
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; apic id = 00
> > fault virtual address = 0x8
> > fault code = supervisor read, page not present
> > instruction pointer = 0x8:0xc04e1870
> > stack pointer = 0x10:0xf11e6b50
> > frame pointer = 0x10:0xf11e6b6c
> > code segment = base 0x0, limit 0xfffff, type 0x1b
> > = DPL 0, pres 1, def32 1, gran 1
> > processor eflags = interrupt enabled, resume, IOPL = 0
> > current process = 23938 (mtree)
> > kernel: type 12 trap, code=0
> > Stopped at pfs_getattr+0x130: movl 0x8(%eax),%eax
> > db> trace
> > pfs_getattr(f11e6b78,c06fda00,cf397b2c,f11e6b98,d23e8a80) at pfs_getattr+0x130
> > vn_stat(cf397b2c,f11e6c80,d23e8a80,0,c5eb0c60) at vn_stat+0x4f
> > lstat(c5eb0c60,f11e6d14,2,2,297) at lstat+0x6a
> > syscall(2f,2f,2f,805a200,805a248) at syscall+0x217
> > Xint0x80_syscall() at Xint0x80_syscall+0x1f
> > --- syscall (190, FreeBSD ELF32, lstat), eip = 0x280ac664, esp = 0xbfbf7594, ebp = 0xbfbf7620 ---
> >
> > dosirak# addr2line -e kernel.debug 0xc04e1870
> > /usr/src/sys/i386/compile/DOSIRAK/../../../fs/pseudofs/pseudofs_vnops.c:200
> >
> > [...]
> > if (pvd->pvd_pid != NO_PID) {
> > if ((proc = pfind(pvd->pvd_pid)) == NULL)
> > PFS_RETURN (ENOENT);
> > --> vap->va_uid = proc->p_ucred->cr_ruid;
> >
> > rwatson has a patch that works around this particular null pointer
> > deref, but the underlying cause is not addressed.
>
> A patch to pseudofs_vnops.c was made that checks to make sure what pfind()
> returned was "usable". Did that solve this problem? Looks like that
> patch went in after you reported this because it's immediately above
> line 200 you show above.
>
> > * ULE has lots of problems (poor performance on HTT, unable to disable
> > HTT, incorrect load average reporting on SMP machines, ...). Should
> > be turned off until an active maintainer is found.
>
> re@ is discussing this now, it looks likely we will shift to 4BSD soon.
>
> > * ---
> > Fatal trap 12: page fault while in kernel mode
> > fault virtual address = 0x104
> > fault code = supervisor read, page not present
> > instruction pointer = 0x8:0xc058a8cf
> > stack pointer = 0x10:0xdcb34cc4
> > frame pointer = 0x10:0xdcb34cec
> > code segment = base 0x0, limit 0xfffff, type 0x1b
> > = DPL 0, pres 1, def32 1, gran 1
> > processor eflags = resume, IOPL = 0
> > current process = 50 (schedcpu)
> > trap number = 12
> > panic: page fault
> >
> > syncing disks, buffers remaining... panic: mi_switch: switch in a critical section
> >
> > addr2line says the panic was in kern/sched_4bsd.c:327
> >
> > /*
> > * The kse slptimes are not touched in wakeup
> > * because the thread may not HAVE a KSE.
> > */
> > if (ke->ke_state == KES_ONRUNQ) {
> > awake = 1;
> > ke->ke_flags &= ~KEF_DIDRUN;
> > ---> } else if ((ke->ke_state == KES_THREAD) &&
> > (TD_IS_RUNNING(ke->ke_thread))) {
> > awake = 1;
> >
> > gdb -k got confused and couldn't make anything out of the backtrace.
>
> The code you quote above hasn't changed recently but a few kse related
> fixes have gone in recently if I recall correctly. Is this one still
> biting you?
>
> > * Machines with 4GB RAM do not auto-tune kernel memory parameters
> > optimally and easily panic under load with a panic message that does
> > not at least give instructions on what may be wrong and how to fix it.
>
> Work was done on that recently-ish, do you know off hand if that fixed
> what you were seeing?
>
> Thanks...
>
> --
> Ken Smith
> - From there to here, from here to | kensmith at cse.buffalo.edu
> there, funny things are everywhere. |
> - Theodore Geisel |
--
--
In God we Trust -- all others must submit an X.509 certificate.
-- Charles Forsythe <forsythe at alum.mit.edu>
More information about the freebsd-current
mailing list