Infinite loop bug in libc_r on 4.x with condition variables and signals

Daniel Eischen deischen at freebsd.org
Thu Oct 28 15:43:45 PDT 2004


On Thu, 28 Oct 2004, Daniel Eischen wrote:

> On Thu, 28 Oct 2004, John Baldwin wrote:
>
> > On Wednesday 27 October 2004 06:30 pm, Daniel Eischen wrote:
> > > On Wed, 27 Oct 2004, John Baldwin wrote:
> > > >
> > > > FWIW, we are having (I think) the same problem on 5.3 with libpthread.
> > > > The panic there is in the mutex code about an assertion failing because a
> > > > thread is on a syncq when it is not supposed to be.
> > >
> > > David and I recently fixed some races in pthread_join() and
> > > pthread_exit() in -current libpthread.  Don't know if those
> > > were responsible...
> > >
> > > Here's a test program that shows correct behavior with both
> > > libc_r and libpthread in -current.
> >
> > We've started testing on -current and are seeing several problems with
> > libpthread.  Using a UP kernel (machines have single processor with HTT)
> > seems to make it better, but we seem to be getting SIG 11's in
> > pthread_testcancel() as well as the failed lock assertions that were
> > mentioned earlier on the list in the PR.  Just running monodevelop from the
> > bsd-sharp stuff mentioned earlier can break in that one of the processes dies
> > with the assertion failure.  If you let the other processes run, then you can
> > run it again and get the window to pop up, but then clicking on any of the
> > controls results in the pthread_testcancel() crash.  FWIW, I think the reason
> > that the stack traces look weird in the PR's thread may be due to catching a
> > signal.  When we were looking at the problems with libc_r on 4.x we would get
> > some weird looking backtraces sometimes when the assertion in uthread_sig.c
> > that I added failed.  Seems that gdb doesn't handle the signal frames very
> > well.
>
> You also want to make sure you're not running out of stack space
> for your threads.
>
> Is the code trying to install signal frames on threads itself?
> That could cause the problems you are seeing.

I went back to the monodoc test case in the PR.  Running under
the debugger gives this:

(gdb) run /usr/local/lib/mono/1.0/mcs.exe -out:browser.exe ./browser.cs
./list.cs               ./elabel.cs             ./history.cs
./Contributions.cs      ./XmlNodeWriter.cs
-resource:./../monodoc.png,monodoc.png -resource:./browser.glade,browser.glade
-pkg:gtkhtml-sharp -pkg:glade-sharp -r:System.Web.Services -r:./monodoc.dll
Starting program: /usr/local/bin/mono /usr/local/lib/mono/1.0/mcs.exe
-out:browser.exe ./browser.cs                 ./list.cs
./elabel.cs             ./history.cs             ./Contributions.cs
./XmlNodeWriter.cs -resource:./../monodoc.png,monodoc.png
-resource:./browser.glade,browser.glade  -pkg:gtkhtml-sharp -pkg:glade-sharp
-r:System.Web.Services -r:./monodoc.dll
[Switching to Thread 1 (LWP 100074)]

Breakpoint 1, 0x0804862e in main ()
(gdb) cont
Continuing.
[Switching to Thread 4 (LWP 100128)]

Breakpoint 2, 0x2842c801 in __assert () from /lib/libc.so.5
(gdb) bt
#0  0x2842c801 in __assert () from /lib/libc.so.5
#1  0x2837ce4e in _lock_acquire (lck=0x8062f00, lu=0x8110e48, prio=674751930)
    at /opt/FreeBSD/src/lib/libpthread/sys/lock.c:171
#2  0x2837010b in mutex_lock_common (curthread=0x8110e00, m=0x28482434, abstime=0x0)
    at /opt/FreeBSD/src/lib/libpthread/thread/thr_mutex.c:495
#3  0x28371677 in __pthread_mutex_lock (m=0x28482434)
    at /opt/FreeBSD/src/lib/libpthread/thread/thr_mutex.c:796
#4  0x28171cc6 in WaitForSingleObjectEx (handle=0xe, timeout=500, alertable=0) at handles-private.h:97
#5  0x2816b116 in CreateProcess (appname=0xd, cmdline=0x8092ac4, process_attrs=0x0, thread_attrs=0x0,
    inherit_handles=1, create_flags=1024, new_environ=0x0, cwd=0x0, startup=0xbf8ec78c,
    process_info=0xbf8ec77c) at processes.c:427
#6  0x2813ef4f in ves_icall_System_Diagnostics_Process_Start_internal (appname=0x80f89d8,
    cmd=0x8092ab8, dirname=0x808ff30, stdin_handle=0x2837e5ba, stdout_handle=0x2837e5ba,
    stderr_handle=0x2837e5ba, process_info=0xbf8ec964) at process.c:870
#7  0x28f548ff in ?? ()
#8  0x080f89d8 in ?? ()
#9  0x08092ab8 in ?? ()
#10 0x0808ff30 in ?? ()
#11 0x00000009 in ?? ()
#12 0x0000000d in ?? ()
#13 0x0000000b in ?? ()
#14 0xbf8ec964 in ?? ()
#15 0x0812d420 in ?? ()
#16 0x0812d408 in ?? ()
#17 0x0820d300 in ?? ()
#18 0x0808ff30 in ?? ()
#19 0x08092ab8 in ?? ()
#20 0x080f89d8 in ?? ()
#21 0xbf8ec838 in ?? ()
#22 0x28f548cc in ?? ()
#23 0xbf8ec98c in ?? ()
#24 0x28f542aa in ?? ()
---Type <return> to continue, or q <return> to quit---
#25 0x080f89d8 in ?? ()
#26 0x08092ab8 in ?? ()
#27 0x0808ff30 in ?? ()
#28 0x00000009 in ?? ()
#29 0x0000000d in ?? ()
#30 0x0000000b in ?? ()
#31 0xbf8ec964 in ?? ()
#32 0x28371bfe in mutex_unlock_common (m=0xb, add_reference=134818488)
    at /opt/FreeBSD/src/lib/libpthread/thread/thr_mutex.c:984
Previous frame inner to this frame (corrupt stack?)
(gdb) info threads
  5 Thread 2 (LWP 100137)  0x2837bfd3 in kse_release () at kse_release.S:2
  4 Thread 3 (sleeping)  0x28373d0f in _thr_sched_switch_unlocked (curthread=0x8110000)
    at pthread_md.h:225
* 3 Thread 4 (LWP 100128)  0x2842c801 in __assert () from /lib/libc.so.5
  2 Thread 1 (sleeping)  0x28373d0f in _thr_sched_switch_unlocked (curthread=0x8053000)
    at pthread_md.h:225
(gdb) thread 3
[Switching to thread 3 (Thread 4 (LWP 100128))]#0  0x2842c801 in __assert () from /lib/libc.so.5
(gdb) frame 2
#2  0x2837010b in mutex_lock_common (curthread=0x8110e00, m=0x28482434, abstime=0x0)
    at /opt/FreeBSD/src/lib/libpthread/thread/thr_mutex.c:495
495                     THR_LOCK_ACQUIRE(curthread, &(*m)->m_lock);
(gdb) print curthread->uniqueid
$36 = 3
(gdb) print/x curthread->magic
$37 = 0xd09ba115
(gdb) print/x **m
$39 = {m_lock = {l_head = 0x7273752f, l_tail = 0x636f6c2f, l_type = 0x6c2f6c61, l_wait = 0x6d2f6269,
    l_wakeup = 0x726f6373}, m_type = 0x2e62696c, m_protocol = 0x7c6c6c64, m_queue = {
    tqh_first = 0x74737953, tqh_last = 0x522e6d65}, m_owner = 0x69746e75, m_flags = 0x532e656d,
  m_count = 0x61697265, m_refcount = 0x617a696c, m_prio = 0x6e6f6974, m_saved_prio = 0x6553492e,
  m_qe = {tqe_next = 0x6c616972, tqe_prev = 0x62617a69}}

The thread seems to be correct, but the mutex is trashed.  It's not
a valid mutex and the lock type (l_type) does indeed have LCK_PRIORITY
set.  Note that libpthread doesn't create any locks of this type, so
this trips the assertion failure.

-- 
Dan Eischen



More information about the freebsd-threads mailing list