From rink at FreeBSD.org Sun Jul 5 14:00:11 2009 From: rink at FreeBSD.org (Rink Springer) Date: Sun Jul 5 14:00:18 2009 Subject: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B Message-ID: <20090705134016.CB1286D41E@mx1.rink.nu> >Number: 136345 >Category: threads >Synopsis: Recursive read rwlocks in thread A cause deadlock with write lock in thread B >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-threads >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Sun Jul 05 14:00:10 UTC 2009 >Closed-Date: >Last-Modified: >Originator: Rink Springer >Release: FreeBSD 7.2-PRERELEASE amd64 >Organization: >Environment: System: FreeBSD gloom.rink.nu 7.2-PRERELEASE FreeBSD 7.2-PRERELEASE #1 r191417: Thu Apr 23 13:53:08 CEST 2009 rink@gloom.rink.nu:/usr/obj/extra0/sources/releng7/sys/GENERIC amd64 The problem is also present in HEAD as of today. >Description: The following program deadlocks on FreeBSD in the 'urdlck' state: --- #include #include pthread_rwlock_t rwl_lock; void* thread1(void* x) { while(1) { pthread_rwlock_rdlock(&rwl_lock); printf("read1\n"); pthread_rwlock_rdlock(&rwl_lock); printf("read2\n"); pthread_rwlock_unlock(&rwl_lock); pthread_rwlock_unlock(&rwl_lock); } return NULL; } void* thread2(void* x) { while(1) { pthread_rwlock_wrlock(&rwl_lock); printf("write\n"); pthread_rwlock_unlock(&rwl_lock); } return NULL; } int main() { pthread_t thr_1, thr_2; pthread_rwlock_init(&rwl_lock, NULL); pthread_create(&thr_1, NULL, thread1, NULL); pthread_create(&thr_1, NULL, thread2, NULL); pthread_join(thr_1, NULL); return 0; } --- The problem is that it acquires a read rwlock multiple times in one thread, and tries to acquire a write rwlock in another thread. >How-To-Repeat: $ gcc -o locktest locktest.c -pthread $ ./locktest ... output ... load: 0.00 cmd: locktest 72866 [urdlck] 1.38r 0.01u 0.00s 0% 1360k and it's deadlocked. Note that POSIX states that 'A thread may hold multiple concurrent read locks on rwlock (that is, successfully call the pthread_rwlock_rdlock() function n times). If so, the application shall ensure that the thread performs matching unlocks (that is, it calls the pthread_rwlock_unlock() function n times).', which seems to imply that the program above shouldn't deadlock. The program above works fine on Linux, yet it also seems to deadlock on Solaris 8. I have yet to check more recent versions of Solaris. >Fix: Don't Do That[tm]; it seems the only fix is to restructure the application to avoid this scenario, even though POSIX seems to allow it. >Release-Note: >Audit-Trail: >Unformatted: From rink at FreeBSD.org Sun Jul 5 14:10:03 2009 From: rink at FreeBSD.org (Rink Springer) Date: Sun Jul 5 14:10:22 2009 Subject: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B Message-ID: <200907051410.n65EA3Kg081921@freefall.freebsd.org> The following reply was made to PR threads/136345; it has been noted by GNATS. From: Rink Springer To: FreeBSD-gnats-submit@FreeBSD.org, freebsd-threads@FreeBSD.org Cc: Subject: Re: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B Date: Sun, 5 Jul 2009 16:03:32 +0200 Did some more testing; it works fine on Solaris 9 and 10. Also, the deadlock doesn't appear on FreeBSD 6.1-STABLE (Sep 2006). Perhaps this is a libthr issue? -- Rink P.W. Springer - http://rink.nu "Doom, gloom and despair. I like it!" - Tiresias From rink at FreeBSD.org Sun Jul 5 14:20:03 2009 From: rink at FreeBSD.org (Rink Springer) Date: Sun Jul 5 14:20:10 2009 Subject: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B In-Reply-To: <200907051400.n65E0AWQ074917@freefall.freebsd.org> References: <20090705134016.CB1286D41E@mx1.rink.nu> <200907051400.n65E0AWQ074917@freefall.freebsd.org> Message-ID: <20090705140332.GA8739@rink.nu> Did some more testing; it works fine on Solaris 9 and 10. Also, the deadlock doesn't appear on FreeBSD 6.1-STABLE (Sep 2006). Perhaps this is a libthr issue? -- Rink P.W. Springer - http://rink.nu "Doom, gloom and despair. I like it!" - Tiresias From attilio at freebsd.org Sun Jul 5 15:10:37 2009 From: attilio at freebsd.org (Attilio Rao) Date: Sun Jul 5 15:10:43 2009 Subject: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B In-Reply-To: <20090705140332.GA8739@rink.nu> References: <20090705134016.CB1286D41E@mx1.rink.nu> <200907051400.n65E0AWQ074917@freefall.freebsd.org> <20090705140332.GA8739@rink.nu> Message-ID: <3bbf2fe10907050748x64bb2c7dmc0787fe6bda5d701@mail.gmail.com> 2009/7/5 Rink Springer : > Did some more testing; it works fine on Solaris 9 and 10. Also, the > deadlock doesn't appear on FreeBSD 6.1-STABLE (Sep 2006). Perhaps this > is a libthr issue? I think that rdlock_count in libthr is not updated properly. Can you test the following patch: http://www.freebsd.org/~attilio/libthr.diff Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein From attilio at freebsd.org Sun Jul 5 15:20:03 2009 From: attilio at freebsd.org (Attilio Rao) Date: Sun Jul 5 15:20:10 2009 Subject: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B Message-ID: <200907051520.n65FK2AM035520@freefall.freebsd.org> The following reply was made to PR threads/136345; it has been noted by GNATS. From: Attilio Rao To: Rink Springer Cc: FreeBSD-gnats-submit@freebsd.org, freebsd-threads@freebsd.org Subject: Re: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B Date: Sun, 5 Jul 2009 16:48:27 +0200 2009/7/5 Rink Springer : > Did some more testing; it works fine on Solaris 9 and 10. Also, the > deadlock doesn't appear on FreeBSD 6.1-STABLE (Sep 2006). Perhaps this > is a libthr issue? I think that rdlock_count in libthr is not updated properly. Can you test the following patch: http://www.freebsd.org/~attilio/libthr.diff Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein From rink at FreeBSD.org Sun Jul 5 15:40:03 2009 From: rink at FreeBSD.org (Rink Springer) Date: Sun Jul 5 15:40:09 2009 Subject: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B Message-ID: <200907051540.n65Fe2R1052138@freefall.freebsd.org> The following reply was made to PR threads/136345; it has been noted by GNATS. From: Rink Springer To: FreeBSD-gnats-submit@FreeBSD.org Cc: Subject: Re: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B Date: Sun, 5 Jul 2009 17:37:40 +0200 On Sun, Jul 05, 2009 at 03:20:02PM +0000, Attilio Rao wrote: > I think that rdlock_count in libthr is not updated properly. > Can you test the following patch: > http://www.freebsd.org/~attilio/libthr.diff This seems to resolve the issue; it seems to solve the deadlock the attached test program has. I'll retry it with my real workload and see if this fixes the problem - will let you know. Thanks! -- Rink P.W. Springer - http://rink.nu "Doom, gloom and despair. I like it!" - Tiresias From dfilter at FreeBSD.ORG Mon Jul 6 09:40:05 2009 From: dfilter at FreeBSD.ORG (dfilter service) Date: Mon Jul 6 09:40:11 2009 Subject: threads/136345: commit references a PR Message-ID: <200907060940.n669e463036523@freefall.freebsd.org> The following reply was made to PR threads/136345; it has been noted by GNATS. From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: threads/136345: commit references a PR Date: Mon, 6 Jul 2009 09:31:15 +0000 (UTC) Author: attilio Date: Mon Jul 6 09:31:04 2009 New Revision: 195403 URL: http://svn.freebsd.org/changeset/base/195403 Log: In the current code, rdlock_count is not correctly handled for some cases. The most notable is that it is not bumped in rwlock_rdlock_common() when the hard path (__thr_rwlock_rdlock()) returns successfully. This can lead to deadlocks in libthr when rwlocks recursion in read mode happens. Fix the interested parts by correctly handling rdlock_count. PR: threads/136345 Reported by: rink Tested by: rink Reviewed by: jeff Approved by: re (kib) MFC: 2 weeks Modified: head/lib/libthr/thread/thr_rtld.c head/lib/libthr/thread/thr_rwlock.c Modified: head/lib/libthr/thread/thr_rtld.c ============================================================================== --- head/lib/libthr/thread/thr_rtld.c Mon Jul 6 09:07:35 2009 (r195402) +++ head/lib/libthr/thread/thr_rtld.c Mon Jul 6 09:31:04 2009 (r195403) @@ -114,6 +114,7 @@ _thr_rtld_rlock_acquire(void *lock) THR_CRITICAL_ENTER(curthread); while (_thr_rwlock_rdlock(&l->lock, 0, NULL) != 0) ; + curthread->rdlock_count++; RESTORE_ERRNO(); } @@ -148,6 +149,7 @@ _thr_rtld_lock_release(void *lock) state = l->lock.rw_state; if (_thr_rwlock_unlock(&l->lock) == 0) { + curthread->rdlock_count--; if ((state & URWLOCK_WRITE_OWNER) == 0) { THR_CRITICAL_LEAVE(curthread); } else { Modified: head/lib/libthr/thread/thr_rwlock.c ============================================================================== --- head/lib/libthr/thread/thr_rwlock.c Mon Jul 6 09:07:35 2009 (r195402) +++ head/lib/libthr/thread/thr_rwlock.c Mon Jul 6 09:31:04 2009 (r195403) @@ -177,10 +177,11 @@ rwlock_rdlock_common(pthread_rwlock_t *r /* if interrupted, try to lock it in userland again. */ if (_thr_rwlock_tryrdlock(&prwlock->lock, flags) == 0) { ret = 0; - curthread->rdlock_count++; break; } } + if (ret == 0) + curthread->rdlock_count++; return (ret); } _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From bugmaster at FreeBSD.org Mon Jul 6 11:07:09 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Jul 6 11:09:47 2009 Subject: Current problem reports assigned to freebsd-threads@FreeBSD.org Message-ID: <200907061107.n66B77b5010955@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o threa/136345 threads Recursive read rwlocks in thread A cause deadlock with o threa/135462 threads [PATCH] _thread_cleanupspecific() doesn't handle delet o threa/133734 threads 32 bit libthr failing pthread_create() o threa/128922 threads threads hang with xorg running o threa/127225 threads bug in lib/libthr/thread/thr_init.c o threa/122923 threads 'nice' does not prevent background process from steali o threa/121336 threads lang/neko threading ok on UP, broken on SMP (FreeBSD 7 o threa/118715 threads kse problem o threa/116668 threads can no longer use jdk15 with libthr on -stable SMP o threa/116181 threads /dev/io-related io access permissions are not propagat o threa/115211 threads pthread_atfork misbehaves in initial thread o threa/110636 threads [request] gdb(1): using gdb with multi thread applicat o threa/110306 threads apache 2.0 segmentation violation when calling gethost o threa/103975 threads Implicit loading/unloading of libpthread.so may crash o threa/101323 threads [patch] fork(2) in threaded programs broken. s threa/100815 threads FBSD 5.5 broke nanosleep in libc_r s threa/94467 threads send(), sendto() and sendmsg() are not correct in libc s threa/84483 threads problems with devel/nspr and -lc_r on 4.x o threa/83914 threads [libc] popen() doesn't work in static threaded program o threa/80992 threads abort() sometimes not caught by gdb depending on threa o threa/80435 threads panic on high loads o threa/79887 threads [patch] freopen() isn't thread-safe o threa/79683 threads svctcp_create() fails if multiple threads call at the s threa/76694 threads fork cause hang in dup()/close() function in child (-l s threa/76690 threads fork hang in child for -lc_r o threa/75374 threads pthread_kill() ignores SA_SIGINFO flag o threa/75273 threads FBSD 5.3 libpthread (KSE) bug o threa/72953 threads fork() unblocks blocked signals w/o PTHREAD_SCOPE_SYST o threa/70975 threads [sysvipc] unexpected and unreliable behaviour when usi s threa/69020 threads pthreads library leaks _gc_mutex s threa/49087 threads Signals lost in programs linked with libc_r s threa/48856 threads Setting SIGCHLD to SIG_IGN still leaves zombies under s threa/40671 threads pthread_cancel doesn't remove thread from condition qu s threa/39922 threads [threads] [patch] Threaded applications executed with s threa/37676 threads libc_r: msgsnd(), msgrcv(), pread(), pwrite() need wra s threa/34536 threads accept() blocks other threads s threa/32295 threads [libc_r] [patch] pthread(3) dont dequeue signals s threa/30464 threads pthread mutex attributes -- pshared s threa/24632 threads libc_r delicate deviation from libc in handling SIGCHL s threa/24472 threads libc_r does not honor SO_SNDTIMEO/SO_RCVTIMEO socket o 40 problems total. From bugmaster at FreeBSD.org Mon Jul 13 11:07:09 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Jul 13 11:09:54 2009 Subject: Current problem reports assigned to freebsd-threads@FreeBSD.org Message-ID: <200907131107.n6DB78oB040807@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o threa/136345 threads Recursive read rwlocks in thread A cause deadlock with o threa/135462 threads [PATCH] _thread_cleanupspecific() doesn't handle delet o threa/133734 threads 32 bit libthr failing pthread_create() o threa/128922 threads threads hang with xorg running o threa/127225 threads bug in lib/libthr/thread/thr_init.c o threa/122923 threads 'nice' does not prevent background process from steali o threa/121336 threads lang/neko threading ok on UP, broken on SMP (FreeBSD 7 o threa/118715 threads kse problem o threa/116668 threads can no longer use jdk15 with libthr on -stable SMP o threa/116181 threads /dev/io-related io access permissions are not propagat o threa/115211 threads pthread_atfork misbehaves in initial thread o threa/110636 threads [request] gdb(1): using gdb with multi thread applicat o threa/110306 threads apache 2.0 segmentation violation when calling gethost o threa/103975 threads Implicit loading/unloading of libpthread.so may crash o threa/101323 threads [patch] fork(2) in threaded programs broken. s threa/100815 threads FBSD 5.5 broke nanosleep in libc_r s threa/94467 threads send(), sendto() and sendmsg() are not correct in libc s threa/84483 threads problems with devel/nspr and -lc_r on 4.x o threa/83914 threads [libc] popen() doesn't work in static threaded program o threa/80992 threads abort() sometimes not caught by gdb depending on threa o threa/80435 threads panic on high loads o threa/79887 threads [patch] freopen() isn't thread-safe o threa/79683 threads svctcp_create() fails if multiple threads call at the s threa/76694 threads fork cause hang in dup()/close() function in child (-l s threa/76690 threads fork hang in child for -lc_r o threa/75374 threads pthread_kill() ignores SA_SIGINFO flag o threa/75273 threads FBSD 5.3 libpthread (KSE) bug o threa/72953 threads fork() unblocks blocked signals w/o PTHREAD_SCOPE_SYST o threa/70975 threads [sysvipc] unexpected and unreliable behaviour when usi s threa/69020 threads pthreads library leaks _gc_mutex s threa/49087 threads Signals lost in programs linked with libc_r s threa/48856 threads Setting SIGCHLD to SIG_IGN still leaves zombies under s threa/40671 threads pthread_cancel doesn't remove thread from condition qu s threa/39922 threads [threads] [patch] Threaded applications executed with s threa/37676 threads libc_r: msgsnd(), msgrcv(), pread(), pwrite() need wra s threa/34536 threads accept() blocks other threads s threa/32295 threads [libc_r] [patch] pthread(3) dont dequeue signals s threa/30464 threads pthread mutex attributes -- pshared s threa/24632 threads libc_r delicate deviation from libc in handling SIGCHL s threa/24472 threads libc_r does not honor SO_SNDTIMEO/SO_RCVTIMEO socket o 40 problems total. From ale at FreeBSD.org Tue Jul 14 07:54:02 2009 From: ale at FreeBSD.org (ale@FreeBSD.org) Date: Tue Jul 14 07:54:09 2009 Subject: threads/135673: databases/mysql50-server - MySQL query lock-ups on 7.2-RELEASE amd64 Message-ID: <200907140754.n6E7s22L058864@freefall.freebsd.org> Synopsis: databases/mysql50-server - MySQL query lock-ups on 7.2-RELEASE amd64 Responsible-Changed-From-To: ale->freebsd-threads Responsible-Changed-By: ale Responsible-Changed-When: Tue Jul 14 07:52:43 UTC 2009 Responsible-Changed-Why: FreeBSD's threads problem. http://www.freebsd.org/cgi/query-pr.cgi?pr=135673 From nick at desert.net Wed Jul 15 21:40:04 2009 From: nick at desert.net (Nick Esborn) Date: Wed Jul 15 21:40:11 2009 Subject: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B Message-ID: <200907152140.n6FLe42l045879@freefall.freebsd.org> The following reply was made to PR threads/136345; it has been noted by GNATS. From: Nick Esborn To: bug-followup@FreeBSD.org, rink@FreeBSD.org Cc: Subject: Re: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B Date: Wed, 15 Jul 2009 14:32:38 -0700 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --Apple-Mail-19-950902279 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Even after the above patch, I still run into occasional MySQL thread deadlocks, which I originally described in what is now threads/135673. I also posted on freebsd-current a few days ago: http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009328.html I'd be happy to collect whatever data would be helpful in tracking down this deadlock. This only seems to happen under our production workload, so that might make it harder to capture meaningful debug data, but I'm certainly willing to try. I can also arrange for developer access to the system in question, if that would help significantly. -nick -- nick@desert.net - all messages cryptographically signed --Apple-Mail-19-950902279 content-type: application/pgp-signature; x-mac-type=70674453; name=PGP.sig content-description: This is a digitally signed message part content-disposition: inline; filename=PGP.sig content-transfer-encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iEYEARECAAYFAkpeSvYACgkQw1bX5UNr2ABajACeNpj/MW4X+4zfvlWNCXnqo6D9 EZkAoLpGHxs4RoHMd7yqyba4IPzKxsJh =5I4q -----END PGP SIGNATURE----- --Apple-Mail-19-950902279-- From attilio at freebsd.org Thu Jul 16 12:26:51 2009 From: attilio at freebsd.org (Attilio Rao) Date: Thu Jul 16 12:26:58 2009 Subject: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B In-Reply-To: <200907152140.n6FLe42l045879@freefall.freebsd.org> References: <200907152140.n6FLe42l045879@freefall.freebsd.org> Message-ID: <3bbf2fe10907160526l2f066698qce8a5e77aee6366b@mail.gmail.com> 2009/7/15 Nick Esborn : > The following reply was made to PR threads/136345; it has been noted by GNATS. > > From: Nick Esborn > To: bug-followup@FreeBSD.org, > rink@FreeBSD.org > Cc: > Subject: Re: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B > Date: Wed, 15 Jul 2009 14:32:38 -0700 > > This is an OpenPGP/MIME signed message (RFC 2440 and 3156) > --Apple-Mail-19-950902279 > Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > Content-Transfer-Encoding: 7bit > > Even after the above patch, I still run into occasional MySQL thread > deadlocks, which I originally described in what is now threads/135673. > > I also posted on freebsd-current a few days ago: > > http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009328.html > > I'd be happy to collect whatever data would be helpful in tracking > down this deadlock. This only seems to happen under our production > workload, so that might make it harder to capture meaningful debug > data, but I'm certainly willing to try. I can also arrange for > developer access to the system in question, if that would help > significantly. So did you backport this to 7 and still experience deadlocks? I just committed the fix to HEAD not to STABLE branch. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein From attilio at freebsd.org Thu Jul 16 12:48:13 2009 From: attilio at freebsd.org (Attilio Rao) Date: Thu Jul 16 12:48:19 2009 Subject: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B In-Reply-To: <3bbf2fe10907160526l2f066698qce8a5e77aee6366b@mail.gmail.com> References: <200907152140.n6FLe42l045879@freefall.freebsd.org> <3bbf2fe10907160526l2f066698qce8a5e77aee6366b@mail.gmail.com> Message-ID: <3bbf2fe10907160548l74de896bka609de7a9a994899@mail.gmail.com> 2009/7/16 Attilio Rao : > 2009/7/15 Nick Esborn : >> The following reply was made to PR threads/136345; it has been noted by GNATS. >> >> From: Nick Esborn >> To: bug-followup@FreeBSD.org, >> rink@FreeBSD.org >> Cc: >> Subject: Re: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B >> Date: Wed, 15 Jul 2009 14:32:38 -0700 >> >> This is an OpenPGP/MIME signed message (RFC 2440 and 3156) >> --Apple-Mail-19-950902279 >> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes >> Content-Transfer-Encoding: 7bit >> >> Even after the above patch, I still run into occasional MySQL thread >> deadlocks, which I originally described in what is now threads/135673. >> >> I also posted on freebsd-current a few days ago: >> >> http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009328.html >> >> I'd be happy to collect whatever data would be helpful in tracking >> down this deadlock. This only seems to happen under our production >> workload, so that might make it harder to capture meaningful debug >> data, but I'm certainly willing to try. I can also arrange for >> developer access to the system in question, if that would help >> significantly. > > So did you backport this to 7 and still experience deadlocks? > I just committed the fix to HEAD not to STABLE branch. Ok, I got, you just upgraded. Can you try the following things?: - Upgrade to the -CURRENT of today - Recompile the kernel with the following options: KDB, DDB, SCHED_ULE, PREEMPTION, FULL_PREEMPTION, INVARIANT_SUPPORT, INVARIANTS, WITNESS - When the deadlock takes place break into DDB and please retrieve the following info: db> show allpcpu db> ps db> alltrace db> show alllock - Save them with a serial console output or using the textdump(4) format. (if necessary read the ddb(4) and textdump(4) before to set up the whole system). This would shade a light if the problem lives within the kernel or not. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein From nick at desert.net Thu Jul 16 17:51:01 2009 From: nick at desert.net (Nick Esborn) Date: Thu Jul 16 17:51:08 2009 Subject: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B In-Reply-To: <3bbf2fe10907160548l74de896bka609de7a9a994899@mail.gmail.com> References: <200907152140.n6FLe42l045879@freefall.freebsd.org> <3bbf2fe10907160526l2f066698qce8a5e77aee6366b@mail.gmail.com> <3bbf2fe10907160548l74de896bka609de7a9a994899@mail.gmail.com> Message-ID: <6E6A9516-6C69-4E41-803C-FE5F126F402C@desert.net> On Jul 16, 2009, at 5:48 AM, Attilio Rao wrote: > 2009/7/16 Attilio Rao : >> 2009/7/15 Nick Esborn : >>> The following reply was made to PR threads/136345; it has been >>> noted by GNATS. >>> >>> From: Nick Esborn >>> To: bug-followup@FreeBSD.org, >>> rink@FreeBSD.org >>> Cc: >>> Subject: Re: threads/136345: Recursive read rwlocks in thread A >>> cause deadlock with write lock in thread B >>> Date: Wed, 15 Jul 2009 14:32:38 -0700 >>> >>> This is an OpenPGP/MIME signed message (RFC 2440 and 3156) >>> --Apple-Mail-19-950902279 >>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes >>> Content-Transfer-Encoding: 7bit >>> >>> Even after the above patch, I still run into occasional MySQL thread >>> deadlocks, which I originally described in what is now threads/ >>> 135673. >>> >>> I also posted on freebsd-current a few days ago: >>> >>> http://lists.freebsd.org/pipermail/freebsd-current/2009-July/009328.html >>> >>> I'd be happy to collect whatever data would be helpful in tracking >>> down this deadlock. This only seems to happen under our production >>> workload, so that might make it harder to capture meaningful debug >>> data, but I'm certainly willing to try. I can also arrange for >>> developer access to the system in question, if that would help >>> significantly. >> >> So did you backport this to 7 and still experience deadlocks? >> I just committed the fix to HEAD not to STABLE branch. > > Ok, I got, you just upgraded. > Can you try the following things?: > - Upgrade to the -CURRENT of today > - Recompile the kernel with the following options: > KDB, DDB, SCHED_ULE, PREEMPTION, FULL_PREEMPTION, INVARIANT_SUPPORT, > INVARIANTS, WITNESS > - When the deadlock takes place break into DDB and please retrieve the > following info: > db> show allpcpu > db> ps > db> alltrace > db> show alllock > > - Save them with a serial console output or using the textdump(4) > format. > > (if necessary read the ddb(4) and textdump(4) before to set up the > whole system). > This would shade a light if the problem lives within the kernel or > not. > > Thanks, > Attilio > > > -- > Peace can only be achieved by understanding - A. Einstein I can definitely do the upgrade. KDB, DDB, SCHED_ULE, and PREEMPTION are already turned on. I will try FULL_PREEMPTION, INVARIANT_SUPPORT, INVARIANTS, and WITNESS, but when I first upgraded to 8.0, this server was unable to handle its workload with the INVARIANTS and WITNESS options turned on. Also, it can take a while for it to become clear that the deadlock has occurred -- usually our monitoring picks it up when replication falls behind. So it may be 15-20 minutes after the deadlock that I am able to run the above db commands. Of course the thread will still be deadlocked. Hopefully that doesn't reduce the value of the data obtained. Thanks, -nick -- nick@desert.net - all messages cryptographically signed -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-threads/attachments/20090716/512bfdc6/PGP.pgp From attilio at freebsd.org Thu Jul 16 17:53:29 2009 From: attilio at freebsd.org (Attilio Rao) Date: Thu Jul 16 17:54:11 2009 Subject: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B In-Reply-To: <6E6A9516-6C69-4E41-803C-FE5F126F402C@desert.net> References: <200907152140.n6FLe42l045879@freefall.freebsd.org> <3bbf2fe10907160526l2f066698qce8a5e77aee6366b@mail.gmail.com> <3bbf2fe10907160548l74de896bka609de7a9a994899@mail.gmail.com> <6E6A9516-6C69-4E41-803C-FE5F126F402C@desert.net> Message-ID: <3bbf2fe10907161053x3b6aa60dneb8dbd5217b4cb03@mail.gmail.com> 2009/7/16 Nick Esborn : > > > KDB, DDB, SCHED_ULE, and PREEMPTION are already turned on. I will try > FULL_PREEMPTION, INVARIANT_SUPPORT, INVARIANTS, and WITNESS, but when I > first upgraded to 8.0, this server was unable to handle its workload with > the INVARIANTS and WITNESS options turned on. What do you mean with 'unable'? What was happening precisely? > Also, it can take a while for it to become clear that the deadlock has > occurred -- usually our monitoring picks it up when replication falls > behind. So it may be 15-20 minutes after the deadlock that I am able to run > the above db commands. Of course the thread will still be deadlocked. > Hopefully that doesn't reduce the value of the data obtained. It should be still fine. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein From nick at desert.net Thu Jul 16 17:56:58 2009 From: nick at desert.net (Nick Esborn) Date: Thu Jul 16 17:57:07 2009 Subject: threads/136345: Recursive read rwlocks in thread A cause deadlock with write lock in thread B In-Reply-To: <3bbf2fe10907161053x3b6aa60dneb8dbd5217b4cb03@mail.gmail.com> References: <200907152140.n6FLe42l045879@freefall.freebsd.org> <3bbf2fe10907160526l2f066698qce8a5e77aee6366b@mail.gmail.com> <3bbf2fe10907160548l74de896bka609de7a9a994899@mail.gmail.com> <6E6A9516-6C69-4E41-803C-FE5F126F402C@desert.net> <3bbf2fe10907161053x3b6aa60dneb8dbd5217b4cb03@mail.gmail.com> Message-ID: On Jul 16, 2009, at 10:53 AM, Attilio Rao wrote: > 2009/7/16 Nick Esborn : >> >> >> KDB, DDB, SCHED_ULE, and PREEMPTION are already turned on. I will >> try >> FULL_PREEMPTION, INVARIANT_SUPPORT, INVARIANTS, and WITNESS, but >> when I >> first upgraded to 8.0, this server was unable to handle its >> workload with >> the INVARIANTS and WITNESS options turned on. > > What do you mean with 'unable'? What was happening precisely? System time would rise during periods of peak demand, and the system would quickly fall behind on its workload of queries. However, I have some hardware I can dedicate to this, and only run the one MySQL data set which exhibits this problem. That should be enough of a workload reduction to allow the server to keep up even with all the above options turned on. -nick > >> Also, it can take a while for it to become clear that the deadlock >> has >> occurred -- usually our monitoring picks it up when replication falls >> behind. So it may be 15-20 minutes after the deadlock that I am >> able to run >> the above db commands. Of course the thread will still be >> deadlocked. >> Hopefully that doesn't reduce the value of the data obtained. > > It should be still fine. > > Thanks, > Attilio > > > -- > Peace can only be achieved by understanding - A. Einstein -- nick@desert.net - all messages cryptographically signed -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-threads/attachments/20090716/0bfaf912/PGP.pgp From bugmaster at FreeBSD.org Mon Jul 20 11:07:06 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Jul 20 11:09:51 2009 Subject: Current problem reports assigned to freebsd-threads@FreeBSD.org Message-ID: <200907201107.n6KB75pa002463@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o threa/136345 threads Recursive read rwlocks in thread A cause deadlock with o threa/135673 threads databases/mysql50-server - MySQL query lock-ups on 7.2 o threa/135462 threads [PATCH] _thread_cleanupspecific() doesn't handle delet o threa/133734 threads 32 bit libthr failing pthread_create() o threa/128922 threads threads hang with xorg running o threa/127225 threads bug in lib/libthr/thread/thr_init.c o threa/122923 threads 'nice' does not prevent background process from steali o threa/121336 threads lang/neko threading ok on UP, broken on SMP (FreeBSD 7 o threa/118715 threads kse problem o threa/116668 threads can no longer use jdk15 with libthr on -stable SMP o threa/116181 threads /dev/io-related io access permissions are not propagat o threa/115211 threads pthread_atfork misbehaves in initial thread o threa/110636 threads [request] gdb(1): using gdb with multi thread applicat o threa/110306 threads apache 2.0 segmentation violation when calling gethost o threa/103975 threads Implicit loading/unloading of libpthread.so may crash o threa/101323 threads [patch] fork(2) in threaded programs broken. s threa/100815 threads FBSD 5.5 broke nanosleep in libc_r s threa/94467 threads send(), sendto() and sendmsg() are not correct in libc s threa/84483 threads problems with devel/nspr and -lc_r on 4.x o threa/83914 threads [libc] popen() doesn't work in static threaded program o threa/80992 threads abort() sometimes not caught by gdb depending on threa o threa/80435 threads panic on high loads o threa/79887 threads [patch] freopen() isn't thread-safe o threa/79683 threads svctcp_create() fails if multiple threads call at the s threa/76694 threads fork cause hang in dup()/close() function in child (-l s threa/76690 threads fork hang in child for -lc_r o threa/75374 threads pthread_kill() ignores SA_SIGINFO flag o threa/75273 threads FBSD 5.3 libpthread (KSE) bug o threa/72953 threads fork() unblocks blocked signals w/o PTHREAD_SCOPE_SYST o threa/70975 threads [sysvipc] unexpected and unreliable behaviour when usi s threa/69020 threads pthreads library leaks _gc_mutex s threa/49087 threads Signals lost in programs linked with libc_r s threa/48856 threads Setting SIGCHLD to SIG_IGN still leaves zombies under s threa/40671 threads pthread_cancel doesn't remove thread from condition qu s threa/39922 threads [threads] [patch] Threaded applications executed with s threa/37676 threads libc_r: msgsnd(), msgrcv(), pread(), pwrite() need wra s threa/34536 threads accept() blocks other threads s threa/32295 threads [libc_r] [patch] pthread(3) dont dequeue signals s threa/30464 threads pthread mutex attributes -- pshared s threa/24632 threads libc_r delicate deviation from libc in handling SIGCHL s threa/24472 threads libc_r does not honor SO_SNDTIMEO/SO_RCVTIMEO socket o 41 problems total. From bugmaster at FreeBSD.org Mon Jul 27 11:07:04 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Jul 27 11:09:55 2009 Subject: Current problem reports assigned to freebsd-threads@FreeBSD.org Message-ID: <200907271107.n6RB73KO019123@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o threa/136345 threads Recursive read rwlocks in thread A cause deadlock with o threa/135673 threads databases/mysql50-server - MySQL query lock-ups on 7.2 o threa/135462 threads [PATCH] _thread_cleanupspecific() doesn't handle delet o threa/133734 threads 32 bit libthr failing pthread_create() o threa/128922 threads threads hang with xorg running o threa/127225 threads bug in lib/libthr/thread/thr_init.c o threa/122923 threads 'nice' does not prevent background process from steali o threa/121336 threads lang/neko threading ok on UP, broken on SMP (FreeBSD 7 o threa/118715 threads kse problem o threa/116668 threads can no longer use jdk15 with libthr on -stable SMP o threa/116181 threads /dev/io-related io access permissions are not propagat o threa/115211 threads pthread_atfork misbehaves in initial thread o threa/110636 threads [request] gdb(1): using gdb with multi thread applicat o threa/110306 threads apache 2.0 segmentation violation when calling gethost o threa/103975 threads Implicit loading/unloading of libpthread.so may crash o threa/101323 threads [patch] fork(2) in threaded programs broken. s threa/100815 threads FBSD 5.5 broke nanosleep in libc_r s threa/94467 threads send(), sendto() and sendmsg() are not correct in libc s threa/84483 threads problems with devel/nspr and -lc_r on 4.x o threa/83914 threads [libc] popen() doesn't work in static threaded program o threa/80992 threads abort() sometimes not caught by gdb depending on threa o threa/80435 threads panic on high loads o threa/79887 threads [patch] freopen() isn't thread-safe o threa/79683 threads svctcp_create() fails if multiple threads call at the s threa/76694 threads fork cause hang in dup()/close() function in child (-l s threa/76690 threads fork hang in child for -lc_r o threa/75374 threads pthread_kill() ignores SA_SIGINFO flag o threa/75273 threads FBSD 5.3 libpthread (KSE) bug o threa/72953 threads fork() unblocks blocked signals w/o PTHREAD_SCOPE_SYST o threa/70975 threads [sysvipc] unexpected and unreliable behaviour when usi s threa/69020 threads pthreads library leaks _gc_mutex s threa/49087 threads Signals lost in programs linked with libc_r s threa/48856 threads Setting SIGCHLD to SIG_IGN still leaves zombies under s threa/40671 threads pthread_cancel doesn't remove thread from condition qu s threa/39922 threads [threads] [patch] Threaded applications executed with s threa/37676 threads libc_r: msgsnd(), msgrcv(), pread(), pwrite() need wra s threa/34536 threads accept() blocks other threads s threa/32295 threads [libc_r] [patch] pthread(3) dont dequeue signals s threa/30464 threads pthread mutex attributes -- pshared s threa/24632 threads libc_r delicate deviation from libc in handling SIGCHL s threa/24472 threads libc_r does not honor SO_SNDTIMEO/SO_RCVTIMEO socket o 41 problems total.