many thread applications are unstable on 7-current
Norikatsu Shigemura
nork at FreeBSD.org
Sun Jul 16 14:23:48 UTC 2006
I think that this is FYI, maybe. Sorry, I don't know why it causes.
I could not trace problems.
The problems are happened for about one ~ two month, but I report
following environment with SYMVER_ENABLE=YES. I don't know what
happened on 7-current without SYMBER_ENABLE.
FreeBSD nadesico.ninth-nine.com 7.0-CURRENT FreeBSD 7.0-CURRENT #87: Sun Jul 16 11:48:32 JST 2006 nork at nadesico.ninth-nine.com:/usr/obj/usr/src/sys/NADESICO i386
The applications which I troubled are following:
1. cvsync (net/cvsync)
2. openoffice.org (editors/openoffice.org-2.0)
3. firefox (www/firefox)
4. scim-launcher (textproc/scim)
It offten crash, and automatically recovery. And it is too hard
to trace:-(. So I didn't trace it.
----------------------------------------------------------
1. cvsync (net/cvsync)
# ktrace -i /usr/local/bin/cvsync -v -c FreeBSD.conf
:
Updating (collection freebsd-cvsroot-ports/rcs)
/home/ncvs/CVSROOT-ports/commitlogs/.cvsync.QCnj4R: Unknown error: 0
Updater(RCS): UPDATE: /home/ncvs/CVSROOT-ports/commitlogs/ports
Updater: RCS Error
Socket Error: recv: 2 residue 2
Receiver Error: recv
Mux(RECV) Error: not running: 0
FileScan: RCS Error
Mux(SEND) Error: not running: 0
DirScan: RCS Error
Failed
I got following kdump's result:
(snip)
4667 cvsync CALL utimes(0x831071c,0xbf5faeb0)
4667 cvsync NAMI "/home/ncvs/CVSROOT-ports/commitlogs/.cvsync.QCnj4R"
4667 cvsync RET utimes 0
4667 cvsync CALL rename(0x831071c,0x8310218)
4667 cvsync NAMI "/home/ncvs/CVSROOT-ports/commitlogs/.cvsync.QCnj4R"
4667 cvsync NAMI "/home/ncvs/CVSROOT-ports/commitlogs/ports"
4667 cvsync RET rename 0
4667 cvsync CALL unlink(0x831071c)
4667 cvsync NAMI "/home/ncvs/CVSROOT-ports/commitlogs/.cvsync.QCnj4R"
4667 cvsync RET unlink -1 errno 2 No such file or directory
4667 cvsync CALL write(0x2,0xbf5fa690,0x45)
4667 cvsync GIO fd 2 wrote 69 bytes
"/home/ncvs/CVSROOT-ports/commitlogs/.cvsync.QCnj4R: Unknown error: 0
"
4667 cvsync RET write 69/0x45
(snip)
It looks like errno cross-talking per threads.
----------------------------------------------------------
2. openoffice.org (editors/openoffice.org-2.0)
$ ktrace -i /usr/local/bin/openoffice.org
(Open new document)
X IO Error
I got following kdump's result:
(snip)
4720 soffice.bin CALL socket(PF_LOCAL|PF_UNIX,SOCK_STREAM,0)
4720 soffice.bin RET socket 36/0x24
(snip)
4720 soffice.bin CALL connect(0x24,0xbfbfceb8,0x13)
4720 soffice.bin NAMI "/tmp/.X11-unix/X0"
4720 soffice.bin RET connect 0
(snip)
4720 soffice.bin CALL fcntl(0x24,F_SETFD,FD_CLOEXEC)
4720 soffice.bin RET fcntl 0
(snip)
4720 soffice.bin CALL write(0x24,0x9014000,0x8)
4720 soffice.bin GIO fd 36 wrote 8 bytes
0x0000 1700 0200 2401 0000 |....$...|
4720 soffice.bin RET write 8
4720 soffice.bin CALL read(0x24,0xbf4f9aac,0x20)
4720 soffice.bin RET read -1 errno 35 Resource temporarily unavailable
4720 soffice.bin CALL write(0x2,0xbf4f9404,0xb)
4720 soffice.bin GIO fd 2 wrote 11 bytes
"X IO Error
"
4720 soffice.bin RET write 11/0xb
4720 soffice.bin CALL exit(0)
Sorry, I don't have any coments.
----------------------------------------------------------
3. firefox (www/firefox)
$ ktrace -i /usr/X11R6/bin/firefox http://www.google.co.jp/
I got following kdump's result:
(snip)
4862 firefox-bin CALL socket(PF_INET,SOCK_STREAM,0)
4862 firefox-bin RET socket 34/0x22
4862 firefox-bin CALL fcntl(0x22,F_GETFL,0xbf8fdb38)
4862 firefox-bin RET fcntl 2
4862 firefox-bin CALL fcntl(0x22,F_SETFL,O_RDWR|O_NONBLOCK)
4862 firefox-bin RET fcntl 0
4862 firefox-bin CALL connect(0x22,0xbf8fdc50,0x10)
4862 firefox-bin RET connect -1 errno 36 Operation now in progress
4862 firefox-bin CALL write(0x9,0x68284d2d,0x1)
4862 firefox-bin GIO fd 9 wrote 1 byte
"8"
4862 firefox-bin RET write 1
4862 firefox-bin CALL close(0x22)
4862 firefox-bin RET close 0
(snip)
4862 firefox-bin CALL socket(PF_INET,SOCK_STREAM,0)
4862 firefox-bin RET socket 35/0x23
4862 firefox-bin CALL fcntl(0x23,F_GETFL,0xbf8fdb38)
4862 firefox-bin RET fcntl 2
4862 firefox-bin CALL fcntl(0x23,F_SETFL,O_RDWR|O_NONBLOCK)
4862 firefox-bin RET fcntl 0
4862 firefox-bin CALL connect(0x23,0xbf8fdc50,0x10)
4862 firefox-bin RET connect -1 errno 36 Operation now in progress
4862 firefox-bin CALL write(0x9,0x68284d2d,0x1)
4862 firefox-bin GIO fd 9 wrote 1 byte
"8"
4862 firefox-bin RET write 1
4862 firefox-bin CALL close(0x23)
4862 firefox-bin RET close 0
(snip)
I got following result of 'tcpdump -pn not port ssh'.
23:10:32.340676 IP 219.127.74.122.54603 > 66.249.89.104.80: S 3969568294:3969568294(0) win 65535 <mss 1414,nop,wscale 1,nop,nop,timestamp 29620081 0,sackOK,eol>
23:10:32.358419 IP 66.249.89.104.80 > 219.127.74.122.54603: S 2563066654:2563066654(0) ack 3969568295 win 8190 <mss 1402>
23:10:32.358486 IP 219.127.74.122.54603 > 66.249.89.104.80: R 3969568295:3969568295(0) win 0
23:10:32.823420 IP 219.127.74.122.54748 > 63.245.209.21.80: S 3885439161:3885439161(0) win 65535 <mss 1414,nop,wscale 1,nop,nop,timestamp 29620564 0,sackOK,eol>
23:10:32.970835 IP 63.245.209.21.80 > 219.127.74.122.54748: S 2028454746:2028454746(0) ack 3885439162 win 8190 <mss 1402>
23:10:32.970930 IP 219.127.74.122.54748 > 63.245.209.21.80: R 3885439162:3885439162(0) win 0
According to tcpdump, this is two times connect(2) failure are
matched.
219.127.74.122:54603 -> 66.249.89.104:80 SYN
219.127.74.122:54603 <- 66.249.89.104:80 SYN+ACK
219.127.74.122:54603 -> 66.249.89.104:80 RSET
219.127.74.122.54748 -> 63.245.209.21.80 SYN
219.127.74.122.54748 <- 63.245.209.21.80 SYN+ACK
219.127.74.122.54748 -> 63.245.209.21.80 RSET
I know why connect(2) is falure, but I don't know why close(2).
More information about the freebsd-current
mailing list