Locked up processes after upgrade to ZFS v15
Jeremy Chadwick
freebsd at jdc.parodius.com
Tue Oct 12 15:45:45 UTC 2010
On Tue, Oct 12, 2010 at 06:26:55PM +0300, Andriy Gapon wrote:
> on 12/10/2010 18:18 Jeremy Chadwick said the following:
> > Got it -- just finished and is currently running/working. I also
> > installed ports/sysutils/DTraceToolkit and shells/ksh93 "just in case".
> >
> > testbox# dtrace -l | head
> > ID PROVIDER MODULE FUNCTION NAME
> > 1 dtrace BEGIN
> > 2 dtrace END
> > 3 dtrace ERROR
> > 4 dtmalloc fbt malloc
> > 5 dtmalloc fbt free
> > 6 dtmalloc cyclic malloc
> > 7 dtmalloc cyclic free
> > 8 dtmalloc zones_data malloc
> > 9 dtmalloc zones_data free
> >
> > I can provide you root-level access to the box as well as serial console
> > if you'd prefer to do the debugging yourself, otherwise step me through
> > what's needed and I'll be happy to act as remote hands.
>
> Great! Let's start now :)
> I would like you to run the following script with "dtrace -s <script name>" in one
> terminal while running sendfile patched regression test (with TEST_EXTRA=100) in
> another. After sendfile program finishes, please ^C the DTrace script.
> Please show me complete output that you'll get from the DTrace script.
> Thanks!
>
> fbt::vm_fault:entry
> /execname == "sendfile"/
> {
> self->vm_fault = 1;
> }
>
> fbt::vm_fault:return
> /execname == "sendfile"/
> {
> self->vm_fault = 0;
> }
>
> fbt::zfs_freebsd_read:entry
> /self->vm_fault/
> {
> self->zfs_read = 1;
> }
>
> fbt::zfs_freebsd_read:return
> /self->vm_fault/
> {
> self->zfs_read = 0;
> }
>
> fbt::vm_page_lookup:return
> /self->zfs_read && arg1 != 0/
> {
> @stacks[stack()] = count();
> printf("\n");
> printf("valid = 0x%02x\n", ((vm_page_t)arg1)->valid);
> printf("flags = 0x%04x\n", ((vm_page_t)arg1)->flags);
> printf("oflags = 0x%04x\n", ((vm_page_t)arg1)->oflags);
> printf("pindex = %u\n", ((vm_page_t)arg1)->pindex);
> printf("object = %p\n", ((vm_page_t)arg1)->object);
> }
Okay, I realised what I did wrong with the original incarnation of
your modified sendfile stuff -- the code defaults to using /tmp, which
idiotically I forgot to change to a ZFS filesystem (/tmp isn't ZFS
on the testbox). Now that I changed it to /home, I can reproduce the
problem. Excellent!
Secondly: the testbox is running kernel/world source from October 8th.
I *have not* applied your kernel patch at this point. Just a FYI.
So here's what I get. Note that the sendfile process appears locked up
in zfsmrb state.
Terminal #1 (sendfile)
------------------------
testbox# ./sendfile
1..11
ok 1
ok 2
ok 3
ok 4
ok 5
ok 6
ok 7
ok 8
ok 9
ok 10
ok 11
mmap test
^C
Terminal #2 (DTrace script + ps output)
-----------------------------------------
testbox# ./zfs_sendfile.d
dtrace: script './zfs_sendfile.d' matched 5 probes
CPU ID FUNCTION:NAME
1 22458 vm_page_lookup:return
valid = 0x01
flags = 0x0000
oflags = 0x0001
pindex = 4
object = c614e550
^C
0xc457b43d
kernel`VOP_READ_APV+0x7a
kernel`vnode_pager_generic_getpages+0x329
kernel`vop_stdgetpages+0x29
kernel`VOP_GETPAGES_APV+0x83
kernel`vnode_pager_getpages+0x19a
kernel`vm_fault+0x1139
kernel`trap_pfault+0x173
kernel`trap+0x2cb
kernel`0xc07d29bc
1
testbox# ps -axl | grep sendfile
0 1318 1132 0 52 0 3324 1024 zfsmrb DL+ u0 0:00.01 ./sendfile
0 1333 1170 0 44 0 3444 1200 - R+ 0 0:00.00 grep sendfile
testbox# procstat -k -k 1318
PID TID COMM TDNAME KSTACK
1318 100126 sendfile - mi_switch+0x11b sleepq_switch+0xc1 sleepq_wait+0x39 _sleep+0x282 vm_page_sleep+0xd5 zfs_freebsd_read+0x2f3 VOP_READ_APV+0x7a vnode_pager_generic_getpages+0x329 vop_stdgetpages+0x29 VOP_GETPAGES_APV+0x83 vnode_pager_getpages+0x19a vm_fault+0x1139 trap_pfault+0x173 trap+0x2cb calltrap+0x6
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. PGP: 4BD6C0CB |
More information about the freebsd-fs
mailing list