kern/145339: [zfs] deadlock after detaching block device from raidz pool

Alex Bakhtin alex.bakhtin at gmail.com
Sat May 22 23:40:42 UTC 2010


Pawel,

	I made some additional testing. Now I'm 95 percent sure that this
deadlock was introduced by this patch. I tried patched and non-patched
GENERIC kernel. It seems that it is harder to reproduce this deadlock on
raidz1 than raidz2. With raidz2 I tried to detach and reattach back two disk
at the same time, and deadlock is 100% reproducible on patched kernel but I
can't reproduce it on non-patched kernel.

	How to reproduce:

1. Create raiz2 pool
2. Detach two devices while the pool is idle.
3. Start writing to the pool (dd if=/dev/zero of=/storage/test bs=1m)
4. atacontrol detach/attach to have disks back.
5. Online two disks at the same time (zpool online storage adX adY).
6. Wait some time (in my testing - several seconds, less than one minute) -
all disk activity would be stopped. After that it's impossible to abort
zpool online or dd command. Also it's impossible to reboot without
hard-reset.

	If you need core from deadlocked kernel please let me know.

Alex Bakhtin

-----Original Message-----
From: Alex Bakhtin [mailto:alex.bakhtin at gmail.com] 
Sent: Monday, May 17, 2010 7:37 PM
To: pjd at freebsd.org
Cc: freebsd-fs at freebsd.org; bug-followup at freebsd.org
Subject: Re: kern/145339: [zfs] deadlock after detaching block device from
raidz pool

Pawel,

   I tested your patch in the following zfs configuration (all on
5x2TB WD20EARS drivers):

1. raidz1 on top of physical disks.
2. raidz1 on top of geli
3. raidz2 on top of physical disks.

   In all three cases it seems that the problem was fixed - I can't
crash zfs in vdev_geom when unplugging the disk.

   Unfortunately, 3 times I got a deadlock in zfs after plugging
vdevs back under load. It happens several seconds after zpool online
command. I'm not 100 percent sure that deadlocks are related to this
patch, but... I'm going to make some additional testing with patched
and not patched kernels.

2010/5/13  <pjd at freebsd.org>:
> Synopsis: [zfs] deadlock after detaching block device from raidz pool
>
> State-Changed-From-To: open->feedback
> State-Changed-By: pjd
> State-Changed-When: czw 13 maj 2010 09:33:20 UTC
> State-Changed-Why:
> Could you try this patch:
>
>        http://people.freebsd.org/~pjd/patches/vdev_geom.c.3.patch
>
> It is against most recent HEAD. If it is rejected on 8-STABLE, just grab
> entire vdev_geom.c from HEAD and patch this.
>
>
> Responsible-Changed-From-To: freebsd-fs->pjd
> Responsible-Changed-By: pjd
> Responsible-Changed-When: czw 13 maj 2010 09:33:20 UTC
> Responsible-Changed-Why:
> I'll take this one.
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=145339
>



More information about the freebsd-fs mailing list