HEAD r252840 (illumos bug 3836) and our TRIM are incompatible, causing deadlocks

Alexander Motin mav at FreeBSD.org
Sun Aug 4 13:53:00 UTC 2013


On 04.08.2013 15:56, Steven Hartland wrote:
>> On 31.07.2013 14:59, Alexander Motin wrote:
>>> With some experiments I've got to believe that HEAD r252840 (illumos bug
>>> 3836) and our TRIM implementation are mutually incompatible. I have
>>> found 100% repeatable scenario how to cause deadlock when these changes
>>> are applied together. All that needed is to create significant write
>>> load (I've used `iozone -t 16 -s 8G` on 8-core system with 2GB RAM and 2
>>> striped SSDs) and run `zpool clear poolname`. After that system is
>>> effectively dead: all I/O are stuck and even zpool commands are no
>>> longer functioning. I think triggering event is not necessary should be
>>> `zpool clear`, any event/action that takes SCL_ZIO lock for writing
>>> should do the same.
>>>
>>> r252840 (illumos bug 3836) is based on assumption that zio_free_sync()
>>> has no lock dependencies and should complete immediately. Unfortunately,
>>> with our TRIM implementation that is not true due to
>>> ZIO_STAGE_VDEV_IO_START added to the ZIO_FREE_PIPELINE, which, while not
>>> really accessing devices, still acquires SCL_ZIO lock for read. As
>>> result, we are getting such deadlock: `zpool clear` asks for SC_ZIO for
>>> writing and waits for all read locks to be dropped; SC_ZIO is held for
>>> read by regular I/Os that were running and are completed now, but to
>>> drop the lock they require free zio_write_intr thread; unfortunately all
>>> zio_write_intr treads under high load are stuck inside modified
>>> zio_free(), that is now trying to directly execute zio_free_sync(), that
>>> with our TRIM implementation tries to obtain SCL_ZIO for read; BANG!
>>> DEADLOCK!
>>>
>>> Reverting r252840 fixes the situation for me. If somebody have ideas how
>>> to fix the situation without reverting either changes -- welcome.
>>
>> This patch workarounds the problem:
>> http://people.freebsd.org/~mav/zfs_patches/direct_free_and_trim.patch
>>
>> It disables r252840 when ZFS TRIM is enabled (vfs.zfs.trim.enabled=1).
>> When TRIM is disabled, patch enables direct free execution from
>> r252840 and removes ZIO_STAGE_VDEV_IO_START and
>> ZIO_STAGE_VDEV_IO_ASSESS stages from the pipeline.
>
> I assume you removed the vdev stages when trim is disabled as an
> optimization, due to the fact that a free wouldn't result in any
> physical IO?

As I have written above, zio_vdev_io_start() called on top level takes 
SCL_ZIO, that causes described deadlock. That is why I needed to block 
those stages, useless in that case any way, just at the entrance point. 
I've decided it will be more effective to remove whole stages then add 
checks inside.

-- 
Alexander Motin


More information about the zfs-devel mailing list