CAM Target over FC and UNMAP problem

Wed Dec 2 00:39:49 UTC 2015

Example of the difference between the machines:

vPool175 (it's zpool drives are all iSCSI)

dT: 1.060s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d
   o/s   ms/o   %busy Name
*   64     97      0      0    0.0      0      0    0.0     97   6643
 713.4      0    0.0  109.0| da1*

pool92 (it's the zvol target for vPool175's iSCSI connection)

dT: 1.003s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/d
   o/s   ms/o   %busy Name
    0      0      0      0    0.0      0      0    0.0      0      0    0.0
     0    0.0    0.0| cd0
    0     54     54   3447    2.8      0      0    0.0      0      0    0.0
     0    0.0   14.9| da0
    0     62     62   3958    3.6      0      0    0.0      0      0    0.0
     0    0.0   22.0| da1
    0     42     42   2681    3.3      0      0    0.0      0      0    0.0
     0    0.0   13.9| da2
    0     44     44   2809    2.4      0      0    0.0      0      0    0.0
     0    0.0   10.5| da3
    0     57     57   3638    3.8      0      0    0.0      0      0    0.0
     0    0.0   21.7| da4
    0     39     39   2489    3.9      0      0    0.0      0      0    0.0
     0    0.0   15.1| da5
    1    159      0      0    0.0      0      0    0.0    159  10405    6.3
     0    0.0  100.2| zvol/pool92/iscsi0

There is a slight time lag bewteen my copies, but you can see the ~100
delete ops per sec on both sides.

Perhaps the queue depth is nothing, but as you can see on the pool92 side
with the physical disks, we're hardly working anything.

On Tue, Dec 1, 2015 at 8:32 PM, Christopher Forgeron <csforgeron at gmail.com>
wrote:

> Thanks for the update.
>
> Perhaps my situation is different.
>
> I have a zpool (vPool175) that is made up of iSCSI disks.
>
> Those iSCSIi disks target a zvol on another machine (pool92) made of real
> disks.
>
> The performance of the UNMAP system is excellent when we're talking bulk
> UNMAPs - I can UNMAP a 5TiB zvol in 50 seconds or so. zpool create and
> destroy are fairly fast in this situation.
>
> However, once my vPool175 is into random writes, the UNMAP performance is
> terrible.
>
> 5 minutes of random writes (averaging 1000iops) will result in 50 MINUTES
> of UNMAP's after the test run. And it often will hang on I\O before the
> full 5 min of rrnd write is up.  I feel like the UNMAP buffer/count is
> being exceeded (10,000 pending operations bu default).
>
> Sync writes don't have this issue. 5 min of 800iops sync writes will
> result in ~3 mintutes of UNMAP operations after the test is finished.
>
> It's not a ctl issue I would think - It's to do with the way ZFS needs to
> read metadata to then write out the UNMAPs.
>
> HOWEVER:
>
> I do notice that my remote zpool that is the iSCSI initiator (vPool175),
> can keep a queue depth of 64 for the UNMAP operations, but on the iSCSI
> target machine (pool92), the queue depth for the UNMAP operation on the
> zvol is never more than 1. I've tried modifying the various vfs.zfs.vdev.
> write controls, but none of them are set to a value of 1, so perhaps CTL is
> only passing a queue depth of 1 on for UNMAP operations? The zvol should be
> UNMAPing at the same queue depth as the remote machine - 1-64.
>
> I've tried setting the trim min_active on the zvol machine, but also no
> luck:
>
> root at pool92:~ # sysctl vfs.zfs.vdev.trim_min_active=10
> vfs.zfs.vdev.trim_min_active: 1 -> 10
>
> The zvol queue stays at 0/1 depth during the 50 minutes of small-block
> UNMAP.
>
> In fact, in general, the queue of the zvol on the iSCSI target machine
> (pool92) looks to be run at a very low queue depth
>
> I feel if we could at least get that queue depth up, we'd have a chance to
> keep up with the remote system asking for UNMAP.
>
> I'm curious to experiment with deeper TAG depths - say 4096, to see if
> UNMAP aggregation will help out - That may be for tomorrow.
>
> On Tue, Dec 1, 2015 at 1:34 PM, Alexander Motin <mav at freebsd.org> wrote:
>
>> Not really.  But just as an idea you may try to set tloader unable
>> vfs.zfs.trim.enabled=0 . Aside of obvious disabling TRIM, that is no-op
>> for non-SSD ZFS pool, it also changes the day deletes are handled,
>> possibly making them less blocking.
>>
>> I see no issue here from CTL side -- it does well. It indeed does not
>> limit size of single UNMAP operation, that is not good, but that is
>> because it has no idea about performance of backing store.
>>
>> On 01.12.2015 16:39, Christopher Forgeron wrote:
>> > Did this ever progress further? I'm load testing 10.2 zvol / UNMAP, and
>> > having similar lockup issues.
>> >
>> > On Thu, Mar 5, 2015 at 1:10 PM, Emil Muratov <gpm at hotplug.ru
>> > <mailto:gpm at hotplug.ru>> wrote:
>> >
>> >
>> >     I've got an issue with CTL UNMAP and zvol backends.
>> >     Seems that UNMAP from the initiator passed to the underlying disks
>> >     (without trim support) causes IO blocking to the whole pool. Not
>> sure
>> >     where to address this problem.
>> >
>> >     My setup:
>> >      - plain SATA 7.2 krpm drives attached to Adaptec aacraid SAS
>> controller
>> >      - zfs raidz pool over plain drives, no partitioning
>> >      - zvol created with volmode=dev
>> >      - Qlogic ISP 2532 FC HBA in target mode
>> >      - FreeBSD 10.1-STABLE #1 r279593
>> >
>> >     Create a new LUN with a zvol backend
>> >
>> >     ctladm realsync off
>> >     ctladm port -o on -p 5
>> >     ctladm create -b block -o file=/dev/zvol/wd/tst1 -o unmap=on -l 0 -d
>> >     wd.tst1 -S tst1
>> >
>> >     Both target an initiator hosts connected to the FC fabric.
>> Initiator is
>> >     Win2012 server, actually it is a VM with RDM LUN to the guest OS.
>> >     Formating, reading and writing large amounts of data (file
>> copy/IOmeter)
>> >     - so far so good.
>> >     But as soon as I've tried to delete large files all IO to the LUN
>> >     blocks, initiator system just iowaits. gstat on target shows that
>> >     underlying disk load bumped to 100%, queue up to 10, but no iowrites
>> >     actually in progress, only decent amount of ioreads. After a minute
>> or
>> >     so IO unblocks for a second or two than blocks again and so on again
>> >     until all UNMAPs are done, it could take up to 5 minutes to delete
>> 10Gb
>> >     file. I can see that 'logicalused' property of a zvol shows that the
>> >     deleted space was actually released. System log is filled with CTL
>> msgs:
>> >
>> >
>> >     kernel: (ctl2:isp1:0:0:3): ctlfestart: aborted command 0x12aaf4
>> >     discarded
>> >     kernel: (2:5:3/3): WRITE(10). CDB: 2a 00 2f d4 74 b8 00 00 08 00
>> >     kernel: (2:5:3/3): Tag: 0x12ab24, type 1
>> >     kernel: (2:5:3/3): ctl_process_done: 96 seconds
>> >     kernel: (ctl2:isp1:0:0:3): ctlfestart: aborted command 0x12afa4
>> >     discarded
>> >     kernel: (ctl2:isp1:0:0:3): ctlfestart: aborted command 0x12afd4
>> >     discarded
>> >     kernel: ctlfedone: got XPT_IMMEDIATE_NOTIFY status 0x36 tag
>> 0xffffffff
>> >     seq 0x121104
>> >     kernel: (ctl2:isp1:0:0:3): ctlfe_done: returning task I/O tag
>> 0xffffffff
>> >     seq 0x1210d4
>> >
>> >
>> >     I've tried to tackle some sysctls, but no success so far.
>> >
>> >     vfs.zfs.vdev.bio_flush_disable: 1
>> >     vfs.zfs.vdev.bio_delete_disable: 1
>> >     vfs.zfs.trim.enabled=0
>> >
>> >
>> >     Disabling UNMAP in CTL (-o unmap=off) resolves the issue completely
>> but
>> >     than there is no space reclamation for zvol.
>> >
>> >     Any hints would be appreciated.
>> >
>> >
>> >
>> >     _______________________________________________
>> >     freebsd-fs at freebsd.org <mailto:freebsd-fs at freebsd.org> mailing list
>> >     http://lists.freebsd.org/mailman/listinfo/freebsd-fs
>> >     To unsubscribe, send any mail to "
>> freebsd-fs-unsubscribe at freebsd.org
>> >     <mailto:freebsd-fs-unsubscribe at freebsd.org>"
>> >
>> >
>>
>>
>> --
>> Alexander Motin
>>
>
>