ZFS prefers iSCSI disks over local ones ?

Tue Oct 3 15:18:53 UTC 2017

On Tue, Oct 03, 2017 at 05:03:18PM +0200, Ben RUBSON wrote:
> > On 03 Oct 2017, at 16:58, Steven Hartland <steven at multiplay.co.uk> wrote:
> > 
> > On 03/10/2017 15:40, Ben RUBSON wrote:
> >> Hi,
> >> 
> >> I start a new thread to avoid confusion in the main one.
> >> (ZFS stalled after some mirror disks were lost)
> >> 
> >> 
> >>> On 03 Oct 2017, at 09:39, Steven Hartland wrote:
> >>> 
> >>> 
> >>>> On 03/10/2017 08:31, Ben RUBSON wrote:
> >>>> 
> >>>> 
> >>>>> On 03 Oct 2017, at 09:25, Steven Hartland wrote:
> >>>>> 
> >>>>> 
> >>>>>> On 03/10/2017 07:12, Andriy Gapon wrote:
> >>>>>> 
> >>>>>> 
> >>>>>>> On 02/10/2017 21:12, Ben RUBSON wrote:
> >>>>>>> 
> >>>>>>> Hi,
> >>>>>>> 
> >>>>>>> On a FreeBSD 11 server, the following online/healthy zpool :
> >>>>>>> 
> >>>>>>> home
> >>>>>>>  mirror-0
> >>>>>>>    label/local1
> >>>>>>>    label/local2
> >>>>>>>    label/iscsi1
> >>>>>>>    label/iscsi2
> >>>>>>>  mirror-1
> >>>>>>>    label/local3
> >>>>>>>    label/local4
> >>>>>>>    label/iscsi3
> >>>>>>>    label/iscsi4
> >>>>>>> cache
> >>>>>>>  label/local5
> >>>>>>>  label/local6
> >>>>>>> 
> >>>>>>> A sustained read throughput of 180 MB/s, 45 MB/s on each iscsi disk
> >>>>>>> according to "zpool iostat", nothing on local disks (strange but I
> >>>>>>> noticed that IOs always prefer iscsi disks to local disks).
> >>>>>>> 
> >>>>>> Are your local disks SSD or HDD?
> >>>>>> Could it be that iSCSI disks appear to be faster than the local disks
> >>>>>> to the smart ZFS mirror code?
> >>>>>> 
> >>>>>> Steve, what do you think?
> >>>>>> 
> >>>>> Yes that quite possible, the mirror balancing uses the queue depth +
> >>>>> rotating bias to determine the load of the disk so if your iSCSI host
> >>>>> is processing well and / or is reporting non-rotating vs rotating for
> >>>>> the local disks it could well be the mirror is preferring reads from
> >>>>> the the less loaded iSCSI devices.
> >>>>> 
> >>>> Note that local & iscsi disks are _exactly_ the same HDD (same model number,
> >>>> same SAS adapter...). So iSCSI ones should be a little bit slower due to
> >>>> network latency (even if it's very low in my case).
> >>>> 
> >>> The output from gstat -dp on a loaded machine would be interesting to see too.
> >>> 
> >> So here is the gstat -dp :
> >> 
> >> L(q) ops/s  r/s  kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da0
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da1
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da2
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da3
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da4
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da5
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da6
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da7
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da8
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da9
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da10
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da11
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da12
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da13
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da14
> >>    1   370  370 47326  0.7   0    0  0.0   0    0  0.0 23.2| da15
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da16
> >>    0   357  357 45698  1.4   0    0  0.0   0    0  0.0 39.3| da17
> >>    0   348  348 44572  0.7   0    0  0.0   0    0  0.0 22.5| da18
> >>    0   432  432 55339  0.7   0    0  0.0   0    0  0.0 27.5| da19
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da20
> >>    0     0    0     0  0.0   0    0  0.0   0    0  0.0  0.0| da21
> >> 
> >> The 4 active drives are the iSCSI targets of the above quoted pool.
> >> 
> >> A local disk :
> >> 
> >> Geom name: da7
> >> Providers:
> >> 1. Name: da7
> >>    Mediasize: 4000787030016 (3.6T)
> >>    Sectorsize: 512
> >>    Mode: r0w0e0
> >>    descr: HGSTxxx
> >>    lunid: 5000xxx
> >>    ident: NHGDxxx
> >>    rotationrate: 7200
> >>    fwsectors: 63
> >>    fwheads: 255
> >> 
> >> A iSCSI disk :
> >> 
> >> Geom name: da19
> >> Providers:
> >> 1. Name: da19
> >>    Mediasize: 3999688294912 (3.6T)
> >>    Sectorsize: 512
> >>    Mode: r1w1e2
> >>    descr: FREEBSD CTLDISK
> >>    lunname: FREEBSD MYDEVID  12
> >>    lunid: FREEBSD MYDEVID  12
> >>    ident: iscsi4
> >>    rotationrate: 0
> >>    fwsectors: 63
> >>    fwheads: 255
> >> 
> >> Sounds like then the faulty thing is the rotationrate set to 0 ?
> > 
> > Absolutely
> 
> Good catch then, thank you !
> 
> > and from the looks you're not stressing the iSCSI disks so they get high queuing depths hence the preference.
> > As load increased I would expect the local disks to start seeing activity.
> 
> Yes this is also what I see.
> 
> Any way however to set rotationrate to 7200 (or to a slightly greater value) as well for iSCSI drives ?
> I looked through ctl.conf(5) and iscsi.conf(5) but did not found anything related.
> 
> Many thanks !

Use the "option" setting in ctl.conf to change the rpm value (documented
in the OPTIONS section of ctladm(8)).

Regards,

Gary