ZFS + NFS poor performance after restarting from 100 day uptime
Josh Beard
josh at signalboxes.net
Fri Mar 22 20:24:49 UTC 2013
On Fri, Mar 22, 2013 at 1:07 PM, Steven Hartland <killing at multiplay.co.uk>wrote:
>
> ----- Original Message ----- From: Josh Beard
>>
>>> A snip of gstat:
>>>
>>> dT: 1.002s w: 1.000s
>>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
>>>
>> ...
>
>> 4 160 126 1319 31.3 34 100 0.1 100.3| da1
>>> 4 146 110 1289 33.6 36 98 0.1 97.8| da2
>>> 4 142 107 1370 36.1 35 101 0.2 101.9| da3
>>> 4 121 95 1360 35.6 26 19 0.1 95.9| da4
>>> 4 151 117 1409 34.0 34 102 0.1 100.1| da5
>>> 4 141 109 1366 35.9 32 101 0.1 97.9| da6
>>> 4 136 118 1207 24.6 18 13 0.1 87.0| da7
>>> 4 118 102 1278 32.2 16 12 0.1 89.8| da8
>>> 4 138 116 1240 33.4 22 55 0.1 100.0| da9
>>> 4 133 117 1269 27.8 16 13 0.1 86.5| da10
>>> 4 121 102 1302 53.1 19 51 0.1 100.0| da11
>>> 4 120 99 1242 40.7 21 51 0.1 99.7| da12
>>>
>>> Your ops/s are be maxing your disks. You say "only" but the ~190 ops/s
>>> is what HD's will peak at, so whatever our machine is doing is causing
>>> it to max the available IO for your disks.
>>>
>>> If you boot back to your previous kernel does the problem go away?
>>>
>>> If so you could look at the changes between the two kernel revisions
>>> for possible causes and if needed to a binary chop with kernel builds
>>> to narrow down the cause.
>>>
>>
>> Thanks for your response. I booted with the old kernel (9.1-RC3) and the
>> problem disappeared! We're getting 3x the performance with the previous
>> kernel than we do with the 9.1-RELEASE-p1 kernel:
>>
>> Output from gstat:
>>
>> 1 362 0 0 0.0 345 20894 9.4 52.9| da1
>> 1 365 0 0 0.0 348 20893 9.4 54.1| da2
>> 1 367 0 0 0.0 350 20920 9.3 52.6| da3
>> 1 362 0 0 0.0 345 21275 9.5 54.1| da4
>> 1 363 0 0 0.0 346 21250 9.6 54.2| da5
>> 1 359 0 0 0.0 342 21352 9.5 53.8| da6
>> 1 347 0 0 0.0 330 20486 9.4 52.3| da7
>> 1 353 0 0 0.0 336 20689 9.6 52.9| da8
>> 1 355 0 0 0.0 338 20669 9.5 53.0| da9
>> 1 357 0 0 0.0 340 20770 9.5 52.5| da10
>> 1 351 0 0 0.0 334 20641 9.4 53.1| da11
>> 1 362 0 0 0.0 345 21155 9.6 54.1| da12
>>
>>
>> The kernels were compiled identically using GENERIC with no modification.
>> I'm no expert, but none of the stuff I've seen looking at svn commits
>> looks like it would have any impact on this. Any clues?
>>
>
> Your seeing a totally different profile there Josh as in all writes no
> reads where as before you where seeing mainly reads and some writes.
>
> So I would ask if your sure your seeing the same work load, or has
> something external changed too?
>
> Might be worth rebooting back to the new kernel and seeing if your
> still see the issue ;-)
>
>
> Regards
> Steve
>
> Regards
> Steve
>
>
Steve,
You're absolutely right. I didn't catch that, but the total ops/s is
reaching quite a bit higher. Things are certainly more responsive than
they have been, for what it's worth, so it "feels right." I'm also not
seeing this thing consistently railed to 100% busy like I was before with
similar testing (that was 50 machines just pushing data with dd). I won't
be able to get a good comparison until Monday, when our students come back
(this is a file server for a public school district and used for network
homes).
Josh
More information about the freebsd-fs
mailing list