Zfs heavy io writing | zfskern txg_thread_enter
killing at multiplay.co.uk
Fri Feb 19 13:51:24 UTC 2016
On 19/02/2016 13:06, Michelle Sullivan wrote:
> Steven Hartland wrote:
>> On 19/02/2016 11:58, Michelle Sullivan wrote:
>>> Niccolò Corvini wrote:
>>>> Hi, first time here!
>>>> We are having a problem with a server running FreeBsd 9.1 with ZFS on a
>>> You should upgrade to a supported version first... 9.3 would probably
>>> be the best (rather than 10.x) as it's still supported and uses the same
>>> ABI (ie you should need to reinstall all your ports/packages - though
>>> you should because it sometimes breaks things - at least check for
>>> broken things :) .)
>>> If you're not familiar "freebsd-update -r 9.3-RELEASE upgrade" will help
>>> you do it without too many problems.
>> 9.3 is still ancient, and while "supported" its not in active
>> development, and to be blunt no one will be interested in helping to
>> diagnose any actual issue on something so old.
> So supported is not really supported... Is that an official position?
Supported for 9.x, which is a "Legacy Release", I would say is supported
for security and other critical issues only, which is the same for
pretty much every project / product out there.
>> 10.x has a totally different ZFS IO scheduler for example, so its
>> differently for most workloads.
> But the user is on 9.x not 10.x and 10.x changes a lot more than just
> the ZFS IO scheduler. If this is a production machine, then an upgrade
> to 9.3 may be easier as it would require less regression testing.... Or
> is this another case of people don't run FreeBSD in production
> environments so it doesn't matter...?
Yes but 9.x is already legacy and becomes unsupported in December of
this year, so the process of migration to 10.x should be well on the way
by now tbh.
>>>> single sata drive. Since a few days ago, in the morning the system
>>>> really slow due of a really heavy io writing. We investigated and we
>>>> it might start at night, maybe correlated to to crondaily (standard)
>>>> but we
>>>> are not sure. After a few hours the situation returns to normal.
>>> Yeah this sounds like something I am quite familiar with... It's the
>>> security check cronjob that runs every day... its looking for any
>>> setuid/setgid files, new/modified files...etc... across all file systems
>> This is quite likely, so while updating to 10 may not fix the issue
>> running on 9.x.
> Which means you just told the user to do something that is not likely to
> fix the issue but will give them more problems to deal with so they
> might forget the original problem in the mean time... You know this was
> the reason I was the Technical Lead for the support teams first in
> Europe then in AsiaPAC for Netscape back in the day, and why I never
> worked for Microsoft... You diagnose the problem as best as possible
> with as minimal changes to the system at first, then if all else fails
> or you come across evidence that points to a known bug that you know is
> fixed you tell them to upgrade to the latest *supported* version
>> Be aware that 10.3-BETA2 has a known issue related to vnode memory
>> usage which can be triggered by such workloads so trying BETA3 when
>> released, which should address this would be a good idea.
> ...supported version... i.e. *NOT* a BETA release - especially if that
> beta release has other known issues that might well trigger on the very
> problem they indicated...!
> Seriously, sorry to pick on you, but "Upgrade to 10.x Beta as it might
> help" is not the *first* answer anyone should give... you might as well
> have told him to upgrade to 11.... because that has as much chance of
> fixing the problem...
Beta and soon to become RC, so no its nothing like 11.
If no one bothered to update and test until release then you can't
expect a good result at the end of the day now can you. There's a reason
why people like Netflix run code close to head. Gleb Smirnoff iirc has
Youtube vid talking about it, you should watch it.
Obviously when updating to any new version, be that on mission critical
system or dev test box, then the implications must be considered.
At the end of the day it doesn't change the fact that if any
investigation results in the issue requiring code changes, you're not
going to get that running 9.x, and it may already be fixed, so you'll
have wasted your time.
Yes you could spend lots of time investigating the problem to come to
that conclusion and be at a dead end or you could get on a version which
is being actively developed and hence help that process along, which is
something that's going to need to be done anyway, so why not take a step
in the right direction.
At the end of the day its a balancing act and something that only the
user can decide given the relevant information.
More information about the freebsd-fs