ZFS: arc_reclaim_thread running 100%, 8.1-RELEASE, LBOLT related
Artem Belevich
art at freebsd.org
Tue May 31 16:48:27 UTC 2011
On Tue, May 31, 2011 at 8:24 AM, Andriy Gapon <avg at freebsd.org> wrote:
>>> 65 nsec = (hrtime_t)ts.tv_sec * NANOSEC + ts.tv_nsec;
>>
>> Yup. This would indeed overflow in ~106.75 days.
>
> Have you referred to the LBOLT above?
> gethrtime() should have several hundred years before overflowing.
hrtime_t is 64-bit. NANOSEC=1000000000.
When it's time to use LBOLT, we further multiply number of seconds by HZ:
>> 41 #define LBOLT ((gethrtime() * hz) / NANOSEC)
In the end we want 64-bit scalar to hold number of seconds times 10e12.
0x7fffffffffffffff/1000000000000 = 9223372 # number of seconds before
signed overflow
9223372/(24*60*60) --> 106 # .. or about 106 days
>> The side effect is that it limits bolt resolution to hz units. With
>> HZ=100, that will be 10ms. Whether it's good enough or too coarse I
>
> Nope, I think you did your your math wrong here.
> As shown above it limits the resolution to hz ticks, i.e. hz * 1/hz seconds :)
Point taken.
>
>> have no idea. Perhaps we can compromise and update lbolt in
>> microseconds. That should give us few hundred years until the
>> overflow.
>
> Well, we can either use the ticks variable, since we are not switching to tickless
> mode yet. But we would need to make it 64-bit to avoid early overflows.
> Or, perhaps, to be somewhat future-friendly we could do approximately what
> OpenSolaris [upstream :-)] code does:
>
> gethrtime() / (NANOSEC / hz)
>
> Or, given that NANOSEC is constant and hz is invariant, we could apply extended
> invariant division by multiplication approach to get precise and fast result
> without overflowing. But likely that's an overkill here. Though we definitely
> should pre-calculate, store and re-use (NANOSEC / hz) value just like OpenSolaris
> does it.
This should work.
>
>>> clock_t will still need the typedef'ed to 64bit to still address the l2arc usage of LBOLT.
>
> Hm, I see that OpenSolaris defines clock_t to long, while we use int32_t.
> So, I think that it means two things:
> - i386 OpenSolaris (if it exists) should be affected by the problem as well
> - we should not use our native clock_t definition for ported OpenSolaris code
>
> Maybe we should fix our clock_t to be something wider at least for 64-bit
> platforms. But I am not prepared to discuss this.
>
> To summarize:
> 1. We must fix ddi_get_lbolt*() to avoid unnecessary overflow in multiplication
Agreed.
> 2. We could fix 31-bit clock_t overflow by using Solaris-compatible definition of
> it (long), but that still won't help on i386. Maybe we should bring up this issue
> to the attention of upstream ZFS developers. Universally using ddi_get_lbolt64()
> seems like a safer bet.
FYI, we've already changed clock_t for opensolaris code to int64_t in
r218169 regardless of $MACHINE.
--Artem
More information about the freebsd-fs
mailing list