zfs l2arc warmup

Thu Mar 27 20:55:03 UTC 2014

I agree, and since the devs will take this into account for our next,
total rewrite, release I may hold any further hardware purchase until
then, not sure yet.
I've almost got my current 960GB of l2arc filled up, I'm going to see
how that affect any performance.
It won't cover the whole dataset, but about 75% or so, so I reckon I
should see some improvement.

On 27 March 2014 15:53, Karl Denninger <karl at denninger.net> wrote:
>
> On 3/27/2014 9:26 AM, Bob Friesenhahn wrote:
>>
>> On Thu, 27 Mar 2014, Joar Jegleim wrote:
>>>
>>> Is this how 'you' do it to warmup the l2arc, or am I missing something ?
>>>
>>> The thing is with this particular pool is that it serves somewhere
>>> between 20 -> 30 million jpegs for a website. The front page of the
>>> site will for every reload present a mosaic of about 36 jpegs, and the
>>> jpegs are completely randomly fetched from the pool.
>>> I don't know what jpegs will be fetched at any given time, so I'm
>>> installing about 2TB of l2arc ( the pool is about 1.6TB today) and I
>>> want the whole pool to be available from the l2arc .
>>
>>
>> Your usage pattern is the opposite of what the ARC is supposed to do. The
>> ARC is supposed to keep most-often accessed data in memory (or retired to
>> L2ARC) based on access patterns.
>>
>> It does not seem necessary for your mosaic to be truely random across
>> 20 -> 30 million jpegs.  Random across 1000 jpegs which are circulated
>> in time would produce a similar effect.
>>
>> The application building your web page mosiac can manage which files will
>> be included in the mosaic and achieve the same effect as a huge cache by
>> always building the mosiac from a known subset of files. The 1000 jpegs used
>> for the mosaics can be cycled over time from a random selection, with old
>> ones being removed.  This approach assures that in-memory caching is
>> effective since the same files will be requested many times by many clients.
>>
>> Changing the problem from an OS-oriented one to an application-oriented
>> one (better algorithm) gives you more control and better efficiency.
>>
>> Bob
>
> That's true, but the other option if he really does want it to be random
> across the entire thing, given the size (which is not outrageous) and that
> the resource is going to be read-nearly-only, is to put them on SSDs and
> ignore the L2ARC entirely.  These days that's not a terribly expensive
> answer as with a read-mostly-always environment you're not going to run into
> a rewrite life-cycle problem on rationally-priced SSDs (e.g. Intel 3500s).
>
> Now an ARC cache miss is not all *that* material since there is no seek or
> rotational latency penalty.
>
> HOWEVER, with that said it's still expensive compared against rotating rust
> for bulk storage, and as Bob noted a pre-select middleware process would
> result in no need for a L2ARC and allow the use of a pool with much-smaller
> SSDs for the actual online retrieval function.
>
> Whether the coding time and expense is a good trade against the lower
> hardware cost to do it the "raw" way is a fair question.
>
> --
> -- Karl
> karl at denninger.net
>
>

-- 
----------------------
Joar Jegleim
Homepage: http://cosmicb.no
Linkedin: http://no.linkedin.com/in/joarjegleim
fb: http://www.facebook.com/joar.jegleim
AKA: CosmicB @Freenode

----------------------