stat speed

Eric Anderson anderson at centtech.com
Sat Feb 18 20:39:48 PST 2006


Eric Anderson wrote:
> Mark Bucciarelli wrote:
>> On Sat, Feb 18, 2006 at 11:06:57PM -0500, Mark Bucciarelli wrote:
>>  
>>> I'm curious how fast stat is.
>>>
>>> I generated a list of 200,000 file names
>>>
>>>     # find / | head -200000 > files.statspeed
>>>
>>> then ran a million iterations of randomly picking a file name and
>>> stating it (see attached program).
>>>     
>>
>> Hmmm, 200,000 files 1,000,000 iterations.  On avg, each file hit
>> five times.  Uhh, that's not a good way to avoid caching.  Doh.
>>
>> Wow, caching is pretty amazing. I just reran the program, this time
>> using 500,000 file paths and only stat'ing 10,000 of them.
>>
>> The first run was 99,059/second, the second was 188,239.
>>
>> So I guess 100,000/second is about right on my system w/o cache.
>>   
>
> I'm also wondering if by using find, and getting a list of 
> files/directories in the default order, you might be seeing some 
> results that aren't really completely random.  What I mean is, your 
> find is traversing the tree, probably digging through directories 
> based on inode number or last modified time (can't recall which), but 
> either way, it's possible your list consisted of clumps of files/dirs 
> in the same cylinder groups, specially since you grabbed the first 
> 500k files, instead of picking a random file from the entire list of 
> files on the filesystem, and building a list from that random 
> plucking..  This is all speculative, but if you had lots of files in a 
> directory, those could be clumped in a few cylinder groups and 
> therefore you might see higher numbers than sampling from the entire 
> disk (since the speed is probably mostly dominated by disk seeks I 
> believe).
>
> What exactly are you trying to determine?

You are also timing the rand() function.  I suggest randomizing the list 
first, then stating the files in the randomized list.


Eric




-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
Anything that works is better than anything that doesn't.
------------------------------------------------------------------------



More information about the freebsd-performance mailing list