svn commit: r292955 - head/lib/libmd

Allan Jude allanjude at freebsd.org
Sat Jan 2 20:19:42 UTC 2016


On 2016-01-02 05:07, Bruce Evans wrote:
> On Sat, 2 Jan 2016, Allan Jude wrote:
> 
>> On 2015-12-31 13:50, Allan Jude wrote:
>>> On 2015-12-31 13:32, Jonathan T. Looney wrote:
>>>> On 12/31/15, 2:15 AM, "Allan Jude" <allanjude at freebsd.org> wrote:
>>>>
>>>>> It seems these problems also slow things down, a lot:
>>>>>
>>>>> # time md5 /media/md5test/bigdata
>>>>> MD5 (/media/md5test/bigdata) = 6afad0bf5d8318093e943229be05be67
>>>>> 4.310u 3.476s 0:07.79 99.8%     20+167k 0+0io 0pf+0w
>>>>> # time env LD_PRELOAD=/usr/obj/media/svn/md5/head/tmp/lib/libmd.so
>>>>> /usr/obj/media/svn/md5/head/sbin/md5/md5 /media/md5test/bigdata
>>>>> MD5 (/media/md5test/bigdata) = 6afad0bf5d8318093e943229be05be67
>>>>> 4.133u 0.354s 0:04.49 99.7%     20+167k 1+0io 0pf+0w
>>>>>
>>>>> (file is fully cached in ZFS ARC, dd reads it at 11GB/s)
>>>>>
>>>>> Will investigate more tomorrow.
>>>>
>>>> md5 will be slower than dd due to the extra processing it needs to
>>>> do to
>>>> generate the hash. I suspect that explains the difference you're seeing
>>>> between those utilities.
>>>
>>> Sorry, you missed my point here.
>>>
>>> I replaced MDXFile() with the implementation included in my earlier
>>> email. Using the newer libmd with that code, cut the time to md5 the
>>> SAME data down a lot. I need to do a more scientific test on a box that
>>> isn't doing other stuff still though.
>>>
>>> The comment about dd doing 11GB/s, was just to clarify that I wasn't
>>> reading the file from disk, which would introduce other variables.
>>
>> I found the cause of my bogus benchmark, the world on my test machine
>> was just old enough to be missing jmg@'s bufsize patch.
>>
>> Now the difference is about 1 second on a 2GB file, so ignore my
>> foolishness.
> 
> That patch is surprisingly new.
> 
> The main slowness that I complained about was for the other path in md5
> that must be used for special files.  That uses stdio so it suffers from
> stdio trusting st_blksize.  But st_blksize is rarely as small as the old
> size BUFSIZ in MDXFile.
> 
> Bruce
> 

I did some experiments on MDXFilter, adjusting the buffer size fo 16kb,
and using setvbuf() on stdin before reading from it. It improved things,
but only marginally.

dd if=/mnt/bigzerofile bs=1m | md5

10 GB took 80 seconds for unmodified md5, and 73.5 seconds with the
bigger buffer size.

I will try to setup and flamegraph it, and see if we can determine what
can be done to make it faster.

-- 
Allan Jude

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 834 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/svn-src-all/attachments/20160102/5a4ab984/attachment.sig>


More information about the svn-src-all mailing list