Call for testers: FPU changes

Kostik Belousov kostikbel at gmail.com
Tue Nov 16 22:19:31 UTC 2010


On Tue, Nov 16, 2010 at 05:08:30PM -0500, Mike Tancsa wrote:
> On 11/16/2010 4:43 AM, Kostik Belousov wrote:
> > On Mon, Nov 15, 2010 at 10:42:50PM -0500, Mike Tancsa wrote:
> >> On 11/15/2010 4:13 PM, Kostik Belousov wrote:
> >>>
> >>> Patch is at
> >>> http://people.freebsd.org/~kib/misc/releng_8_fpu.1.patch
> >>
> >>
> >> Hi,
> >> 	One small failure on the patch
> >>
> >> The text leading up to this was:
> >> --------------------------
> >> |Index: pc98/include/npx.h
> >> |===================================================================
> >> |--- pc98/include/npx.h (revision 215253)
> >> |+++ pc98/include/npx.h (working copy)
> >> --------------------------
> >> Patching file pc98/include/npx.h using Plan A...
> >> Hunk #1 failed at 1.
> >> 1 out of 1 hunks failed--saving rejects to pc98/include/npx.h.rej
> > This is because our patch(1) in base is somewhat old, I believe.
> > The diff was generated by svn diff from the up to date stable/8
> > checkout, and the reason for failure is expanded $FreeBSD$ tags.
> > 
> > Newer gnu patch, available in ports, handless this correctly,
> > reporting about patches applied with "fuzz".
> > 
> >>
> >>
> >> I tested with openssl and openvpn and all seems to work great on the via
> >> board and my i5 board!!  Simple test details at
> >>
> >> http://www.tancsa.com/fpu.html
> >>
> >> I will try out geli and some more extensive tests tomorrow
> >>
> >> Thanks for porting this back to RELENG_8 !
> > This is actually somewhat puzzling. Does openssl in base automatically
> > use crypto(4) ?
> 
> 
> I force it it via ssl.cnf
> 
> 
> 0(achinetboot)% tail -11 /etc/ssl/openssl.cnf
> 
> openssl_conf = openssl_def
> 
> [openssl_def]
> engines = openssl_engines
> 
> [openssl_engines]
> padlock = cryptodev_engine
> 
> [cryptodev_engine]
> default_algorithms = ALL
> 0(achinetboot)%
Ah, that explains the results.

> 
> 
> The limiting factor here for ssh seems to be the 100Mb link my i5 box is
> on. Here is with and without aesni loaded
> 
> 0(achinetboot)% /usr/bin/time scp -c aes128-cbc test.bin
> mdtancsa at 10.255.255.1:/dev/null
> test.bin
>                                   100%   88MB  11.0MB/s   00:08
>         8.14 real         0.44 user         0.57 sys
> 0(achinetboot)% /usr/bin/time scp -c aes128-cbc test.bin
> mdtancsa at 10.255.255.1:/dev/null
> test.bin
>                                   100%   88MB  11.0MB/s   00:08
>         8.15 real         1.46 user         0.36 sys
> 0(achinetboot)%
> 
> I will move it to gigabit to get a better test shortly.
> 
> > 
> > Also, could you, please redo the speed tests for aesni(4) with the
> > following patch applied over the driver sources ?
> > 
> > Thank you !
> > 
> > diff --git a/sys/crypto/aesni/aesni_wrap.c b/sys/crypto/aesni/aesni_wrap.c
> > index 36c66ea..3fd397c 100644
> > --- a/sys/crypto/aesni/aesni_wrap.c
> > +++ b/sys/crypto/aesni/aesni_wrap.c
> > @@ -246,14 +246,21 @@ int
> 
> 
> 
>  patch -p2 < a
> Hmm...  Looks like a unified diff to me...
> The text leading up to this was:
> --------------------------
> |diff --git a/sys/crypto/aesni/aesni_wrap.c b/sys/crypto/aesni/aesni_wrap.c
> |index 36c66ea..3fd397c 100644
> |--- a/sys/crypto/aesni/aesni_wrap.c
> |+++ b/sys/crypto/aesni/aesni_wrap.c
> --------------------------
> Patching file crypto/aesni/aesni_wrap.c using Plan A...
> Hunk #1 succeeded at 246.
> Hunk #2 succeeded at 271.
> Hunk #3 succeeded at 324.
> Hmm...  Ignoring the trailing garbage.
> done
> 
> 
> Seems to work ok
> 
> 
> 
> 0(achinetboot)# kldload aesni
> 0(achinetboot)#  openssl speed -evp aes-128-cbc
> To get the most accurate results, try to run this
> program when this computer is idle.
> Doing aes-128-cbc for 3s on 16 size blocks: 2587085 aes-128-cbc's in 0.39s
> Doing aes-128-cbc for 3s on 64 size blocks: 2425301 aes-128-cbc's in 0.38s
> Doing aes-128-cbc for 3s on 256 size blocks: 1925353 aes-128-cbc's in 0.19s
> Doing aes-128-cbc for 3s on 1024 size blocks: 1098255 aes-128-cbc's in 0.11s
> Doing aes-128-cbc for 3s on 8192 size blocks: 152631 aes-128-cbc's in 0.05s
> OpenSSL 0.9.8n 24 Mar 2010
> built on: date not available
> options:bn(64,32) md2(int) rc4(idx,int) des(ptr,risc1,16,long)
> aes(partial) blowfish(idx)
> compiler: cc
> available timing options: USE_TOD HZ=128 [sysconf value]
> timing function used: getrusage
> The 'numbers' are in 1000s of bytes per second processed.
> type             16 bytes     64 bytes    256 bytes   1024 bytes   8192
> bytes
> aes-128-cbc     105979.48k   404781.84k  2632455.13k  9955323.90k
> 27619906.16k
> 0(achinetboot)#
> 
> But there is a LOT of variation between runs for some reason.
> 
> I added to http://www.tancsa.com/fpu.html
> 
> the different runs
> 
> 
Mike, thank you again.

Would your conclusion be that the patch seems to increase the throughput
of the aesni(4) ?

I think that on small-sized blocks, when using aesni(4), the dominating
factor is the copying/copyout of the data to/from the kernel address
space. Still would be interesting to compare the full output
of "openssl speed" on aesni(4) with and without the patch I posted.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20101116/dfd8c814/attachment.pgp


More information about the freebsd-stable mailing list