bin/106734: SSE2 optimization for bzip2/libbz2

Mikhail T. mi at aldan.algebra.com
Thu Dec 14 13:53:48 PST 2006


>Number:         106734
>Category:       bin
>Synopsis:       SSE2 optimization for bzip2/libbz2
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Thu Dec 14 21:50:06 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator:     Mikhail T.
>Release:        FreeBSD 6.2-PRERELEASE amd64
>Organization:
Virtual Estates, Inc.
>Environment:

	Intel's and AMD chips with SSE2 instructions.

>Description:
	The patch below makes bzip2's blocksort routines use SSE2-registers
	to compare 16 bytes at a time.

	On both i386 and AMD chips I tested, the performance improvement
	ranges from 5% for the already compressed (.gz) files to 20% for
	the highly compressible system logs.

	The compressed files are byte-for-byte identical with those produced
	by the original bzip2.

	The changes are ifdef-ed by __SSE2__ and relies on the intrinsics
	available in GNU, Intel's, and Microsoft's compilers.

	No changes to Makefile(s) are necessary -- when targeting an
	SSE2-capable CPU (i.e. ``-march=opteron'' or ``-march=pentium4''),
	the __SSE2__ is set by the compiler.

>How-To-Repeat:

>Fix:

	The patch is available from http://aldan.algebra.com/~mi/bz/

	The patch is not FreeBSD-specific, but was developed, tested, and timed
	on FreeBSD-6.x using both i386 and amd64.

	Feedback welcome.
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list