FreeBSD 7.O compiled code is very slow
Kailash Kailash
kailash.kailash at zscaler.com
Wed Feb 18 13:06:59 PST 2009
I am using FreeBSD 7.0 in 64 BIT mode. Same code compiled on BSD 7.0 runs
50% speed (as expected by CPU and code architecture) compared to BSD 6.2.
Here is one real code with binary output. On 3.0 GHz Woodcrest processor, I
am able to copy from one cache to another cache at the speed of
24Gbytes/second verses 12 Gbytes/second. Any help is appreciated.
Thanks,
Kailash
Source-code:
#define BACKWARD_MEMCOPY_WITH_TYPE(dp, sp, len, type)
BACKWARD_MEMCOPY((type*)(dp), (const type*)(sp),
((len)+(sizeof(type)-1))/sizeof(type))
#define BACKWARD_MEMCOPY(dp, sp, len) \
do { \
smwbits __len = len; /* must be signed value */ \
while (--__len >= 0) \
(dp)[__len] = (sp)[__len]; \
} while (0)
BSD 6.2 code
0x00000000004353b2 <smcli_memory_copy_performance+434>: mov
0x8(%rbp,%rdx,8),%rax
0x00000000004353b7 <smcli_memory_copy_performance+439>: mov
%rax,0x48(%r12,%rdx,8)
0x00000000004353bc <smcli_memory_copy_performance+444>: dec %rdx
0x00000000004353bf <smcli_memory_copy_performance+447>: jns 0x4353b2
<smcli_memory_copy_performance+434>
BSD 7.0 code
0x000000000040cb80 <smcli_memory_copy_performance+272>: mov
0xfffffffffffffff8(%rsi),%rax
0x000000000040cb84 <smcli_memory_copy_performance+276>: mov
%rax,0xfffffffffffffff8(%rcx)
0x000000000040cb88 <smcli_memory_copy_performance+280>: sub
$0x8,%rsi
0x000000000040cb8c <smcli_memory_copy_performance+284>: sub
$0x8,%rcx
0x000000000040cb90 <smcli_memory_copy_performance+288>: sub
$0x1,%rdx
0x000000000040cb94 <smcli_memory_copy_performance+292>: jns
0x40cb80 <smcli_memory_copy_performance2+272>
More information about the freebsd-questions
mailing list