Fast vs slow syscalls (Re: Fwd: [RFC] Kernel shared variables)
Dag-Erling Smørgrav
des at des.no
Wed Jun 6 08:24:27 UTC 2012
Bruce Evans <brde at optusnet.com.au> writes:
> Dag-Erling Smørgrav <des at des.no> writes:
> > getpid(): 10,000,000 iterations in 24,400 ms
> > gettimeofday(0, 0): 10,000,000 iterations in 54,104 ms
> > raise(0): 10,000,000 iterations in 1,284,593 ms
> That's one slow system or broken units.
Broken units, these are microseconds not milliseconds. Sorry.
> After adjusting by factors of 1000 here and there, this format is still
> hard to parse. I like the format of nsec/operation. 24400 10 million
> operations in 24400 moroseconds seems to scale to 2.44 nsec/call (if 1
> moro = 1 micro). But that is impossibly fast, unless getpid() is
> inlined to a load of the shared variable (it may also need the load to
> be moved outside the loop). I can't see any reasonable adjustment that
> gives 24.4 nsec/call.
#define ITERATIONS 10000000
struct timeval start, end;
int i;
gettimeofday(&start, NULL);
for (i = 0; i < ITERATIONS; ++i)
getpid();
gettimeofday(&end, NULL);
On Linux, gcc 4.4.6 compiles this to:
# gettimeofday(&start, NULL)
0x000000000040064b <+23>: lea -0x20(%rbp),%rax
0x000000000040064f <+27>: mov $0x0,%esi
0x0000000000400654 <+32>: mov %rax,%rdi
0x0000000000400657 <+35>: callq 0x400500 <gettimeofday at plt>
# i = 0
0x000000000040065c <+40>: movl $0x0,-0x4(%rbp)
0x0000000000400663 <+47>: jmp 0x40066e <main+58>
# getpid()
0x0000000000400665 <+49>: callq 0x400520 <getpid at plt>
# ++i
0x000000000040066a <+54>: addl $0x1,-0x4(%rbp)
# i < ITERATIONS
0x000000000040066e <+58>: cmpl $0x98967f,-0x4(%rbp)
0x0000000000400675 <+65>: jle 0x400665 <main+49>
# gettimeofday(&end, NULL)
0x0000000000400677 <+67>: lea -0x30(%rbp),%rax
0x000000000040067b <+71>: mov $0x0,%esi
0x0000000000400680 <+76>: mov %rax,%rdi
0x0000000000400683 <+79>: callq 0x400500 <gettimeofday at plt>
The code generated by gcc 4.2.1 on FreeBSD is almost identical:
# gettimeofday(&start, NULL)
0x00000000004006f7 <main+23>: lea -0x20(%rbp),%rdi
0x00000000004006fb <main+27>: mov $0x0,%esi
0x0000000000400700 <main+32>: callq 0x40057c <gettimeofday at plt>
# i = 0
0x0000000000400705 <main+37>: movl $0x0,-0x4(%rbp)
0x000000000040070c <main+44>: jmp 0x400717 <main+55>
# getpid()
0x000000000040070e <main+46>: callq 0x40059c <getpid at plt>
# ++i
0x0000000000400713 <main+51>: addl $0x1,-0x4(%rbp)
# i < ITERATIONS
0x0000000000400717 <main+55>: cmpl $0x98967f,-0x4(%rbp)
0x000000000040071e <main+62>: jle 0x40070e <main+46>
# gettimeofday(&end, NULL)
0x0000000000400720 <main+64>: lea -0x30(%rbp),%rdi
0x0000000000400724 <main+68>: mov $0x0,%esi
0x0000000000400729 <main+73>: callq 0x40057c <gettimeofday at plt>
I don't know why gcc 4.4.6 loads &start / &end into %rax before copying
it to %esi instead of loading it directly into %esi like 4.2.1 does. I
used the same command line (gcc -Wall -Wextra syscall.c) in both cases.
DES
--
Dag-Erling Smørgrav - des at des.no
More information about the freebsd-arch
mailing list