Unexpected threading performance result
Ivan Voras
ivoras at freebsd.org
Sun Oct 7 07:52:40 PDT 2007
Hi,
For an unrelated purpose, I'm benchmarking performance of tree
algorithms in SMP environments and my preliminary run has an unexpected
result. Here's the typical output from the (small) benchmark program,
run on a dual-core Athlon64 (i386 mode):
Running benchmarks on small_nonuniform, 1000000 samples
Step 1: Running 100 loops
** Step 1 benchmark completed 100 loops in 84.44 seconds.
Step 2: Running 2 threads with 100 loops each
** Step 2 benchmark completed 100 loops in 2 threads in 167.46 seconds.
The interpretation is: running the same loop twice, in two parallel
threads doesn't result in a speedup, but it looks like the execution is
serialized. The problem is: the loops are completely independent, no
locking in their execution, and 'top' confirms that both threads in the
program are running at 100% CPU each.
I verified this behaviour on:
- 7-CURRENT, i386, ULE
- 7-CURRENT, i386, 4BSD
- 6-STABLE, amd64, 4BSD
I can't really explain this behaviour, but it might not be related to
FreeBSD - maybe I made a mistake in the program or there's a
hardware-related reason for it (maybe CPU cache trashing from the tree
traversal?). In any case, can someone shed some light on this?
The main part of the (small) program is pasted below.
47 double time_start, time_b1, time_b2;
48 int n_data, n_samples;
49 int *data, *samples;
50
51
52 void bench_loop()
53 {
54 int i;
55 struct node *nd, find;
56 for (i = 0; i < n_samples; i++) {
57 find.data = samples[i];
58 nd = RB_FIND(node_tree, &head, &find);
59 if (nd == NULL)
60 errx(1, "Sample %d was not found", find.data);
61 }
62 }
63
64 void step1()
65 {
66 int n;
67 /* step 1 - simple tree traversal */
68 printf("Step 1: Running %d loops\n", STEP1_ITER);
69 for (n = 0; n < STEP1_ITER; n++)
70 bench_loop();
71 time_b1 = gettime();
72 printf("** Step 1 benchmark completed %d loops in %0.2lf
seconds.\n", STEP1_ITER, time_b1 - time_start);
73 }
74
75 void *step2_thread(void *arg) {
76 int n;
77 for (n = 0; n < STEP2_ITER; n++)
78 bench_loop();
79 return NULL;
80 }
81
82 void step2()
83 {
84 /* step 2 - run tree traversal in parallel threads */
85 int n;
86 pthread_t threads[STEP2_THREADS];
87
88 printf("Step 2: Running %d threads with %d loops each\n",
STEP2_THREADS, STEP2_ITER);
89 for (n = 0; n < STEP2_THREADS; n++) {
90 if (pthread_create(&threads[n], NULL, step2_thread,
NULL) != 0)
91 err(1, "Cannot spawn thread");
92 }
93 for (n = 0; n < STEP2_THREADS; n++)
94 pthread_join(threads[n], NULL);
95 time_b2 = gettime();
96 printf("** Step 2 benchmark completed %d loops in %d
threads in %0.2lf seconds.\n",
97 STEP2_ITER, STEP2_THREADS, time_b2 -
time_start);
98 }
More information about the freebsd-threads
mailing list