Unexpected threading performance result
Alfred Perlstein
alfred at freebsd.org
Sun Oct 7 09:21:54 PDT 2007
* Ivan Voras <ivoras at freebsd.org> [071007 07:52] wrote:
> Hi,
>
> For an unrelated purpose, I'm benchmarking performance of tree
> algorithms in SMP environments and my preliminary run has an unexpected
> result. Here's the typical output from the (small) benchmark program,
> run on a dual-core Athlon64 (i386 mode):
>
> Running benchmarks on small_nonuniform, 1000000 samples
> Step 1: Running 100 loops
> ** Step 1 benchmark completed 100 loops in 84.44 seconds.
> Step 2: Running 2 threads with 100 loops each
> ** Step 2 benchmark completed 100 loops in 2 threads in 167.46 seconds.
>
> The interpretation is: running the same loop twice, in two parallel
> threads doesn't result in a speedup, but it looks like the execution is
> serialized. The problem is: the loops are completely independent, no
> locking in their execution, and 'top' confirms that both threads in the
> program are running at 100% CPU each.
...
1)
Could you please make this example program compile/work, it's a bit
difficult to diagnose the problem if we don't know if things like
n_samples are initialised properly.
2)
Please try to pthread_attr_setscope() to PTHREAD_SCOPE_SYSTEM
on a pthread_attr_t object to pass into pthread_create(). That
may help.
3)
What's the deal with RB_FIND? What does that do? Is that
data structure locked with a an exclusive lock?
>
> I verified this behaviour on:
>
> - 7-CURRENT, i386, ULE
> - 7-CURRENT, i386, 4BSD
> - 6-STABLE, amd64, 4BSD
>
> I can't really explain this behaviour, but it might not be related to
> FreeBSD - maybe I made a mistake in the program or there's a
> hardware-related reason for it (maybe CPU cache trashing from the tree
> traversal?). In any case, can someone shed some light on this?
>
> The main part of the (small) program is pasted below.
>
>
> 47 double time_start, time_b1, time_b2;
> 48 int n_data, n_samples;
> 49 int *data, *samples;
> 50
> 51
> 52 void bench_loop()
> 53 {
> 54 int i;
> 55 struct node *nd, find;
> 56 for (i = 0; i < n_samples; i++) {
> 57 find.data = samples[i];
> 58 nd = RB_FIND(node_tree, &head, &find);
> 59 if (nd == NULL)
> 60 errx(1, "Sample %d was not found", find.data);
> 61 }
> 62 }
> 63
> 64 void step1()
> 65 {
> 66 int n;
> 67 /* step 1 - simple tree traversal */
> 68 printf("Step 1: Running %d loops\n", STEP1_ITER);
> 69 for (n = 0; n < STEP1_ITER; n++)
> 70 bench_loop();
> 71 time_b1 = gettime();
> 72 printf("** Step 1 benchmark completed %d loops in %0.2lf
> seconds.\n", STEP1_ITER, time_b1 - time_start);
> 73 }
> 74
> 75 void *step2_thread(void *arg) {
> 76 int n;
> 77 for (n = 0; n < STEP2_ITER; n++)
> 78 bench_loop();
> 79 return NULL;
> 80 }
> 81
> 82 void step2()
> 83 {
> 84 /* step 2 - run tree traversal in parallel threads */
> 85 int n;
> 86 pthread_t threads[STEP2_THREADS];
> 87
> 88 printf("Step 2: Running %d threads with %d loops each\n",
> STEP2_THREADS, STEP2_ITER);
> 89 for (n = 0; n < STEP2_THREADS; n++) {
> 90 if (pthread_create(&threads[n], NULL, step2_thread,
> NULL) != 0)
> 91 err(1, "Cannot spawn thread");
> 92 }
> 93 for (n = 0; n < STEP2_THREADS; n++)
> 94 pthread_join(threads[n], NULL);
> 95 time_b2 = gettime();
> 96 printf("** Step 2 benchmark completed %d loops in %d
> threads in %0.2lf seconds.\n",
> 97 STEP2_ITER, STEP2_THREADS, time_b2 -
> time_start);
> 98 }
>
> _______________________________________________
> freebsd-threads at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-threads
> To unsubscribe, send any mail to "freebsd-threads-unsubscribe at freebsd.org"
--
- Alfred Perlstein
More information about the freebsd-threads
mailing list