mysql performance on 4 * dualcore opteron

Michael Vince mv at roq.com
Thu Apr 6 09:12:43 UTC 2006


I just ran a test on 6_stable (April 5th) on a Dell 2850 dual CPU 
(single core 3.60GHz) using the AMD64 build of FreeBSD and got similar 
speeds as you.
Its interesting how Sven could have 8 cores with what appears to be less 
MySQL speed then just having a few cores.
After enabling libthr it does jump by about 3,600 on a generic SMP 
kernel compile, I didn't try any more serious tweaks.

For those who are interested in exactly how I tested wheres what I did.

portupgrade -RN  -m 'BUILD_OPTIMIZED=yes WITH_PROC_SCOPE_PTH=yes' 
/usr/ports/databases/mysql41-server
portupgrade -RN /usr/ports/benchmarks/super-smack

super-smack -d mysql /usr/local/share/super-smack/select-key.smack 10 10000
Query Barrel Report for client smacker1
connect: max=4ms  min=1ms avg= 2ms from 10 clients
Query_type      num_queries     max_time        min_time        q_per_s
select_index    200000  0       0       22061.88

With this below in my /etc/libmap.conf for libthr and a MySQL restart 
/usr/local/etc/rc.d/mysql-server restart the numbers do jump.
[/usr/local/libexec/mysqld]
libpthread.so.2         libthr.so.2
libpthread.so           libthr.so


super-smack -d mysql /usr/local/share/super-smack/select-key.smack 10 10000
Query Barrel Report for client smacker1
connect: max=238ms  min=0ms avg= 117ms from 10 clients
Query_type      num_queries     max_time        min_time        q_per_s
select_index    200000  0       0       25601.49

I have also done benchmarking with libthr against Apache using 'ab' and 
found it can deliver an extra amount of megabytes/sec of data (I think 
it was about an extra 2000/requests sec) at the cost of giving the 
server from what I remember almost double the 'average load' according 
to 'top'
Given that if your machine has nothing else to do but deliver data 
purely from Apache then even libthr is more worth while for Apache as well.

Mike

Steven Hartland wrote:
> Looking at this on a dual box here ( waiting for the new MB for dual 
> dual core )
> All the time is spent processing super-smack and only 25% on mysqld.
> Even dropping to 10 clients a large portion is take by the clients.
> That said there is a lot that can be gained by using the tweaks out there
> i.e. ULE + libthr + TSC + context_time.patch + cpu_acct_1.patch + 
> cpu_acct_2.patch
> Adding these jumps from a baseline:
> select_index    2000000 8       0       18624.60
> to:
> select_index    2000000 5       0       29942.10
>
> The biggest increases coming from libthr ( thanks DavidXu ) and the ULE
> scheduler.
>
> [log]
> == 4BSD + libpthread + ACPI-Fast ==
> super-smack -d mysql select-key.smack 100 10000
> Query Barrel Report for client smacker1
> connect: max=46ms  min=6ms avg= 25ms from 100 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    2000000 8       0       18624.60
>
> super-smack -d mysql select-key.smack 10 100000
> Query Barrel Report for client smacker1
> connect: max=5ms  min=0ms avg= 1ms from 10 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    2000000 0       0       23983.87
>
> == 4BSD + libthr + ACPI-Fast  ==
> super-smack -d mysql select-key.smack 100 10000
> Query Barrel Report for client smacker1
> connect: max=107ms  min=2ms avg= 45ms from 100 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    2000000 13      0       22413.39
>
> super-smack -d mysql select-key.smack 10 100000
> Query Barrel Report for client smacker1
> connect: max=2ms  min=1ms avg= 1ms from 10 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    2000000 0       0       26841.07
>
> == 4BSD + libthr + TSC ==
> super-smack -d mysql select-key.smack 100 10000
> Query Barrel Report for client smacker1
> connect: max=46ms  min=1ms avg= 21ms from 100 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    2000000 11      0       23428.03
>
> super-smack -d mysql select-key.smack 10 100000
> Query Barrel Report for client smacker1
> connect: max=2ms  min=0ms avg= 1ms from 10 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    2000000 0       0       26403.95
>
> == ULE + libthr + TSC ==
> super-smack -d mysql select-key.smack 100 10000
> Query Barrel Report for client smacker1
> connect: max=41ms  min=0ms avg= 23ms from 100 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    2000000 5       0       28581.18
>
> super-smack -d mysql select-key.smack 10 100000
> Query Barrel Report for client smacker1
> connect: max=4ms  min=0ms avg= 1ms from 10 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    2000000 0       0       30128.44
>
> == ULE + libthr + TSC + context_time.patch + cpu_acct_1.patch + 
> cpu_acct_2.patch ==
> super-smack -d mysql select-key.smack 100 10000
> Query Barrel Report for client smacker1
> connect: max=27ms  min=0ms avg= 14ms from 100 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    2000000 5       0       29942.10
>
> super-smack -d mysql select-key.smack 10 100000
> Query Barrel Report for client smacker1
> connect: max=12ms  min=0ms avg= 4ms from 10 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    2000000 0       0       31057.52
>
> == 4BSD + libthr + TSC + context_time.patch + cpu_acct_1.patch + 
> cpu_acct_2.patch ==
> super-smack -d mysql select-key.smack 100 10000
> Query Barrel Report for client smacker1
> connect: max=54ms  min=20ms avg= 38ms from 100 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    2000000 9       0       24144.22
>
> super-smack -d mysql select-key.smack 10 100000
> Query Barrel Report for client smacker1
> connect: max=2ms  min=0ms avg= 1ms from 10 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    2000000 0       0       27073.46
>
> ** update test **
> super-smack -d mysql update-select.smack 10 100000
> Query Barrel Report for client smacker
> connect: max=3ms  min=0ms avg= 0ms from 10 clients Query_type      
> num_queries     max_time        min_time        q_per_s
> select_index    1000000 1       0       6468.70
> update_index    1000000 0       0       6468.70
> [/log]
>
> Machine:
> Dual 244, 2Gb running FreeBSD 6.1-PRERELEASE (i386)
> Package install of mysql 4.0
> Port install of super-smack
>
> Notes:
> No detectable disk activity thoughout the tests
> ULE scheduler breaks the output from top with everything showing as
> WCPU 0% in the 100 concurrency test and the numbers not adding up
> at all in 10 concurrency test or showing 0%.
> To get context_time.patch to work I needed the attached patch which
> is basically two failed chunks of: kern/kern_exit.c moved to 
> kern/kern_thread.c
>
>    Steve
> ----- Original Message ----- From: "Sven Petai" <hadara at bsd.ee>
> To: <freebsd-performance at freebsd.org>
> Sent: Tuesday, April 04, 2006 5:42 PM
> Subject: mysql performance on 4 * dualcore opteron
>
>
>> hi
>>
>> Before I begin, let me just say that I'm probably aware most of the 
>> threads about mysql performance in various fbsd lists over last 
>> couple of years, so please let's not consentrate on the usual points 
>> made over and over again like how filesystems are mounted under 
>> linux, how fast time() is or how various combinations of 
>> scheduler/threding library/compiler flags give you ~5-10% better 
>> performance. It's very unlikely that any of these reasons, or even 
>> all of them together can explain performance differences of 2-3 *
>> so now a little bit of the backround...
>> I usually use MySQL benchmark called super-smack as one of the 
>> benchmarks on all the new machines to get a general feeling of the 
>> servers performance.
>> I certainly agree that the default smack workloads are far too simple 
>> to say much about actual production performance, but still... better 
>> than nothing...
>>
>> In general 2.4Ghz amd64 UP box (6.1 betaX) can do about
>> 17400 q/s with select-smack+4bsd+thr combination and
>> 4300 q/s with update-smack+4bsd+thr
>>
>> on dualcore 2Ghz opteron (6.1 prerelease) the results are:
>> 20000 q/s with select-smack+4bsd+thr and
>> 4500 q/s  with update-smack+4bsd+thr
>>
>> performance for update-smack seems to be always 4XXX q/s, no matter 
>> how many CPUs the box has or what kind or raid controller/disks are 
>> used (i have tested on about 8 rather different machines).  I have no 
>> idea if IO on all the servers I have tried really maxes out at this 
>> point or is there some bottleneck in UFS.
>> select-smack performance gains on dualcore are not quite as good as 
>> one might expect, but then again that dualcore box uses ECC memory 
>> which is probably somewhat slower because of the checksum 
>> calculations, and synchronisation has some overhead too... Anyway all 
>> in all I'm more or less happy with these results, even though linux 
>> will do about twise as much selects on the same hardware.
>>
>> Today I had a chance to test 4 * 2Ghz dualcore opteron machine,  so 
>> this machine has 8 cores in total and 8G of RAM.
>>
>> Now, on that server I get:
>> 11000 q/s for select-smack+4bsd+thr combination (with KSE it's around 
>> 6000 q/s, ule+thr gives somewhere around 12000 q/s)
>> 4100 q/s for update-smack+4bsd+thr
>>
>> So the 8 core machine got almost 2* worse result for select than UP 
>> server.
>>
>> After some tinkering I found out that renicing mysqld to -5 will make 
>> it push out 21000 q/s (4bsd, thr), so I suspect part of the problem 
>> is in the scheduling - probably super-smack with it's 100 processes 
>> gets just a lot more CPU time otherwise than mysql with it's 100 
>> threads servicing them. But anyway even this result is still only 
>> about equal in performance to what I get from dualcore machine.
>>
>> As I ran out of good (macro)tuning ideas at this point, and wanted to 
>> make sure higher scores are indeed achievable, I tried Linux on the 
>> same hardware.
>> Here are the results for same tests on Suse enterprise linux 9 
>> (2.6.5-7.97-smp):
>> 76857 q/s for select-smack
>> 10050 q/s for update-smack
>>
>> the mysql configuration was identical to the one I used under freebsd 
>> (my-huge). This Suse uses ReiserFS, but I have no idea about what 
>> kind of FS guarantees it provides, didn't see any sync/async stuff in 
>> the mount output.
>> I also repeated the tests on identical box that had Fedora installed 
>> (2.6.9-22-ELsmp) and used ext3'fs.
>> select-smack results were obviously almost the same as it doesn't 
>> touch the FS, update was about 8000 q/s.
>>
>> I'm relativelly sure that this kind of huge performance differences 
>> can't be explained by mere speed difference of time(), I haven't yet 
>> tested phk'd and roberts timer hacks, but at some point in time I 
>> rewrote mysql's timing code to completelly avoid any calls to time() 
>> by keeping internal timestamp that was updated from TSC reg. value. 
>> It was certainly very ugly and imprecise, but worked well enough 
>> since mysql uses these code paths mainly for statistics and for 
>> setting various safeguard timeouts. Even with ~90% time() calls 
>> removed the performance still didn't get measurably better.
>> Of course it's possible that I fucked up somehow, so if someone has 
>> tested roberts and phk's changes then it would be certainly nice to 
>> hear about your results.
>>
>> To make the long story short - does anyone have any good ideas about 
>> where might the bottleneck and how to debug it ?
>>
>> PS
>> Here's some system/test information:
>> super-smack was used with concurrency of 100 and reqs. set to 10000
>> it was running on the same machine as the mysqld and connections were 
>> done over local socket.
>>
>> timer: acpi-fast in all the cases
>> mysql: 4.1.18_2 from ports, table type is myisam
>> mysql configuration file:
>> http://bsd.ee/~hadara/debug/mysql3/2way/my.cnf
>> in general it's just my-huge.cnf from mysql distribution, with 
>> increased max_connections
>>
>> kernel config is GENERIC-SMP (no it doesn't have WITNESS enabled)
>> == 4 * dualcore opteron ==:
>> vmstat 1, during select-smack test:
>> http://bsd.ee/~hadara/debug/mysql3/8way/vmstat.txt
>> dmesg:
>> http://bsd.ee/~hadara/debug/mysql3/8way/dmesg.boot
>> sysctl -a:
>> http://bsd.ee/~hadara/debug/mysql3/8way/sysctl.txt
>>
>> == 1 * dualcore opteron ==:
>> vmstat 1, during select-smack test:
>> http://bsd.ee/~hadara/debug/mysql3/2way/vmstat.txt
>> dmesg:
>> http://bsd.ee/~hadara/debug/mysql3/2way/dmesg.boot
>> sysctl -a:
>> http://bsd.ee/~hadara/debug/mysql3/2way/sysctl.txt
>> _______________________________________________
>> freebsd-performance at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>> To unsubscribe, send any mail to 
>> "freebsd-performance-unsubscribe at freebsd.org"
>>
>>
>
> ================================================
> This e.mail is private and confidential between Multiplay (UK) Ltd. 
> and the person or entity to whom it is addressed. In the event of 
> misdirection, the recipient is prohibited from using, copying, 
> printing or otherwise disseminating it or any information contained in 
> it.
> In the event of misdirection, illegible or incomplete transmission 
> please telephone (023) 8024 3137
> or return the E.mail to postmaster at multiplay.co.uk.
> ------------------------------------------------------------------------
>
> _______________________________________________
> freebsd-performance at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to "freebsd-performance-unsubscribe at freebsd.org"



More information about the freebsd-performance mailing list