Performance problems with pagedaemon
Victor Balada Diaz
victor at bsdes.net
Mon Jan 2 13:41:55 UTC 2012
Thanks a lot for your fast reply Jeremy!
On Mon, Jan 02, 2012 at 04:59:29AM -0800, Jeremy Chadwick wrote:
> On Mon, Jan 02, 2012 at 01:25:49PM +0100, Victor Balada Diaz wrote:
> > Hello,
> >
> > First: Happy new year to everybody!
> >
> > Last year i had a problem with pagedaemon that i reported here[1] but i had
> > no replies. The problem went away and now it's back to beat me again.
> >
> > Seems that mysql creates some bad interactions with pagedaemon and system starts
> > to get slow for disk access. Eg, accessing a man page takes a few seconds. If
> > i stop mysql and start it again, it recovers again. As this is a problem that have
> > appeared just twice (on October and now), it's not easy to reproduce and test.
> >
> > Disk I/O as per gstat is not high, not many context switches, etc. Bottleneck
> > seems to be CPU. I've reported more information on october's mail[1].
> >
> > What can i do to debug it? Seems that maybe the most interesting thing would be
> > enabling lock profiling and wait for the system to fail again, but is there anything
> > that i can do/report that would be helpful to track the problem down now that's
> > failing?
> >
> > Thanks a lot.
> > Regards.
> >
> > [1]: http://lists.freebsd.org/pipermail/freebsd-stable/2011-October/064248.html
>
> The only things I can think of are output from the following:
>
> * vmstat -z
> * sysctl -a vm
> * top (mainly for mysqld)
> * top -H -S
Attached at the end.
>
> Is this machine in question still running 8.1-RELEASE? If so, have you
> tried upgrading it to 8.2-STABLE? I believe there have been some VM
> adjustments, but I'd have to go back and examine commits.
8.1-RELEASE-p5, yes. I didn't try updating to 8.2, but i'll try upgrading
to 9.0 once it's out. Once i upgrade i think it will take a few months until i can
say if it's fixed or not.
I must say that this problem just appeared twice on all of our servers. The rest have
been rock solid with 8.1 RELEASE-p* and no problems in a long time.
>
> Taken from our mysql server. Note that the mysql storage filesystem is
> stored on ZFS as well, which has added benefits/complexities yet we see
> no issue.
We're using UFS for filesystem with RAID 10 in HW, controller is Adaptec 5405.
So far we've just tested ZFS on FreeBSD for our backup servers and don't feel
it's a good choice for our database needs as it wastes much memory and don't
get us enough benefits.
Out of curiosity: why did you prefer ZFS over UFS for big databases? The biggest
reason i can think of, is snapshots and backups, but with xtrabackup available
and in ports, it's not that useful.
vmstat -z:
ITEM SIZE LIMIT USED FREE REQUESTS FAILURES
UMA Kegs: 208, 0, 104, 15, 104, 0
UMA Zones: 576, 0, 104, 4, 104, 0
UMA Slabs: 568, 0, 3880, 495, 463422, 0
UMA RCntSlabs: 568, 0, 3006, 592, 682012, 0
UMA Hash: 256, 0, 0, 0, 3, 0
16 Bucket: 152, 0, 47, 153, 204, 0
32 Bucket: 280, 0, 130, 122, 340, 2
64 Bucket: 536, 0, 94, 95, 583, 86
128 Bucket: 1048, 0, 894, 15, 153173, 287797
VM OBJECT: 216, 0, 33316, 50834, 86207077, 0
MAP: 232, 0, 7, 25, 7, 0
KMAP ENTRY: 120, 786315, 80, 1346, 6366241, 0
MAP ENTRY: 120, 0, 1590, 2347, 212747051, 0
DP fakepg: 120, 0, 0, 124, 21, 0
SG fakepg: 120, 0, 0, 0, 0, 0
mt_zone: 2056, 0, 260, 3, 260, 0
16: 16, 0, 2336, 1696, 6042441, 0
32: 32, 0, 3333, 2424, 426544, 0
64: 64, 0, 8137, 1271, 986811990, 0
128: 128, 0, 15488, 3826, 25899478, 0
256: 256, 0, 581, 2059, 47822165, 0
512: 512, 0, 2972, 1949, 9053615, 0
1024: 1024, 0, 54, 358, 24855, 0
2048: 2048, 0, 56, 266, 41473, 0
4096: 4096, 0, 396, 2047, 3139352, 0
Files: 80, 0, 487, 2258, 54800616, 0
TURNSTILE: 136, 0, 4045, 435, 4413, 0
umtx pi: 96, 0, 0, 0, 0, 0
MAC labels: 40, 0, 0, 0, 0, 0
PROC: 1120, 0, 55, 2501, 2921033, 0
THREAD: 984, 0, 2654, 1390, 22933, 0
SLEEPQUEUE: 80, 0, 4045, 537, 4413, 0
VMSPACE: 392, 0, 35, 2125, 2921015, 0
cpuset: 72, 0, 2, 98, 2, 0
audit_record: 952, 0, 0, 0, 0, 0
mbuf_packet: 256, 0, 256, 1664, 379896799, 0
mbuf: 256, 0, 1, 1664, 1522479226, 0
mbuf_cluster: 2048, 25600, 1920, 1816, 593152, 0
mbuf_jumbo_page: 4096, 12800, 0, 1138, 257071656, 0
mbuf_jumbo_9k: 9216, 6400, 0, 0, 0, 0
mbuf_jumbo_16k: 16384, 3200, 0, 0, 0, 0
mbuf_ext_refcnt: 4, 0, 0, 0, 0, 0
g_bio: 232, 0, 8, 1208, 847789577, 0
ttyinq: 160, 0, 150, 282, 570, 0
ttyoutq: 256, 0, 80, 205, 304, 0
ata_request: 320, 0, 0, 0, 0, 0
ata_composite: 336, 0, 0, 0, 0, 0
VNODE: 472, 0, 29655, 55617, 10698733, 0
VNODEPOLL: 112, 0, 0, 0, 0, 0
S VFS Cache: 108, 0, 27906, 60204, 10211590, 0
L VFS Cache: 328, 0, 3755, 22285, 1665549, 0
NAMEI: 1024, 0, 0, 1536, 188198737, 0
DIRHASH: 1024, 0, 0, 1404, 401913, 0
NFSMOUNT: 616, 0, 0, 0, 0, 0
NFSNODE: 656, 0, 0, 0, 0, 0
pipe: 728, 0, 3, 2437, 1993959, 0
ksiginfo: 112, 0, 2546, 1810, 142136, 0
itimer: 344, 0, 1, 21, 1, 0
KNOTE: 128, 0, 0, 522, 86920, 0
socket: 680, 25602, 54, 2100, 525540, 0
ipq: 56, 819, 0, 0, 0, 0
udp_inpcb: 336, 25608, 9, 266, 141389, 0
udpcb: 16, 25704, 9, 1671, 141389, 0
tcp_inpcb: 336, 25608, 20, 3137, 152210, 0
tcpcb: 880, 25600, 20, 764, 152210, 0
tcptw: 72, 5150, 0, 3150, 10894, 0
syncache: 144, 15366, 0, 338, 151385, 0
hostcache: 136, 15372, 2, 754, 741, 0
tcpreass: 40, 1680, 0, 336, 168, 0
sackhole: 32, 0, 0, 1313, 535067, 0
sctp_ep: 1272, 25602, 0, 0, 0, 0
sctp_asoc: 2232, 40000, 0, 0, 0, 0
sctp_laddr: 48, 80064, 0, 144, 4, 0
sctp_raddr: 616, 80004, 0, 0, 0, 0
sctp_chunk: 136, 400008, 0, 0, 0, 0
sctp_readq: 104, 400032, 0, 0, 0, 0
sctp_stream_msg_out: 96, 400026, 0, 0, 0, 0
sctp_asconf: 40, 400008, 0, 0, 0, 0
sctp_asconf_ack: 48, 400032, 0, 0, 0, 0
ripcb: 336, 25608, 0, 44, 35, 0
unpcb: 240, 25600, 24, 2104, 231904, 0
rtentry: 200, 0, 11, 46, 11, 0
selfd: 56, 0, 4960, 1592, 440025550, 0
SWAPMETA: 288, 116519, 0, 0, 0, 0
ip4flow: 56, 295974, 18, 1557, 285545, 0
ip6flow: 80, 295965, 0, 0, 0, 0
Mountpoints: 752, 0, 6, 4, 6, 0
FFS inode: 168, 0, 29619, 60251, 10698085, 0
FFS1 dinode: 128, 0, 0, 0, 0, 0
FFS2 dinode: 256, 0, 29619, 59646, 10698085, 0
pfsrctrpl: 152, 10000, 0, 0, 0, 0
pfrulepl: 912, 0, 7, 5, 7, 0
pfstatepl: 392, 10000, 2, 2748, 10439, 0
pfaltqpl: 240, 0, 0, 0, 0, 0
pfpooladdrpl: 88, 0, 1, 83, 1, 0
pfrktable: 1296, 1002, 2, 37, 181, 0
pfrkentry: 216, 100008, 13, 167, 189, 0
pfrkentry2: 216, 0, 0, 0, 0, 0
pffrent: 32, 5050, 0, 0, 0, 0
pffrag: 80, 0, 0, 0, 0, 0
pffrcache: 80, 10035, 0, 0, 0, 0
pffrcent: 24, 50022, 0, 0, 0, 0
pfstatescrub: 40, 0, 0, 0, 0, 0
pfiaddrpl: 120, 0, 0, 0, 0, 0
pfospfen: 112, 0, 696, 30, 696, 0
pfosfp: 40, 0, 407, 97, 407, 0
sysctl -a vm:
vm.vmtotal:
System wide totals computed every five seconds: (values in kilobytes)
===============================================
Processes: (RUNQ: 2 Disk Wait: 0 Page Wait: 0 Sleep: 51)
Virtual Memory: (Total: 1098017100K, Active 24065448K)
Real Memory: (Total: 21157668K Active 20971144K)
Shared Virtual Memory: (Total: 27740K Active: 8632K)
Shared Real Memory: (Total: 6128K Active: 4928K)
Free Memory Pages: 315636K
vm.loadavg: { 3.26 4.55 4.71 }
vm.v_free_min: 38447
vm.v_free_target: 161771
vm.v_free_reserved: 7983
vm.v_inactive_target: 242656
vm.v_cache_min: 161771
vm.v_cache_max: 323542
vm.v_pageout_free_min: 34
vm.pageout_algorithm: 0
vm.swap_enabled: 1
vm.kmem_size_scale: 3
vm.kmem_size_max: 329853485875
vm.kmem_size_min: 0
vm.kmem_size: 8319258624
vm.nswapdev: 1
vm.dmmax: 32
vm.swap_async_max: 4
vm.overcommit: 0
vm.swap_reserved: 891061784576
vm.swap_total: 4294967296
vm.zone_count: 106
vm.swap_idle_threshold2: 10
vm.swap_idle_threshold1: 2
vm.kstacks: 2654
vm.kstack_cache_size: 128
vm.exec_map_entries: 16
vm.stats.misc.zero_page_count: 0
vm.stats.misc.cnt_prezero: 0
vm.stats.vm.v_kthreadpages: 0
vm.stats.vm.v_rforkpages: 0
vm.stats.vm.v_vforkpages: 376159858
vm.stats.vm.v_forkpages: 387513645
vm.stats.vm.v_kthreads: 21
vm.stats.vm.v_rforks: 0
vm.stats.vm.v_vforks: 1446951
vm.stats.vm.v_forks: 1474070
vm.stats.vm.v_interrupt_free_min: 2
vm.stats.vm.v_pageout_free_min: 34
vm.stats.vm.v_cache_max: 323542
vm.stats.vm.v_cache_min: 161771
vm.stats.vm.v_cache_count: 20031
vm.stats.vm.v_inactive_count: 488736
vm.stats.vm.v_inactive_target: 242656
vm.stats.vm.v_active_count: 4755474
vm.stats.vm.v_wire_count: 769895
vm.stats.vm.v_free_count: 58856
vm.stats.vm.v_free_min: 38447
vm.stats.vm.v_free_target: 161771
vm.stats.vm.v_free_reserved: 7983
vm.stats.vm.v_page_count: 6093208
vm.stats.vm.v_page_size: 4096
vm.stats.vm.v_tfree: 1826977943
vm.stats.vm.v_pfree: 432842089
vm.stats.vm.v_dfree: 0
vm.stats.vm.v_tcached: 115172741
vm.stats.vm.v_pdpages: 771431101
vm.stats.vm.v_pdwakeups: 185993
vm.stats.vm.v_reactivated: 10374109
vm.stats.vm.v_intrans: 81714
vm.stats.vm.v_vnodepgsout: 0
vm.stats.vm.v_vnodepgsin: 43701
vm.stats.vm.v_vnodeout: 0
vm.stats.vm.v_vnodein: 13429
vm.stats.vm.v_swappgsout: 0
vm.stats.vm.v_swappgsin: 0
vm.stats.vm.v_swapout: 0
vm.stats.vm.v_swapin: 0
vm.stats.vm.v_ozfod: 25884
vm.stats.vm.v_zfod: 426155747
vm.stats.vm.v_cow_optim: 77306
vm.stats.vm.v_cow_faults: 229289109
vm.stats.vm.v_vm_faults: 878863698
vm.stats.sys.v_soft: 1298923565
vm.stats.sys.v_intr: 745814299
vm.stats.sys.v_syscall: 1579413083
vm.stats.sys.v_trap: 961158308
vm.stats.sys.v_swtch: 64043380
vm.stats.object.bypasses: 931626
vm.stats.object.collapses: 3569975
vm.v_free_severe: 23215
vm.max_proc_mmap: 693271
vm.old_msync: 0
vm.msync_flush_flags: 3
vm.boot_pages: 48
vm.max_wired: 2016902
vm.pageout_lock_miss: 0
vm.disable_swapspace_pageouts: 0
vm.defer_swapspace_pageouts: 1
vm.swap_idle_enabled: 0
vm.pageout_stats_interval: 5
vm.pageout_full_stats_interval: 20
vm.pageout_stats_max: 161771
vm.max_launder: 32
vm.phys_segs:
SEGMENT 0:
start: 0x1000
end: 0x98000
free list: 0xffffffff80c68a68
SEGMENT 1:
start: 0xeaf000
end: 0x1000000
free list: 0xffffffff80c68a68
SEGMENT 2:
start: 0x1000000
end: 0xbf780000
free list: 0xffffffff80c686c0
SEGMENT 3:
start: 0x100000000
end: 0x611030000
free list: 0xffffffff80c686c0
vm.phys_free:
FREE LIST 0:
ORDER (SIZE) | NUMBER
| POOL 0 | POOL 1 | POOL 2
-- -- -- -- -- -- -- --
12 ( 16384K) | 0 | 0 | 0
11 ( 8192K) | 0 | 0 | 0
10 ( 4096K) | 0 | 0 | 0
9 ( 2048K) | 0 | 0 | 0
8 ( 1024K) | 0 | 0 | 0
7 ( 512K) | 0 | 0 | 0
6 ( 256K) | 0 | 0 | 0
5 ( 128K) | 0 | 0 | 0
4 ( 64K) | 0 | 0 | 0
3 ( 32K) | 0 | 0 | 0
2 ( 16K) | 36 | 0 | 1
1 ( 8K) | 1 | 0 | 134
0 ( 4K) | 1 | 39 | 23272
FREE LIST 1:
ORDER (SIZE) | NUMBER
| POOL 0 | POOL 1 | POOL 2
-- -- -- -- -- -- -- --
12 ( 16384K) | 0 | 0 | 0
11 ( 8192K) | 0 | 0 | 0
10 ( 4096K) | 0 | 0 | 0
9 ( 2048K) | 0 | 0 | 0
8 ( 1024K) | 1 | 0 | 0
7 ( 512K) | 0 | 0 | 0
6 ( 256K) | 2 | 0 | 0
5 ( 128K) | 1 | 0 | 0
4 ( 64K) | 2 | 0 | 0
3 ( 32K) | 2 | 0 | 0
2 ( 16K) | 0 | 0 | 0
1 ( 8K) | 1 | 0 | 0
0 ( 4K) | 2 | 0 | 0
vm.reserv.reclaimed: 0
vm.reserv.partpopq:
LEVEL SIZE NUMBER
-1: 218756K, 273
vm.reserv.freed: 338501
vm.reserv.broken: 6551
vm.idlezero_enable: 0
vm.kvm_free: 538066284544
vm.kvm_size: 549755809792
vm.pmap.pmap_collect_active: 0
vm.pmap.pmap_collect_inactive: 0
vm.pmap.pv_entry_spare: 15150
vm.pmap.pv_entry_allocs: 2698367488
vm.pmap.pv_entry_frees: 2693316934
vm.pmap.pc_chunk_tryfail: 0
vm.pmap.pc_chunk_frees: 15239956
vm.pmap.pc_chunk_allocs: 15270109
vm.pmap.pc_chunk_count: 30153
vm.pmap.pv_entry_count: 5050554
vm.pmap.pdpe.demotions: 0
vm.pmap.pde.promotions: 21739
vm.pmap.pde.p_failures: 1818560
vm.pmap.pde.mappings: 0
vm.pmap.pde.demotions: 19898
vm.pmap.shpgperproc: 200
vm.pmap.pv_entry_max: 7326008
vm.pmap.pg_ps_enabled: 1
Top:
last pid: 24777; load averages: 3.26, 4.07, 4.49 up 34+19:43:58 14:32:05
66 processes: 5 running, 61 sleeping
CPU: 0.2% user, 0.0% nice, 37.6% system, 0.0% interrupt, 62.2% idle
Mem: 18G Active, 1908M Inact, 3008M Wired, 73M Cache, 2465M Buf, 232M Free
Swap: 4096M Total, 4096M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
1059 mysql 22 44 0 21004M 19741M ucond 3 71.2H 124.51% mysqld
24775 nagios 1 103 0 5332K 1088K CPU0 0 0:01 34.28% sh
24776 nagios 1 103 0 6156K 792K CPU6 6 0:01 34.28% grep
24774 nagios 1 76 0 8264K 1552K wait 4 0:00 26.07% sh
24766 nagios 1 76 0 8264K 1552K piperd 0 0:05 22.85% sh
24764 root 1 53 0 9336K 2248K CPU4 8 0:05 17.58% top
Top -mio:
PID USERNAME VCSW IVCSW READ WRITE FAULT TOTAL PERCENT COMMAND
1059 mysql 44 7 5 68 0 73 4.55% mysqld
24810 nagios 0 0 0 0 0 0 0.00% nrpe2
24809 nagios 1 6 0 0 0 0 0.00% bsdtar
24808 nagios 0 0 0 0 0 0 0.00% sed
24807 nagios 0 22 0 0 0 0 0.00% bzip2
top -HS:
last pid: 24835; load averages: 5.10, 4.46, 4.62 up 34+19:44:48 14:32:55
155 processes: 17 running, 112 sleeping, 26 waiting
CPU: 0.0% user, 0.0% nice, 41.7% system, 0.0% interrupt, 58.2% idle
Mem: 18G Active, 1912M Inact, 3008M Wired, 75M Cache, 2465M Buf, 232M Free
Swap: 4096M Total, 4096M Free
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
8 root 76 - 0K 16K CPU11 11 254:27 100.00% pagedaemon
1059 mysql 116 0 21002M 19741M CPU0 0 59.9H 92.09% {mysqld}
11 root 171 ki31 0K 192K CPU9 9 833.4H 89.16% {idle: cpu9}
11 root 171 ki31 0K 192K RUN 11 831.5H 63.48% {idle: cpu11}
11 root 171 ki31 0K 192K CPU5 5 827.5H 61.77% {idle: cpu5}
11 root 171 ki31 0K 192K CPU8 8 832.5H 57.18% {idle: cpu8}
11 root 171 ki31 0K 192K CPU3 3 830.4H 54.49% {idle: cpu3}
11 root 171 ki31 0K 192K RUN 4 825.5H 49.76% {idle: cpu4}
11 root 171 ki31 0K 192K CPU7 7 829.1H 47.36% {idle: cpu7}
11 root 171 ki31 0K 192K RUN 1 824.7H 41.26% {idle: cpu1}
11 root 171 ki31 0K 192K CPU6 6 829.3H 38.48% {idle: cpu6}
24830 root 104 0 9336K 2080K CPU1 1 0:06 37.99% top
11 root 171 ki31 0K 192K CPU10 10 831.9H 36.87% {idle: cpu10}
1059 mysql 76 0 21002M 19741M ucond 4 77:41 29.49% {mysqld}
1059 mysql 67 0 21002M 19741M CPU4 4 241:24 25.49% {mysqld}
11 root 171 ki31 0K 192K RUN 2 816.8H 23.00% {idle: cpu2}
3 root -8 - 0K 16K - 6 84:12 14.45% g_up
1059 mysql 76 0 21002M 19741M ucond 3 115:50 10.16% {mysqld}
no loader.conf tunables.
Sysctl modifications from /etc/sysctl.conf:
kern.timecounter.hardware=HPET
vm.defer_swapspace_pageouts=1
My.cnf (with stripped comments for saving a few kbs):
[client]
port = 3306
socket = /tmp/mysql.sock
[mysqld]
group_concat_max_len=32768
datadir = /mnt/garaje/var/db/mysql/
port = 3306
socket = /tmp/mysql.sock
local_infile=0
optimizer_search_depth=4
skip-symlink
safe-user-create=1
secure_auth=1
expire_logs_days=10
skip-name-resolve
default-storage-engine=innodb
default_table_type = innodb
back_log = 50
bind-address=127.0.0.1
max_connections = 100
max_connect_errors = 10
table_cache = 4096
max_allowed_packet = 16M
binlog_cache_size = 1M
max_heap_table_size = 128M
sort_buffer_size = 8M
join_buffer_size = 8M
thread_cache_size = 8
thread_concurrency = 8
query_cache_size = 64M
query_cache_limit = 2M
ft_min_word_len = 3
thread_stack = 192K
transaction_isolation = REPEATABLE-READ
tmp_table_size = 128M
log-bin=mysql-bin
binlog_format=mixed
log_slave_updates
log_slow_queries
long_query_time = 2
log_long_format
server-id = 2
replicate-same-server-id = 0
key_buffer_size = 32M
read_buffer_size = 2M
read_rnd_buffer_size = 16M
bulk_insert_buffer_size = 64M
myisam_sort_buffer_size = 128M
myisam_max_sort_file_size = 10G
myisam_max_extra_sort_file_size = 10G
myisam_repair_threads = 1
myisam_recover
innodb_rollback_on_timeout=1
innodb_additional_mem_pool_size = 20M
innodb_buffer_pool_size = 16G
innodb_data_file_path = ibdata1:10M:autoextend
innodb_file_per_table
innodb_file_io_threads = 4
innodb_thread_concurrency = 8
innodb_flush_log_at_trx_commit = 1
innodb_log_buffer_size = 8M
innodb_log_file_size = 256M
innodb_log_files_in_group = 3
innodb_max_dirty_pages_pct = 90
innodb_flush_method=O_DIRECT
innodb_lock_wait_timeout = 60
[mysqldump]
quick
max_allowed_packet = 16M
[mysql]
no-auto-rehash
[isamchk]
key_buffer = 512M
sort_buffer_size = 512M
read_buffer = 8M
write_buffer = 8M
[myisamchk]
key_buffer = 512M
sort_buffer_size = 512M
read_buffer = 8M
write_buffer = 8M
[mysqlhotcopy]
interactive-timeout
[mysqld_safe]
open-files-limit = 8192
--
La prueba más fehaciente de que existe vida inteligente en otros
planetas, es que no han intentado contactar con nosotros.
More information about the freebsd-stable
mailing list