Postgresql performance profiling

Sun Jun 11 18:01:10 UTC 2006

On Sun, 11 Jun 2006, Kris Kennaway wrote:

> * The postgres processes seem to change their proctitle hundreds or 
> thousands of times per second.  This is currently done via a Giant-locked 
> sysctl (kern.proc.args) so there is enormous contention for Giant.  Even 
> when this is fixed (thanks to a patch from csjp@), each of them requires a 
> syscall and syscalls ain't free.  This is not a clever thing to be doing 
> from a performance standpoint.

You might consider disabling setproctitle() entirely to see what impact that 
has?

> * pgsql uses select() and this seems to be a major choke point.  I bet you'd 
> see fairly impressive performance gains (especially on SMP) if it was 
> modified to use kqueue instead of select.
>
> * You really want to avoid using IPv6 for transport (since it's 
> Giant-locked).  This was an issue at first since I was running against 
> localhost, which maps to ::1 by default.  We should reconsider the 
> preference for IPv6 over IPv4 until IPv6 is Giant-free - there are probably 
> many other situations where IPv6 is being secretly used "because it is 
> there" and costing performance.

FYI, for purely loopback traffic, it's probably safe to mark the IPv6 netisr 
as MPSAFE.  Add NETISR_MPSAFE as a flag to the following line in ip6_input.c:

ip6_input.c:    netisr_register(NETISR_IPV6, ip6_input, &ip6intrq, 0);

If you have non-loopback traffic, you may put yourself at greater risks of 
panic in the IPv6 multicast and neighbor discovery code, however, so this 
should be done with caution.  It might be an interesting exercise though.

> * The sysv IPC code is still giant-locked.  pgsql makes a lot of semop() 
> calls which grab Giant, and it also msleep()s on the Giant lock in the 
> semwait channel.

It is likely quite easy to put subsystem locks around System V IPC subsystems. 
I'm a bit surprised no one has done it already.  sysvshm is a bit more tricky, 
but sysvsem and sysvmsg should be quite straight forward.

> * When semop() wants to wake up some sleeping processes because semaphores 
> have been released, it does a wakeup() and wakes them all up.  This means a 
> thundering herd (I see up to 11 CPUs being woken here).  Since we know 
> exactly how many resources are available, it would be better to only 
> wakeup_one() that number of times instead.

Should be easy to experiment with.

Robert N M Watson
Computer Laboratory
Universty of Cambridge