Postgresql locks up server - no response at all
Jeremy Chadwick
freebsd at jdc.parodius.com
Wed Aug 4 13:34:57 PDT 2004
I've seen this with our SuperMicro SuperServer 5013C-T, running mysqld.
Please note that the server is "heavily loaded" (note the quotes); usually
a load of around 0.50 to 1.00 at all times, with mysqld being the top
process. Server runs all latest -CURRENT builds.
Many people over in freebsd-threads mentioned this problem, and recommended
all sorts-of different workarounds. I tried every one available to me,
except mucking with PREEMPTION (as I did not feel comfortable tinkering
with a random .h file on the box; seemed to be a kernel-related thing,
so I'd rather have just an "options" line for it -- I'm conditionally
lazy).
The locks are exactly as you describe: random, hard-locks. No KDB/DDB/GDB.
Just hard-locks with nothing in logs anywhere.
There's been (very recent) discussion here about lock-up problems seeming
load-related. This is starting to sound very probable for a lot of reasons.
Here's a list of all the combinations of things I've tried to *no avail*.
The solution for us was to move mysqld to a 4.x machine. Since then,
the -CURRENT box has managed to stay up for 3.5 days without any trouble:
=====
SuperMicro SuperServer 5013C-T
P4, 2.6GHz (for HTT settings, see below)
1GB ECC DDR400
For many months this machine worked fine under heavy load, SMP enabled, ACPI enabled,
APIC enabled. Sometime in early-to-mid July things became unstable; I update my
kernel/world every 1-2 weeks. The only other difference between "then and now" is
that the box runs MySQL (mysqld) 4.0.20; mysqld is not very heavily loaded (at least
in comparison to some other posters' systems I've seen...)
System can usually stay up about 48-72 hours before dying.
Initial configuration
* KERNEL: SCHED_ULE
* KERNEL: Disabled INVARIANT* and WITNESS*
* KERNEL: SMP enabled, APIC enabled
* BIOS: HTT enabled, APIC enabled, ACPI enabled
* /etc/make.conf has CPUTYPE=p4 (seems to be required for mysqld to work, else sig11)
Now the problems begin. Here are my attempted changes...
* KERNEL: SCHED_4BSD --> SCHED_ULE
KERNEL: Enabled KDB and DDB
!! Random locks.
* KERNEL: Enabled INVARIANT* and WITNESS*
!! Random locks.
* LOADER: Temporary ACPI disable (via loader(8) only; BIOS still has ACPI enabled).
Kernel panic:
pci0: <PCI bus> on pcib0
panic: Multiple entries for PCI IRQ 18
cpuid = 0;
KDB: enter: panic
[thread 0]
Stopped at kdb_enter+0x30: movl %ebp,%esp
* BIOS: MPS 1.4 --> 1.1
No idea if this worked, because we did the following after reading freebsd-threads:
* BIOS: Disabled HTT
BIOS: MPS 1.1 --> 1.4
KERNEL: SCHED_ULE --> SCHED_4BSD
KERNEL: Disabled INVARIANT* and WITNESS*
!! Random locks.
Thu Jul 29 04:16 PDT
* BIOS: Disabled APIC
KERNEL: Disabled SMP, disabled APIC
KERNEL: Enabled INVARIANT* and WITNESS*
NOTE: Because of the latest gcc 3.4 import, I was forced to rebuild world too.
NOTE: Prior to now, world was build WITHOUT CPUTYPE=p4. If this matters at all...
!! Random locks.
Sat Jul 31 13:08 PDT
* MYSQL: Recompiled 4.0.20 with WITH_PROC_SCOPE_PTH=yes.
MYSQL: The 4.0.20 rebuild obviously now included CPUTYPE=p4.
!! Random locks.
Sun Aug 1 03:01:09 PDT 2004
* Ended up moving mysql server portion to a 4.x box, in attempt to
see if the 5.x box still hard-locks without mysqld.
Wed Aug 4 13:28:35 PDT 2004
* -CURRENT box is still alive and well.
=====
Since our situation has shown that even a pure single CPU (i.e. no HTT
and no SMP in the kernel) has exhibited lock-ups, as mentioned, I'm
starting to think high load causes it.
--
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networking http://www.parodius.com/ |
| UNIX Systems Administrator Mountain View, CA, USA |
| Making life hard for others since 1977. |
On Wed, Aug 04, 2004 at 03:58:53PM -0400, Sven Willenberger wrote:
> FreeBSD 5.2.1-P8 running on dual Xeon supermicro system with vinum data
> drive and em network interfaces. I have been having a problem with the
> system simply locking up every couple days. No response from the
> keyboard, network, nothing. As if it is in some state of IRQ locking. I
> see nothing in the messages, even with DDB and DDB_UNATTENDED enabled in
> kernel. The system runs 4GB of ram with the following modifications to
> kernel:
>
> cpu I486_CPU
> cpu I586_CPU
> cpu I686_CPU
> <snip>
> options SHMMAXPGS=65536 # ********************
> options SEMMNI=40 # added for posgresql
> options SEMMNS=240 # allows for around
> options SEMUME=40 # 180 simultaneous connections
> options SEMMNU=120 # ********************
> <snip>
> # Debugging for use in -current
> options DDB #Enable the kernel debugger
> options DDB_UNATTENDED #Don't panic on DDB but log it
> #options INVARIANTS #Enable calls of extra sanity
> checking
> options INVARIANT_SUPPORT #Extra sanity checks of internal
> #options WITNESS #Enable checks to detect dead ..
> #options WITNESS_SKIPSPIN #Don't run witness on spinlocks
> # Deal with kmem issues
> options VM_KMEM_SIZE_SCALE="4"
> options VM_KMEM_SIZE_MAX="(512*1024*1024)"
> options KVA_PAGES=512
>
>
> /boot/loader.conf:
> vinum_load="YES"
> vinum.autostart="YES"
> #kern.maxdsiz="1073741824"
> #kern.dfldsiz="1073741824"
>
> I had experimented in loader.conf with the dsiz settings to no avail,
> still get lockups. Got lockups with and without the DDB settings. It
> would be helpful if I could see some type of error being generated, but
> nothing; the attached terminal has utterly no messages beyond normal
> system messages, everything just stops responding.
>
> After the last lockup and reboot, I sysctl machdep.hlt_logical_cpus=1 to
> see if that had any effect. Any other recommendations? adaptive_mutexes?
> Any ideas on how to actually find out what is happening?
>
> Sven
>
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
More information about the freebsd-current
mailing list