I/O hanging while hosting Postgres database

Dustin Wenz dustinwenz at ebureau.com
Wed Feb 6 23:20:33 UTC 2013


I'm seeing a condition on FreeBSD 9.1 (built October 24th) where I/O seems to hang on any local zpools after several hours of hosting a large-ish Postgres database. The database occupies about 14TB of a 38TB zpool with a single SSD ZIL. The OS is on a ZFS boot disk. The system also has 24GB of physical memory. Smartmon tools reports no errors on any disks attached to the system, and IPMI reports all temperatures, CPU voltages, and fan speeds are normal.

The database has been gradually increasing in size since it was first deployed on FreeBSD 9.1 this fall. There were no problems until last night, when the database became unresponsive. Attempts to interact with the shell would block (specifically, any interaction with the disk), and no error messages were logged to the console. I restarted the system at that time, and brought the database back up. Everything seemed normal until this morning, where the database had become unresponsive again. Fortunately, I was able to grab some system statistics before the shell and console went AWOL.

The only finding that I thought was suspicious were the kmem_map numbers:

	vm.kmem_map_free: 655360
	vm.kmem_map_size: 17141383168

It's something like 0.004% free. I haven't been able to find much documentation on what to expect here, but I don't see anything like that for other databases that I've monitored. It is possible that kmem_map can become exhausted without generating a kernel panic? Could it be indicative of severe memory fragmentation?

	- .Dustin



More information about the freebsd-stable mailing list