ZFS "stalls" -- and maybe we should be talking about defaults?

Steven Hartland killing at multiplay.co.uk
Thu Mar 7 18:57:45 UTC 2013


----- Original Message ----- 
From: "Karl Denninger" <karl at denninger.net>
> Where I am right now is this:
> 
> 1. I *CANNOT* reproduce the spins on the test machine with Postgres
> stopped in any way.  Even with multiple ZFS send/recv copies going on
> and the load average north of 20 (due to all the geli threads), the
> system doesn't stall or produce any notable pauses in throughput.  Nor
> does the system RAM allocation get driven hard enough to force paging. 
> 
> This is with NO tuning hacks in /boot/loader.conf.  I/O performance is
> both stable and solid.
> 
> 2. WITH Postgres running as a connected hot spare (identical to the
> production machine), allocating ~1.5G of shared, wired memory,  running
> the same synthetic workload in (1) above I am getting SMALL versions of
> the misbehavior.  However, while system RAM allocation gets driven
> pretty hard and reaches down toward 100MB in some instances it doesn't
> get driven hard enough to allocate swap.  The "burstiness" is very
> evident in the iostat figures with spates getting into the single digit
> MB/sec range from time to time but it's not enough to drive the system
> to a full-on stall.
> 
> There's pretty-clearly a bad interaction here between Postgres wiring
> memory and the ARC, when the latter is left alone and allowed to do what
> it wants.   I'm continuing to work on replicating this on the test
> machine... just not completely there yet.

Another possibility to consider is how postgres uses the FS. For example
does is request sync IO in ways not present in the system without it
which is causing the FS and possibly underlying disk system to behave
differently.

One other options to test, just to rule it out is what happens if you
use BSD scheduler instead of ULE?

    Regards
    Steve


================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster at multiplay.co.uk.



More information about the freebsd-stable mailing list