[EXTERNAL] Re: FreeBSD10 Stable + ZFS + PostgreSQL + SSD performance drop < 24 hours
Caza, Aaron
Aaron.Caza at ca.weatherford.com
Fri Jun 16 05:06:31 UTC 2017
> -----Original Message-----
> From: Rodney W. Grimes [mailto:freebsd-rwg at pdx.rh.CN85.dnsmgr.net]
> Sent: Wednesday, June 14, 2017 4:04 PM
> To: Caza, Aaron
> Cc: Allan Jude; freebsd-hackers at freebsd.org
> Subject: Re: [EXTERNAL] Re: FreeBSD10 Stable + ZFS + PostgreSQL + SSD performance drop < 24 hours
>
> > > -----Original Message-----
> > > From: owner-freebsd-hackers at freebsd.org
> > > [mailto:owner-freebsd-hackers at freebsd.org] On Behalf Of Allan Jude
> > > Sent: Wednesday, June 14, 2017 11:20 AM
> > > To: freebsd-hackers at freebsd.org
> > > Subject: [EXTERNAL] Re: FreeBSD10 Stable + ZFS + PostgreSQL + SSD
> > > performance drop < 24 hours
> > >
> > > On 2017-06-14 13:11, Caza, Aaron wrote:
> > > > Further to this:
> > > >
> > > > I've now tested FreeBSD 10.3-RELEASE-p19 and FreeBSD 11.0-STABLE #0 r307264M, both of which suffer the same degraded performance.
> > > >
> > > > test$ uname -a
> > > > FreeBSD xyz.com 10.3-RELEASE-p19 FreeBSD 10.3-RELEASE-p19 #0 r319904M: Tue Jun 13 12:38:29 MDT 2017 aaronc at WFT:XYZ amd64
> > > > test$ uptime
> > > > 10:15AM up 21:09, 2 users, load averages: 1.00, 1.14, 1.30 test$
> > > > dd if=/testdb/test of=/dev/null bs=1m
> > > > 16000+0 records in
> > > > 16000+0 records out
> > > > 16777216000 bytes transferred in 200.379127 secs (83727363
> > > > bytes/sec)
> > > >
> > > > After reboot:
> > > > test$ dd if=/testdb/test of=/dev/null bs=1m
> > > > 16000+0 records in
> > > > 16000+0 records out
> > > > 16777216000 bytes transferred in 23.213040 secs (722749623
> > > > bytes/sec)
> > > >
> > > > Same Intel Xeon E31240 with 8GB ram and 2x Samsung 850 Pro 256GB SSDs as before.
> > > >
> > >
> > > Can you do the same test, but grab the memory lines from top(1) before and after each of those two runs.
> > >
> > > I am guessing the ARC is being squeezed out by PostgreSQL, because you have so little RAM.
> > >
> > > --
> > > Allan Jude
> >
> > Takes a while for the degradation to kick in now that I rebooted this morning.
> >
> > Regarding the ARC being squeezed - well, that doesn't explain why gstat shows on 95-100% busy on the drives on reboot but only ~15 %busy after the degradation hits.
> >
> > In fact, ARC is being squeezed all the time because I've limited it to 50M in /boot/loader.conf:
> > vfs.zfs.arc_min="50M"
> > vfs.zfs.arc_max="50M"
>
> Would you passify an old fart by at least having a delta of 1 between a min and max please?
> Some code may oscilate when min=max.
> > Note that the FreeBSD 9.0 server that I tested on also hamstrings the ARC to 50M but doesn't suffer a performance degradation hence why I hadn't bothered mentioning it before.
> >
> > To remove Postgres entirely, I won't even start it and simply use dd on the 16GB file. The server is essentially doing nothing at all.
> >
> > At this point, I'm looking at going back to FreeBSD 10.3-RELEASE-p7 as yesterday as 'trafdev' reported that he doesn't see any performance drop and he's got 95 days uptime. He's also mentioned vfs.zfs.metaslab.lba_weighting_enabled=0 setting which I also need to try.
> >
> > --
> > Aaron
>
> --
> Rod Grimes rgrimes at freebsd.org
Allan, here's the top(1) output after degradation:
last pid: 9403; load averages: 1.09, 1.48, 1.30 up 0+23:29:14 10:30:06
21 processes: 1 running, 19 sleeping, 1 zombie
CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
Mem: 1528K Active, 46M Inact, 462M Wired, 4856K Cache, 624K Buf, 7376M Free
ARC: 53M Total, 5K MFU, 16M MRU, 16K Anon, 1137K Header, 36M Other
And, after reboot with performance back to normal:
last pid: 739; load averages: 0.99, 0.95, 0.58 up 0+00:08:02 10:53:21
19 processes: 1 running, 17 sleeping, 1 zombie
CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
Mem: 18M Active, 22M Inact, 224M Wired, 432K Buf, 7626M Free
ARC: 28M Total, 3385K MFU, 24M MRU, 16K Anon, 232K Header, 1005K Other
The above was done using FreeBSD 10.3-RELEASEp19 (amd64) on the aforementioned Xeon server with 2x Samsung 850 Pro 256GB SSDs. PostgreSQL was never started or use - simply dd'ing the 16GB test file. These results were still using vfs.zfs.arc_min and vfs.zfs_arc_max of "50MB" though subsequent tests will utilize 50M and 51M, respectively as suggested by Rod.
Checking the FreeBSD 9.0 test server, it had vfs.zfs.arc_max="50M" but did not have vfs.zfs.arc_min set in /boot/loader.conf; consequently, it was actually higher that vfs.zfs.arc_max. I've reset to 50M and 51M and rebooted. Time for "select count(*)" in Postgres for ~21.5 million row test table, after 24hours, is still maintaining the ~82 seconds as it did after reboot.
Additionally, on separate test servers I've checked the performance of both FreeBSD 10.3-RELEASE-p6 r303605M and the FreeBSD 11.0-STABLE snapshot dated May 10, 2017 and both exhibited the same degradation in performance.
Suggestions?
--
Aaron
This message may contain confidential and privileged information. If it has been sent to you in error, please reply to advise the sender of the error and then immediately delete it. If you are not the intended recipient, do not read, copy, disclose or otherwise use this message. The sender disclaims any liability for such unauthorized use. PLEASE NOTE that all incoming e-mails sent to Weatherford e-mail accounts will be archived and may be scanned by us and/or by external service providers to detect and prevent threats to our systems, investigate illegal or inappropriate behavior, and/or eliminate unsolicited promotional e-mails (spam). This process could result in deletion of a legitimate e-mail before it is read by its intended recipient at our organization. Moreover, based on the scanning results, the full text of e-mails and attachments may be made available to Weatherford security and other personnel for review and appropriate action. If you have any concerns about this process, please contact us at dataprivacy at weatherford.com.
More information about the freebsd-hackers
mailing list