From nobody Wed Apr 06 11:28:45 2022 X-Original-To: freebsd-performance@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 966A01A81E70; Wed, 6 Apr 2022 11:28:48 +0000 (UTC) (envelope-from egoitz@ramattack.net) Received: from cu1208c.smtpx.saremail.com (cu1208c.smtpx.saremail.com [195.16.148.183]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4KYMhv4tm0z4gLP; Wed, 6 Apr 2022 11:28:47 +0000 (UTC) (envelope-from egoitz@ramattack.net) Received: from www.saremail.com (unknown [194.30.0.183]) by sieve-smtp-backend02.sarenet.es (Postfix) with ESMTPA id D560C60C055; Wed, 6 Apr 2022 13:28:45 +0200 (CEST) List-Id: Performance/tuning List-Archive: https://lists.freebsd.org/archives/freebsd-performance List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-performance@freebsd.org X-BeenThere: freebsd-performance@freebsd.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="=_2094f57367545a643dbb74ca4f8ba24d" Date: Wed, 06 Apr 2022 13:28:45 +0200 From: egoitz@ramattack.net To: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org, freebsd-performance@freebsd.org Subject: Re: Desperate with 870 QVO and ZFS In-Reply-To: <4e98275152e23141eae40dbe7ba5571f@ramattack.net> References: <4e98275152e23141eae40dbe7ba5571f@ramattack.net> Message-ID: X-Sender: egoitz@ramattack.net User-Agent: Saremail webmail X-Rspamd-Queue-Id: 4KYMhv4tm0z4gLP X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=reject) header.from=ramattack.net; spf=pass (mx1.freebsd.org: domain of egoitz@ramattack.net designates 195.16.148.183 as permitted sender) smtp.mailfrom=egoitz@ramattack.net X-Spamd-Result: default: False [-3.79 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; XM_UA_NO_VERSION(0.01)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip4:195.16.148.0/24:c]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[multipart/alternative,text/plain,multipart/related]; TO_DN_NONE(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCVD_TLS_LAST(0.00)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; DMARC_POLICY_ALLOW(-0.50)[ramattack.net,reject]; FROM_NO_DN(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; MLMMJ_DEST(0.00)[freebsd-fs,freebsd-hackers,freebsd-performance]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:+,3:~,4:~]; ASN(0.00)[asn:3262, ipnet:195.16.128.0/19, country:ES]; RCVD_COUNT_TWO(0.00)[2]; MID_RHS_MATCH_FROM(0.00)[] X-ThisMailContainsUnwantedMimeParts: N --=_2094f57367545a643dbb74ca4f8ba24d Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 The most extrange thing is... When machine boots ARC is in 40 value of GB used (for instance), but later decreases to 20GB (and this is not an example... is exact) in all my servers.... it's like if the ARC metadata which is more or less 17GB would limite the whole ARC..... With the traffic of this machines, it should I suppose the ARC should be larger than it is... and ARC in loader.conf is limited to 64GB (the half the ram this machines have) El 2022-04-06 13:15, egoitz@ramattack.net escribió: > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y sepa que el contenido es seguro. > > Good morning, > > I write this post with the expectation that perhaps someone could help me > > I am running some mail servers with FreeBSD and ZFS. They use 870 QVO (not EVO or other Samsung SSD disks) disks as storage. They can easily have from 1500 to 2000 concurrent connections. The machines have 128GB of ram and the CPU is almost absolutely idle. The disk IO is normally at 30 or 40% percent at most. > > The problem I'm facing is that they could be running just fine and suddenly at some peak hour, the IO goes to 60 or 70% and the machine becomes extremely slow. ZFS is all by default, except the sync parameter which is set disabled. Apart from that the ARC is limited to 64GB. But even this is extremely odd. The used ARC is near 20GB. I have seen, that meta cache in arc is very near to the limit that FreeBSD automatically sets depending on the size of the ARC you set. It seems that almost all ARC is used by meta cache. I have seen this effect in all my mail servers with this hardware and software config. > > I do attach a zfs-stats output, but from now that the servers are not so loaded as described. I do explain. I run a couple of Cyrus instances in these servers. One as master, one as slave on each server. The commented situation from above, happens when both Cyrus instances become master, so when we are using two Cyrus instances giving service in the same machine. For avoiding issues, know we have balanced and we have a master and a slave in each server. You know, a slave instance has almost no io and only a single connection for replication. So the zfs-stats output is from now we have let's say half of load in each server, because they have one master and one slave instance. > > As said before, when I place two masters in same server, perhaps all day works, but just at 11:00 am (for example) the IO goes to 60% (it doesn't increase) but it seems like if the IO where not being able to be served, let's say more than a limit. More than a concrete io limit (I'd say 60%). > > I don't really know if, perhaps the QVO technology could be the guilty here.... because... they say are desktop computers disks... but later... I have get a nice performance when copying for instance mailboxes from five to five.... I can flood a gigabit interface when copying mailboxes between servers from five to five.... they seem to perform.... > > Could anyone please shed us some light in this issue?. I don't really know what to think. > > Best regards, > > ATENCION: Este correo se ha enviado desde fuera de la organización. No pinche en los enlaces ni abra los adjuntos a no ser que reconozca el remitente y sepa que el contenido es seguro. --=_2094f57367545a643dbb74ca4f8ba24d Content-Type: multipart/related; boundary="=_6e3b5663f91f7c881feb9dbfb751600c" --=_6e3b5663f91f7c881feb9dbfb751600c Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=UTF-8

The most extrange thing is... When machine boots ARC is = in 40 value of GB used (for instance), but later decreases to 20GB (and thi= s is not an example... is exact) in all my servers.... it's like if the ARC= metadata which is more or less 17GB would limite the whole ARC.....

 

With the traffic of this machines, it should I suppose t= he ARC should be larger than it is... and ARC in loader.conf is limited to = 64GB (the half the ram this machines have)

 


El 2022-04-06 13:15, egoitz@ramattack.net escribió:


ATENCION: Este correo se ha enviado = desde fuera de la organización. No pinche en los enlaces ni abra los= adjuntos a no ser que reconozca el remitente y sepa que el contenido es se= guro.

Good morning,

I write this post with = the expectation that perhaps someone could help me 3D":)"

I am running = some mail servers with FreeBSD and ZFS. They use 870 QVO (not EVO or other = Samsung SSD disks) disks as storage. They can easily have from 1500 to 2000= concurrent connections. The machines have 128GB of ram and the CPU is almo= st absolutely idle. The disk IO is normally at 30 or 40% percent at most.
The problem I'm facing is that they could be running just fine = and suddenly at some peak hour, the IO goes to 60 or 70% and the machine be= comes extremely slow. ZFS is all by default, except the sync parameter whic= h is set disabled. Apart from that the ARC is limited to 64GB. But even thi= s is extremely odd. The used ARC is near 20GB. I have seen, that meta cache= in arc is very near to the limit that FreeBSD automatically sets depending= on the size of the ARC you set. It seems that almost all ARC is used by me= ta cache. I have seen this effect in all my mail servers with this hardware= and software config.

I do attach a zfs-stats output, but from= now that the servers are not so loaded as described. I do explain. I run a= couple of Cyrus instances in these servers. One as master, one as slave on= each server. The commented situation from above, happens when both Cyrus i= nstances become master, so when we are using two Cyrus instances giving ser= vice in the same machine. For avoiding issues, know we have balanced and we= have a master and a slave in each server. You know, a slave instance has a= lmost no io and only a single connection for replication. So the zfs-stats = output is from now we have let's say half of load in each server, because t= hey have one master and one slave instance.

As said before, wh= en I place two masters in same server, perhaps all day works, but just at 1= 1:00 am (for example) the IO goes to 60% (it doesn't increase) but it seems= like if the IO where not being able to be served, let's say more than a li= mit. More than a concrete io limit (I'd say 60%).

I don't real= ly know if, perhaps the QVO technology could be the guilty here.... because= =2E.. they say are desktop computers disks... but later... I have get a nic= e performance when copying for instance mailboxes from five to five.... I c= an flood a gigabit interface when copying mailboxes between servers from fi= ve to five.... they seem to perform....

Could anyone please sh= ed us some light in this issue?. I don't really know what to think.
<= br /> Best regards,
 




ATENCION: Este correo se ha enviado = desde fuera de la organización. No pinche en los enlaces ni abra los= adjuntos a no ser que reconozca el remitente y sepa que el contenido es se= guro.
--=_6e3b5663f91f7c881feb9dbfb751600c Content-Transfer-Encoding: base64 Content-ID: <1649244525624d796dc0952013070694@ramattack.net> Content-Type: image/gif; name=d8974688.gif Content-Disposition: inline; filename=d8974688.gif; size=42 R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7 --=_6e3b5663f91f7c881feb9dbfb751600c-- --=_2094f57367545a643dbb74ca4f8ba24d--