Re: 504 gateway time-outs

From: Mark Millard <marklmi_at_yahoo.com>
Date: Tue, 31 Mar 2026 19:37:30 UTC
On 3/31/26 06:25, Philip Paeps wrote:
> On 2026-02-28 19:32:35 (+0800), Graham Perrin wrote:
>> On 28/02/2026 10:32, Mark Millard wrote:
>>> The following got "504 Gateway Time-out" when I tried them:
>>>
>>> <https://pkg-status.freebsd.org/beefy24/build.html?mastername=main-
>>> amd64-default&build=pdf4f957ea181_s178d0b5b8d>
>>>
>>> <https://pkg-status.freebsd.org/beefy23/build.html?
>>> mastername=150amd64-default&build=df4f957ea181>
>>
>> Both somewhat slow to load, however they do load for me.
> 
> I only just noticed this thread, sorry for resurrecting it.
> 
> I've noticed that beefy23 and beefy24 (2x EPYC 9254, 512G RAM) sometimes
> get too busy building to schedule nginx (or sshd).  They eventually
> manage to plough through.  Usually.
> 
> That causes the 504 timeouts if you're going through pkg-status.f.o.  If
> you're going directly to beefyX.chi.freebsd.org you'll just get a timeout.
> 
> They're running exactly the same poudriere.conf as the other builders. 
> I wonder if our calc_builders() function that tries to assign about 12G
> per builder isn't quite right for this particular configuration of cores
> and RAM.
> 
> I haven't had a chance to look closely.  As far as I can tell the builds
> do eventually succeed.  If the only problem is "I can't obsessively poll
> pkg-status in real time", it's not a very high priority. :)

It is mostly not having a clue about the distinction between "the
overall build failed somehow, such as by the builder system crashing"
and "you just can not observe anything now but the system is still".

I was explicitly asked to not send in notes about potential failure
symptoms so I no longer add to the clusteradm workload in such cases.

[I do wonder if those builders are, over significant times,
page-thrashing or anything else that might suggest mis-tuning to the
point that the overall builds take notably longer. I do expect load
averages generally larger than the FreeBSD count of cpus for keeping
overall elapsed times smaller: otherwise there is likely unused idle
time not put to useful work. But that wording ignores issues like
page-thrashing consequences that can be involved for too much RMA+SWAP
resource intensive activity in parallel if some mutual exclusion of huge
builders is not prevented.]

> 
> This is on my list.  It's just a VERY long list. :)

Yep.

> 
> Philip
> 
> 

Thanks for the notes.

-- 
===
Mark Millard
marklmi at yahoo.com