Concurrency limit warning in Postfix leads to server lock

Bill Moran wmoran at potentialtech.com
Sat Sep 15 05:51:25 PDT 2007


Robert Fitzpatrick <lists at webtent.net> wrote:
>
> I have dilemma with one of our 5.4 server mail gateways. About 2-3 times
> a month now the server SMTP and related services stop responding. I find
> myself not able to login, just sits there after entering user name. I
> have to reset the server and the only thing I can find with an 'egrep
> (fatal|error|warn)' in the messages and maillog are these concurrency
> limit warnings minutes before the issue started...
> 
> Sep 15 07:19:02 esmtp postfix/smtpd[2789]: warning: Connection
> concurrency limit exceeded: 51 from unknown[88.238.96.247] for service
> smtp
> 
> This seems to be an attacker of some sort, I block them and the issue
> goes away, of course. I posted my issue to the Postfix list, but was
> told this should not be taking down my server and to find out why I'm
> not able to login when this happens.

It shouldn't.

< I am looking for help on where to
> look to determine this, can someone give some guidance? Some other log I
> should examine? The only thing I can spot that looks possibly out of
> place is nfsd running at 6-8% CPU. I do a backup from one other server
> to this server via nfs. I checked and all that backup was finished
> couple of hours prior to this latest issue, but the nfsd process seems
> to be taking more CPU than normal. And when I reboot, the nfs connection
> I have in /etc/fstab takes several seconds to initialize.

Definitely sounds like some networking issues.  I can't give you a
direct "answer" because the question is too vague (although I think
you described it to the best of you ability).  Instead, I'll outline
how I would go about tracking it down and solving it.

* Start with the nfs thing.  It seems to indicate a network problem,
  which will skew everything else you investigate unless you fix it
  first.  Try some large FTP transfers between those two servers (FTP
  has very little overhead, and is thus a good gauge of network
  performance). If the FTP transfer isn't getting within 20% of the
  theoretical capability of the network, then you probably have a
  network problem.  Carefully investigate speed/duplex settings,
  whether or not your switching hardware is crappy or simply overloaded.
  In short, find the network problem and fix it.
* Next time it happens, make absolutely sure it's refusing login.  Under
  a DDoS or similar attack, it can take several seconds for ssh to
  complete the protocol negotiation.  If DNS is running slow, longer.
  Are you waiting until the ssh client actually times out before
  giving up?  Even then, it might connect on the second or third
  try.  Try setting ConnectTimeout to 300 in /etc/ssh/ssh_config and
  see if it connects.  I've seen network problems cause ssh to take
  45 seconds or longer to connect, and that's to be expected under
  certain network circumstances.
* Get MRTG or some other trend gathering system running on that machine
  so you have other stats to look at when the problem happens, this may
  point you to the source of the problem very quickly.  In general, it's
  a good idea to have on production systems so you can see what's
  happening.  With MRTG (and similar software) you can, and should!,
  graph a lot more than network usage.  Graph disk read/writes, cpu
  usage, swap file usage, memory usage.  A system that's heavily in
  to swap will respond dog-slow, and could be your problem.

Hope these help you narrow down the problem.

-- 
Bill Moran
http://www.potentialtech.com


More information about the freebsd-questions mailing list