cron and vfork

Malcolm Kay malcolm.kay at internode.on.net
Mon Aug 23 08:37:08 PDT 2004


I'm having a problem with cron and vfork.
Here are a couple of samples from the cron log:-

Aug 22 16:13:00 central /usr/sbin/cron[89748]: (root) CMD (   /usr/local/sbin/nwmail2linux) 
Aug 22 16:15:00 central /usr/sbin/cron[89749]: (CRON) error (can't vfork) 
Aug 22 16:15:00 central /usr/sbin/cron[89750]: (CRON) error (can't vfork) 
Aug 22 16:17:00 central /usr/sbin/cron[89752]: (root) CMD (   /usr/local/sbin/nwmail2linux) 
Aug 22 16:19:00 central /usr/sbin/cron[89754]: (root) CMD (   /usr/local/sbin/nwmail2linux) 
Aug 22 16:20:00 central /usr/sbin/cron[89756]: (root) CMD (/usr/libexec/atrun) 

Aug 22 18:59:00 central /usr/sbin/cron[89952]: (root) CMD (   /usr/local/sbin/nwmail2linux) 
Aug 22 19:00:01 central /usr/sbin/cron[89953]: (CRON) error (can't vfork) 
Aug 22 19:00:01 central /usr/sbin/cron[89954]: (CRON) error (can't vfork) 
Aug 22 19:01:00 central /usr/sbin/cron[89956]: (root) CMD (   /usr/local/sbin/nwmail2linux) 

The problem started with the first error entry shown here. 
Note that this is a work machine and the problem appears 
at 4:15pm on a Sunday when all should be quiet which seems
to make any of the vfork failure mechanisms mentioned in
the man pages vfork(2) and fork(2) unlikely.

The presence of a problem became evident when when I was 
unable to log into the machine Monday morning, neither from
a console or through ssh on the LAN. At the console it puts up
a login prompt and accepts the name entry but that is all --
no password prompt and no further activity. I've had to reboot
the machine with a brutal physical reset.

The same thing happened at the quiet part of the previous 
weekend.

The cron jobs are all in /etc/crontab and the problem when it 
occurs is always at a time when 2 jobs clash:
  /usr/local/sbin/nwmail2linux and /usr/libexec/atrun
or 
  newsyslog and /usr/libexec/atrun

/usr/local/sbin/nwmail2linux is a process that picks up e-mail
for three people from a Novell system and and delivers it to
3 linux machines via ssh and mail.local. This is programmed to 
activate at each odd minute in the hour.

The machine is running FreeBSD 4.9 with vinum disk mirroring.
Its main reason for being is to manage backup of a number of 
FreeBSD and Linux machines all of which happens in the late 
evening and early morning hours and in any case not on Sunday
night/Monday morning.
It is also used as a gateway between 2 physically isolated 
LAN networks and between IPX and TCP/IP and as mentioned above
some mail management.

The problem seems to have begun with a change to the e-mail
management. Previously The machine was used to retrieve 
mail from Novell for only two people - one delivered locally
to a conventional unix mailbox (and acessed via ssh and kmail)
and the other passed on via rsh to a HP-UX machine.

I've not found anomalies in any other log files.

The only explanation I can think of sounds rather far fetched,
that the linux machines somehow take a long time to waken from 
slumber in the quiet of Sunday afternoon and the e-mail jobs 
begin overlapping until too many processes exist -- but
surely it could not be that many.

Even when not accessable through login the machine seems 
for the most part to be carrying out its intended roles.

Does anyone have any ideas?

Help would be appreciated.

Malcolm



More information about the freebsd-questions mailing list