8.1 broken inter-jail IP communication

markham breitbach markham_breitbach at ssimicro.com
Wed Jun 15 17:47:58 UTC 2011

Good Day,

I am encountering an occassional problem under FreeBSD 8.1 where two jails on the same
server cannot reach each other after a reboot.

The Jails are running a mail server and ldap server, respectively and each has it's own IP

The problem manifests itself after a reboot of the server.  After both jails have started
the mail server is unable to communicate with the ldap server.  From inside the jail, a
host unreachable is returned when trying to connect to the ldap server.

I have tried clearing the arp-cache and route-cache from the host and restarting both
jails, but the problem persists. The arp table from the host server (outside the jail)
shows an "(incomplete)" entry for the mail server when this is happening. 

I was able to ping the mail IP address from the host server and the incomplete entry
disappeared and, as expected, there was no longer an arp entry for the mail server and
communications between the two jails was restored.

Unfortunately I have had difficulty recreating this scenario in a test environment and it
only pops up occasionally in the field.  And while this workaround is suitable, it is a
bit of a PITA and I would like to know if this problem can be resolved.

So, I am wondering if anyone has some insights into what might be at the root of this
problem and what might be useful data to collect when this problem is happening to help
pin down the source of it.  Unfortunately, when service fails, I don't have a lot of time
to poke around at things as I need to do whatever I can to get it back up a quickly as
possible, although I am continuing to try and recreate this scenario in a test environment.

Best Regards,

Markham Breitbach

