misc/145305: ipfw problems, panics, data corruption, ipv6 socket weirdness

Terrence Koeman root at mediamonks.net
Fri Apr 2 20:50:05 UTC 2010


>Number:         145305
>Category:       misc
>Synopsis:       ipfw problems, panics, data corruption, ipv6 socket weirdness
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Apr 02 20:50:05 UTC 2010
>Closed-Date:
>Last-Modified:
>Originator:     Terrence Koeman
>Release:        8.0-STABLE
>Organization:
MediaMonks B.V.
>Environment:
FreeBSD xxx 8.0-STABLE FreeBSD 8.0-STABLE #9: Fri Apr  2 21:20:04 CEST 2010     terrence at xxx:/usr/obj/usr/src/sys/ADINAVA-SMP  amd64
>Description:
Several things break with the current 8-STABLE:

What broke:

It's a mail server, running Communigate Pro (tried 5.1, 5.2 & 5.3) and accepting connections on both ipv4 and ipv6. The server uses v6 sockets for v4 addresses, like this:

CGServer 826 root   45u  IPv6 0xffffff00174c8a50      0t0  TCP [2001:610:xxx:xxx:xxx:xxx:xxx:200]:smtp (LISTEN)
CGServer 826 root   47u  IPv6 0xffffff0001faf370      0t0  TCP [::217.xxx.xxx.xxx]:smtp (LISTEN)

Directly after the upgrade I noticed that connections *out* to other ipv4 mailservers were no longer succeeding and ipfw was seeing some weird packets:

Mar 30 06:27:34 adinava kernel: ipfw: 65530 Accept TCP 1.23.2.0:28859 65.55.92.152:25 out via bce0

Obviously 1.23.2.0 is not a local IP, so I checked lsof:

CGServer 824 root   49u  IPv6 0xffffff00174b3a50      0t0  TCP [2001:610:xxx:xxx:xxx:xxx:xxx:200]:28859->[::65.55.92.152]:smtp (SYN_SENT)

Somehow the server was trying to connect to an ipv4 address from an ipv6 address, where the ipv6 address apparently overflows ipv4 storage and ends up being '1.23.2.0'...

For comparison, this is what it looks like when it works:

CGServer 105 root   94u  IPv6 0xffffff00a4ccd000      0t0  TCP [::217.195.117.200]:14532->[::65.55.92.152]:smtp  (ESTABLISHED)

At first I assumed this was a problem in the daemon, so I temporarily disabled ipv6 for CGatePro (the ipv6 is not yet added as MX anyway) and forced it to use ipv4 sockets. That worked, until I tried to reload the ipfw rules and hit a panic:

---
Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff803e3b77
stack pointer           = 0x28:0xffffff8076d73890
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1467 (ipfw)
trap number             = 9
panic: general protection fault
cpuid = 0
Uptime: 4m24s
Cannot dump. Device not defined or unavailable.
panic: bufwrite: buffer is not busy???
cpuid = 0
Uptime: 4m24s
Cannot dump. Device not defined or unavailable.
Automatic reboot in 15 seconds - press a key on the console to abort
Automatic reboot in 15 seconds - press a key on the console to abort
ipfw: ouch!, skip past end of rules, denying packet
---

It should have dumped (device is defined) and rebooted, but it hung there.

When I rebooted it, my rules file (/etc/ipfw.rules.sh) was truncated to zero. Assuming this was an ipfw problem I left the rules out temporarily (I have IPFIREWALL_DEFAULT_TO_ACCEPT).

After ~4 minutes the server hung again, this time with the screen filled with 'ipfw: ouch!, skip past end of rules, denying packet' messages, these were also logged to /var/log/messages. This seemed a bit weird as only the default 'allow-all' rule was present.

So I decided to recompile the kernel without ipfw and reboot (no module loaded either). After again ~4m I got the following panic:

---
dev = mfid0s1f, block = 1, fs = /var
panic: ffs_blkfree: freeing free block
cpuid = 3
Uptime: 4m34s
Cannot dump. Device not defined or unavailable.
Automatic reboot in 15 seconds - press a key on the console to abort
---

Again the server didn't reboot but just froze (no num-lock LED action either).

Reverting to 8-STABLE from 1 march 2010 makes all this go away.

dmesg & kernconf @ http://ra.phid.ae/dmesg.txt, see also: http://forums.freebsd.org/showthread.php?p=75765
>How-To-Repeat:
Even without ipfw in the kernel and the module not loaded there are problems, so I don't think this is ipfw specific. However, I've not found a way to reliably reproduce the problems without ipfw. The following run a couple of times in a row will either deny all packets (skip past end of rules) or panic:

---
#/bin/sh

ipfw disable firewall
ipfw -f flush
ipfw add 00001 allow any from any to any
ipfw enable firewall
ipfw show          
---

When I run this script and a panic occurs, the file is then truncated to zero or garbage is added to the end, so somehow there's also data corruption.
>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list