Address error trap in ng_netflow on MIPS
Nathan Ward
freebsd at daork.net
Mon Apr 20 06:24:30 UTC 2015
Hi all,
I am using freebsd on MIPS (on Buffalo WZR-HP-AG300H hardware, which is basically just Atheros AP96 with some more flash, RAM, and a crippled u-boot). I am running FreeBSD 9.1.
I am using netgraph’s netflow module, configured to look at both ingress and egress packets ('msg netflow: setconfig { iface=0 conf=3 }’). This does not happen if I leave the default of receive-only, however that does not do what I need in my environment.
This works fine, for the most part, except when the router tries to transmit a DHCP response, which causes the following:
+ Trap cause = 4 (address error (load or I-fetch) - kernel mode)
[ thread pid 226 tid 100066 ]
Stopped at export9_add+0x1230: lw a0,0(s7)
I hooked up kgdb, and the problem is at this line of code, in sys/netgraph/netflow/ng_netflow.c:
761 if ((ip->ip_v != IPVERSION) ||
Hoping for a quick hack, I replaced all the ip->ip_v references with (((char *)ip)[0] >> 4). Nasty, sure, but it worked OK for my purposes :-)
Now it’s dying when it calls "ip->ip_src” in netflow.c, same sort of error (trap 4).
I am unclear exactly what this error means, but it seems to be a MIPS error. I don’t know much about CPU architectures, so, here I am.
It is curious to me that ip->ip_hl, which is the other 4 bits of the first octet in the ip header, works fine.
Other packets (received, or transmitted - because of forwarding or responding to ping/DNS requests against the local DNS cache) also work fine.
Here’s a dump of the ip struct, which all seems just fine:
(kgdb) x/10 ip
0x8420380e: 0x45000148 0x00004000 0x40112584 0x0a000001
0x8420381e: 0x0a000021 0x00430044 0x01345a50 0x02010600
0x8420382e: 0xea83f35e 0x00000000
I have tried this on amd64 (VMWare Fusion VM on modern Macbook) and don’t have any problems - the DHCP reply packets are transmitted just fine, hence I am posting this on this list, rather than whatever list looks after netgraph.
Does anyone have pointers to anywhere I might be able to start trying to fix this?
Upgrading to FreeBSD 10.1 is an obvious first step, but I want to be sure that this will fix it first, I’ve got a handful of changes to various user land components, I’ve got it on my roadmap, and would avoid bringing that forward for the moment unless it’s necessary.
I found https://github.com/freebsd/freebsd/commit/6cc0e8d2a0b583db5707f811d4ebfbe1ad05e628, which changes netinet/ip.h to use __aligned(2) rather than 4, which fixes what seems to be a similar issue on ARM, but it doesn’t seem to help me unfortunately.
--
Nathan Ward
More information about the freebsd-mips
mailing list