help w/panic under heavy load - 5.4
Edwin
edwin at verolan.com
Sun Jul 24 02:38:48 GMT 2005
Max/et.al.,
replies to your message in-line below...
If I understand correctly...(albeit an overly brief understanding :))
1. ethernet packet comes in - stuck into an mbuf
2. ether_demux calls ip_fastforward passing the mbuf struct
3. mbuf struct is copied/munged into ip struct by mtod
4. ntohs is called to change ip->ip_len to host byte order
incidentally - ip_len should be set to ntohs(ip->ip_len)
as well - it seems like neither one of those calls worked?
5. also - the call to set hlen to ip->ip_hl <<2 didn't work out well
either - right? since hlen = -1057417216, and i think it's
supposed to be 20 (5*4) - am I correct there as well?
6. due to ip->ip_len being in network byte order still a little
gremlin helps us to think we have a 10240 byte packet and we
need to fragment it...
7. in ip_fragment - ip->ip_len is still 10240 - so we assume that we
need to make several fragments - however, the mbuf is correct
(len = 40)
8. in ip_fragment - to create the 'second' fragment, we try to copy
1480 bytes @ offset 1500 out of the mbuf that only has a valid
data length of 40-bytes???
Are we really looking for the cause of ip->ip_len not being in the correct
order @ the right time then? - in that case - there's two possibilities that
I see - and I don't think that ntohs not working (1) is too realistic, so
I would suppose we are looking for what flipped it in the first place?
1. either ntohs didn't work for some reason, or
2. it was already in host order, and the ntohs call flipped it back to
network order
If you feel that it's a ipfw/ipfil issue - I can easily take IPFIREWALL* options
out of the kernel and build a new one - just give me about 15 minutes.
cheers. /edwin
Max Laier (max at love2party.net) wrote:
> On Saturday 23 July 2005 20:41, Edwin wrote:
> > Kernel name: D1-0722 (for reference)
> >
> > mbsd05# kgdb kernel.debug /usr/local/STORAGE/crash/vmcore.5
> > #13 0xc06933c1 in ip_fastforward (m=0xc12e6c00) at
> > /usr/src/sys/netinet/ip_fastfwd.c:572 warning: Source file is more recent
> > than executable.
>
> Let's hope that's still correct ...
>
it is - result of manual patch application and removal - just the timestamp/dates on the file are different (verified by
diff from clean source tree just now to make sure again.
> > 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist,
> > (kgdb) l
> > 567 m->m_pkthdr.csum_flags |= CSUM_IP;
> > 568 /*
> > 569 * ip_fragment expects ip_len and ip_off in host byte
> > 570 * order but returns all packets in network byte order
> > 571 */
> > 572 if (ip_fragment(ip, &m, mtu, ifp->if_hwassist,
> > 573 (~ifp->if_hwassist & CSUM_DELAY_IP))) {
> > 574 goto drop;
> > 575 }
> > 576 KASSERT(m != NULL, ("null mbuf and no error"));
> > (kgdb) i loc
> > ip = (struct ip *) 0xc12f700e
> > m0 = (struct mbuf *) 0xc12f700e
> > ro = {ro_rt = 0xc11f8420, ro_dst = {sa_len = 16 '\020', sa_family = 2
> > '\002', sa_data = "\000\000ˬ\002\005\000\000\000\000\000\000\000"}}
> > dst = (struct sockaddr_in *) 0xc76bfc3c
> > ia = (struct in_ifaddr *) 0x0
> > ifa = (struct ifaddr *) 0x0
> > ifp = (struct ifnet *) 0xc0f91800
> > odest = {s_addr = 84060352}
> > dest = {s_addr = 84060352}
> > sum = 0
> > ip_len = 0
>
> This should not happen. ip_len is initialize from ntohs(ip->ip_len) and never
> touched again. Anyway, let's look some more ...
is it accurate to say that ip->ip_len is 10240 @ this point - but it should be 40?
at line 542 of ip_fastfwd.c 1.17.2.7...
the ip->ip_len <= mtu should eval to true and fall through to the true case - but it
falls through to false (hence the ip_fragment section) - b/c it is still in network order?
if (ip->ip_len <= mtu ||
(ifp->if_hwassist & CSUM_FRAGMENT && (ip->ip_off & IP_DF) == 0)) {
/*
* Restore packet header fields to original values
*/
ip->ip_len = htons(ip->ip_len);
ip->ip_off = htons(ip->ip_off);
/*
* Send off the packet via outgoing interface
*/
error = (*ifp->if_output)(ifp, m,
(struct sockaddr *)dst, ro.ro_rt);
} else {
/*
* Handle EMSGSIZE with icmp reply needfrag for TCP MTU discovery
*/
if (ip->ip_off & IP_DF) {
ipstat.ips_cantfrag++;
icmp_error(m, ICMP_UNREACH, ICMP_UNREACH_NEEDFRAG,
0, ifp);
goto consumed;
} else {
/*
* We have to fragement the packet
*/
m->m_pkthdr.csum_flags |= CSUM_IP;
/*
* ip_fragment expects ip_len and ip_off in host byte
* order but returns all packets in network byte order
*/
if (ip_fragment(ip, &m, mtu, ifp->if_hwassist,
(~ifp->if_hwassist & CSUM_DELAY_IP))) {
goto drop;
}
KASSERT(m != NULL, ("null mbuf and no error"));
/*
>
> > error = 84060352
> > hlen = -1057417216
> > mtu = 0
> > __func__ = "ip_fastforward"
> > (kgdb) p *ip
> > $1 = {ip_hl = 5, ip_v = 4, ip_tos = 0 '\0', ip_len = 10240, ip_id = 61249,
>
> ip_len should be 40 as ip_len is supposed to be in HOST BYTE ORDER at this
> point. Feeding 10240 to ntohs() give the correct value, so something
> obviously went wrong.
>
> Let's see how we got here:
> 355 does the byteorder flip to host byte order
> 366 pfil OUT
> 451 pfil IN
> 527 first check ip_len < if_mtu etc ...
>
> Obviously, the only thing that might mess with the byte order (unless I missed
> something along the way) is one of the pfil consumers.
>
> ***
> *** What firewall(s) are you running with?
> ***
ipfw enabled - it's a permit all (IPFIREWALL_DEFAULT_TOACCEPT) - output from 'ipfw show'
fb54c# ipfw show
65535 26395 1874336 allow ip from any to any
fb54c#
here is the diff from the generic config
mbsd05# diff /root/kernels/D1-0722 /root/kernels/GENERIC
21,22d20
< makeoptions DEBUG=-g
<
24c22
< #cpu I486_CPU
---
> cpu I486_CPU
26,27c24,25
< #cpu I686_CPU
< ident D1-0722
---
> cpu I686_CPU
> ident GENERIC
31,48d28
<
< options KDB
< options DDB
< options INVARIANTS
< options INVARIANT_SUPPORT
<
< options CPU_SOEKRIS
< options CPU_GEODE
<
< options HZ=1000
< options DEVICE_POLLING
<
< options IPFIREWALL
< options IPFIREWALL_VERBOSE
< options IPFIREWALL_VERBOSE_LIMIT
< options IPFIREWALL_DEFAULT_TO_ACCEPT
< options DUMMYNET
< options IPDIVERT
mbsd05#
>
> > ip_off = 0, ip_ttl = 63 '?', ip_p = 17 '\021', ip_sum = 31921, ip_src =
> > {s_addr = 67479744}, ip_dst = {s_addr = 84060352}} (kgdb) p *m
> > $2 = {m_hdr = {mh_next = 0x0, mh_nextpkt = 0x0, mh_data = 0xc12f700e "E",
> > mh_len = 40, mh_flags = 3, mh_type = 1}, M_dat = {MH = {MH_pkthdr = {rcvif
> > = 0xc0f90000, len = 40, header = 0x0, csum_flags = 769, csum_data = 0, tags
>
> 40, there you have it - no need to fragment at all!
>
> > /usr/src/sys/netinet/ip_output.c:967
> > 967 m->m_next = m_copy(m0, off, len);
> > (kgdb) l
> > 962 len = ip->ip_len - off;
> > 963 m->m_flags |= M_LASTFRAG;
> > 964 } else
> > 965 mhip->ip_off |= IP_MF;
> > 966 mhip->ip_len = htons((u_short)(len + mhlen));
> > 967 m->m_next = m_copy(m0, off, len);
> > 968 if (m->m_next == NULL) { /* copy failed */
> > 969 m_free(m);
> > 970 error = ENOBUFS; /* ??? */
> > 971 ipstat.ips_odropped++;
>
> Just to make sure, we didn't touch the original packet at this point so the
> above values are still the ones we based the (wrong) decision on.
>
> --
> /"\ Best regards, | mlaier at freebsd.org
> \ / Max Laier | ICQ #67774661
> X http://pf4freebsd.love2party.net/ | mlaier at EFnet
> / \ ASCII Ribbon Campaign | Against HTML Mail and News
More information about the freebsd-hackers
mailing list