Corrupt packets in Jnet (Was: Re: rtentry and rtrequest)

Mon Apr 23 14:54:35 UTC 2007

On Mon, Apr 23, 2007 at 10:24:46AM +1000, Alan Garfield wrote:
> On Sat, 2007-04-21 at 03:36 +0400, Yar Tikhiy wrote:
> 
> > > ----
> > > Disconnecting: Corrupted MAC on input.
> > > ----
> > 
> > That looks like data corruption happening when TCP segments and/or
> > IP packets become relatively large, i.e., approach or reach the mtu
> > limit.
> 
> The reply looks disturbing from the SP (note the packet size)....
> 
> ----
> IP (tos 0x0, ttl  64, id 2493, offset 0, flags [none], proto: ICMP (1),
> length: 108) 169.254.101.3 > 169.254.101.2: ICMP echo request, id 31748,
> seq 3, length 88
>         0x0000:  4500 006c 09bd 0000 4001 52d2 a9fe 6503
>         0x0010:  a9fe 6502 0800 843d 7c04 0003 462b fbe5
>         0x0020:  0001 c4b7 abcd efab cdef abcd efab cdef
>         0x0030:  abcd efab cdef abcd efab cdef abcd efab
>         0x0040:  cdef abcd efab cdef abcd efab cdef abcd
>         0x0050:  efab cdef abcd efab cdef abcd efab cdef
>         0x0060:  abcd efab cdef abcd efab cdef
> IP (tos 0x0, ttl 255, id 57441, offset 0, flags [none], proto: ICMP (1),
> length: 108) 169.254.101.2 > 169.254.101.3: ICMP echo reply, id 31748,
> seq 3, length 88
>         0x0000:  4500 006c e061 0000 ff01 bd2c a9fe 6502
>         0x0010:  a9fe 6503 0000 8c3d 7c04 0003 462b fbe5
>         0x0020:  0001 c4b7 abcd efab cdef abcd efab cdef
>         0x0030:  abcd efab cdef abcd efab cdef abcd efab
>         0x0040:  cdef abcd efab cdef abcd efab cdef abcd
>         0x0050:  efab cdef abcd efab cdef abcd efab cdef
>         0x0060:  abcd efab cdef abcd efab cdef 0000 0000
>         0x0070:  0000 0000 0000 0000 0000 0000 0000 0000
>         0x0080:  0000 0000 0000 0000 0000 0000 0000 0000
>         0x0090:  0000 0000 0000 0000 0000 0000 0000 0000
>         0x00a0:  0000 0000 0000 0000 0000 0000 0000 0000
>         0x00b0:  0000 0000 0000 0000 0000 0000 0000 0000
>         0x00c0:  0000 0000 0000 0000 0000 0000 0000 0000
>         0x00d0:  0000 0000 0000 0000 0000 0000 0000 0000
>         0x00e0:  0000 0000 0000 0000 0000 0000 0000 0000
>         0x00f0:  00
> ----

Nothing wrong with the length, it's 241 bytes, the size of the
buffer less that of the Ethernet header.  As we concluded, Jnet has
no means for indicating the exact size of the Ethernet frame, so the
whole buffer has to be considered as such.

BPF works at the link layer, so tcpdump shows you the frame's
complete payload, not only the IP packet.  The extra 0's at the end
will be trimmed off by ip_input() based on the packet length field
in the IP header -- note that it's correctly set to 108 bytes.

> So obviously it cannot deal with fragmented packets. A ping over 213
> will over flow the packet and make the ping request fragment, the other
> side simply drops it to the floor.

This conclusion doesn't seem to follow from the above observation.
Dropping fragmented IP packets can be a quirk of the Linux if it
does so.

> But that still doesn't make sense with SSH complaining about a corrupt
> MAC on input. I see no corruption here only dumped packets if they are
> over-sized.

Perhaps the bug is triggered when the outgoing packet consists of
multiple mbufs.  ping sends its packet to the kernel as a single
message while sshd can do smaller writes to the socket which get
coalesced into a TCP segment.  Now I can only think of the following
test: run "sshd -d" under ktrace and compare the data it writes to
the network socket with the data actually sent to Jnet via jnet_start().
A debug printf in jnet_start() will be needed to see the data at the
lowest level possible.

Other possible options for collecting genuine data sent by sshd are:
- a netgraph using ng_ether and ng_eiface
- a data tap between "sshd -i" and inetd
- higher debug levels of sshd (I've never investigated them)

I'd also test if the ssh from SP can work OK with a FreeBSD host
(the same FreeBSD version as on the platform side would be the best)
via the external Ethernet.

If nothing helps at all, device access timing can be the cause.  Can
the device ports written to/read from in a loop without a delay?

> Should I pad out the packet on the platform side to be the same as the
> SP?

Your jnet_start() routine fills the tail of the buffer w/zeros
already, doesn't it?

-- 
Yar