sshd / tcp packet corruption ?
Martin Minkus
martin.minkus at punz.co.nz
Wed Jun 23 21:43:46 UTC 2010
Thanks for the reply. I actually posted a response to this original
message with more details showing just raw tcp data sent from one box to
another box is getting corrupted.
The culprit is definitely kinetic.
Futhermore, i've determined both NICs are doing it.
kinetic:~# netstat -i
Name Mtu Network Address Ipkts Ierrs Idrop
Opkts Oerrs Coll
em0 1500 <Link#1> 00:0e:0c:6b:d6:d3 222249 0 0
190062 0 0
em0 1500 10.64.10.0 kinetic 198516 - -
189315 - -
nfe0 1500 <Link#2> 00:24:1d:15:11:48 17932 0 0
219 0 0
nfe0 1500 10.64.11.0 10.64.11.253 12675 - -
217 - -
plip0 1500 <Link#3> 0 0 0
0 0 0
lo0 16384 <Link#4> 592 0 0
592 0 0
lo0 16384 fe80:4::1 fe80:4::1 0 - -
0 - -
lo0 16384 localhost ::1 0 - -
0 - -
lo0 16384 your-net localhost 552 - -
592 - -
kinetic:~#
Perhaps it is ram, though.... good point. I'll do a memtest.
Martin.
-----Original Message-----
From: Lowell Gilbert [mailto:freebsd-questions-local at be-well.ilk.org]
Sent: Thursday, 24 June 2010 09:41
To: Martin Minkus
Cc: freebsd-questions
Subject: Re: sshd / tcp packet corruption ?
Martin Minkus <martin.minkus at punz.co.nz> writes:
> It seems this issue I reported below may actually be related to some
> kind of TCP packet corruption ?
Possible. Or memory errors. Hard to say much at this point, when you
don't even know which side is actually causing the errors.
> Still same box. Ive noticed my SSH connections into the box will die
> randomly, with errors.
>
>
>
> Sshd logs the following on the box itself:
>
>
>
> Jun 18 11:15:32 kinetic sshd[1406]: Received disconnect from
> 10.64.10.251: 2: Invalid packet header. This probably indicates a
> problem with key exchange or encryption.
>
You might find more useful information by getting verbose messages from
the other end.
I don't have time to check this in detail, but if I recall correctly,
that message means that the other side closed the connection based on an
apparent invalid header type in a packet that 'kinetic' received.
Random corruption isn't likely in that case, because the error is always
in the same place in the packet. Check the 'netstat -i' numbers to see
if the drivers are picking up any packet errors.
It's hard to debug network problems in ssh, though, because (obviously)
you can't tell in general whether packet data is corrupt. If you can
set up a test case with, say, UDP echo, that would be easier to see the
damage to the packets if they are, in fact, being corrupted.
Unfortunately, I'm so used to having sophisticated test equipment in the
lab to look at these kinds of problems that I'm probably missing what
would be obvious to someone who deals with problems "in the field."
Hope I've been somewhat helpful anyway.
More information about the freebsd-questions
mailing list