sshd / tcp packet corruption ?

Martin Minkus martin.minkus at punz.co.nz
Wed Jun 23 21:43:46 UTC 2010


Thanks for the reply. I actually posted a response to this original 
message with more details showing just raw tcp data sent from one box to 
another box is getting corrupted.

The culprit is definitely kinetic.

Futhermore, i've determined both NICs are doing it.

kinetic:~# netstat -i
Name    Mtu Network       Address              Ipkts Ierrs Idrop    
Opkts Oerrs  Coll
em0    1500 <Link#1>      00:0e:0c:6b:d6:d3   222249     0     0   
190062     0     0
em0    1500 10.64.10.0    kinetic             198516     -     -   
189315     -     -
nfe0   1500 <Link#2>      00:24:1d:15:11:48    17932     0     0      
219     0     0
nfe0   1500 10.64.11.0    10.64.11.253         12675     -     -      
217     -     -
plip0  1500 <Link#3>                               0     0     0        
0     0     0
lo0   16384 <Link#4>                             592     0     0      
592     0     0
lo0   16384 fe80:4::1     fe80:4::1                0     -     -        
0     -     -
lo0   16384 localhost     ::1                      0     -     -        
0     -     -
lo0   16384 your-net      localhost              552     -     -      
592     -     -
kinetic:~# 

Perhaps it is ram, though.... good point. I'll do a memtest.

Martin.

-----Original Message-----
From: Lowell Gilbert [mailto:freebsd-questions-local at be-well.ilk.org] 
Sent: Thursday, 24 June 2010 09:41
To: Martin Minkus
Cc: freebsd-questions
Subject: Re: sshd / tcp packet corruption ?

Martin Minkus <martin.minkus at punz.co.nz> writes:

> It seems this issue I reported below may actually be related to some
> kind of TCP packet corruption ?

Possible.  Or memory errors.  Hard to say much at this point, when you
don't even know which side is actually causing the errors.

> Still same box. Ive noticed my SSH connections into the box will die
> randomly, with errors.
>
>  
>
> Sshd logs the following on the box itself:
>
>  
>
> Jun 18 11:15:32 kinetic sshd[1406]: Received disconnect from
> 10.64.10.251: 2: Invalid packet header.  This probably indicates a
> problem with key exchange or encryption. 
>

You might find more useful information by getting verbose messages from
the other end.  

I don't have time to check this in detail, but if I recall correctly,
that message means that the other side closed the connection based on an
apparent invalid header type in a packet that 'kinetic' received.
Random corruption isn't likely in that case, because the error is always
in the same place in the packet.  Check the 'netstat -i' numbers to see
if the drivers are picking up any packet errors.

It's hard to debug network problems in ssh, though, because (obviously)
you can't tell in general whether packet data is corrupt.  If you can
set up a test case with, say, UDP echo, that would be easier to see the
damage to the packets if they are, in fact, being corrupted.  

Unfortunately, I'm so used to having sophisticated test equipment in the
lab to look at these kinds of problems that I'm probably missing what
would be obvious to someone who deals with problems "in the field."
Hope I've been somewhat helpful anyway.




More information about the freebsd-questions mailing list