tcp failing to recover from a packet loss under 8.2-RELEASE?

Thu Aug 11 13:54:56 UTC 2011

On Thu, Aug 11, 2011 at 11:33:37PM +1000, Lawrence Stewart wrote:

> >>> Autotunig w/o limits is bad idea. This is way to DoS.
> >>
> >> Depends how it is implemented. With appropriate backpressure mechanisms
> >> put in place, it could be perfectly safe. I envisage reassembly segments
> >> being at the bottom of the heap in terms of importance, so if a machine
> >> were to come under memory pressure, they would be the first thing to be
> >> reclaimed. TCP would continue to operate if they got pulled out from
> >> under the connection as the protocol doesn't consider segments held in
> >> reassembly to have been delivered, so would recover via retransmission.
> >
> > Yes, TCP would continue to operate. But attacker don't allow to put
> > system under memory pressure.
> 
> Without a concrete patch to discuss, let's just agree to disagree for 
> the time being. FreeBSD does a fairly good job autoscaling and reacting 
> to pressure with the VM subsystem for example. I don't see why we
> can't 

Yes, and VM system allow to set different memory limits for proccess (and now for jails).

> become good at doing it with the netstack. Manual tuning sucks and can 
> be just as dangerous if you tune things up to get performance, which 
> opens you up to the same problems.

Autoscaling with limits is good.
Automatic computation of limits (from available resources) also is
good (currently limits frequently to small for modern installation,
but don't remember about embeded systems).

> >>> May be solved this trouble by preallocation "hidden" element in tqe
> >>> for segment received in valid order and ready for send to application?
> >>> T.e. when creating reassembled queue for tcp connection we allocation
> >>> queue element (with room for data payload), used only when data ready
> >>> for application. Allocation in queue for not breaking ABI (don't
> >>> change struct tcpcb).
> >>
> >> I'm not sure I quite follow what you're suggesting here, but I think
> >> Andre's proposed patch achieves the same goal and is arguably cleaner?
> >
> > Ande allocation on stack. My idea is different (sorry for bad english).
> >
> > 1. application open socket.
> > 2. kernel allocated internal tcp structure for this socket.
> > 2.1 additional step: kernel beforehand allocation one tseg_qent and
> > place it in t_segq. this queue entry will be used only when we receive
> > first good segment in right place.
> >
> > for example:
> >
> > [lost segment 100] (segment 101) (segment 102) ... (segment 100).
> >
> > segments 101, 102 and etc processed as usaly.
> > segment 100 placed in reserved and previously allocated queue entry.
> >
> > after receive segment 100 we can send data to application (to user
> > space). after send data queue entry from 2.1 not freed, this
> > permanently allocated entry.
> 
> Ok I understand. Why is your proposal better than stack allocated 
> though? I can think of a number of reasons why it's worse...

I to be afraid stack allocation in this place (for list entry).
Bugs in this code can cause kernel panic.