[6.x] problem with AIO, non-blocking sockets on freebSD and IE7 on windows.

John-Mark Gurney gurney_j at resnet.uoregon.edu
Mon Jun 25 17:46:45 UTC 2007


Julian Elischer wrote this message on Mon, Jun 25, 2007 at 10:17 -0700:
> Bruce Evans wrote:
> >On Fri, 22 Jun 2007, Julian Elischer wrote:
> >
> >>If one has an event-driven process that accepts tcp connections, one 
> >>needs to set eh non-blocking socket option and use kqueue or similar 
> >>to schedule work.
> >>
> >>This is ok for data transfers, however when it comes to the close() 
> >>call there is a problem.  The problem in in the following code in 
> >>so_close()
> >>
> >>
> >>              if (so->so_options & SO_LINGER) {
> >>                      if ((so->so_state & SS_ISDISCONNECTING) &&
> >>                          (so->so_state & SS_NBIO))
> >>                              goto drop;
> >>...
> >>drop:
> >> [ continues on to destroy socket ]
> >>
> >>
> >>because SS_NBIO is set, the socket acts as if SO_LINGER was set, with 
> >>a timeout of 0.
> >>the result of this, is the following behaviour:
> >
> >[ patckets in flight get lost ]
> >
> >This seems to be the correct behaviour.  The application doesn't care
> >about its data and/or wants to close the descriptor without blocking,
> >so it doesn't turn off the blocking flag and/or wait for i/o to complete
> >(so that it can see if the i/o actually worked) before calling close().
> 
> It's not the correct behaviour if the only packet coming back is an Ack of
> the FIN (and a FIN) because in the real world, making IE7 throw an error 
> screen is not an acceptable option. This is the sort of thing
> that gets FreeBSD thrown out on favour of "anything else".
> Believe me, our customers are "NOT HAPPY" about this.
> Instead of getting an "authorization required" page along with
> the opportunity to log in, they get an error, and no opportunity
> to log in, which makes the system unusable.
> Yes, Blame Microsoft, but we are breaking the TCP spec, not them.
> We need to fix this some how.

As bde mention, the bug is in the application...  Even SUSv2 says:
When all file descriptors associated with a pipe or FIFO special file are closed, any data remaining in the pipe or FIFO will be discarded.

Our own close(2) says:
on the last
     close of a socket(2) associated naming information and queued data are
     discarded

So, failure of the application to ensure that all data is sent is the
application's fault...  bde alluded to a simple work around of clearing
the non-blocking flag which will return close to the "expected" (but
apprently broken) behavior of keeping the tcp socket around till all
remaining data has been sent...

I must note that the code you quoted has been in FreeBSD since 2.0.

> >I implemented this behaviour for tty drivers in FreeBSD.  Old BSD tty
> >drivers didn't check the nonblocking flag and didn't have a timeout,
> >so close() on tty devices tended to hang forever (normally at long
> >weekends) even for closes that should have been nonblocking.

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."


More information about the freebsd-net mailing list