close() of active socket does not work on FreeBSD 6

John-Mark Gurney gurney_j at resnet.uoregon.edu
Thu Dec 21 20:07:40 PST 2006


Daniel Eischen wrote this message on Thu, Dec 21, 2006 at 22:35 -0500:
> On Thu, 21 Dec 2006, John-Mark Gurney wrote:
> 
> >Robert Watson wrote this message on Thu, Dec 21, 2006 at 15:22 +0000:
> >>>I think you are only intersted in treads that are sleeping.. so you allow
> >>>a sleeping thread to save a pointer to the fd (or whatever) on which it 
> >>>is
> >>>sleeping, along with the sleep address.
> >>>
> >>>items that are not sleeping are either already returning, or are going to
> >>>sleep, in which case they can check at that time.
> >>
> >>Hence my question about select and poll: should they throw an exception
> >>state when a file descriptor is closed out from under them?  They often
> >>sleep on hundreds or thousands of file descriptors, and not just one.
> >
> >IMO, your program is buggy if you close the file descriptor before
> >everything is out of the kernel wrt the fd...  It means that your close
> >statement isn't waiting for things to be cleanly shut down, and that
> >you still have dangling reference counts to the parts of the code that
> >is in the kernel...
> >
> >I used to expect something similar w/ an kqueue based event driven
> >web server, and found that I had bugs due to assuming that I could
> >close it whenever I want...  What happens if you close the fd between
> >the time select returns and you process it?  What happens if the fd
> >gets closed, and another thread (or an earlier fd that accepts
> >connections) reuses that fd?  And then youre state machine isn't read
> >to get an event since it isn't suppose to get one yet...
> >
> >The kernel isn't buggy wrt closing a fd when another thread is using
> >it, it's the program that's buggy...
> 
> I agree also, but hanging without return isn't very detectable.

It's a lot more detectable than working 99% more of the time and
failing when things get correupted due to a race.. :)

> The best thing to do is to tell the programmer that he is doing
> something stupid, and returning with an error is the way that
> it is typically done.  Solaris seems to have jumped through

As long as it's EDOOFUS...  I don't see any other error that would
be approriate...

> some hoops to achieve this behavior, so I doubt it is without
> merit.  OTOH, I'm not going to argue that it is one of the
> more important things we should be worried about ;-)

As long as it doesn't cost much more to do it...  Hanging is just as
good of an indication as returning an error...  And I'd say it's better
as it forces the buggy software to be fixed as opposed to simply ignoring
the error which is likely what the programmer will do...

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."


More information about the freebsd-arch mailing list