close() of active socket does not work on FreeBSD 6
Robert Watson
rwatson at FreeBSD.org
Wed Dec 20 08:22:14 PST 2006
On Wed, 13 Dec 2006, Daniel Eischen wrote:
> [CC trimmed]
>
> On Wed, 13 Dec 2006, David Xu wrote:
>
>> On Wednesday 13 December 2006 04:49, Daniel Eischen wrote:
>>>
>>> Well, if threads waiting on IO are interruptable by signals, can't we make
>>> a new signal that's only used by the kernel and send it to all threads
>>> waiting on IO for that descriptor? When it gets out to actually setup the
>>> signal handler, it just resumes like it is returning from an SA_RESTART
>>> signal handler (which according to another posting would reissue the IO
>>> command and get EBADF).
>>
>> Even if you have implemented the close() with the interruption, another
>> thread openning a file still can reuse the file handle immediately,
>> according to specifications, the lowest free file handle will be returned,
>> if SA_RESTART is used, the interrupted thread restart the syscall, it will
>> be using a wrong file, I think even if we have implemented the feature in
>> kernel, useland threads still has serious race to fix.
>
> If you use a special signal that is only used for this purpose, there is no
> reason you have to try the IO operation again. You can just return EBADF.
>
> Anyway, this was just a thought/idea. I don't mean to argue against any of
> the other reasons why this isn't a good idea.
Whatever may be implemented to solve this issue will require a fairly serious
re-working of how we implement file descriptor reference counting in the
kernel. Do you propose similar "cancellation" of other system calls blocked
on the file descriptor, including select(), etc? Typically these system calls
interact with the underlying object associated with the file descriptor, not
the file descriptor itself, and often, they act directly on the object and
release the file descriptor before performing their operation. I think before
we can put any reasonable implementation proposal on the table, we need a
clear set of requirements:
- What is the scope of cancellation? Are we cancelling oustanding
simultaneous I/O operations on the same fd index in the process, use of any
fd pointing at the same open file entry in the process (i.e., all dup'd
instances), or the same open file entry across all processes? I've been
presuming only use of the same fd index in the same process is relevant, but
if so, let's make sure we state that. If not, what do we mean?
- Exactly which potentially blocking operations will be cancelled as a result
of close() of an "in use" file descriptor? read()? write()? sendfile()?
connect()? ioctl()? select()? poll()? close()? Is the set of possible
cancellation points equal to the existing set of interruptible sleeps?
Notice that in our current implementation, objects are often reached using a
file descriptor, but then separately referenced for the duration of the
operation, with the file descriptor being released. This means that we
currently don't maintain any useful list of threads currently interacting
with the file descriptor, and only have a limited notion of which threads
are interacting with the underlying object.
- What semantics are expected regarding the underlying object when an
operation is cancelled due to simultaneous close() on the same file
descriptor? Keep in mind that the underlying object may be referenced by
other file descriptor indexes pointing at the same open file state (shared
offset, etc). For example, if we cancel connect(), is it safe to say that
what we've done is cancel the wait for connect() to complete, rather than
the connection operation itself, which may continue and be visible on other
file descriptor indexes referencing the same object, or to other processes
also referencing it?
While providing Solaris-like semantics here makes some amount of sense, this
is a very tricky area, and one where we're still refining performance
behavior, reference counting behavior, etc. I don't think there will be any
easy answers, and we need to think through the semantic and performance
implications of any change very carefully before starting to implement.
Robert N M Watson
Computer Laboratory
University of Cambridge
More information about the freebsd-arch
mailing list