cvs commit: src/sys/dev/io iodev.c
brde at optusnet.com.au
Mon Aug 11 16:36:46 UTC 2008
On Sat, 9 Aug 2008, John Baldwin wrote:
> On Saturday 09 August 2008 12:35:50 am Bruce Evans wrote:
>> On Fri, 8 Aug 2008, John Baldwin wrote:
>>> On Friday 08 August 2008 09:43:56 am Ed Schouten wrote:
>>>> ed 2008-08-08 13:43:56 UTC
>>>> Apart from this change, I think some fishy things may happen when
>>>> using /dev/io in multithreaded applications. I haven't tested, but
>>>> looking at the code, the flag doesn't get cleared when close() is called
>>>> from another thread, but this may not be this important.
>> Of course it isn't. The flag only gets cleared on last close. Threads
>> probably involve mainly file descriptors, and for file descriptors close()
>> doesn't even reach vn_close() until the fd reference count drops to 0.
>> Then for files, vn_close() normally doesn't reach device close until the
>> file reference count drops to 0.
>>> It should be setting D_TRACKCLOSE though so that close() reliably clears
>>> the flag even in single-threaded processes. You can still get odd
>>> behavior if
>> You mean "should _not_ be setting D_TRACKCLOSE", so that close() works
>> close() can never reliably clear the flag, since even vn_close() is not
>> reached after open()/dup()/close() or open()/fork()/close()...
>>> you explicitly open it twice in an app and then close one of the two
>>> fd's. You will no longer have IO permission even though you still have
>>> one fd open. However, if you do that I think you deserve what you asked
>>> for. :)
>> You asked for normal last-close behaviour and deserve that not being broken
>> by setting D_TRACKCLOSE.
>> D_TRACKCLOSE allows different handling of non-last closes, but this is
>> rarely wanted, and in theory it allows better determination of last
>> closes, since vfs doesn't count last closes right. However, it is
>> difficult to count last closes right, and most uses of D_TRACKCLOSE handle
>> last closes are worse than vfs:
>> - ast: uses D_TRACKCLOSE to trash (write a filemark) and/or rewind tapes
>> on non-last closes. After the trash and rewind, astclose() uses
>> count_dev() to avoid doing a full hardware close if the device is still
>> open. This involves using count_dev(), but count_dev() is just an
>> interface to the vfs count, so it miscounts in the same way as vfs. In
>> particular, it doesn't count incompleted opens. I don't know if ast's
>> locking is sufficient to prevent close() being called while a new open
>> is in progress. For most devices, it isn't.
>> - drm: it uses its own count of opens and closes, and it doesn't use
>> count_dev(). The own count has some chance of working, but needs
>> delicate locking of device open and device close to prevent them
>> running into each other, and some defense against the vfs bugs which
>> at least used to result in missing and/or extra closes.
>> - smb: like drm, except it limits its own count to 0 or 1 to give half-
>> baked exclusive access (fork() and dup(), etc, still give non-exclusive
>> access). D_TRACKCLOSE thus has no effect except for its interaction
>> with other bugs.
>> - geom: unlike most or all of the others, this may actually need
>> D_TRACKCLOSE, to implement multiple logical drivers per physical device.
>> I don't understand its details.
>> - apm: like drm
>> - bpf: like smb
> You failed to note that not using D_TRACKCLOSE doesn't actually reliable close
No, I noted that vfs doesn't count last closes right. Unreliability of
last-close is a special case of that. But using D_TRACKCLOSE mostly makes
things worse. To avoid the vfs bugs in tracked closes, it is necessary
for individual drivers to understand vfs's bugs better than vfs does.
> I set D_TRACKCLOSE on bpf because under load at work (lots of
> concurrent nmap's) bpf devices would sometimes never get fully closed, so
> you'd have an unopened bpf device that was no longer available.
It is unclear how that helps. vfs's bugs must be really large for the
last-close to go completely missing. kib fixed some of them a few months
ago (before you changed bpf I think). bpf is especially simple since it
is exclusive-accesses, so the vfs bugs should be smaller for it.
> gives much saner semantics for devices, each open() of an fd calls
> the 'd_open' method, and the last close of that fd (which could be in another
> process) calls 'd_close'.
This is insane semantics for most devices, and D_TRACKCLOSE doesn't give it.
Most devices don't care about the difference between file descriptors and
files -- they want d_close to be called only when the last file descriptor
on the device is closed. !D_TRACKCLOSE gives this, modulo the vfs bugs.
D_TRACKCLOSE doesn't give these semantics: revoke() closes everything
in 1 step and then calls d_close(), giving an N-1 correspondence between
d_open()s and d_close()s (modulo the vfs bugs giving a fuzzier
correspondence). (IIRC, at least before kib's fixes, it was possible
for revoke() to cause any of 0, 1 or 2 calls to d_close(). I think 1
call always gets as far as devfs_close() but this is only guaranteed
to reaach d_close() with D_TRACKCLOSE.) This is not a bug, but all
drivers that depend on the correspondence being 1-1 are broken. Drivers
that enforce "exclusive" access like bpf have N=1, so they probably
work modulo the vfs bugs, and they have a good chance of not being
affected by the vfs bug that gives an extra devfs_close() and thus an
> /dev/io in particular is quite bogus since even though a child process
> inherits the file descriptor, it doesn't inherit the behavior of
> having /dev/io open.
Not just child processes. The same process disinherits the behaviour on
exec though it normally keeps the fd open.
More information about the cvs-all