Race in NFS lookup can result in stale namecache entries
Kostik Belousov
kostikbel at gmail.com
Thu Jan 19 14:06:25 UTC 2012
On Wed, Jan 18, 2012 at 05:07:21PM -0500, John Baldwin wrote:
...
> What I concluded is that it would really be far simpler and more
> obvious if the cached timestamps were stored in the namecache entry
> directly rather than having multiple name cache entries validated by
> shared state in the nfsnode. This does mean allowing the name cache
> to hold some filesystem-specific state. However, I felt this was much
> cleaner than adding a lot more complexity to nfs_lookup(). Also, this
> turns out to be fairly non-invasive to implement since nfs_lookup()
> calls cache_lookup() directly, but other filesystems only call it
> indirectly via vfs_cache_lookup(). I considered letting filesystems
> store a void * cookie in the name cache entry and having them provide
> a destructor, etc. However, that would require extra allocations for
> NFS lookups. Instead, I just adjusted the name cache API to
> explicitly allow the filesystem to store a single timestamp in a name
> cache entry by adding a new 'cache_enter_time()' that accepts a struct
> timespec that is copied into the entry. 'cache_enter_time()' also
> saves the current value of 'ticks' in the entry. 'cache_lookup()' is
> modified to add two new arguments used to return the timespec and
> ticks value used for a namecache entry when a hit in the cache occurs.
>
> One wrinkle with this is that the name cache does not create actual
> entries for ".", and thus it would not store any timestamps for those
> lookups. To fix this I changed the NFS client to explicitly fast-path
> lookups of "." by always returning the current directory as setup by
> cache_lookup() and never bothering to do a LOOKUP or check for stale
> attributes in that case.
>
> The current patch against 8 is at
> http://www.FreeBSD.org/~jhb/patches/nfs_lookup.patch
...
So now you add 8*2+4 bytes to each namecache entry on amd64 unconditionally.
Current size of the struct namecache invariant part on amd64 is 72 bytes,
so addition of 20 bytes looks slightly excessive. I am not sure about
typical distribution of the namecache nc_name length, so it is unobvious
does the change changes the memory usage significantly.
A flag could be added to nc_flags to indicate the presence of timestamp.
The timestamps would be conditionally placed after nc_nlen, we probably
could use union to ease the access. Then, the direct dereferences of
nc_name would need to be converted to some inline function.
I can do this after your patch is committed, if you consider the memory
usage saving worth it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20120119/4ce8db20/attachment.pgp
More information about the freebsd-fs
mailing list