Approval request of some additions to src/sys/kern/vfs_subr.c
and src/sys/sys/vnode.h
Daichi GOTO
daichi at freebsd.org
Mon Apr 28 17:26:49 UTC 2008
Kostik Belousov wrote:
> On Mon, Apr 28, 2008 at 11:36:55PM +0900, Daichi GOTO wrote:
>> Thanks for your response and explanation :)
>>
>> Kostik Belousov wrote:
>>> On Mon, Apr 28, 2008 at 06:37:09PM +0900, Daichi GOTO wrote:
>>>> Kostik Belousov wrote:
>>>>> On Fri, Apr 25, 2008 at 07:21:20PM +0900, Daichi GOTO wrote:
>>>>>> Hi Konstantin :)
>>>>>>
>>>>>> To fix a unionfs issue of
>>>>>> http://www.freebsd.org/cgi/query-pr.cgi?pr=109377,
>>>>>> we need to add new functions
>>>>>>
>>>>>> void vkernrele(struct vnode *vp);
>>>>>> void vkernref(struct vnode *vp);
>>>>>>
>>>>>> and one value
>>>>>>
>>>>>> int v_kernusecount; /* i ref count of kernel */
>>>>>>
>>>>>> to src/sys/sys/vnode.h and rc/sys/kern/vfs_subr.c.
>>>>>>
>>>>>> Unionfs will be panic when lower fs layer is forced umounted by
>>>>>> "umount -f". So to avoid this issue, we've added
>>>>>> "v_kernusecount" value that means "a vnode count that kernel are
>>>>>> using". vkernrele() and vkernref() are functions that manage
>>>>>> "v_kernusecount" value.
>>>>>>
>>>>>> Please check those and give us an approve or some comments!
>>>>> There is already the vnode reference count. From your description, I
>>>>> cannot
>>>>> understand how the kernusecount would prevent the panic when forced
>>>>> unmount
>>>>> is performed. Could you, please, show the actual code ? PR mentioned
>>>>> does not contain any patch.
>>>> Our patch realizes avoiding kernel panic by "umount -f" operation using
>>>> with
>>>> EBUSY process.
>>>>
>>>> On current implementation (not applied our patch), "umount -f" tries to
>>>> release vnode at any vnode reference count value. Since that, unionfs
>>>> and nullfs access invalid vnode and lead kernel panic. To prevent this
>>>> issue, we need a some kind of not-umount-accept-mechanism in invalid case
>>>> (e.x. fs in unionfsed stack, it must be umounted in correct order) and
>>>> to realize that, current vnode reference count is not enough we are
>>>> thinking.
>>>>
>>>> If you have any ideas to realize the same solution with current vnode
>>>> reference, would you please tell us your idea :)
>>>>
>>>>> The problem you described is common for the kernel code, and right way
>>>>> to handle it, for now, is to keep refcount _and_ check for the forced
>>>>> reclaim.
>>> Your patch in essence disables the forced unmount. I would object against
>>> such decision.
>> Oooooo.... OK. We understand.
>>
>>> Even if taking this direction, I believe more cleaner solution would be
>>> to introduce a counter that disables the (forced) unmount into the
>>> struct mount, instead of the struct vnode. Having the counter in the
>>> vnode, the unmount -f behaviour is non-deterministic and depended on
>>> the presence of the cached vnodes of the upper layer. The mount counter
>>> would be incremented by unionfs cover mount. But, as I said above, this
>>> looks like a wrong solution.
>>>
>>> The right way to handle the forced reclaim with the current VFS is to
>>> add the explicit checks for the reclaimed vnodes where it is needed. The
>>> vnode cannot be reclaimed while the vnode lock is held. When obtaining
>>> the vnode lock, the reclamation can be detected. For instance, the
>>> vget() without LK_RETRY shall be checked for ENOENT.
>> At last, we want to check that vnode is released or not where
>> unionfs does not know. If we can do that check, our patch is
>> not needed for solving that issue.
>>
>> Would you please give us the way to check that target vnode is
>> released or not before accessing it.
>
> The basic rules of our VFS are:
> 1. You _must_ hold the vnode unless the vnode is locked. Hold count
> prevents the vnode memory from being reused and guarantees the
> validity of the counters, v_vnlock, v_mount and vop (but please note
> that validity != stability). E.g., v_mount may be NULLed and vop
> become the deadfs_vop due to reclamation.
> 2. The vnode lock is held when the vnode is vgone(9)'ed. In the other
> words, if you have a pointer to the non-reclaimed vnode that
> is locked, the vnode cannot be reclaimed until the lock is freed.
> 3. The verbs that lock a vnode (vget() and vn_lock(9)) have two mode
> of operations.
> - If you specify the LK_RETRY in the lock flags, you would get
> even the reclaimed vnode locked.
> - If you do not specified LK_RETRY, you would get ENOENT for the
> reclaimed vnode.
> [See the #1 for the reason why you must have a vnode held while
> calling vget() or vn_lock()].
> 4. The reclaimed vnode has the VI_DOOMED flag set; you must have vnode
> interlock locked to check the context of the v_iflag. Most filesystems,
> as opposed to the VFS, use the other technique to detect the reclaimed
> vnode, if needed. They clear the v_data in the vop_reclaim, and
> verification of the (v_data != NULL) is enough to check for reclamation.
>
> Very good example of the practical usage of the rules above are the
> nullfs routines null_reclaim(), null_lock() and null_nodeget().
Thanks for your explanation! We'll try to research and get another
new solution for this issue :)
>>> You said that that nullfs is vulnerable to the problem. Could you,
>>> please, point me to the corresponding stack trace ? At least, the nullfs
>>> vop_lock() seems to carefully check the possible problems.
--
Daichi GOTO, http://people.freebsd.org/~daichi
More information about the freebsd-fs
mailing list