svn commit: r268087 - head/sys/kern

Konstantin Belousov kostikbel at gmail.com
Tue Jul 1 12:31:04 UTC 2014


On Tue, Jul 01, 2014 at 01:56:12PM +0200, Mateusz Guzik wrote:
> On Tue, Jul 01, 2014 at 02:42:45PM +0300, Konstantin Belousov wrote:
> > Old code did the malloc(M_WAITOK) call in crget() before the text vnode
> > was locked.  After your change, the crdup() is called with the vnode locked.
> > Witness would not tell you that anything is wrong there, but the new
> > code is worse than the previous structure, even if malloc() was sometimes
> > done when not needed.
> > 
> > To satisfy the memory request from malloc(), pagedaemon or laundry may
> > need to lock the vnode, which creates a circular dependency.  Pagedaemon
> > locks vnodes with timeout, which just means that it would not be able
> > to clean pages while execve() is stuck in malloc(M_WAITOK), while
> > laundry takes the vnode lock without timeout, hanging until the malloc
> > request is satisfied.
> > 
> > The rule is, do not allocate memory while vnodes are locked.  It is not
> > always followed, but it makes no sense to change existing correct code
> > to broke the pattern.
> 
> Right, my bad. This was intended to be a minor cleanup, I'm happy to
> revert if you want.
> 
> Note that current code relocks the vnode already, so there should be no
> harm doing the same in 'else' case. (Although LK_RETRY looks somewhat
> fishy in here.)
> 
> That said I propose the following:
> diff --git a/sys/kern/kern_exec.c b/sys/kern/kern_exec.c
> index cce687b..9b3a99d 100644
> --- a/sys/kern/kern_exec.c
> +++ b/sys/kern/kern_exec.c
> @@ -716,11 +716,11 @@ interpret:
>  		VOP_UNLOCK(imgp->vp, 0);
>  		setugidsafety(td);
>  		error = fdcheckstd(td);
> -		vn_lock(imgp->vp, LK_SHARED | LK_RETRY);
>  		if (error != 0)
>  			goto done1;
>  		newcred = crdup(oldcred);
>  		euip = uifind(attr.va_uid);
> +		vn_lock(imgp->vp, LK_SHARED | LK_RETRY);
>  		PROC_LOCK(p);
>  		/*
>  		 * Set the new credentials.
This is definitely fine.

> @@ -764,7 +764,9 @@ interpret:
>  		if (oldcred->cr_svuid != oldcred->cr_uid ||
>  		    oldcred->cr_svgid != oldcred->cr_gid) {
>  			PROC_UNLOCK(p);
> +			VOP_UNLOCK(imgp->vp, 0);
>  			newcred = crdup(oldcred);
> +			vn_lock(imgp->vp, LK_SHARED | LK_RETRY);
>  			PROC_LOCK(p);
>  			change_svuid(newcred, newcred->cr_uid);
>  			change_svgid(newcred, newcred->cr_gid);
Use of LK_RETRY is fine as far errors from  VOPs which actually perform
accesses to the vnode are checked.  It means that reclaimed vnode would
be detected later.

In fact, could the vnode unlock moved much earlier, in particular,
to avoid the same unlock/lock in the pmc hook call ?  The only use
for the vnode after the VREF() is done, as I see, is to check
for MNT_NOSUID.  Can we test this earlier, and cache the result ?
I do not think that the possible race with flag changing under us
matter.

> @@ -841,6 +843,7 @@ interpret:
>  
>  	SDT_PROBE(proc, kernel, , exec__success, args->fname, 0, 0, 0, 0);
>  
> +	VOP_UNLOCK(imgp->vp, 0);
>  done1:
>  	/*
>  	 * Free any resources malloc'd earlier that we didn't use.
This change is fine but unrelated.  There is no harm of calling free()
while holding vnode lock.

> @@ -849,7 +852,6 @@ done1:
>  		uifree(euip);
>  	if (newcred != NULL)
>  		crfree(oldcred);
> -	VOP_UNLOCK(imgp->vp, 0);
>  
>  	/*
>  	 * Handle deferred decrement of ref counts.
> -- 
> Mateusz Guzik <mjguzik gmail.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/svn-src-head/attachments/20140701/77047fb9/attachment.sig>


More information about the svn-src-head mailing list