[saturnero@freesbie.org: Weird behaviour of mount_unionfs with
executables]
Suleiman Souhlal
refugee at astraldream.net
Wed Jul 6 20:10:01 GMT 2005
Hi,
On Jul 4, 2005, at 2:52 PM, Bruce Evans wrote:
> On Mon, 4 Jul 2005, Dario Freni wrote:
>
>
>> On Sun, Jul 03, 2005 at 08:43:01PM -0400, Stephan Uphoff wrote:
>>
>>> YES - and it looks like just specifying noatime for the union mount
>>> should fix the copy problem for FreeSBIE.
>>>
>>
>> Correct. Using -o noatime when mounting the unionfs resolves all
>> above
>> problems. Thank you very much.
>>
>
> Hmm. This shouldn't work, but helps avoid the copy due to a layering
> violation. In kern_execve(), the test of MNT_NOATIME is supposed to
> be just an optimization to avoid making null VOP_SETATTR(), but it
> only looks at the mount flag for the top layer so it doesn't work
> right if the mount flag is different in lower layers. This gives the
> bizarre behaviour that the noatime option for unionsfs "works" for
> exec() but not for read().
>
> I could only see the duplication problem for mount_unionfs -b. There
> are lots of bugs in that case:
> = no copy is made for the modification caused by read(), despite
> mount_unionfs(8) claiming that:
>
> Any other operation which would ultimately require
> modification to the
> ^^^
> lower layer fails with EROFS.
>
> - the error for failure is often not EROFS (but this may be a side
> effect).
> - exec() makes a bad copy. It fails to copy the access times and
> doesn't
> even copy the permissions; in particular, it loses the x bits so the
> file becomes non-executable.
> - touch(1) makes a bad copy like exec() and then fails to actually
> touch
> the file except for the clobbered times in the non-copy being
> current.
> (touch tries utimes(2) twice, first with the current times and
> then with
> a null times pointer. Both calls fail with EACCES. I think a
> bad copy
> is made on the first call ...). Subsequent touch(1)'s work normally
> (... touch tries utimes(2) only once, only with the current
> times, and
> since this works there seems to be a race for the earlier calls
> to fail
> (I tested as root, so there should be no permissions problems
> with either
> call, and since the middle utimes(path, NULL) call failed the bug
> can't
> be just a stale vp after making the copy in the first call).)
>
> The comment in union_setattr() only claims to make a copy to handle
> truncation, but a copy seems to be made when any attribute is set:
>
> % static int
> % union_setattr(ap)
> % ...
> % /*
> % * Handle case of truncating lower object to zero size
> % * by creating a zero length upper object. This is to
> % * handle the case of open with O_TRUNC and O_CREAT.
> % */
> % if (un->un_uppervp == NULLVP && (un->un_lowervp->v_type ==
> VREG)) {
> % error = union_copyup(un, (ap->a_vap->va_size != 0),
> % ap->a_cred, ap->a_td);
> % if (error)
> % return (error);
> % }
>
> Copying on any change to an attribute seems right, but it doesn't
> actually handle implicit changes that don't go through setattr --
> mainly setting of atimes on read(). Copying on read() wouldn't be
> right.
>
> % % /*
> % * Try to set attributes in upper layer,
> % * otherwise return read-only filesystem error.
> % */
> % error = EROFS;
> % if ((uppervp = union_lock_upper(un, td)) != NULLVP) {
> % error = VOP_SETATTR(un->un_uppervp, ap->a_vap,
> % ap->a_cred, ap->a_td);
> % if ((error == 0) && (ap->a_vap->va_size != VNOVAL))
> % union_newsize(ap->a_vp, ap->a_vap->va_size, VNOVAL);
> % union_unlock_upper(uppervp, td);
> % }
> % return (error);
> % }
>
> The second clause in the comment doesn't match the code here -- the
> error
> is very rarely EROFS since it is usually the error returned by
> VOP_SETATTR()
> and we have avoided getting this far in most cases where VOP_SETATTR()
> would return EROFS. The comment was closer to matching the code in
> rev.1.1
> where decideding EROFS was left to lower layers.
>
> Bruce
The vnode locking is also wrong for several cases:
- According sys/kern/vnode_if.src we should return from VOP_OPEN()
with vp locked, but union_open() unlocks it before returning it.
- Same thing in union_access().
- We are not making sure we are calling VOP_CLOSE() with the vnode
exclusively locked in union_close().
- We don't lock the upper and lower vnodes when doing VOP_GETATTR()
on them in union_getattr().
- We don't lock othervp when doing VOP_STRATEGY() in union_strategy().
I think there are also other problems, but haven't had time to look
for them yet.
More information about the freebsd-fs
mailing list