svn commit: r243311 - in head/sys: fs/ext2fs fs/msdosfs fs/nfsclient fs/nullfs fs/unionfs gnu/fs/reiserfs nfsclient ufs/ffs

Konstantin Belousov kostikbel at gmail.com
Tue Nov 20 14:11:47 UTC 2012


On Tue, Nov 20, 2012 at 01:27:07PM +0000, Attilio Rao wrote:
> On 11/20/12, Bruce Evans <brde at optusnet.com.au> wrote:
> > On Tue, 20 Nov 2012, Attilio Rao wrote:
> >
> >> On 11/20/12, Attilio Rao <attilio at freebsd.org> wrote:
> >>> On 11/20/12, Bruce Evans <brde at optusnet.com.au> wrote:
> >>>> On Tue, 20 Nov 2012, Attilio Rao wrote:
> >>>>
> >>>>> On Tue, Nov 20, 2012 at 6:20 AM, Bruce Evans <brde at optusnet.com.au>
> >>>>> wrote:
> >>>>>> On Mon, 19 Nov 2012, Attilio Rao wrote:
> >>>>>>
> >>>>>>> Log:
> >>>>>>>  r16312 is not any longer real since many years (likely since when
> >>>>>>> VFS
> >>>>>>>  received granular locking) but the comment present in UFS has been
> >>>>>>>  copied all over other filesystems code incorrectly for several
> >>>>>>> times.
> >>>>>>>
> >>>>>>>  Removes comments that makes no sense now.
> >>>>>>
> >>>>>>
> >>>>>> It still made sense (except for bitrot in the function name), but
> >>>>>> might
> >>>>>> not
> >>>>>> be true).  The code made sense with it.  Now the code makes no sense.
> >>>>>>
> >>>>>>
> >>>>>>> Modified: head/sys/ufs/ffs/ffs_vfsops.c
> >>>>>>>
> >>>>>>> ==============================================================================
> >>>>>>> --- head/sys/ufs/ffs/ffs_vfsops.c       Mon Nov 19 21:58:14 2012
> >>>>>>> (r243310)
> >>>>>>> +++ head/sys/ufs/ffs/ffs_vfsops.c       Mon Nov 19 22:43:45 2012
> >>>>>>> (r243311)
> >>>>>>> @@ -1676,14 +1676,6 @@ ffs_vgetf(mp, ino, flags, vpp, ffs_flags
> >>>>>>>         ump = VFSTOUFS(mp);
> >>>>>>>         dev = ump->um_dev;
> >>>>>>>         fs = ump->um_fs;
> >>>>>>> -
> >>>>>>> -       /*
> >>>>>>> -        * If this malloc() is performed after the getnewvnode()
> >>>>>>
> >>>>>>
> >>>>>> This malloc() didn't match the code, which uses uma_zalloc().  Old
> >>>>>> versions used MALLOC() in both the comment and the code.  ffs's
> >>>>>> comment
> >>>>>> was updated to say malloc() when the code was changed to use
> >>>>>> malloc(),
> >>>>>> then rotted when the code was changed to use uma_zalloc().  In some
> >>>>>> other file systems, the comment still said MALLOC().
> >>>>>>
> >>>>>>
> >>>>>>> -        * it might block, leaving a vnode with a NULL v_data to be
> >>>>>>> -        * found by ffs_sync() if a sync happens to fire right then,
> >>>>>>> -        * which will cause a panic because ffs_sync() blindly
> >>>>>>> -        * dereferences vp->v_data (as well it should).
> >>>>>>> -        */
> >>>>>>>         ip = uma_zalloc(uma_inode, M_WAITOK | M_ZERO);
> >>>>>>>
> >>>>>>>         /* Allocate a new vnode/inode. */
> >>>>>>>
> >>>>>>
> >>>>>> The code makes no sense now.  The comment explains why ip is
> >>>>>> allocated
> >>>>>> before vp, instead of in the natural, opposite order like it used to
> >>>>>> be.  Allocating things in an unnatural  order requires extra code to
> >>>>>> free ip when the allocation of vp fails.
> >>>>>
> >>>>> "Used to be" is very arguably. The code has been like its current form
> >>>>> many more years than the opposite (16 against 3 I think).
> >>>>> And the code makes perfectly sense if you know the history. So I don't
> >>>>> agree with you.
> >>>>
> >>>> But it shouldn't be necessary to know the history of the code to
> >>>> understand it.  The code only makes sense if its comment is not
> >>>> removed,
> >>>> or if you know the history of the code so that you can restore the
> >>>> removed comment.  However, if the comment makes no sense as you claim,
> >>>> then the code that it it describes makes no sense.
> >>>
> >>> The "code that makes no sense" is basically the justification to have
> >>> the allocation before the getnewvnode(). It makes no sense because the
> >>> order makes no sense (you can allocate before or after getnewvnode(),
> >>> you won't have v_data corruption as the comment claims).
> >>>
> >>> Hence the code makes no sense.
> >>
> >> Herm, s/code/comment.
> >
> > I think you are right that the comment makes no sense.  A preemptible
> > kernel may be preempted without it calling malloc() where it may block.
> > Thus ffs_fsync() and anything else that looks at the inode must be
> > doing something to avoid dereferencing v_data if the vnode is not fully
> > constructed.  This seems to be done by iterating over vnodes using
> > MNT_VNODE_FOREACH_ACTIVE*() and not making incomplete vnodes active.
> > ffs_fsync() still just blindly dereferences the inode.
> >
> > I think I am right that the code makes no sense.  It is ordered like
> > it is because placing the allocation of ip before the allocation of
> > vp used to be enough to prevent v_data being dereferenced.  This makes
> > no sense when it isn't enough.
> 
> In the past, before VFS got locking and kernel was single-threaded,
> the comment and code arranged in this way were sensitive and
> effective.
> As now this is not true anymore, there is no strict relationship
> between the getnewvnode() and sleeping points.
> It is important to remove stale comments because they confuse people,
> the porters and as you can see the code/comment has been cut&paste
> quite a bit around.
> 

Putting the issue of the comment making no sense at all aside, which is
true but tangential to what I want to note.

Although malloc(9) is not ordered with the vnode locks in the lock
order, it is still good practice to not perform allocations under the
vnode lock. The reason is that pagedaemon might need to clean and write
pages, in particular, belonging to the vnodes. Generally, if you own
vnode lock and do something which might kick pagedaemon, you are creating
troubles for pagedaemon, which would attempt to lock the vnode with
timeout, spending the precious time waiting for lock it cannot obtain.

This is less an issue for the new vnode instantiation locations, because
vnode cannot have resident pages yet. But it is good to follow the same
pattern anywhere.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/svn-src-all/attachments/20121120/db1ef51f/attachment.sig>


More information about the svn-src-all mailing list