svn commit: r300423 - in head/sys: fs/ext2fs ufs/ffs
Pedro Giffuni
pfg at FreeBSD.org
Tue May 24 14:47:28 UTC 2016
On 05/24/16 07:20, Bruce Evans wrote:
> On Sun, 22 May 2016, Kevin Lo wrote:
>
>> Log:
>> arc4random() returns 0 to (2**32)−1, use an alternative to initialize
>> i_gen if it's zero rather than a divide by 2.
>>
>> With inputs from delphij, mckusick, rmacklem
>>
>> Reviewed by: mckusick
>> ...
>> Modified: head/sys/fs/ext2fs/ext2_alloc.c
>> ==============================================================================
>>
>> --- head/sys/fs/ext2fs/ext2_alloc.c Sun May 22 14:13:20 2016
>> (r300422)
>> +++ head/sys/fs/ext2fs/ext2_alloc.c Sun May 22 14:31:20 2016
>> (r300423)
>> @@ -408,7 +408,8 @@ ext2_valloc(struct vnode *pvp, int mode,
>> /*
>> * Set up a new generation number for this inode.
>> */
>> - ip->i_gen = arc4random();
>> + while (ip->i_gen == 0 || ++ip->i_gen == 0)
>> + ip->i_gen = arc4random();
>
> This is a correct implementation of ffs's intended method, but ffs's
> intended method is wrong (see below for its wrongness).
>
> Correctness depends on i_gen having type uint32_t in ext2fs. This
> makes the code +ip_i_gen undead, so i_gen is re-randomized occasionally
> (averaged over all inodes, once for every ~2 billionth reuse of an
> inode, which is practically never. Bugs in ffs prevent it being done
> even that often there. So the re-randomization is almost useless. I
> think it is slighty worse than useless, since it may give the same
> i_gen immediately, while always incrementing (but skipping 0) always
> gives a new i_gen.
>
> ext2fs might not need this at all. For ffs, then special case of
> i_gen == 0 must be handled because we still pretend to support file
> systems created by newfs versions almost 25 years old. newfs didn't
> initialize di_gen back then, so all inodes started with di_gen == 0.
> ext2fs is less that 25 years old, so it has less history to support.
>
I agree ext2fs, and likely UFS2, can handle the case of di_gen == 0.
The additional question is where do we actually need to do randomization
in ext2fs? UFS does the randomization when formatting
so ffs_vget doesn't do randomization at all. arc4random is AFAICT,
only required for very old UFS1.
I lost interest in ext2fs and can't be bothered to look in linux ;).
>>
>> vfs_timestamp(&ts);
>> ip->i_birthtime = ts.tv_sec;
>>
>> Modified: head/sys/fs/ext2fs/ext2_vfsops.c
>> ==============================================================================
>>
>> --- head/sys/fs/ext2fs/ext2_vfsops.c Sun May 22 14:13:20 2016
>> (r300422)
>> +++ head/sys/fs/ext2fs/ext2_vfsops.c Sun May 22 14:31:20 2016
>> (r300423)
>> @@ -998,7 +998,8 @@ ext2_vget(struct mount *mp, ino_t ino, i
>> * already have one. This should only happen on old filesystems.
>> */
>> if (ip->i_gen == 0) {
>> - ip->i_gen = random() + 1;
>> + while (ip->i_gen == 0)
>> + ip->i_gen = arc4random();
>
> This is correct, but might be unnecessary (see above). "on old
> filesystems"
> was copied from ffs where it meant "on file systems created by versions
> of newfs that didn't initialize di_gen". Such file systems are restricted
> to ffs1. But i_gen stil occurs due to bugs in newfs -- it is missing this
> loop, so it sometimes sets i_gen to the random value of 1, and we
> can't/don't
> tell the difference between this and unitialized.
>
> I think ext2fs is more like ffs2 in a relevant way here. Both have the
> feature of speeding up newfs by only writing a few inodes. Thus most
> inode allocations occur in the kernel, and it hardly matters if newfs
> initialized i_gen for the few inodes that it initialized. extfs and
> ext2fs also have an inode non-clearing feature on unlink that might
> be relevant. I forget if ffs2 has this. So the code should be simplified
> by never expecting newfs to initialize i_gen to nonzero. It already
> almost does this, except in the comment. For this, it is necessary to
> skip i_gen == 0 (mod 2**32).
>
>> if ((vp->v_mount->mnt_flag & MNT_RDONLY) == 0)
>> ip->i_flag |= IN_MODIFIED;
>> }
>>
>> Modified: head/sys/ufs/ffs/ffs_alloc.c
>> ==============================================================================
>>
>> --- head/sys/ufs/ffs/ffs_alloc.c Sun May 22 14:13:20 2016 (r300422)
>> +++ head/sys/ufs/ffs/ffs_alloc.c Sun May 22 14:31:20 2016 (r300423)
>> @@ -1102,8 +1102,8 @@ dup_alloc:
>> /*
>> * Set up a new generation number for this inode.
>> */
>> - if (ip->i_gen == 0 || ++ip->i_gen == 0)
>> - ip->i_gen = arc4random() / 2 + 1;
>> + while (ip->i_gen == 0 || ++ip->i_gen == 0)
>> + ip->i_gen = arc4random();
>
> This is broken due to an old type error. In ffs, i_gen has type uint64_t,
> so it is physically impossible for ++ip->i_gen to wrap to 0.
> (i_gen is initialized from di_gen, and that has type uint32_t, so
> i_gen is initially <= UINT32_MAX and it would take 584 years to wrap
> with the modest inode recycling period of 1 nanosecond.)
> So the above is an obfuscated way of writing:
>
> if (ip->i_gen == 0) {
> /*
> * This value means uninitialized (or a bug). Init now.
> * The loop is to not have the usual bug here.
> */
> do
> ip->i_gen = arc4random();
> while (ip->i_gen == 0);
> } else
> ip->i_gen++;
>
> Now it is clear that i_gen can grow far above UINT32_MAX. But usually
> it doesn't. Growth above UINT32_MAX gets truncated when the vnode is
> recycled. Overflow occurs with i_gen is stored to di_gen, and growth
> resumes at a small truncated value.
>
> The type error gives truncation on most uses of i_gen: di_gen, ufid_gen
> and ueh_i_gen are all 32 bits. va_gen is 32-bits on 32-bit arches but
> is 64 bits on 64-bit arches. Various bugs result. The bugs are mostly
> features. It is not very useful to re-randomize on reaching the 32-bit
> boundary. The bugfeature normally avoids this. If i_gen were not
> truncated to 32 bits when the vnode is recycled (or on unmount), and
> if it were consistently truncated (that means, truncate it to 32 bits
> in va_gen on 64-bit arches), then the bugfeature would work perfectly.
> The top 32 bits in i_gen would then be unused except to record history
> for a few trillion years for most inodes.
>
>> DIP_SET(ip, i_gen, ip->i_gen);
>> if (fs->fs_magic == FS_UFS2_MAGIC) {
>> vfs_timestamp(&ts);
>> @@ -2080,7 +2080,8 @@ gotit:
>> bzero(ibp->b_data, (int)fs->fs_bsize);
>> dp2 = (struct ufs2_dinode *)(ibp->b_data);
>> for (i = 0; i < INOPB(fs); i++) {
>> - dp2->di_gen = arc4random() / 2 + 1;
>> + while (dp2->di_gen == 0)
>> + dp2->di_gen = arc4random();
>
> This seems to be correct. It is only for the ffs2 case, and di_gen was
> initialized to 0 by the bzero(), but the while loop is easier to read
> that the more optimal do-while loop that I wrote above.
>
>> dp2++;
>> }
>> /*
>>
>> Modified: head/sys/ufs/ffs/ffs_vfsops.c
>> ==============================================================================
>>
>> --- head/sys/ufs/ffs/ffs_vfsops.c Sun May 22 14:13:20 2016
>> (r300422)
>> +++ head/sys/ufs/ffs/ffs_vfsops.c Sun May 22 14:31:20 2016
>> (r300423)
>> @@ -1768,7 +1768,8 @@ ffs_vgetf(mp, ino, flags, vpp, ffs_flags
>> * already have one. This should only happen on old filesystems.
>> */
>> if (ip->i_gen == 0) {
>> - ip->i_gen = arc4random() / 2 + 1;
>> + while (ip->i_gen == 0)
>> + ip->i_gen = arc4random();
>
> This also seems to be correct. Now the compiler can easily optimize the
> while loop to a do-while loop, since a previous check for i_gen == 0 is
> visible.
>
> "should" in the comment is correct, except this should never happen
> now since file systems that were old when it was written now shouldn't
> exist. However, this does happen now, mostly for new file systems,
> due to a bug in newfs: newfs is missing this while loop, so it sometimes
> initializes di_gen to 0. Then we can't/don't tell if di_gen was
> initialized to a random value, so we-re-randomize it.
>
>> if ((vp->v_mount->mnt_flag & MNT_RDONLY) == 0) {
>> ip->i_flag |= IN_MODIFIED;
>> DIP_SET(ip, i_gen, ip->i_gen);
>
> All versions of newfs seem to have had buggy initialization:
> - they didn't initialize di_gen (except to 0) until 1997
> - the first 1997 version used /dev/urandom to initialize "long ret;".
> Assignment of this to "int32_t di_gen;" overflowed on 64-bit arches
> (alpha?). This gave a negative value which gave further overflows
> or suprising sign extensions mainly when assigned to "u_long va_gen'";
> most other types were consistently signed.
> - the next 1997 version used random(). This fixed the generation of
> negative values. 0 was still generated
> - the 2003 version used arc4random(). This gave negative values again.
> 0 was still generated. This is essentially the current version. The
> current version uses a wrapper function like the first 1997 version.
> - newfs doesn't seem to have ever had the version that did *random() / 2
> + 1
> like the kernel did. It thus escaped having the overflow/sign extension
> bugs that the kernel had. Dividing by 2 was apparently supposed to avoid
> these bugs. It worked with random() since random() returns at most
> INT32_MAX. But arc4random() returns at mist UINT32_MAX. Dividing this
> by 2 gives (u_int)INT32_MAX and adding 1 gives a value that overflows
> when assigned to int32_t. This was later fixed by changing lots of
> int32_t to uint32_t.
>
> I'm not sure about the security aspects of randomizing i_gen. The re-
> randomization is so accidental and infrequent that security must be
> unimportant. But I think a unique generation (over all inodes on all
> file systems) would be better than any randomness. FreeBSD-1 was closer
> to having that -- i_gen was initialized to the global nextgennumber++ if
> it was zero; it was not incremented for inode use. The globabl would
> have to be 64 bits now. Ensuring uniqueness is not easy since it means
> that you have to check inode numbers on not-very-trusted newly mounted
> file systems against all inode numbers already in use. Perhaps it works
> to always use a new set for every new mount (ignore the ones on disk).
>
> Bruce
Yes, I have always wondered: who is detecting if we have a repeated
generation number, the end user? And then if we get a repeated
generation number and try to fix it, how do we know that a new
random value we generate to replace the duplicate is unused?
I am rather afraid to hear the answers :-/.
Pedro.
More information about the svn-src-head
mailing list