random(4) plugin infrastructure for mulitple RNG in a modular fashion

Bruce Evans brde at optusnet.com.au
Thu Aug 8 23:17:21 UTC 2013


On Thu, 8 Aug 2013, Mark R V Murray wrote:

> I still want to get back something like the original get_cyclecount(); simple and quick. I don't care what its called, but out doesn't need to be the massive thing that the current get_cyclecount() has grown to be on x86. rdtsc(), I think it was.

The simple and quick version cannot exist, and never did.  The original
i386 version was:

1.50         (markm    21-Nov-00): /*
1.50         (markm    21-Nov-00):  * Return contents of in-cpu fast counter as a sort of "bogo-time"
1.50         (markm    21-Nov-00):  * for non-critical timing.
1.50         (markm    21-Nov-00):  */
1.50         (markm    21-Nov-00): static __inline u_int64_t
1.50         (markm    21-Nov-00): get_cyclecount(void)
1.50         (markm    21-Nov-00): {
1.50         (markm    21-Nov-00): #if defined(I386_CPU) || defined(I486_CPU)
1.50         (markm    21-Nov-00): 	struct timespec tv;
1.50         (markm    21-Nov-00): 
1.50         (markm    21-Nov-00): 	if ((cpu_feature & CPUID_TSC) == 0) {
1.50         (markm    21-Nov-00): 		nanotime(&tv);
1.50         (markm    21-Nov-00): 		return (tv.tv_sec * (u_int64_t)1000000000 + tv.tv_nsec);
1.50         (markm    21-Nov-00): 	}
1.50         (markm    21-Nov-00): #endif
1.50         (markm    21-Nov-00): 	return (rdtsc());
1.50         (markm    21-Nov-00): }

This is not so simple, and is unquick if there is no TSC.  If I386_CPU or
I486_CPU is configured, then it is suboptimal even if there is a TSC.
Other arches are even further from always having a TSC.  The simple and
quvck version would always return 0 or a kernel global like time.tv_nsec
if there is no TSC and no other readable freqently changing timer or
noise source that can be read almost as fast as memory.  It wouldn't
guarantee any entropy.

The current version is only slightly unsimpler and unquicker:
- on amd64, it is still just inline rdtsc()
On other versions, the nanotime() in it was first improved to binuptime().
This also gave more noise in the extra low bits, and mixing of the bits
made it less abusable as a timer.  The latter has been broken on some
arches.
- on arm, the bits are still mixed by ((sec << 56) | (frac >> 8)) (8 bits
   of sec and 56 bits of frac.  I don't like losing some low bits (it is
   better to xor things), but the result is fairly unusable as a timer
   and perhaps there is nothing useful in the low bits on arm (it takes
   a very high frequency clock like a TSC and/or delicate ntpd adjustments
   that aren't very noisy to put anything there).  The 8-bit seconds count
   isn't too good when KTR abuses get_cyclecount.().
- on i386, read_cycleount() is still inline, but the inline just calls
   the function pointer cpu_tick().  If there is a TSC, then cpu_tick
   points to an un-inline rdtsc() and the result is a slightly pessimzed
   version of the above if I386_CPU or I486_CPU is configured and a
   more pessimized version of the above if neither is configured.
   Otherwise, the result is the accumulated tick count of the currently
   active timecounter.  This is much better for noise in get_cyclecount()
   and much worse for its primary purpose of timing than is binuptime()
   with bits mixed to form a timer.  The active timecounter can change,
   and then the frequency and offset of its ticker changes.  Its primary
   use is for process times, and there is some recalibration for this,
   but this is incomplete and buggy.  But for get_cyclecount(), the noise
   is a feature.  The noise from this is bad when KTR abuses get_cyclecount().
   Otherwise, this is better for get_cyclecount() than the old binuptime()
   method.
- on ia64, get_cyclecount is #define'd as another function.  The declaration
   and definition of the other function are even more obscure.  They are
   generated by a macro.  Standard namespace pollution in sys/systm.h is
   depended on to join the definitions.
- mips is like ia64 except the obfuscation chain is shorter.  <machine/cpu.h>
   provides its own namespace pollution, so sys/systm.h and its pollution
   aren't depended on...
- on powerpc, get_cyclecount() reads a counter using inline asm.  It
   spells the 2 32-bit components of the counter as essentially
   time._upper and time._lower, so it isn't clear if they are actually
   times to begin with.
- sparc64 uses inline asm to read some register which is hopefully a counter.

So, get_cyclecount() is actually simple and quick (except for macros
hiding the simplicity) on all arches except arm and old i386.  But it
is very MD, so it takes a lot of code with different simplicity to
support it for all arches.  Still better than #ifdefing it wherever it
is used.

Bruce


More information about the freebsd-arch mailing list