i386/73921: sysctlbyname for machdep.tsc_freq doesn't handle speeds > 2^31 Hz properly

Bruce Evans bde at zeta.org.au
Sun Nov 14 01:50:36 PST 2004


The following reply was made to PR i386/73921; it has been noted by GNATS.

From: Bruce Evans <bde at zeta.org.au>
To: Ted Schundler <tschundler at scu.edu>
Cc: freebsd-gnats-submit at freebsd.org, freebsd-i386 at freebsd.org
Subject: Re: i386/73921: sysctlbyname for machdep.tsc_freq doesn't handle
 speeds > 2^31 Hz properly
Date: Sun, 14 Nov 2004 20:49:55 +1100 (EST)

 On Sun, 14 Nov 2004, Ted Schundler wrote:
 
 > >Description:
 > In an app that needs the CPU speed, I'm calling sysctlbyname("machdep.tsc_freq",.... with an 64-bit long long (so when 5GHz CPUs come around, it should still work). However I'm getting very large numbers are a result. It seems that internally it is getting the value as a 32 bit integer and converting that to 64 bits by extending the sign. It shouldn't be 32 bits to begin with imho, but even if it is, it should at least be treated as unsigned.
 
 Please limit line lengths to considerably less than 446 columns.
 
 > >How-To-Repeat:
 > On a sufficiently fast system (in my case 3.1Ghz):
 >
 >   unsigned long long sysctl_data;
 >   bl=8;
 >   sysctlbyname("machdep.tsc_freq", &sysctl_data,&bl,NULL,0);
 >   printf("cpuspeed: %llu\n",sysctl_data);
 >
 > result:
 > cpuspeed: 13817017134246676192
 >
 > should be:
 > spuspeed: 3192014560
 
 tsc_freq was converted to 64 bits in the kernel to support frequencies
 >= 2^32, but the associated changes for the sysctl are mostly wrong.
 The sysctl only properly handles the lower 32 bits of tsc_freq.  In
 particular, it returns garbage in the top bits.  Your output shows the
 garbage, but this is partly a bug in test program (you should check
 final value in `bl'; other bugs make it 4 instead of the expected 8,
 so you shouldn't use the upper 4 bytes.  This is how sysctl(1) avoids
 seeing the bug).
 
 I use the following incomplete fixes and tests:
 
 %%%
 Index: tsc.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/i386/i386/tsc.c,v
 retrieving revision 1.204
 diff -u -2 -r1.204 tsc.c
 --- tsc.c	21 Oct 2003 18:28:34 -0000	1.204
 +++ tsc.c	14 Nov 2004 06:20:12 -0000
 @@ -131,10 +131,12 @@
  sysctl_machdep_tsc_freq(SYSCTL_HANDLER_ARGS)
  {
 +	u_int freq;
  	int error;
 -	uint64_t freq;
 
  	if (tsc_timecounter.tc_frequency == 0)
  		return (EOPNOTSUPP);
  	freq = tsc_freq;
 +	if (freq != tsc_freq)
 +		return (EOVERFLOW);
  	error = sysctl_handle_int(oidp, &freq, sizeof(freq), req);
  	if (error == 0 && req->newptr != NULL) {
 @@ -145,5 +147,5 @@
  }
 
 -SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq, CTLTYPE_QUAD | CTLFLAG_RW,
 +SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq, CTLTYPE_UINT | CTLFLAG_RW,
      0, sizeof(u_int), sysctl_machdep_tsc_freq, "IU", "");
 
 @@ -153,2 +155,38 @@
  	return (rdtsc());
  }
 +
 +uint64_t tsc_freq1 = 0x0000000100000002;
 +
 +static int
 +sysctl_machdep_tsc_freq1(SYSCTL_HANDLER_ARGS)
 +{
 +	uint64_t freq;
 +	int error;
 +
 +	freq = tsc_freq1;
 +	error = sysctl_handle_opaque(oidp, &freq, sizeof(freq), req);
 +	if (error == 0 && req->newptr != NULL)
 +		tsc_freq1 = freq;
 +	return (error);
 +}
 +
 +SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq1, CTLTYPE_QUAD | CTLFLAG_RW,
 +    0, sizeof(uint64_t), sysctl_machdep_tsc_freq1, "QU", "");
 +
 +uint64_t tsc_freq2 = 0x0000000100000002;
 +
 +static int
 +sysctl_machdep_tsc_freq2(SYSCTL_HANDLER_ARGS)
 +{
 +	uint64_t freq;
 +	int error;
 +
 +	freq = tsc_freq2;
 +	error = sysctl_handle_opaque(oidp, &freq, sizeof(freq), req);
 +	if (error == 0 && req->newptr != NULL)
 +		tsc_freq2 = freq;
 +	return (error);
 +}
 +
 +SYSCTL_PROC(_machdep, OID_AUTO, tsc_freq2, CTLTYPE_QUAD | CTLFLAG_RW,
 +    0, sizeof(uint64_t), sysctl_machdep_tsc_freq2, "IU", "");
 %%%
 
 Details:
 - in sysctl_machdep_tsc_freq() and its SYSCTL_PROC(), everything is
   reduced to u_int's so that other bugs aren't activated.  Support
   for frequencies >= 2^32 remains mostly broken.  It works in the kernel,
   but such frequencies cannot be set or read from userland.
 - sysctl_machdep_tsc_freq1() is progression test.  It shows how to give
   correct support for 63-bit (sic) tsc_freq's.  It depends on unbreaking
   support for printing quad_t's in sysctl(1) using the "Q" format
   specifier.
 - sysctl_machdep_tsc_freq2() shows the best that can be done without
   fixing sysctl(1).  sysctl(1) in -current does useless things for the
   "Q" format specifier; the best that it can do is print a 64-bit quad_t
   as 2 32-bit ints.
 
 Sample output:
 
 %%%
 machdep.tsc_freq: 2222929820
 machdep.tsc_freq1: 4294967298    # requires support for "Q" format specifier
 machdep.tsc_freq2: 2 1
 %%%
 
 There are many bugs in error handling that inhibit detection of of the main
 bug:
 - `void *'s in the SYSCTL_* macros defeat compile-time type checking
 - missing runtime checking in sysctl_handle_int() and friends.
   sysctl_handle_int() silently ignores its length arg.  It blindly
   handles sizeof(int) bytes instead of the requested length.
 - breakage of documented error handling.  From sysctl(3):
 
 %      [EINVAL]           A non-null newp is given and its specified length in
 %                         newlen is too large or too small.
 
   The size parameter in the SYSCTL_PROC() for machdep.tsc_freq has
   apparently been hacked to make the number of bytes returned by the
   sysctl 4 instead of 8.  This masks the bug for printing the value in
   sysctl(1) provided the value fits in 4 bytes.  However, it makes input
   inconsistent with output.  For input, sysctl(1) uses the CTLBYTE
   parameter and ignores the size parameter.  It tries to write 8
   bytes to the kernel, but sysctl_handle_int() only writes 4 bytes.
   This error is silently ignored.  Note that unlike for write(2),
   short writes cannot be reported to userland except as an error,
   since the sysctl(3) interface only returns a count for reads; thus
   short writes for sysctl(3) must cause an error as documented.
 
 I have only fixed the error handling for short writes:
 
 %%%
 Index: kern_sysctl.c
 ===================================================================
 RCS file: /home/ncvs/src/sys/kern/kern_sysctl.c,v
 retrieving revision 1.157
 diff -u -2 -r1.157 kern_sysctl.c
 --- kern_sysctl.c	11 Jun 2004 02:20:37 -0000	1.157
 +++ kern_sysctl.c	11 Jun 2004 07:58:23 -0000
 @@ -957,6 +949,13 @@
  	if (!req->newptr)
  		return 0;
 -	if (req->newlen - req->newidx < l)
 +	if (req->newlen - req->newidx != l) {
 +		if (req->newlen - req->newidx > l) {
 +			printf(
 +		"sysctl_new_kernel -- short write: %zu - %zu > %zu\n",
 +			    req->newlen, req->newidx, l);
 +			Debugger("sysctl_new_kernel: short write");
 +		}
  		return (EINVAL);
 +	}
  	bcopy((char *)req->newptr + req->newidx, p, l);
  	req->newidx += l;
 @@ -1075,6 +1073,13 @@
  	if (!req->newptr)
  		return 0;
 -	if (req->newlen - req->newidx < l)
 +	if (req->newlen - req->newidx != l) {
 +		if (req->newlen - req->newidx > l) {
 +			printf(
 +		"sysctl_new_user -- short write: %zu - %zu > %zu\n",
 +			    req->newlen, req->newidx, l);
 +			Debugger("sysctl_new_user: short write");
 +		}
  		return (EINVAL);
 +	}
  	error = copyin((char *)req->newptr + req->newidx, p, l);
  	req->newidx += l;
 %%%
 
 Bruce


More information about the freebsd-i386 mailing list