svn commit: r308329 - head/usr.bin/ministat

Colin Percival cperciva at FreeBSD.org
Sat Nov 5 06:33:40 UTC 2016


Author: cperciva
Date: Sat Nov  5 06:33:39 2016
New Revision: 308329
URL: https://svnweb.freebsd.org/changeset/base/308329

Log:
  Reduce the bogosity of ministat's % difference calculations.
  
  The previous calculation used an approximation which was only valid in
  cases where the means being compared were similar; this resulted in very
  odd claims being made, e.g. that 0 +/- 0 is a difference of -100% +/- 1%
  from 100 +/- 1.
  
  The new calculation scales sample standard deviations by the means, and
  yields approximately correct percentage difference bounds providing that
  the reference population is bounded away from zero.  (In the case where
  the values being compared are not sufficiently bounded away from zero,
  the distribution of ratios becomes much harder to calculate, and is not
  likely to be useful anyway.)
  
  Note that when ministat is used for its intended purpose of determining
  whether two samples are statistically different, this change is unlikely
  to have any noticeable effect; in such cases the means will be similar
  enough that the correction applied here will be minimal.

Modified:
  head/usr.bin/ministat/ministat.c

Modified: head/usr.bin/ministat/ministat.c
==============================================================================
--- head/usr.bin/ministat/ministat.c	Sat Nov  5 04:40:58 2016	(r308328)
+++ head/usr.bin/ministat/ministat.c	Sat Nov  5 06:33:39 2016	(r308329)
@@ -232,6 +232,7 @@ static void
 Relative(struct dataset *ds, struct dataset *rs, int confidx)
 {
 	double spool, s, d, e, t;
+	double re;
 	int i;
 
 	i = ds->n + rs->n - 2;
@@ -246,11 +247,16 @@ Relative(struct dataset *ds, struct data
 	d = Avg(ds) - Avg(rs);
 	e = t * s;
 
+	re = (ds->n - 1) * Var(ds) + (rs->n - 1) * Var(rs) *
+	    (Avg(ds) * Avg(ds)) / (Avg(rs) * Avg(rs));
+	re *= (ds->n + rs->n) / (ds->n * rs->n * (ds->n + rs->n - 2.0));
+	re = t * sqrt(re);
+
 	if (fabs(d) > e) {
 	
 		printf("Difference at %.1f%% confidence\n", studentpct[confidx]);
 		printf("	%g +/- %g\n", d, e);
-		printf("	%g%% +/- %g%%\n", d * 100 / Avg(rs), e * 100 / Avg(rs));
+		printf("	%g%% +/- %g%%\n", d * 100 / Avg(rs), re * 100 / Avg(rs));
 		printf("	(Student's t, pooled s = %g)\n", spool);
 	} else {
 		printf("No difference proven at %.1f%% confidence\n",


More information about the svn-src-all mailing list