bin/141340: netstat(1): wrong netstat -w 1 output

Efstratios Karatzas gpf.kira at gmail.com
Wed Jan 20 23:00:22 UTC 2010


The following reply was made to PR bin/141340; it has been noted by GNATS.

From: Efstratios Karatzas <gpf.kira at gmail.com>
To: bug-followup at freebsd.org
Cc: mitya at cabletv.dp.ua
Subject: Re: bin/141340: netstat(1): wrong netstat -w 1 output
Date: Thu, 21 Jan 2010 00:51:33 +0200

 I believe I got this one figured out.
 
 I studied netstat's code and the actual loop is taking place inside
 function sidewaysintpr() contained in netstat/if.c
 so goto that function if you plan on reading the rest of this report.
 
 The reason you get bogus output is that the counters of various statistics
 that are kept for each interface are never again initialized to 0.
 This is the struct being used by the function to hold the various
 statistics that are printed.
 
 struct	iftot {
 	SLIST_ENTRY(iftot) chain;
 	char	ift_name[IFNAMSIZ];	/* interface name */
 	u_long	ift_ip;			/* input packets */
 	u_long	ift_ie;			/* input errors */
 	u_long	ift_id;			/* input drops */
 	u_long	ift_op;			/* output packets */
 	u_long	ift_oe;			/* output errors */
 	u_long	ift_co;			/* collisions */
 	u_int	ift_dr;			/* drops */
 	u_long	ift_ib;			/* input bytes */
 	u_long	ift_ob;			/* output bytes */
 };
 
 In each iteration we recompute (sum up) the counters of variable "sum"
 and what we actually print is the difference between current statistics
 that we gathered during the interval (defined by the -w arg) and the
 statistics of the previous loop. Take a look a this:
 
 show_stat("lu", 10, sum->ift_ib - total->ift_ib, 1);
 
 Since the counters for each statistic go only one way (up), eventually
 we are going to overflow. The counters inside "sum" are going to overflow
 slightly faster than "total's" counters so once in a while
 sum->ift_ib - total->ift_ib
 will yield a "negative number" because sum->ift_ib overflows and
 wraps near 0, becoming a small number in comparison to total->ift_ib
 In the next loop total will overflow as well and wrap around 0 itself,
 so no more crazy ultra long output.
 
 So the result of this subtraction, which is passed as a u_long, underflows
 and generates an enormously large value. Keep in mind than in 64bit
 archs a u_long is actually a really long number!
 So the max value for a u_long is around 2^64 = 1.84467441 * 10^19,
 close to what we see in the pr above. The pr originator has the amd64 version.
 
 Possible Fix:
 a) Implement an extra argument for netstat that re-initializes the
 statistics kept for each interface, of course it must be run by root.
 b) reboot the system
 
 In conclusion, I believe the state of the pr should change to "suspended" until
 someone submits a patch. I 'm sure I won't for at least 3 more weeks
 'till university exam period is over.
 
 Cheers,
 
 -- 
 
 Efstratios "GPF" Karatzas


More information about the freebsd-bugs mailing list