bin/141340: netstat(1): wrong netstat -w 1 output
Efstratios Karatzas
gpf.kira at gmail.com
Wed Jan 20 23:00:22 UTC 2010
The following reply was made to PR bin/141340; it has been noted by GNATS.
From: Efstratios Karatzas <gpf.kira at gmail.com>
To: bug-followup at freebsd.org
Cc: mitya at cabletv.dp.ua
Subject: Re: bin/141340: netstat(1): wrong netstat -w 1 output
Date: Thu, 21 Jan 2010 00:51:33 +0200
I believe I got this one figured out.
I studied netstat's code and the actual loop is taking place inside
function sidewaysintpr() contained in netstat/if.c
so goto that function if you plan on reading the rest of this report.
The reason you get bogus output is that the counters of various statistics
that are kept for each interface are never again initialized to 0.
This is the struct being used by the function to hold the various
statistics that are printed.
struct iftot {
SLIST_ENTRY(iftot) chain;
char ift_name[IFNAMSIZ]; /* interface name */
u_long ift_ip; /* input packets */
u_long ift_ie; /* input errors */
u_long ift_id; /* input drops */
u_long ift_op; /* output packets */
u_long ift_oe; /* output errors */
u_long ift_co; /* collisions */
u_int ift_dr; /* drops */
u_long ift_ib; /* input bytes */
u_long ift_ob; /* output bytes */
};
In each iteration we recompute (sum up) the counters of variable "sum"
and what we actually print is the difference between current statistics
that we gathered during the interval (defined by the -w arg) and the
statistics of the previous loop. Take a look a this:
show_stat("lu", 10, sum->ift_ib - total->ift_ib, 1);
Since the counters for each statistic go only one way (up), eventually
we are going to overflow. The counters inside "sum" are going to overflow
slightly faster than "total's" counters so once in a while
sum->ift_ib - total->ift_ib
will yield a "negative number" because sum->ift_ib overflows and
wraps near 0, becoming a small number in comparison to total->ift_ib
In the next loop total will overflow as well and wrap around 0 itself,
so no more crazy ultra long output.
So the result of this subtraction, which is passed as a u_long, underflows
and generates an enormously large value. Keep in mind than in 64bit
archs a u_long is actually a really long number!
So the max value for a u_long is around 2^64 = 1.84467441 * 10^19,
close to what we see in the pr above. The pr originator has the amd64 version.
Possible Fix:
a) Implement an extra argument for netstat that re-initializes the
statistics kept for each interface, of course it must be run by root.
b) reboot the system
In conclusion, I believe the state of the pr should change to "suspended" until
someone submits a patch. I 'm sure I won't for at least 3 more weeks
'till university exam period is over.
Cheers,
--
Efstratios "GPF" Karatzas
More information about the freebsd-bugs
mailing list