bin/30360: vmstat(8) returns impossible data
Sergey Kandaurov
pluknet at gmail.com
Sat Jan 1 18:30:16 UTC 2011
The following reply was made to PR bin/30360; it has been noted by GNATS.
From: Sergey Kandaurov <pluknet at gmail.com>
To: bug-followup at FreeBSD.org
Cc:
Subject: Re: bin/30360: vmstat(8) returns impossible data
Date: Sat, 1 Jan 2011 21:23:24 +0300
That's a type overflow bug which I think isn't easy to fix, b.c. it
breaks cp_time ABI.
cp_time is (roughly) an array[CPUSTATES] of longs.
long type is 4-bytes on i386, and 8-bytes on amd64.
That's why I don't see this bug on amd64 boxes.
Sometimes the bug might not manifest on i386 sysctl kern.cp_time, but
generally it does.
That's because the exported cp_time[] fmt (used by /sbin/sysctl) is
different ("UL"),
and that gives extended type capacity (for a while) by casting signed
to unsigned.
In this example bug manifests for `id' as well with /sbin/sysctl on
i386 (uptime 597 days):
# sysctl kern.cp_time
kern.cp_time: 4021277307 75175092 2025746497 49748493 2746074583
# vmstat
procs memory page disks faults cpu
r b w avm fre flt re pi po fr sr da0 da1 in sy cs us sy id
1 5 0 93720 458992 14 0 0 3 53 1 0 0 37 1 5
-61 633 -472
Both boxes, hub and freefall, reported by arundel@ are i386.
In this example /sbin/sysctl abuses "UL" fmt, but it doesn't work for vmstat
which uses libdevstat which in turn properly uses cp_time[] as long signed.
# sysctl kern.cp_time
kern.cp_time: 795491304 5844771 246148418 43709451 2752874123
# ./test
printf("%lu\n", l): 2752874123
printf("%ld\n", l): -1542093173 [compare]
# ./vmstat
procs memory page disk faults cpu
r b w avm fre flt re pi po fr sr aa0 in sy cs us sy id
3 3 0 5776M 172M 173 39 22 5 617 444 0 743 193 60
cpustats(): before 'total += cur.cp_time[state]': cp_time[]: 795758944
cpustats(): before 'total += cur.cp_time[state]': total: 0.000000
cpustats(): after 'total += cur.cp_time[state]': cp_time[]: 795758944
cpustats(): after 'total += cur.cp_time[state]': total: 795758944.000000
cpustats(): before 'total += cur.cp_time[state]': cp_time[]: 5844771
cpustats(): before 'total += cur.cp_time[state]': total: 795758944.000000
cpustats(): after 'total += cur.cp_time[state]': cp_time[]: 5844771
cpustats(): after 'total += cur.cp_time[state]': total: 801603715.000000
cpustats(): before 'total += cur.cp_time[state]': cp_time[]: 246218512
cpustats(): before 'total += cur.cp_time[state]': total: 801603715.000000
cpustats(): after 'total += cur.cp_time[state]': cp_time[]: 246218512
cpustats(): after 'total += cur.cp_time[state]': total: 1047822227.000000
cpustats(): before 'total += cur.cp_time[state]': cp_time[]: 43723365
cpustats(): before 'total += cur.cp_time[state]': total: 1047822227.000000
cpustats(): after 'total += cur.cp_time[state]': cp_time[]: 43723365
cpustats(): after 'total += cur.cp_time[state]': total: 1091545592.000000
cpustats(): before 'total += cur.cp_time[state]': cp_time[]:
-1541158615 [compare]
cpustats(): before 'total += cur.cp_time[state]': total: 1091545592.000000
cpustats(): after 'total += cur.cp_time[state]': cp_time[]: -1541158615
cpustats(): after 'total += cur.cp_time[state]': total: -449613023.000000
-178 -64 343
^^1 ^^2 ^^3
(1) and (2) is negative b.c. both multiplied by neg. total cp_time index;
(3) is positive b.c. it's neg. cp_time[CP_IDLE] multiplied by neg.
total cp_time index
After summation, total has wrong sign and wrong value hence high pct. values.
--
wbr,
pluknet
More information about the freebsd-bugs
mailing list