CPU time accounting broken on 8-STABLE machine after a few hours of uptime

Don Lewis truckman at FreeBSD.org
Wed Sep 29 05:31:40 UTC 2010


On 28 Sep, Don Lewis wrote:

> Looking at the timestamps of things and comparing to my logs, I
> discovered that the last instance of ntp instability happened when I was
> running "make index" in /usr/ports.  I tried it again with entertaining
> results.  After a while, the machine became unresponsive.  I was logged
> in over ssh and it stopped echoing keystrokes.  In parallel I was
> running a script that echoed the date, the results of "vmstat -i", and
> the results of "ntpq -c pe".  The latter showed jitter and offset going
> insane.  Eventually "make index" finished and the machine was responsive
> again, but the time was way off and ntpd croaked because the necessary
> time correction was too large.  Nothing else anomalous showed up in the
> logs.  Hmn, about half an hour after ntpd died I started my CPU time
> accounting test and two minutes into that test I got a spew of calcru
> messages ...

I tried this experiment again using a kernel with WITNESS and
DEBUG_VFS_LOCKS compiled in, and pinging this machine from another.
Things look normal for a while, then the ping times get huge for a while
and then recover.

64 bytes from 192.168.101.3: icmp_seq=1169 ttl=64 time=0.135 ms
64 bytes from 192.168.101.3: icmp_seq=1170 ttl=64 time=0.141 ms
64 bytes from 192.168.101.3: icmp_seq=1171 ttl=64 time=0.130 ms
64 bytes from 192.168.101.3: icmp_seq=1172 ttl=64 time=0.131 ms
64 bytes from 192.168.101.3: icmp_seq=1173 ttl=64 time=0.128 ms
64 bytes from 192.168.101.3: icmp_seq=1174 ttl=64 time=38232.140 ms
64 bytes from 192.168.101.3: icmp_seq=1175 ttl=64 time=37231.309 ms
64 bytes from 192.168.101.3: icmp_seq=1176 ttl=64 time=36230.470 ms
64 bytes from 192.168.101.3: icmp_seq=1177 ttl=64 time=35229.632 ms
64 bytes from 192.168.101.3: icmp_seq=1178 ttl=64 time=34228.791 ms
64 bytes from 192.168.101.3: icmp_seq=1179 ttl=64 time=33227.953 ms
64 bytes from 192.168.101.3: icmp_seq=1180 ttl=64 time=32227.091 ms
64 bytes from 192.168.101.3: icmp_seq=1181 ttl=64 time=31226.262 ms
64 bytes from 192.168.101.3: icmp_seq=1182 ttl=64 time=30225.425 ms
64 bytes from 192.168.101.3: icmp_seq=1183 ttl=64 time=29224.597 ms
64 bytes from 192.168.101.3: icmp_seq=1184 ttl=64 time=28223.757 ms
64 bytes from 192.168.101.3: icmp_seq=1185 ttl=64 time=27222.918 ms
64 bytes from 192.168.101.3: icmp_seq=1186 ttl=64 time=26222.086 ms
64 bytes from 192.168.101.3: icmp_seq=1187 ttl=64 time=25221.164 ms
64 bytes from 192.168.101.3: icmp_seq=1188 ttl=64 time=24220.407 ms
64 bytes from 192.168.101.3: icmp_seq=1189 ttl=64 time=23219.575 ms
64 bytes from 192.168.101.3: icmp_seq=1190 ttl=64 time=22218.737 ms
64 bytes from 192.168.101.3: icmp_seq=1191 ttl=64 time=21217.905 ms
64 bytes from 192.168.101.3: icmp_seq=1192 ttl=64 time=20217.066 ms
64 bytes from 192.168.101.3: icmp_seq=1193 ttl=64 time=19216.228 ms
64 bytes from 192.168.101.3: icmp_seq=1194 ttl=64 time=18215.333 ms
64 bytes from 192.168.101.3: icmp_seq=1195 ttl=64 time=17214.503 ms
64 bytes from 192.168.101.3: icmp_seq=1196 ttl=64 time=16213.720 ms
64 bytes from 192.168.101.3: icmp_seq=1197 ttl=64 time=15210.912 ms
64 bytes from 192.168.101.3: icmp_seq=1198 ttl=64 time=14210.044 ms
64 bytes from 192.168.101.3: icmp_seq=1199 ttl=64 time=13209.194 ms
64 bytes from 192.168.101.3: icmp_seq=1200 ttl=64 time=12208.376 ms
64 bytes from 192.168.101.3: icmp_seq=1201 ttl=64 time=11207.536 ms
64 bytes from 192.168.101.3: icmp_seq=1202 ttl=64 time=10206.694 ms
64 bytes from 192.168.101.3: icmp_seq=1203 ttl=64 time=9205.816 ms
64 bytes from 192.168.101.3: icmp_seq=1204 ttl=64 time=8205.014 ms
64 bytes from 192.168.101.3: icmp_seq=1205 ttl=64 time=7204.186 ms
64 bytes from 192.168.101.3: icmp_seq=1206 ttl=64 time=6203.294 ms
64 bytes from 192.168.101.3: icmp_seq=1207 ttl=64 time=5202.510 ms
64 bytes from 192.168.101.3: icmp_seq=1208 ttl=64 time=4201.677 ms
64 bytes from 192.168.101.3: icmp_seq=1209 ttl=64 time=3200.851 ms
64 bytes from 192.168.101.3: icmp_seq=1210 ttl=64 time=2200.013 ms
64 bytes from 192.168.101.3: icmp_seq=1211 ttl=64 time=1199.100 ms
64 bytes from 192.168.101.3: icmp_seq=1212 ttl=64 time=198.331 ms
64 bytes from 192.168.101.3: icmp_seq=1213 ttl=64 time=0.129 ms
64 bytes from 192.168.101.3: icmp_seq=1214 ttl=64 time=58223.470 ms
64 bytes from 192.168.101.3: icmp_seq=1215 ttl=64 time=57222.637 ms
64 bytes from 192.168.101.3: icmp_seq=1216 ttl=64 time=56221.800 ms
64 bytes from 192.168.101.3: icmp_seq=1217 ttl=64 time=55220.960 ms
64 bytes from 192.168.101.3: icmp_seq=1218 ttl=64 time=54220.116 ms
64 bytes from 192.168.101.3: icmp_seq=1219 ttl=64 time=53219.282 ms
64 bytes from 192.168.101.3: icmp_seq=1220 ttl=64 time=52218.444 ms
64 bytes from 192.168.101.3: icmp_seq=1221 ttl=64 time=51217.618 ms
64 bytes from 192.168.101.3: icmp_seq=1222 ttl=64 time=50216.778 ms
64 bytes from 192.168.101.3: icmp_seq=1223 ttl=64 time=49215.932 ms
64 bytes from 192.168.101.3: icmp_seq=1224 ttl=64 time=48215.095 ms
64 bytes from 192.168.101.3: icmp_seq=1225 ttl=64 time=47214.262 ms
64 bytes from 192.168.101.3: icmp_seq=1226 ttl=64 time=46213.440 ms
64 bytes from 192.168.101.3: icmp_seq=1227 ttl=64 time=45212.623 ms
64 bytes from 192.168.101.3: icmp_seq=1228 ttl=64 time=44211.783 ms
64 bytes from 192.168.101.3: icmp_seq=1229 ttl=64 time=43210.903 ms
64 bytes from 192.168.101.3: icmp_seq=1230 ttl=64 time=42210.111 ms
64 bytes from 192.168.101.3: icmp_seq=1231 ttl=64 time=41209.274 ms
64 bytes from 192.168.101.3: icmp_seq=1232 ttl=64 time=40208.448 ms
64 bytes from 192.168.101.3: icmp_seq=1233 ttl=64 time=39207.608 ms
64 bytes from 192.168.101.3: icmp_seq=1234 ttl=64 time=38206.774 ms
64 bytes from 192.168.101.3: icmp_seq=1235 ttl=64 time=37205.842 ms
64 bytes from 192.168.101.3: icmp_seq=1236 ttl=64 time=36205.104 ms
64 bytes from 192.168.101.3: icmp_seq=1237 ttl=64 time=35204.270 ms
64 bytes from 192.168.101.3: icmp_seq=1238 ttl=64 time=34203.433 ms
64 bytes from 192.168.101.3: icmp_seq=1239 ttl=64 time=33202.603 ms
64 bytes from 192.168.101.3: icmp_seq=1240 ttl=64 time=32201.764 ms
64 bytes from 192.168.101.3: icmp_seq=1241 ttl=64 time=31200.924 ms
64 bytes from 192.168.101.3: icmp_seq=1242 ttl=64 time=30200.082 ms
64 bytes from 192.168.101.3: icmp_seq=1243 ttl=64 time=29198.883 ms
64 bytes from 192.168.101.3: icmp_seq=1244 ttl=64 time=28198.414 ms
64 bytes from 192.168.101.3: icmp_seq=1245 ttl=64 time=27197.434 ms
64 bytes from 192.168.101.3: icmp_seq=1246 ttl=64 time=26196.738 ms
64 bytes from 192.168.101.3: icmp_seq=1247 ttl=64 time=25195.912 ms
64 bytes from 192.168.101.3: icmp_seq=1248 ttl=64 time=24195.074 ms
64 bytes from 192.168.101.3: icmp_seq=1249 ttl=64 time=23194.231 ms
64 bytes from 192.168.101.3: icmp_seq=1250 ttl=64 time=22193.407 ms
64 bytes from 192.168.101.3: icmp_seq=1251 ttl=64 time=21192.565 ms
64 bytes from 192.168.101.3: icmp_seq=1252 ttl=64 time=20191.725 ms
64 bytes from 192.168.101.3: icmp_seq=1253 ttl=64 time=19190.852 ms
64 bytes from 192.168.101.3: icmp_seq=1254 ttl=64 time=18190.060 ms
64 bytes from 192.168.101.3: icmp_seq=1255 ttl=64 time=17189.220 ms
64 bytes from 192.168.101.3: icmp_seq=1256 ttl=64 time=16188.381 ms
64 bytes from 192.168.101.3: icmp_seq=1257 ttl=64 time=15183.118 ms
64 bytes from 192.168.101.3: icmp_seq=1258 ttl=64 time=14182.711 ms
64 bytes from 192.168.101.3: icmp_seq=1259 ttl=64 time=13181.876 ms
64 bytes from 192.168.101.3: icmp_seq=1260 ttl=64 time=12181.034 ms
64 bytes from 192.168.101.3: icmp_seq=1261 ttl=64 time=11180.192 ms
64 bytes from 192.168.101.3: icmp_seq=1262 ttl=64 time=10179.357 ms
64 bytes from 192.168.101.3: icmp_seq=1263 ttl=64 time=9178.522 ms
64 bytes from 192.168.101.3: icmp_seq=1264 ttl=64 time=8177.692 ms
64 bytes from 192.168.101.3: icmp_seq=1265 ttl=64 time=7176.850 ms
64 bytes from 192.168.101.3: icmp_seq=1266 ttl=64 time=6176.026 ms
64 bytes from 192.168.101.3: icmp_seq=1267 ttl=64 time=5175.185 ms
64 bytes from 192.168.101.3: icmp_seq=1268 ttl=64 time=4174.355 ms
64 bytes from 192.168.101.3: icmp_seq=1269 ttl=64 time=3173.479 ms
64 bytes from 192.168.101.3: icmp_seq=1270 ttl=64 time=2172.658 ms
64 bytes from 192.168.101.3: icmp_seq=1271 ttl=64 time=1171.835 ms
64 bytes from 192.168.101.3: icmp_seq=1272 ttl=64 time=170.971 ms
64 bytes from 192.168.101.3: icmp_seq=1273 ttl=64 time=0.138 ms
64 bytes from 192.168.101.3: icmp_seq=1274 ttl=64 time=0.162 ms
64 bytes from 192.168.101.3: icmp_seq=1275 ttl=64 time=0.133 ms
64 bytes from 192.168.101.3: icmp_seq=1276 ttl=64 time=0.140 ms
64 bytes from 192.168.101.3: icmp_seq=1277 ttl=64 time=0.138 ms
64 bytes from 192.168.101.3: icmp_seq=1278 ttl=64 time=0.132 ms
64 bytes from 192.168.101.3: icmp_seq=1279 ttl=64 time=0.132 ms
64 bytes from 192.168.101.3: icmp_seq=1280 ttl=64 time=0.132 ms
64 bytes from 192.168.101.3: icmp_seq=1281 ttl=64 time=0.129 ms

At that point the machine silently rebooted inspite of being compiled
with KDB and DDB and not KDB_UNATTENDED.  This silent reboot is
reproduceable.




More information about the freebsd-stable mailing list