Livelock with GENERIC HEAD from Feb 19 13:36 UTC
Peter Holm
peter at holm.cc
Tue Feb 22 15:48:24 GMT 2005
With GENERIC HEAD from Feb 19 13:36 UTC + mpsafe_vfs = 1 I got
a new livelock:
http://www.holm.cc/stress/log/cons118.html
This time I think I have a clue to what the problem is. One of
the stress test programs (swap) works like this pseudo code:
c = malloc(size);
page = getpagesize();
while (done_testing == 0) {
i = 0;
while (i < size && done_testing == 0) {
c[i] = 0;
i += page;
}
}
Could it be that two incarnations of this program can monopolize
the run queue?
$ sort -n +4 < /var/crash/ps.186 | grep " R"
UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
0 5 0 0 8 0 0 0 - RL ?? 0:00.00 [thread tas
0 68391 68390 8 97 0 320 120 - RE ?? 0:00.01 [atrun]
1001 68342 68326 295 131 0 17628 0 - R+ #C: 192:43.57 [swap]
1001 68354 68326 295 131 0 13268 0 - R+ #C: 192:42.13 [swap]
1001 68331 68325 288 132 0 1224 0 - R+ #C: 0:00.02 [creat]
1001 68332 68325 288 132 0 1224 0 - R+ #C: 0:00.02 [creat]
1001 68333 68325 288 132 0 1224 0 - R+ #C: 0:00.02 [creat]
1001 68334 68325 288 132 0 1224 0 - R+ #C: 0:00.03 [creat]
1001 68335 68325 288 132 0 1224 0 - R+ #C: 0:00.10 [creat]
1001 68336 68325 288 132 0 1224 0 - R+ #C: 0:00.07 [creat]
1001 68361 68328 290 132 0 1232 0 - R+ #C: 0:00.42 [tcp]
1001 68362 68329 288 132 0 1252 0 - R+ #C: 0:00.06 [udp]
1001 68363 68329 288 132 0 1252 0 - R+ #C: 0:00.04 [udp]
1001 68368 68360 288 132 0 1320 0 - R+ #C: 0:00.05 [tcp]
1001 68369 68361 290 132 0 1320 0 - R+ #C: 0:00.56 [tcp]
1001 68387 68338 288 132 0 1656 0 - R+ #C: 0:00.02 [sh]
1001 68388 68340 288 132 0 1664 0 - R+ #C: 0:00.02 [sh]
1001 68389 68388 288 132 0 0 0 - RE+ #C: 0:00.02 [swapinfo]
1001 68390 68388 288 132 0 1204 0 - R+ #C: 0:00.01 [tail]
0 11 0 262 171 0 0 0 - RL ?? 345:02.29 [idle: cpu0
At a later freeze today a "kill 1 <swap pid>" from kdb unfroze
the box.
--
Peter Holm
More information about the freebsd-current
mailing list