hung poudriere bulk recovery
Russell L. Carter
rcarter at pinyon.org
Fri Oct 23 16:41:46 UTC 2015
Greetings,
Recently my nightly cron poudriere builds have been occasionally
hanging. For instance, here's last night's, with apparently no
progress for over 10 hours:
root at terpsichore> poudriere status
SET PORTS JAIL BUILD STATUS QUEUE
BUILT FAIL SKIP IGNORE REMAIN TIME LOGS
- default 10-stable-amd64 2015-10-22_22h30m08s parallel_build 488
34 0 0 0 454 10:45:56
/ssd1/poudriere/data/logs/bulk/10-stable-amd64-default/2015-10-22_22h30m08s
root at terpsichore>
htop now shows no significant activity for the specified 3 builders:
root at terpsichore> ps xa | grep poud
72482 - Is 0:00.01 /bin/sh /root/poudriere/run-poudriere-bulk
73202 - S 0:04.24 sh -e /usr/local/share/poudriere/bulk.sh -f
/root/poudriere/ports -j 10-stable-amd64
73347 - S 1:55.38 sh -e /usr/local/share/poudriere/bulk.sh -f
/root/poudriere/ports -j 10-stable-amd64
73352 - I 0:00.08 sh -e /usr/local/share/poudriere/bulk.sh -f
/root/poudriere/ports -j 10-stable-amd64
6119 1 S+ 0:00.00 grep poud
root at terpsichore>
If I reboot, so that the tmp zfs filesystems are unmounted, and
manually rerun the exact same script as the previous cron'd, hung
instance, poudriere has (so far) run to completion.
I'm not sure how to debug this, but in the interim, I'm very curious
how I can stop the hung bulk run, and either restart it, or clean up
the various mounted zfs filesystems and manually restart from the
beginning w/o rebooting. Studying the man page, it's not clear at all
the Right Way to do this, so any pointers here would be appreciated.
I'm leaving the system untouched for now so that I can try out any
suggestions for cleanup and restart.
Thanks,
Russell
More information about the freebsd-ports
mailing list