hung poudriere bulk recovery

Russell L. Carter rcarter at pinyon.org
Fri Oct 30 23:36:49 UTC 2015


Got a hung poudriere run last night:

On 10/28/15 15:10, Bryan Drewery wrote:
> On 10/23/2015 9:34 AM, Russell L. Carter wrote:
>>
>> Greetings,
>>
>> Recently my nightly cron poudriere builds have been occasionally
>> hanging.  For instance, here's last night's, with apparently no
>> progress for over 10 hours:
>>
>> root at terpsichore> poudriere status
>> SET PORTS   JAIL            BUILD                STATUS         QUEUE
>> BUILT FAIL SKIP IGNORE REMAIN TIME     LOGS
>> -   default 10-stable-amd64 2015-10-22_22h30m08s parallel_build   488
>>   34    0    0      0    454 10:45:56
>> /ssd1/poudriere/data/logs/bulk/10-stable-amd64-default/2015-10-22_22h30m08s
>> root at terpsichore>
>>
>
> Also check 'poudriere status -b' to see per-builder status. Something
> may be actually doing something. Poudriere will timeout builds after a
> long time. I forget the default but it may be up to 24 hours.
>

root at terpsichore> date
Fri Oct 30 15:19:55 MST 2015
root at terpsichore> poudriere status -b
[10-stable-amd64-default] [2015-10-29_22h30m07s] [parallel_build:] 
Queued: 129 Built: 34  Failed: 0   Skipped: 0   Ignored: 0   Tobuild: 95 
   Time: 16:49:55
         [01]: x11-toolkits/gtk30               build_port_done (16:40:58)
         [02]: graphics/ImageMagick             build_port_done (16:43:16)
         [03]: www/webkit-gtk2                  build_port_done (16:43:58)
====>> Logs: 
/ssd1/poudriere/data/logs/bulk/10-stable-amd64-default/2015-10-29_22h30m07s
root at terpsichore>

> Please record 'procstat -kka' before rebooting in case this is some kind
> of deadlock.

invoked right after the poudriere status -b:

http://rcarter.esturion.net/procstat-kka.txt

>>
>> I'm not sure how to debug this, but in the interim, I'm very curious
>> how I can stop the hung bulk run, and either restart it, or clean up
>> the various mounted zfs filesystems and manually restart from the
>> beginning w/o rebooting.  Studying the man page, it's not clear at all
>> the Right Way to do this, so any pointers here would be appreciated.
>
> Kill -TERM the main poudriere process. It will clean up children.
> Beyond that you can 'poudriere jail -j NAME -p TREE -z SET -k' to clean
> up any mounts leftover from a previous build.

A bit of trial and error led me to this solution, without either '-p' or 
'-z', so I'm good.  In the above, I ran the exact same bulk script 
manually and the poudriere bulk build ran to completion.

Thanks!
Russell


More information about the freebsd-ports mailing list