Re: poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder
- Reply: Mark Millard : "Re: poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder"
- In reply to: Mateusz Guzik : "Re: poudriere bulk with ZFS and USE_TMPFS=no on main [14-ALPHA2 based]: extensive vlruwk for cpdup's on new builders after pkg builds in first builder"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 24 Aug 2023 07:22:00 UTC
On Aug 23, 2023, at 22:54, Mateusz Guzik <mjguzik@gmail.com> wrote:
> On 8/24/23, Mark Millard <marklmi@yahoo.com> wrote:
>> On Aug 23, 2023, at 15:10, Mateusz Guzik <mjguzik@gmail.com> wrote:
>>
>>> On 8/23/23, Mark Millard <marklmi@yahoo.com> wrote:
>>>> [Forked off the ZFS deadlock 14 discussion, per feedback.]
>>>> . . .
>>>
>>> This is a known problem, but it is unclear if you should be running
>>> into it in this setup.
>>
>> The changed fixed the issue: so I do run into the the issue
>> for this setup. See below.
>>
>>> Can you try again but this time *revert*
>>> 138a5dafba312ff39ce0eefdbe34de95519e600d, like so:
>>> git revert 138a5dafba312ff39ce0eefdbe34de95519e600d
>>>
>>> may want to switch to a different branch first, for example: git
>>> checkout -b vfstesting
>>
>> # git -C /usr/main-src/ diff sys/kern/vfs_subr.c
>> diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
>> index 0f3f00abfd4a..5dff556ac258 100644
>> --- a/sys/kern/vfs_subr.c
>> +++ b/sys/kern/vfs_subr.c
>> @@ -3528,25 +3528,17 @@ vdbatch_process(struct vdbatch *vd)
>> MPASS(curthread->td_pinned > 0);
>> MPASS(vd->index == VDBATCH_SIZE);
>> + mtx_lock(&vnode_list_mtx);
>> critical_enter();
>> - if (mtx_trylock(&vnode_list_mtx)) {
>> - for (i = 0; i < VDBATCH_SIZE; i++) {
>> - vp = vd->tab[i];
>> - vd->tab[i] = NULL;
>> - TAILQ_REMOVE(&vnode_list, vp, v_vnodelist);
>> - TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist);
>> - MPASS(vp->v_dbatchcpu != NOCPU);
>> - vp->v_dbatchcpu = NOCPU;
>> - }
>> - mtx_unlock(&vnode_list_mtx);
>> - } else {
>> - for (i = 0; i < VDBATCH_SIZE; i++) {
>> - vp = vd->tab[i];
>> - vd->tab[i] = NULL;
>> - MPASS(vp->v_dbatchcpu != NOCPU);
>> - vp->v_dbatchcpu = NOCPU;
>> - }
>> + for (i = 0; i < VDBATCH_SIZE; i++) {
>> + vp = vd->tab[i];
>> + TAILQ_REMOVE(&vnode_list, vp, v_vnodelist);
>> + TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist);
>> + MPASS(vp->v_dbatchcpu != NOCPU);
>> + vp->v_dbatchcpu = NOCPU;
>> }
>> + mtx_unlock(&vnode_list_mtx);
>> + bzero(vd->tab, sizeof(vd->tab));
>> vd->index = 0;
>> critical_exit();
>> }
>>
>> Still with:
>>
>> # grep USE_TMPFS= /usr/local/etc/poudriere.conf
>> # EXAMPLE: USE_TMPFS="wrkdir data"
>> #USE_TMPFS=all
>> #USE_TMPFS="data"
>> USE_TMPFS=no
>>
>>
>> That allowed the other builders to eventually reach "Builder started"
>> and later activity, [00:05:50] [27] [00:02:29] Builder started
>> being the first non-[01] to do so, no vlruwk's observed in what
>> I saw in top:
>>
>> . . .
>>
>> Now testing for the zfs deadlock issue should be possible for
>> this setup.
>>
>
> Thanks for testing, I wrote a fix:
>
> https://people.freebsd.org/~mjg/vfs-recycle-fix.diff
>
> Applies to *stock* kernel (as in without the revert).
I'm going to leave the deadlock test running for when
I sleep tonight. So it is going to be a while before
I get to testing this. $ work will likely happen first
as well. (No deadlock observed yet, by the way. 6+ hrs
and 3000+ ports built so far.)
I can easily restore the sys/kern/vfs_subr.c to then
do normal 14.0-ALPHA2-ish based patching with: so not
a problem. Thanks.
===
Mark Millard
marklmi at yahoo.com