Re: port binary dumping core on recent head in poudriere [tmpfs corruptions involving blocks of zeros that should not be all zeros]
- Reply: Mark Millard : "Re: port binary dumping core on recent head in poudriere [tmpfs corruptions involving blocks of zeros that should not be all zeros]"
- Reply: Guido Falsi : "Re: port binary dumping core on recent head in poudriere [tmpfs corruptions involving blocks of zeros that should not be all zeros]"
- In reply to: Mark Millard : "Re: port binary dumping core on recent head in poudriere [tmpfs corruptions involving blocks of zeros that should not be all zeros]"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Tue, 26 Nov 2024 17:29:23 UTC
I think @kib has found the source of the problem. I've attached an attempt to fix it. On 11/26/24 09:52, Mark Millard wrote: > On Nov 26, 2024, at 05:38, Konstantin Belousov <kib@freebsd.org> wrote: > >> On Tue, Nov 26, 2024 at 01:58:19PM +0100, Dimitry Andric wrote: >>> On 26 Nov 2024, at 13:32, Dimitry Andric <dim@FreeBSD.org> wrote: >>>> On 26 Nov 2024, at 11:19, Dag-Erling Smørgrav <des@FreeBSD.org> wrote: >>>>> Mark Millard <marklmi@yahoo.com> writes: >>>>>> From inside a bulk -i where I did a manual make command >>>>>> after it built and installed libsass.so.1.0.0 . The >>>>>> manual make produced a /wrkdirs/ : >>>>>> [...] >>>>>> So the original creation looks okay. But . . . >>>>>> [...] >>>>>> So: The later, staged copy is a bad copy. Both are in the >>>>>> tmpfs. So copying to the staging area makes a corrupted >>>>>> copy inside the same tmpfs. After that, further copies of >>>>>> staging's bad copy can be expected to be messed up. >>>>> This and the fact that it happens on 14 and 15 but not on 13 strongly >>>>> suggests an issue wth `copy_file_range(2)`, since `install(1)` in 14 and >>>>> 15 (but not in 13) now uses `copy_file_range(2)` if at all possible. >>>>> >>>>> My educated guess is that hole detection doesn't work reliably for files >>>>> that have had holes filled while memory-mapped, so `copy_file_range(2)` >>>>> thinks there is a hole where there isn't one and skips some of the data >>>>> when `install(1)` uses it to copy the library from `${WRKSRC}` to >>>>> `${STAGEDIR}`. This may or may not be specific to tmpfs. >>>>> >>>>> You may want to try applying the attached patch to your FreeBSD 14 and >>>>> 15 jails. It prevents `cp(1)` and `install(1)` from trying to use >>>>> `copy_file_range(2)`. >>>> Yes, tmpfs is indeed the culprit (or at least involved). I have had USE_TMPFS=localbase in my poudriere.conf for a long time, since otherwise my build machine would run out of memory very quickly, so I didn't encounter any issues. >>>> >>>> Now I changed it to USE_TMPFS=yes, rebuilt only textproc/libsass and textproc/sassc, and then after reinstalling those packages: >>>> >>>> $ /usr/local/bin/sassc >>>> Segmentation fault >>> And after applying Dag-Erling's patch to disable copy_file_range for cp and install, it works correctly again. >> So indeed there might be an issue in tmpfs seeking for data. Could you try >> the following? >> >> commit f4b848946a131dab260b44eab2cfabceb82bee0c >> Author: Konstantin Belousov <kib@FreeBSD.org> >> Date: Tue Nov 26 15:34:56 2024 +0200 >> >> tmpfs: do not skip pages searching for data >> >> If the iterator finds invalid page at the requested pindex in >> swap_pager_seek_data(), the current code only looks at the swap blocks >> to search for data. This is not correct, valid pages may appear at the >> higher indexes still. >> >> diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c >> index db925f4ae7f6..390b2c10d680 100644 >> --- a/sys/vm/swap_pager.c >> +++ b/sys/vm/swap_pager.c >> @@ -2503,12 +2503,9 @@ swap_pager_seek_data(vm_object_t object, vm_pindex_t pindex) >> VM_OBJECT_ASSERT_RLOCKED(object); >> vm_page_iter_init(&pages, object); >> m = vm_page_iter_lookup_ge(&pages, pindex); >> - if (m != NULL) { >> - if (!vm_page_any_valid(m)) >> - m = NULL; >> - else if (pages.index == pindex) >> - return (pages.index); >> - } >> + if (m != NULL && pages.index == pindex) >> + return (pages.index); >> + >> swblk_iter_init_only(&blks, object); >> swap_index = swap_pager_iter_find_least(&blks, pindex); >> if (swap_index == pindex) > Not sufficient, unfortunately . . . > > I patched what I've been running and rebooted into: > > # uname -apKU > FreeBSD 7950X3D-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT #152 main-n273696-43e045c1733d-dirty: Tue Nov 26 07:21:27 PST 2024 root@7950X3D-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG amd64 amd64 1500027 1500027 > > Note: 43e045c1733d is from 2024-Nov-18 . > > I then built libsass : > > [00:00:02] [01] [00:00:00] Building textproc/libsass | libsass-3.6.6 > [00:00:20] [01] [00:00:18] Finished textproc/libsass | libsass-3.6.6: Success ending TMPFS: 3.42 GiB > > I then installed it, resulting in: > > # pkg info libsass > libsass-3.6.6 > Name : libsass > Version : 3.6.6 > Installed on : Tue Nov 26 07:33:15 2024 PST > Origin : textproc/libsass > Architecture : FreeBSD:15:amd64 > Prefix : /usr/local > Categories : textproc > Licenses : MIT > Maintainer : nivit@FreeBSD.org > WWW : https://sass-lang.com/libsass > Comment : C/C++ implementation of a Sass compiler > Shared Libs provided: > libsass.so.1 > Annotations : > FreeBSD_version: 1500027 > build_timestamp: 2024-11-26T15:32:33+0000 > built_by : poudriere-git-3.4.99.20240811 > . . . > > libsass.so.1.0.0 still has .got.plt starting with (this time): > > 2bed60 00000000 00000000 00000000 00000000 ................ > 2bed70 00000000 00000000 00000000 00000000 ................ > 2bed80 00000000 00000000 00000000 00000000 ................ > 2bed90 00000000 00000000 00000000 00000000 ................ > . . . > 2bffc0 00000000 00000000 00000000 00000000 ................ > 2bffd0 00000000 00000000 00000000 00000000 ................ > 2bffe0 00000000 00000000 00000000 00000000 ................ > 2bfff0 00000000 00000000 00000000 00000000 ................ > 2c0000 96cb2a00 00000000 a6cb2a00 00000000 ..*.......*..... > 2c0010 b6cb2a00 00000000 c6cb2a00 00000000 ..*.......*..... > 2c0020 d6cb2a00 00000000 e6cb2a00 00000000 ..*.......*..... > 2c0030 f6cb2a00 00000000 06cc2a00 00000000 ..*.......*..... > . . . > > And still results in: > > # sassc > Segmentation fault (core dumped) > > > > === > Mark Millard > marklmi at yahoo.com >