Re: port binary dumping core on recent head in poudriere [tmpfs corruptions involving blocks of zeros that should not be all zeros]

From: Mark Millard <marklmi_at_yahoo.com>
Date: Tue, 26 Nov 2024 15:52:47 UTC
On Nov 26, 2024, at 05:38, Konstantin Belousov <kib@freebsd.org> wrote:

> On Tue, Nov 26, 2024 at 01:58:19PM +0100, Dimitry Andric wrote:
>> On 26 Nov 2024, at 13:32, Dimitry Andric <dim@FreeBSD.org> wrote:
>>> 
>>> On 26 Nov 2024, at 11:19, Dag-Erling Smørgrav <des@FreeBSD.org> wrote:
>>>> 
>>>> Mark Millard <marklmi@yahoo.com> writes:
>>>>> From inside a bulk -i where I did a manual make command
>>>>> after it built and installed libsass.so.1.0.0 . The
>>>>> manual make produced a /wrkdirs/ :
>>>>> [...]
>>>>> So the original creation looks okay. But . . .
>>>>> [...]
>>>>> So: The later, staged copy is a bad copy. Both are in the
>>>>> tmpfs. So copying to the staging area makes a corrupted
>>>>> copy inside the same tmpfs. After that, further copies of
>>>>> staging's bad copy can be expected to be messed up.
>>>> 
>>>> This and the fact that it happens on 14 and 15 but not on 13 strongly
>>>> suggests an issue wth `copy_file_range(2)`, since `install(1)` in 14 and
>>>> 15 (but not in 13) now uses `copy_file_range(2)` if at all possible.
>>>> 
>>>> My educated guess is that hole detection doesn't work reliably for files
>>>> that have had holes filled while memory-mapped, so `copy_file_range(2)`
>>>> thinks there is a hole where there isn't one and skips some of the data
>>>> when `install(1)` uses it to copy the library from `${WRKSRC}` to
>>>> `${STAGEDIR}`.  This may or may not be specific to tmpfs.
>>>> 
>>>> You may want to try applying the attached patch to your FreeBSD 14 and
>>>> 15 jails.  It prevents `cp(1)` and `install(1)` from trying to use
>>>> `copy_file_range(2)`.
>>> 
>>> Yes, tmpfs is indeed the culprit (or at least involved). I have had USE_TMPFS=localbase in my poudriere.conf for a long time, since otherwise my build machine would run out of memory very quickly, so I didn't encounter any issues.
>>> 
>>> Now I changed it to USE_TMPFS=yes, rebuilt only textproc/libsass and textproc/sassc, and then after reinstalling those packages:
>>> 
>>> $ /usr/local/bin/sassc
>>> Segmentation fault
>> 
>> And after applying Dag-Erling's patch to disable copy_file_range for cp and install, it works correctly again.
> 
> So indeed there might be an issue in tmpfs seeking for data.  Could you try
> the following?
> 
> commit f4b848946a131dab260b44eab2cfabceb82bee0c
> Author: Konstantin Belousov <kib@FreeBSD.org>
> Date:   Tue Nov 26 15:34:56 2024 +0200
> 
>    tmpfs: do not skip pages searching for data
> 
>    If the iterator finds invalid page at the requested pindex in
>    swap_pager_seek_data(), the current code only looks at the swap blocks
>    to search for data.  This is not correct, valid pages may appear at the
>    higher indexes still.
> 
> diff --git a/sys/vm/swap_pager.c b/sys/vm/swap_pager.c
> index db925f4ae7f6..390b2c10d680 100644
> --- a/sys/vm/swap_pager.c
> +++ b/sys/vm/swap_pager.c
> @@ -2503,12 +2503,9 @@ swap_pager_seek_data(vm_object_t object, vm_pindex_t pindex)
> 	VM_OBJECT_ASSERT_RLOCKED(object);
> 	vm_page_iter_init(&pages, object);
> 	m = vm_page_iter_lookup_ge(&pages, pindex);
> -	if (m != NULL) {
> -		if (!vm_page_any_valid(m))
> -			m = NULL;
> -		else if (pages.index == pindex)
> -			return (pages.index);
> -	}
> +	if (m != NULL && pages.index == pindex)
> +		return (pages.index);
> +
> 	swblk_iter_init_only(&blks, object);
> 	swap_index = swap_pager_iter_find_least(&blks, pindex);
> 	if (swap_index == pindex)

Not sufficient, unfortunately . . .

I patched what I've been running and rebooted into:

# uname -apKU
FreeBSD 7950X3D-ZFS 15.0-CURRENT FreeBSD 15.0-CURRENT #152 main-n273696-43e045c1733d-dirty: Tue Nov 26 07:21:27 PST 2024     root@7950X3D-ZFS:/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/sys/GENERIC-NODBG amd64 amd64 1500027 1500027

Note: 43e045c1733d is from 2024-Nov-18 .

I then built libsass :

[00:00:02] [01] [00:00:00] Building   textproc/libsass | libsass-3.6.6
[00:00:20] [01] [00:00:18] Finished   textproc/libsass | libsass-3.6.6: Success ending TMPFS: 3.42 GiB

I then installed it, resulting in:

# pkg info libsass
libsass-3.6.6
Name           : libsass
Version        : 3.6.6
Installed on   : Tue Nov 26 07:33:15 2024 PST
Origin         : textproc/libsass
Architecture   : FreeBSD:15:amd64
Prefix         : /usr/local
Categories     : textproc
Licenses       : MIT
Maintainer     : nivit@FreeBSD.org
WWW            : https://sass-lang.com/libsass
Comment        : C/C++ implementation of a Sass compiler
Shared Libs provided:
libsass.so.1
Annotations    :
FreeBSD_version: 1500027
build_timestamp: 2024-11-26T15:32:33+0000
built_by       : poudriere-git-3.4.99.20240811
. . .

libsass.so.1.0.0 still has .got.plt starting with (this time):

 2bed60 00000000 00000000 00000000 00000000  ................
 2bed70 00000000 00000000 00000000 00000000  ................
 2bed80 00000000 00000000 00000000 00000000  ................
 2bed90 00000000 00000000 00000000 00000000  ................
. . .
 2bffc0 00000000 00000000 00000000 00000000  ................
 2bffd0 00000000 00000000 00000000 00000000  ................
 2bffe0 00000000 00000000 00000000 00000000  ................
 2bfff0 00000000 00000000 00000000 00000000  ................
 2c0000 96cb2a00 00000000 a6cb2a00 00000000  ..*.......*.....
 2c0010 b6cb2a00 00000000 c6cb2a00 00000000  ..*.......*.....
 2c0020 d6cb2a00 00000000 e6cb2a00 00000000  ..*.......*.....
 2c0030 f6cb2a00 00000000 06cc2a00 00000000  ..*.......*.....
. . .

And still results in:

# sassc
Segmentation fault (core dumped)



===
Mark Millard
marklmi at yahoo.com