Re: 14-CURRENT: www/nextcloud: php occ/web access : Segmentation fault

From: Bakul Shah <bakul_at_iitbombay.org>
Date: Wed, 30 Jun 2021 18:33:02 UTC
On Jun 28, 2021, at 1:39 PM, O. Hartmann <ohartmann@walstatt.org> wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> 
> Hello,
> 
> we ran into serious trouble here with an www/nextcloud installation on a recent 14-CURRENT
> (FreeBSD 14.0-CURRENT #23 main-n247612-e6dd0e2e8d4: Mon Jun 28 18:08:20 CEST 2021 amd64).
> Ports tree is up to date and ports are built via traditional "make". Port www/nextcloud, all
> mod_ ports and even every php-* port (php74 is installed and default) has been recompiled
> within the last two weeks via "portmaster -f".
> 
> The phenomenon occured back a couple of weeks, when access from the web via cjromoum and/or
> firefox reported out of the sudden "Secure Connection Failed". I checked the Apache
> 2.4 server's certificate (self signed,never had been an issue so far), but there seems no
> issue to exist.
> It got very strange when I tried to perfom an upgrade and/or check via
> 
> cd /usr/local/www/nextcloud
> su -m -c "/usr/local/bin/php ./occ upgrade"
> 
> Whenever I access occ, I receive an @"Segmentation fault".
> The I checked the server's error log and I found for each access of the nextcloud instance an
> entry like
> 
> [Tue Jun 01 06:04:40.667026 2021] [core:notice] [pid 81123:tid 34374492160] AH00052: child pid
> 24598 exit signal Segmentation fault (11)
> 
> Well, I'm out of ideas, it seems nextcloud, php or apache ar all in combination do have a
> serious problem hard to come by with 14-CURRENT (another instance running on 12.2-RELENG
> doesn't have any issues).
> 
> Can someone hint me to what to do track this nasty error?

Some general ideas:

You can use ktrace and tcpdump to capture in some detail what is going on.
ktrace can tell you if there was a failing open near the crash (often due to
the "all the world is linux" syndrome). There should be a core dump connected
to the segfault - make sure coredumpsize limit is not set to 0. You may be
able to get a stacktrace from it. You can then compile the offending program
with -g and capture file:line with gdb. You can then pore over the source
code and add some checks or traps of your own (if you are guessing what the
bug may be). Also look at /var/log/messages for anything unusual. ["Unusual"
may become more apparent as you gain debugging experience.]

If you have a bug report, adding such relevant details may be of help to others.

Note that if you switch to a different version of *some* s/w piece, the
problem may disappear. That may be fine if you are under time pressure and
just want to work around the problem but in general it is better to try to
catch the bug without changing its environment.