Re: USB-serial adapter suggestions needed

From: Mark Millard <marklmi_at_yahoo.com>
Date: Wed, 10 Jan 2024 19:21:39 UTC
On Jan 10, 2024, at 10:16, bob prohaska <fbsd@www.zefox.net> wrote:

> On Tue, Jan 09, 2024 at 05:03:42PM -0800, Mark Millard wrote:
>> On Jan 9, 2024, at 14:47, bob prohaska <fbsd@www.zefox.net> wrote:
>> 
> [transcript of ssh-tip disconnect omitted]
>> 
>> Interesting.
>> 
>> www.zefox.org is the aarch64 that is not configured in config.txt
>> in the normal aarch64 manor. As I've requested before, please test
>> using a config.txt that instead has:
>> 
>> QUOTE
>> [all]
>> arm_64bit=1
>> dtparam=audio=on,i2c_arm=on,spi=on
>> dtoverlay=mmc
>> dtoverlay=disable-bt
>> device_tree_address=0x4000
>> kernel=u-boot.bin
>> 
>> [pi4]
>> hdmi_safe=1
>> armstub=armstub8-gic.bin
>> 
>> # Local addition:
>> [all]
>> force_mac_address=b8:27:eb:71:46:4f
>> END QUOTE
>> 
>> Please do not use a configuration based in part on armv7 FreeBSD
>> config.txt material any more for the testing activity: Just use
>> the FreeBSD normal aarch64 configuration with the force_mac_address
>> related material added at the end.
>> 
>> I want to know if this also fails when powerd is not in
>> use anywhere.
>> 
> 
> I'd like to let the the present OS build/install cycle complete.
> Then I'll replace config.txt on www.zefox.org and reboot.
> That should be done in another day or two. The remote console
> disconnect reported above hasn't happened again, all consoles
> have stayed connected and responsive.
> 
> 
>> [I assume that the "The Pi4 workstation" is the "pi4 RasPiOS
>> workstation". True? Presuming yes: Is the RasPiOS the 64 bit
>> variation? (Just my curiosity.)]
>> 
> Yes. Uname -a reports 
> Linux raspberrypi 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr  3 17:24:16 
> BST 2023 aarch64 GNU/Linux
> 
>> Do you run the buildworld on www.zefox.org and such via the
>> tip session on pelorus.zefox.org ? Via an ssh session from the
>> "pi4 RasPiOS workstation" to www.zefox.org ? More generally:
>> 
>> A) What runs (if anything) via the tip session started from
>>    pelorus.zefox.org ?
>> 
>> B) What runs (if anything) via the ssh session connected to
>>    www.zefox.org ?
>> 
> 
> In general the tip session is used only for observation or
> troubleshooting. Ssh connections are used for other activity, 
> including OS build/install cycles, poudriere, etc. They are
> usually placed in the background, writing to log files so that
> accidental disconnects from the workstation don't stop them.

Are you using:

NAME
     nohup – invoke a utility immune to hangups

SYNOPSIS
     nohup [--] utility [arguments]

DESCRIPTION
     The nohup utility invokes utility with its arguments and at this time
     sets the signal SIGHUP to be ignored.  If the standard output is a
     terminal, the standard output is appended to the file nohup.out in the
     current directory.  If standard error is a terminal, it is directed to
     the same place as the standard output.

     Some shells may provide a builtin nohup command which is similar or
     identical to this utility.  Consult the builtin(1) manual page.

?

>> A useful test would be to not have the tip command running
>> on polaris.zefox.org and to just use the ssh to www.zfox.org
>> instead to start the buildworld/buildkernel. So: No use
>> of the serial connection when the buildworld is started or
>> during the build(s). Using tip before that but quitting tip
>> before starting to load the RPi4B would be okay for this type
>> of test. The question would be if the:
>> 
>> client_loop: send disconnect: Broken pipe
>> 
>> still happens.
>> 
>> (I'm not claiming that recovery if it fails would be nice. But
>> finding out if it fails looks to be important.)
>> 
>> The contrasting useful test would be to start the buildworld
>> from the tip session on polaris.zefox.org and to not have any
>> ssh session to www.zefox.org . The question would be if a
>> failure of some kind still happens. (The tip session does not
>> have a pipe in use as far as I know so the detail for
>> identifying faulure would likely be different.)
>> 
> 
> Normal practice is to leave the tip sessions displaying the 
> console host's login prompt. So long as the console login is 
> responsive I can assume that host isn't hung.
> 
>> Another question would be: do both such tests fail? Just one
>> (which)? None? So trying both tests eventually would be
>> important.
> 
> In general, ssh sessions behave completely independently. 
> Ssh connections to tip sessions commonly fail but no other 
> ssh connection to that terminal server is disturbed visibly.
>> 
>> It is important to have only one of the 2 types of connections
>> in use during the buildworld/buildkernel and such activity for
>> this type of test --and only the one instance of which ever
>> type the active test is for.
>> 
>> 
> 
> Apologies if I didn't answer your question; I'm missing the gist.

I only want one source of hangups/failure, no worries about which
one (network vs. serial) lead to the activity if a failure happens.

That only ssh sessions that in turn run tip fail suggests that the
tip session gets the initial problem and then things propagate. I
want more than a suggestion. For example: direct tip runs that
are not in any ssh session: still get some form of failures? For
another: no tip use, just ssh: still get failures? Do both ways
still get failures?

Yes, the implication is that some experiments that do not have
your normal structure are involved and there may be risk of not
being able to use a tip session as a responsiveness test during
such an experiment. I'm not suggesting any such thing for normal
operation once such experiments are finished.

> It remains unclear where the disconnects to tip originate.

That is part of what I'm requesting exploration of via
different techniques than past attempts that did not provide
the information.

> If the tip
> session is stopped by typing ~~. from the originating ssh instance I'm 
> returned to the shell on the terminal server. Ssh isn't disturbed. If 
> I type ~. the ssh session terminates and I'm back to the workstation's 
> shell. Would it be informative to start a tip session, then ssh in 
> separately and try to kill tip?

A question is of SIGHUP is happening. If it is, then the kill that would
simulate the issue would be via sending SIGHUP. But this may be only one
of however many alternatives there may be. I prefer to explore what
is actually happening than attempted simulations via guesses at what is
happening.

> I'd expect the ssh part of the link
> to remain up. If not, would it be significant? 
> 
> Occastionally warnings like
> Jan 10 00:23:30 ns1 sshd[925]: error: beginning MaxStartups throttling
> show up in console messages. Might those be relevant in some way?  

Hmm. Intersting. Looking around I see notation like:

MaxStartups 10:30:100

where (mostly copy/pasted wording from an example, other than detailed formatting):

10: concurrent unauthenticated sessions before it begins rejecting some subsequent connections
30: The percent of subsequent connections that are rejected [but see below]
100: At this many concurrent unauthenticated sessions, sshd rejects all subsequent connections

Looking, "man sshd_config" reports:

     MaxStartups
             Specifies the maximum number of concurrent unauthenticated
             connections to the SSH daemon.  Additional connections will be
             dropped until authentication succeeds or the LoginGraceTime
             expires for a connection.  The default is 10:30:100.

             Alternatively, random early drop can be enabled by specifying the
             three colon separated values start:rate:full (e.g. "10:30:60").
             sshd(8) will refuse connection attempts with a probability of
             rate/100 (30%) if there are currently start (10) unauthenticated
             connections.  The probability increases linearly and all
             connection attempts are refused if the number of unauthenticated
             connections reaches full (60).


It does suggest that testing isolated from the source(s) of
unauthenticated sessions could be worth while in case handling
the load from such sessions when already heavily loaded with
buildworld/builkernel or the like leads to other problems (and
denial of service consequences?).

I do not expect that this issue is all that likely but
expectations are not evidence of their own accuracy/inaccuracy.

===
Mark Millard
marklmi at yahoo.com