(ipfw) Re: HELP! fetch: stuck forever OR error: RPC failed: curl 56 recv failure: Operation timed out
- Reply: Juraj Lutter : "Re: (ipfw) Re: HELP! fetch: stuck forever OR error: RPC failed: curl 56 recv failure: Operation timed out"
- In reply to: FreeBSD User : "Re: HELP! fetch: stuck forever OR error: RPC failed: curl 56 recv failure: Operation timed out"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sun, 08 Dec 2024 19:30:36 UTC
Hi,
I can reproduce your error.
Today I updated my RPI4 from a build of Oct 23 to Dec 6. And I can reproduce the problem.
After about 2 hours scp exits with:
client_loop: send disconnect: Broken pipe
scp: Connection closed
Working:
FreeBSD rpi4 15.0-CURRENT FreeBSD 15.0-CURRENT #4 main-d2e7bb630b8-dirty: Wed Oct 23 00:55:12 CEST 2024 ronald@rpi4:/data/ronald/freebsd/obj/data/ronald/freebsd/src/main/arm64.aarch64/sys/GENERIC-NODEBUG arm64
Broken:
FreeBSD rpi4 15.0-CURRENT FreeBSD 15.0-CURRENT #5 main-839fb85336a-dirty: Sat Dec 7 22:33:27 CET 2024 ronald@rpi4:/data/ronald/freebsd/obj/data/ronald/freebsd/src/main/arm64.aarch64/sys/GENERIC-NODEBUG arm64
A cronjob which does a scp to another server didn't work anymore. When I go back to the previous BE it works fine again.
Ipfw disable firewall also makes the scp work.
Scp also seems to work fine if I replace the statefull firewall rules with stateless "pass all from any to any".
Regards,
Ronald.
Op 06-12-2024 om 21:09 schreef FreeBSD User:
> Am Fri, 6 Dec 2024 19:40:02 +0100 (CET)
> Ronald Klop <ronald-lists@klop.ws> schrieb:
>
>> Might be useful to share your ipfw config.
>
> Sorry, my posting must have been disturbing (having in mind a "deny any rule and then
> disabling the FW ...).
>
> Well, the IPFW setup itself is explained quickly - I use almost the vanilla rc.conf-issued
> IPFW (settings: firewall_type="workstation", firewall_logif="YES",
> firewall_myservices="22/tcp", firewall_allowservices="any"). The hosts in question have the
> following kernel configuration, I provide the option tags that might be of interest or, if
> not, just for the record, as they are not part of GENERIC, see below.
>
> Also, I'll provide some sysctl setting performed via /etc/sysctl.conf.local, see below.
>
> The configuration and settings have been mostly unchanged over a couple of months for now and
> did not induce trouble so far.
>
> As it deemed fit regarding time and my limited skills, I disabled and enabled piece by piece
> of the MAC_ and NETGRAPH_ options - without any success so far - my "measurement" is fetching
> emails via claws-mail (all TLS). claws-mail reports "corrupted/broken stream", does have
> authetication issues and is de facto unusable - it doesn't refresh IMAP based email fetches
> and doesn't even quit without a hard kill.
> Another "indicator" is the time taken to "git pull" of ZFS filesystems: cloning and pulling
> takes unusual long (/usr/src is UFS/FFS, /usr/ports on a ZFS pool and since the problem
> occured, it makes a mutual difference).
>
> While git pull or clone mutually stuck and claws-mail is endlessly fetching/authenticating
> emails and never responding back in a usable manner, performing
>
> "ipfw disable firewall"
>
> makes all of a sudden the system work again as usual and expected.
>
> As reported - the problem spreads across all of my CURRENT hosts as I'm going to update them
> towards a recent CURRENT (they all share similar static kernel configs as described here). Most
> of the boxes do not show the weird reluctant behaviour when pulling via git, but weren't
> capable of cloning, bailing out with the timeout reported earlier.
>
> I use one CURRENT box as my personal desktop, so no other (server) CURRENT show the Email
> problem in detail as described.
>
> And, for the record: I haven't commented out the "options IPFIREWALL" yet in the kernel
> config ...
>
>
> Kind regards
>
> oh
>
> [ KERNEL config different from vanilla GENERIC ]
>
> options RATELIMIT
> options ZFS
> options TCPHPTS
> options MROUTING
> options IPSEC
> options SCTP
>
> options MAC_BSDEXTENDED
> options MAC_PORTACL
> options MAC_IPACL
> options MAC_NTPD
> #options MAC_DO
>
> options NETGRAPH
> options NETGRAPH_IPFW
> options NETGRAPH_ETHER
> options NETGRAPH_EIFACE
> options NETGRAPH_VLAN
> #options NETGRAPH_NAT
> options NETGRAPH_DEVICE
> #options NETGRAPH_PPPOE
> options NETGRAPH_SOCKET
> options NETGRAPH_KSOCKET
> options NETGRAPH_NETFLOW
> #options NETGRAPH_CAR
>
> # IPFW firewall
> options IPFIREWALL
> options IPFIREWALL_VERBOSE
> options DUMMYNET # traffic shaper
>
> options BPF_JITTER # adds support for BPF just-in-time compiler.
>
> # Pseudo devices not in GENERIC.
> device enc # IPsec device
> device stf # 6to4 IPv6 over IPv4 encapsulation
> device carp # Common address redundancy protocol
> device lagg # Link aggregation
> device gre # GRE Tunnel
> device epair # A pair of virtual back-to-back connected Ethernet interfaces
> device if_bridge # bridge device
> device vxlan # Virtual eXtensible LAN interface
>
>
> For the MAC_ Modules: the appropriate OIDs (sysctl) are disabled as far as the MAC module
> influence the initial behaviour if unconfigured, for instance
> (/etc/sysctl.conf.local)
>
> [ /etc/sysctl.conf.local ]
> security.mac.bsdextended.enabled=0
> security.mac.mls.enabled=0
> security.mac.portacl.enabled=0
> security.mac.do.enabled=0
> security.mac.ipacl.ipv6=0
> security.mac.ipacl.ipv4=0
> #
> net.bpf.optimize_writers=1
> #
> net.inet.ip.fw.verbose=1
> #net.inet.ip.fw.verbose_limit=10
> net.inet.ip.fw.dyn_keep_states=1
>
>
>
>
>
>>
>> Van: FreeBSD User <freebsd@walstatt-de.de>
>> Datum: 6 december 2024 03:47
>> Aan: freebsd-current@freebsd.org, freebsd-ipfw@freebsd.org
>> Onderwerp: Re: HELP! fetch: stuck forever OR error: RPC failed: curl 56 recv failure:
>> Operation timed out
>>
>>>
>>>
>>> Am Thu, 5 Dec 2024 17:33:54 +0100
>>> FreeBSD User schrieb:
>>>
>>> I found the culprit!
>>>
>>> Disabling IPFW ("ipfw disable firewall") turns system back to normal!
>>>
>>> For the record: on recent CURRENT, since approx. Nov. 30 and/or December 1st CURRENT seems
>>> to corrupt network connections.
>>>
>>> IPFW is compiled statically into the kernel.
>>>
>>> The problem sketched below can be reproduced in a more or less obvious manner on recent
>>> CURRENT: git pull/git clone of a regular FreeBSD source repo or ports via git+https takes
>>> either a couple of time (up to several mintes to initiate the pull) - or, in some worse
>>> cases here, the box runs into
>>> error: RPC failed; curl 56 Recv failure: Operation timed out
>>>
>>> claws-mail complains about "corrupted/broken stream", fetching emails takes Aeons -
>>> forever, the client does not come back even after several hours.
>>>
>>>> On Thu, 5 Dec 2024 16:55:00 +0100
>>>> Daniel Tameling wrote:
>>>>
>>>>> On Thu, Dec 05, 2024 at 11:51:03AM +0100, FreeBSD User wrote:
>>>>>> On Wed, 04 Dec 2024 17:20:39 +0000
>>>>>> "Dave Cottlehuber" wrote:
>>>>>>
>>>>>> Thank you very much for responding!
>>>>>>
>>>>>>> On Tue, 3 Dec 2024, at 19:46, FreeBSD User wrote:
>>>>>>>> On most recent CURRENT (on some boxes of ours, not all) fetch/git seem
>>>>>>>> to be stuck
>>>>>>>> forever fetching tarballs from ports, fetching Emails via claws-mail
>>>>>>>> (TLS), opening
>>>>>>>> websites via librewolf and firefox or pulling repositories via git.
>>>>>>>>
>>>>>>>> CURRENT: FreeBSD 15.0-CURRENT #1 main-n273978-b5a8abe9502e: Mon Dec 2
>>>>>>>> 23:11:07 CET 2024
>>>>>>>> amd64
>>>>>>>>
>>>>>>>> When performing "git pull" und /usr/ports, I received after roughly 5-7 minutes:
>>>>>>>>
>>>>>>>> error: RPC failed: curl 56 recv failure: Operation timed out
>>>>>>>
>>>>>>> Generally it would be worth seeing if the HTTP(S) layers are doing the right thing
>>>>>>> or not, and then working down from there, to tcpdump / wireshark and then if
>>>>>>> necessary into kernel itself.
>>>>>>
>>>>>> My skills are limited, according to packet analysis utilizing tcpdum/wireshark (and
>>>>>> theory,of course). I tried due to "a feeling" my used older Intel based NIC could
>>>>>> have some checksum issues like in the past (I saw e1000 driver updates recently
>>>>>> flowing into FreeBSD CURRENT).
>>>>>>>
>>>>>>> If fetch fails reliably in ports distfile fetching, then isolate a suitable
>>>>>>> tarball, and try it again in curl, with tcpdump already prepared to capture
>>>>>>> traffic to the remote host.
>>>>>>>
>>>>>>> tcpdump -w /tmp/curl.pcap -i ... host ...
>>>>>>>
>>>>>>> env SSLKEYLOGFILE=/tmp/ssl.keys curl -vsSLo /dev/null --trace
>>>>>>> /tmp/curl.log https://what.ev/er
>>>>>>>
>>>>>>> I would guess that between the two something useful should pop up.
>>>>>>>
>>>>>>> I like opening the pcap in wireshark, it often has angry red and black highlighted
>>>>>>> lines already giving me a hint.
>>>>>>>
>>>>>>> The SSLKEYLOGFILE can be imported into wireshark, and allows decrypting the TLS
>>>>>>> traffic as well in case there are issues further in. Very handy,
>>>>>>> see https://everything.curl.dev/usingcurl/tls/sslkeylogfile.html for how to do that.
>>>>>>>
>>>>>>> If your issues only occur with git pull, its also curl inside and supports similar
>>>>>>> debugging. Ferreting
>>>>>>> through https://stackoverflow.com/questions/6178401/how-can-i-debug-git-git-shell-related-problems/56094711#56094711 should get you similar info.
>>>>>>>
>>>>>>> A+
>>>>>>> Dave
>>>>>>>
>>>>>>
>>>>>> Thanks for the hints and precious tips! I'll digg deeper into the matter.
>>>>>>
>>>>>> In the meanwhile, I updated some other machines running CURRENT since approx. two
>>>>>> weeks with an older CURRENT to the most recent one - and face similar but not
>>>>>> identical problems!
>>>>>> Updating exiting FreeBSD repositories, like src.git and ports.git, show no problems
>>>>>> except they take longer to accomplish than expected.
>>>>>> Cloning a repo is impossible, after 10 or 15 minutes I receive a timeout.
>>>>>>
>>>>>> On aCURRENT recently updated and worked flawlessly before (CURRENT now: FreeBSD
>>>>>> 15.0-CURRENT #5 main-n274014-b2bde8a6d39: Wed Dec 4 22:22:22 CET 2024 amd64),
>>>>>> cloning attempts for 14.2-RELENG ends up in this mess:
>>>>>>
>>>>>> # git clone --branch releng/14.2 https://git.freebsd.org/src.git 14.2-RELENG/src/
>>>>>> Cloning into '14.2-RELENG/src'...
>>>>>> error: RPC failed; curl 56 Recv failure: Operation timed out
>>>>>> fatal: expected 'packfile'
>>>>>>
>>>>>> This is nasty. The host now in question has an i350 based dual-port NIC - the host's
>>>>>> kernel is very similar to the box I reported the issue first time, both do have
>>>>>> customized kernels (in most cases, I compile several modules like ZFS and
>>>>>> several NETGRAPH modules statically into the kernel - a habit inherited from a small
>>>>>> FBSD project I configured (I wouldn't say developed) which does not allow loadable
>>>>>> kernel modules due to regulations.
>>>>>>
>>>>>> I hoped others would stumble over this tripwire in recent CURRENT sources, since the
>>>>>> phenomena and its distribution over a bunch of CURRENT boxes with different OS states
>>>>>> seemingly show different behviour.
>>>>>>
>>>>>> And for the record: I also build my ports via poudriere and mostly via make. I also
>>>>>> rebuilt in a two day's marathon all packages via "make -f" - for librewolf, curl and
>>>>>> so on to ensure having latest sources/packages.
>>>>>>
>>>>>> (I repeat myself here again, sorry, its for the record).
>>>>>>
>>>>>> Will report in on further development and "investigations"
>>>>>>
>>>>>> Kind regards and thanks,
>>>>>>
>>>>>> oh
>>>>>>
>>>>>>
>>>>>
>>>>> This is a shot into the dark but is this a virtual machine? VirtualBox 7.1.0 had some
>>>>> networking issues that got fixed later.
>>>>
>>>> No, pure Hardware and FreeBSD ...
>>>>
>>>>>
>>>>> Otherwise I would start with ping and traceroute to figure out if they show this issue
>>>>> and where it occurs.
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> O. Hartmann
>>>
>>>
>>>
>>>
>>>
>
>
>