From nobody Fri Feb 21 02:39:03 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4YzZ8W06HNz5nq34 for ; Fri, 21 Feb 2025 02:39:23 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-ed1-x529.google.com (mail-ed1-x529.google.com [IPv6:2a00:1450:4864:20::529]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4YzZ8T6jQ9z3Qm7 for ; Fri, 21 Feb 2025 02:39:21 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ed1-x529.google.com with SMTP id 4fb4d7f45d1cf-5ded69e6134so2629195a12.0 for ; Thu, 20 Feb 2025 18:39:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1740105560; x=1740710360; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=kBVyqy8DD3GJybO/QYLYsmrkdGjEequYB2iXedxDng4=; b=kyhckzLzVc/b4iepJKvBVYi8qIOreT5MlbO/JSxH6CEu22fB5aEGKyEAXdMXI++x1f /K6CP/mbSWqkJxBytOAt2nzyEJdzwbQzBFCuARUJzligcPpsameTaBR4OmWGps42G7Ts MhWIrRLsuVVqREOM6BxbxI+OaggochU0VU2LU0iORbmsCxZDAxnskjMWTNtGbyu5zCGc be6NGjJgA5ioIs0rjtax9YSRR8Ea7nFQJOzDeIcle8JFR4gunCOApS1l+PfpGcFmoAvp PGcYq1U+jOYuBfj87kTE2auU0OPsYqf4zlHsMNapXahT3un/bUiAFhfSfUStY4lLdJjh TrrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740105560; x=1740710360; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kBVyqy8DD3GJybO/QYLYsmrkdGjEequYB2iXedxDng4=; b=t5rGVcAmYujfzwOV2MiL0TVD5McvmwvG6n18RnMSQq1LUX6fyo5o+tyyOmP5d2OcDA Qp7l1Kjm9M7ku95V0lobaVOge57WVjlFMbebsyoDpXboy6cdAja5pWUWwcP0Y7uWtt06 Q/mc+gk7+51kJ7YHqIcI6GHOCvD0jnzJQcCjlxi99OsiNmeCDVNa5AMQG9EUduPz9jSZ zg7Bm0PuDGlLKtlERAkm4dBo3dM/y/raNwoJH4NrIY+LtillrOmtktYSu7tDXJ07LKrm 1Rdb85y/hr+A1HyRvpvLZF9dAUd/YuU5vgK+eLnufqjBLTj2YmAM53KIgVzIbKcqiZlx F79w== X-Gm-Message-State: AOJu0YzyhsgXFa1ykFCb4TqcS3EWRA+t5QwI42Ylvk5UaI1D55OYOfXh FVCaixh/udk+2ylA0qoVkZGLdpaztE4EuOTT/feIvjDJ9Uks6MfGQVw+DQQBPbeszIpnyiYT5PI trzAgM+qrDS6DIABWvFw6u4r+ow== X-Gm-Gg: ASbGncucKQ2kF3omHV9yy9QhCi1HYfsEffou+lGe2YR8WCWoZheSoXVQUuK2lAKJJHU LUAJMjvMFLBy6WKalqXRqlnJ0LxY0FGgXpSQ90B4auSMrZrtpoDZ3MVacS6FQdxZd5JBZN2dADw IxHSnN2jsyea33MJgXK6g2MH4TahWkdfNQS1y3kbhT X-Google-Smtp-Source: AGHT+IGcnXgH968p+Wjv0vE3d6mvps+vN0iMgYNZO+fBldpu2kvPw3wHl2IMTr3ub8I/mL4l5aLSN8/q6gBkHjZESf4= X-Received: by 2002:a05:6402:2750:b0:5e0:9269:f54e with SMTP id 4fb4d7f45d1cf-5e0b70d7026mr946261a12.14.1740105559752; Thu, 20 Feb 2025 18:39:19 -0800 (PST) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 References: In-Reply-To: From: Rick Macklem Date: Thu, 20 Feb 2025 18:39:03 -0800 X-Gm-Features: AWEUYZltPpX70N1mA7fFETHVkApExavu3E691zg96u5nQJt775askD-U19W5JTc Message-ID: Subject: Re: RFC: mount_nfs failure due to dns not running yet To: Steve Rikli Cc: FreeBSD CURRENT , Gleb Smirnoff Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4YzZ8T6jQ9z3Qm7 X-Spamd-Bar: ---- On Thu, Feb 20, 2025 at 4:28=E2=80=AFPM Steve Rikli wrote= : > > On Wed, Feb 19, 2025 at 02:40:15PM -0800, Rick Macklem wrote: > > > > The subject line basically describes the problem glebius@ > > ran into. When doing an NFS mount in /etc/fstab, it failed > > since the DNS service was not yet working and, as such, > > the DNS lookup of the server fqdn failed, causing the mount > > to fail. Note that this behaviour has existed for decades. > > > > He feels this is a bug and that mount_nfs(8) should retry > > getaddrinfo(3) calls until success, instead of failing the > > mount when the first attempt fails. > > The problem with just retrying getaddrinfo(3) is that it > > could retry forever for simple failures like a typo in the > > server fqdn. > > I can see several ways this can be handled and would > > like feedback from others w.r.t. these alternatives. > > > > 1) Simply document this case and encourage use of > > host names in /etc/hosts for NFS servers along with > > specifying use of file before dns in nsswitch.conf. > > Doing this results in the mounts working whether or > > not DNS is working. > > > > 2) Call it a bug and patch mount_nfs(8) to retry getaddrinfo(3) > > until it succeeds. (I feel this would be a POLA violation, > > given that the current behaviour has existed for decades > > and for simple cases where the fqdn will never resolve > > the behaviour would be to hang at the mount attempt > > during boot unless "bg" is specified for the /etc/fstab entry.) > > > > 3) Add a new NFS mount option "retrydns=3D", which would enable > > retries of getaddrinfo(3). This would avoid any POLA violation and > > would allow for a convenient way to document the behaviour in > > "man mount_nfs". > > > > 4) ??? > > > > So, what do you think is the preferred change? > > I don't think I would change mount_nfs code behavior for this. > > That is, requiring services and daemons etc. to workaround missing, > misconfigured, slow, or misbehaving nameservice (whether it's DNS, > /etc/hosts, NIS, whatever) seems like more complexity, possibly not > effective, and maybe not focused on the right thing. > > Now, without meaning to be presumptuous, it may be worth re-examining > the startup sequence, e.g. to make sure NFS mounts are tried after the > known dependencies can reasonably be expected to have started, including > the network, plus local_unbound or bind (if used), possibly others. > > After a quick look, I don't see an obvious problem with the sequence, > but more knowledgeable eyes than mine are welcome. I don't quite follow > some of the output from rcorder and service -r. > > > ps: I looked and the return value from getaddrinfo(3) does not > > appear to be useful to discern the case of "DNS service not > > running yet". (I think it replies EAI_FAIL for this case.) > > In that area, I'll note FreeBSD rc.d has a "NETWORKING" dependency for > PROVIDE and REQUIRE, and it's included in scripts like nfsclient, > mountcritremote et al. However there seems to be no similar dependency > for something like "NAMESERVICE" (generic, as opposed to "named" > specifically), and I'm not sure how that might be implemented, even > assuming it could be useful in a situation like this. > > I.e. there are many things to potentially check for "can the system > resolve hostnames yet", and not all of them involve running a local > instance of named, unbound, etc. > > In general, if I were running into problems with nameservice not being > available by the time NFS mounts happen, I think I'd start by looking > into possible nameservice issues, then check out some mechanisms other > folks have mentioned (fstab IP addresses or late option, rc.conf > netwait_enable, etc.) rather than coding workarounds into NFS itself. Well, the patch I have created (it took about 15min) only changes behaviour if a new "retrydns" option i used. As such, I think it might be useful for = some, but doesn't change things unless someone uses it. I agree with you that I don't think the rc scripts have a way to check REQU= IRE dns working. (I, personally, always put the fqdn for NFS servers in /etc/ho= sts and make sure "files" is first in nsswitch.conf, but others argue that is n= ot feasible for some deployments. (Using IP numbers works for AUTH_SYS, but not Kerberized mounts.) Note that there is already "retrycnt", which specifies retry the mount, but that retry loop doesn't include getaddrinfo(3) calls. --> Personally, I do not like always doing retries since I often type mount commands manually and I'm a terrible typist, so I often mistype the server's name. This reply was mostly a followup on all the good comments and not just yours. Thanks everyone, for your comments, rick