From nobody Fri Dec 02 01:32:26 2022 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4NNb6G392bz4jxMM for ; Fri, 2 Dec 2022 01:32:38 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4NNb6G2YWYz4B6w; Fri, 2 Dec 2022 01:32:38 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-pj1-x102d.google.com with SMTP id w15-20020a17090a380f00b0021873113cb4so3851304pjb.0; Thu, 01 Dec 2022 17:32:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ppmxmYSUwoN+v3t3yn+I+YH8Loer49ChQ1IxboJTYWM=; b=jFdnLeZ17g/Dt2SiFV25TzxqnVAf7etuzI2J2hAeModwItMPLXx2BUCTW+fzrqhSy+ YKyvZ0QPy7PwHNevKfgfsQodZ+goXToZ8aEOd3yfhglRxgoTMkKe94DdNcf6Q5uLHT1D VSs60eteJK+xHZd5UE46ORdmB3TusuqUGD+xJ77fNEFQjIAz31+qE45Ss3mnRTbQQHsG cIIX6SsWeXN0jcPQxLg0DcacOP9KmeZGkbno6PwGj8B1gKSs4clsreEi3Sy4Fu9CgF8Q jY5tft2YgyS/pcgv1HJQvUHZcRsuzWJiz5Oxif/hZFfOfMVK5nKOviEqVbgmpWrGdGwY qnjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ppmxmYSUwoN+v3t3yn+I+YH8Loer49ChQ1IxboJTYWM=; b=cfYybltDppuo3xMFAgCl8/hEBXn5Q3xK99gVvMEC4fGzuvgb4s8Przd/t3ZDzb53l4 aoBmt6q3G5te/KqeCwy7DU0ho6rgbzGxh2Gzk9WriDMiXNdDFlfXz2diwUb2IoOlKTWd jVymka1SIQkqpKkrZiIKmEFz0rJSU9N+4qoDhyiBOUjPm4TMy2tnayxe922psMhhcYRr TbHv3TmxqvjOGF4QSdWpdMod/OpUeTiRWRXuCo7Ph5t+scPy9P+pymOOoozGaOkTwkck TQ6hS9zK3iParVhKxgD0YyyPf6YHeEd6t12leDd298WL2mKjRoFWczI/L5c1MuTsrOcr J1Pw== X-Gm-Message-State: ANoB5pkyn+VtdcwVfDoz6NW+sIYoUgFGFOzHiGJxXGGhUeE3m6qs3W+O wlWAA5MPnihJNz43DsooN6p8Joly23Rw2fYdLA== X-Google-Smtp-Source: AA0mqf4eWCnVwlc4sdc3FfNymUGx7odz6csC9aIge9S6si3kaUgSnEEPJYzPCoEDHIQvIBjHb4jAphgF2ms5lClcKH0= X-Received: by 2002:a17:902:c104:b0:189:a931:c8a1 with SMTP id 4-20020a170902c10400b00189a931c8a1mr11657914pli.112.1669944757315; Thu, 01 Dec 2022 17:32:37 -0800 (PST) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 References: <82103A1E-9D39-47B0-9520-205583C8B680@lysator.liu.se> <2980bcbd22f884962d358808f9440d77@bsdforge.com> In-Reply-To: <2980bcbd22f884962d358808f9440d77@bsdforge.com> From: Rick Macklem Date: Thu, 1 Dec 2022 17:32:26 -0800 Message-ID: Subject: Re: RFC: nfsd in a vnet jail To: Chris Cc: Peter Eriksson , FreeBSD CURRENT , "Bjoern A. Zeeb" , Alan Somers Content-Type: multipart/alternative; boundary="0000000000000debab05eece505c" X-Rspamd-Queue-Id: 4NNb6G2YWYz4B6w X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-ThisMailContainsUnwantedMimeParts: N --0000000000000debab05eece505c Content-Type: text/plain; charset="UTF-8" On Thu, Dec 1, 2022 at 8:23 AM Chris wrote: > On 2022-11-29 16:21, Rick Macklem wrote: > > On Sun, Nov 27, 2022 at 10:04 AM Peter Eriksson > wrote: > > > >> Keep the global variables as defaults that apply to all nfsds and allow > >> (at least some subset) to be overridden inside the net jails if some > things > >> need to be changed from the defaults? > >> > >> This is pretty much a reply to one of the posts selected at random, > > but I thought that better than starting a new email thread. > > > > bz@ and asomers@ have both asked about running mountd within a vnet > prison > > (one via offlist email and the other on phabricator). > > > > I think it is worth discussing here... > > mountd (rightly or wrongly) does two distinctly different things: > > 1 - It pushes the exports into the kernel via nmount() so they > > can be hung off of the "struct mount" for a file system's > > mount point. > > --> This can only work for file system mount points and can > > only be done once for any given file system mount point. > > > > At this time, I have it done once globally outside of the prisons. > > The alternative I can see is doing it within each prison, but I > > think that would require that each prison have its own file > system(s). > > (ie. The prison's root would always be a file system mount point.) > > > > 2 - It handles RPC Mount protocol requests from NFSv3 clients. This one > > is NFSv3 specific, which is why I have done this NFSv4 only at > > this time. To do this, it must be able to register with rpcbind, > > and I have no idea if running rpcbind in a vnet jail is practical. > > > > Enforcing the use for separate file systems for each jail also makes > > things safer, since the exports are enforced by the kernel. Without > > this, a malicious NFSv4 client could "guess" a file handle for a file > > outside the jail and gain access to that file. Put another way, without > > a separate file system, there is no way to stop a malicious client from > > finding files above the Root file handle. (Normal clients will use > > PutRootFH and LookupParent and these won't be able to go above the top > > of the jail.) > > > > So, what do others think of enforcing the requirement that each jail > > have its own file systems for this? > > I don't care for any of it. It looks like additional overhead with the > addition of potential security risks. All for a very limited (and as yet > unknown) use case. > I am thinking that if/when this goes into main, it would be under a new kernel build option called something like NFSD_VIMAGE. I think that would avoid the overhead/security risks for those that do not need/want it. rick > > --chris > > > > rick > > > > > >> - Peter > >> > >> > >> On Fri, Nov 25, 2022, 4:24 PM Rick Macklem > wrote: > >> > >>> Hi, > >>> > >>> bz@ has encouraged me to fiddle with the nfsd > >>> so that it works in a vnet jail. > >>> I have now basically done so, specifically for > >>> NFSv4, since NFSv3 presents various issues. > >>> > >>> What I have not yet done is put global variables > >>> in the vnet. This needs to be done so that the nfsd > >>> can be run in multiple jail instances and/or in and > >>> outside of a jail. > >>> The problem is that there are 100s of global variables. > >>> > >>> I can see two approaches: > >>> 1 - Move them all into the vnet jail. This would imply > >>> that all the sysctls need to somehow be changed, > >>> which would seem to be a POLA violation. > >>> It also implies a lot of stuff in the vnet. > >>> 2 - Just move the global variables that will always > >>> differ from one nfsd to another (this would make > >>> the sysctls global and apply to all nfsds). > >>> This will keep the number of globals in the vnet > >>> smaller. > >>> > >>> I am currently leaning towards #2, put what do others > >>> think? > >>> > >>> rick > >>> ps: Personally, I don't know what use there is of > >>> running the nfsd inside a vnet jail, but bz@ has > >>> some use case. > >>> > >> > >> > --0000000000000debab05eece505c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Thu, Dec 1, 2022 at 8:23 AM Chris <bsd-lists@bsdforge.com> wrote:=
On 2022-11-29 1= 6:21, Rick Macklem wrote:
> On Sun, Nov 27, 2022 at 10:04 AM Peter Eriksson <pen@lysator.liu.se> wrote:
>
>> Keep the global variables as defaults that apply to all nfsds and = allow
>> (at least some subset) to be overridden inside the net jails if so= me things
>> need to be changed from the defaults?
>>
>> This is pretty much a reply to one of the posts selected at random= ,
> but I thought that better than starting a new email thread.
>
> bz@ and asomers@ have both asked about running mountd within a vnet pr= ison
> (one via offlist email and the other on phabricator).
>
> I think it is worth discussing here...
> mountd (rightly or wrongly) does two distinctly different things:
> 1 - It pushes the exports into the kernel via nmount() so they
>=C2=A0 =C2=A0 =C2=A0can be hung off of the "struct mount" for= a file system's
>=C2=A0 =C2=A0 =C2=A0mount point.
>=C2=A0 =C2=A0 =C2=A0--> This can only work for file system mount poi= nts and can
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0only be done once for any given file = system mount point.
>
>=C2=A0 =C2=A0 =C2=A0At this time, I have it done once globally outside = of the prisons.
>=C2=A0 =C2=A0 =C2=A0The alternative I can see is doing it within each p= rison, but I
>=C2=A0 =C2=A0 =C2=A0think that would require that each prison have its = own file system(s).
>=C2=A0 =C2=A0 =C2=A0(ie. The prison's root would always be a file s= ystem mount point.)
>
> 2 - It handles RPC Mount protocol requests from NFSv3 clients.=C2=A0 T= his one
>=C2=A0 =C2=A0 =C2=A0is NFSv3 specific, which is why I have done this NF= Sv4 only at
>=C2=A0 =C2=A0 =C2=A0this time.=C2=A0 To do this, it must be able to reg= ister with rpcbind,
>=C2=A0 =C2=A0 =C2=A0and I have no idea if running rpcbind in a vnet jai= l is practical.
>
> Enforcing the use for separate file systems for each jail also makes > things safer, since the exports are enforced by the kernel. Without > this, a malicious NFSv4 client could "guess" a file handle f= or a file
> outside the jail and gain access to that file. Put another way, withou= t
> a separate file system, there is no way to stop a malicious client fro= m
> finding files above the Root file handle. (Normal clients will use
> PutRootFH and LookupParent and these won't be able to go above the= top
> of the jail.)
>
> So, what do others think of enforcing the requirement that each jail > have its own file systems for this?

I don't care for any of it. It looks like additional overhead with the<= br> addition of potential security risks. All for a very limited (and as yet unknown) use case.
I am thinking that if/when this goes into main, = it would be
under a new kernel build option called something like
N= FSD_VIMAGE. I think that would avoid the overhead/security
risks for th= ose that do not need/want it.

rick=C2=A0

--chris
>
> rick
>
>
>> - Peter
>>
>>
>> On Fri, Nov 25, 2022, 4:24 PM Rick Macklem <rick.macklem@gmail.com> wro= te:
>>
>>> Hi,
>>>
>>> bz@ has encouraged me to fiddle with the nfsd
>>> so that it works in a vnet jail.
>>> I have now basically done so, specifically for
>>> NFSv4, since NFSv3 presents various issues.
>>>
>>> What I have not yet done is put global variables
>>> in the vnet. This needs to be done so that the nfsd
>>> can be run in multiple jail instances and/or in and
>>> outside of a jail.
>>> The problem is that there are 100s of global variables.
>>>
>>> I can see two approaches:
>>> 1 - Move them all into the vnet jail. This would imply
>>>=C2=A0 =C2=A0 =C2=A0that all the sysctls need to somehow be cha= nged,
>>>=C2=A0 =C2=A0 =C2=A0which would seem to be a POLA violation. >>>=C2=A0 =C2=A0 =C2=A0It also implies a lot of stuff in the vnet.=
>>> 2 - Just move the global variables that will always
>>>=C2=A0 =C2=A0 =C2=A0differ from one nfsd to another (this would= make
>>>=C2=A0 =C2=A0 =C2=A0the sysctls global and apply to all nfsds).=
>>>=C2=A0 =C2=A0 =C2=A0This will keep the number of globals in the= vnet
>>>=C2=A0 =C2=A0 =C2=A0smaller.
>>>
>>> I am currently leaning towards #2, put what do others
>>> think?
>>>
>>> rick
>>> ps: Personally, I don't know what use there is of
>>>=C2=A0 =C2=A0 =C2=A0running the nfsd inside a vnet jail, but bz= @ has
>>>=C2=A0 =C2=A0 =C2=A0some use case.
>>>
>>
>>
--0000000000000debab05eece505c--