amd64/170351: [patch] amd64: 64-bit process can't always get
mqiao at juniper.net
Thu Aug 9 20:20:03 UTC 2012
The following reply was made to PR amd64/170351; it has been noted by GNATS.
From: Ming Qiao <mqiao at juniper.net>
To: Konstantin Belousov <konstantin.belousov at zoral.com.ua>
Cc: Erin MacNeil <emacneil at juniper.net>, "freebsd-gnats-submit at freebsd.org"
<freebsd-gnats-submit at freebsd.org>
Subject: RE: amd64/170351: [patch] amd64: 64-bit process can't always get
Date: Thu, 9 Aug 2012 16:16:38 -0400
Thanks for the explanation. I'll prepare a fix and send it to you for revie=
when it's ready.
From: Konstantin Belousov [mailto:konstantin.belousov at zoral.com.ua]=20
Sent: Wednesday, August 08, 2012 6:07 PM
To: Ming Qiao
Cc: Erin MacNeil; freebsd-gnats-submit at freebsd.org
Subject: Re: amd64/170351: [patch] amd64: 64-bit process can't always get u=
Do not strip public lists from the discussion. There is nothing private.
On Tue, Aug 07, 2012 at 05:52:07PM -0400, Ming Qiao wrote:
> Hi Konstantin,
> Thanks for your quick response. Actually I'm not very clear about the=20
> second approach you mentioned. Some questions here: 1) Could you=20
> please elaborate the idea of "tracking rlimits set to ABI infinity"?
> If I understand correctly, you are referring to a model where a=20
> process can have it rlimit set multiple times by different ABI? But=20
> what does it mean exactly? Could you give a simple example here? 2)=20
> What do you mean by "per-struct rlimit"? Do you mean each memory=20
> segment as a struct? such as datasize, stacksize, etc.
I mean that in addition to the existing array of pl_rlimit in struct plimit=
, you also create an bitmap array of the same size. Set bit in this new arr=
ay would indicate that corresponding limit was set (either implicit, or exp=
licitely by usermode) to infinity. The bit has its meaning regardless of th=
e actual numeric value written into the pl_rlimit, either by syscall or by =
Then, 64bit sysent should also grow sv_fixup for resource limits, and set i=
t accordingly for host ABI if array indicates that resource is logically 'i=
For completeness, I should note that bit is cleared if syscall sets the res=
ource to non-infinite value. Per-struct rlimit means that there is a bit fo=
r each resource.
Is it clear now ?
> -----Original Message-----
> From: Konstantin Belousov [mailto:kostikbel at gmail.com]
> Sent: Friday, August 03, 2012 1:39 PM
> To: Ming Qiao
> Cc: freebsd-gnats-submit at freebsd.org
> Subject: Re: amd64/170351: [patch] amd64: 64-bit process can't always=20
> get unlimited rlimit
> On Fri, Aug 03, 2012 at 03:35:20PM +0000, Ming Qiao wrote:
> > >Number: 170351
> > >Category: amd64
> > >Synopsis: [patch] amd64: 64-bit process can't always get unlimit=
> > >Confidential: no
> > >Severity: non-critical
> > >Priority: low
> > >Responsible: freebsd-amd64
> > >State: open
> > >Quarter: =20
> > >Keywords: =20
> > >Date-Required:
> > >Class: sw-bug
> > >Submitter-Id: current-users
> > >Arrival-Date: Fri Aug 03 15:40:08 UTC 2012
> > >Closed-Date:
> > >Last-Modified:
> > >Originator: Ming Qiao
> > >Release: FreeBSD 9.0-RC2
> > >Organization:
> > Juniper Networks
> > >Environment:
> > FreeBSD neys 9.0-RC2 FreeBSD 9.0-RC2 #0: Thu Jul 26 01:27:46 UTC=20
> > 2012 root at neys:/usr/obj/usr/src/sys/GENERIC amd64
> > >Description:
> > On the amd64 platform, if a 32-bit process ever manually set its=20
> > rlimit, none of its 64-bit child or offspring will be able to get=20
> > the full 64-bit rlimit anymore, even if they explicitly set the limit t=
> > Note that for the sake of simplicity, only datasize limit is=20
> > referred in this report. But the same logic applies to all other=20
> > memory segment (i.e. stacksize, etc.).
> > Take the following scenario as an example:
> > 1) Let's say we have a 32-bit process p1 whose hard limit is set to=20
> > 500MB by calling setrlimit().
> > 2) p1 then exec another 32-bit process p2.
> > 3) p2 set its hard limit to unlimited by calling setrlimit().
> > 4) p2 exec a 64-bit process p3.
> > 5) check the hard limit of p3, we can see that it only has 3GB=20
> > (value of
> > ia32_maxdsiz) instead of 32GB which is the global kernel limit=20
> > (value of
> > maxdsiz) for a 64-bit process.
> > The root cause is that on step 3, p2 didn't actually set its limit=20
> > to the correct value when calling setrlimit(). Instead the limit is=20
> > set to ia32_maxdsiz since ia32_fixlimit() is called in kern_proc_setrli=
> > >How-To-Repeat:
> > There are 3 test programs attached in this report: 32_p1.c, 32_p2.c,=20
> > and 64_p3.c. They can be used to reproduce the problem.
> > 1) Compile 32_p1.c and 32_p2.c into 32-bit binaries. Compile 64_p3.c=20
> > into 64-bit binary.
> > 2) Put all 3 binaries into the same directory on a machine running=20
> > FreeBSD
> > amd64 version.
> > 3) Run 32_p1 which will exec 32_p2 and 64_p3. The output of 64_p3=20
> > will show its limit is capped at ia32_maxdsiz.
> > >Fix:
> > The proposed fix is to change kern_proc_setrlimit() so that
> > sv_fixlimit() will not be called if the caller wants to set the new lim=
it to RLIM_INFINITY.
> > Please refer to the attached diff file for the proposed fix.
> The 'fix' is wrong and does not address the issue.
> Instead, it uses some arbitrary properties of the scenario you considered=
and adapts kernel code to suit your scenario. Your deny the correction of =
the infinity limit, I do not see how it can be right.
> The problem you described is architectural. By design, Unix resource limi=
ts cannot be increased after they were decreased, except by root.
> In your scenario, the limits were decreased by mere fact of running the 3=
2bit process which have lower 'infinity' limits then 64bit processes.
> That said, I see two possible solutions.
> First is to manually set compat.ia32.max* sysctls to 0. Then you get desi=
red behaviour for 64bit processes execed from 32bit, it seems.
> It does not require code change. Since you are fine with denying fix for =
infinity, this setting gives the same effect as the patch.
> Second approach (which is essentially a correction to your approach from =
fix.diff) is to track the fact that corresponding rlimits are set to 'ABI i=
nfinity', in some per-struct rlimit flag. Then, get/setrlimit should first =
test the 'ABI infinity' flag and behave as if rlimit is set to infinity for=
current bitness even if the actual value of rlimit is not infinity. Flag i=
s set when rlimit is set to infinity by current ABI.
> The second approach would provide 'correct' fix, but it is not trivial am=
ount of work for very rare situation (execing 64bit process from 32bit), an=
d current behaviour of inheriting 32bit limits may be argued as right.
> If you want, feel free to develop such patch, I will review and commit it=
, but I do not want to spend efforts on developing it myself ATM.
More information about the freebsd-amd64