btoc()/ctob() (was Re: svn commit: r349791 - head/sys/vm)

Mike Karels mike at karels.net
Sun Jul 7 13:04:49 UTC 2019


------- Blind-Carbon-Copy

To: arch at freebsd.org
cc: brde at optusnet.com.au, alc at feeebsd.org
From: Mike Karels <mike at karels.net>
Reply-to: mike at karels.net
Subject: btoc()/ctob() (was Re: svn commit: r349791 - head/sys/vm)
In-reply-to: Your message of Sun, 07 Jul 2019 03:48:54 +1000.
             <20190707023441.B2047 at besplex.bde.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <60189.1562504486.1 at mail.karels.net>
Content-Transfer-Encoding: quoted-printable
Date: Sun, 07 Jul 2019 08:04:41 -0500

[I moved my reply/new question to arch@, and bcc'd src-committers@ et al,
where the interaction started.  Excuse the top post, but the previous
message is included just for reference; it need not be included in any
replies.  Followups to arch@, please.]

[Sorry if this topic has been covered before; just let me know.  Neither
my freebsd.org or Google searches located anything, although I didn't try
all that hard.]

I am astounded to hear that the btoc() macro still lives on, along with
ctob() (bytes to clicks, and clicks to bytes).  They don't seem to have
suitable MI replacements (at least not in <sys/param.h>).  For those who
don't know/remember: these macros were used to convert physical addresses
to values for the segment base registers (PARs) on the PDP-11, where there
was no actual "page size" analogous to newer hardware.  (A click, the 'c'
in btoc/ctob, was 64 bytes, the unit of scaling to keep physical addresses
in the hardware registers fitting in 16 bits.)  Since then, the macros hav=
e
been subverted to operate in pages, starting with the VAX in 32/V, rather
than coming up with a new, more appropriate name.  I am wondering whether
it is finally time to deprecate PDP-11-based nomenclature.  Clicks have
not been a "thing" for many years; but pages are nearly universal (modulo
superpages, etc).

I see that many architectures have architecture-dependent xxx_btop() and
xxx_ptob() macros, which are similar but annoyingly different in details
like casts.  Unlike btoc(), the xxx_btop() versions do not round.  I wonde=
r,
though, whether it is time to hoist *_btop/*_ptob into a new architecture-
independent replacement for btoc/ctob.  btop() and ptob() come to mind,
although I don't know what Linux (etc) does along these lines.

I see just over 100 uses of btoc()/ctob() overall, but of course we would
need to keep deprecated versions for a while.

Thoughts on changing the KPI here?  We'd have to agree on details like
rounding and casting, but my first reaction would be no to both.

		Mike

> Date: Sun, 7 Jul 2019 03:48:54 +1000 (EST)
> From: Bruce Evans <brde at optusnet.com.au>
> X-X-Sender: bde at besplex.bde.org
> To: Doug Moore <dougm at freebsd.org>
> cc: src-committers at freebsd.org, svn-src-all at freebsd.org,
>         svn-src-head at freebsd.org
> Subject: Re: svn commit: r349791 - head/sys/vm
> In-Reply-To: <201907061555.x66FtGsg025314 at repo.freebsd.org>
> Message-ID: <20190707023441.B2047 at besplex.bde.org>
> References: <201907061555.x66FtGsg025314 at repo.freebsd.org>
> MIME-Version: 1.0
> Content-Type: TEXT/PLAIN; charset=3DUS-ASCII; format=3Dflowed
> X-Optus-CM-Score: 0
> X-Optus-CM-Analysis: v=3D2.2 cv=3DP6RKvmIu c=3D1 sm=3D1 tr=3D0 cx=3Da_id=
p_d
> 	a=3DPalzARQSbocsUSjMRkwAPg=3D=3D:117 a=3DPalzARQSbocsUSjMRkwAPg=3D=3D:1=
7
> 	a=3DjpOVt7BSZ2e4Z31A5e1TngXxSK0=3D:19 a=3Dkj9zAlcOel0A:10
> 	a=3Dk1N1_3MYVckZGDfFxwYA:9 a=3DCjuIK1q_8ugA:10
> Precedence: bulk
> X-Loop: FreeBSD.org
> Sender: owner-src-committers at freebsd.org
> List-Id: FreeBSD mail <src-committers.freebsd.org>
> X-Rspamd-Queue-Id: 158A46ECA9
> X-Spamd-Bar: ------
> Authentication-Results: mx1.freebsd.org
> X-Spamd-Result: default: False [-6.98 / 15.00];
> 	 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0];
> 	 NEURAL_HAM_SHORT(-0.98)[-0.982,0];
> 	 REPLY(-4.00)[];
> 	 NEURAL_HAM_LONG(-1.00)[-1.000,0]
> X-UID: 364675
> Status: RO
> X-Keywords: $NotJunk NonJunk                                        =

> Content-Length: 5846

> On Sat, 6 Jul 2019, Doug Moore wrote:

> > Log:
> >  Fix style(9) violations involving division by PAGE_SIZE.

> It is style violation to even use an explicit division by PAGE_SIZE
> instead of the btoc() conversion macro (*).

> > Modified: head/sys/vm/swap_pager.c
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
> > --- head/sys/vm/swap_pager.c	Sat Jul  6 15:34:23 2019	(r349790)
> > +++ head/sys/vm/swap_pager.c	Sat Jul  6 15:55:16 2019	(r349791)
> > @@ -523,7 +523,7 @@ swap_pager_swap_init(void)
> > 	 * but it isn't very efficient).
> > 	 *
> > 	 * The nsw_cluster_max is constrained by the bp->b_pages[]
> > -	 * array (MAXPHYS/PAGE_SIZE) and our locally defined
> > +	 * array MAXPHYS / PAGE_SIZE and our locally defined
> > 	 * MAX_PAGEOUT_CLUSTER.   Also be aware that swap ops are
> > 	 * constrained by the swap device interleave stripe size.
> > 	 *

> The macro is less readable in comments.

> > @@ -538,7 +538,7 @@ swap_pager_swap_init(void)
> > 	 * have one NFS swap device due to the command/ack latency over NFS.
> > 	 * So it all works out pretty well.
> > 	 */
> > -	nsw_cluster_max =3D min((MAXPHYS/PAGE_SIZE), MAX_PAGEOUT_CLUSTER);
> > +	nsw_cluster_max =3D min(MAXPHYS / PAGE_SIZE, MAX_PAGEOUT_CLUSTER);
> >
> > 	nsw_wcount_async =3D 4;
> > 	nsw_wcount_async_max =3D nsw_wcount_async;
> >
> > Modified: head/sys/vm/vnode_pager.c
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
> > --- head/sys/vm/vnode_pager.c	Sat Jul  6 15:34:23 2019	(r349790)
> > +++ head/sys/vm/vnode_pager.c	Sat Jul  6 15:55:16 2019	(r349791)
> > @@ -544,8 +544,8 @@ vnode_pager_addr(struct vnode *vp, vm_ooffset_t ad=
dres
> > 			*rtaddress +=3D voffset / DEV_BSIZE;
> > ...

> Using explicit division by DEV_BSIZE instead of the btodb() conversion
> macro is another style bug.

> (*) The macros use shifts while the divisions use division.  The hard-co=
ded
> versions could use shifts too, so there is another set of style bugs fro=
m
> only using shifts sometimes.  Shifts are faster if the type of the divid=
end
> is signed.

> Oops.  The macro also rounds up.  So hard-coding the division is not jus=
t
> a style bug.  It is wrong unless the dividend is a multiple of the page
> size.  This shouldn't be assumed for values like MAXBSIZE.

> There are many more style bugs involving btoc():
> - 'c' in it means 'click', which might mean a virtual page size while
>    PAGE_SIZE might mean the physical page size.  Or vice versa.  dyson
>    retired from BSD not long after I asked him to clean this up 20+
>    years ago.
> - btoc() is the only clearly MI macro for this.  The better-named macros
>    amd64_ptob() and i386_ptob() are of course MD.  amd64_ptob() is never
>    used in sys/amd64.  i386_ptob() is used 8 times in sys/i386 (all in
>    pmap.c), and 1 of these uses is in a macro which is used 22 times.
>    These uses give another set of style bugs.  They are just obfuscation=
s
>    if clicks are the same as pages, and probably incomplete otherwise.
>    However, it is correct for MD code to use physical pages unless it is
>    implementing virtual pages.

>    These macros don't round up, so they are not equivalent to btoc() eve=
n
>    if the click size is PAGE_SIZE.
> - there is also the better-named macro atop().  This doesn't round up.
>    I think it is supposed to be MI.  It is replicated <machine/param.h>
>    for all arches.  'a' in it means 'address', which is less general tha=
n
>    'b' for 'byte, so it is a worse name than btop().
> - the macro with the best name btop() doesn't exist.

> In the opposite direction, there are ctob(), {amd64,i386}_ptob(),
> ptoa(), and direct multiplications by PAGE_SIZE and direct shifts by
> PAGE_SHIFT.  {amd64,i386}_ptob() is not used even on i386.  This directi=
on
> is trickier since the (npages << PAGE_SHIFT) overflows if npages has the
> correct type (int).  The caller should convert npages to something like
> vm_offset_t before using the macro and the need for this is less obvious
> than for a direct expression.  This is of course undocumented except by
> the source code which shows:
> - ctob(), ptoa() and i386_ptob() need conversion in the caller, but
>    amd64_ptob() converts to unsigned long in the macro.  Since the last
>    macro is unused, the caller should convert in all used cases.  Covers=
ion
>    is most important on 64-bit arches.  On i386, overflow occurs at 2G
>    starting with int npages but u_int npages works up to 4G which is
>    enough for i386 addresses but not for amd64 addresses or i386-PAE
>    byte counts.

> I once sprinkled conversions in the macros.  btodb() still uses my old
> code for this, where the code is sophisticated so as to avoid using
> the long long abomination, but which became mostly nonsense when daddr_t
> was expanded from 32 bits to 64.  Since this macro shifts right,
> conversion is not really necessary, and conversion to daddr_t gives
> much the same pessimations as conversion to the abomination.  dbtodb()
> simply converts to daddr_t.  I forget if I wrote that.  btoc() converts
> to vm_offset_t, but since it shifts right conversion is not really
> necessary.

> Conversions in macros are a wrong way to do this.  jake taught me this
> in connection with sparc64 i386-PAE work.  Conversion in the caller
> allows the caller to control the type of the result and to not pessimize
> by expanding the type.  I think jake intentionally left out conversions
> in the sparc64 macros but didn't change the x86 macros much.  Other arch=
es
> use a random mixture.  E.g., ptoa() converts to unsigned long on amd64,
> arm64, riscv and sparc64 (so jake didn't change this?), to unsigned
> on arm, and doesn't convert on i386, mips or powerpc.  Always converting
> to vm_offset_t would be reasonable, but gives namespace problems.  It
> is stupid that the MI <sys/param.h> uses vm_offset_t for btoc() where
> conversion is not really needed, while <machine/param.h> is more careful
> about namespace problems so it avoids using even u_long for ptoa().

> Bruce

------- End of Blind-Carbon-Copy


More information about the svn-src-head mailing list