Duplicate OPT_ entries in gcc/options.h
Jung-uk Kim
jkim at FreeBSD.org
Wed Jun 8 21:54:28 UTC 2016
On 06/ 8/16 05:15 PM, Dimitry Andric wrote:
> On 08 Jun 2016, at 21:11, Gerald Pfeifer <gerald at pfeifer.com> wrote:
>>
>> I got a user report, and could reproduce this, that building
>> GCC (lang/gcc, but also current HEAD, so probably pretty much
>> any version) with FreeBSD 11 and LANG = en_US.UTF-8 we get
>> conflicting entires in $BUILDDIR/gcc/options.h such as
>>
>> OPT_d = 135, /* -d */
>> OPT_D = 136, /* -D */
>> OPT_d = 137, /* -d */
>> OPT_D = 138, /* -D */
>> OPT_d = 141, /* -d */
>> OPT_D = 142, /* -D */
>> OPT_d = 143, /* -d */
>>
>> Using LANG = en_US (without UTF-8), everything works fine.
>>
>> Any ideas what might be going on here? (This is done via
>> AWK scripts from what I can tell, does this trigger any
>> ideas?)
>
> It is definitely something caused by our awk in base, in any case.
> First opt-gather.awk is run to generate a flat list of all options:
>
> /usr/bin/awk -f /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/opt-gather.awk /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/ada/gcc-interface/lang.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/fortran/lang.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/go/lang.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/java/lang.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/lto/lang.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/c-family/c.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/common.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/config/fused-madd.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/config/i386/i386.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/config/rpath.opt /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/config/freebsd.opt > tmp-optionlist
>
> Then opt-functions.awk is run to process optionlist into options.h:
>
> /usr/bin/awk -f /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/opt-functions.awk -f /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/opt-read.awk -f /usr/ports/lang/gcc/work/gcc-4.8.5/gcc/opth-gen.awk < optionlist > options.h
>
> If I run the first step using LANG=C, or without any LANG setting, both
> optionlist and options.h are as expected. If I run the first step using
> LANG=en_US.UTF-8, the optionlist is sorted differently, for example the
> "good" optionlist has the uppercase d options first, and much later the
> lowercase d options:
>
> D^\C ObjC C++ ObjC++ Joined Separate MissingArgError(macro name missing after %qs)^\-D<macro>[=<val>] Define a <macro> with <val> as its value. If just <macro> is given, <val> is taken to be 1
> D^\Driver Joined Separate
> D^\Fortran Joined Separate
> ... much later in the file, after all options starting with an uppercase letter ...
> d^\C ObjC C++ ObjC++ Joined
> d^\Common Joined^\-d<letters> Enable dumps from specific passes of the compiler
> d^\Fortran Joined
> d^\Java Separate SeparateAlias Alias(foutput-class-dir=)
>
> The "bad" optionlist has the upper and lower case d options sorted
> together:
>
> d^\C ObjC C++ ObjC++ Joined
> D^\C ObjC C++ ObjC++ Joined Separate MissingArgError(macro name missing after %qs)^\-D<macro>[=<val>] Define a <macro> with <val> as its value. If just <macro> is given, <val> is taken to be 1
> d^\Common Joined^\-d<letters> Enable dumps from specific passes of the compiler
> D^\Driver Joined Separate
> defsym=^\Driver JoinedOrMissing
> defsym^\Driver Separate
> d^\Fortran Joined
> D^\Fortran Joined Separate
> d^\Java Separate SeparateAlias Alias(foutput-class-dir=)
>
> Note that GNU awk does *not* produce a different optionlist file when
> used with either LANG=C or LANG=en_US.UTF-8.
>
> opt-gather.awk's sorting function looks like this:
>
> function sort(ARRAY, ELEMENTS)
> {
> for (i = 2; i <= ELEMENTS; ++i) {
> for (j = i; ARRAY[j-1] > ARRAY[j]; --j) {
> temp = ARRAY[j]
> ARRAY[j] = ARRAY[j-1]
> ARRAY[j-1] = temp
> }
> }
> return
> }
>
> So I am assuming that the ARRAY[j-1] > ARRAY[j] comparison works
> differently in our awk, depending on the LANG settings. No idea when
> that changed, though, if it changed at all...
This behaviour is known for very long time:
https://svnweb.freebsd.org/changeset/base/173731
and it is not our fault:
https://www.gnu.org/software/gawk/manual/html_node/POSIX-String-Comparison.html
GNU awk produces the same output with "--posix" option.
FYI...
Jung-uk Kim
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-toolchain/attachments/20160608/9013a6cf/attachment.sig>
More information about the freebsd-toolchain
mailing list