Bad performance on alpha? (make buildworld)

Chuck Swiger cswiger at mac.com
Wed Feb 25 13:06:07 PST 2004


Petri Helenius wrote:
> Talking about different instruction sets and compiler scheluding 
> options. Would it be considered a good idea to introduce a sysctl which 
> would contain the maximum mcpu= value for the currently running 
> architechture? This way one could provide with multiple executables and 
> a startup script, in the fashion of:
> prog.i386
> prog.pentium2
> prog.pentium3
> prog.pentium4
> prog.athlon-mp
> etc...

The idea you've suggested is interesting, although the distinction between 
code generation between a P2 and P4, or for the AMD chips is fairly minimal 
for most code, the obvious exception being code which tries to take advantage 
of CPU features like MMX, SSE, & 3D-Now!  In other words, your suggestion 
wouldn't help grep or the kernel very much, but could be fairly useful for 
multimedia apps.

There's also a very good implementation for supporting multiple architectures 
within a single binary, called the Mach-O executable format (rather than ELF) 
used to create "fat binaries", or "MAB"s (multi-architecture binaries). 
Mach-O is the format used by NEXTSTEP and MacOS X.  Typically, adding a new 
architecture only adds about ~15% to the size of a particular executable, 
although that can vary quite widely.

 From /usr/include/mach-o/arch.h:

/* The NXArchInfo structs contain the architectures symbolic name
  * (such as "ppc"), its CPU type and CPU subtype as defined in
  * mach/machine.h, the byte order for the architecture, and a
  * describing string (such as "PowerPC").
  * There will both be entries for specific CPUs (such as ppc604e) as
  * well as generic "family" entries (such as ppc).
  */
typedef struct {
     const char *name;
     cpu_type_t cputype;
     cpu_subtype_t cpusubtype;
     enum NXByteOrder byteorder;
     const char *description;
} NXArchInfo;

#if __cplusplus
extern "C" {
#endif /* __cplusplus */

/* NXGetAllArchInfos() returns a pointer to an array of all known
  * NXArchInfo structures.  The last NXArchInfo is marked by a NULL name.
  */
extern const NXArchInfo *NXGetAllArchInfos(void);

/* NXGetLocalArchInfo() returns the NXArchInfo for the local host, or NULL
  * if none is known.
  */
extern const NXArchInfo *NXGetLocalArchInfo(void);

/* NXGetArchInfoFromName() and NXGetArchInfoFromCpuType() return the
  * NXArchInfo from the architecture's name or cputype/cpusubtype
  * combination.  A cpusubtype of CPU_SUBTYPE_MULTIPLE can be used
  * to request the most general NXArchInfo known for the given cputype.
  * NULL is returned if no matching NXArchInfo can be found.
  */
extern const NXArchInfo *NXGetArchInfoFromName(const char *name);
extern const NXArchInfo *NXGetArchInfoFromCpuType(cpu_type_t cputype,
                                                   cpu_subtype_t cpusubtype);

/* NXFindBestFatArch() is passed a cputype and cpusubtype and a set of
  * fat_arch structs and selects the best one that matches (if any) and returns
  * a pointer to that fat_arch struct (or NULL).  The fat_arch structs must be
  * in the host byte order and correct such that the fat_archs really points to
  * enough memory for nfat_arch structs.  It is possible that this routinecould
  * fail if new cputypes or cpusubtypes are added and an old version of this
  * routine is used.  But if there is an exact match between the cputype and
  * cpusubtype and one of the fat_arch structs this routine will alwayssucceed.
  */
extern struct fat_arch *NXFindBestFatArch(cpu_type_t cputype,
                                           cpu_subtype_t cpusubtype,
                                           struct fat_arch *fat_archs,
                                           unsigned long nfat_archs);


[ ... ]

	----------

/usr/include/mach/machine.h supports the following CPUTYPEs:

/*
  *      Machine types known by all.
  */

#define CPU_TYPE_ANY            ((cpu_type_t) -1)

#define CPU_TYPE_VAX            ((cpu_type_t) 1)
/* skip                         ((cpu_type_t) 2)        */
/* skip                         ((cpu_type_t) 3)        */
/* skip                         ((cpu_type_t) 4)        */
/* skip                         ((cpu_type_t) 5)        */
#define CPU_TYPE_MC680x0        ((cpu_type_t) 6)
#define CPU_TYPE_I386           ((cpu_type_t) 7)
/* skip CPU_TYPE_MIPS           ((cpu_type_t) 8)        */
/* skip                         ((cpu_type_t) 9)        */
#define CPU_TYPE_MC98000        ((cpu_type_t) 10)
#define CPU_TYPE_HPPA           ((cpu_type_t) 11)
/* skip CPU_TYPE_ARM            ((cpu_type_t) 12)       */
#define CPU_TYPE_MC88000        ((cpu_type_t) 13)
#define CPU_TYPE_SPARC          ((cpu_type_t) 14)
#define CPU_TYPE_I860           ((cpu_type_t) 15)
/* skip CPU_TYPE_ALPHA          ((cpu_type_t) 16)       */
/* skip                         ((cpu_type_t) 17)       */
#define CPU_TYPE_POWERPC        ((cpu_type_t) 18)

...which appear to be a proper superset of the platforms FreeBSD supports. 
For the sake of reference, since the CPU_SUBTYPE list is ~200 lines, here are 
the x86 variants MachO knows about:

/*
  *      I386 subtypes.
  */

#define CPU_SUBTYPE_I386_ALL    ((cpu_subtype_t) 3)
#define CPU_SUBTYPE_386         ((cpu_subtype_t) 3)
#define CPU_SUBTYPE_486         ((cpu_subtype_t) 4)
#define CPU_SUBTYPE_486SX       ((cpu_subtype_t) 4 + 128)
#define CPU_SUBTYPE_586         ((cpu_subtype_t) 5)
#define CPU_SUBTYPE_INTEL(f, m) ((cpu_subtype_t) (f) + ((m) << 4))
#define CPU_SUBTYPE_PENT        CPU_SUBTYPE_INTEL(5, 0)
#define CPU_SUBTYPE_PENTPRO     CPU_SUBTYPE_INTEL(6, 1)
#define CPU_SUBTYPE_PENTII_M3   CPU_SUBTYPE_INTEL(6, 3)
#define CPU_SUBTYPE_PENTII_M5   CPU_SUBTYPE_INTEL(6, 5)

#define CPU_SUBTYPE_INTEL_FAMILY(x)     ((x) & 15)
#define CPU_SUBTYPE_INTEL_FAMILY_MAX    15

#define CPU_SUBTYPE_INTEL_MODEL(x)      ((x) >> 4)
#define CPU_SUBTYPE_INTEL_MODEL_ALL     0


-- 
-Chuck



More information about the freebsd-performance mailing list