Bad performance on alpha? (make buildworld)
Chuck Swiger
cswiger at mac.com
Wed Feb 25 13:06:07 PST 2004
Petri Helenius wrote:
> Talking about different instruction sets and compiler scheluding
> options. Would it be considered a good idea to introduce a sysctl which
> would contain the maximum mcpu= value for the currently running
> architechture? This way one could provide with multiple executables and
> a startup script, in the fashion of:
> prog.i386
> prog.pentium2
> prog.pentium3
> prog.pentium4
> prog.athlon-mp
> etc...
The idea you've suggested is interesting, although the distinction between
code generation between a P2 and P4, or for the AMD chips is fairly minimal
for most code, the obvious exception being code which tries to take advantage
of CPU features like MMX, SSE, & 3D-Now! In other words, your suggestion
wouldn't help grep or the kernel very much, but could be fairly useful for
multimedia apps.
There's also a very good implementation for supporting multiple architectures
within a single binary, called the Mach-O executable format (rather than ELF)
used to create "fat binaries", or "MAB"s (multi-architecture binaries).
Mach-O is the format used by NEXTSTEP and MacOS X. Typically, adding a new
architecture only adds about ~15% to the size of a particular executable,
although that can vary quite widely.
From /usr/include/mach-o/arch.h:
/* The NXArchInfo structs contain the architectures symbolic name
* (such as "ppc"), its CPU type and CPU subtype as defined in
* mach/machine.h, the byte order for the architecture, and a
* describing string (such as "PowerPC").
* There will both be entries for specific CPUs (such as ppc604e) as
* well as generic "family" entries (such as ppc).
*/
typedef struct {
const char *name;
cpu_type_t cputype;
cpu_subtype_t cpusubtype;
enum NXByteOrder byteorder;
const char *description;
} NXArchInfo;
#if __cplusplus
extern "C" {
#endif /* __cplusplus */
/* NXGetAllArchInfos() returns a pointer to an array of all known
* NXArchInfo structures. The last NXArchInfo is marked by a NULL name.
*/
extern const NXArchInfo *NXGetAllArchInfos(void);
/* NXGetLocalArchInfo() returns the NXArchInfo for the local host, or NULL
* if none is known.
*/
extern const NXArchInfo *NXGetLocalArchInfo(void);
/* NXGetArchInfoFromName() and NXGetArchInfoFromCpuType() return the
* NXArchInfo from the architecture's name or cputype/cpusubtype
* combination. A cpusubtype of CPU_SUBTYPE_MULTIPLE can be used
* to request the most general NXArchInfo known for the given cputype.
* NULL is returned if no matching NXArchInfo can be found.
*/
extern const NXArchInfo *NXGetArchInfoFromName(const char *name);
extern const NXArchInfo *NXGetArchInfoFromCpuType(cpu_type_t cputype,
cpu_subtype_t cpusubtype);
/* NXFindBestFatArch() is passed a cputype and cpusubtype and a set of
* fat_arch structs and selects the best one that matches (if any) and returns
* a pointer to that fat_arch struct (or NULL). The fat_arch structs must be
* in the host byte order and correct such that the fat_archs really points to
* enough memory for nfat_arch structs. It is possible that this routinecould
* fail if new cputypes or cpusubtypes are added and an old version of this
* routine is used. But if there is an exact match between the cputype and
* cpusubtype and one of the fat_arch structs this routine will alwayssucceed.
*/
extern struct fat_arch *NXFindBestFatArch(cpu_type_t cputype,
cpu_subtype_t cpusubtype,
struct fat_arch *fat_archs,
unsigned long nfat_archs);
[ ... ]
----------
/usr/include/mach/machine.h supports the following CPUTYPEs:
/*
* Machine types known by all.
*/
#define CPU_TYPE_ANY ((cpu_type_t) -1)
#define CPU_TYPE_VAX ((cpu_type_t) 1)
/* skip ((cpu_type_t) 2) */
/* skip ((cpu_type_t) 3) */
/* skip ((cpu_type_t) 4) */
/* skip ((cpu_type_t) 5) */
#define CPU_TYPE_MC680x0 ((cpu_type_t) 6)
#define CPU_TYPE_I386 ((cpu_type_t) 7)
/* skip CPU_TYPE_MIPS ((cpu_type_t) 8) */
/* skip ((cpu_type_t) 9) */
#define CPU_TYPE_MC98000 ((cpu_type_t) 10)
#define CPU_TYPE_HPPA ((cpu_type_t) 11)
/* skip CPU_TYPE_ARM ((cpu_type_t) 12) */
#define CPU_TYPE_MC88000 ((cpu_type_t) 13)
#define CPU_TYPE_SPARC ((cpu_type_t) 14)
#define CPU_TYPE_I860 ((cpu_type_t) 15)
/* skip CPU_TYPE_ALPHA ((cpu_type_t) 16) */
/* skip ((cpu_type_t) 17) */
#define CPU_TYPE_POWERPC ((cpu_type_t) 18)
...which appear to be a proper superset of the platforms FreeBSD supports.
For the sake of reference, since the CPU_SUBTYPE list is ~200 lines, here are
the x86 variants MachO knows about:
/*
* I386 subtypes.
*/
#define CPU_SUBTYPE_I386_ALL ((cpu_subtype_t) 3)
#define CPU_SUBTYPE_386 ((cpu_subtype_t) 3)
#define CPU_SUBTYPE_486 ((cpu_subtype_t) 4)
#define CPU_SUBTYPE_486SX ((cpu_subtype_t) 4 + 128)
#define CPU_SUBTYPE_586 ((cpu_subtype_t) 5)
#define CPU_SUBTYPE_INTEL(f, m) ((cpu_subtype_t) (f) + ((m) << 4))
#define CPU_SUBTYPE_PENT CPU_SUBTYPE_INTEL(5, 0)
#define CPU_SUBTYPE_PENTPRO CPU_SUBTYPE_INTEL(6, 1)
#define CPU_SUBTYPE_PENTII_M3 CPU_SUBTYPE_INTEL(6, 3)
#define CPU_SUBTYPE_PENTII_M5 CPU_SUBTYPE_INTEL(6, 5)
#define CPU_SUBTYPE_INTEL_FAMILY(x) ((x) & 15)
#define CPU_SUBTYPE_INTEL_FAMILY_MAX 15
#define CPU_SUBTYPE_INTEL_MODEL(x) ((x) >> 4)
#define CPU_SUBTYPE_INTEL_MODEL_ALL 0
--
-Chuck
More information about the freebsd-performance
mailing list