Baseline C: cc -arch ev6 -fast -O4 ONESTEP
Fortran: f90 -arch ev6 -fast -O5 ONESTEP
Peak:
All use -g3 -arch ev6 -non_shared ONESTEP
Individual benchmark tuning:
168.wupwise: kf77 -fast -O4 -pipeline -unroll 2 +PFB
171.swim: f90 -fast -O5
172.mgrid: kf77 -O5 -transform_loops -tune ev6 -unroll 8
173.applu: f90 -fast -O5 +PFB
177.mesa: cc -fast -O4 +CFB +IFB
-split_threshold .90 -noporder
178.galgel: f90 -fast -O5
179.art: kcc -fast -O4 -unroll 10 -ckapargs='-arl=4
-ur=4' +PFB
183.equake: cc -fast -xtaso_short -assume
restricted_pointers -all -ldensemalloc -none +PFB
187.facerec: f90 -fast -O4 +PFB
188.ammp: cc -fast -O4 -xtaso_short -assume
restricted_pointers
189.lucas: kf90 -O5 -fkapargs='-ur=1' +PFB
191.fma3d: kf90 -O4 -transform_loops +PFB
200.sixtrack: f90 -fast -O5 -assume accuracy_sensitive
-notransform_loops +PFB
301.apsi: kf90 -O5 -transform_loops -unroll 8
-fkapargs='-ur=1' +PFB
Most benchmarks are built using one or more types of
profile-driven feedback. The types used are designated
by abbreviations in the notes:
+CFB: Code generation is optimized by the compiler, using
feedback from a training run. These commands are
done before the first compile (in phase "fdo_pre0"):
mkdir /tmp/pp
rm -f /tmp/pp/${baseexe}*
and these flags are added to the first and second compiles:
PASS1_CFLAGS = -prof_gen_noopt -prof_dir /tmp/pp
PASS2_CFLAGS = -prof_use -prof_dir /tmp/pp
(Peak builds use /tmp/pp above; base builds use /tmp/pb.)
+IFB: Icache usage is improved by the post-link-time optimizer
Spike, using feedback from a training run. These commands
are used (in phase "fdo_postN"):
mv ${baseexe} oldexe
spike oldexe -feedback oldexe -o ${baseexe}
+PFB: Prefetches are improved by the post-link-time optimizer
Spike, using feedback from a training run. These
commands are used (in phase "fdo_post_makeN"):
rm -f *Counts*
mv ${baseexe} oldexe
pixie -stats dstride oldexe 1>pixie.out 2>pixie.err
mv oldexe.pixie ${baseexe}
A training run is carried out (in phase "fdo_runN"), and
then this command (in phase "fdo_postN"):
spike oldexe -fb oldexe -stride_prefetch -o ${baseexe}
When Spike is used for both Icache and Prefetch improvements,
only one spike command is actually issued, with the Icache
options followed by the Prefetch options.
Portability: galgel: -fixed
Process limits are set to maximum using csh "unlimit" command
Spike, and the Program Analysis Tools, are part of the Developers'
Tool Kit Supplement, http://www.tru64unix.compaq.com/dtk/ . The
features used in this SPEC submission will be available at the web
site as a beta kit in August, 2001, and as a production release in
October, 2001.
|