CFP2000 Result: Hewlett-Packard Company AlphaServer DS15/1000

Benchmark	Reference Time	Base Runtime	Base Ratio	Runtime	Ratio
168.wupwise	1600	251	637	118	1362
171.swim	3100	274	1133	274	1133
172.mgrid	1800	381	472	277	650
173.applu	2100	259	811	228	919
177.mesa	1400	183	765	151	927
178.galgel	2900	211	1375	178	1628
179.art	2600	305	851	228	1138
183.equake	1300	399	326	156	832
187.facerec	1900	264	720	236	804
188.ammp	2200	447	492	380	579
189.lucas	2000	236	848	206	970
191.fma3d	2100	343	613	296	708
200.sixtrack	1100	289	381	257	428
301.apsi	2600	312	834	297	875
SPECfp_base2000	682
	SPECfp2000	876

Benchmark

Reference
Time

Base
Runtime

Base
Ratio

Runtime

Ratio

168.wupwise

1600

251

637

118

1362

171.swim

3100

274

1133

274

1133

172.mgrid

1800

381

472

277

650

173.applu

2100

259

811

228

919

177.mesa

1400

183

765

151

927

178.galgel

2900

211

1375

178

1628

179.art

2600

305

851

228

1138

183.equake

1300

399

326

156

832

187.facerec

1900

264

720

236

804

188.ammp

2200

447

492

380

579

189.lucas

2000

236

848

206

970

191.fma3d

2100

343

613

296

708

200.sixtrack

1100

289

381

257

428

301.apsi

2600

312

834

297

875

SPECfp_base2000

682

SPECfp2000

876

Hardware

Hardware Vendor:

Hewlett-Packard Company

Model Name:

AlphaServer DS15/1000

CPU:

Alpha 21264C

CPU MHz:

1000

FPU:

Integrated

CPU(s) enabled:

1 core, 1 chip, 1 core/chip

CPU(s) orderable:

Parallel:

Primary Cache:

64KB(I)+64KB(D) on chip

Secondary Cache:

2MB

L3 Cache:

None

Other Cache:

None

Memory:

2GB; 512MB RIMMs

Disk Subsystem:

36GB Ultra 160 10KRPM

Other Hardware:

None

Software

Operating System:

Tru64 UNIX V5.1B (Rev. 2650)
+IPK

Compiler:

Compaq C V6.5-011-48C5K
Program Analysis Tools V2.0
Spike V5.2 (510 USG)
HP Fortran V5.5A-3548-48D88
HP Fortran 77 V5.5A-3548-48D88
KAP Fortran V4.3 000607
KAP Fortran 77 V4.1 980926
KAP C V4.1 000607

File System:

UFS

System State:

Multi-user

Notes / Tuning Information
Baseline C: cc -arch ev6 -fast -O4 ONESTEP Fortran: f90 -arch ev6 -fast -O5 ONESTEP Peak: All use -g3 -arch ev6 -non_shared ONESTEP except these (which use only the tunings shown below): 173.applu 188.ammp 191.fma3d Individual benchmark tuning: 168.wupwise: kf77 -call_shared -inline all -tune ev67 -unroll 12 -automatic -align commons -arch ev67 -fkapargs=' -aggressive=c -fuse -fuselevel=1 -so=2 -r=1 -o=1 -interleave -ur=6 -ur2=060 ' +PFB 171.swim: same as base 172.mgrid: kf90 -call_shared -arch generic -O5 -inline manual -nopipeline -transform_loops -unroll 9 -automatic -fkapargs='-aggressive=a -fuse -interleave -ur=2 -ur3=5 -cachesize=128,16000 ' +PFB 173.applu: kf90 -O5 -transform_loops -fkapargs=' -o=0 -nointerleave -ur=14 -ur2=260 -ur3=18' +PFB 177.mesa: kcc -fast -O4 +CFB +IFB 178.galgel: f90 -O5 -fast -unroll 5 -automatic 179.art: kcc -assume whole_program -ldensemalloc -call_shared -assume restricted_pointers -unroll 16 -inline none -ckapargs=' -fuse -fuselevel=1 -ur=3' +PFB 183.equake: cc -call_shared -arch generic -fast -O4 -ldensemalloc -assume restricted_pointers -inline speed -unroll 13 -xtaso_short +PFB 187.facerec: f90 -O4 -nopipeline -inline all -non_shared -speculate all -unroll 7 -automatic -assume accuracy_sensitive -math_library fast +IFB 188.ammp: cc -arch host -O4 -ifo -assume nomath_errno -assume trusted_short_alignment -fp_reorder -readonly_strings -ldensemalloc -xtaso_short -assume restricted_pointers -unroll 9 -inline speed +CFB +IFB +PFB 189.lucas: kf90 -O5 -fkapargs='-ur=1' +PFB 191.fma3d: kf90 -O4 -transform_loops -fkapargs='-cachesize=128,16000 ' +PFB 200.sixtrack: f90 -fast -O5 -assume accuracy_sensitive -notransform_loops +PFB 301.apsi: kf90 -O5 -inline none -call_shared -speculate all -align commons -fkapargs=' -aggressive=ab -tune=ev5 -fuse -ur=1 -ur2=60 -ur3=20 -cachesize=128,16000' Most benchmarks are built using one or more types of profile-driven feedback. The types used are designated by abbreviations in the notes: +CFB: Code generation is optimized by the compiler, using feedback from a training run. These commands are done before the first compile (in phase "fdo_pre0"): mkdir /tmp/pp rm -f /tmp/pp/${baseexe}* and these flags are added to the first and second compiles: PASS1_CFLAGS = -prof_gen_noopt -prof_dir /tmp/pp PASS2_CFLAGS = -prof_use_feedback -prof_dir /tmp/pp (Peak builds use /tmp/pp above; base builds use /tmp/pb.) +IFB: Icache usage is improved by the post-link-time optimizer Spike, using feedback from a training run. These commands are used (in phase "fdo_postN"): mv ${baseexe} oldexe spike oldexe -feedback oldexe -o ${baseexe} +PFB: Prefetches are improved by the post-link-time optimizer Spike, using feedback from a training run. These commands are used (in phase "fdo_post_makeN"): rm -f Counts mv ${baseexe} oldexe pixie -stats dstride oldexe 1>pixie.out 2>pixie.err mv oldexe.pixie ${baseexe} A training run is carried out (in phase "fdo_runN"), and then this command (in phase "fdo_postN"): spike oldexe -fb oldexe -stride_prefetch -o ${baseexe} When Spike is used for both Icache and Prefetch improvements, only one spike command is actually issued, with the Icache options followed by the Prefetch options. vm: vm_bigpg_enabled = 1 vm_bigpg_thresh = 6 vm_swap_eager = 0 ubc_maxpercent = 50 proc: max_per_proc_address_space = 34359738368 max_per_proc_data_size = 34359738368 max_per_proc_stack_size = 34359738368 max_proc_per_user = 2048 max_threads_per_user = 4096 maxusers = 2048 per_proc_address_space = 34359738368 per_proc_data_size = 34359738368 per_proc_stack_size = 34359738368 Portability: galgel: -fixed Information on UNIX V5.1B Patches can be found at http://ftp1.service.digital.com/public/unix/v5.1b/ Processes were bound to CPUs using "runon".

Notes / Tuning Information

 Baseline   C: cc  -arch ev6 -fast -O4 ONESTEP 
      Fortran: f90 -arch ev6 -fast -O5 ONESTEP 

 
 Peak:
   All use -g3 -arch ev6 -non_shared ONESTEP 
   except these (which use only the tunings shown below):
      173.applu 188.ammp 191.fma3d
   Individual benchmark tuning:
   168.wupwise: kf77 -call_shared -inline all -tune ev67 
                -unroll 12 -automatic -align commons -arch ev67
                -fkapargs=' -aggressive=c -fuse
                -fuselevel=1 -so=2 -r=1 -o=1 -interleave
                -ur=6 -ur2=060 ' +PFB
       171.swim: same as base
      172.mgrid: kf90 -call_shared -arch generic -O5 -inline
                 manual -nopipeline -transform_loops -unroll 9 -automatic
                 -fkapargs='-aggressive=a -fuse -interleave
                 -ur=2 -ur3=5 -cachesize=128,16000 ' +PFB
     173.applu: kf90  -O5 -transform_loops 
                -fkapargs=' -o=0 -nointerleave -ur=14
                -ur2=260 -ur3=18' +PFB
      177.mesa: kcc -fast -O4 +CFB +IFB 
    178.galgel: f90 -O5 -fast -unroll 5 -automatic
       179.art: kcc  -assume whole_program -ldensemalloc 
                -call_shared -assume restricted_pointers 
                -unroll 16 -inline none -ckapargs=' 
                -fuse -fuselevel=1 -ur=3' +PFB
    183.equake: cc -call_shared -arch generic -fast -O4
                -ldensemalloc -assume restricted_pointers
                -inline speed -unroll 13 -xtaso_short +PFB
   187.facerec: f90 -O4 -nopipeline -inline all 
                -non_shared -speculate all -unroll 7
                -automatic -assume accuracy_sensitive 
                -math_library fast +IFB 
      188.ammp: cc -arch host -O4 -ifo -assume nomath_errno 
                -assume trusted_short_alignment -fp_reorder 
                -readonly_strings -ldensemalloc -xtaso_short 
                -assume restricted_pointers -unroll 9 
                -inline speed +CFB +IFB +PFB
     189.lucas: kf90 -O5 -fkapargs='-ur=1' +PFB 
     191.fma3d: kf90 -O4 -transform_loops -fkapargs='-cachesize=128,16000 ' +PFB
  200.sixtrack: f90 -fast -O5 -assume accuracy_sensitive 
                -notransform_loops +PFB
      301.apsi: kf90 -O5 -inline none -call_shared -speculate all 
                -align commons -fkapargs=' -aggressive=ab 
                -tune=ev5 -fuse -ur=1 -ur2=60 -ur3=20 
                -cachesize=128,16000'

 Most benchmarks are built using one or more types of 
 profile-driven feedback.  The types used are designated
 by abbreviations in the notes:

 +CFB: Code generation is optimized by the compiler, using 
       feedback from a training run.  These commands are
       done before the first compile (in phase "fdo_pre0"):

            mkdir /tmp/pp
            rm -f /tmp/pp/${baseexe}*

       and these flags are added to the first and second compiles:

            PASS1_CFLAGS = -prof_gen_noopt -prof_dir /tmp/pp
            PASS2_CFLAGS = -prof_use_feedback -prof_dir /tmp/pp
 
      (Peak builds use /tmp/pp above; base builds use /tmp/pb.)

 +IFB: Icache usage is improved by the post-link-time optimizer 
       Spike, using feedback from a training run.  These commands
       are used (in phase "fdo_postN"):  

            mv ${baseexe} oldexe
            spike oldexe -feedback oldexe -o ${baseexe}

 +PFB: Prefetches are improved by the post-link-time optimizer 
       Spike, using feedback from a training run.  These
       commands are used (in phase "fdo_post_makeN"):

            rm -f *Counts*
            mv ${baseexe} oldexe
            pixie -stats dstride oldexe 1>pixie.out 2>pixie.err
            mv oldexe.pixie ${baseexe}

       A training run is carried out (in phase "fdo_runN"), and 
       then this command (in phase "fdo_postN"):

            spike oldexe -fb oldexe -stride_prefetch -o ${baseexe}

 When Spike is used for both Icache and Prefetch improvements, 
 only one spike command is actually issued, with the Icache 
 options followed by the Prefetch options.

 vm:
         vm_bigpg_enabled = 1
         vm_bigpg_thresh = 6
         vm_swap_eager = 0
         ubc_maxpercent = 50
 
 proc:
         max_per_proc_address_space = 34359738368
         max_per_proc_data_size = 34359738368
         max_per_proc_stack_size = 34359738368
         max_proc_per_user = 2048
         max_threads_per_user = 4096
         maxusers = 2048
         per_proc_address_space = 34359738368
         per_proc_data_size = 34359738368
         per_proc_stack_size = 34359738368
 
 
 Portability: galgel: -fixed
  
 Information on UNIX V5.1B Patches can be found at
 http://ftp1.service.digital.com/public/unix/v5.1b/
  
 Processes were bound to CPUs using "runon".

First published at SPEC.org on 10-Aug-2004

Generated on Tue Aug 10 15:57:04 2004 by SPEC CPU2000 HTML formatter v1.01