SPEC CPU Benchmark Description

549.fotonik3d_r

Ulf Andersson ulfa [at] pdc [dot] kth [dot] se

Computational Electromagnetics (CEM)

Fotonik3D computes the transmission coefficient of a photonic waveguide using the finite-difference time-domain (FDTD) method for the Maxwell equations. UPML for dielectric materials is used to terminate the computational domain.

The core of the FDTD method is second-order accurate central-difference approximations of the Faraday's and Ampere's laws. These central-differences are employed on a staggered Cartesian grid resulting in an explicit finite-difference method. The FDTD method is also referred to as the Yee scheme. It is the standard time-domain method within CEM.

The code consists of three steps, initialization, time-stepping and wrap-up. More than 99% of the time is spent in the time-stepping. Each time step is identical to all the others. The majority of the time is spent in five routines:

- Updating the electric fields in dielectric materials
- Updating the magnetic fields in dielectric materials
- Computing the Discrete Fourier Transform (DFT) of electric and magnetic fields in power planes
- Updating the electric fields in UPML
- Updating the magnetic fields in UPML

The FDTD-updates of the electric field during the time-stepping is done in the module material_mod while the FDTD-updates of the magnetic fields are done in the module update_mod.

The excitation of the code is a 2D (x and z) cross section of the computational domain. A precomputed Single TE mode is read from file and multiplied with a time dependent pulse:

Ex(:,y_index,:) = Ex(:,y_index,:) + pulse(t)*Single_TE_mode(:,:)

The computation of the excitation takes very little time.

The module power_mod performs the computation of the power flow.

During the initialization, two files containing a list of twinkles are read. A twinkles is one side of an FDTD-cell. These lists defines the two power planes through which we will compute the power flow. The input files also defines for which frequencies the power flow shall be computed.

During the time-stepping a DFT is computed for the perpendicular components (x and z) of the interpolated electric and magnetic fields at the midpoint of each twinkle in the power plane.

After time-stepping the power flow, i.e., Poynting's vector, is computed for each twinkle. Then the contribution from all the twinkles are summed for each frequency. This is written to an output file for both power planes. The transmission coefficient for each frequency can then be computed by a post-processing program. (In the real application we need to take more time steps in order to compute the power flow accurately.)

All input files are ASCII files.

**yee.dat** is the main input file. Inputs in
this file must come in a specific order. For full details, see the
source to 'init.F90'. Among the fields are:

**nx,ny,nz**: The size of the computational domain, nx by ny by nz cells**N_t**: the number of time-steps. Run time is directly affected by the number of steps chosen.**OBC**: Outer Boundary Condition. Positive numbers request use of the Uniaxial PML method with that number of cells. Higher numbers cause more memory usage.

**power1.dat** and **power2.dat** define the
two power planes were the power flow shall be computed:

- power1.dat is for the incident field
- power2.dat is for the transmitted field.

These files define the frequencies for which to compute the power flow. For details on these two files, see the source file 'power.F90'. Among the fields are:

**Filename**points to the geometrical description of the power planes (for example,**trans_W3PC.def**, and**incident_W3PC.def**.)**Freq_no**controls the number of frequencies to calculate, and directly affects the size all arrays allocated in routine 'power_init'. It should be set to the same value in both power1.dat and power2.dat.

**OBJ.dat** describes the photonic waveguide. This file starts
with a short list of values for relative permittivity (epsilon_r). It
then assigns one of these epsilon_r-values to each electric component
of every cell.

**PSI.dat** contains the definition of the single TE mode used
for the Plane Source excitation. This file defines the location (its
y-value) of the Plane Source and contains a pointer to a file,
**TEwaveguide.m**, containing the description of the single TE
mode. This TE mode has been precomputed by another code. It contains
(nx+1)*(nz+1) values.

SPEC provides 4 workloads: test, train, refrate, and refspeed, with these characteristics:

- The Freq_no increases across the 4 workloads, from a low of 301 for test to a high of 4001 for refspeed.
- The first three use a problem size (nx, ny, nz) of 120 x 470 x 120; the refspeed workload uses a size of 240 x 940 x 240.
- The OBC is 4 for test and train, 8 for refrate, and 12 for refspeed.
- The number of steps (N_t) is chosen to match SPEC's requirements:
- Test steps are only a few, because test is intended simply to verify that a working executable has been built.
- Train steps are about 10% of the number of refrate steps.
- Refrate and refspeed steps are set to take the desired amount of time (on a particular system used during development of the benchmark suite).

The output ASCII-file, 'pscyee.out', contains the power values for each frequency for the two power planes. It also contains the values of Ec which can be used for normalization. The values are validated by comparing them to a SPEC-provided set of expected outputs.

Various progress information is written to standard output, which may be useful when debugging, especially if the benchmark is run directly from the command line. When run under the control of the SPEC tools, standard out is captured to <benchmark_name>.log. This file is not validated.

Fortran 95 + OpenMP

No portability issues known.

It is perhaps worth mentioning that some calculations generate 'subnormal' numbers (wikipedia) which may cause slower operation than normal numbers on some hardware platforms. On such platforms, performance may be improved if "flush to zero on underflow" (FTZ) is enabled. During SPEC's testing of Fotonik3d, the output validated correctly whether or not FTZ was enabled.

Ulf Andersson, Min Qiu, and Ziyang Zhang, Parallel Power Computation for Photonic Crystal Devices, Methods and Applications of Analysis, 07/2006; 13(2):149-156. DOI: 10.4310/MAA.2006.v13.n2.a3, PDF available from: www.researchgate.net/publication/228405514

Torleif Martin, Broadband Electromagnetic Scattering and Shielding Analysis using the Finite Difference Time Domain Method, Linköping 2001, ISBN 91-7219-914-8.

S.D. Gedney (1996), An anisotropic perfectly matched layer absorbing media for the truncation of FDTD latices, IEEE Transactions on Antennas and Propagation, Vol. 44 (12): pp. 1630-1639. Bibcode: 1996ITAP...44.1630G. DOI:10.1109/8.546249.

Taflove (ed.), Advances in Computational Electrodynamics, Sect. 5.4-5.9, 1998

A. Taflove and S. C. Hagness, Computational Electrodynamics: The Finite-Difference Time-Domain Method, 3rd ed., Norwood, MA: Artech House, 2005.

A. Taflove, A. Oskooi, and S. G. Johnson, eds., Advances in FDTD Computational Electrodynamics: Photonics and Nanotechnology. Norwood, MA: Artech House, 2013.