SPEC CPU2017 Platform Settings for GIGA-BYTE

Operating System Tuning Parameters

kernel.randomize_va_space (ASLR)

This setting can be used to select the type of process address space randomization. Defaults differ based on whether the architecture supports ASLR, whether the kernel was built with the CONFIG_COMPAT_BRK option or not, or the kernel boot options used.
Possible settings:

0: Turn process address space randomization off.
1: Randomize addresses of mmap base, stack, and VDSO pages.
2: Additionally randomize the heap. (This is probably the default.)

Disabling ASLR can make process execution more deterministic and runtimes more consistent. For more information see the randomize_va_space entry in the Linux sysctl documentation.

Transparent Hugepages (THP)

THP is an abstraction layer that automates most aspects of creating, managing, and using huge pages. It is designed to hide much of the complexity in using huge pages from system administrators and developers. Huge pages increase the memory page size from 4 kilobytes to 2 megabytes. This provides significant performance advantages on systems with highly contended resources and large memory workloads. If memory utilization is too high or memory is badly fragmented which prevents hugepages being allocated, the kernel will assign smaller 4k pages instead. Most recent Linux OS releases have THP enabled by default.
THP usage is controlled by the sysfs setting /sys/kernel/mm/transparent_hugepage/enabled. Possible values:

never: entirely disable THP usage.
madvise: enable THP usage only inside regions marked MADV_HUGEPAGE using madvise(3).
always: enable THP usage system-wide. This is the default.

THP creation is controlled by the sysfs setting /sys/kernel/mm/transparent_hugepage/defrag. Possible values:

never: if no THP are available to satisfy a request, do not attempt to make any.
defer: an allocation requesting THP when none are available get normal pages while requesting THP creation in the background.
defer+madvise: acts like "always", but only for allocations in regions marked MADV_HUGEPAGE using madvise(3); for all other regions it's like "defer".
madvise: acts like "always", but only for allocations in regions marked MADV_HUGEPAGE using madvise(3). This is the default.
always: an allocation requesting THP when none are available will stall until some are made.

An application that "always" requests THP often can benefit from waiting for an allocation until those huge pages can be assembled.
For more information see the Linux transparent hugepage documentation.

Firmware / BIOS / Microcode Settings

Determinism Slider: (Default = Power)

Selects the determinism mode for the CPU:

Power: Maximizes performance within the power limits defined by cTDP and PPT.
Performance: Provides predictable performance across all processors of the same type.

cTDP Control:(Default = 280)

Configures the maximum power that the CPU will consume, up to the platform power limit (PPT). Valid values vary by CPU model. If value outside the valid range is set, the CPU will automatically adjust the value so that it does fall within the valid range. When increasing cTDP, additional power will only be consumed up to the Package Power Limit (PPT), which may be less than the cTDP setting.

Model	Minimum cTDP	Maximum cTDP
EPYC 7763	225	280
EPYC 7713	225	240

Package Power Limit (PPT) Control:(Default = 280)

Specifies the maximum power that each CPU package may consume in the system. The actual power limit is the maximum of the Package Power Limit and cTDP.

Model	Minimum cTDP	Maximum cTDP
EPYC 7763	225	280
EPYC 7713	225	240

NUMA nodes per socket:(Default = NPS4)

Specifies the number of desired NUMA nodes per populated socket in the system:

NPS1: Each physical processor is a NUMA node, and memory accesses are interleaved across all memory channels directly connected to the physical processor.
NPS2: Each physical processor is two NUMA nodes, and memory accesses are interleaved across 4 memory channels.
NPS4: Each physical processor is four NUMA nodes, and memory accesses are interleaved across 2 memory channels.

SMT Mode: (Default = Enabled)

Can be used to disable symmetric multithreading. To re-enable SMT, a POWER CYCLE is needed after selecting the 'Auto' option. WARNING - S3 is NOT SUPPORTED on systems where SMT is disabled.

IOMMU: (Default = Enabled)

Enable: Enables the I/O Memory Management Unit (IOMMU), which extends the AMD64 system architecture by adding support for address translation and system memory access protection on DMA transfers from peripheral devices.

4-link xGMI max speed:(Default = 16Gbps)

xGMI (Global Memory Interface) is the Socket SP3 processor socket-to-socket interconnection topology comprised of four x16 links. Each x16 link is comprised of 16 lanes. Each lane is comprised of two unidirectional differential signals. Since xGMI is the interconnection between processor sockets, these xGMI settings are not applicable for 1S platforms. NUMA-unaware workloads may need maximum xGMI bandwidth/speed while other compute efficient platforms may need to minimize xGMI power. The xGMI speed can be lowered, lane width can be reduced from x16 to x8 (or x2), or an xGMI link can be disabled if power consumption is too high.
The default value for this option on Milan platforms is "Auto" which corresponds to "16Gbps". On platforms that support higher speeds, it can be raised to increase performance on workloads that benefit from higher cross-socket bandwidth at the cost of some additional power consumption.