SPEC CPU2017 Platform Settings for Supermicro Systems

Operating System Tuning Parameters

kernel.randomize_va_space (ASLR)

This setting can be used to select the type of process address space randomization. Defaults differ based on whether the architecture supports ASLR, whether the kernel was built with the CONFIG_COMPAT_BRK option or not, or the kernel boot options used.
Possible settings:

0: Turn process address space randomization off.
1: Randomize addresses of mmap base, stack, and VDSO pages.
2: Additionally randomize the heap. (This is probably the default.)

Disabling ASLR can make process execution more deterministic and runtimes more consistent.
For more information see the randomize_va_space entry in the Linux sysctl documentation.

Transparent Hugepages (THP)

THP is an abstraction layer that automates most aspects of creating, managing, and using huge pages. It is designed to hide much of the complexity in using huge pages from system administrators and developers. Huge pages increase the memory page size from 4 kilobytes to 2 megabytes. This provides significant performance advantages on systems with highly contended resources and large memory workloads. If memory utilization is too high or memory is badly fragmented which prevents hugepages being allocated, the kernel will assign smaller 4k pages instead. Most recent Linux OS releases have THP enabled by default.
THP usage is controlled by the sysfs setting /sys/kernel/mm/transparent_hugepage/enabled.
Possible values:

never: entirely disable THP usage.
madvise: enable THP usage only inside regions marked MADV_HUGEPAGE using madvise(3).
always: enable THP usage system-wide. This is the default.

THP creation is controlled by the sysfs setting /sys/kernel/mm/transparent_hugepage/defrag.
Possible values:

never: if no THP are available to satisfy a request, do not attempt to make any.
defer: an allocation requesting THP when none are available get normal pages while requesting THP creation in the background.
defer+madvise: acts like "always", but only for allocations in regions marked MADV_HUGEPAGE using madvise(3); for all other regions it's like "defer".
madvise: acts like "always", but only for allocations in regions marked MADV_HUGEPAGE using madvise(3). This is the default.
always: an allocation requesting THP when none are available will stall until some are made.

An application that "always" requests THP often can benefit from waiting for an allocation until those huge pages can be assembled.
For more information see the Linux transparent hugepage documentation.

dirty_ratio

This is a percentage value of total available memory that can be filled with dirty data before writing the modifications to disk. Set through "sysctl -w vm.dirty_ratio=8".

swappiness

This control is used to define how aggressive the kernel will swap memory pages. Increaasing the value causes swapping more frequently. The default value is 60. A value of 1 tells the kernel to only swap processes to disk if absolutely necessary. This can be set through a command like "sysctl -w vm.swappiness=1"

zone_reclaim_mode

Zone_reclaim_mode allows someone to set more or less aggressive approaches to reclaim memory when a zone runs out of memory. It controls whether memory reclaim is performed on a local NUMA node or other nodes. To tell the kernel to free local node memory rather than grabbing free memory from remote nodes, it can be set through a command like "sysctl -w vm.zone_reclaim_mode=1".

drop_caches

Writing this will cause kernel to drop clean caches, as well as reclaimable slab objects like dentries and inodes. Once dropped, their memory becomes free. Set through "sysctl -w vm.drop_caches=3" to free slab objects and pagecache.

CPUFreq scaling governor:

Governors are power schemes for the CPU. It is in-kernel pre-configured power schemes for the CPU and allows you to change the clock speed of the CPUs on the fly. On Linux systems can set the govenor for all CPUs through the cpupower utility with the following command:

"cpupower -c all frequency-set -g governor"

Below are governors in the Linux kernel:

performance: Run the CPU at the maximum frequency.
powersave: Run the CPU at the minimum frequency.
userspace: Run the CPU at user specified frequencies.
ondemand: Scales the frequency dynamically according to current load. Jumps to the highest frequency and then possibly back off as the idle time increases.
conservative: Scales the frequency dynamically according to current load. Scales the frequency more gradually than ondemand.
schedutil: Scheduler-driven CPU frequency selection.

tuned-adm:

A commandline interface for switching between different tuning profiles available in supported Linux distributions. The distribution provided profiles are located in /usr/lib/tuned and the user defined profiles in /etc/tuned. To set a profile, one can issue the command "tuned-adm profile (profile_name)".
Below are details about some relevant profiles:

throughput-performance: For typical throughput performance tuning. Disables power saving mechanisms and enables sysctl settings that improve the throughput performance of disk and network I/O. CPU governor is set to performance and CPU energy performance bias is set to performance. Disk readahead values are increased.
latency-performance: For low latency performance tuning. Disables power saving mechanisms. CPU governor is set to performance and locked to the low C states. CPU energy performance bias to performance.
balanced: Default profile provides balanced power saving and performance. It enables CPU and disk plugins of tuned and makes the conservative governor is active and also sets the CPU energy performance bias to normal. It also enables power saving on audio and graphics card.
powersave: Maximal power saving for whole system. It sets the CPU governor to ondemand governor and energy performance bias to powersave. It also enable power saving on USB, SATA, audio and graphics card.

Firmware / BIOS / Microcode Settings

ANC mode:

Ampere NUMA Control (ANC) specifies the number of desired NUMA (Non-Uniform Memory Access) nodes per chip:

monolithic: Each physical processor chip is a NUMA node (default)
hemisphere: Each physical processor chip is two NUMA nodes
quadrant: Each physical processor chip is four NUMA nodes

Dividing the chip into separate nodes (hemisphere or quadrant) may improve latency to the last level cache and main memory, which may benefit overall performance for NUMA-aware operating systems and workloads.

Enable ACPI Auto Configuration:

Automatically configure ACPI releated settings with CPPC and LPI enabled.

Enabled: Automatically configure ACPI releated settings. (default)
Disabled: Disable ACPI auto configuration and allow user to manually change CPPC and LPI settings.

Enable CPPC:

CPPC (Collaborative Processor Performance Control) defined in the ACPI spec describes a mechanism for the OS to manage the performance of a logical processor on a contiguous and abstract performance scale. CPPC exposes a set of registers to describe abstract performance scale, to request performance levels and to measure per-cpu delivered performance.

Enabled: Enable firmware to communicate with the OS using CPPC.(default)
Disabled: Disable firmware to communicate with the OS using CPPC.

Enable LPI:

LPI (Low Power Idle) make the system remains partially running. In low power idle mode, the system can stay up-to-date whenever a suitable network is available and also wake when real-time action is required.

Enabled: Enable the system to enter Low Power Idle (LPI) mode. (default)
Disabled: Disable Low Power Idle (LPI) mode.