SPEC CPU2017 Platform Settings for Supermicro Systems
- CPUFreq scaling governor:
-
Governors are power schemes for the CPU. It is in-kernel pre-configured power schemes for the CPU and allows you to change the clock speed of the CPUs on the fly. On Linux systems can set the govenor for all CPUs through the cpupower utility with the following command:
- "cpupower -c all frequency-set -g governor"
Below are governors in the Linux kernel:
- performance: Run the CPU at the maximum frequency.
- powersave: Run the CPU at the minimum frequency.
- userspace: Run the CPU at user specified frequencies.
- ondemand: Scales the frequency dynamically according to current load. Jumps to the highest frequency and then possibly back off as the idle time increases.
- conservative: Scales the frequency dynamically according to current load. Scales the frequency more gradually than ondemand.
- schedutil: Scheduler-driven CPU frequency selection.
- tuned-adm:
-
A commandline interface for switching between different tuning profiles available in supported Linux distributions. The distribution provided profiles are located in /usr/lib/tuned and the user defined profiles in /etc/tuned. To set a profile, one can issue the command "tuned-adm profile (profile_name)".
Below are details about some relevant profiles:
- throughput-performance: For typical throughput performance tuning. Disables power saving mechanisms and enables sysctl settings that improve the throughput performance of disk and network I/O. CPU governor is set to performance and CPU energy performance bias is set to performance. Disk readahead values are increased.
- latency-performance: For low latency performance tuning. Disables power saving mechanisms. CPU governor is set to performance and locked to the low C states. CPU energy performance bias to performance.
- balanced: Default profile provides balanced power saving and performance. It enables CPU and disk plugins of tuned and makes the conservative governor is active and also sets the CPU energy performance bias to normal. It also enables power saving on audio and graphics card.
- powersave: Maximal power saving for whole system. It sets the CPU governor to ondemand governor and energy performance bias to powersave. It also enable power saving on USB, SATA, audio and graphics card.
- Hyper-Threading [ALL]: (Default="Enable")
-
Enabled for Windows and Linux (OS optimized for Hyper-Threading Technology) and Disabled for other OS (OS not optimized for Hyper-Threading Technology). When Disabled only one thread per enabled core is enabled.
- VMX: (Default = "Enable")
-
When enabled, a VMM can utilize the additional hardware capabilities provided by Vanderpool Technology.
- LLC Prefetch: (Default = "Enable")
-
The LLC prefetcher is an additional prefetch mechanism on top of the existing prefetchers that prefetch data into the core Data Cache Unit (DCU) and Mid-Level Cache (MLC). Enabling LLC prefetch gives the core prefetcher the ability to prefetch data directly into the LLC without necessarily filling into the MLC.
- Power Technology: (Default = "Custom")
-
Switch processor power management features. If value "Custom" is set, Customer can define the values of all power management setup items.
- Power Performance Tuning: (Default = "OS Controls EPB")
-
Allows the OS or BIOS to control the Energy Performance Bias.
Available options are:
- OS Controls EPB: The Energy Performance Bias setting controls by OS.
- BIOS Controls EPB: The Energy Performance Bias setting controls by ENERGY_PERF_BIAS_CFG mode item in BIOS.
- ENERGY_PERF_BIAS_CFG mode (Energy Performance Bias Setting): (Default = "Balanced Performance")
-
This BIOS option allows for processor performance and power optmization.
Available options are:
- Extreme Performance: This mode will raise system performance to its highest potential. With Extreme Performance enabled, power consumption will increase as the processor frequency is maximized. In other words, system performance is gained at the cost of system power efficiency, depending on the workload.
- Maximum Performance: Get more performance with more power consumption than performance mode.
- Performance: High performance with less need for power saving.
- Balanced Performance (Default Setting): Provides optimal performance efficiency.
- Balanced Power: Provides optimal power efficiency.
- Power: High power saving with less need for performance.
- CPU C6 Report: (Default = "Auto")
-
Controls the BIOS to report the CPU C6 State (ACPI C3) to the operating system. During the CPU C6 State, the power to all cache is turned off.
Available options are:
- Enable: Enable BIOS to report the CPU C6 State (ACPI C3) to the operating system.
- Disable: Disable BIOS to report the CPU C6 State (ACPI C3) to the operating system.
- Auto: BIOS automatically decides to report the CPU C6 State (ACPI C3) to the operating system or not depends on Power Technology setting.
- Enhanced Halt State (C1E): (Default = "Enable")
-
Power saving feature where, when enabled, idle processor cores will halt.
- Hardware P-states: (Default = "Disable")
-
The Hardware P-State setting allows the user to select between OS and hardware-controlled P-states. Selecting Native Mode allows the OS to choose a P-state. Selecting Out of Band Mode allows the hardware to autonomously choose a P-state without OS guidance. Selecting Native Mode with No Legacy Support functions as Native Mode with no support for older hardware.
- SNC (Sub NUMA): (Default = "Disable")
-
Sub-NUMA Clusters (SNC) is a feature that provides similar localization benefits as Cluster-On-Die (COD), without some of COD's downsides. SNC breaks up the LLC into disjoint clusters based on address range, with each cluster bound to a subset of the memory controllers in the system. SNC improves average latency to the LLC.
- XPT Prefetch: (Default = "Auto")
-
This feature allows an LLC read request to be speculatively duplicated and sent concurrently to the appropriate MC (Memory Controller). These speculative MC reads are sent when an LLC miss is likely based on recent LLC history. If an LLC miss does occur, the MC read is already in flight so the requested data will be returned more quickly.
- KTI Prefetch: (Default = "Auto")
-
When this feature is set to Enable, the KTI prefetcher will preload the L1 cache with data deemed relevant to allow the memory read to start earlier on a DDR bus in an effort to reduce latency. Available options are "Auto", "Disable" and "Enable".
- Local/Remote Threshold: (Default = "Auto")
-
This feature allows the user to set the threshold for the Interrupt Request (IRQ) signal, which handles hardware interruptions. There are 5 options: "Disable", "Auto", "Low", "Medium", and "High". This BIOS option changes the threshold number of requests in remote/local-to-remote request queues to cause the throttling.
- Stale AtoS: (Default = "Auto")
-
The in-memory directory has three states: I, A, and S. I (invalid) state means the data is clean and does not exist in any other socket's cache. The A (snoopAll) state means the data may exist in another socket in exclusive or modified state. S (Shared) state means the data is clean and may be shared across one or more socket's caches.
When doing a read to memory, if the directory line is in the A state we must snoop all the other sockets because another socket may have the line in modified state. If this is the case, the snoop will return the modified data. However, it may be the case that a line is read in A state and all the snoops come back a miss. This can happen if another socket read the line earlier and then silently dropped it from its cache without modifying it.
Available options are:
- Enable: In the situation where a line in A state returns only snoop misses, the line will transition to S state. That way, subsequent reads to the line will encounter it in S state and not have to snoop, saving latency and snoop bandwidth. Stale AtoS may be beneficial in a workload where there are many cross-socket reads.
- Disable: Disabling this option allows the feature to process memory directories as described above.
- Auto: This will enable Stale AtoS when AEP DIMM installed on system and disable Stale AtoS if no AEP DIMM installed.
- LLC Dead Line Alloc: (Default = "Enable")
-
In the Skylake-SP non-inclusive cache scheme, MLC evictions are filled into the LLC. When lines are evicted from the MLC, the core can flag them as "dead" (i.e., not likely to be read again). The LLC has the option to drop dead lines and not fill them in the LLC. If the LLC Dead Line Alloc feature is disabled, dead lines will always be dropped and will never fill into the LLC. This can help save space in the LLC and prevent the LLC from evicting useful data. However, if the LLC Dead Line Alloc feature is enabled, the LLC can opportunistically fill dead lines into the LLC if there is free space available. Available options are "Auto", "Enable" and "Disable".
- Enforce POR: (Default = "POR")
-
Set to POR enforce Plan Of Record restrictions for DDR4 frequency and voltage programming. Memory speeds will be capped at Intel guidelines. Disabling allows user selection of additional supported memory speeds. Available options are "POR" and "Disable".
- Memory Frequency: (Default = "Auto")
-
Set the maximum memory frequency for onboard memory modules. Available options are "Auto", "2133", "2200", "2400", "2600", "2666", "2800", "2933", "3000", "3200".
- ADDDC Sparing: (Default = "Enabled")
-
Adaptive Double Device Data Correction (ADDDC) Sparing detects the predetermined threshold for correctable errors, copying the contents of the failing DIMM to spare memory. The failing DIMM or memory rank will then be disabled.
Available options are:
- Enabled: Enable the ADDDC Sparing feature.
- Disabled: Disable the ADDDC Sparing feature.
- Patrol Scrub: (Default = "Enable")
-
Enable or disable the ability to proactively search the system memory, repairing correctable errors.
- DCU IP Prefetcher: (Default = "Enable")
-
This L1-cache prefether looks for sequential load history and attempts on this basis to determine the next data to be expected and, if necessary, to prefetch this data from the L2 cache or the main memory into the L1 cache.
- DCU Streamer Prefetcher: (Default = "Enable")
-
This prefetcher is a L1 data cache prefetcher, which detects multiple loads from the same cache line done within a time limit, in order to then prefetch the next line from the L2 cache or the main memory into the L1 cache based on the assumption that the next cache line will also be needed.