Platform Settings for Dell PowerEdge Servers
- kernel.randomize_va_space (ASLR)
-
This setting can be used to select the type of process address space
randomization. Defaults differ based on whether the architecture supports
ASLR, whether the kernel was built with the CONFIG_COMPAT_BRK
option or not, or the kernel boot options used.
Possible settings:
- 0: Turn process address space randomization off.
- 1: Randomize addresses of mmap base, stack, and VDSO pages.
- 2: Additionally randomize the heap. (This is probably the default.)
Disabling ASLR can make process execution more deterministic and runtimes more consistent.
For more information see the randomize_va_space entry in the
Linux sysctl
documentation.
Transparent Hugepages (THP)
-
THP is an abstraction layer that automates most aspects of creating, managing,
and using huge pages. It is designed to hide much of the complexity in using
huge pages from system administrators and developers. Huge pages
increase the memory page size from 4 kilobytes to 2 megabytes. This provides
significant performance advantages on systems with highly contended resources
and large memory workloads. If memory utilization is too high or memory is badly
fragmented which prevents hugepages being allocated, the kernel will assign
smaller 4k pages instead. Most recent Linux OS releases have THP enabled by default.
THP usage is controlled by the sysfs setting /sys/kernel/mm/transparent_hugepage/enabled.
Possible values:
- never: entirely disable THP usage.
- madvise: enable THP usage only inside regions marked MADV_HUGEPAGE using madvise(3).
- always: enable THP usage system-wide. This is the default.
THP creation is controlled by the sysfs setting /sys/kernel/mm/transparent_hugepage/defrag.
Possible values:
- never: if no THP are available to satisfy a request, do not attempt to make any.
- defer: an allocation requesting THP when none are available get normal pages while requesting THP creation in the background.
- defer+madvise: acts like "always", but only for allocations in regions marked MADV_HUGEPAGE using madvise(3); for all other regions it's like "defer".
- madvise: acts like "always", but only for allocations in regions marked MADV_HUGEPAGE using madvise(3). This is the default.
- always: an allocation requesting THP when none are available will stall until some are made.
An application that "always" requests THP often can benefit from waiting for an allocation until those huge pages can be assembled.
For more information see the Linux transparent hugepage documentation.
drop_caches
-
sysctl is used to change kernel parameters at run-time.
-w vm.drop_caches=3 - clears filesystem caches
tuned-adm
-
This command line utility allows you to switch between user definable tuning profiles.
Several predefined profiles are already included. You can even create your own profile,
either based on one of the existing ones by copying it or make a completely new one.i
The distribution provided profiles are stored in subdirectories below /usr/lib/tuned
and the user defined profiles in subdirectories below /etc/tuned. If there are profiles
with the same name in both places, user defined profiles have precedence.
Profiles Used:
- throughput-performance: Broadly applicable tuning that provides excellent performance across a variety of common server workloads.
- latency-performance: Optimize for deterministic performance at the cost of increased power consumption.
DRAM Refresh Delay
-
Default: Minimum
- Minimum: By minimizing the delay time, it is ensured that the memory controller runs the REFRESH command at regular intervals.
- Performance: By enabling the CPU memory controller to delay running the REFRESH command, performance can be improved for some workloads.
Memory Interleaving
-
Default: Auto
Memory interleaving is supported if a symmetric memory configuration is installed.
When set to Disabled, the system supports Non-Uniform Memory Access (NUMA) (asymmetric) memory configurations.
Operating Systems that are NUMA-aware understand the distribution of memory in a particular system and can
intelligently allocate memory in an optimal manner. Operating Systems that are not NUMA-aware could allocate
memory to a processor that is not local, resulting in a loss of performance. Die and Socket interleaving should
only be enabled for Operarting Systems that are not NUMA-aware.
DIMM Self Healing on Uncorrectable Memory Error
-
Default: Enabled
Post Package Repair (PPR) on Uncorrectable Memory Error
Disabling this feature may improve memory performance for some workloads.
Logical Processor
-
Default: Enabled
Each processor core supports up to two logical processors. When set to Enabled, the BIOS
reports all logical processors. When set to Disabled, the BIOS only reports one
logical processor per core. Generally, higher processor count results in increased
performance for most multi-threaded workloads and the recommendation is to keep this enabled.
However, there are some floating point/scientific workloads, including HPC workloads, where
disabling this feature may result in higher performance.
Virtualization Technology
-
Default: Enabled
When set to Enabled, the BIOS will enable processor Virtualization features and provide the virtualization
support to the Operating System (OS) through the DMAR table. In general, only virtualized environments
such as VMware(r) ESX (tm), Microsoft Hyper-V(r) , Red Hat(r) KVM, and other virtualized operating systems
will take advantage of these features. Disabling this feature is not known to significantly alter the
performance or power characteristics of the system, so leaving this option Enabled is advised for most cases.
NUMA Nodes per Socket
-
Default: 1
Allows configuration of the memory NUMA domains per socket. The configuration can consist of one whole doman (NPS1),
two domains (NPS2) or four domains (NPS4).
In the case of two-socket platforms, an additional NPS profile is available to have whole system memory be
mapped as a single NUMA domain (NPS0).
L3 Cache as NUMA Domain
-
Default: Disabled
This field specifies that each CCX within the processor will be declared as a NUMA domain.
ACPI CST C2 Latency
-
Default: 800
Enter in 18-1000 microseconds (decimal value).
Larger C2 latency values will reduce teh number of C2 transitions and reduce C2 residency.
Fewer transitions can help when performance is sensitive to the latency of C2 entry and exit.
Higher residency can improve performance by allowing higher frequency boost and reduce idle core power.
With Linux kernel 6.0 or later, the C2 transition cost is significantly reduced.
The best value will be dependent on kernel version, use case, and workload.
System Profile
-
Default: Performance Per Watt (OS)
When set to a mode other than Custom, BIOS will set each option accordingly. When set to Custom, each option
setting can be changed.
CPU Power Management
-
Default: OS DBPM
Allows selection of CPU power management methodology.
- Maximum Performance: typically selected for performance-centric workloads where it is
acceptable to consume additional power to achieve the highest possible performance for the computing environment.
This mode drives processor frequency to the maximum across all cores (although idled cores can still be
frequency reduced by C-state enforcement through BIOS or OS mechanisms if enabled). This mode also offers
the lowest latency of the CPU Power Management Mode options, so is always preferred for
latency-sensitive environments.
- OS DBPM: another performance-per-watt option that relies on the operating system to dynamically control
individual frequency. Both Windows and Linux can take advantage of this mode to reduce frequency of idle
or underutilized cores in order to save power.
C-States
-
Default: Enabled
C-States allow the processor to enter lower power states when idle.
When set to Enabled (OS Controlled) or when set to Autonomous (if Hardware control is supported), the processor
can operate in all available Power States to save power, but may increase memory latency and frequency jitter.
Memory Patrol Scrub
-
Default: Standard
Patrol Scrubbing searches the memory for errors and repairs correctable errors to prevent
the accumulation of memory errors.
- Disabled: no patrol scrubbing will occur.
- Standard: the entire memory array will be scrubbed once in a 24 hour period.
- Extended: the entire memory array will be scrubbed more frequently to further increase system reliability.
PCI ASPM L1 Link Power Management
-
Default: Enabled
When enabled, PCIe Advanced State Power Management (ASPM) can reduce overall system power a bit while slightly reducing
system performance. NOTE: Some devices may not perform properly (they may hang or cause the system to hang) when ASPM is
enabled. For this reason L1 will only be enabled for validated qualified cards.
Periodic Directory Rinse (PDR) Tuning
-
Default: Auto
Controls PDR settings that may impact the workload and processor performance
- Auto: Same as Blended
- Periodic (RefClock Based Floss Only): Rate based Directory Rinse.
- Blended (Cache Load Based Floss with Background RefClock Based Floss: Demand based Directory Rinse.
Determinism Control
-
Default: Auto
Set to Manual to enable Determinism Slider Control. Read-only unless System Profile is set to Custom.
- Auto: Use default performance determinism settings.
- Manual: Specify custom power/performance determinism.
Determinism Slider
-
Default: Performance Determinism
Controls whether BIOS will enable determinism to control performance. Read-only unless System Profile is set to Custom and Determinsim Control is set to Manual.
- Performance: Workload performance is the same regardless of variations in the environment and silicon.
- Power: Maximizes workload performance to part-specific power limits, thereby tapping the additional performance headroom based on the silicon. Maximum performance can be obtained by setting the TDP and Package Power Limit (PPL) to the maximum TDP value supported by the CPU.
Optimizer Mode
-
Default: Disabled
Allows for automatic tunning maximizing the processor's performance based on system configuration and thermal environment. Requires the system to be configured in Power Determinism Mode.
- Enabled: Enables the feature.
- Disabled: Turns off the feature.
CPU Interconnect Bus Link Power Management
-
Default: Enabled
When Enabled, CPU interconnect bus link power management can reduce overall system power a
bit while slightly reducing system performance.
Algorithm Performance Boost Disable (ApbDis)
-
Default: Disabled
- Enabled: a specific hard-fused Data Fabric (SoC) P-state is forced for optimizing workloads
sensitive to latency or throughput. (For higher performance)
- Disabled: P-states will be automatically managed by the Application Power Management,
allowing the processor to provide maximum performance while remaining within a specified
power-delivery and thermal envelope. (For power savings)
Adaptive Allocation (AA)
-
Default: Auto
- Auto: Same as Disabled
- Enabled: Dynamically alters cache replacement and allocation policy based on application behaviors.
- Disabled: Uses a fixed L2 replacement/allocation policy, which may benefit highly-optimized, cache-aware codes.
Fan Speed Offset
-
Default: Off
Configuring this option allows additional cooling to the server. In case hardware is added (example, new PCIe cards),
it may require additional cooling.
A fan speed offset causes fan speeds to increase (by the offset % value) over baseline fan speeds calculated
by the Thermal Control algorithm.