coyote/cpupower/readme.md

184 lines
9.4 KiB
Markdown

# CPU Power and C-State configuration
- [CPU Power and C-State configuration](#cpu-power-and-c-state-configuration)
- [Introduction](#introduction)
- [C1 vs C1E](#c1-vs-c1e)
- [How To Activate](#how-to-activate)
- [Via Script](#via-script)
- [Sidenotes](#sidenotes)
- [cpupower.sh](#cpupowersh)
- [How To Check the actual status](#how-to-check-the-actual-status)
- [Full output examples](#full-output-examples)
- [cpupower frequency-info](#cpupower-frequency-info)
- [cpupower idle-info](#cpupower-idle-info)
- [Related tools](#related-tools)
- [turbostat](#turbostat)
## Introduction
During a investigation of performance related issues, it turned out that
- setting the governor to performance (default: powersave)
- limit the CPU to enter at maximum C1 or C1E C-State, depening on power savings needed, cooling situation and/or CPU temperature headroom
brings huge performance benefits.
This mainly concentrates on GlusterFS based installations and we did not recognize any pro or con in MinIO setups so far.
### C1 vs C1E
- C1E still allows CPU frerquency scaling where C1 would run at 100% all the time. So C1E still gives some power savings over C1.
- The Intel Xeon Silver series CPUs provide a bitger temperature headroom which makes it a better candidate for using C1
- The Intel Xeon Gold series CPU do not provide as much temperature headroom and may run into temperature throttling, which depends on cooling environment
- If temperature issues arise at C1, a counter measure would be to configure the fans in iLO
- look in *dmesg*
- iLO: *Power & Thermal → Temperatures:* Here are temperatures and thresholds
- *ipmitool sensor* gives also information about temperatures and thresholds
- *Power & Thermal → Fans → Thermal Configuration: Enhanced CPU Cooling,* to provide a better CPU cooling
## How To Activate
Basically one can activate/change these settings during runtime or via a grub option. Setting it via runtime using scripts is more flexible and does not need a reboot.
### Via Script
To make these settings reboot persistent and activate/change them on a running, a systemd service is configured to run the script at boot or can be restarted to change settings.
Copy the three files as follows to the system you want to activate the CPU power settings. E.g. to */home/l3support/cpupower*
- cpupower
- configuration file for the *cpupower.sh* scipt
- cpupower.sh
- script that uses *cpupower* command to set governor and cstates
- cpupower.service
- systemd service that starts *cpupower.sh* at bootup
Once the files are available at the server, one can use the *l3support* user to distribute the files to the servers and copy them into the correct location.
```
ansible -b storage\_nodes -m copy -a "src=cpupower dest=/etc/default/cpupower"
ansible -b storage\_nodes -m copy -a "src=cpupower.service dest=/etc/systemd/system/cpupower.service"
ansible -b storage\_nodes -m copy -a "src=cpupower.sh dest=/usr/local/sbin/cpupower.sh mode=0755"
```
Then activate the sytsemd service
```
ansible -b storage\_nodes -m shell -a "systemctl daemon-reload"
ansible -b storage\_nodes -m shell -a "systemctl enable --now cpupower.service"
```
#### Sidenotes
##### cpupower.sh
The script can be configured to either use a C-State as MAX, which is preferred or by latency. Both can be configured in /etc/default/cpupower.
## How To Check the actual status
To check the actual status of the CPU power settings it is recommended to query it directly with *cpupower*.
ansible -b storage\_nodes -m shell -a "cpupower frequency-info | grep governor; cpupower idle-info | egrep '^C|POLL'"
Here is an example output of the desired status and one can see that
- the governor is set to performance
- only C-States Poll (C0), C1 are enabled, all others are disabled
```
myserver| CHANGED | rc=0 >>
available cpufreq governors: performance powersave
The governor "performance" may decide which speed to use
CPUidle driver: intel\_idle
CPUidle governor: menu
POLL:
C1:
C1E (DISABLED) :
C6 (DISABLED) :
```
## Full output examples
### cpupower frequency-info
Some CPUs do not allow to set the governor which is OK then. One can try to set the *Power Regulator* to *Static High Performance Mode* in iLO and test performance.
```
# cpupower frequency-info
analyzing CPU 0:
driver: intel_pstate
CPUs which run at the same hardware frequency: 0
CPUs which need to have their frequency coordinated by software: 0
maximum transition latency: Cannot determine or is not supported.
hardware limits: 800 MHz - 3.60 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 800 MHz and 3.60 GHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 3.40 GHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: yes
```
### cpupower idle-info
```
# cpupower idle-info
CPUidle driver: intel_idle
CPUidle governor: menu
analyzing CPU 0:
Number of idle states: 4
Available idle states: POLL C1 C1E C6
POLL:
Flags/Description: CPUIDLE CORE POLL IDLE
Latency: 0
Usage: 69491797
Duration: 6844908294
C1:
Flags/Description: MWAIT 0x00
Latency: 1
Usage: 472669351
Duration: 381352246947
C1E (DISABLED) :
Flags/Description: MWAIT 0x01
Latency: 4
Usage: 1957301007
Duration: 899230612315
C6 (DISABLED) :
Flags/Description: MWAIT 0x20
Latency: 170
Usage: 878016383
Duration: 1773269582945
```
### Related tools
#### turbostat
With *turbostat* one can gather CPU metrics like time spend in different C-States, power consumption, temperatures... Output is very large, so prepare to have a maximized terminal.
```
Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz IPC IRQ SMI POLL C1 C1E C6 POLL% C1% C1E% C6% CPU%c1 CPU%c6 CoreTmp PkgTmp Pkg%pc2 Pkg%pc6 PkgWatt RAMWatt PKG_% RAM_%
- - 213 7.89 2700 2195 0.49 95450 0 45223 95525 0 0 4.02 92.43 0.00 0.00 92.11 0.00 45 45 0.00 0.00 34.25 0.00 0.00 0.00
0 0 197 7.30 2700 2195 0.50 3890 0 1813 4514 0 0 3.48 93.00 0.00 0.00 92.70 0.00 45 45 0.00 0.00 34.25 0.00 0.00 0.00
0 10 242 9.00 2700 2195 0.48 5433 0 2741 5167 0 0 4.90 91.33 0.00 0.00 91.00
1 1 246 9.14 2700 2195 0.42 6491 0 3014 5437 0 0 5.30 91.29 0.00 0.00 90.86 0.00 45
1 11 264 9.80 2700 2195 0.38 6865 0 3240 5883 0 0 5.88 90.62 0.00 0.00 90.20
2 2 194 7.21 2700 2195 0.50 3874 0 1540 4683 0 0 3.15 93.12 0.00 0.00 92.79 0.00 43
2 12 235 8.71 2700 2195 0.43 5153 0 2900 4718 0 0 4.95 91.60 0.00 0.00 91.29
3 3 196 7.29 2700 2195 0.52 4027 0 1987 4194 0 0 3.56 92.99 0.00 0.00 92.71 0.00 43
3 13 218 8.11 2700 2195 0.43 5521 0 2528 5168 0 0 4.46 92.24 0.00 0.00 91.89
4 4 183 6.80 2700 2195 0.56 3568 0 1512 4035 0 0 2.89 93.50 0.00 0.00 93.20 0.00 44
4 14 209 7.77 2700 2195 0.49 4807 0 2311 4751 0 0 3.99 92.54 0.00 0.00 92.23
8 5 188 6.96 2700 2195 0.50 4443 0 1524 5291 0 0 2.96 93.40 0.00 0.00 93.04 0.00 44
8 15 232 8.63 2700 2195 0.50 5317 0 2811 4410 0 0 4.61 91.69 0.00 0.00 91.37
9 6 226 8.41 2700 2195 0.49 4864 0 2735 4246 0 0 4.58 91.90 0.00 0.00 91.59 0.00 45
9 16 190 7.07 2700 2195 0.55 4149 0 1669 4992 0 0 3.23 93.24 0.00 0.00 92.93
10 7 220 8.15 2700 2195 0.43 4415 0 2429 4617 0 0 4.47 92.17 0.00 0.00 91.85 0.00 44
10 17 187 6.95 2700 2195 0.50 4063 0 1738 4735 0 0 3.24 93.38 0.00 0.00 93.05
11 8 178 6.59 2700 2195 0.54 4269 0 1444 4982 0 0 2.71 93.73 0.00 0.00 93.41 0.00 43
11 18 252 9.34 2700 2195 0.59 5751 0 3163 4867 0 0 5.12 90.96 0.00 0.00 90.66
12 9 241 8.96 2700 2195 0.44 5364 0 3254 4389 0 0 5.15 91.34 0.00 0.00 91.04 0.00 43
12 19 154 5.70 2700 2195 0.72 3186 0 870 4446 0 0 1.75 94.60 0.00 0.00 94.30
```
Helpful options are
- *-l* and *-H* to list and to hide colums
- *-S* to get a summary of processor instead of single cores