From cd26d18df86ddf99ea61576b02b74f0dd0815c2c Mon Sep 17 00:00:00 2001 From: Thomas Gebert Date: Tue, 8 Oct 2024 11:06:18 +0200 Subject: [PATCH] Add readme for cpupower --- cpupower/readme.md | 171 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 171 insertions(+) create mode 100644 cpupower/readme.md diff --git a/cpupower/readme.md b/cpupower/readme.md new file mode 100644 index 0000000..3a2986c --- /dev/null +++ b/cpupower/readme.md @@ -0,0 +1,171 @@ +# CPU Power and C-State configuration + + +## Introduction +During a investigation of performance related issues, it turned out that + +- setting the governor to performance (default: powersave) +- limit the CPU to enter at maximum C1 or C1E C-State, depening on power savings needed, cooling situation and/or CPU temperature headroom + +brings huge performance benefits. + +This mainly concentrates on GlusterFS based installations and we did not recognize any pro or con in MinIO setups so far. + +### C1 vs C1E +- C1E still allows CPU frerquency scaling where C1 would run at 100% all the time. So C1E still gives some power savings over C1. +- The Intel Xeon Silver series CPUs provide a bitger temperature headroom which makes it a better candidate for using C1 +- The Intel Xeon Gold series CPU do not provide as much temperature headroom and may run into temperature throttling, which depends on cooling environment +- If temperature issues arise at C1, a counter measure would be to configure the fans in iLO +- look in *dmesg* +- iLO: *Power & Thermal → Temperatures:* Here are temperatures and thresholds +- *ipmitool sensor* gives also information about temperatures and thresholds +- *Power & Thermal → Fans → Thermal Configuration: Enhanced CPU Cooling,* to provide a better CPU cooling + +## How To Activate +Basically one can activate/change these settings during runtime or via a grub option. Setting it via runtime using scripts is more flexible and does not need a reboot. + +### Via Script +To make these settings reboot persistent and activate/change them on a running, a systemd service is configured to run the script at boot or can be restarted to change settings. + +Copy the three files as follows to the system you want to activate the CPU power settings. E.g. to */home/l3support/cpupower* + +- cpupower + - configuration file for the *cpupower.sh* scipt +- cpupower.sh + - script that uses *cpupower* command to set governor and cstates +- cpupower.service + - systemd service that starts *cpupower.sh* at bootup + +Once the files are available at the server, one can use the *l3support* user to distribute the files to the servers and copy them into the correct location. + +``` +ansible -b storage\_nodes -m copy -a "src=cpupower dest=/etc/default/cpupower" +ansible -b storage\_nodes -m copy -a "src=cpupower.service dest=/etc/systemd/system/cpupower.service" +ansible -b storage\_nodes -m copy -a "src=cpupower.sh dest=/usr/local/sbin/cpupower.sh mode=0755" +``` + +Then activate the sytsemd service + +``` +ansible -b storage\_nodes -m shell -a "systemctl daemon-reload" +ansible -b storage\_nodes -m shell -a "systemctl enable --now cpupower.service" +``` + +#### Sidenotes + +##### cpupower.sh +The script uses *cpupower idle-set --disable-by-latency*, which is out of convenience since this option allows to disable all C-States that take more time than specified at once. While just using disable just disables a single C-States but does not disable deeper one. E.g. (based on the example from [cpupower idle-info](#cpupowerandc-stateconfiguration-cpupowe)) *cpupower idle-set --disable 4* would disable C1E status but would leave C6 status active... + +## How To Check the actual status +To check the actual status of the CPU power settings it is recommended to query it directly with *cpupower*. + + ansible -b storage\_nodes -m shell -a "cpupower frequency-info | grep governor; cpupower idle-info | egrep '^C|POLL'" + +Here is an example output of the desired status and one can see that + +- the governor is set to performance +- only C-States Poll (C0), C1 are enabled, all others are disabled + +``` +myserver| CHANGED | rc=0 >> + available cpufreq governors: performance powersave + The governor "performance" may decide which speed to use +CPUidle driver: intel\_idle +CPUidle governor: menu +POLL: +C1: +C1E (DISABLED) : +C6 (DISABLED) : +``` + +## Full output examples + +### cpupower frequency-info +Some CPUs do not allow to set the governor which is OK then. One can try to set the *Power Regulator* to *Static High Performance Mode* in iLO and test performance. + + +``` +# cpupower frequency-info +analyzing CPU 0: + driver: intel_pstate + CPUs which run at the same hardware frequency: 0 + CPUs which need to have their frequency coordinated by software: 0 + maximum transition latency: Cannot determine or is not supported. + hardware limits: 800 MHz - 3.60 GHz + available cpufreq governors: performance powersave + current policy: frequency should be within 800 MHz and 3.60 GHz. + The governor "performance" may decide which speed to use + within this range. + current CPU frequency: Unable to call hardware + current CPU frequency: 3.40 GHz (asserted by call to kernel) + boost state support: + Supported: yes + Active: yes +``` + +### cpupower idle-info + +``` +# cpupower idle-info +CPUidle driver: intel_idle +CPUidle governor: menu +analyzing CPU 0: + +Number of idle states: 4 +Available idle states: POLL C1 C1E C6 +POLL: +Flags/Description: CPUIDLE CORE POLL IDLE +Latency: 0 +Usage: 69491797 +Duration: 6844908294 +C1: +Flags/Description: MWAIT 0x00 +Latency: 1 +Usage: 472669351 +Duration: 381352246947 +C1E (DISABLED) : +Flags/Description: MWAIT 0x01 +Latency: 4 +Usage: 1957301007 +Duration: 899230612315 +C6 (DISABLED) : +Flags/Description: MWAIT 0x20 +Latency: 170 +Usage: 878016383 +Duration: 1773269582945 +``` + +### Related tools + +#### turbostat +With *turbostat* one can gather CPU metrics like time spend in different C-States, power consumption, temperatures... Output is very large, so prepare to have a maximized terminal. + +``` +Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz IPC IRQ SMI POLL C1 C1E C6 POLL% C1% C1E% C6% CPU%c1 CPU%c6 CoreTmp PkgTmp Pkg%pc2 Pkg%pc6 PkgWatt RAMWatt PKG_% RAM_% +- - 213 7.89 2700 2195 0.49 95450 0 45223 95525 0 0 4.02 92.43 0.00 0.00 92.11 0.00 45 45 0.00 0.00 34.25 0.00 0.00 0.00 +0 0 197 7.30 2700 2195 0.50 3890 0 1813 4514 0 0 3.48 93.00 0.00 0.00 92.70 0.00 45 45 0.00 0.00 34.25 0.00 0.00 0.00 +0 10 242 9.00 2700 2195 0.48 5433 0 2741 5167 0 0 4.90 91.33 0.00 0.00 91.00 +1 1 246 9.14 2700 2195 0.42 6491 0 3014 5437 0 0 5.30 91.29 0.00 0.00 90.86 0.00 45 +1 11 264 9.80 2700 2195 0.38 6865 0 3240 5883 0 0 5.88 90.62 0.00 0.00 90.20 +2 2 194 7.21 2700 2195 0.50 3874 0 1540 4683 0 0 3.15 93.12 0.00 0.00 92.79 0.00 43 +2 12 235 8.71 2700 2195 0.43 5153 0 2900 4718 0 0 4.95 91.60 0.00 0.00 91.29 +3 3 196 7.29 2700 2195 0.52 4027 0 1987 4194 0 0 3.56 92.99 0.00 0.00 92.71 0.00 43 +3 13 218 8.11 2700 2195 0.43 5521 0 2528 5168 0 0 4.46 92.24 0.00 0.00 91.89 +4 4 183 6.80 2700 2195 0.56 3568 0 1512 4035 0 0 2.89 93.50 0.00 0.00 93.20 0.00 44 +4 14 209 7.77 2700 2195 0.49 4807 0 2311 4751 0 0 3.99 92.54 0.00 0.00 92.23 +8 5 188 6.96 2700 2195 0.50 4443 0 1524 5291 0 0 2.96 93.40 0.00 0.00 93.04 0.00 44 +8 15 232 8.63 2700 2195 0.50 5317 0 2811 4410 0 0 4.61 91.69 0.00 0.00 91.37 +9 6 226 8.41 2700 2195 0.49 4864 0 2735 4246 0 0 4.58 91.90 0.00 0.00 91.59 0.00 45 +9 16 190 7.07 2700 2195 0.55 4149 0 1669 4992 0 0 3.23 93.24 0.00 0.00 92.93 +10 7 220 8.15 2700 2195 0.43 4415 0 2429 4617 0 0 4.47 92.17 0.00 0.00 91.85 0.00 44 +10 17 187 6.95 2700 2195 0.50 4063 0 1738 4735 0 0 3.24 93.38 0.00 0.00 93.05 +11 8 178 6.59 2700 2195 0.54 4269 0 1444 4982 0 0 2.71 93.73 0.00 0.00 93.41 0.00 43 +11 18 252 9.34 2700 2195 0.59 5751 0 3163 4867 0 0 5.12 90.96 0.00 0.00 90.66 +12 9 241 8.96 2700 2195 0.44 5364 0 3254 4389 0 0 5.15 91.34 0.00 0.00 91.04 0.00 43 +12 19 154 5.70 2700 2195 0.72 3186 0 870 4446 0 0 1.75 94.60 0.00 0.00 94.30 +``` + +Helpful options are + +- *-l* and *-H* to list and to hide colums +- *-S* to get a summary of processor instead of single cores \ No newline at end of file -- 2.39.5