Feature #15054

Use the performance CPU scaling governor on lizard

Added by intrigeri 2017-12-14 06:15:06 . Updated 2017-12-22 07:32:54 .

Status:
Resolved
Priority:
Normal
Assignee:
intrigeri
Category:
Infrastructure
Target version:
Start date:
2017-12-14
Due date:
% Done:

100%

Feature Branch:
Type of work:
Sysadmin
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

We’re currently using the powersave governor, which is likely to hinder performance because reacting to load change & frequency switching take time.


Subtasks


History

#1 Updated by intrigeri 2017-12-16 11:56:27

x86_energy_perf_policy can be used to configure “how aggressively the hardware enters and exits CPU idle states (C-states) and Processor Performance States (P-states)”, e.g. I would do x86_energy_perf_policy performance. It requires loading the msr kernel module. And cpupower can be used to set the cpufreq governor, e.g. I would do cpupower frequency-set --governor performance.

Another way to handle this would be to install tuned and benefit from the profiles maintained by the RedHat perfomance team: I trust they have plenty of good engineers whose job is to come up with the right settings for various workloads, why shouldn’t we piggy-back on it? See e.g. the profile for running KVM guests, that includes the throughput-performance profile, which does exactly what I’m suggesting above wrt. governor and energy/perf policy. The only problem I see with this approach is that it’s more invasive (tuned also manages a bunch of sysctl:s, kernel VM settings, etc.) and it increases complexity (we need to remember that tuned is there if we want to manually tweak anything it might already be managing), so I won’t set this up without discussing it with my team-mates first.

For now I’ll try out the simpler approach on sib (my local instance of our Jenkins setup) and we’ll see. But sib runs only one isobuilder + one isotester, and has quite different hardware, so I won’t be able to draw conclusions there that apply equally to a virtualization host that runs 24+ VMs. Still, it’ll be interesting to see if I can already measure any performance impact on sib.

#2 Updated by intrigeri 2017-12-16 12:59:53

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10

Applied on sib, I notice no statistically relevant performance impact (at least it does not get worse). So applied on lizard too, let’s see.

#3 Updated by intrigeri 2017-12-18 09:00:33

  • % Done changed from 10 to 20

For the curious, ISO build jobs:

  • between 2017-12-06T00:00:00 and 2017-12-15T00:00:00: 51.5 min (mean), 51.3 (median)
  • since this change (2017-12-16T13:00:00) until now: 50 min (mean), 49.7 min (median)
  • standard deviation (computed on seconds) = 59 in both cases

… i.e. a 2-3% improvement, which is not statically meaningful, and I have less than 48h of data anyway.

Now, I don’t expect CPU throughput-bound workloads like ISO build jobs to benefit the most from this change: they tend to be CPU-intensive so I would expect the cores assigned to the corresponding QEMU processes to be kept at a high frequency even with the powersave governor. In contrast, more spiky workloads should benefit from it more: their response latency should decrease. But we have no metrics to measure the performance of such services directly, so well, I’ll come back to it in a few days and will look at the relevant Munin graphs to see if the relevant low-level system metrics show any impact.

#4 Updated by intrigeri 2017-12-22 07:32:54

  • Subject changed from Consider using the performance CPU scaling governor on lizard to Use the performance CPU scaling governor on lizard
  • Status changed from In Progress to Resolved
  • % Done changed from 20 to 100

intrigeri wrote:
> For the curious, ISO build jobs:
> * between 2017-12-06T00:00:00 and 2017-12-15T00:00:00: 51.5 min (mean), 51.3 (median)
> * since this change (2017-12-16T13:00:00) until now: 50 min (mean), 49.7 min (median)
> * standard deviation (computed on seconds) = 59 in both cases
>
> … i.e. a 2-3% improvement, which is not statically meaningful, and I have less than 48h of data anyway.

This tendency went on: almost 4% improvement (both on mean and median) since this change. So if I don’t notice regressions below I’ll call this done.

> In contrast, more spiky workloads should benefit from it more: their response latency should decrease. But we have no metrics to measure the performance of such services directly, so well, I’ll come back to it in a few days and will look at the relevant Munin graphs to see if the relevant low-level system metrics show any impact.

Actually, on top of the fact that we don’t collect relevant performance data, I’ve realized that Munin’s data resolution is probably not fine-gained enough to measure such things. Anyway:

  • IO latency: nothing noticeable
  • IO throughput: nothing noticeable (perhaps a tiny bit less spiky, would not be surprising, but it’s hard to tell with this little data)
  • VMstat running processes, CPU usage, load average: a bit less spiky

So nothing terribly relevant on the Munin front, not really surprising as explained above.

So I’ve made this the default for all our bare metal systems and I’m calling this done.