Feature #8877

Consider using hugepages for the TailsToaster VM

Added by intrigeri 2015-02-07 11:02:25 . Updated 2015-07-01 11:42:03 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Test suite
Target version:
Start date:
2015-02-07
Due date:
% Done:

100%

Feature Branch:
Type of work:
Test
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

This should improve the test suite performance quite a bit. Given some tests configure that VM with 8 GiB of RAM, we would need to pre-allocate 8 GiB of hugepages, and then it would be impossible to run the test suite on a system that has less memory. I don’t think it is a blocker as relying on swap to run these tests will likely make them timeout or be too slow to be practically useful.


Subtasks


History

#1 Updated by intrigeri 2015-02-07 11:03:30

  • Assignee set to anonym
  • QA Check set to Info Needed

anonym, what are your thoughts on this topic?

#2 Updated by anonym 2015-03-13 02:30:38

  • Assignee changed from anonym to intrigeri

It sounds like a nice optimization and thanks to our new configuration system, it could be made into an option (so we inject the needed stuff into the XML before creating a domain) for those with crap loads of ram, like lizard. :)

Hmm, I have 2 GiB of 2MiB-hugepages enabled but no hugetlbfs mounted and hugetlbfs_mount unset in my /etc/libvirt/qemu.conf. However, I just had a look at grep Huge /proc/meminfo and the AnonHugePages: is definitely being used transparently for my test suite VMs. So… do we get this for free?

#3 Updated by intrigeri 2015-03-15 11:24:53

> It sounds like a nice optimization and thanks to our new configuration system, it could be made into an option (so we inject the needed stuff into the XML before creating a domain) for those with crap loads of ram, like lizard. :)

We might want a boolean option:

  • either we allocate as many huge pages are needed for most tests (1310720 KiB on feature/jessie), and then the few tests that require more memory must be started in a VM that hasn’t huge pages as its memory backing
  • or we allocate as many huge pages are needed even for the tests that need more memory (8GB); this will only be a useful optimization if the performance gain we get from running the memory erasure tests with huge pages is bigger than the hit we take from wasting about 6.5GB of RAM (that implies more data flushed to disk and then more I/O costs) in all other tests. I doubt that even our testing VMs on lizard would take advantage of that.

=> IMO we should only allocate enough huge pages as required to run the vast majority of tests, and simply run the more memory-hungry tests with non-huge-pages memory backing settings (granted, for those tests we still waste 1.3GB of RAM that’s allocated to huge pages and unused). And then, no configuration setting is needed :)

> Hmm, I have 2 GiB of 2MiB-hugepages enabled but no hugetlbfs mounted and hugetlbfs_mount unset in my /etc/libvirt/qemu.conf. However, I just had a look at grep Huge /proc/meminfo and the AnonHugePages: is definitely being used transparently for my test suite VMs. So… do we get this for free?

I’ve no experience with transparent hugepages. It seems to have lots of tweakability: see Documentation/vm/transhuge.txt in the Linux source tree. I’ve no idea if transparent hugepages are as efficient as hugetlbfs in our specific context, and only benchmarking will tell us so. Such benchmarking should IMO be done on the full automated test suite, and ideally on several consecutive runs thereof without rebooting the testing system/VM, as it may be that memory fragmentation increases along the way, and thus increasingly prevents transparent huge pages from being allocated as the test suite goes.

#4 Updated by intrigeri 2015-03-15 11:25:30

  • Assignee changed from intrigeri to anonym

#5 Updated by anonym 2015-03-17 10:56:00

  • Assignee changed from anonym to intrigeri

intrigeri wrote:
> I’ve no experience with transparent hugepages. It seems to have lots of tweakability: see Documentation/vm/transhuge.txt in the Linux source tree. I’ve no idea if transparent hugepages are as efficient as hugetlbfs in our specific context […]

From my read of the documentation there is no performance difference between transparent hugepages and hugetlbfs — after all, the performance gain is that 2M pages are used, so there will be less context switches to the kernel.

> and only benchmarking will tell us so. Such benchmarking should IMO be done on the full automated test suite, and ideally on several consecutive runs thereof without rebooting the testing system/VM, as it may be that memory fragmentation increases along the way, and thus increasingly prevents transparent huge pages from being allocated as the test suite goes.

Any way, I did a preliminary test of just one scenario, encryption.feature (since it doesn’t have any/much randomness involved), run (non-nested) three times in a row for each of the following cases, with those results (sorted):

  • no hugepages: 2m47.302s, 2m49.662s, 2m53.001s
  • hugepages through hugetlbfs: 2m49.332s, 2m50.678s, 2m53.611s
  • transparent hugepages: 2m48.809s, 2m49.102s, 2m52.810s

Note: to make the caches/buffer situation as similar as possible I ran a “0:th” test (that I disregarded from the results) before each case and even dropped caches/buffers before each scenario run. I also monitored /proc/meminfo to make sure that hugepages were used (or not) as expected for each scenario.

So there’s no clear pattern that hugepages has any benefit at all, which is disappointing. I suspect me reading emails in parallel on the same computer had a larger influence => the variations we see are noise. In particular, there’s no speedup in booting Tails, which is when most memory is manipulated, which was the hope I had.

Perhaps enough is different in a nested-VM context (where booting Tails is comparably much slower — after that it’s pretty similar) to warrant new tests, but I must say I’m a bit discouraged by the above results.

What do you think?

#6 Updated by intrigeri 2015-03-17 11:19:22

> So there’s no clear pattern that hugepages has any benefit at all,

Good to know it is this way on bare metal.

> Perhaps enough is different in a nested-VM context (where booting Tails is comparably much slower — after that it’s pretty similar) to warrant new tests, but I must say I’m a bit discouraged by the above results.

> What do you think?

Back when we merged the test suite initially, I have measured pretty good improvements with hugepages. This was in a nested VM context, which is the one that matters e.g. on lizard. So yes, IMO this warrants new tests in that context.

#7 Updated by intrigeri 2015-03-17 11:24:18

  • Assignee changed from intrigeri to anonym

#8 Updated by intrigeri 2015-03-21 18:28:27

intrigeri wrote:
> Back when we merged the test suite initially, I have measured pretty good improvements with hugepages. This was in a nested VM context, which is the one that matters e.g. on lizard. So yes, IMO this warrants new tests in that context.

Actually, I could do these tests if you share your experimental code with me.

#9 Updated by intrigeri 2015-06-03 08:49:49

Ping?

#10 Updated by anonym 2015-06-03 12:28:16

  • Assignee changed from anonym to intrigeri
  • QA Check changed from Info Needed to Dev Needed

Sorry for the delay! :S

So this is what I did (if I remember correctly, otherwise it’s just minor differences that I’m sure you’ll figure out quickly) for each test:

> * no hugepages: …

Do not mount any hugetlbfs fs at all.

> * transparent hugepages: …

Mount a hugetlbfs fs. That’s all.

> * hugepages through hugetlbfs: …

Mount a hugetlbfs fs, set hugetlbfs_mount appropriately in /etc/libvirt/qemu.conf, and apply:

--- a/features/domains/default.xml
+++ b/features/domains/default.xml
@@ -2,6 +2,9 @@
   <name>TailsToaster</name>
   <memory unit='KiB'>1310720</memory>
   <currentMemory unit='KiB'>1310720</currentMemory>
+  <memoryBacking>
+    <hugepages/>
+  </memoryBacking>
   <vcpu>1</vcpu>
   <os>
     <type arch='x86_64' machine='pc-0.15'>hvm</type>

#11 Updated by intrigeri 2015-06-03 14:33:42

All these tests are on a laptop with a i7-4600U CPU (that’s supposed to have current coolest stuff wrt. nested virtualization). Caches were dropped before each scenario run. The test scenario is encryption.feature.

nested virtualization, level 1 VM is using AnonHugePages

  • transparent hugepages (no change to documented test environment setup): 3m40.171s, 3m42.151s, 3m51.040s
  • statically pre-allocated hugepages (sysctl vm.nr_hugepages=642, check that HugePages_Total in /proc/meminfo is now 642, patch features/domains/default.xml to have memoryBacking = hugepages): 3m45.203s, 3m39.621s, 3m45.037s

nested virtualization, level 1 VM is using statically pre-allocated hagepages

  • transparent hugepages (no change to documented test environment setup): 3m43.289s, 3m43.796s, 3m56.097s
  • statically pre-allocated hugepages (sysctl vm.nr_hugepages=642, check that HugePages_Total in /proc/meminfo is now 642, patch features/domains/default.xml to have memoryBacking = hugepages): 3m45.067s, 3m47.043s, 3m39.744s

#12 Updated by intrigeri 2015-06-03 14:36:45

  • Status changed from Confirmed to In Progress
  • Target version set to Tails_1.4.1
  • % Done changed from 0 to 50
  • QA Check deleted (Dev Needed)
  • Type of work changed from Discuss to Test

Next step: test on isotester1.lizard (encryption.feature again, and also usb_install.feature since it might be that this one benefits more from hugepages). If no measurable benefit there, reject this ticket. It might be that hugepages are useful with nested virt. on pre-Haswell CPUs, but if so people who’re doing that (and I don’t know of anyone doing that) should benchmark and report back.

#13 Updated by intrigeri 2015-06-11 19:37:00

  • Target version changed from Tails_1.4.1 to Tails_1.5

#14 Updated by intrigeri 2015-07-01 11:42:03

  • Status changed from In Progress to Resolved
  • Assignee deleted (intrigeri)
  • Target version changed from Tails_1.5 to Tails_1.4.1
  • % Done changed from 50 to 100

intrigeri wrote:
> Next step: test on isotester1.lizard (encryption.feature again, and also usb_install.feature since it might be that this one benefits more from hugepages). If no measurable benefit there, reject this ticket.

No measurable benefits there, giving up.