Bug #17429

Stretch and Buster VMs running on a sid host can't run our test suite

Added by intrigeri 2020-01-12 12:44:30 . Updated 2020-02-05 20:23:52 .

Status:
Confirmed
Priority:
Normal
Assignee:
Category:
Test suite
Target version:
Start date:
Due date:
% Done:

0%

Feature Branch:
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

Both anonym & I have seen this problem. The QEMU process that’s supposed to run Tails (aka. TailsToaster) dies complaining about CPU flags.

My current understanding is that:

  • This affects sid virtualization hosts, i.e. most of our core developers.
  • This does not affect Stretch virtualization hosts, i.e. our CI in its current state (which is unlikely to change in the next few months).
  • It’s unknown if Buster virtualization hosts are affected.

Subtasks


Related issues

Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed

History

#1 Updated by intrigeri 2020-01-12 12:44:47

#2 Updated by anonym 2020-02-03 13:58:37

  • related to Bug #17457: Add Buster support to the automated test suite added

#3 Updated by anonym 2020-02-03 14:03:01

  • related to deleted (Bug #17457: Add Buster support to the automated test suite)

#4 Updated by anonym 2020-02-05 14:58:39

As of Linux 5.4.0 on my sid host I started seeing my three VMs (stretch, buster, sid) crashing their kernels on boot. By messing with the CPU flags I got them booting again (but I wasn’t careful enough to check which one was the problem). FWIW this configuration works for me on Sid/Buster:

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Skylake-Client-IBRS</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='vmx'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='clflushopt'/>
    <feature policy='require' name='umip'/>
    <feature policy='require' name='md-clear'/>
    <feature policy='require' name='stibp'/>
    <feature policy='require' name='arch-capabilities'/>
    <feature policy='require' name='ssbd'/>
    <feature policy='require' name='xsaves'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='require' name='ibpb'/>
    <feature policy='require' name='amd-ssbd'/>
    <feature policy='require' name='skip-l1dfl-vmentry'/>
    <feature policy='disable' name='hle'/>
    <feature policy='disable' name='rtm'/>
  </cpu>

But so far I have been unable to find a configuration that allows me to run the test suite on Stretch; the complaints I get indicate that I should disable spec-ctrl, but then the kernel crashes on boot like above (so I guess that was the problematic flag). Switching TailsToaster to host-passthrough fixes the issues, though.

It seems to me we should switch from host-model to host-passthrough for TailsToaster: sure, this will mean that we get somewhat different virtual hardware depending on real hardware which is bad for reproducibility, but since we have to mess with CPU flags any way to get host-model working we are already in that crappy situation.

Thoughts?

#5 Updated by intrigeri 2020-02-05 20:23:52

> It seems to me we should switch from host-model to host-passthrough for TailsToaster:

+1 (that’s what worked for me last time I faced this problem) but to build more confidence into this option, I’d like to test that it still works for me on my current sid.

> sure, this will mean that we get somewhat different virtual hardware depending on real hardware which is bad for reproducibility

host-model also has this problem, no? It’s not as if we’re currently forcing a specific vCPU model independently of the bare betal host CPU.