Bug #17199

Regression with some AMD GPU + Intel HD 4000 graphics (Ivy Bridge)

Added by goupille 2019-10-29 12:15:53 . Updated 2020-03-26 19:43:42 .

Status:
Confirmed
Priority:
Normal
Assignee:
numbat
Category:
Hardware support
Target version:
Start date:
Due date:
% Done:

0%

Feature Branch:
bugfix/17199-17228-udevadm-trigger-vs-dual-gpu+force-all-tests
Type of work:
Research
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

a user reported that with tails 4.0 started on a dell z14 5423 with intel graphics clicking on the start tails button doesn’t start tails but keeps going back to the greeter. The issue wasn’t there with tails 3.16 and the user was able to start this tails 4.0 device from a dell E6440 with no errors.

I asked the user to try the various workarounds for intel graphics.


Files


Subtasks


Related issues

Related to Tails - Bug #17228: Tails 4.0 greeter loads a second time when admin password added Confirmed

History

#1 Updated by intrigeri 2019-10-31 11:33:19

  • Category set to Hardware support
  • Assignee changed from intrigeri to goupille

Thanks. Please reassign to me once you’ve collected this info. Also, to work on this we’ll probably need the output of tails-debugging-info (run from a text console, redirected to a file, copied to a USB stick, then shared with us somehow).

#3 Updated by intrigeri 2019-11-01 09:46:41

To debug this, I think we’ll need the output of tails-debugging-info (run as root in a text console), otherwise I’m afraid there’s no way to tell what’s going on :/

#4 Updated by op_mb 2019-11-01 21:28:13

i’ve attached a screenshot from a system with Intel HD Graphics 4000
i cant reproduce the issue

#5 Updated by intrigeri 2019-11-02 05:56:45

  • Subject changed from regression with intel HD 4000 graphics to Regression with some Intel HD 4000 graphics (Ivy Bridge)

#6 Updated by goupille 2019-11-02 18:00:14

adding

 xorg-driver=intel


fixes the issue for that user (I still asked for the debugging info)

#7 Updated by goupille 2019-11-05 11:31:28

  • Assignee changed from goupille to intrigeri

I just resent you the debugging info you requested (the subject is tagged with Bug #17199)

#8 Updated by intrigeri 2019-11-10 16:40:20

Hi @goupille!

goupille wrote:
> adding xorg-driver=intel fixes the issue for that user

Interestingly, after they reported that, you asked them to try xorg-driver=modesetting. The OP obliged, so the debugging info we got is about the latter, not about xorg-driver=intel; interestingly, the modesetting driver also makes Tails work :) Maybe you made a typo when asking them the debugging info?

Anyway, this was super useful:

  • The kernel is 5.3.2-1~exp1 so that’s probably Tails 4.0
  • That system has two GPUs, with vga_switcheroo doing its thing:
    • “00:02.0 VGA compatible controller [0300]: Intel Corporation 3rd Gen Core processor Graphics Controller [8086:0166] (rev 09)”
    • “02:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Thames [Radeon HD 7550M/7570M/7650M] [1002:6841]”
  • With xorg-driver=modesetting, the Intel GPU is used by X.Org, because it’s the default drm device; the AMD GPU is ignored because there’s no support for multiple graphics cards there
  • Without extra boot options:
    • The radeon X.Org driver is used for the AMD GPU, and the modesetting X.Org driver is used for the Intel one.
    • The Greeter loads just fine so there’s actually no X.Org hardware support issue here.
    • X.Org segfaults while logging in, during MAC spoofing, triggered by (II) config/udev: Adding drm device (/dev/dri/card1)

So I’m now almost convinced that this has nothing to do with GPU drivers, but rather, MAC spoofing breaks X.Org in some cases (possibly systems with multiple GPUs). I’ll ask the OP to test with MAC spoofing disabled.

#9 Updated by intrigeri 2019-11-10 16:44:37

  • Subject changed from Regression with some Intel HD 4000 graphics (Ivy Bridge) to Regression with some AMD GPU + Intel HD 4000 graphics (Ivy Bridge)
  • Status changed from New to Confirmed

#10 Updated by intrigeri 2019-11-11 07:58:28

If my hunch is correct and the problem is caused by MAC spoofing running udevadm trigger, one option could be to exclude graphics devices from the trigger action, e.g. with --subsystem-nomatch=SUBSYSTEM or one of the related options.

#11 Updated by intrigeri 2019-11-14 09:52:56

Disabling MAC spoofing does not fix the problem but I was slightly confused when I thought it would: the thing I believe could be breaking stuff here is not spoofing the MAC address per se, but rather the fact that our MAC spoofing implementation, based on blocking/unblocking network devices, triggers udevadm actions, regardless of whether MAC spoofing is enabled or not.

The logs I got this time suggest that systemd-logind is confused about the state of FDs (for /dev/dri/card{0,1} it passes to X.Org: when amnesia’s X.Org starts, it gets passed paused FDs which makes it abort. I suspect that’s because at least one of /dev/dri/card{0,1} seem to briefly disappear and come back again when we run udevadm trigger.

So one thing I’d like to try is to avoid triggering udev events for GPU devices.

#12 Updated by goupille 2019-11-14 13:56:16

I don’t know if it is related but we received an anonymous report (Bug report: 20c6669cc73aa96b4bcbf982a55b3d57) from a user with two gpu (Intel Corporation 4th Gen Core Processor Integrated Graphics Controller [8086:0416] and [AMD/ATI] Mars [Radeon HD 8670A/8670M/8750M] [1002:6600]), saying that setting an administrator password in the Greeter prevents Tails to fully start (it keeps coming back to the Greeter when clicking on the “start tails” button)

when no administrator password is set, Tails starts without issues (hence the logs)

#13 Updated by intrigeri 2019-11-22 11:44:35

> I don’t know if it is related but we received an anonymous report (Bug report: 20c6669cc73aa96b4bcbf982a55b3d57) from a user with two gpu (Intel Corporation 4th Gen Core Processor Integrated Graphics Controller [8086:0416] and [AMD/ATI] Mars [Radeon HD 8670A/8670M/8750M] [1002:6600]), saying that setting an administrator password in the Greeter prevents Tails to fully start (it keeps coming back to the Greeter when clicking on the “start tails” button)

> when no administrator password is set, Tails starts without issues (hence the logs)

Indeed, this could be related: setting an admin password could increase the chances that we win the systemd-logind vs. MAC spoofing vs. X.Org race condition I had a hunch about above.

So far we had reports of this class of problems affecting 2 users. If there are more, IMO this should become FT work.

#14 Updated by intrigeri 2019-11-22 11:44:36

  • related to Bug #17228: Tails 4.0 greeter loads a second time when admin password added added

#15 Updated by intrigeri 2019-11-22 11:46:55

  • Assignee deleted (intrigeri)

Bug #17228 has more info that is consistent with my current best theory.

#16 Updated by intrigeri 2020-03-26 11:53:40

  • Status changed from Confirmed to In Progress

Applied in changeset commit:tails|9919ae5e4f33114bb61fd0a74450aa7872f2d5c2.

#17 Updated by intrigeri 2020-03-26 12:01:11

  • Status changed from In Progress to Confirmed

I’ve documented the xorg-driver=modesetting workaround.

Next steps:

  1. Prepare a branch where we exclude graphics cards from udevadm trigger: Bug #17199#note-10
  2. Ask affected users to test a nightly build from that branch

#18 Updated by intrigeri 2020-03-26 12:42:56

  • Feature Branch set to bugfix/17199-17228-udevadm-trigger-vs-dual-gpu+force-all-tests

#19 Updated by intrigeri 2020-03-26 19:43:42

  • Assignee set to numbat

Dear help desk,

I have a branch that might fix the “return to Greeter when clicking the Start Tails button” problem, that we’ve seen reported so far on a couple Intel+AMD dual-GPU systems (here + Bug #17228).

Could you please ask affected users to try https://nightly.tails.boum.org/build_Tails_ISO_bugfix-17199-17228-udevadm-trigger-vs-dual-gpu-force-all-tests/lastSuccessful/archive/build-artifacts/ ?

(My current hunch is that this problem is caused by a bug specific to Tails — as opposed to a more general Linux hardware support issue — that could potentially affect a broader range of hardware. That’s why I think it could be worth spending a little bit of time on this one.)