Bug #12219

Regressions on Stretch with modesetting driver for Intel graphics

Added by intrigeri 2017-02-10 12:58:58 . Updated 2017-05-16 17:52:30 .

Status:
Resolved
Priority:
Elevated
Assignee:
intrigeri
Category:
Hardware support
Target version:
Start date:
2017-02-10
Due date:
% Done:

100%

Feature Branch:
feature/stretch+intel-video
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

On 3.0~beta1 we switched Intel graphics from the intel X.Org driver to the modesetting one, following Debian’s lead. I’ve heard about crashes (ThinkPad X201). https://bugs.debian.org/837451 says that reverting to the intel driver fixes such crashes. https://bugs.freedesktop.org/show_bug.cgi?id=98742 says that the real fix is in the kernel, but meanwhile I think we should put the intel driver back.


Subtasks


Related issues

Related to Tails - Feature #14991: Remove /usr/share/live/config/xserver-xorg/intel.ids Resolved 2017-11-18

History

#1 Updated by intrigeri 2017-03-08 07:56:04

  • Description updated

#2 Updated by intrigeri 2017-03-11 07:45:14

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10

Tails 3.0~beta1 and 3.0~beta2: on a system with an Intel 915GAV motherboard (Intel GMA900 onboard graphics), the Greeter doesn’t show up. It works fine with feature/stretch-intel-video.

Now, with this few data points, I find it hard to guess whether re-introducing the Intel X.Org driver would fix more regressions than it would add. I’ll send a call for testing to tails-testers@b.o, but in general this doesn’t give that many more data points.

So, dear help desk: as usual, please forward me+anonym any bug report about 3.0~betaN, and I’m particularly interested in such issues with Intel video adapters.

#4 Updated by bertagaz 2017-03-17 10:40:39

  • Assignee changed from intrigeri to bertagaz

On it!

#5 Updated by bertagaz 2017-03-17 11:17:51

OP reported that with a newest release of xserver-xorg-core, the bug disappeared. [1]

There’s been new xserver-xorg-core uploads recently, after the last snapshot freeze of the feature/stretch branch. So we should probably bump our APT snapshot to a newer version and give a try with the new modsetting driver.

I’ll update the debian bug, ask UWE to confirm the working version, and ask to lelutin to try it. Meanwhile I’ll bump the APT snapshot, or nag whoever is supposed to do it so that we can test it.

[1] https://bugs.freedesktop.org/show_bug.cgi?id=98742#c14

#6 Updated by bertagaz 2017-03-17 14:14:07

APT snapshots have been bumped and since then (specifically https://nightly.tails.boum.org/build_Tails_ISO_feature-stretch/builds/328/archive/build-artifacts/) Tails ships newer modsetting driver, which may solve the issue.

Asked for more tests/informations to several people, we’ll see if it does. Otherwise we’ll have to stick to the re-introduction of the intel driver.

#7 Updated by sajolida 2017-03-17 19:42:19

I’m running 17aca13c (beta3). I don’t know how to reproduce this so I’ll wait and see :)

#8 Updated by intrigeri 2017-03-18 13:25:04

  • Assignee changed from bertagaz to intrigeri

tl;dr for Help Desk: please point anyone affected while running 3.0~beta2 to:

Current feature/stretch has xserver-xorg-core:

  • 3.0~beta2: 2:1.19.0-3
  • current feature/stretch: 2:1.19.2-1
  • current Stretch: 2:1.19.2-1
  • current sid: 2:1.19.3-1; according to the upstream changelog, it fixes a few bugs on top of .2 (nothing specific to the modesetting driver, but they still might be relevant); FWIW it was uploaded by a Debian release manager, so I expect it’s meant to be shipped in Stretch; but so far it has been exposed to testers for only 2 days in sid.

So we have three options for 3.0~beta3:

  1. ship only the modesetting driver:
    1. version 2:1.19.2-1 (i.e. do nothing and hope that upgrading to this .2 version + other Stretch changes we have already are enough to fix the regressions we care about)
    2. version 2:1.19.3-1 (i.e. hope that the .3 version fixes more regressions than it introduces)
  2. reintroduce the Intel driver

We need to decide today, so we can’t wait for LeLutin’s tests and have to rely on what we already know only.

I don’t like option 1.2 much because it implies upgrading all other binary packages built from src:xorg-server, and it’s simply a bit too young in unstable for me to feel comfortable forcing this upgrade.

I don’t like option 2 much because it will mostly hide the problem, prevent us from helping to get it fixed in Debian Stretch, and it’s not obvious that it’ll fix more regressions than it introduces.

So at this point, for 3.0~beta3 we’ll go with option 1.1. And then, next beta or RC should give us option 1.2 if we do nothing particular about it. Depending on the feedback we get on 3.0~beta3 and on the Debian bug bertagaz pinged, we can decide at that point whether we prefer option 2 or not.

#9 Updated by intrigeri 2017-03-18 18:27:25

sajolida has reproduced the crash with option 1.1, so I’ll look into 1.2 now.

#10 Updated by intrigeri 2017-03-18 19:06:26

intrigeri wrote:
> sajolida has reproduced the crash with option 1.1, so I’ll look into 1.2 now.

Implemented option 1.2 in bugfix/12219-xorg-server-1.19.3-1.

#11 Updated by intrigeri 2017-03-19 10:12:30

intrigeri wrote:
> Implemented option 1.2 in bugfix/12219-xorg-server-1.19.3-1.

That branch upgrades only 4 packages: xserver-common, xserver-xorg-core, xvfb, and xwayland. I’ve looked at reverse-build-deps, there’s no ABI break, and all in all it doesn’t seem scary to upgrade these packages. I’ll merge the branch for option 1.2 at the last minute before freezing 3.0~beta3, if nobody tells me they could reproduce the bug with option 1.2 in the meantime.

#12 Updated by intrigeri 2017-03-19 11:18:23

  • Priority changed from Elevated to High

#13 Updated by intrigeri 2017-03-19 11:18:35

  • Feature Branch changed from feature/stretch+intel-video to feature/stretch+intel-video, bugfix/12219-xorg-server-1.19.3-1

#14 Updated by intrigeri 2017-04-02 08:59:37

  • Assignee changed from intrigeri to sajolida
  • QA Check set to Info Needed
  • Feature Branch changed from feature/stretch+intel-video, bugfix/12219-xorg-server-1.19.3-1 to feature/stretch+intel-video

intrigeri wrote:
> intrigeri wrote:
> > Implemented option 1.2 in bugfix/12219-xorg-server-1.19.3-1.
>
> That branch upgrades only 4 packages: xserver-common, xserver-xorg-core, xvfb, and xwayland. I’ve looked at reverse-build-deps, there’s no ABI break, and all in all it doesn’t seem scary to upgrade these packages. I’ll merge the branch for option 1.2 at the last minute before freezing 3.0~beta3, if nobody tells me they could reproduce the bug with option 1.2 in the meantime.

This was done, but since then sajolida has seen crashes on 3.0~beta3 (xorg-server 1.19.3-1). I’ve pinged the upstream bug report.

sajolida, may you please build an ISO from the feature/stretch+intel-video branch and see if it fixes the problem for you (using it for as long as you know is needed to reproduce it). If it does, and no progress is made upstream, then I’ll merge this branch in time for 3.0~beta4.

#15 Updated by gagz 2017-04-03 16:30:10

Hi,

I have an x201 (Intel Core i7 M620) and experience crashes with Tails 3.0~beta3.
I have been pointed to this image, which I installed and tested, and experienced apparently the same issue, which is that, with no obvious reason, the screen is turned black and the computer is powered off a few seconds later.
I tried sending logs to another computer (`journalctl -f | nc `), but it didn’t give anything interesting, unfortunately.

Well, I don’t know what to do now, but I will with pleasure test other images if pointed to.

Thanks a lot!
gagz

#16 Updated by intrigeri 2017-04-03 17:31:16

> I have been pointed to this
> image
,
> which I installed and tested, and experienced apparently the same
> issue,

Thanks for testing! Crap, that’s bad news! The change brought by this ISO image was my main hope.

Now, I wonder if installing the Intel driver is enough to have it used, so can you please try the same ISO image and pass xorg-driver=intel on the kernel command line, check the Journal to make sure the intel driver is used (and not the modesetting one), and then report back whether you experience the same issue again?

#17 Updated by gagz 2017-04-03 22:00:03

Thank you intrigeri, it seems that with xorg-driver=intel, everything seems to work fine. I had no crash after 3 hours.

I will try to get a trustable ISO of that branch and continue testing.

Thank you !

#18 Updated by intrigeri 2017-04-04 06:20:08

  • Assignee changed from sajolida to intrigeri

gagz wrote:
> Thank you intrigeri, it seems that with xorg-driver=intel, everything seems to work fine. I had no crash after 3 hours.

Good news! Note that sajolida sometimes sees this bug several times in 1 hour, and sometimes never in 2-3 days, so:

> I will try to get a trustable ISO of that branch and continue testing.

Yes, please!

If you confirm that forcing the intel X.Org driver fixes the problem for you, then we’ll have to add a live-config hook that does exactly that for affected graphic cards. I’ll need the part of lspci -v that’s about the graphics adapter, for all affected systems. I guess the list will be incomplete in the beginning, and we’ll need to add to it as we get new regression reports. Tentative list of potentially relevant bug reports:

  • [Tails-testers] Testing Tails 3.0 Unable to boot into tails: asked more tests + output of lspci -v
  • Bug report 194bd (00:02.0 VGA compatible controller: Intel Corporation 82915G/GV/910GL Integrated Graphics Controller (rev 04)): asked more tests
  • [Tails-testers] tails 3.0~beta3 report: asked more tests + output of lspci -v

#19 Updated by intrigeri 2017-04-04 19:36:20

  • Assignee changed from intrigeri to gagz

#20 Updated by sajolida 2017-04-13 09:00:38

I don’t have a build machine and I don’t plan on having one (I spent too much time and frustration on this before Jenkins was here). I also don’t want to run ISO images from Jenkins on my production system. So I’m sorry but I can’t test branch feature/stretch+intel-video since the bug is very random and can take several hours to reproduce.

#21 Updated by intrigeri 2017-04-14 07:15:08

Fair enough, I’ll wait a bit for gagz’ input then.

I’ll try to force the intel X.Org driver on all relevant graphics adapters in 3.0~beta4 (Apr 19), worst case it’ll be in the May ~20 beta/RC.
So, sajolida, if you want yours on that list, please give me the output of lspci -v. Thanks :)

#22 Updated by gagz 2017-04-14 14:13:10

  • Assignee changed from gagz to intrigeri

Hi,

So I have been using `tails-amd64-feature_stretch+intel-video-3.0~beta4` for some days, and I have given `xorg-driver=intel` to the kernel at every boot.
It just worked flawlessly !

Thank you!

Here is the output of lspci -v :
00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 02) (prog-if 00 [VGA controller]) Subsystem: Lenovo Device 215a Flags: bus master, fast devsel, latency 0, IRQ 28 Memory at f2000000 (64-bit, non-prefetchable) [size=4M] Memory at d0000000 (64-bit, prefetchable) [size=256M] I/O ports at 1800 [size=8] Expansion ROM at <unassigned> [disabled] Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [d0] Power Management version 2 Capabilities: [a4] PCI Advanced Features Kernel driver in use: i915

#23 Updated by intrigeri 2017-04-14 16:03:13

> So I have been using `tails-amd64-feature_stretch+intel-video-3.0~beta4` for some days, and I have given `xorg-driver=intel` to the kernel at every boot.
> It just worked flawlessly !

Cool, thanks a lot!

#24 Updated by intrigeri 2017-04-15 08:05:22

  • QA Check changed from Info Needed to Dev Needed

#25 Updated by intrigeri 2017-04-15 09:21:19

  • Assignee changed from intrigeri to gagz
  • QA Check changed from Dev Needed to Info Needed

gagz, sajolida: actually, I also need the output of lspci -mn (because the output of lspci -v doesn’t include the numeric IDs I need, and there’s no bijection between the human readable names and these numerical IDs).

I need this info by the end of the week-end to include them in 3.0~beta4.

#26 Updated by gagz 2017-04-17 11:47:36

  • Assignee changed from gagz to intrigeri

Hoi,
Here is the output of lspci -nm | grep 00:02.0 :

> 00:02.0 “0300” “8086” “0046” -r02 “17aa” “215a”

Have a good day!

#27 Updated by intrigeri 2017-04-17 11:51:59

> Here is the output of lspci -nm | grep 00:02.0 :

>> 00:02.0 “0300” “8086” “0046” -r02 “17aa” “215a”

OK, I had it on my list already (thanks to the Debian bug report).
So it’ll be in 3.0~beta4 :)

#28 Updated by intrigeri 2017-04-17 17:14:53

  • % Done changed from 10 to 20
  • QA Check deleted (Info Needed)

OK, all the IDs of graphics adapters where failures have been reported have been added as special cases. I’ll mention this in the release notes so affected users can report the info I need to add their own graphics adapters to the list.

#29 Updated by intrigeri 2017-04-18 15:01:12

  • Priority changed from High to Elevated

The urgent part was done, and what’s left will depend mostly on testers reporting issues to us.

#30 Updated by Anonymous 2017-04-25 11:02:13

A user reported that this fix did not work for this hardware:


00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT
Integrated Graphics Controller (rev 0b) (prog-if 00 [VGA controller])
    Subsystem: Hewlett-Packard Company Haswell-ULT Integrated Graphics
Controller
    Flags: bus master, fast devsel, latency 0, IRQ 46
    Memory at d0000000 (64-bit, non-prefetchable) [size=4M]
    Memory at c0000000 (64-bit, prefetchable) [size=256M]
    I/O ports at 4000 [size=64]
    [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
    Capabilities: <access denied>
    Kernel driver in use: i915
    Kernel modules: i915

00:02.0 "0300" "8086" "0a16" -r0b "103c" "198f"

see https://mailman.boum.org/pipermail/tails-testers/2017-April/000776.html for more details.

#32 Updated by intrigeri 2017-04-29 11:11:42

> A user reported that this fix did not work for this hardware:
> 00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT
> Integrated Graphics Controller (rev 0b) (prog-if 00 [VGA controller])

Thanks! As said there, I think it’s a different problem, and I’ll handle it on a dedicated ticket.

#33 Updated by intrigeri 2017-04-29 11:11:59

> Another users reported this works: https://mailman.boum.org/pipermail/tails-testers/2017-April/000775.html

Thanks! Added to the list in commit:bd36e7ca73263360b96cbed7c6d00185764f3bcd.

#34 Updated by intrigeri 2017-05-16 17:52:30

  • Status changed from In Progress to Resolved
  • % Done changed from 20 to 100

Applied in changeset commit:021dda47aef9d3e0324352b6e3e37ac49ebf982d.

#35 Updated by intrigeri 2017-11-18 21:27:11

  • related to Feature #14991: Remove /usr/share/live/config/xserver-xorg/intel.ids added