Bug #16281
Update the test suite for Buster
100%
Description
The test suite won’t pass directly for the feature/buster
branch, as some reference images will need updating etc. Let’s keep track of those updates with this bug report.
Subtasks
Bug #16287: buster vs. ssh.feature: unable to connect to SFTP server | Resolved | 0 |
|||
Bug #16314: IO errors on buster when restoring snapshot from previous run with --keep-snapshots | Rejected | 10 |
|||
Bug #16316: Black screen/apparent machine crash in test suite due to snapshot mechanism? | Resolved | 100 |
|||
Bug #16317: persistence.feature fails due to NetworkManager test | Resolved | 0 |
|||
Bug #16319: Regression in ssh.feature with buster | Resolved | 0 |
|||
Bug #16335: Gobby 0.5→0.6 changes in buster: test suite update needed | Resolved | 0 |
|||
Bug #16340: Seahorse/buster: No import button, no more icons | Resolved | 0 |
|||
Bug #16341: Seahorse/buster: No more close button | Resolved | 100 |
|||
Bug #16616: Re-enable and adjust test for desktop icons | Resolved | 100 |
|||
Bug #16621: Fetching OpenPGP keys using Seahorse fails. | Resolved | 100 |
|||
Bug #16623: Fix tests for mat with mat2 | Resolved | 100 |
|||
Bug #16817: Emergency shutdown automated tests on Buster often fail to notice that memory wipe was completed | Resolved | intrigeri | 0 |
||
Bug #16819: Failing scenario: Recovering in offline mode after Additional Software previously failed to upgrade and then succeed to upgrade when online | Resolved | intrigeri | 100 |
||
Bug #16820: UEFI test is broken for Tails based on buster | Resolved | intrigeri | 100 |
Related issues
Related to Tails - Bug #16969: "Electrum starts" test step is broken on Buster | In Progress | ||
Blocks Tails - Feature #16209: Core work: Foundations Team | Confirmed | ||
Blocked by Tails - |
Resolved | ||
Blocked by Tails - |
Resolved | 2016-12-27 |
History
#1 Updated by CyrilBrulebois 2019-01-05 10:37:11
- Status changed from New to In Progress
Applied in changeset commit:tails|1bdac672b17eeee5fbb2f7bed46a41af175b3c81.
#2 Updated by CyrilBrulebois 2019-01-05 10:38:58
With these initial commits, at least thunderbird.feature
is OK:
5 scenarios (5 passed)
42 steps (42 passed)
#3 Updated by intrigeri 2019-01-05 10:39:30
- blocks
Feature #13241: Core work: Test suite maintenance added
#4 Updated by intrigeri 2019-03-20 14:50:23
- blocks Feature #16209: Core work: Foundations Team added
#5 Updated by intrigeri 2019-03-20 14:50:28
- blocked by deleted (
)Feature #13241: Core work: Test suite maintenance
#6 Updated by intrigeri 2019-04-03 06:09:44
- Priority changed from Elevated to High
(This is currently the main blocker for us to prioritize coding work on Buster, let’s express this via priority.)
#7 Updated by hefee 2019-04-05 15:04:19
- Assignee deleted (
CyrilBrulebois)
Link to testsuite: https://pad.netzguerilla.net/p/fghhjkl%C3%B6
#8 Updated by anonym 2019-05-23 08:58:28
- Assignee set to anonym
#9 Updated by intrigeri 2019-06-18 13:18:41
After a coordination meeting with anonym & hefee, at this point:
- We can tell testers that all safety guarantees covered by non-fragile scenarios (that needs to be listed somewhere) are provided by feature/buster. Additional Software may be buggy in some corner cases our test suite exercises, but it shouldn’t be dangerous to use as far as we can tell.
- Wrt. fragile scenarios, they all passed at least once except these ones, that we’ll investigate:
- features/electrum.feature:15 (
Bug #16821) → real bug, now fixed - features/additional_software_packages.feature:69 (
Bug #16819) → test suite bug - features/usb_install.feature:87 (
Bug #16820) → works on bare metal so it’s a test suite bug - features/totem.feature:50 → works when tested manually so not blocking the beta; broken on stable too (
Bug #10442#note-46)
- features/electrum.feature:15 (
#10 Updated by intrigeri 2019-06-18 16:48:31
I’ve updated my above comment with info from the pad and tl;dr: all safety guarantees exercised by our test suite are provided by current feature/buster. Every test scenario passed either in the test suite on by testing manually. So yeah, we’re good to go and can now publish a beta with security support, that Tails contributors and enthusiasts can use in production!
#11 Updated by intrigeri 2019-06-18 18:15:28
Now that we’re done with the blockers for beta1 (Bug #16822), I think we should plan the next steps here. I think a XMPP meeting will work best. hefee,
anonym: anyone wants to coordinate the scheduling process?
#12 Updated by intrigeri 2019-06-19 08:27:39
intrigeri wrote:
> I think we should plan the next steps here.
A fine goal for such a meeting would be to define what needs to be done so we can call this ticket done and close it: at this point, it’s not clear to me what task/goal this ticket is tracking. And then, have a plan to make it happen :)
#13 Updated by intrigeri 2019-06-19 17:40:23
Woohoo, in https://jenkins.tails.boum.org/view/RM/job/test_Tails_ISO_feature-buster/225/, only one (“Browsing the web using the Tor Browser ǂ Watching a WebM video”) scenario failed! \o/
#14 Updated by intrigeri 2019-07-05 20:55:34
- Subject changed from Update the test suite for buster to Update the test suite for Buster
intrigeri wrote:
> Now that we’re done with the blockers for beta1 (Bug #16822), I think we should plan the next steps here. I think a XMPP meeting will work best. hefee,
anonym: anyone wants to coordinate the scheduling process?
I’ll give it a try: are you folks available for this meeting some day between July 16 and July 19, between 13:00 UTC and 15:00 UTC, or between 17:00 UTC and 20:00 UTC?
Worst case, I think that having two of us around would be enough to make a plan and unblock those of us who’ll resume work on this to know where to focus their time.
#15 Updated by hefee 2019-07-08 23:47:51
intrigeri wrote:
> I’ll give it a try: are you folks available for this meeting some day between July 16 and July 19, between 13:00 UTC and 15:00 UTC, or between 17:00 UTC and 20:00 UTC?
> Worst case, I think that having two of us around would be enough to make a plan and unblock those of us who’ll resume work on this to know where to focus their time.
intrigeri,
anonym: I can’t integrate this into my plans for upcoming week. But I need to know in advanced, as I’m often for a day offline… Other than that, an agenda for the meeting would be nice in advanced too.
#16 Updated by anonym 2019-07-10 12:14:06
hefee wrote:
> intrigeri wrote:
>
> > I’ll give it a try: are you folks available for this meeting some day between July 16 and July 19, between 13:00 UTC and 15:00 UTC, or between 17:00 UTC and 20:00 UTC?
> > Worst case, I think that having two of us around would be enough to make a plan and unblock those of us who’ll resume work on this to know where to focus their time.
I’m not available July 16, maybe not available 17th, but definitely available 18th and 19th. I prefer the 13:00 UTC option over 17:00 UTC.
So, what about July 18, 13:00 UTC?
> intrigeri, anonym: I can’t integrate this into my plans for upcoming week. But I need to know in advanced, as I’m often for a day offline…
Ok, I think me and intrigeri can handle it, then.
> Other than that, an agenda for the meeting would be nice in advanced too.
I think it’s simply “try to figure out the next steps for buster”. Attempting to flesh out a meeting agenda would be to start doing that work already! :)
#17 Updated by intrigeri 2019-07-11 19:17:36
> So, what about July 18, 13:00 UTC?
Deal!
>> Other than that, an agenda for the meeting would be nice in advanced too.
> I think it’s simply “try to figure out the next steps for buster”. Attempting to flesh out a meeting agenda would be to start doing that work already! :)
This, plus I’ve already spelled out some goals for the meeting earlier here.
#18 Updated by anonym 2019-07-18 12:23:42
I collected stats for all runs of feature-buster and stable so far in July (i.e. until the 18th). Note that I didn’t check any stats for +force-all-tests runs.
feature-buster
- 12 features/additional_software_packages.feature
- 9 Scenario: I am warned I can not use Additional Software when I start Tails from a DVD and install a package
- 2 Scenario: My Additional Software list is configurable through a GUI or through notifications when I install or remove packages with APT or Synaptic
- 1 Scenario: Recovering in offline mode after Additional Software previously failed to upgrade and then succeed to upgrade when online
- 2 features/torified_browsing.feature
- 2 Scenario: Watching a WebM video
- 1 features/persistence.feature
- 1 Scenario: Deleting a Tails persistent partition
- 1 features/veracrypt.feature
- 1 Scenario: Use Unlock VeraCrypt Volumes to unlock a basic VeraCrypt file container
- 1 features/mac_spoofing.feature
- 1 Scenario: MAC address spoofing fails and macchanger returns false
- 1 features/emergency_shutdown.feature
- 1 Scenario: Tails erases memory on DVD boot medium removal: aufs read-write branch
- 1 features/root_access_control.feature
- 1 Scenario: If an administrative password is set in Tails Greeter the live user should be able to run arbitrary commands with administrative privileges.
- 1 features/tor_stream_isolation.feature
- 1 Scenario: tails-security-check is using the Tails-specific SocksPort
- 1 features/usb_install.feature
- 1 Scenario: Writing a Tails isohybrid to a USB drive and booting it, then installing Tails on top of it using Tails Installer, and it still boots
- 2 features/untrusted_partitions.feature
- 1 Scenario: Tails booting from a DVD does not use live systems stored on hard drives
- 1 Scenario: Booting Tails does not automount untrusted ext2 partitions
Total: 23 failures (in 13 scenarios, 10 features) over 24 runs
stable
- 10 features/additional_software_packages.feature
- 2 Scenario: I am notified when Additional Software fails to install a package
- 1 Scenario: I set up Additional Software when installing a package without persistent partition and the package is installed next time I start Tails
- 2 Scenario: The Additional Software dpkg hook notices when persistence is locked down while installing a package
- 1 Scenario: My Additional Software list is configurable through a GUI or through notifications when I install or remove packages with APT or Synaptic
- 4 Scenario: Recovering in offline mode after Additional Software previously failed to upgrade and then succeed to upgrade when online
- 3 features/electrum.feature
- 3 Scenario: Using a persistent Electrum configuration
- 3 features/totem.feature
- 3 Scenario: Watching a WebM video over HTTPS
- 3 features/veracrypt.feature
- 1 Scenario: Use Unlock VeraCrypt Volumes to unlock a hidden VeraCrypt file container
- 1 Scenario: Use GNOME Disks to unlock a USB drive that has a basic VeraCrypt volume with a keyfile
- 1 Scenario: Use GNOME Disks to unlock a basic VeraCrypt file container with a keyfile
- 1 features/dhcp.feature
- 1 Scenario: Getting a DHCP lease with a manually configured NetworkManager connection
- 1 features/emergency_shutdown.feature
- 1 Scenario: Tails erases memory on DVD boot medium removal: vfat
- 1 features/evince.feature
- 1 Scenario: I can view and print a PDF file stored in /usr/share
- 1 features/mac_spoofing.feature
- 1 Scenario: The MAC address is not leaked when booting Tails - 1 [1875]
- 1 features/persistence.feature
- 1 Scenario: Dotfiles persistence
- 1 features/tor_stream_isolation.feature
- 1 Scenario: tails-security-check is using the Tails-specific SocksPort
- 1 features/torified_browsing.feature
- 1 Scenario: The Tor Browser should not have any plugins enabled
- 1 features/torified_gnupg.feature
- 1 Scenario: Syncing OpenPGP keys using Seahorse started from the OpenPGP Applet should work and be done over Tor.
- 1 features/untrusted_partitions.feature
- 1 Scenario: Tails will not enable disk swap
Total: 28 failures (in 19 scenarios, 13 features) over 28 runs
Comparison
While the failures per run is essentially identical for both branches, I think feature-buster still looks better because its failures are much less spread out over different scenarios/features. Still, stable is apparently running quite poorly recently, so part of why feature-buster looks good is because stable looks bad. Also, the overlap of failing scenarios isn’t great; these are the overlapping failing scenarios:
- My Additional Software list is configurable through a GUI or through notifications when I install or remove packages with APT or Synaptic
- Recovering in offline mode after Additional Software previously failed to upgrade and then succeed to upgrade when online
- tails-security-check is using the Tails-specific SocksPort
So all other failing scenarios are completely different between the two branches, which doesn’t look great.
#19 Updated by anonym 2019-07-19 13:16:08
anonym wrote:
> Note that I didn’t check any stats for +force-all-tests runs.
Here’s an extremely quick’n’dirty diff just to see which scenarios are failing (in all of July) for feature-16792-update-chutney-force-all-tests (used as the “stable” baseline) vs feature-buster-force-all-tests:
--- feature-16792-update-chutney-force-all-tests
+++ feature-buster-force-all-tests
@@ -1,28 +1,26 @@
features/additional_software_packages.feature - Scenario: I am notified when Additional Software fails to install a package
features/additional_software_packages.feature - Scenario: I am warned I can not use Additional Software when I start Tails from a DVD and install a package
-features/additional_software_packages.feature - Scenario: I set up Additional Software when installing a package without persistent partition and the package is installed next time I start Tails
features/additional_software_packages.feature - Scenario: My Additional Software list is configurable through a GUI or through notifications when I install or remove packages with APT or Synaptic
features/additional_software_packages.feature - Scenario: Recovering in offline mode after Additional Software previously failed to upgrade and then succeed to upgrade when online
-features/additional_software_packages.feature - Scenario: The Additional Software dpkg hook notices when persistence is locked down while installing a package
features/electrum.feature - Scenario: Using a persistent Electrum configuration
-features/encryption.feature - Scenario: Signing and verification using OpenPGP Applet
-features/encryption.feature - Scenario: Symmetric encryption and decryption using OpenPGP Applet
-features/localization.feature - Scenario: The Report an Error launcher will open the support documentation in supported non-English locales
-features/time_syncing.feature - Scenario: Clock is one day in the future in bridge mode
-features/time_syncing.feature - Scenario: Clock with host's time
-features/time_syncing.feature - Scenario: Clock with host's time in bridge mode
-features/tor_bridges.feature - Scenario: Using bridges
-features/tor_bridges.feature - Scenario: Using obfs4 pluggable transports
+features/evince.feature - Scenario: I can view and print a PDF file stored in non-persistent /home/amnesia
+features/evince.feature - Scenario: I can view and print a PDF file stored in persistent /home/amnesia/Persistent
+features/persistence.feature - Scenario: Deleting a Tails persistent partition
features/torified_browsing.feature - Scenario: Downloading files with the Tor Browser
features/torified_browsing.feature - Scenario: I can view a file stored in "~/Tor Browser" but not in ~/.gnupg
-features/torified_browsing.feature - Scenario: The Tor Browser should not have any plugins enabled
+features/torified_browsing.feature - Scenario: Playing an Ogg audio track
+features/torified_browsing.feature - Scenario: The Tor Browser directory is usable
features/torified_browsing.feature - Scenario: The Tor Browser's "New identity" feature works as expected
features/torified_browsing.feature - Scenario: Watching a WebM video
-features/torified_git.feature - Scenario: Cloning git repository over SSH
-features/torified_gnupg.feature - Scenario: Fetching OpenPGP keys using GnuPG should work and be done over Tor.
features/torified_gnupg.feature - Scenario: Fetching OpenPGP keys using Seahorse should work and be done over Tor.
features/torified_gnupg.feature - Scenario: Fetching OpenPGP keys using Seahorse via the OpenPGP Applet should work and be done over Tor.
features/torified_gnupg.feature - Scenario: Syncing OpenPGP keys using Seahorse should work and be done over Tor.
features/torified_gnupg.feature - Scenario: Syncing OpenPGP keys using Seahorse started from the OpenPGP Applet should work and be done over Tor.
-features/tor_stream_isolation.feature - Scenario: SSH is using the default SocksPort
+features/tor_stream_isolation.feature - Scenario: The Tor Browser is using the web browser-specific SocksPort
features/totem.feature - Scenario: Watching a WebM video over HTTPS
+features/unsafe_browser.feature - Scenario: Starting the Unsafe Browser works as it should.
+features/usb_install.feature - Scenario: Booting Tails from a USB drive in UEFI mode
+features/usb_upgrade.feature - Scenario: Booting Tails from a USB drive upgraded from DVD with persistence enabled
+features/usb_upgrade.feature - Scenario: Booting Tails from a USB drive upgraded from USB with persistence enabled
+features/usb_upgrade.feature - Scenario: Creating a persistent partition with the old Tails USB installation
+features/usb_upgrade.feature - Scenario: Writing files to a read/write-enabled persistent partition with the old Tails USB installation
#20 Updated by anonym 2019-07-19 14:10:14
intrigeri and I had a meeting and here are our conclusions.
Let’s focus on the scenarios that occasionally fail on Buster, that we’re not used to see being fragile (be it tagged as such or not) on Stretch. That is:
- primarily, those whose line starts with “+” in the previous comment
- after that, those that fail much more often on Buster than on Stretch
While investigating this, let’s particularly pay attention to robustness issues that have potential to break essentially any test, which decreases the usefulness of the data we’re collecting; for example, “virt-viewer failed to start”, GNOME Shell not starting properly, or the Overview not showing up when we try to start an app.
@hefee, is this something you would like to work on in July and/or August, perhaps as a nice distraction from more stressful translation platform work? Or would it instead be Yet Another Thing To Do and a cause of additional stress?
“Worst” case, I can resume working on it in August, which is perfectly fine.
FWIW, here are the falingteststats.py
(❤❤❤!!! thanks @hefee!!!) parameters used for comment 18:
RANGE = range(1871,1898)
jenkins = Jenkins("https://jenkins.tails.boum.org/job/test_Tails_ISO_stable", AUTH)
RANGE = range(245,268)
jenkins = Jenkins("https://jenkins.tails.boum.org/job/test_Tails_ISO_feature-buster", AUTH)
and comment 19:
RANGE = range(17,32)
jenkins = Jenkins("https://jenkins.tails.boum.org/job/test_Tails_ISO_feature-16792-update-chutney-force-all-tests", AUTH)
RANGE = range(64,76)
jenkins = Jenkins("https://jenkins.tails.boum.org/job/test_Tails_ISO_feature-buster-force-all-tests", AUTH)
#21 Updated by intrigeri 2019-08-09 08:45:56
- Feature Branch set to test/16281-misc+force-all-tests
Pushed something that might help wrt. the “Additional Software documentation from the notification” test failure. I have no big hopes but in any case, this seems the right thing to do.
#22 Updated by intrigeri 2019-08-09 08:52:05
- blocked by
Bug #16941: devel branch FTBFS since torbrowser-launcher 0.3.2-1 was uploaded to sid added
#23 Updated by anonym 2019-08-09 09:14:38
intrigeri wrote:
> Pushed something that might help wrt. the “Additional Software documentation from the notification” test failure. I have no big hopes but in any case, this seems the right thing to do.
If you are referring to commit:7c2f44948cb1843c86bd083266e3497dd99520b9, it actually changes nothing. try_for()
will fail iff an exception is thrown in its block, or if the block returns something false
-ish (IIRC only false
and nil
too). child()
will raise an exception on failure, or return an object on success, which makes it identical to child?()
in try_for()
’s context.
The problem with this scenario is actually that the system under testing runs out of memory and ens up effectively frozen due to how Linux over-commits memory. I first bumped RAM with 512 MB but it wasn’t enough, so just to get a quick pass I doubled it to 4 GB and then it passed, but presumably less is required.
I guess 2 GB of RAM is a bit low when the tmpfs overlay for / is filled up by downloading the APT lists etc. but I am surprised it is this severe. Could it be compounded by some part of a-s-p (that is still running at that stage?) being very memory hungry?
Otherwise, if we have have to (significantly) bump RAM it would be annoying to have to do so for our default machine. We could consider making this scenario configure more RAM and boot from scratch, without snapshots, instead.
#24 Updated by intrigeri 2019-08-09 10:09:07
> If you are referring to commit:7c2f44948cb1843c86bd083266e3497dd99520b9, it actually changes nothing.
OK, thanks for the explanation! I was somewhat mislead by commit:403eb3fc31b68391c4bd553b48002d74f62fa31f.
> The problem with this scenario is actually that the system under testing runs out of memory […]
Great you’ve already found the culprit!
> I guess 2 GB of RAM is a bit low when the tmpfs overlay for / is filled up by downloading the APT lists etc. but I am surprised it is this severe. Could it be compounded by some part of a-s-p (that is still running at that stage?) being very memory hungry?
I’ve reproduced this situation by hand in a VM with 2 GB of RAM and:
- The
a-s-p
processes uses about 70MB of RES memory at that point. This could be optimized a bit (e.g. by freeingapt_cache
once it’s not needed anymore) but that’s not going to be a game changer so let’s not bother. - Three processes eat more RAM:
- amnesia’s GNOME Shell (130MB) → I doubt we can do anything about it.
- the greeter’s GNOME Shell (125MB) → that’s a bug:
Bug #12092. - OpenPGP Applet (73MB) → we can probably optimize this a bit but again, this is not going to be a game changer.
- At that point there’s only a couple hundreds of free memory left. That is, apart of the aforementioned memory-hungry processes, we use lots of memory elsewhere. There’s probably some stuff we could avoid running and save a few dozens of MB but I suspect this won’t make a huge difference in the end.
- Then Tor Browser eats up to 292MB (
firefox.real
) + 38MB (Web content process) and indeed gets stuck. IIRC this memory consumption increased a lot when we upgraded to Firefox 60. I doubt we can do anything about it ourselves :/
So, well, newer software eats more memory and at some point, 2GB won’t be enough even for relatively basic usage. On Feature #5502 I temporarily concluded that we did not have to bump the memory requirements for 4.0; as we see here, it turns out I was wrong: there is a problem we need to fix. But if we bumped the memory requirements, we would have to do extra work as indicated on that ticket. It seems cheaper (and nicer to users with low amounts of memory) to instead fix Bug #12092, that has a plausibly easy solution documented in its last comment.
@anonym, what do you think? If you agree, let’s set Bug #12092’s target version to 4.0, with elevated priority (release blocker), and mark it as blocking this very ticket.
#25 Updated by anonym 2019-08-09 12:08:40
I’ll answer you comment #24 later (was just leaving for today).
anonym wrote:
> We could consider making this scenario configure more RAM and boot from scratch, without snapshots, instead.
I pushed this (apparently a 512 MB bump is enough) to the branch, works locally for me.
#26 Updated by intrigeri 2019-08-11 09:58:40
- blocked by
Bug #16822: Release 4.0~beta1 added
#27 Updated by intrigeri 2019-08-11 10:00:08
> Blocked by Bug Bug #16822: Release 4.0~beta1 added
For some reason I’ve based this branch on testing, which makes it FTBFS until the release process of 4.0~beta1 is completed and devel is merged into this branch.
#28 Updated by intrigeri 2019-08-14 06:00:01
(Sent 24h ago but Redmine DDoS mode breaks our Redmine email interaction support.)
anonym wrote:
> I’ll answer you comment #24 later (was just leaving for today).
OK, I’m eager to read your reply :)
> anonym wrote:
>> We could consider making this scenario configure more RAM and boot from scratch, without snapshots, instead.
> I pushed this (apparently a 512 MB bump is enough) to the branch, works locally for me.
This worked on Jenkins too.
So I’m in favour of merging test/16281-misc+force-all-tests into devel, which will bring us one step closer to having more valuable output from our CI (among the other
two major offenders, Bug #10442 has a branch ready to be reviewed, and on the -ci ML I’ve proposed to remove the last one that’s currently totally useless).
#29 Updated by intrigeri 2019-08-14 06:49:25
- related to Bug #16969: "Electrum starts" test step is broken on Buster added
#30 Updated by anonym 2019-08-15 12:53:09
- Assignee changed from anonym to intrigeri
intrigeri wrote:
> > If you are referring to commit:7c2f44948cb1843c86bd083266e3497dd99520b9, it actually changes nothing.
>
> OK, thanks for the explanation! I was somewhat mislead by commit:403eb3fc31b68391c4bd553b48002d74f62fa31f.
Ah, understandable.
> > The problem with this scenario is actually that the system under testing runs out of memory […]
>
[…]
>
> So, well, newer software eats more memory and at some point, 2GB won’t be enough even for relatively basic usage. On Feature #5502 I temporarily concluded that we did not have to bump the memory requirements for 4.0; as we see here, it turns out I was wrong: there is a problem we need to fix. But if we bumped the memory requirements, we would have to do extra work as indicated on that ticket. It seems cheaper (and nicer to users with low amounts of memory) to instead fix Bug #12092, that has a plausibly easy solution documented in its last comment.
>
> @anonym, what do you think? If you agree, let’s set Bug #12092’s target version to 4.0, with elevated priority (release blocker), and mark it as blocking this very ticket.
I think your proposal is fine, but I have one worry: since the problematic scenario manages to trigger a low-memory situation by just using 70 MB more memory than the many scenarios that starts the Tor Browser, I would expect these other scenarios to fail occasionally, making test suite results worse.
So I’m tempted to propose that we drop my commit:895387ece8cec86831cd617ee08dbdfc49cdc144 and bump TailsToaster to 2.5 GB memory. And then we might be able to revert the bump if Bug #12092 turns out to be good enough. Unless it is a problem for Jenkins, of course.
What do you think?
#31 Updated by intrigeri 2019-08-15 17:42:43
- Assignee changed from intrigeri to anonym
(Meta: please don’t reassign to me merely to ask me one question, unless you want me to “own” this ticket: QA Check = Info Needed is dead :)
> since the problematic scenario manages to trigger a low-memory situation by just using 70 MB more memory than the many scenarios that starts the Tor Browser, I would expect these other scenarios to fail occasionally, making test suite results worse.
FWIW, I’ve not noticed that so far. I’m not too afraid about other scenarios failing occasionally due to this problem: I believe memory usage is rather deterministic in our test suite on Jenkins. But I can imagine it could happen at some point, indeed.
> So I’m tempted to propose that we drop my commit:895387ece8cec86831cd617ee08dbdfc49cdc144 and bump TailsToaster to 2.5 GB memory. And then we might be able to revert the bump if Bug #12092 turns out to be good enough. Unless it is a problem for Jenkins, of course.
I don’t know if that’s a problem for Jenkins (it could be) but that’s the least of my concerns at the moment — my main worry is elsewhere: I care about our test suite to exercise basic functionality (e.g. it does not do multi-tasking) using the hardware requirements we document, i.e. 2GB, that are supposed to be sufficient for such basic usage. Otherwise, we’ll have a hard time noticing when basic usage does not fit anymore in said requirements. This discussion is a good example: we would not have noticed the problem (and the need to fix it for actual users, not only for our test suite) if we had been running TailsToaster with 2.5GB of RAM already.
So, updated proposal:
- Set
Bug #12092’s target version to 4.0, with elevated priority (release blocker), and mark it as blocking this very ticket. - Merge the topic branch so the offending scenario gives us more relevant information.
- Fix
Bug #12092ASAP. Meanwhile, if we notice that other scenarios fail due to lack of memory, bump TailsToaster to 2.5GB temporarily, and file a 4.0 release blocker ticket about reverting that as soon asBug #12092is done.
How does this sound?
If we’re not on the same page yet, let’s discuss this tomorrow on XMPP :)
#32 Updated by anonym 2019-08-16 07:57:55
- blocked by
Bug #12092: The Greeter keeps eating lots of memory after logging in added
#33 Updated by anonym 2019-08-16 08:03:31
I like your proposal, and have implemented it!
#34 Updated by anonym 2019-08-16 08:06:01
- Feature Branch deleted (
test/16281-misc+force-all-tests)
#35 Updated by intrigeri 2019-08-16 11:28:52
Most (if not all) upgrade tests started failing on Jenkins at https://jenkins.tails.boum.org/view/RM/job/test_Tails_ISO_devel/1811/ due to lack of disk space. The difference with the last good run is 8f375132a38b2427df1b5dabfb5a2ddd08555337..931874d35999be7072eeab3f27f2bfd528f61412. I suspect that adding the Bullseye APT sources made some snapshots a bit bigger. If my hunch is right, then Bug #12092 should fix that problem too.
#36 Updated by intrigeri 2019-08-24 07:44:06
> Most (if not all) upgrade tests started failing on Jenkins at https://jenkins.tails.boum.org/view/RM/job/test_Tails_ISO_devel/1811/ due to lack of disk space. The difference with the last good run is 8f375132a38b2427df1b5dabfb5a2ddd08555337..931874d35999be7072eeab3f27f2bfd528f61412. I suspect that adding the Bullseye APT sources made some snapshots a bit bigger. If my hunch is right, then Bug #12092 should fix that problem too.
It does.
#37 Updated by intrigeri 2019-08-30 12:40:21
- Assignee changed from anonym to intrigeri
I’m analyzing Jenkins results from the last two months and will:
- Remove obsolete fragile tags.
- Add fragile tags where they’re missing.
- Update priority of test suite tickets accordingly.
#38 Updated by intrigeri 2019-08-30 13:21:47
- Assignee deleted (
intrigeri)
intrigeri wrote:
> I’m analyzing Jenkins results from the last two months and will:
>
> * Remove obsolete fragile tags.
> * Add fragile tags where they’re missing.
Done, will submit a MR. Another similar session (using the json-analysis
script) done 1-2 months after the 4.0 release will give us more useful info: it will only have Buster results.
> * Update priority of test suite tickets accordingly.
Done.
#39 Updated by intrigeri 2019-08-31 18:51:03
intrigeri wrote:
> Most (if not all) upgrade tests started failing on Jenkins at https://jenkins.tails.boum.org/view/RM/job/test_Tails_ISO_devel/1811/ due to lack of disk space. The difference with the last good run is 8f375132a38b2427df1b5dabfb5a2ddd08555337..931874d35999be7072eeab3f27f2bfd528f61412. I suspect that adding the Bullseye APT sources made some snapshots a bit bigger. If my hunch is right, then Bug #12092 should fix that problem too.
I’ve grown a bit the disk space for /tmp/TailsToaster
on Jenkins isotesters, but that was not enough. And then, even the branch for Bug #12092, that initially fixed the problem, ended up being affected as well. I have no idea why.
#40 Updated by intrigeri 2019-08-31 18:54:01
… so I’ve grown /tmp/TailsToaster
on lizard’s isotesters again. Let’s hope this does not break things because there’s not enough memory left for other operations. Worst case, I’ll give these VMs more RAM.
#41 Updated by intrigeri 2019-09-01 09:42:18
Woohoo, full test suite runs 1847 and 1848 passed on Jenkins! I think we’re close to the point when we can conclude that the test suite on devel (Buster) is at least as robust as on stable (Stretch).
#42 Updated by intrigeri 2019-09-01 09:49:07
- blocks deleted (
)Bug #16941: devel branch FTBFS since torbrowser-launcher 0.3.2-1 was uploaded to sid
#43 Updated by intrigeri 2019-09-01 18:40:56
- Status changed from In Progress to Resolved
intrigeri wrote:
> I think we’re close to the point when we can conclude that the test suite on devel (Buster) is at least as robust as on stable (Stretch).
I’ve kept analyzing test suite results on Jenkins and I’m now convinced that the full test suite on devel (Buster) is more robust than on stable (Stretch). It also has more up-to-date @fragile tags so test runs that skip fragile tests should be more robust as well (I did not verify this but I don’t think we need this ticket to monitor this).
I’ll keep monitoring test suite results on Jenkins, filing tickets for robustness issues, tagging stuff as fragile, and submitting branches to fix low-hanging fruits.