Bug #15146

Make memory erasure feature compatible with overlayfs

Added by intrigeri 2018-01-03 17:45:31 . Updated 2020-05-10 19:17:33 .

Status:
Resolved
Priority:
High
Assignee:
intrigeri
Category:
Target version:
Start date:
2018-01-03
Due date:
% Done:

100%

Feature Branch:
bugfix/15146-overlayfs-memory-erasure+force-all-tests
Type of work:
Code
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

The “Erasure of the overlayfs read-write branch on shutdown” test scenario fails. But interestingly, “Tails erases memory on DVD boot medium removal: overlayfs read-write branch” passes. Looks like live-boot behaves slightly differently on overlayfs than on aufs, which seems to explain why mountpoints are not visible and can’t be unmounted and thus cleaned.

We need to update the design doc (wiki/src/contribute/design/memory_erasure.mdwn) to document the overlayfs implementation instead of the aufs one.


Files


Subtasks


Related issues

Related to Tails - Feature #8415: Migrate from aufs to overlayfs Resolved 2014-12-18
Blocks Tails - Feature #16209: Core work: Foundations Team Confirmed

History

#1 Updated by intrigeri 2018-01-03 17:45:43

  • Target version set to 2018

#2 Updated by intrigeri 2018-01-03 17:52:22

anonym, you committed to handle the parent ticket this year (and I’m supposed to be the reviewer) but right now I fancy playing a bit with this ticket. If it becomes less fun I’ll reassign to you.

#3 Updated by intrigeri 2018-06-02 14:23:55

  • related to Bug #15477: Consider upgrading to current live-boot added

#4 Updated by intrigeri 2018-06-02 14:24:07

  • related to deleted (Bug #15477: Consider upgrading to current live-boot)

#5 Updated by intrigeri 2018-06-02 14:24:13

  • blocked by Bug #15477: Consider upgrading to current live-boot added

#6 Updated by intrigeri 2018-06-02 14:25:27

I don’t think it’s worth spending time on this before Bug #15477 is done: perhaps the revamping of how mountpoints are managed by live-boot will fix this problem, perhaps it’ll make it harder to solve, but in any case better do the overlayfs-specific work only once, after that other migration is done.

#7 Updated by intrigeri 2018-09-12 06:46:19

  • Target version changed from 2018 to Tails_3.11

#8 Updated by intrigeri 2018-09-12 06:46:29

#9 Updated by intrigeri 2018-11-05 14:45:46

  • Target version changed from Tails_3.11 to Tails_3.12

#10 Updated by intrigeri 2018-11-06 15:04:45

  • Target version changed from Tails_3.12 to Tails_3.13

#11 Updated by intrigeri 2018-12-02 21:53:11

#12 Updated by intrigeri 2018-12-02 21:53:58

  • blocked by deleted (Feature #15506: Core work 2018Q4: Foundations Team)

#13 Updated by intrigeri 2019-01-25 16:32:58

  • Target version changed from Tails_3.13 to 2019

#14 Updated by intrigeri 2019-02-06 14:08:18

#15 Updated by intrigeri 2019-02-06 14:08:21

  • blocked by deleted (Feature #15507: Core work 2019Q1: Foundations Team)

#16 Updated by intrigeri 2019-04-05 16:07:59

  • Assignee deleted (intrigeri)

#17 Updated by intrigeri 2019-11-23 17:14:06

  • Description updated

#18 Updated by intrigeri 2019-11-23 18:59:10

  • File aufs.png added
  • File overlayfs.png added
  • Status changed from Confirmed to In Progress
  • Assignee set to intrigeri
  • Feature Branch set to feature/8415-overlayfs+force-all-tests

#19 Updated by intrigeri 2019-11-24 07:08:16

To test all this, I:

  • on the kernel command line, replace quiet with debug nosplash
  • run sudo touch /run/initramfs/tails_shutdown_debugging within Tails before shutting down

On the topic branch, the tmpfs where the read-write upper dir of the overlayfs lives is exposed as /lib/live/mount/overlay; it is successfully unmounted on shutdown before we switch back to the initramfs; same for the read-only branch (SquashFS). Then, the overlayfs itself is successfully unmounted in the initrd (initramfs-pre-shutdown-hook), so from my reading of the logs everything looks good, modulo some actions we take in the initrd are not necessary anymore (as some stuff has already been unmounted) and fail. But still, the known pattern we write in “Scenario: Erasure of the overlayfs read-write branch on shutdown” is not cleaned up.

I’m not sure what to do at this point. Hunches:

  • It could be that some of the seemingly successful unmount operations are done in a lazy way, and the tmpfs we want to clean up is still seen as internally mounted by the kernel. It could be interesting to run lsof or similar at various points of the shutdown process.
  • It could be that deleting the content of the tmpfs is necessary to clean the memory (we did rm -rf /mnt/live/overlay/* for aufs but it’s a no-op here because the tmpfs has already been unmounted so there’s nothing visible to delete there anymore at that point). We could try to do that in a systemd unit that stops just before Unmounting /lib/live/mount/overlay. I’ll try that.

#20 Updated by intrigeri 2019-11-24 08:37:30

  • Assignee deleted (intrigeri)
  • Feature Branch changed from wip/bugfix/15146-overlayfs-memory-erasure to bugfix/15146-overlayfs-memory-erasure

Looks like I got a PoC fix that works! I’m not sure if I should polish this or switch to Feature #15281.

#21 Updated by intrigeri 2019-11-24 08:38:04

Also, if we go the way my PoC branch does, we need to bring back some tests, see commit:f3ec8583001a9a90861eede18ac5f330a560cad2.

#22 Updated by intrigeri 2019-11-24 08:38:18

  • blocks deleted (Bug #15477: Consider upgrading to current live-boot)

#23 Updated by intrigeri 2019-11-24 19:30:05

  • Status changed from In Progress to Needs Validation
  • Assignee set to intrigeri

The affected scenario now passes on the topic branch! Next steps: check that the reintroduced AppArmor-related test steps still make sense in overlayfs-world, verify that they pass, and finally merge into Feature #8415.

#24 Updated by intrigeri 2019-11-25 08:27:46

  • Status changed from Needs Validation to In Progress
  • Assignee deleted (intrigeri)

> Next steps: check that the reintroduced AppArmor-related test steps still make sense in overlayfs-world,

They needed adjustments on the test suite side. They also showed that our AppArmor configuration needed to be adapted to the changes brought by commit:3cdeadfeadc28d93aed5356c5780b97dac75dc19. I did both, let’s see what Jenkins thinks.

Also, unrelated to AppArmor, lots of the commands we have in initramfs-pre-shutdown-hook are now obsolete and thus fail loudly. IMO we should clean up this script: otherwise, next time we debug a problem in this area, we may get confused by all the error messages unneeded and failing operations trigger ⇒ back to “In Progress”.

#25 Updated by intrigeri 2019-11-28 08:38:52

intrigeri wrote:
> > Next steps: check that the reintroduced AppArmor-related test steps still make sense in overlayfs-world,
>
> They needed adjustments on the test suite side. They also showed that our AppArmor configuration needed to be adapted to the changes brought by commit:3cdeadfeadc28d93aed5356c5780b97dac75dc19. I did both, let’s see what Jenkins thinks.

This now looks good on Jenkins so I’ve merged this branch into Feature #8415, to make it easier to analyze test suite results there. But I’m leaving this ticket in progress as more work is needed on this front IMO:

> Also, unrelated to AppArmor, lots of the commands we have in initramfs-pre-shutdown-hook are now obsolete and thus fail loudly. IMO we should clean up this script: otherwise, next time we debug a problem in this area, we may get confused by all the error messages unneeded and failing operations trigger ⇒ back to “In Progress”.

#26 Updated by intrigeri 2019-11-28 08:39:19

  • Feature Branch changed from bugfix/15146-overlayfs-memory-erasure to feature/8415-overlayfs+force-all-tests

#27 Updated by intrigeri 2019-12-01 11:08:05

  • Target version changed from 2019 to Tails_4.5

#28 Updated by intrigeri 2019-12-01 11:21:27

  • Priority changed from Normal to High

#29 Updated by segfault 2020-02-21 10:46:04

  • Assignee set to segfault

#30 Updated by intrigeri 2020-02-21 11:16:52

  • Description updated

#31 Updated by intrigeri 2020-02-22 06:57:37

I’ve seen the “Tails erases memory on DVD boot medium removal: overlayfs read-write branch” scenario fail on the feature/6560-secure-boot+force-all-tests branch:

9.078% of the free memory still has the pattern, but less than 0.800% was expected.
<false> is not true. (Test::Unit::AssertionFailedError)
./features/step_definitions/erase_memory.rb:181:in `/^I find very few patterns in the guest's memory$/'
features/emergency_shutdown.feature:24:in `Then I find very few patterns in the guest's memory'

Attaching video & screenshot.

Note that this scenario is marked as fragile due to Bug #13462, which is a different problem: it’s about FindFailed: can not find MemoryWipeCompleted.png, while in the case at hand, the “Happy dumping!” message is displayed and the failure very much looks like we did not erase all memory. Could it be that tails-remove-overlayfs-dirs.service sometimes can’t do its job is the boot medium was removed?

#32 Updated by segfault 2020-02-22 11:20:27

intrigeri wrote:
> > Also, unrelated to AppArmor, lots of the commands we have in initramfs-pre-shutdown-hook are now obsolete and thus fail loudly. IMO we should clean up this script: otherwise, next time we debug a problem in this area, we may get confused by all the error messages unneeded and failing operations trigger ⇒ back to “In Progress”.

To find out which parts of initramfs-pre-shutdown-hook are still needed with overlayfs, on a separate branch, I deleted everything except for the sleep for dumping. The memory erasure test passed:

https://jenkins.tails.boum.org/job/test_Tails_ISO_bugfix-15146-overlayfs-memory-erasure-no-pre-shutdown-hook/2/cucumber-html-reports/report-feature_3_2622904403.html

#33 Updated by segfault 2020-02-22 11:53:48

intrigeri wrote:
> I’ve seen the “Tails erases memory on DVD boot medium removal: overlayfs read-write branch” scenario fail on the feature/6560-secure-boot+force-all-tests branch:
>
> […]
>
> Attaching video & screenshot.
>
> Note that this scenario is marked as fragile due to Bug #13462, which is a different problem: it’s about FindFailed: can not find MemoryWipeCompleted.png, while in the case at hand, the “Happy dumping!” message is displayed and the failure very much looks like we did not erase all memory.

This did not happen on Jenkins yet (in the 5 test suite runs of the branch).

> Could it be that tails-remove-overlayfs-dirs.service sometimes can’t do its job is the boot medium was removed?

Yes, I think that would be the case if, for the execution of the service, some file must loaded from the filesystem which wasn’t cached. Could be that systemd needs to load something. Could even be /bin/rm which we execute in the script, but I doubt that, because /bin/rm is so ubiquitous that I would expect it to be cached.

IMO the right design would be to copy to a tmpfs all files needed for the memory erasure to work.

#34 Updated by segfault 2020-02-22 11:57:31

> IMO the right design would be to copy to a tmpfs all files needed for the memory erasure to work.

Or we could just do it in the initramfs.

#35 Updated by intrigeri 2020-02-22 12:31:03

>> IMO the right design would be to copy to a tmpfs all files needed for the memory erasure to work.

> Or we could just do it in the initramfs.

Indeed.

Note that we try to lock in memory (/etc/memlockd.cfg) all those files, so if the initramfs option does not work, it may be that the hard part is not to ensure the files are in memory (that’s a mostly solved problem already), but rather to figure out which files are needed. Of course, the simpler the implementation is wrt. “listing the needed files”, the easier it gets; the systemd-based one is not exactly simple in this respect.

#36 Updated by intrigeri 2020-02-22 12:43:00

segfault wrote:
> intrigeri wrote:
>> I’ve seen the “Tails erases memory on DVD boot medium removal: overlayfs read-write branch” scenario fail on the feature/6560-secure-boot+force-all-tests branch: […]
>
> This did not happen on Jenkins yet (in the 5 test suite runs of the branch).

That’s good to know. Note that my local Jenkins is much faster than lizard, so for example, when exposed to racy code, it tends to fails in different places/ways. I remember we did our best to make tails-remove-overlayfs-dirs.service non-racy, but who knows. I’m glad you’re trying to eliminate it.

#37 Updated by intrigeri 2020-02-23 06:58:44

  • Target version changed from Tails_4.5 to Tails_4.4

(As per timeline discussion we had yesterday. This is about finishing the work, and possibly merging it into devel, not about merging it into stable yet: the current plan is to release this in 4.5, not 4.4.)

#38 Updated by segfault 2020-03-05 22:48:25

> Or we could just do it in the initramfs.

Turns out we can’t delete the tmpfs contents in the initramfs, because for some reason that’s not clear to me, these tmpfs’s are not mounted anymore when our initramfs-pre-shutdown-hook is executed. That can also be seen in the screenshot and video you posted above.

#39 Updated by segfault 2020-03-05 23:15:45

I’m now testing if we can remove the tmpfs contents in a script in /lib/systemd/system-shutdown/ instead of the systemd service. I pushed a commit for that to bugfix/15146-overlayfs-memory-erasure.

> This did not happen on Jenkins yet (in the 5 test suite runs of the branch).

I still can’t find a case where the memory erasure test failed on Jenkins on the Feature #8415 branch, the Feature #6560 branch, or bugfix/15146-overlayfs-memory-erasure. So if it still works in Jenkins with the shutdown script instead of the service, could you try to reproduce the issue on your local Jenkins with bugfix/15146-overlayfs-memory-erasure?

#40 Updated by intrigeri 2020-03-06 07:52:38

> I still can’t find a case where the memory erasure test failed on Jenkins on the Feature #8415 branch, the Feature #6560 branch, or bugfix/15146-overlayfs-memory-erasure. So if it still works in Jenkins with the shutdown script instead of the service, could you try to reproduce the issue on your local Jenkins with bugfix/15146-overlayfs-memory-erasure?

Sure, will do! I expect I’ll be able to report results by the end of the week.

#41 Updated by intrigeri 2020-03-06 08:17:23

Hi segfault,

> I still can’t find a case where the memory erasure test failed on Jenkins on the Feature #8415 branch, the Feature #6560 branch, or bugfix/15146-overlayfs-memory-erasure.

Note that the latter (bugfix/15146-overlayfs-memory-erasure) does not run the test that I’ve reported as failing, because it’s in features/emergency_shutdown.feature, which is tagged fragile.

So I’ll add the +force-all-tests suffix to bugfix/15146-overlayfs-memory-erasure locally so my local Jenkins runs that test.

#42 Updated by intrigeri 2020-03-07 07:06:10

I’ve run features/erase_memory.feature and features/emergency_shutdown.feature 3 times on my local Jenkins, including fragile tests, from the bugfix/15146-overlayfs-memory-erasure branch at commit:7edfb08488cfc35c2a0493b6fce984a4914ec0f7. Each of these 3 runs had exactly one failing scenario:

Scenario: Erasure of the overlayfs read-write branch on shutdown         # features/erase_memory.feature:64
Given I have started Tails from DVD without network and logged in      # features/step_definitions/snapshots.rb:170
And I prepare Tails for memory erasure tests                           # features/step_definitions/erase_memory.rb:60
When I fill a 128 MiB file with a known pattern on the root filesystem # features/step_definitions/erase_memory.rb:194
 # ensure the pattern is in memory due to tmpfs, not to disk cache
And I drop all kernel caches                                           # features/step_definitions/erase_memory.rb:211
Then patterns cover at least 128 MiB in the guest's memory             # features/step_definitions/erase_memory.rb:154
Pattern coverage: 100.000% (128 MiB out of 128 MiB reference memory)
When I trigger shutdown                                                # features/step_definitions/erase_memory.rb:215
And I wait 20 seconds                                                  # features/step_definitions/common_steps.rb:861
Slept for 20 seconds
Then I find very few patterns in the guest's memory                    # features/step_definitions/erase_memory.rb:178
Pattern coverage: 9.169% (128 MiB out of 1396 MiB reference memory)
9.169% of the free memory still has the pattern, but less than 0.800% was expected.
<false> is not true. (Test::Unit::AssertionFailedError)
./features/step_definitions/erase_memory.rb:181:in `/^I find very few patterns in the guest's memory$/'
features/erase_memory.feature:73:in `Then I find very few patterns in the guest's memory'

#43 Updated by segfault 2020-03-07 15:29:31

  • Feature Branch changed from feature/8415-overlayfs+force-all-tests to bugfix/15146-overlayfs-memory-erasure+force-all-tests

> Note that the latter (bugfix/15146-overlayfs-memory-erasure) does not run the test that I’ve reported as failing, because it’s in features/emergency_shutdown.feature, which is tagged fragile.

Ack. I renamed the branch to bugfix/15146-overlayfs-memory-erasure+force-all-tests.

> I’ve run features/erase_memory.feature and features/emergency_shutdown.feature 3 times on my local Jenkins, including fragile tests, from the bugfix/15146-overlayfs-memory-erasure branch at 7edfb08488cfc35c2a0493b6fce984a4914ec0f7. Each of these 3 runs had exactly one failing scenario:

The same scenario fails on (our shared) Jenkins. I suspect that’s because, according to a comment in the file, config/chroot_local-includes/lib/systemd/system-shutdown/tails is not run " by the other instance of systemd-shutdown that’s run (as /shutdown) after returning to the initramfs during shutdown". I don’t really understand what that means though (I don’t know anything about two instances of systemd-shutdown, and why deleting files on a filesystem in one of those instances wouldn’t delete them for the other instance). Anyway, since it did work in most cases with tails-remove-overlayfs-dirs.service, I’m now trying to use that again, but also locking it in memory via /etc/memlockd.cfg.

#44 Updated by intrigeri 2020-03-07 18:12:29

> I suspect that’s because, according to a comment in the file, config/chroot_local-includes/lib/systemd/system-shutdown/tails is not run " by the other instance of systemd-shutdown that’s run (as /shutdown) after returning to the initramfs during shutdown". I don’t really understand what that means though (I don’t know anything about two instances of systemd-shutdown, and why deleting files on a filesystem in one of those instances wouldn’t delete them for the other instance).

I see that both https://tails.boum.org/contribute/design/memory_erasure/ is a bit lacking in this respect: it does not say what systemd itself does, hence your confusion, I think.

The missing info is in https://www.freedesktop.org/wiki/Software/systemd/InitrdInterface/, in the “If the executable /run/initramfs/shutdown exists systemd will use it to jump back into the initrd on shutdown” bullet point. For the details, when I (re-re-)implemented this feature, I had to read the systemd source code.

One important thing, IIRC, is that “jump back into the initrd” is done via the equivalent of chroot.

> Anyway, since it did work in most cases with tails-remove-overlayfs-dirs.service, I’m now trying to use that again, but also locking it in memory via /etc/memlockd.cfg.

OK!

#45 Updated by segfault 2020-03-07 22:54:41

>> Anyway, since it did work in most cases with tails-remove-overlayfs-dirs.service, I’m now trying to use that again, but also locking it in memory via /etc/memlockd.cfg.
>
> OK!

The memory erasure tests passed on Jenkins, so it would be nice if you could test that branch on your local Jenkins.

#46 Updated by intrigeri 2020-03-08 06:57:13

> The memory erasure tests passed on Jenkins,

Good news :)

> so it would be nice if you could test that branch on your local Jenkins.

Sure, I’m on it!

Nitpicking: my understanding is that using the + prefix for /lib/systemd/system/tails-remove-overlayfs-dirs.service in memlockd.cfg is not useful/necessary here; IMO it can be a little confusing, letting the reader believe that memlockd will be clever enough to lock all dependencies of the service in memory, which AAUI is not the case.

#47 Updated by segfault 2020-03-08 08:38:36

> Nitpicking: my understanding is that using the + prefix for /lib/systemd/system/tails-remove-overlayfs-dirs.service in memlockd.cfg is not useful/necessary here; IMO it can be a little confusing, letting the reader believe that memlockd will be clever enough to lock all dependencies of the service in memory, which AAUI is not the case.

Right. I pushed a fixup commit.

#48 Updated by intrigeri 2020-03-08 09:10:23

>> so it would be nice if you could test that branch on your local Jenkins.
>
> Sure, I’m on it!

I’ll report more complete stress-testing results later, but I can tell you that I’ve already seen “Scenario: Tails erases memory on DVD boot medium removal: overlayfs read-write branch” fail here. I’m attaching the corresponding screenshot and video, in the hope it might help understand what’s going on.

I now have a big doubt wrt. the tails-remove-overlayfs-dirs.service -based implementation: config/chroot_local-includes/usr/local/lib/udev-watchdog-wrapper runs systemctl --force poweroff, i.e. “shutdown of all running services is skipped”. So, if I understand this correctly, we cannot count on the ExecStop= command of tails-remove-overlayfs-dirs.service being run during emergency shutdown. This would explain why, in some cases, this service does not do its job on shutdown. But this does not explain why the overlayfs read-write branch is cleaned up most of the time, so perhaps I’m totally confused.

Either way, I’m wondering: maybe it would be more robust to move the /bin/rm -rf /lib/live/mount/overlay/rw /lib/live/mount/overlay/work command from /lib/systemd/system/tails-remove-overlayfs-dirs.service to /lib/systemd/system-shutdown/tails? IIRC (without checking), executables under /lib/systemd/system-shutdown/ are run by systemd even when --force was passed. At the very least, this would simplify things and avoid the need to reason about the exact behavior of systemctl --force poweroff.

#49 Updated by intrigeri 2020-03-08 09:46:17

> I’ll report more complete stress-testing results later

I’ve run features/emergency_shutdown.feature 5 times here and I saw the failure only once.

I see that when it fails, once we’ve gone back to the initramfs, this is still mounted:

udev on /oldroot/dev type devtmpfs […]
tmpfs on /oldroot/run type tmpfs […]
overlay on /oldroot type overlay (rw,noatime,lowerdir=//filesystem.squashfs/,upperdir=/live/overlay//rw,workdir=/live/overlay//work)

And then /oldroot cannot be unmounted: “Device or resource busy”. I suspect that as long as the overlayfs is mounted:

  • The overlayfs kernel module has a copy of the upperdir/workdir data in memory, even if we rm -rf’d it on disk (I would not be surprised: AFAIK, manually fiddling with the on-disk upperdir/workdir data is not exactly supported :)
  • Or its backing upperdir and workdir are still around, even if /live/overlay was successfully (but perhaps lazily) umounted. In most cases this probably does not matter because we’ve rm -rf’ed their content, but when that fails — like here apparently — it does matter.

So I would see value in ensuring /oldroot is unmounted, either on top of, or instead of, trying to make /bin/rm -rf /lib/live/mount/overlay/rw /lib/live/mount/overlay/work more robust.

Could it be because of the /oldroot/dev and /oldroot/run mountpoints? The “Move /oldroot/* mountpoints out of the way” comment suggests that they might cause trouble. Note that sd-umount tried to do that itself earlier, but that fails with “Device or resource busy” as well. So I would suggest we mount --move them outside of /oldroot: if we can’t unmount them, fine, but let’s at least ensure that won’t prevent us from unmounting /oldroot itself?

#50 Updated by segfault 2020-03-08 17:36:56

intrigeri wrote:
> >> so it would be nice if you could test that branch on your local Jenkins.
> >
> > Sure, I’m on it!
>
> I’ll report more complete stress-testing results later, but I can tell you that I’ve already seen “Scenario: Tails erases memory on DVD boot medium removal: overlayfs read-write branch” fail here. I’m attaching the corresponding screenshot and video, in the hope it might help understand what’s going on.
>
> I now have a big doubt wrt. the tails-remove-overlayfs-dirs.service -based implementation: config/chroot_local-includes/usr/local/lib/udev-watchdog-wrapper runs systemctl --force poweroff, i.e. “shutdown of all running services is skipped”. So, if I understand this correctly, we cannot count on the ExecStop= command of tails-remove-overlayfs-dirs.service being run during emergency shutdown. This would explain why, in some cases, this service does not do its job on shutdown. But this does not explain why the overlayfs read-write branch is cleaned up most of the time, so perhaps I’m totally confused.

Yes, that confuses me too, because without this service, erase_memory.feature fails every time (see https://jenkins.tails.boum.org/job/test_Tails_ISO_bugfix-15146-overlayfs-memory-erasure-no-ovlerlayfs-dirs-removal/). But if systemctl --force poweroff skips ExecStop commands, the service shouldn’t do anything, so emergency_shutdown.feature should also fail every time.

> Either way, I’m wondering: maybe it would be more robust to move the /bin/rm -rf /lib/live/mount/overlay/rw /lib/live/mount/overlay/work command from /lib/systemd/system/tails-remove-overlayfs-dirs.service to /lib/systemd/system-shutdown/tails? IIRC (without checking), executables under /lib/systemd/system-shutdown/ are run by systemd even when --force was passed. At the very least, this would simplify things and avoid the need to reason about the exact behavior of systemctl --force poweroff.

That is exactly what I tried in 7edfb08488cfc35c2a0493b6fce984a4914ec0f7, which caused the emergency shutdown test to fail every time, see Bug #15146#note-42 and following.

> I’ve run features/emergency_shutdown.feature 5 times here and I saw the failure only once.
>
> I see that when it fails, once we’ve gone back to the initramfs, this is still mounted:
>
> […]
>
> And then /oldroot cannot be unmounted: “Device or resource busy”. I suspect that as long as the overlayfs is mounted:
>
> * The overlayfs kernel module has a copy of the upperdir/workdir data in memory, even if we rm -rf’d it on disk (I would not be surprised: AFAIK, manually fiddling with the on-disk upperdir/workdir data is not exactly supported :)
> * Or its backing upperdir and workdir are still around, even if /live/overlay was successfully (but perhaps lazily) umounted. In most cases this probably does not matter because we’ve rm -rf’ed their content, but when that fails — like here apparently — it does matter.
>
> So I would see value in ensuring /oldroot is unmounted, either on top of, or instead of, trying to make /bin/rm -rf /lib/live/mount/overlay/rw /lib/live/mount/overlay/work more robust.
>
> Could it be because of the /oldroot/dev and /oldroot/run mountpoints? The “Move /oldroot/* mountpoints out of the way” comment suggests that they might cause trouble. Note that sd-umount tried to do that itself earlier, but that fails with “Device or resource busy” as well. So I would suggest we mount --move them outside of /oldroot: if we can’t unmount them, fine, but let’s at least ensure that won’t prevent us from unmounting /oldroot itself?

It’s worth a try. I pushed a commit.

#51 Updated by intrigeri 2020-03-10 18:09:41

Hi,

>> Could it be because of the /oldroot/dev and /oldroot/run mountpoints? The “Move /oldroot/* mountpoints out of the way” comment suggests that they might cause trouble. Note that sd-umount tried to do that itself earlier, but that fails with “Device or resource busy” as well. So I would suggest we mount --move them outside of /oldroot: if we can’t unmount them, fine, but let’s at least ensure that won’t prevent us from unmounting /oldroot itself?
>
> It’s worth a try. I pushed a commit.

With this commit, I’ve run the relevant test 13 times on the affected machine and never saw it fail. For good measure, these 13 times include a few runs of the entire features/erase_memory.feature and features/emergency_shutdown.feature. During most of these runs, this was the only thing the machine was doing; for 2 runs I was running another test suite instance on another VM on the same hardware, to broaden the span of conditions I was exercising the thing in.

So either the race condition is still there but the odds of losing it are super low, or commit:450244d94129f445990390911d54ba94223decd8 did the trick. I’m personally satisfied to leave it at that.

If that passes on Jenkins too, I think we’re good here (unless there was anything left, like design doc update? I don’t remember).

#52 Updated by CyrilBrulebois 2020-03-12 09:55:43

  • Target version changed from Tails_4.4 to Tails_4.5

#53 Updated by intrigeri 2020-03-16 08:17:27

  • Assignee changed from segfault to intrigeri

I’ve:

  • made the branch build again on Jenkins, by merging feature/8415-overlayfs+force-all-tests into it
  • updated the design doc

If the relevant tests pass on Jenkins, I’ll merge this into feature/8415-overlayfs+force-all-tests and will close this issue as resolved. Then, this can be reviewed as part of the Feature #8415 + Feature #6560 batch.

#54 Updated by intrigeri 2020-03-17 13:03:12

intrigeri wrote:
> If the relevant tests pass on Jenkins […]

Unfortunately, in 3 runs on Jenkins, “Scenario Tails erases memory on DVD boot medium removal: overlayfs read-write branch” failed once: https://jenkins.tails.boum.org/view/RM/job/test_Tails_ISO_bugfix-15146-overlayfs-memory-erasure-force-all-tests/5/cucumber-html-reports/report-feature_8_3124862301.html.
I’m attaching the relevant build artifacts, in the hope it helps debugging. I see that /oldroot could not be unmounted.

At this point I’m not sure we should block on this for 4.5~rc1, or even for 4.5. @segfault, what do you think?

#55 Updated by intrigeri 2020-03-18 16:48:04

intrigeri wrote:
> At this point I’m not sure we should block on this for 4.5~rc1, or even for 4.5. @segfault, what do you think?

I’ve thought a little bit about this. I’m quite torn because:

  • Our “erase memory on shutdown” feature has historically been pretty fragile. AFAICT, the only robust version of it is the current one (with aufs), to which we switched in Tails 3.0. We’ve never been super clear with users about whether this was a “works 90+ % of the time” thing or a “works 100% of the time, guaranteed” thing. So one may argue that going back to a somewhat fragile status in 4.5 is basically going back to business as usual, from a long-term perspective (and possibly we could adjust a bit our claims about this feature, if we don’t manage to fix the overlayfs-based implementation really soon).
  • By nature of this security feature, depending on whether it’s 100% reliable or not, informed users may make different security decisions. Given it’s been reliable since almost 3 years, one may argue that any regression will put some users at risk.
  • I’m not aware of any real-world situation in which this feature has made a practical difference. Granted, people who benefited from it may not want to tell us about it, but still, we’ve had this feature for many years now.

I’m now leaning towards “accept the regression in 4.5~rc1, don’t block on this to merge Feature #8415 and friends, then treat it as high priority, and set a deadline in N months when we’ll adjust our doc if it’s not fixed yet”.

#56 Updated by intrigeri 2020-03-19 09:09:32

  • Status changed from In Progress to Needs Validation
  • Assignee changed from intrigeri to anonym

@anonym, I’d like your opinion too on the last 2 comments.

In passing, note that systemd v245 improves a bit the “unmount stuff on shutdown” code, so there’s a little bit of hope that upgrading systemd will help.

#57 Updated by intrigeri 2020-03-19 09:20:59

  • Feature Branch changed from bugfix/15146-overlayfs-memory-erasure+force-all-tests to feature/8415-overlayfs+force-all-tests

I’ve merged the current implementation into feature/8415-overlayfs+force-all-tests and in turn into feature/8415-overlayfs+force-all-tests, because that’s the best we have at the moment.

#58 Updated by intrigeri 2020-03-20 08:54:58

  • Feature Branch changed from feature/8415-overlayfs+force-all-tests to feature/6560-secure-boot+force-all-tests, https://salsa.debian.org/tails-team/tails/-/merge_requests/44/

#59 Updated by anonym 2020-03-20 09:11:59

intrigeri wrote:
> intrigeri wrote:
> > If the relevant tests pass on Jenkins […]
>
> Unfortunately, in 3 runs on Jenkins, “Scenario Tails erases memory on DVD boot medium removal: overlayfs read-write branch” failed once: https://jenkins.tails.boum.org/view/RM/job/test_Tails_ISO_bugfix-15146-overlayfs-memory-erasure-force-all-tests/5/cucumber-html-reports/report-feature_8_3124862301.html.

Since the branch was merged, these jobs have now been deleted. Oh well, seems the screenshot has everything needed.

> I’m attaching the relevant build artifacts, in the hope it helps debugging. I see that /oldroot could not be unmounted.

Oof, this looks like an ugly one! Some ideas:

  • Normally when mount fails you get the useful info in the journal, so perhaps you can add its tail to the debug logging to get more info about the problem?
  • I’m not at all sure about this one, but IIRC pivot_root can be a more effective way to change the root when it comes to unmounting problematic mountpoints.
  • Since we are in the initramfs, I guess we are using busybox’s mount? The error message seems to confirm this (I can find the string “mounting %s on %s failed” in busybox, bot not in util-linux). Perhaps it is buggy, as (in my experience) busybox alternatives sometimes are. What if we would use the real mount?
  • What about filling the tmpfs with zeros if we detect this error? Could be a cheap (?) workaround until we find something better.

> intrigeri wrote:
> > At this point I’m not sure we should block on this for 4.5~rc1, or even for 4.5. segfault, what do you think?
>
> I’ve thought a little bit about this. I’m quite torn because:
>
> * Our “erase memory on shutdown” feature has historically been pretty fragile. AFAICT, the only robust version of it is the current one (with aufs), to which we switched in Tails 3.0. We’ve never been super clear with users about whether this was a “works 90+ % of the time” thing or a “works 100% of the time, guaranteed” thing. So one may argue that going back to a somewhat fragile status in 4.5 is basically going back to business as usual, from a long-term perspective (and possibly we could adjust a bit our claims about this feature, if we don’t manage to fix the overlayfs-based implementation really soon).
> * By nature of this security feature, depending on whether it’s 100% reliable or not, informed users may make different security decisions. Given it’s been reliable since almost 3 years, one may argue that any regression will put some users at risk.
> * I’m not aware of any real-world situation in which this feature has made a practical difference. Granted, people who benefited from it may not want to tell us about it, but still, we’ve had this feature for many years now.
>
> I’m now leaning towards “accept the regression in 4.5~rc1, don’t block on this to merge Feature #8415 and friends, then treat it as high priority, and set a deadline in N months when we’ll adjust our doc if it’s not fixed yet”.

I 100% agree with this. Also, this problem looks solvable to me, so I still have pretty good feelings about this sub-optimal move.

#60 Updated by anonym 2020-03-20 09:32:50

  • Status changed from Needs Validation to In Progress
  • Assignee changed from anonym to intrigeri

#61 Updated by intrigeri 2020-03-20 09:44:40

Hi,

anonym wrote:
> Oof, this looks like an ugly one! Some ideas:
>
> * Normally when mount fails you get the useful info in the journal, so perhaps you can add its tail to the debug logging to get more info about the problem?

I’m not sure if, nor how, it could work, because:

  • At this stage of the emergency shutdown process, I doubt our test suite will manage to talk to the remote shell so the best we can hope for is to convince our shutdown scripts/hooks to do so.
  • At this late stage of the shutdown process, IIRC systemd has stopped journald and has redirected all output to the console.
  • AFAIK the extra info (apart of stdout/stderr that we already have) we can hope for is kernel errors, but we have them already in the video.

I’d love to be wrong.

IIRC, adding debug to the kernel command line makes the whole “return to the initramfs and shutdown” systemd thing much more verbose and IIRC the extra info is visible on the console. That would probably be my next step to debug this further.

> * I’m not at all sure about this one, but IIRC pivot_root can be a more effective way to change the root when it comes to unmounting problematic mountpoints.

Interesting! I think current systemd’s switch_root() tries to do that when returning to the initrd, but falls back to overmounting root. I did not check what the version from Buster does. It could be worth looking deeper into this.

> * Since we are in the initramfs, I guess we are using busybox’s mount? The error message seems to confirm this (I can find the string “mounting %s on %s failed” in busybox, bot not in util-linux). Perhaps it is buggy, as (in my experience) busybox alternatives sometimes are. What if we would use the real mount?

Worth trying :)

> * What about filling the tmpfs with zeros if we detect this error? Could be a cheap (?) workaround until we find something better.

Do we have access to the relevant tmpfs at this stage? It seems to me that we don’t.

Finally, my own hunch wrt. this occasional failure is: it’s a race related to how the kernel lazily unmounts stuff vs. when/how systemd switches to the initramfs. Probably incorrect but oh well.

>> I’m now leaning towards “accept the regression in 4.5~rc1, don’t block on this to merge Feature #8415 and friends, then treat it as high priority, and set a deadline in N months when we’ll adjust our doc if it’s not fixed yet”.
>
> I 100% agree with this. Also, this problem looks solvable to me, so I still have pretty good feelings about this sub-optimal move.

OK, I’ll unparent this ticket!

#62 Updated by intrigeri 2020-03-20 09:45:16

  • Assignee deleted (intrigeri)
  • Parent task deleted (Feature #8415)

#63 Updated by intrigeri 2020-03-20 09:45:54

#64 Updated by intrigeri 2020-03-26 10:51:58

Hi anonym & segfault,

anonym wrote:
> intrigeri wrote:
>> I’m now leaning towards “accept the regression in 4.5~rc1, don’t block on this to merge Feature #8415 and friends, then treat it as high priority, and set a deadline in N months when we’ll adjust our doc if it’s not fixed yet”.
>
> I 100% agree with this. Also, this problem looks solvable to me, so I still have pretty good feelings about this sub-optimal move.

Does one of you have the spoons + availability to work on this in time for 4.5? I think that means having something merged by April 5.

#65 Updated by segfault 2020-03-29 15:09:10

  • Feature Branch changed from feature/6560-secure-boot+force-all-tests, https://salsa.debian.org/tails-team/tails/-/merge_requests/44/ to bugfix/15146-overlayfs-memory-erasure+force-all-tests

I recreated the old feature branch and pushed two commits to test whether they fix the problem:

  • Added a sync at the beginning of the script in the hope that it waits until lazy unmounts are finished
  • Using /bin/mount as proposed by anonym

Now waiting for Jenkins test results

#66 Updated by segfault 2020-04-03 14:11:55

  • Status changed from In Progress to Needs Validation
  • Assignee set to intrigeri

The relevant tests haven’t failed in the 8 test runs we have until now. Doesn’t mean that there is no race anymore but maybe we should just merge it in time for 4.5?

#67 Updated by intrigeri 2020-04-04 05:37:45

Hi,

> The relevant tests haven’t failed in the 8 test runs we have until now. Doesn’t mean that there is no race anymore but maybe we should just merge it in time for 4.5?

Thanks!
Looking at the code, I tend to agree: I can’t imagine how these changes could make things worse.
I’ll stress-test this a bit today on my local Jenkins, that exposed the problem more readily, and I’ll merge unless I spot a regression here!

#68 Updated by intrigeri 2020-04-04 06:06:10

Note: I’ve set the base branch to testing and merged current testing into this branch, to ensure we’re actually testing what would land into 4.5 if we merge this branch.

#69 Updated by intrigeri 2020-04-04 07:54:19

As our baseline, among 13 runs on the testing branch: this failure happened twice.
FWIW: it never failed since March 28.

#70 Updated by intrigeri 2020-04-04 11:15:22

  • Status changed from Needs Validation to In Progress
  • Assignee changed from intrigeri to segfault

I’ve run the relevant scenario 10-15 times on each of lizard and my local Jenkins, and unfortunately, it failed 10-15% of the time on both systems ⇒ no measurable improvement AFAICT.

BTW, in the initramfs environment, are /bin/{mount,umount} really the “real” ones? I’m wondering if they could be symlinks to the busybox executable, in which case commit:228469faca4d2d1847623fa6d2dc6a6f9dada4cd would be a no-op.

Finally, about that same commit: why the addition to memlock.cfg? The /bin/{mount,umount} used in the script are those from the initramfs, not those from the root filesystem, so I don’t understand how it’s related to memlock.

#71 Updated by segfault 2020-04-06 14:32:57

  • Assignee deleted (segfault)

intrigeri wrote:
> I’ve run the relevant scenario 10-15 times on each of lizard and my local Jenkins, and unfortunately, it failed 10-15% of the time on both systems ⇒ no measurable improvement AFAICT.

Too bad :(

> BTW, in the initramfs environment, are /bin/{mount,umount} really the “real” ones? I’m wondering if they could be symlinks to the busybox executable, in which case commit:228469faca4d2d1847623fa6d2dc6a6f9dada4cd would be a no-op.

I don’t know if that’s the case.

> Finally, about that same commit: why the addition to memlock.cfg? The /bin/{mount,umount} used in the script are those from the initramfs, not those from the root filesystem, so I don’t understand how it’s related to memlock.

You’re right, it’s not related to memlock, I must have been confused.

#72 Updated by anonym 2020-04-06 14:39:13

  • Assignee set to anonym
  • Target version changed from Tails_4.5 to Tails_4.6

I’ll give this on a try during the 4.6 cycle.

#73 Updated by anonym 2020-04-16 12:11:29

So I have tested moving to util-linux’ mount/umount, but the errors persist, just with different messages as you can see in the attached image.

However, while I could reproduce the problem before switching mount/umount, I haven’t been able to reproduce it since the switch, and that is with 30+ runs. I’m mildly optimistic, so I have pushed this + made it so it will only run the affected scenario, and it will run 50 times so we get something statistically significant. So I’m quite curious how Jenkins run #28 (which should become the first run with my changes) will turn out!

#74 Updated by anonym 2020-04-17 09:54:50

tl;dr: moving to util-linuxmount/@umount@ seems to (mostly) solve the problem!

What I did was:

  • move to util-linuxmount/@umount@ (what segfault did in commit:228469faca4d2d1847623fa6d2dc6a6f9dada4cd was not enough)
  • speed up ‘I fill a .. MiB file …’ step by 1000%
  • remove all scenarios but the problematic one, repeating it 50 times

We now have 200 runs of the Tails erases memory on DVD boot medium removal: overlayfs read-write branch scenario on Jenkins, and only one failed, and it failed in the way we have seen (128 MiB of patterns detected). That one failure is sort of a bummer as it proves the problem isn’t completely solved, but with a failure rate of 0.5% I guess it starts to look like something acceptable, perhaps comparable to the occurrence rate of hardware/BIOS/blah-related crashes during shutdown which would have the same effect? Any thoughts on this?

After seeing the first 50 runs finish with no errors I worried that the speed up I did might have been involved some how (e.g. difference in how the data was cached + some kernel bug leaving that cache uncleared, so that the real fix about the way we write the file) so I reverted it, and in then next 150 runs there was only 1 error. So I don’t think the speed up “hid” the problem, and that switching mount/@umount@ is the actual fix. :) (Also I see that the run with the speed up ran in a third of the time of the runs without, which is nice! But it won’t matter much in a full run with all features restored.)

As the next step I have restored all the memory erasing scenarios and repeated them 10 times just to make sure this didn’t mess up any of the other scenarios. After that, if all looks good, I want to restore all features/scenario and do several full runs — I want to eliminate that the approach of repeating these scenarios/feature in the same run makes a difference compared to full runs. By that time I will also rewrite this branch so it doesn’t use the hacky “debugging mechanism” I introduced to fix mount/umount. :)

intrigeri, if you will get your local Jenkins (which seems to expose the problem more often) up and running again soon I'd like you to run this branch from its current HEAD@ (commit:3918c329d1d0c8b3f492889654c674aa47e66b4f), and also from HEAD^ (commit:87459f66fd3aa6091918a6c9d8cab785595657d3) for the stress testing of that single scenario that has been most problematic.

BTW, I think we should move the scenarios in emergency_shutdown.feature that are about memory erasure into features/erase_memory.feature, and just keep the one scenario that specifically tests the emergency shutdown feature in there. That makes more sense, right?

#75 Updated by intrigeri 2020-04-17 16:41:17

> That one failure is sort of a bummer as it proves the problem isn’t completely solved, but with a failure rate of 0.5% I guess it starts to look like something acceptable, perhaps comparable to the occurrence rate of hardware/BIOS/blah-related crashes during shutdown which would have the same effect? Any thoughts on this?

I tend to agree. I’d also like to note that our automated test suite does not tell us much about the failure rate of such problems (that looks like a race condition) on real-world hardware.

I would be curious to have an idea of the failure rate on a branch that hasn’t this tentative fix. But it’s probably not worth blocking on this.

> intrigeri, if you will get your local Jenkins (which seems to expose the problem more often) up and running again soon

I don’t know when zen and I will find time to work on this, so “soon” may not happen.
To be clear, given how confident you are in your fix, what matters is “in time for 4.6”, right?

> BTW, I think we should move the scenarios in emergency_shutdown.feature that are about memory erasure into features/erase_memory.feature, and just keep the one scenario that specifically tests the emergency shutdown feature in there. That makes more sense, right?

The way I see it, the place where the scenarios live is consistent with the features’ definition:

Feature: System memory erasure on shutdown
As a Tails user
when I shutdown Tails
I want the system memory to be free from sensitive data.

Feature: Emergency shutdown
As a Tails user
when I unplug my Tails device to trigger emergency shutdown
I want the system memory to be free from sensitive data.

i.e. the core documented purpose of the Emergency shutdown feature is to ensure we erase memory in that case.
This makes quite some to me as it what’s really matters wrt. emergency shutdown.

#76 Updated by anonym 2020-04-17 19:33:15

  • % Done changed from 0 to 30

Ok, so we have 3 successful runs of my latest iteration, i.e. over 3 runs we have seen erase_memory.feature and emergency_shutdown.feature succeed 3*10 = 30 times in total. Looks good! I greedily started 3 more because jenkins was chilling (no weekend for the machines! >:)). If those succeed, I’ll proceed with the next steps to make an actually mergable branch.

intrigeri wrote:
> > That one failure is sort of a bummer as it proves the problem isn’t completely solved, but with a failure rate of 0.5% I guess it starts to look like something acceptable, perhaps comparable to the occurrence rate of hardware/BIOS/blah-related crashes during shutdown which would have the same effect? Any thoughts on this?
>
> I tend to agree. I’d also like to note that our automated test suite does not tell us much about the failure rate of such problems (that looks like a race condition) on real-world hardware.

Yes, and WhisperBack logs won’t help either. :/

> I would be curious to have an idea of the failure rate on a branch that hasn’t this tentative fix. But it’s probably not worth blocking on this.

I could probably do some ssh isotesterlizard find | grep | blah@ to get a ballpark if you think it worth to investigate 20 minutes into this. I’m on the edge, leaning toward “no, whatever”. You?

> > intrigeri, if you will get your local Jenkins (which seems to expose the problem more often) up and running again soon
>
> I don’t know when zen and I will find time to work on this, so “soon” may not happen.
> To be clear, given how confident you are in your fix, what matters is “in time for 4.6”, right?

I think testing on your hardware would be “nice” given the history, but not a blocker; IMHO it’s a win that it improves the situation on Jenkins (and my hardware, FWIW) and that is enough to warrant a review’n’merge. If your hardware still have problems once you get it running => new ticket (or revive this one, whatever).

> > BTW, I think we should move the scenarios in emergency_shutdown.feature that are about memory erasure into features/erase_memory.feature, and just keep the one scenario that specifically tests the emergency shutdown feature in there. That makes more sense, right?
>
> The way I see it, the place where the scenarios live is consistent with the features’ definition:
>
> […]
>
> i.e. the core documented purpose of the Emergency shutdown feature is to ensure we erase memory in that case.
> This makes quite some to me as it what’s really matters wrt. emergency shutdown.

Ok, this makes sense! Thanks! :)

#77 Updated by anonym 2020-04-18 12:44:48

  • % Done changed from 30 to 40

anonym wrote:
> Ok, so we have 3 successful runs of my latest iteration, i.e. over 3 runs we have seen erase_memory.feature and emergency_shutdown.feature succeed 3*10 = 30 times in total. Looks good! I greedily started 3 more because jenkins was chilling (no weekend for the machines! >:)). If those succeed, I’ll proceed with the next steps to make an actually mergable branch.

There has now been 10 runs so we’re up in 100 runs of all memory erasure related scenarios, and there was a single failure (same scenario) so it still looks good.

I have rewritten the branch into a mergable state, let’s see what jenkins thinks!

#78 Updated by anonym 2020-04-18 12:46:22

Oh, the old branch is still available as bugfix/15146-overlayfs-memory-erasure-debugging.

#79 Updated by anonym 2020-04-22 11:20:19

  • Status changed from In Progress to Needs Validation
  • Assignee deleted (anonym)
  • % Done changed from 40 to 50

On Jenkins I’ve seen 10 full runs of the complete test suite without any issues related to memory erasure (runs 53-63, ignoring 61 which was a Jenkins fail). Combined with the stress testing, it seems there are no regressions, only improvements, so let’s merge this, ey? In other words, let’s not block in intrigeri testing it on his monster hardware.

#80 Updated by anonym 2020-04-27 09:35:37

3 more good runs on Jenkins (up to run #66). :)

#81 Updated by CyrilBrulebois 2020-05-06 04:28:53

  • Target version changed from Tails_4.6 to Tails_4.7

#82 Updated by intrigeri 2020-05-10 18:28:00

  • Assignee set to intrigeri

#83 Updated by intrigeri 2020-05-10 19:17:33

  • Status changed from Needs Validation to Resolved
  • % Done changed from 50 to 100

Applied in changeset commit:tails|7a30bc0b5ec656ed0b4406d770b17384e189c0aa.