Feature #17165

Stop taking time-based APT snapshots of Stretch

Added by intrigeri 2019-10-19 06:10:40 . Updated 2019-12-13 09:43:16 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Infrastructure
Target version:
Start date:
Due date:
% Done:

0%

Feature Branch:
Type of work:
Sysadmin
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

Once we don’t use them anymore, we should drop these snapshots and stop updating them, to free tons of disk space on apt.lizard.

Doc: https://tails.boum.org/contribute/APT_repository/time-based_snapshots/#stop-tracking-a-distribution


Subtasks


Related issues

Blocks Tails - Feature #13242: Core work: Sysadmin (Maintain our already existing services) Confirmed 2017-06-29
Blocked by Tails - Bug #16868: Upgrade Vagrant box to Buster Resolved

History

#1 Updated by intrigeri 2019-10-19 06:10:52

  • blocks Feature #13242: Core work: Sysadmin (Maintain our already existing services) added

#2 Updated by intrigeri 2019-10-19 06:10:59

  • blocked by Bug #16868: Upgrade Vagrant box to Buster added

#3 Updated by intrigeri 2019-10-23 15:17:00

  • Target version set to Tails_4.1

#4 Updated by CyrilBrulebois 2019-10-24 13:00:40

While I can understand the desire and need to free up some space, I’m not sure I find it reasonable to drop all snapshots, including those referenced in the basebox config needed to rebuild 4.0:

+ pwd
+ sudo LC_ALL=C ARCHITECTURE=amd64 DISTRIBUTION=stretch DEBIAN_SERIAL=2019100904 DEBIAN_SECURITY_SERIAL=2019100904 TAILS_SERIAL=2019100904 vmdebootstrap --arch amd64 --distribution stretch --image tails-builder-amd64-stretch-20191010-fd86f34e5a.qcow2 --convert-qcow2 --enable-dhcp --grub --hostname vagrant-stretch --log-level debug --mirror http://time-based.snapshots.deb.tails.boum.org/debian/2019100904 --debootstrapopts keyring=/tmp/tmp.dVxSxb544S/pubring.kbx --owner kibi --kernel-package linux-image-amd64 --root-password=vagrant --size 20G --sudo --user vagrant/vagrant --customize /home/kibi/work/clients/tails/tails.git/vagrant/definitions/tails-builder/customize.sh --verbose
[sudo] password for kibi: 
Sorry, try again.
[sudo] password for kibi: 
Creating disk image
Creating partitions
Creating filesystem ext4
Mounting /dev/mapper/loop0p1 on /tmp/tmpYomomf
Debootstrapping stretch [amd64]
EEEK! Something bad happened...
command failed: ['debootstrap', '--arch=amd64', '--include=linux-image-amd64,acpid,sudo,grub-pc', '--keyring=/tmp/tmp.dVxSxb544S/pubring.kbx', 'stretch', '/tmp/tmpYomomf', 'http://time-based.snapshots.deb.tails.boum.org/debian/2019100904']
I: Retrieving InRelease 
I: Checking Release signature
I: Valid Release signature (key id 221F9A3C6FA3E09E182E060BC7988EA7A358D82E)
I: Retrieving Packages 
I: Validating Packages 
I: Resolving dependencies of required packages...
I: Resolving dependencies of base packages...
I: Checking component main on http://time-based.snapshots.deb.tails.boum.org/debian/2019100904...
I: Retrieving acpid 1:2.0.28-1+b1
I: Validating acpid 1:2.0.28-1+b1
I: Retrieving adduser 3.115
W: Couldn't download package adduser (ver 3.115 arch all) at http://time-based.snapshots.deb.tails.boum.org/debian/2019100904/pool/main/a/adduser/adduser_3.115_all.deb
I: Retrieving apt 1.4.9
W: Couldn't download package apt (ver 1.4.9 arch amd64) at http://time-based.snapshots.deb.tails.boum.org/debian/2019100904/pool/main/a/apt/apt_1.4.9_amd64.deb
[…]

I thought being able to reproduce image builds has grown an important property of our project. I’m rather disappointed to see that disappear in the blink of an eye, especially for this major release, just a few days after the release has been published.

#5 Updated by intrigeri 2019-10-24 17:04:31

  • Status changed from Confirmed to In Progress
  • Assignee set to intrigeri

@zen, I’m done (for better and for worse) with the bits we had planned that I would do earlier today.
Please hold on wrt. the next steps, as they’re about deleting data that we’ll need if we decide to recover other stuff; see below if you want the details.

@CyrilBrulebois,

> While I can understand the desire and need to free up some space, I’m not sure I find it reasonable to drop all snapshots, including those referenced in the basebox config needed to rebuild 4.0:
> […]
> I thought being able to reproduce image builds has grown an important property of our project. I’m rather disappointed to see that disappear in the blink of an eye, especially for this major release, just a few days after the release has been published.

Thank you for raising these concerns.

I’m not happy either about the consequences you’re describing: I agree that the resulting situation is not great.
In particular, yes, the timing sucks: usually it’s possible to reproduce Tails releases for at least a month or two after the release.

All I can say is that it was not easy to decide which trade-off to make, factoring in what we already know about 4.0 reproducibility, my own doubts wrt. whether anyone ever tries to independently reproduce our builds (if we don’t fix this situation, I’m curious whether anyone will complain), longer-term storage management (it’s harder to shrink a ~1TB storage volume than to grow one), implementation complexity/cost of the various options, and available time for sysadmins on short notice (the situation was not urgent yet but it would have soon be, and after today I won’t be available for a while to help my sysadmin team-mates deal with a part of our infra they’re not familiar with).

I won’t argue that we picked the best trade-off we could have because I was not sure back then, and given your reaction, I’m even more in doubt now.
Regardless, I take responsibility for the choice we made (zen’s perception of the big picture may not have been clear enough for him to feel he could provide an informed opinion).

Your reaction makes me wish I had rushed this less, and instead asked input to more people, so we could better balance the pros/cons of each option. I regret this.

Given the APT indices for the Vagrant box used to build 4.0 are still in place (somewhat luckily, we had no time to finish during our scheduled session earlier today), we could fix rebuilding 4.0, for example:

  1. build a list of the needed packages and which APT repo they came from
  2. reinject these packages into the pool + add references to them in the reprepro database
  3. create a ticket with all the info needed to remove them again at some point in the future, unless the references created above already ensure this will happen automatically at an appropriate time

I estimate that realistically, this could take something between 1h and 3h of work.
Do you think repairing this use case is worth me putting this time there, as opposed to somewhere else in Tails land?
Or is this something you would like to work on, e.g. by preparing scripts that will automate some of these steps?

#6 Updated by intrigeri 2019-10-31 18:06:16

@CyrilBrulebois, hi!

intrigeri wrote:
> Given the APT indices for the Vagrant box used to build 4.0 are still in place (somewhat luckily, we had no time to finish during our scheduled session earlier today), we could fix rebuilding 4.0, for example: […]

> I estimate that realistically, this could take something between 1h and 3h of work.
> Do you think repairing this use case is worth me putting this time there, as opposed to somewhere else in Tails land?
> Or is this something you would like to work on, e.g. by preparing scripts that will automate some of these steps?

FTR: I know you’re busy elsewhere this week, so I’ll wait another week in the hope it allows you to follow-up here :)

#7 Updated by CyrilBrulebois 2019-11-14 09:31:10

Right, I’ve been busy with other topics.

Having the indices around is indeed good news, and reimporting packages looks good to me.

Just to let others know: I’ve just provided intrigeri with a bundle of (hopefully) all the needed packages for the debootstrap step. Once that’s been reimported, I should be getting more errors with extra packages installed and/or upgraded, and I’d expect another round to possibly be sufficient…

At least that’s what I thought initially. But vagrant/definitions/tails-builder/postinstall.sh has a bunch of different `apt-get` calls, and we might need up to 6 different imports… Unless I manage to divert the traffic to a box of mine, where I could copy all indices and incrementally import packages; that would allow me to get a big set of packages to import at once.

Let’s see how the “merge the packages for debootstrap” step goes first? I’d expect it to be probably saner for me to try and do the DNS hijack dance on my own than having both intrigeri and me play ping-pong every few (dozen) minutes…

#8 Updated by CyrilBrulebois 2019-11-14 20:29:55

Update after today’s work session:

* A tarball was sent to intrigeri with everything I needed to successfully rebuild the basebox. This was done by diverting the time-based. traffic to a box of mine with nothing to start with, adding indices, then packages incrementally.
* Once the basebox created, I stumbled upon Bug #16607 very early in the build process, but the chmod workaround made this issue go away.
* Then, I was hitting I: Updating debian-security APT source... which failed because I didn’t have any trace file for debian-security (or any other repository). Adding the current one doesn’t work, given this serial (from November) has no matching snapshot for stretch (because of the the clean-up). Faking the reference snapshot (from October) there fixed this particular issue → Archive serial: 2019100904
* Finally, I had apt-snapshots-serials prepare-build failing because of other missing trace files. Using Archive serial: foo for torproject/project/trace/torproject and Archive serial: 2019100904 for both debian and debian-security led to a successful build, matching the published images.

#9 Updated by intrigeri 2019-11-16 09:46:42

Here’s a summary of where we landed after today’s session.

Here is what it would take to allow independent build reproducibility checks of the 4.0 images:

  • import debs from kibi’s tarballs into our time-based snapshots
  • generate and run corresponding reprepro _addreferences commands
  • start taking (again) snapshots of stretch in the debian-security repo: setup-tails-builder updates the basebox to the “latest” snapshot of debian-security
  • the reproducer must pass an appropriate APT_SNAPSHOTS_SERIALS environment variable

I’d like to sleep on it and I’m fine with us deciding after a day or 3, with a fresh mind, whether I do these next steps or not.
(Keeping in mind that the longer we wait, the closer we are to 4.1, and the lower the chances that someone wants to reproduce 4.0.)

In any case, I’ve added a warning in the corresponding doc to ensure I/we don’t do the same mistake again.

#10 Updated by intrigeri 2019-11-29 10:46:22

  • Target version changed from Tails_4.1 to Tails_4.2

intrigeri wrote:
> I’d like to sleep on it and I’m fine with us deciding after a day or 3, with a fresh mind, whether I do these next steps or not.
> (Keeping in mind that the longer we wait, the closer we are to 4.1, and the lower the chances that someone wants to reproduce 4.0.)

With 4.1 being scheduled in 4 days, let’s drop the ball. Sorry again :/

So next step is to finish cleaning up Stretch leftovers.

#11 Updated by intrigeri 2019-12-13 09:43:16

  • Status changed from In Progress to Resolved
  • Assignee deleted (intrigeri)

intrigeri wrote:
> With 4.1 being scheduled in 4 days, let’s drop the ball. Sorry again :/
>
> So next step is to finish cleaning up Stretch leftovers.

Done, and improved the doc a little bit: it did not cover one corner case.