Feature #12002: Estimate hardware cost of reproducible builds in Jenkins

Feature #12002

Estimate hardware cost of reproducible builds in Jenkins

Added by bertagaz 2016-11-28 08:00:34 . Updated 2017-07-05 10:37:13 .

Status:

Resolved

Priority:

High

Assignee:

Category:

Infrastructure

Target version:

Tails_3.1

Start date:

2016-11-28

Due date:

% Done:

100%

Feature Branch:

Type of work:

Research

Blueprint:

https://tails.boum.org/blueprint/reproducible_builds/hardware/

Starter:

Affected tool:

Deliverable for:

289

Description

Adding reproducible builds in Jenkins will certainly require more hardware resources (disk space and RAM at least). Before deploying it for real, we need to estimate how much and take action if necessary.

Subtasks

Related issues

Related to Tails - ~~Feature #12576~~: Have Jenkins use basebox:clean_old instead of basebox:clean_all	Resolved	2017-05-22
Related to Tails - ~~Bug #12595~~: Not enough space in /var/lib/jenkins on isobuilders	Resolved	2017-05-25
Related to Tails - ~~Bug #12599~~: /var/lib/libvirt/images gets filled on isobuilders	Resolved	2017-05-25
Related to Tails - ~~Bug #12725~~: Sort out the apt-snapshots-disk partition situation on apt.lizard	Resolved	2017-06-16
Related to Tails - ~~Bug #13177~~: Sort out the bitcoin-disk partition situation on lizard	Resolved	2017-06-27
Blocks Tails - ~~Feature #11806~~: Update server storage planning needs for at least 2017	Resolved	2016-09-19

History

#1 Updated by bertagaz 2017-04-05 13:33:51

Status changed from Confirmed to In Progress
% Done changed from 0 to 10

Let assume that a basebox size is around 300M (higher than actually, but better this way). Let also assume that per year we have a delay where we host 2 baseboxes (when we update it), and maybe one other when we need to change the build system (as a safety in the estimation). Does it seem sound?

Based on that, a first partial estimation would be:

* 3 x 300M -> 1G
* 20G per isobuilder for the basebox build process -> 80G

So let say 100G in total for the baseboxes hosting/building.

Now we have to consider that we need to keep the same amount of partial APT snapshots than baseboxes. For now we don’t really have a way to estimate how much disk space it means, as we have no mechanism in the basebox build system to have a list of Debian packages needed for that. I’ll fill a ticket about that. In the meantime, I’ll do a basebox build with an empty APT cache so that we get an idea.

#2 Updated by intrigeri 2017-04-05 17:19:53

> Let also assume that per year we have a delay where we host 2 baseboxes (when we update it), and maybe one other when we need to change the build system (as a safety in the estimation). Does it seem sound?

This seems to implicitly assume at least:

that we’re going to host baseboxes in some central place (unclear given ~~Feature #12409~~)
that we’re going to bother garbage collecting old baseboxes: IIRC last time we discussed this, our conclusion was that it was not worth the effort

I suggest you write down your assumptions explicitly, so that we can ensure we update the estimates if/when these assumptions change, and more generally it’ll help understanding where our conclusions come from.

> Now we have to consider that we need to keep the same amount of partial APT snapshots than baseboxes. For now we don’t really have a way to estimate how much disk space it means, as we have no mechanism in the basebox build system to have a list of Debian packages needed for that. I’ll fill a ticket about that. In the meantime, I’ll do a basebox build with an empty APT cache so that we get an idea.

It seems that whey you write “partial APT snapshots”, you mean “tagged APT snapshots” (given you’re discussing the list of needed Debian packages). Now I’m lost: do we have plans to actually generate such snapshots? I see no such thing on the blueprint, and I don’t recall any discussion about it. The blueprint says we’re going to use frozen (but not tagged) APT snapshots, and that we’ll keep them around for 6 months. So if I’m remembering + reading things right, generating a list of packages installed when creating a basebox won’t teach us anything useful. Please clarify if I got some of this wrong.

#3 Updated by bertagaz 2017-04-06 08:48:05

Assignee changed from bertagaz to intrigeri
QA Check set to Info Needed

intrigeri wrote:
> This seems to implicitly assume at least:
>
> * that we’re going to host baseboxes in some central place (unclear given ~~Feature #12409~~)

Right, so that’s:

* 4 x 3 x 300M -> 4G
* 4 x 20G -> 80G

-> still roughly around 100G

> * that we’re going to bother garbage collecting old baseboxes: IIRC last time we discussed this, our conclusion was that it was not worth the effort

Well, in ~~Feature #12409#note-15~~, we’re discussing of a rake basebox:clean_old option about which you ask if why we can’t run it every build. Unless I’m confused that’s precisely its role.

Not that I absolutely want it to be implemented, but I’m wondering: once we have that, our beloved sysadmins won’t have to regulary grow the libvirt partition of the isobuilders -> less maintenance for us. It’s also useful for people building Tails not to get their libvirt partition bloated with old baseboxes. So unless that’s very costy, I see some advantages to have that.

> I suggest you write down your assumptions explicitly, so that we can ensure we update the estimates if/when these assumptions change, and more generally it’ll help understanding where our conclusions come from.

> It seems that whey you write “partial APT snapshots”, you mean “tagged APT snapshots” (given you’re discussing the list of needed Debian packages).

Yes.

> Now I’m lost: do we have plans to actually generate such snapshots? I see no such thing on the blueprint, and I don’t recall any discussion about it. The blueprint says we’re going to use frozen (but not tagged) APT snapshots, and that we’ll keep them around for 6 months. So if I’m remembering + reading things right, generating a list of packages installed when creating a basebox won’t teach us anything useful. Please clarify if I got some of this wrong.

There hasn’t been such a formal discussion. I just remember a reaction of yours when we quickly realized we forgot to count the APT snapshot, and that a frozen one is quite huge in term of disk space. So I’ve been bold and thought that maybe we’ll want a tagged snapshot (hence the creation of a ticket I mention about generating a manifest of the list of Debian packages). Now I don’t specially want it, so if you think that’s overkill, I’m fine.

#4 Updated by intrigeri 2017-04-06 10:45:33

Assignee changed from intrigeri to bertagaz

Hi!

> intrigeri wrote:
>> * that we’re going to bother garbage collecting old baseboxes: IIRC last time we discussed this, our conclusion was that it was not worth the effort

> Well, in ~~Feature #12409#note-15~~, we’re discussing of a rake basebox:clean_old option about which you ask if why we can’t run it every build. Unless I’m confused that’s precisely its role.

This second comment of mine was based on your previous implicit assumption that “we’re going to host baseboxes in some central place”: if we did that, then your numbers work only if we garbage collect baseboxes stored in that central place, which is not what rake basebox:clean_old is about, so it doesn’t come for free. But you implicitly dropped this assumption in the comment of yours I’m replying to (and I agree), so my comment is now obsolete :)

> Not that I absolutely want it to be implemented, but I’m wondering: once we have that, our beloved sysadmins won’t have to regulary grow the libvirt partition of the isobuilders -> less maintenance for us. It’s also useful for people building Tails not to get their libvirt partition bloated with old baseboxes. So unless that’s very costy, I see some advantages to have that.

Absolutely, as agreed on ~~Feature #12409~~ already.

>> Now I’m lost: do we have plans to actually generate such snapshots? I see no such thing on the blueprint, and I don’t recall any discussion about it. The blueprint says we’re going to use frozen (but not tagged) APT snapshots, and that we’ll keep them around for 6 months. So if I’m remembering + reading things right, generating a list of packages installed when creating a basebox won’t teach us anything useful. Please clarify if I got some of this wrong.

> There hasn’t been such a formal discussion. I just remember a reaction of yours when we quickly realized we forgot to count the APT snapshot, and that a frozen one is quite huge in term of disk space.

This does ring a bell, indeed :)

> So I’ve been bold and thought that maybe we’ll want a tagged snapshot (hence the creation of a ticket I mention about generating a manifest of the list of Debian packages). Now I don’t specially want it, so if you think that’s overkill, I’m fine.

Well, it’s not really a matter of what I want or not, and sadly my guessing skills are limited :)

IMO this discussion is premature as we have no data that shows us that there’s a real problem to solve, especially given the solution to this potential problem can be quite hard to get right, and can have a number of drawbacks.

To get you started: assuming the only frozen APT sources we need to build/upgrade a basebox are Debian stable + backports, then keeping a 6-months-old time-based snapshot costs us (in terms of storage) the total size of packages that were in Debian 6 months ago, and are not anymore, because we store the common ones and the newer ones already anyway. I see at least two ways to get this data:

simply try: bump by N months Valid-Until for the oldest time-based snapshot of the debian repository we have, and monitor the disk space that’s saved when this snapshot is automatically garbage collected
- pros: gives us accurate data, modulo Debian is frozen so there’s less upload churn than usual
- cons: we need to wait a bit (the oldest snapshot we currently have is only 3 months old), and to be very reactive when it’s time to gather the data (or add some cronjob to store disk usage information somewhere); thankfully we don’t need this data so urgently, and I suspect that you can be plenty busy with your other reproducible builds tasks until the N months have passed
do some clever computation, either using reprepro dumpreferences (same as above, gives us data that’s as relevant as the age of our oldest snapshot), or using actual APT indices (e.g. from snapshots.d.o so we can immediately tell what it would cost us today to store a 6 months old snapshot, and we can even tell the same for some time in the past outside of a Debian freeze); in both cases some relatively simple scripting is required

Once we have this data, we can decide whether it’s worth doing tagged snapshots or not: doing it would require spending quite some additional developer’s time, and would either add some complexity to the basebox building & upgrading (having different sources depending on whether it uses a tagged or time-based set of snapshots), or add some painful limitations and lack of flexibility (if we only support tagged snapshots). I suspect that storage (time-based snapshots) will be cheaper than developers’ time (tagged snapshots), but as I said it’s only a guess and not relevant.

Now that the expectations have been clarified, I’m going to shut up (unless you need more info from me) and let you do your job :)

#5 Updated by intrigeri 2017-04-18 15:30:27

Description updated
QA Check deleted (~~Info Needed~~)
Deliverable for set to 289

(Added RAM to the list of things to evaluate, as you raised this on another ticket.)

#6 Updated by intrigeri 2017-05-24 07:00:45

blocked by ~~Bug #12574~~: isobuilders system_disks check keeps switching between OK and WARNING since the switch to Vagrant added

#7 Updated by intrigeri 2017-05-24 07:01:01

blocked by ~~Feature #12576~~: Have Jenkins use basebox:clean_old instead of basebox:clean_all added

#8 Updated by intrigeri 2017-05-24 07:04:05

blocks ~~Feature #11806~~: Update server storage planning needs for at least 2017 added

#9 Updated by intrigeri 2017-05-25 06:59:22

blocked by ~~Bug #12595~~: Not enough space in /var/lib/jenkins on isobuilders added

#10 Updated by intrigeri 2017-05-27 09:00:05

blocks deleted (~~~~Feature #12576~~: Have Jenkins use basebox:clean_old instead of basebox:clean_all~~)

#11 Updated by intrigeri 2017-05-27 09:00:41

blocked by ~~Feature #12576~~: Have Jenkins use basebox:clean_old instead of basebox:clean_all added

#12 Updated by intrigeri 2017-06-01 06:39:56

blocks deleted (~~~~Bug #12595~~: Not enough space in /var/lib/jenkins on isobuilders~~)

#13 Updated by intrigeri 2017-06-01 06:41:22

blocks deleted (~~~~Feature #12576~~: Have Jenkins use basebox:clean_old instead of basebox:clean_all~~)

#14 Updated by intrigeri 2017-06-01 06:41:44

related to ~~Feature #12576~~: Have Jenkins use basebox:clean_old instead of basebox:clean_all added

#15 Updated by intrigeri 2017-06-01 06:41:58

related to ~~Bug #12595~~: Not enough space in /var/lib/jenkins on isobuilders added

#16 Updated by intrigeri 2017-06-01 06:42:12

related to ~~Bug #12599~~: /var/lib/libvirt/images gets filled on isobuilders added

#17 Updated by intrigeri 2017-06-01 06:43:46

(Relaxed relationship with related tickets: the research leading to the numbers we need might happen here, or on other tickets that are about specific issues. If the former, then this ticket would block the others; if the latter, then it’s the opposite. So let’s stick to “Related to” for now.)

#18 Updated by intrigeri 2017-06-01 06:44:35

Target version changed from 2017 to Tails_3.2

(Our deadline is before the end of the year.)

#19 Updated by intrigeri 2017-06-01 09:07:02

Wrt. the storage space needed for APT snapshots used by the build boxes, I think the problem has been greatly simplified now that updating the basebox is part of our release process, and we use the same snapshots as in the ISO.

#20 Updated by bertagaz 2017-06-01 14:43:20

Target version changed from Tails_3.2 to Tails_3.0

#21 Updated by bertagaz 2017-06-04 13:45:47

Assignee changed from bertagaz to intrigeri
Target version changed from Tails_3.0 to Tails_3.2
QA Check set to Info Needed

I’ve made another estimation, now that things have settled a bit. There’s
still some things to discuss/evaluate. I’ve grown a bit the numbers
compared to what we have to get some room. Please have a first look for numbers
that are already known, if they make sense for you, and if you have inputs for
the remaining open questions.

Disk space

root

Was: 4G

Add 500M to have some margin for system upgrades. Won’t grow that much in the
future. Only for Buster upgrade, which may be in quite a long time if we use
Stretch LTS.

-> 5G (+1G * 4)

/var/lib/jenkins

Was: 6G

13 baseboxes 13 * 1.5G ~=> 20G
artifacts 5G
1 basebox build 25G

-> 50G (+44G * 4)

h3. /var/lib/libvirt/images

Was: none

1 basebox 1.5G
1 snapshot 2G

-> 5G (+5G * 4)

Artifacts

Reproducible builds will add one ISO each time a build fails. Difficult to
guess how often it will happen. Let’s consider 50% of the time worst case?

-> Count number of base branches artifacts we keep now and multiply.

time-based APT snapshots

Probably one or two to keep sometimes when we update the basebox in the middle
of the 4 months. Size unknown for now, would need evaluation.

Memory

Current: 14.5G * 4

We did not bump it a lot in the past, biggest one was due to Vagrant itself,
but it should stabilize in the future. Maybe count 20G to have some margin
still?

CPUs

We use 4 per isobuilders.

Projecting if it will grow depends on the question below.

Will we need more isobuilders?

With the reproducibles builds we may have a lower output time, meaning we may
want to add more isobuilders in the future if it’s getting too slow.

-> Need to evaluate how much time it may delay the output we have, meaning
having a look to the number of base branches builds in the past, or maybe
assume we’ll want N more?

1 more isobuilder requires:

4 CPUs
14.5G RAM
60G HDD

#22 Updated by bertagaz 2017-06-05 12:20:03

I’ve adapted the above numbers with what was decided on ~~Bug #12574#note-6~~ regarding isobuilders root partition.

#23 Updated by bertagaz 2017-06-05 12:27:44

bertagaz wrote:
> h3. Artifacts
>
> Reproducible builds will add one ISO each time a build fails. Difficult to
> guess how often it will happen. Let’s consider 50% of the time worst case?
>
> -> Count number of base branches artifacts we keep now and multiply.
>
> h2. Will we need more isobuilders?
>
> -> Need to evaluate how much time it may delay the output we have, meaning
> having a look to the number of base branches builds in the past, or maybe
> assume we’ll want N more?

So here are the number of base branches builds (stable + devel + testing + feature/stretch) in the past:

month  : base_branches / total
---
2015-02: 86  / 263 
2015-03: 134 / 358 
2015-04: 92  / 351 
2015-05: 147 / 268 
2015-06: 91  / 248 
2015-07: 50  / 101 
2015-08: 92  / 195 
2015-09: 87  / 470 
2015-10: 106 / 935 
2015-11: 105 / 611 
2015-12: 117 / 809
2016-01: 154 / 757
2016-02: 126 / 603
2016-03: 89  / 839
2016-04: 85  / 779
2016-05: 143 / 1055
2016-06: 146 / 1022
2016-07: 126 / 1178
2016-08: 128 / 658
2016-09: 141 / 404
2016-10: 147 / 575
2016-11: 145 / 482
2016-12: 151 / 476
2017-01: 172 / 625
2017-02: 159 / 590
2017-03: 239 / 674
2017-04: 217 / 638
2017-05: 161 / 478

Some datapoints:

Tails 2.0 was released end of 2016 January
reproducible build jobs (at least for feature/stretch) were added end of 2016 November.
porting to Stretch started on 2016 August
the high peaks during 2016 May to July were mostly due to a lot of work happening in the test suite side, tagging a lot of them as fragile and fixing a lot of them.

#24 Updated by intrigeri 2017-06-14 19:12:42

> h3. /var/lib/jenkins

> artifacts 5G

What’s this about?

> h3. Artifacts

> Reproducible builds will add one ISO each time a build fails.

I’ll assume below you instead mean “each time a build is not reproducible”. Correct? Otherwise, please clarify as I don’t get it.

> Difficult to guess how often it will happen. Let’s consider 50% of the time worst case?

I seriously hope that we won’t be breaking reproducible builds that commonly. Let’s say 20%?

> -> Count number of base branches artifacts we keep now and multiply.

This seems to be based on the implicit assumption that we’re going to do reproducibility testing on all active branches: I agree we should, because we should not merge branches that break reproducibility (once our ISO is reproducible); but I don’t remember any ticket about this. Please create one if we have none, so we base on hardware cost estimates on actual plans :)

Once this is clarified, I agree with the proposed way of calculating this.

> h3. time-based APT snapshots

> Probably one or two to keep sometimes when we update the basebox in the middle of the 4 months. Size unknown for now, would need evaluation.

Err, we currently keep the snapshots used for major releases for 6 months, not 4:

6. Make it so the time-based APT repository snapshots are kept around
long enough, by bumping their `Valid-Until` to 6 months from now:
[[APT_repository/time-based_snapshots#bump-expiration-date-for-all-snapshots]]

Please check if that’s still relevant, or if $someone forgot to update release_process.mdwn when we decided to keep baseboxes for 4 months only.

> h2. Memory

> Current: 14.5G * 4

> We did not bump it a lot in the past, biggest one was due to Vagrant itself,

Here, we need to account for the additional memory we already had to allocate when switching to Vagrant: it’s been taken on our spare reserve and has already forced us to do some trade-offs, so it doesn’t come for free.

> but it should stabilize in the future.

I wonder what makes you think so. Every time we work on porting Tails to a new version of Debian, we have to bump the amount of RAM needed for in-memory builds. I suggest you look at this growth rate and take it into account.

> h2. Will we need more isobuilders?

> With the reproducibles builds we may have a lower output time,

I guess this again implicitly relies on the assumption that we’re going to do reproducibility testing on all active branches. Correct?

With this assumption in mind, well, “may” seems half-assed: we’re simply gonna build twice as many ISO images, so it’ll definitely make the feedback loop longer.

> -> Need to evaluate how much time it may delay the output we have, meaning having a look to the number of base branches builds in the past, or maybe assume we’ll want N more?

Do you mean “active” branches instead of “base” branches?

Anyway, I think we can assume that building twice as many ISO images without affecting the developer/RM experience too much roughly requires doubling the throughput of our CI for ISO builds. This can be done in various ways, by making builds faster and/or having more isobuilders. I propose you simply add this info to the blueprint for ~~Bug #11680~~, and we don’t bother trying to find a more precise estimate here about the needed throughput increase. OK?

#25 Updated by intrigeri 2017-06-14 19:15:07

Assignee changed from intrigeri to bertagaz
QA Check changed from Info Needed to Dev Needed

> So here are the number of base branches builds (stable + devel + testing + feature/stretch) in the past:

I’m sorry I don’t understand how it’s relevant here. Can you please clarify?

This being said, great job, congrats!

#26 Updated by intrigeri 2017-06-14 19:18:32

Also, it would be nice if this was done during the 3.1 cycle, if you can: this ticket is blocking ~~Feature #11806~~ that has become somewhat urgent while we’ve allocated space needed here and there… including to handle the unplanned needs of the reproducible builds system. I didn’t look closely, but I think we won’t have enough space anymore to grow storage volumes to match our needs in a couple months, and I would dislike having to delete more isotesters/isobuilders again.

#27 Updated by bertagaz 2017-06-15 09:39:40

blocked by ~~Feature #12715~~: Decide what builds we will try to reproduce in Jenkins added

#28 Updated by bertagaz 2017-06-15 09:53:01

intrigeri wrote:
> > h3. /var/lib/jenkins
>
> > artifacts 5G
>
> What’s this about?

We need disk space in this partition to retrieve the two ISOs.

Now I agree that with the disk space we have to build the basebox, it should be enough without adding more for the artifacts. So let’s forget this.

> > h3. Artifacts
>
> > Reproducible builds will add one ISO each time a build fails.
>
> I’ll assume below you instead mean “each time a build is not reproducible”. Correct? Otherwise, please clarify as I don’t get it.

Yes, that’s what I meant.

>
> > Difficult to guess how often it will happen. Let’s consider 50% of the time worst case?
>
> I seriously hope that we won’t be breaking reproducible builds that commonly. Let’s say 20%?

As you wish. I don’t know how to estimate this right now.

> > -> Count number of base branches artifacts we keep now and multiply.
>
> This seems to be based on the implicit assumption that we’re going to do reproducibility testing on all active branches: I agree we should, because we should not merge branches that break reproducibility (once our ISO is reproducible); but I don’t remember any ticket about this. Please create one if we have none, so we base on hardware cost estimates on actual plans :)

Ok, in my mind, the assumption was: we try to reproduce base branches only, as we’ve discussed to set up reproducible jobs for this branches only for now. I’ve created ~~Feature #12715~~ to discuss that and decide something.

> Once this is clarified, I agree with the proposed way of calculating this.
>
> > h3. time-based APT snapshots
>
> > Probably one or two to keep sometimes when we update the basebox in the middle of the 4 months. Size unknown for now, would need evaluation.
>
> Err, we currently keep the snapshots used for major releases for 6 months, not 4:
>
> […]
>
> Please check if that’s still relevant, or if $someone forgot to update release_process.mdwn when we decided to keep baseboxes for 4 months only.

Right, I think the release process is not up to date, as we’ve decided since then to bump this snapshots at every releases, so we though 4 months were enough.

> > h2. Memory
>
> > Current: 14.5G * 4
>
> > We did not bump it a lot in the past, biggest one was due to Vagrant itself,
>
> Here, we need to account for the additional memory we already had to allocate when switching to Vagrant: it’s been taken on our spare reserve and has already forced us to do some trade-offs, so it doesn’t come for free.
>
> > but it should stabilize in the future.
>
> I wonder what makes you think so. Every time we work on porting Tails to a new version of Debian, we have to bump the amount of RAM needed for in-memory builds. I suggest you look at this growth rate and take it into account.

Ack.

> > h2. Will we need more isobuilders?
>
> > With the reproducibles builds we may have a lower output time,
>
> I guess this again implicitly relies on the assumption that we’re going to do reproducibility testing on all active branches. Correct?

Nop, I was assuming we’ll reproduce base branches only.

> Anyway, I think we can assume that building twice as many ISO images without affecting the developer/RM experience too much roughly requires doubling the throughput of our CI for ISO builds. This can be done in various ways, by making builds faster and/or having more isobuilders. I propose you simply add this info to the blueprint for ~~Bug #11680~~, and we don’t bother trying to find a more precise estimate here about the needed throughput increase. OK?

Let see how ~~Feature #12715~~ goes.

> > So here are the number of base branches builds (stable + devel + testing + feature/stretch) in the past:
> I’m sorry I don’t understand how it’s relevant here. Can you please clarify?

That’s because I was considering reproduce base branches only.

> Also, it would be nice if this was done during the 3.1 cycle, if you can: this ticket is blocking ~~Feature #11806~~ that has become somewhat urgent while we’ve allocated space needed here and there… including to handle the unplanned needs of the reproducible builds system. I didn’t look closely, but I think we won’t have enough space anymore to grow storage volumes to match our needs in a couple months, and I would dislike having to delete more isotesters/isobuilders again.

That was my intent.

#29 Updated by intrigeri 2017-06-19 11:37:34

bertagaz wrote:
> intrigeri wrote:
>> > h3. /var/lib/jenkins
>>
>> > artifacts 5G
>>
>> What’s this about?

> We need disk space in this partition to retrieve the two ISOs.

OK.

> Now I agree that with the disk space we have to build the basebox, it should be enough without adding more for the artifacts. So let’s forget this.

ACK.

Note: please move the current state of your thoughts to a blueprint, as having to read the entire ticket history + apply each incremental change to understand what’s the current proposal has already become too painful. But we can/should still discuss changes here :)

>> > h3. Artifacts
>>
>> > Reproducible builds will add one ISO each time a build fails.
>>
>> I’ll assume below you instead mean “each time a build is not reproducible”. Correct? Otherwise, please clarify as I don’t get it.

> Yes, that’s what I meant.

OK, good. Please update this in the proposal once it’s been moved to a blueprint.

>> > Difficult to guess how often it will happen. Let’s consider 50% of the time worst case?
>>
>> I seriously hope that we won’t be breaking reproducible builds that commonly. Let’s say 20%?

> As you wish. I don’t know how to estimate this right now.

If we wanted to do any kind of serious estimate, we could start by looking at the history of the ~~Feature #5630~~ branch builds and see how often we’ve broken it by merging other branches. But I doubt it’s worth the hassle at this point.

>> > h3. time-based APT snapshots
>>
>> > Probably one or two to keep sometimes when we update the basebox in the middle of the 4 months. Size unknown for now, would need evaluation.
>>
>> Err, we currently keep the snapshots used for major releases for 6 months, not 4:
>>
>> […]
>>
>> Please check if that’s still relevant, or if $someone forgot to update release_process.mdwn when we decided to keep baseboxes for 4 months only.

> Right, I think the release process is not up to date, as we’ve decided since then to bump this snapshots at every releases, so we though 4 months were enough.

Then please ensure this is fixed => ticket

>> > h2. Memory
>>
>> > Current: 14.5G * 4
>>
>> > We did not bump it a lot in the past, biggest one was due to Vagrant itself,
>>
>> Here, we need to account for the additional memory we already had to allocate when switching to Vagrant: it’s been taken on our spare reserve and has already forced us to do some trade-offs, so it doesn’t come for free.
>>
>> > but it should stabilize in the future.
>>
>> I wonder what makes you think so. Every time we work on porting Tails to a new version of Debian, we have to bump the amount of RAM needed for in-memory builds. I suggest you look at this growth rate and take it into account.

> Ack.

I’ll let you update this proposal on the blueprint then :)

>> Also, it would be nice if this was done during the 3.1 cycle, if you can: this ticket is blocking ~~Feature #11806~~ that has become somewhat urgent while we’ve allocated space needed here and there… including to handle the unplanned needs of the reproducible builds system. I didn’t look closely, but I think we won’t have enough space anymore to grow storage volumes to match our needs in a couple months, and I would dislike having to delete more isotesters/isobuilders again.

> That was my intent.

Great!

#30 Updated by intrigeri 2017-06-25 11:43:05

related to ~~Bug #12725~~: Sort out the apt-snapshots-disk partition situation on apt.lizard added

#31 Updated by intrigeri 2017-06-25 11:56:30

Priority changed from Normal to High
Target version changed from Tails_3.2 to Tails_3.1

We’re short on disk space on several partitions, and can’t grow them as much as planned (~~Feature #11806~~) due to the Vagrant thing having been deployed without taking the big picture of storage into account. So at least the storage aspect of this ticket has become urgent: the faster it’s done, the earlier we can do ~~Feature #11806~~ and purchase the storage we need. Feel free to split storage/memory/CPU aspects into dedicated subtasks if you want to prioritize memory/CPU less high than storage.

#32 Updated by intrigeri 2017-06-27 09:48:04

related to ~~Bug #13177~~: Sort out the bitcoin-disk partition situation on lizard added

#33 Updated by bertagaz 2017-07-01 13:35:14

% Done changed from 10 to 20
Blueprint set to https://tails.boum.org/blueprint/reproducible_builds/hardware/

intrigeri wrote:
> We’re short on disk space on several partitions, and can’t grow them as much as planned (~~Feature #11806~~) due to the Vagrant thing having been deployed without taking the big picture of storage into account. So at least the storage aspect of this ticket has become urgent: the faster it’s done, the earlier we can do ~~Feature #11806~~ and purchase the storage we need. Feel free to split storage/memory/CPU aspects into dedicated subtasks if you want to prioritize memory/CPU less high than storage.

I’ve added everything into a blueprint with updates related to our discussions. On the disk side we’re almost done. The remaining estimate to do is the APT snapshots size one. There’s also an update and a pending question related to the memory part of the estimate.

#34 Updated by intrigeri 2017-07-04 09:44:40

I’ve taken a look and improved the blueprint a bit, please have a look at my changes.

#35 Updated by intrigeri 2017-07-04 09:58:21

blocks deleted (~~~~Bug #12574~~: isobuilders system_disks check keeps switching between OK and WARNING since the switch to Vagrant~~)

#36 Updated by bertagaz 2017-07-04 12:04:34

intrigeri wrote:
> I’ve taken a look and improved the blueprint a bit, please have a look at my changes.

Looks good. So the only remaining is APT snapshots. I wonder how realistic it is to settle on 40G for one time-based snapshot given that’s what the 2.12 one used (as shown in ~~Bug #12725~~)? If we do, then as we stated on the blueprint that would mean 4 * 40G we’d have to keep during two release cycles.

#37 Updated by intrigeri 2017-07-05 08:41:38

> So the only remaining is APT snapshots. I wonder how realistic it is to settle on 40G for one time-based snapshot given that’s what the 2.12 one used (as
> shown in ~~Bug #12725~~)?

Yes, but please apply to this number the ratio I’ve computed on Feature #12111, and apply a 1.3 ratio of top of that to account for the growth of the Debian archive.

> If we do, then as we stated on the blueprint that would mean 4 * 40G we’d have to keep during two release cycles.

The release says 3 extra snapshots, not 4, so perhaps I’m confused or not looking at the right place?
Once this is clarified, this sounds good to me (I’m too lazy to find the reasoning behind this “3” number, which is not on the blueprint, but I’ll trust you have copied the result correctly).

#38 Updated by bertagaz 2017-07-05 09:23:34

Assignee changed from bertagaz to intrigeri
% Done changed from 20 to 50
QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:
> Yes, but please apply to this number the ratio I’ve computed on Feature #12111, and apply a 1.3 ratio of top of that to account for the growth of the Debian archive.

Ack.

> The release says 3 extra snapshots, not 4, so perhaps I’m confused or not looking at the right place?

Nop, I’ve made the mistake of adding the one we already keep, but that’s not necessary as it’s already taken into account. So you’re right, 3 that is.

> Once this is clarified, this sounds good to me (I’m too lazy to find the reasoning behind this “3” number, which is not on the blueprint, but I’ll trust you have copied the result correctly).

Updated the blueprint. I guess we’re good here then, and we can go on with ~~Feature #11806~~.

#39 Updated by intrigeri 2017-07-05 10:12:47

blocks deleted (~~~~Feature #12715~~: Decide what builds we will try to reproduce in Jenkins~~)

#40 Updated by intrigeri 2017-07-05 10:13:41

Deleted the “blocked by ~~Feature #12715~~” relationship (as the current estimates are about the “worst” case situation) so we can close this ticket.

#41 Updated by intrigeri 2017-07-05 10:16:07

Status changed from In Progress to Resolved
QA Check changed from Ready for QA to Pass

>> The release says 3 extra snapshots, not 4, so perhaps I’m confused or not looking at the right place?

> Nop, I’ve made the mistake of adding the one we already keep, but that’s not necessary as it’s already taken into account. So you’re right, 3 that is.

OK, good.

>> Once this is clarified, this sounds good to me (I’m too lazy to find the reasoning
>> behind this “3” number, which is not on the blueprint, but I’ll trust you have
>> copied the result correctly).

> Updated the blueprint. I guess we’re good here then,

Well, no: there was still the snapshots lifetime thing left to handle first, otherwise these estimates are pretty much off with the real world. I see no related ticket created since I’ve asked you to do so (in order to avoid forgetting this bit) so I assume this disappeared from your radar. Tracking this takes me more time / mental space / energy than handling it myself, so I went ahead: please review commit:945027e0d908a3120af873cd3f817ff1c10c5a31.

> and we can go on with ~~Feature #11806~~.

Yes :)

#42 Updated by intrigeri 2017-07-05 10:19:43

Assignee deleted (~~intrigeri~~)
% Done changed from 50 to 100

#43 Updated by bertagaz 2017-07-05 10:37:13

intrigeri wrote:
> Well, no: there was still the snapshots lifetime thing left to handle first, otherwise these estimates are pretty much off with the real world. I see no related ticket created since I’ve asked you to do so (in order to avoid forgetting this bit) so I assume this disappeared from your radar. Tracking this takes me more time / mental space / energy than handling it myself, so I went ahead: please review commit:945027e0d908a3120af873cd3f817ff1c10c5a31.

Oooch, yes I forgot that. Looks good to me, thanks for the fix.