Bug #8912

Nightly built artifacts for disabled jobs are not removed

Added by intrigeri 2015-02-17 19:34:35 . Updated 2015-08-26 06:09:57 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Continuous Integration
Target version:
Start date:
2015-02-17
Due date:
% Done:

100%

Feature Branch:
feature/8912-automatic_disabled_jobs_artifacts_deletion
Type of work:
Sysadmin
Blueprint:

Starter:
0
Affected tool:
Deliverable for:
266

Description

Once we build all active branches, we’ll have lots of stuff on nightly.t.b.o for branches that we don’t build anymore. We can’t allow this to grow endlessly, so artifacts need to be removed once the corresponding branch isn’t built automatically anymore.


Subtasks


Related issues

Related to Tails - Feature #7107: Refactor Puppet code for Jenkins from www.lizard Resolved 2014-04-17
Related to Tails - Feature #9350: Jenkins workspaces for disabled jobs are not removed from ISO builders Resolved 2015-05-06
Related to Tails - Feature #9616: Refactor Puppet bits for the nightly builds NFS Confirmed 2015-06-18
Related to Tails - Bug #10123: Autobuild ISOs cleanup loses all history for topic branches post-release Resolved 2015-08-30
Blocks Tails - Feature #8658: Deploy the "build all active branches" system Resolved 2015-01-09

History

#1 Updated by intrigeri 2015-02-17 19:35:20

  • Assignee set to bertagaz

Assigning to bertagaz to start with, I’ll let you decide with DrWhax who does what exactly.

#2 Updated by bertagaz 2015-04-18 07:44:32

  • % Done changed from 0 to 10

Ok so, we need to run a script, probably with cron, that would delete these directories, after having retrieve the list of Jenkins build_ISO_* jobs.

The canonical place where it would make sense to run it would be on the NFS server, www.lizard.

We have several ways/places to retrieve the current list of jobs:

  • in our jenkins-jobs git repo
  • using the jenkins api

Not sure which one is the best, but the later might be a bit cleaner, compared to using another script that will clone another git repo on another place. We will probably have to deal with the python-jenkinsapi lib at some point anyway.

This script should be deployed by puppet, so our puppet manifests are the place where we should work on it. Probably in a dedicated tails module manifest, so that people who deploy our setup can include it on any node that host their build artifacts, independently from the nightlies web server and NFS part of our setup.

A first iteration of this script should probably take care of targeting only the build_ISO jobs, yet keep in mind that we might want to reuse such a script for another type of artifacts.

#3 Updated by intrigeri 2015-04-26 03:59:12

> The canonical place where it would make sense to run it would be on the NFS server, www.lizard.

Agreed.

> We have several ways/places to retrieve the current list of jobs:

> * in our jenkins-jobs git repo
> * using the jenkins api

> Not sure which one is the best, but the later might be a bit cleaner, compared to using another script that will clone another git repo on another place.

I hope we don’t have to fallback to cloning the jenkins-jobs repo, parsing its YAML, and filtering the ISO build jobs: with templates, projects etc.: if we went that way, I fear we would be rewriting too much of j-j-b, and it’s not clear if we can sanely reuse j-j-b’s functionality (I don’t think it has a stable API aside of the command-line tool).

So I’m totally in favour of using the Jenkins remote API instead. This will require giving read-only access to the jobs configuration via the remote Jenkins API. Is that level of ACL granularity available? I suspect it’ll require to first enable Jenkins internal authentication system, but we’ll have to do that anyway in order to have email notification working the way we want IIRC.

> We will probably have to deal with the python-jenkinsapi lib at some point anyway.

Note that besides python-jenkinsapi, there’s also python-jenkins.

> This script should be deployed by puppet, so our puppet manifests are the place where we should work on it. Probably in a dedicated tails module manifest, so that people who deploy our setup can include it on any node that host their build artifacts,

Agreed.

> independently from the nightlies web server and NFS part of our setup.

I think I’d like to see a tails::jenkins::artifacts_store class, that for now would handle storing and cleaning up artifacts. It would also optionally serve artifacts over HTTP, taking over what we meant to put in a tails::jenkins::nightlies::webserver class (Feature #7107).

Later, tails::jenkins::artifacts_store would also handle serving stuff over NFS to the Jenkins slaves: as I see it, an artifacts store isn’t very useful if it has no way to receive new artifacts (but of course the NFS bits could be opt-out for those who want to split things up in a less fine-grained way locally).

Thoughts?

> A first iteration of this script should probably take care of targeting only the build_ISO jobs, yet keep in mind that we might want to reuse such a script for another type of artifacts.

Right.

#4 Updated by intrigeri 2015-04-26 03:59:24

  • related to Feature #7107: Refactor Puppet code for Jenkins from www.lizard added

#5 Updated by intrigeri 2015-05-06 10:05:51

  • related to Feature #9350: Jenkins workspaces for disabled jobs are not removed from ISO builders added

#6 Updated by intrigeri 2015-05-09 02:00:50

  • Target version changed from Tails_1.4 to Tails_1.4.1

Due date: 05/31/2015

Postponing.

#7 Updated by intrigeri 2015-05-15 14:42:24

  • blocks Feature #8658: Deploy the "build all active branches" system added

#8 Updated by intrigeri 2015-05-29 11:37:57

  • blocks #8668 added

#9 Updated by bertagaz 2015-06-15 07:42:10

  • Status changed from Confirmed to In Progress
  • Assignee changed from bertagaz to intrigeri
  • QA Check set to Ready for QA
  • Feature Branch set to feature/8912-automatic_disabled_jobs_artifacts_deletion

Pushed a branch that implement it with a simple tails::jenkins::artifacts_store manifest and a python script started with cron. Is is ready to get a first round of reviewing. :)

#10 Updated by intrigeri 2015-06-18 06:28:28

  • Assignee changed from intrigeri to bertagaz
  • % Done changed from 10 to 20
  • QA Check changed from Ready for QA to Dev Needed

OK, that’s a pretty nice start. I’m not entirely happy with it, though: here are a few comments.

About the script:

  • Please don’t rely on global variables in removable_artifacts: pass it all the info it needs as arguments. We actually don’t need a temporary variable, and we can directly pass jenkins.Jenkins(args.jenkins_master_url) to it, by the way.
  • In for artifacts in removable_artifacts, the loop variable should be singular, not plural. Actually, IMO it should be called job_artifacts_dir to express more clearly what it’s about. And accordingly removable_artifacts should be called something like removable_artifacts_dirs.
  • It seems that removable_artifacts should only return directories, and filter out anything else. The current implementation will delete unrelated stuff if we want to add e.g. a README or something, some day.
  • The current implementation relies heavily (and silently) on the assumption that all artifact directories are called the same as the job. This should be explicitly written somewhere. And as a safety measure, in case we add other kinds of jobs later, that don’t satisfy this assumption, IMO the script should only work on directories called build_Tails_ISO_*, and ignore anything else. What do you think?

And about the class:

  • It looks like the class doesn’t handle toggling $remove_old_artifacts off if it was previously toggled on. Please make it clean up behind it in this case.
  • Please single-quote arguments passed to the command run by cron. It won’t be perfect, but at least will protect against some dangerous mistakes.
  • Maybe $artifacts_store_user could have a sensible default value?

#11 Updated by bertagaz 2015-06-19 02:53:23

  • Assignee changed from bertagaz to intrigeri
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:
> OK, that’s a pretty nice start. I’m not entirely happy with it, though: here are a few comments.
>
> About the script:
>
> * Please don’t rely on global variables in removable_artifacts: pass it all the info it needs as arguments. We actually don’t need a temporary variable, and we can directly pass jenkins.Jenkins(args.jenkins_master_url) to it, by the way

Right, done.

> * In for artifacts in removable_artifacts, the loop variable should be singular, not plural. Actually, IMO it should be called job_artifacts_dir to express more clearly what it’s about. And accordingly removable_artifacts should be called something like removable_artifacts_dirs.

Thx, I spent some time trying to find sensible names, but wasn’t convinced by what I came up with. Done.

> * It seems that removable_artifacts should only return directories, and filter out anything else. The current implementation will delete unrelated stuff if we want to add e.g. a README or something, some day.

Done.

> * The current implementation relies heavily (and silently) on the assumption that all artifact directories are called the same as the job. This should be explicitly written somewhere. And as a safety measure, in case we add other kinds of jobs later, that don’t satisfy this assumption, IMO the script should only work on directories called build_Tails_ISO_*, and ignore anything else. What do you think?

Well, I think I’d prefer to consider we’ll try to keep this job <-> artifacts directory relationship in the future. This way this script can be easily deployed against other artifacts, and it will help in future developments if we have that kind of “standards”.
It might not be possible though, but then we’ll adapt the script accordingly when the need raises. So maybe let’s just keep it that way for now?

> And about the class:
>
> * It looks like the class doesn’t handle toggling $remove_old_artifacts off if it was previously toggled on. Please make it clean up behind it in this case.

Right, done.

> * Please single-quote arguments passed to the command run by cron. It won’t be perfect, but at least will protect against some dangerous mistakes.

Oops, fixed.

> * Maybe $artifacts_store_user could have a sensible default value?

Hmm, to me this value is very much dependent of our NFS sharing bits, so that’s why I didn’t put a default yet. Once we’ll have refactored this NFS thingie, I think it will be easier to get this sensible default. So I believe we should leave it that way for now.

#12 Updated by intrigeri 2015-06-19 06:05:09

  • Assignee changed from intrigeri to bertagaz
  • % Done changed from 20 to 70
  • QA Check changed from Ready for QA to Dev Needed

Yay! Everything I’m not quoting nor replying to is good. I’ve pushed some minor style nitpicking changes on top.

>> * The current implementation relies heavily (and silently) on the assumption that all artifact directories are called the same as the job. This should be explicitly written somewhere. And as a safety measure, in case we add other kinds of jobs later, that don’t satisfy this assumption, IMO the script should only work on directories called build_Tails_ISO_*, and ignore anything else. What do you think?

> Well, I think I’d prefer to consider we’ll try to keep this job <-> artifacts directory relationship in the future. This way this script can be easily deployed against other artifacts, and it will help in future developments if we have that kind of “standards”. It might not be possible though, but then we’ll adapt the script accordingly when the need raises. So maybe let’s just keep it that way for now?

Fair enough.

>> * It looks like the class doesn’t handle toggling $remove_old_artifacts off if it was previously toggled on. Please make it clean up behind it in this case.

> Right, done.

It seems that there was some misunderstanding. You added a way to disable tails::jenkins::artifacts_store entirely, which is generally good practice; fine.

However, the use case I was speaking of is still not dealt with: imagine one enables this class with $remove_old_artifacts set to true, and then decides to turn this behaviour off by passing remove_old_artifacts => true to the class; then, the current code will actually not turn old artifacts removal off. See what I mean?

That’s the only blocker left, I think :)

>> * Maybe $artifacts_store_user could have a sensible default value?

> Hmm, to me this value is very much dependent of our NFS sharing bits, so that’s why I didn’t put a default yet. Once we’ll have refactored this NFS thingie, I think it will be easier to get this sensible default. So I believe we should leave it that way for now.

OK.

#13 Updated by bertagaz 2015-06-19 06:15:45

  • related to Feature #9616: Refactor Puppet bits for the nightly builds NFS added

#14 Updated by bertagaz 2015-06-19 08:04:01

  • Assignee changed from bertagaz to intrigeri
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:
>
> It seems that there was some misunderstanding. You added a way to disable tails::jenkins::artifacts_store entirely, which is generally good practice; fine.
>
> However, the use case I was speaking of is still not dealt with: imagine one enables this class with $remove_old_artifacts set to true, and then decides to turn this behaviour off by passing remove_old_artifacts => true to the class; then, the current code will actually not turn old artifacts removal off. See what I mean?

Oooh, didn’t get the subtle difference. I think that’s now better regarding this.

> That’s the only blocker left, I think :)

Getting close! Thx for your availability for reviewing, given you might be busy on your side.

#15 Updated by bertagaz 2015-06-21 12:37:46

  • QA Check deleted (Ready for QA)

Please hold on before reviewing this code, might be unneeded. I’ll post a new note on Feature #9597 that might render this branch useless in the end. We’ll see. More coming soon.

#16 Updated by intrigeri 2015-06-22 08:07:22

  • Assignee changed from intrigeri to bertagaz

OK, then it doesn’t make sense that this ticket is assigned to me.

#17 Updated by bertagaz 2015-06-28 09:06:47

  • Target version changed from Tails_1.4.1 to Tails_1.5

#18 Updated by bertagaz 2015-07-08 11:16:34

  • Assignee changed from bertagaz to intrigeri
  • QA Check set to Ready for QA

Now that we clarified we won’t tackle Feature #9597 before the automates builds are deployed, this branch/ticket/feature needs to be merged. It has already gone through an extensive review, so it’s quite ready to be deployed.

If the last changes fix the remaining issues, I’ll be glad to put this in production and close this ticket. :)

#19 Updated by intrigeri 2015-07-09 02:09:41

  • Assignee changed from intrigeri to bertagaz
  • % Done changed from 70 to 80
  • QA Check changed from Ready for QA to Dev Needed

bertagaz wrote:
> intrigeri wrote:
> >
> > It seems that there was some misunderstanding. You added a way to disable tails::jenkins::artifacts_store entirely, which is generally good practice; fine.
> >
> > However, the use case I was speaking of is still not dealt with: imagine one enables this class with $remove_old_artifacts set to true, and then decides to turn this behaviour off by passing remove_old_artifacts => true to the class; then, the current code will actually not turn old artifacts removal off. See what I mean?
>
> Oooh, didn’t get the subtle difference. I think that’s now better regarding this.

Indeed, it’s better! It’s not handling this case correctly, though:

class { 'tails::jenkins::artifacts_store':
  ensure => absent,
}

In this case, $artifacts_cleaner_ensure will still default to “present”, and thus be deployed. I think that parameter should forcibly be set to $ensure when $ensure is not “present”, which should address my concern. IIRC recent enough versions of Puppet support setting the default value for a class parameter based on another one, so if I’m correct this should be the simplest implementation:

--- a/manifests/jenkins/artifacts_store.pp
+++ b/manifests/jenkins/artifacts_store.pp
@@ -3,7 +3,7 @@ class tails::jenkins::artifacts_store (
   $path,
   $ensure                   = 'present',
   $user                     = undef,
-  $artifacts_cleaner_ensure = 'present',
+  $artifacts_cleaner_ensure = $ensure,
   $jenkins_master_url       = undef,
 ){

I would also move the validation of $artifacts_cleaner_ensure to the top, which would avoid indenting all the relevant code in a case, but I won’t insist on that one.

Note that I’ve merged the master branch into this one, to ease reviewing (and so that any testing done is against current code).

#20 Updated by bertagaz 2015-07-09 03:28:57

  • Assignee changed from bertagaz to intrigeri
  • QA Check changed from Dev Needed to Ready for QA
  • Starter set to No

intrigeri wrote:
> bertagaz wrote:
>
> Indeed, it’s better! It’s not handling this case correctly, though:
> In this case, $artifacts_cleaner_ensure will still default to “present”, and thus be deployed. I think that parameter should forcibly be set to $ensure when $ensure is not “present”, which should address my concern. IIRC recent enough versions of Puppet support setting the default value for a class parameter based on another one, so if I’m correct this should be the simplest implementation:

Makes sense. That was catchy, thanks for your help!

> I would also move the validation of $artifacts_cleaner_ensure to the top, which would avoid indenting all the relevant code in a case, but I won’t insist on that one.

True it looks better now.

So much effort for a branch that will probably not be used when Feature #9597 has been deployed… :)

#21 Updated by intrigeri 2015-07-09 09:15:58

  • Assignee changed from intrigeri to bertagaz
  • QA Check changed from Ready for QA to Pass

Reviewed, looks good! Pushed 62e98bb on top. Please deploy!

#22 Updated by bertagaz 2015-07-09 14:14:31

intrigeri wrote:
> Reviewed, looks good! Pushed 62e98bb on top.

Nice trick! didn’t think about the alternative validate_re option. thanks!

> Please deploy!

\o/

#23 Updated by bertagaz 2015-07-10 05:30:06

  • Assignee changed from bertagaz to intrigeri

intrigeri wrote:
> Reviewed, looks good! Pushed 62e98bb on top. Please deploy!

Deployed!

I had to fix typos and push a more important change, see b53ff45

I’ve tested it by adding a new job, triggering a build, then delete the job once completed, change the crontab to run next minute (‘crontab -u nightly -e’ on www.lizard), and see that only this job directory has vanished in www.lizard:/srv/sft/nightly/www/.

Please close this ticket if happy (I don’t do it myself because of the mentionned change), otherwise shout. :)

#24 Updated by intrigeri 2015-07-11 07:46:47

  • Status changed from In Progress to Resolved
  • % Done changed from 80 to 100

> Deployed!

:)

> I had to fix typos and push a more important change, see b53ff45

Urghl, that’s sad. Reviewed that commit, fine.

> I’ve tested it by adding a new job, triggering a build, then delete the job once completed, change the crontab to run next minute (‘crontab -u nightly -e’ on www.lizard), and see that only this job directory has vanished in www.lizard:/srv/sft/nightly/www/.

Excellent. Thanks for describing the tests you did, so I won’t double-check myself.

#25 Updated by intrigeri 2015-07-11 07:47:53

  • Assignee deleted (intrigeri)

#26 Updated by intrigeri 2015-08-26 06:09:57

  • Deliverable for set to 266

#27 Updated by intrigeri 2015-08-30 10:34:55

  • related to Bug #10123: Autobuild ISOs cleanup loses all history for topic branches post-release added