Feature #6195
Rotate published artifacts archives
100%
Description
We have limited disk space, and want to still build quite a lot of branches with our Jenkins instance.
To manage both, we have to ensure to keep only the necessary build artifacts of a branch, and no more.
The easiest way would be to add a build step to our jenkins Build_Tails_Iso_* jobs that would take care to keep only :
* Last 5 builds of the current day
* Last build of yesterday
* Last build of past week
* Last build of past month
That way we’ll keep <= 8 builds for each branch.
This at least is a first try to get something sustainable in term of disk space, and it might evolve in the future.
History
#1 Updated by bertagaz 2013-09-07 04:26:39
- Assignee changed from bertagaz to intrigeri
- QA Check set to Ready for QA
A script has been deployed on builder.lizard to be used as a jenkins buildstep to clean old artifacts in our nightly archives
It has been pushed in our lizard puppet repo. (commit: 8a16598).
Once reviewed, a corresponding shell buildstep has to be added in our build_Tails_Iso_* jenkins jobs so that it can be used in production.
#2 Updated by intrigeri 2013-09-07 08:34:05
- Assignee changed from intrigeri to bertagaz
- QA Check changed from Ready for QA to Info Needed
Congrats!
I’m not convinced by how the time ranges are currently defined: if we’re Saturday, we have no artifacts kept for Mon-Thu, while if we are Tuesday, then we have artifacts for Sunday and Monday, right? I had personally understood “last week” as “within the last 7 days”, which seems to make more sense to me, by basically making all days equal to each other. Same for “today” (“within the last 24h”), “yesterday” etc.
What do you think?
Also, I’m not 100% comfortable with the hardcoding of /srv/www
and the coupling with $JENKINS_JOB
. Perhaps ARTIFACTS_DIR
could be passed on the command-line instead? This would make it easier to run the script from elsewhere than a Jenkins job, if needed.
> Once reviewed, a corresponding shell buildstep has to be added in our build_Tails_Iso_* jenkins jobs so that it can be used in production.
:)
#3 Updated by bertagaz 2013-09-07 14:51:43
intrigeri wrote:
> I’m not convinced by how the time ranges are currently defined: if we’re Saturday, we have no artifacts kept for Mon-Thu, while if we are Tuesday, then we have artifacts for Sunday and Monday, right? I had personally understood “last week” as “within the last 7 days”, which seems to make more sense to me, by basically making all days equal to each other. Same for “today” (“within the last 24h”), “yesterday” etc.
You’re right, the LastWeek and LastMonth ranges are dumbly aligned to the calendar. The others (Yesterday, Today) are respecting the understanding you describe (“relative is better than absolute”).
I wasn’t sure myself about this part, but see the relevance. It shouldn’t be hard to implement, so expect a new review soon (better say than sorry ;D).
> Also, I’m not 100% comfortable with the hardcoding of /srv/www
and the coupling with $JENKINS_JOB
. Perhaps ARTIFACTS_DIR
could be passed on the command-line instead? This would make it easier to run the script from elsewhere than a Jenkins job, if needed.
That was another black hole of this script. Fine to me, will commit this decision. :)
#4 Updated by intrigeri 2013-09-08 00:37:04
> Issue Feature #6195 has been updated by bertagaz.
>
> It shouldn’t be hard to implement, so expect a new review soon […]
Great :)
>> Also, I’m not 100% comfortable with the hardcoding of /srv/www
and the coupling
>> with $JENKINS_JOB
. Perhaps ARTIFACTS_DIR
could be passed on the command-line
>> instead? This would make it easier to run the script from elsewhere than a Jenkins
>> job, if needed.
> That was another black hole of this script. Fine to me, will commit this decision. :)
Cool.
#5 Updated by intrigeri 2013-09-08 00:39:29
> Issue Feature #6195 has been updated by bertagaz.
>
> It shouldn’t be hard to implement, so expect a new review soon […]
Great :)
>> Also, I’m not 100% comfortable with the hardcoding of /srv/www
and the coupling
>> with $JENKINS_JOB
. Perhaps ARTIFACTS_DIR
could be passed on the command-line
>> instead? This would make it easier to run the script from elsewhere than a Jenkins
>> job, if needed.
> That was another black hole of this script. Fine to me, will commit this decision. :)
Cool.
#6 Updated by bertagaz 2013-09-08 04:41:15
- Assignee changed from bertagaz to intrigeri
- QA Check changed from Info Needed to Ready for QA
Script has been updated as decided. Pushed onto the puppet repo again and deployed on builder.lizard.
Not very sure of the new definitions, but seems to do the work actually (tested live by disabling the FileUtils.rm part of the script).
I’ve also pushed on the jenkins-jobs repo a new buildstep to use this script, but haven’t deployed it yet in our Jenkins.
Also thanks for your commits. :)
#7 Updated by intrigeri 2013-09-08 05:33:01
- Assignee changed from intrigeri to bertagaz
- QA Check changed from Ready for QA to Dev Needed
bertagaz wrote:
> Script has been updated as decided.
Looks mostly good.
Oh, sorry if I was unclear: I was suggesting that ARTIFACTS_DIR
could be passed as an argument on the command line, rather than as an environment variable, since it’s, well, a parameter of this program. Does this make sense to you?
> + [TODAY-(60 * 60 * 24 * 7)..TODAY-(60 * 60 * 24),
I believe this overlaps with the “Today” preset. I guess you really mean ...
instead of ..
here.
> Not very sure of the new definitions,
Are we keeping the oldest or newest artifacts in a given time range? This is not documented, and the code logics is not overly clear to me.
#8 Updated by intrigeri 2013-09-08 05:38:33
bertagaz wrote:
> (tested live by disabling the FileUtils.rm part of the script).
Though no requirement per se, a --dry-run
option that tells what would be done, without actually doing it, would be welcome :)
#9 Updated by bertagaz 2013-09-08 09:54:40
- Assignee changed from bertagaz to intrigeri
- QA Check changed from Dev Needed to Ready for QA
intrigeri wrote:
>
> Oh, sorry if I was unclear: I was suggesting that ARTIFACTS_DIR
could be passed as an argument on the command line, rather than as an environment variable, since it’s, well, a parameter of this program. Does this make sense to you?
It is. I’ve commited and pushed a fix at the same usual place. :)
>
> > + [TODAY-(60 * 60 * 24 * 7)..TODAY-(60 * 60 * 24),
>
> I believe this overlaps with the “Today” preset. I guess you really mean ...
instead of ..
here.
Right, nice catch! Fixed and pushed too.
> > Not very sure of the new definitions,
>
> Are we keeping the oldest or newest artifacts in a given time range? This is not documented, and the code logics is not overly clear to me.
We’re keeping the newest artifact(s) of each range. This is done in line 48, by sorting the iso list by their respective time.
>Though no requirement per se, a —dry-run option that tells what would be done, without actually doing it, would be welcome :)
Done too.
#10 Updated by intrigeri 2013-09-08 13:40:07
- Assignee changed from intrigeri to bertagaz
Review passed. I’ve pushed a minor change or two on top of it, please review and resolve the ticket if happy :)
#11 Updated by bertagaz 2013-09-09 06:41:08
- Status changed from In Progress to Resolved
- QA Check changed from Ready for QA to Pass
Sounds good to me.
I’ve pushed and updated the jenkins-jobs build step that integrate this script. First with the dry-run option as a live test on the experimental branch. As it seemed to work, it is now live in production, without the dry-run option. The experimental branch did already have some real care from the script.
Closing this ticket, let see if we discover a bug now that it is live. :)
#12 Updated by intrigeri 2013-09-10 02:15:52
- Assignee deleted (
bertagaz)
#13 Updated by intrigeri 2013-09-10 02:16:44
Awesome!