Bug #12618

Retrieving ISO build artifacts sometimes fails on Jenkins

Added by intrigeri 2017-05-31 07:43:18 . Updated 2017-07-28 09:03:46 .

Status:
Resolved
Priority:
Elevated
Assignee:
Category:
Continuous Integration
Target version:
Start date:
2017-05-31
Due date:
% Done:

100%

Feature Branch:
Type of work:
Sysadmin
Blueprint:

Starter:
Affected tool:
Deliverable for:
289

Description

E.g. https://jenkins.tails.boum.org/job/build_Tails_ISO_feature-12599/17/console, https://jenkins.tails.boum.org/job/reproducibly_build_Tails_ISO_feature-5630-deterministic-builds/4/console and https://jenkins.tails.boum.org/job/reproducibly_build_Tails_ISO_testing/10/consoleFull expose this problem after successfully building an ISO image:

15:05:41 Retrieving artifacts from Vagrant build box.
15:05:42 Warning: Permanently added '192.168.121.131' (ECDSA) to the list of known hosts.
15:05:42 Warning: Permanently added '192.168.121.131' (ECDSA) to the list of known hosts.
15:21:47 packet_write_wait: Connection to 192.168.121.131 port 22: Broken pipe
15:21:47 lost connection
15:21:49 ==> default: Domain is not running. Please run `vagrant up` or `vagrant resume` first.
15:21:49 ==> default: Domain is not running. Please run `vagrant up` or `vagrant resume` first.
15:21:50 ==> default: Domain is not running. Please run `vagrant up` or `vagrant resume` first.
15:21:51 ==> default: Domain is not running. Please run `vagrant up` or `vagrant resume` first.
15:21:53 rake aborted!
15:21:53 CommandError: command ["scp", "-i", "/var/lib/jenkins/workspace/build_Tails_ISO_feature-12599/vagrant/.vagrant/machines/default/libvirt/private_key", "-o", "StrictHostKeyChecking=no", "-o", "UserKnownHostsFile=/dev/null", "vagrant@192.168.121.131:/home/vagrant/amnesia/tails-amd64-feature_12599-3.0-20170530T1428Z-4e95409+testing@60c7405.iso.apt-sources", "vagrant@192.168.121.131:/home/vagrant/amnesia/tails-amd64-feature_12599-3.0-20170530T1428Z-4e95409+testing@60c7405.iso", "vagrant@192.168.121.131:/home/vagrant/amnesia/tails-amd64-feature_12599-3.0-20170530T1428Z-4e95409+testing@60c7405.iso.buildlog", "vagrant@192.168.121.131:/home/vagrant/amnesia/tails-amd64-feature_12599-3.0-20170530T1428Z-4e95409+testing@60c7405.iso.packages", "vagrant@192.168.121.131:/home/vagrant/amnesia/tails-amd64-feature_12599-3.0-20170530T1428Z-4e95409+testing@60c7405.iso.build-manifest", "build-artifacts/"] failed with exit status 1
15:21:53 /var/lib/jenkins/workspace/build_Tails_ISO_feature-12599/Rakefile:71:in `run_command'
15:21:53 /var/lib/jenkins/workspace/build_Tails_ISO_feature-12599/Rakefile:434:in `block in <top (required)>'
15:21:53 Tasks: TOP => build
15:21:53 (See full trace by running task with --trace)
15:21:53 Build step 'Execute shell' marked build as failure
15:21:53 [PostBuildScript] - Execution post build scripts.

One way to list such problems is to look for unusually small ISO images in the artifacts directory:
ssh jenkins.lizard ls -l --sort=size /var/lib/jenkins/jobs/*/builds/*/archive/build-artifacts/*.iso.

I don’t know if it’s mere coincidence, but these 3 builds were run on isobuilder2.


Subtasks


Related issues

Blocked by Tails - Bug #13302: /var/lib/libvirt/images sometimes gets filled on isobuilders, take 2 Resolved 2017-06-30

History

#1 Updated by intrigeri 2017-05-31 08:57:11

This might be caused by Bug #12599#note-15 (unlikely, but who knows). I’m logging date + disk/memory usage info in /tmp/log on all isobuilders so we can check what their status was next time we see this failure.

#2 Updated by bertagaz 2017-05-31 10:19:18

intrigeri wrote:
> This might be caused by Bug #12599#note-15 (unlikely, but who knows). I’m logging date + disk/memory usage info in /tmp/log on all isobuilders so we can check what their status was next time we see this failure.

I think it is, at least for the 1st build you’re mentioning in the description that was the case. /var/lib/libvirt/images was filled, and the build VM was paused because there was no space left on this partition. I’ve workaround it by hand at that moment for this one.

#3 Updated by bertagaz 2017-06-07 09:26:39

Last failure of that kind I can see in the logs is job reproducibly_build_Tails_ISO_testing, build 2017-05-30_15-54-29. Since then we’ve deployed Bug #12999 and Bug #12577. They should help making this failure disappear, and Bug #12595 will fix this definitely anyway.

#4 Updated by intrigeri 2017-06-08 17:49:51

  • Target version changed from Tails_3.0 to Tails_3.1

Let’s consider our CI infra as frozen until 3.0 is out.

#5 Updated by bertagaz 2017-07-07 13:58:40

  • blocked by Bug #13302: /var/lib/libvirt/images sometimes gets filled on isobuilders, take 2 added

#6 Updated by intrigeri 2017-07-07 16:05:13

> Blocked by “Bug Bug #13302: /var/lib/libvirt/images sometimes gets filled on isobuilders, take 2” added

I don’t get it. Your previous comment said “Bug #12595 will fix this definitely anyway”, and Bug #12595 is about another partition. Can you please clarify?

#7 Updated by bertagaz 2017-07-07 16:39:30

intrigeri wrote:
> > Blocked by “Bug Bug #13302: /var/lib/libvirt/images sometimes gets filled on isobuilders, take 2” added
>
> I don’t get it. Your previous comment said “Bug #12595 will fix this definitely anyway”, and Bug #12595 is about another partition. Can you please clarify?

Yes, I made a mistake in the first commit link, it’s the libvirt partition that is problematic here, as explained in Bug #12618#note-2. So Bug #13302 should fix this for the time being.

#8 Updated by intrigeri 2017-07-07 18:40:20

Great, thanks for clarifying :)

#9 Updated by intrigeri 2017-07-26 08:14:10

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 50
  • QA Check set to Ready for QA

I’ve run grep --files-with-match -w scp /var/lib/jenkins/jobs/build_Tails_ISO_*/builds/*/log on jenkins.lizard and the last time this issue happened was on July 5. This is consistent with the libvirt images partition having been grown on July 7 (13302#note-3). So I think this ticket can now be closed. What do you think?

#10 Updated by bertagaz 2017-07-28 09:03:46

  • Status changed from In Progress to Resolved
  • Assignee deleted (bertagaz)
  • % Done changed from 50 to 100
  • QA Check changed from Ready for QA to Pass

intrigeri wrote:
> I’ve run grep --files-with-match -w scp /var/lib/jenkins/jobs/build_Tails_ISO_*/builds/*/log on jenkins.lizard and the last time this issue happened was on July 5. This is consistent with the libvirt images partition having been grown on July 7 (13302#note-3). So I think this ticket can now be closed. What do you think?

Agreed. I was willing to check that, thanks for having done it.