Bug #10334

Fragile automated builds on lizard

Added by intrigeri 2015-10-03 16:07:22 . Updated 2015-11-06 09:35:36 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Continuous Integration
Target version:
Start date:
2015-10-03
Due date:
% Done:

100%

Feature Branch:
Type of work:
Sysadmin
Blueprint:

Starter:
Affected tool:
Deliverable for:
266

Description

We’ve seen many build failures recently. Almost all of them happen on isobuilder2, I had a look and it’s not surprising things fail:

$ df
df: `/tmp/tails-build.4ZSfX1Cx': Stale NFS file handle
df: `/tmp/tails-build.9zHHWL68': Stale NFS file handle
[...]
tmpfs                      tmpfs     8.8G  5.6G  3.3G  64% /tmp/tmpfs.MtxhVedc
tmpfs                      tmpfs     8.8G  782M  8.1G   9% /tmp/tmpfs.NQz5b14m

$ free
             total       used       free     shared    buffers     cached
Mem:          8992       6962       2029          0         10       6487
-/+ buffers/cache:        464       8527
Swap:            0          0          0

So there are two problems there:

  • I don’t know what the NFS error come from, I see a NFS line in fstab, I thought we were not supposed to have any such thing anymore, but perhaps I’m confused.
  • The build wrapper doesn’t always clean up properly.

I’ve rebooted isobuilder2 so the avalanche of notifications should stop.


Subtasks


Related issues

Related to Tails - Feature #6090: Automated builds Resolved 2013-07-26 2015-02-28
Related to Tails - Bug #10772: Next ISO builds fail on Jenkins when a previous job was aborted uncleanly Resolved 2015-12-17

History

#1 Updated by intrigeri 2015-10-16 02:19:28

#2 Updated by intrigeri 2015-10-16 02:26:52

  • Target version set to Tails_1.7

This happened for the 2nd time (that I noticed) in 2 weeks: all builds on isobuilder2 were failing since a few hours today, apparently for the same reason as last time => rebooted that slave.

As you can guess, since I raised all hell about it 2 weeks ago, my main concern here is the rate of false positive “build failed” notifications sent to developers (last time it took days before the bug was identified and I rebooted the slave), so setting a target version not far away. Formally speaking that’s a follow-up to Feature #6090, so it’s on your plate.

#3 Updated by intrigeri 2015-10-16 02:27:13

  • blocks #8668 added

#4 Updated by bertagaz 2015-10-29 07:04:45

  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 30

I’ve resintalled the faulty isobuilder2 tonight from scratch. Seems easier in the end than tracking what’s the root of this bug. There’s a first build going on, we’ll see in the coming ones if this strange bug reappear.

#5 Updated by intrigeri 2015-10-30 04:46:15

> I’ve resintalled the faulty isobuilder2 tonight from scratch.

Thanks! And yay for disposable Jenkins slaves.

#6 Updated by bertagaz 2015-11-01 06:34:29

  • Assignee changed from bertagaz to intrigeri
  • % Done changed from 30 to 70
  • QA Check set to Ready for QA

It’s been 3 days that it has been re-installed, and so far the stale NFS file handle bug does not seem to be the cause of the failing builds "on this isobuilder:https://jenkins.tails.boum.org/computer/isobuilder2/builds

So I think this ticket can be marked as resolved.

#7 Updated by intrigeri 2015-11-01 06:54:07

  • Status changed from In Progress to Resolved
  • % Done changed from 70 to 100
  • QA Check changed from Ready for QA to Pass

> It’s been 3 days that it has been re-installed, and so far the stale NFS file handle
> bug does not seem to be the cause of the failing builds "on this
> isobuilder:https://jenkins.tails.boum.org/computer/isobuilder2/builds

Great!

#8 Updated by intrigeri 2015-11-01 07:35:21

  • Assignee deleted (intrigeri)

#9 Updated by intrigeri 2015-11-06 09:35:36

  • Deliverable for set to 266

#10 Updated by bertagaz 2015-12-17 04:36:15

  • related to Bug #10772: Next ISO builds fail on Jenkins when a previous job was aborted uncleanly added