Bug #17702

Ensure check-mirrors.rb does not run forever

Added by intrigeri 2020-05-10 07:19:32 . Updated 2020-05-11 18:50:20 .

Status:
Confirmed
Priority:
Normal
Assignee:
zen
Category:
Infrastructure
Target version:
Start date:
Due date:
% Done:

0%

Feature Branch:
Type of work:
Sysadmin
Blueprint:

Starter:
Affected tool:
check-mirrors
Deliverable for:

Description

Today on misc.lizard there were 3 instances of that cronjob running: 1 current + 2 that started respectively on April 22 and 23. The 2 oldest ones were all stuck waiting for a wget command to complete.

This might explain some of the leftover temporary directories we’ve seen recently (Feature #17679): one hypothesis is that these directories were not leftovers of “old instances of check-mirrors.rb was killed and did not clean up after itself”, but rather a consequence of “weeks-old check-mirrors.rb instances were still running”.

Either way, I think we need 2 things:

  • sysadmin: a lock on the cronjob (like we have for most similar cronjobs already) and possibly a timeout (see e.g. timeout(1) which seems to fit the bill), in the cronjob. In case the timeout expires or the lock can’t be taken by a new cronjob instance, I think the mirrors team should be made aware.
  • check-mirrors.rb itself: IMO the script should not wait indefinitely for a wget command to complete; it should run it under some sort of timeout.

I’m filing a single issue for both suggestions to limit overhead. I think they can be implemented in parallel.


Subtasks


Related issues

Related to Tails - Feature #17679: Dangling download directories from failed runs of check-mirrors Resolved
Blocks Tails - Feature #13242: Core work: Sysadmin (Maintain our already existing services) Confirmed 2017-06-29

History

#1 Updated by intrigeri 2020-05-10 07:19:45

  • related to Feature #17679: Dangling download directories from failed runs of check-mirrors added

#2 Updated by groente 2020-05-11 18:47:09

  • blocks Feature #13242: Core work: Sysadmin (Maintain our already existing services) added

#3 Updated by zen 2020-05-11 18:50:20

  • Assignee changed from Sysadmins to zen