Bug #13433

apt-cacher-ng expiration cronjob fails on apt-proxy.lizard

Added by bertagaz 2017-07-06 12:58:00 . Updated 2017-07-10 10:16:51 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Infrastructure
Target version:
Start date:
2017-07-06
Due date:
% Done:

100%

Feature Branch:
Type of work:
Sysadmin
Blueprint:

Starter:
Affected tool:
Deliverable for:

Description

Since July 02, the cronjob running apt-cacher-ng’s expiration script run daily by cron reports failure to the sysadmins of this kind:

/etc/cron.daily/apt-cacher-ng
Error: cannot fetch http://localhost:3142/acng-report.html?doExpire=Start+Expiration&abortOnErrors=aOe,
HTTP/1.1 500 Connection timeout

Subtasks


Related issues

Blocks Tails - Feature #13233: Core work 2017Q3: Sysadmin (Maintain our already existing services) Resolved 2017-06-29

History

#1 Updated by bertagaz 2017-07-06 12:58:21

  • blocks Feature #13233: Core work 2017Q3: Sysadmin (Maintain our already existing services) added

#2 Updated by intrigeri 2017-07-06 13:52:15

Thanks, I’ll take a look :)

#3 Updated by intrigeri 2017-07-06 15:23:29

Nothing changed on that system after June 27 so I suspect this problem appeared before July 2. Whatever, I’ll take care of it.

#4 Updated by intrigeri 2017-07-06 15:42:09

  • Subject changed from Apt-cacher-ng expiration cronjob fails on apt-proxy.lizard to apt-cacher-ng expiration cronjob fails on apt-proxy.lizard
  • Status changed from Confirmed to In Progress
  • % Done changed from 0 to 10

The cronjob seems to have worked fine today according to /var/log/apt-cacher-ng/maint_1499322301.log.html whose mtime is 06:38 UTC. But we got this error from cron earlier, at 06:27 UTC.

And yesterday at 06:26:30 UTC the OOM killer had to kill apt-show-versions (deployed as part of Feature #11523 on June 25, which confirms my guess that this problem started before July 2), 25 seconds before the email error about acng timing out occurred. I guess that’s /etc/cron.daily/apt-show-versions.

So I think we should give a bit more memory to that system.

#5 Updated by intrigeri 2017-07-07 07:22:23

  • Assignee changed from intrigeri to bertagaz
  • % Done changed from 10 to 50
  • QA Check set to Ready for QA

The cronjob worked just fine today. It seemed wasteful to allocate RAM that’ll be useful for very limited amounts of time every second day or so, so instead I’ve added some swap:

virsh vol-create-as lvm apt-proxy-swap 1G && sudo mkswap /dev/lizard/apt-proxy-swap                                     
# verified the backing PV makes sense
virsh attach-disk apt-proxy /dev/lizard/apt-proxy-swap vdc --config --live --driver qemu --iothread 3 --cache directsync
virsh edit apt-proxy # set io='native', not sure how to do it in the previous command line

Then I’ve declared a swap mount resource for the node in our Puppet manifests, pushed, applied and finally ssh apt-proxy.lizard sudo swapon -a.

Please close if you’re fine with this solution and if the problem doesn’t come up again in the next few days :)

#6 Updated by bertagaz 2017-07-10 10:16:51

  • Status changed from In Progress to Resolved
  • Assignee deleted (bertagaz)
  • % Done changed from 50 to 100
  • QA Check changed from Ready for QA to Pass

intrigeri wrote:
> Please close if you’re fine with this solution and if the problem doesn’t come up again in the next few days :)

Sounds good. No new email since then, let’s close this ticket.