Bug #17703
Monitoring broken (since April 29?)
100%
Description
On https://icingaweb2.tails.boum.org/monitoring/health/info I see that the last Icinga status update happened on April 29.
On ecours, I see that icinga2.service
cannot start because the config files in the teels.tails.boum.org
zone refer to a zone that is not declared anymore. Indeed, in etckeeper’s log on ecours, I see that Puppet removed that zone on April 29 (6ffca75bb39d22133064d5d3f306c5be77a9eb46). I could not find where that zone was configured in Puppet so I tried deleting /etc/icinga2/zones.d/teels.tails.boum.org/
, which allowed icinga2.service
to start. Then I ran Puppet on ecours, and those files did not come back, so I’m confused.
Then I saw the exact same problem on monitor.lizard
, and applied the same solution. Here again, running Puppet on that host did not bring back the config files I had deleted.
I’ll stop here for today. I hope I did more good than harm.
Subtasks
Related issues
Blocks Tails - Feature #13242: Core work: Sysadmin (Maintain our already existing services) | Confirmed | 2017-06-29 |
History
#1 Updated by groente 2020-05-11 18:45:43
- Assignee changed from Sysadmins to groente
#2 Updated by groente 2020-05-11 18:46:51
- blocks Feature #13242: Core work: Sysadmin (Maintain our already existing services) added
#3 Updated by groente 2020-05-13 19:10:31
- Status changed from Confirmed to Resolved
- % Done changed from 0 to 100
The cause seems to have been a puppet agent process that had been running on teels since April 22nd. Once i killed that process, ran puppet anew on teels, ecours, and monitor, everyone was happy again (except for monitor, which didn’t have enough memory to run puppet, but after bumping its memory a bit, even monitor was happy).
#4 Updated by groente 2020-05-13 19:11:18
Oh, and @intrigeri - thanks for catching this issue and applying the quick fix!
#5 Updated by intrigeri 2020-05-14 06:49:28
> The cause seems to have been a puppet agent process that had been running on teels since April 22nd. Once i killed that process, ran puppet anew on teels, ecours, and monitor, everyone was happy again (except for monitor, which didn’t have enough memory to run puppet, but after bumping its memory a bit, even monitor was happy).
Awesome detective work \o/ :)
Cheers!