Feature #11366

Document our monitoring setup

Added by bertagaz 2016-04-25 03:06:55 . Updated 2016-07-23 06:37:38 .

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Infrastructure
Target version:
Start date:
2016-04-25
Due date:
% Done:

100%

Feature Branch:
Type of work:
Contributors documentation
Blueprint:

Starter:
0
Affected tool:
Deliverable for:
268

Description

Once completely deployed in production and stabilized, we should note somewhere (to be defined) how our monitoring setup is configured.


Subtasks


Related issues

Blocks Tails - Feature #5734: Monitor servers Resolved 2015-01-09 2015-11-09

History

#1 Updated by bertagaz 2016-04-25 03:07:19

  • blocked by Feature #9484: Deploy the monitoring setup to production added

#2 Updated by intrigeri 2016-04-25 14:43:44

  • Deliverable for changed from SponsorS_Internal to 268

#3 Updated by intrigeri 2016-05-10 05:34:23

  • Type of work changed from Sysadmin to Contributors documentation

#4 Updated by intrigeri 2016-05-10 05:34:32

  • blocks deleted (Feature #9484: Deploy the monitoring setup to production)

#5 Updated by intrigeri 2016-05-10 05:34:39

#6 Updated by bertagaz 2016-06-07 11:14:41

  • Assignee changed from bertagaz to intrigeri
  • Target version changed from Tails_2.4 to Tails_2.5
  • % Done changed from 0 to 60
  • QA Check set to Ready for QA

Pushed commit:c1367c8 that document the classes for sysadmin contributors.

#7 Updated by intrigeri 2016-06-08 05:48:48

  • Status changed from Confirmed to In Progress
  • Assignee changed from intrigeri to bertagaz
  • % Done changed from 60 to 30
  • QA Check changed from Ready for QA to Dev Needed
  • Deliverable for deleted (268)

Thanks, I like the bits you pushed!

What I feel is missing:

  • a high-level description of what the pieces are, and how they’re connected together: you know, the bits we’ve discussed on some now closed ticket, about the satellite etc. design; rationale: pointing to the individual config bits is good, but it makes little sense unless one knows what these bits are about; this can probably fit in a few sentences;
  • basic documentation (not sure where it should live) about adding a check: when I tried, it took me literally hours to reverse-engineer this, and that’s actually why I created this ticket in the first place. A very rough ordered list of resources to create/update would have saved me hours, and will save me hours next time.

#8 Updated by bertagaz 2016-06-10 08:24:28

  • Assignee changed from bertagaz to intrigeri
  • % Done changed from 30 to 70
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:
> * a high-level description of what the pieces are, and how they’re connected together: you know, the bits we’ve discussed on some now closed ticket, about the satellite etc. design; rationale: pointing to the individual config bits is good, but it makes little sense unless one knows what these bits are about; this can probably fit in a few sentences;

Good idea. commit:5372fd8

> * basic documentation (not sure where it should live) about adding a check: when I tried, it took me literally hours to reverse-engineer this, and that’s actually why I created this ticket in the first place. A very rough ordered list of resources to create/update would have saved me hours, and will save me hours next time.

Added that in commit:3e0a7e3

#9 Updated by intrigeri 2016-07-13 10:01:35

  • Assignee changed from intrigeri to bertagaz
  • QA Check changed from Ready for QA to Dev Needed

bertagaz wrote:
> intrigeri wrote:
> > * a high-level description of what the pieces are, and how they’re connected together: you know, the bits we’ve discussed on some now closed ticket, about the satellite etc. design; rationale: pointing to the individual config bits is good, but it makes little sense unless one knows what these bits are about; this can probably fit in a few sentences;
>
> Good idea. commit:5372fd8

I guess you rather mean commit:3f4d45d (that has a buggy “refs:” ID, took me a while to find it). It looks good to me, thanks!

> > * basic documentation (not sure where it should live) about adding a check: when I tried, it took me literally hours to reverse-engineer this, and that’s actually why I created this ticket in the first place. A very rough ordered list of resources to create/update would have saved me hours, and will save me hours next time.
>
> Added that in commit:3e0a7e3

Please don’t put that in contribute/how/sysadmin, that is about welcoming new sysadmin contributors: this info feels out of place on that page. Instead it should be under contribute/working_together/roles/sysadmins (and link from there) since that’s where we document services.

Other than that, I like the structure of this piece of doc. There are quite a few parts that seem obscure or confusing to me, though mostly due to vague wording, so I’ll list them here (all this is so clear in your mind that I understand it’s hard to guess what will be clear or not to me; and tech documentation writing is hard):

  • s/Deploying/Adding/ (deploying means something more specific than what you mean here, I think)
  • “upstream Icinga2 Puppet module” -> URL please
  • “active record” -> I think it’s called “Active Records”
  • “so we can’t really use” -> looks like “really” adds no info, but makes the sentence confusing
  • softwares -> spell checking
  • In “Once plugins and check commands are checked” I don’t understand what “checked” means. Rephrase?
  • “Have a look at the tails::monitoring::service:torbrowser_archive class” -> I don’t think it’s a class.
  • “the related service configuration template” -> URL please (finding where each of these many small files lives was part of the reverse-engineering pain)
  • “Ran from the master on a remote hosted service” -> this contradicts what I understood until now — isn’t the purpose of a “Remotely executed service” precisely that it’s run on the master? Rephrasing, perhaps? (I guess that “on” is not the right word)
  • tails::monitoring::{master,satellite,agent) class -> parenthesis mismatch; and maybe you mean “classes”?
  • Tails::Monitoring::Service::Memory -> the caps seem wrong in this context (there’s at least another instance of this typo elsewhere, so some proof-reading would be welcome)
  • In “Once all of […] are checked”, I don’t know what “checked” means.
  • “Pay attention to the parameter passed at the exported resources collection.” -> that is? What should I pay attention to?
  • “the related node manifest” -> what’s that?
  • “serveral time” -> spell checker + grammar

I’ve tried hard to list only on the issues that either are important for understanding the meaning of the text, or that are trivial to fix. The goal here is not to produce a super-polished text for newbies and end-users, just something I can refer to in 6 months :)

#10 Updated by intrigeri 2016-07-18 06:43:59

  • Deliverable for set to 268

#11 Updated by bertagaz 2016-07-22 07:47:16

  • Assignee changed from bertagaz to intrigeri
  • QA Check changed from Dev Needed to Ready for QA

intrigeri wrote:
> > Added that in commit:3e0a7e3
>
> Please don’t put that in contribute/how/sysadmin, that is about welcoming new sysadmin contributors: this info feels out of place on that page. Instead it should be under contribute/working_together/roles/sysadmins (and link from there) since that’s where we document services.

Ok, moved it in commit:9acbcab

> Other than that, I like the structure of this piece of doc. There are quite a few parts that seem obscure or confusing to me, though mostly due to vague wording, so I’ll list them here (all this is so clear in your mind that I understand it’s hard to guess what will be clear or not to me; and tech documentation writing is hard):

Thanks! I tried to follow the chronological order one should use to add a check.

> * s/Deploying/Adding/ (deploying means something more specific than what you mean here, I think)

commit:19e270b

> * “upstream Icinga2 Puppet module” -> URL please

commit:887a3f6

> * “active record” -> I think it’s called “Active Records”

commit:ce16870

> * “so we can’t really use” -> looks like “really” adds no info, but makes the sentence confusing

commit:389f89f

> * softwares -> spell checking

commit:9d0cdab

> * In “Once plugins and check commands are checked” I don’t understand what “checked” means. Rephrase?

commit:6eab4c1

> * “Have a look at the tails::monitoring::service:torbrowser_archive class” -> I don’t think it’s a class.

Right, commit:fd54ce6

> * “the related service configuration template” -> URL please (finding where each of these many small files lives was part of the reverse-engineering pain)

commit:5375cb8

> * “Ran from the master on a remote hosted service” -> this contradicts what I understood until now — isn’t the purpose of a “Remotely executed service” precisely that it’s run on the master? Rephrasing, perhaps? (I guess that “on” is not the right word)

commit:cfabbff

> * tails::monitoring::{master,satellite,agent) class -> parenthesis mismatch; and maybe you mean “classes”?

commit:f4a7d94

> * Tails::Monitoring::Service::Memory -> the caps seem wrong in this context (there’s at least another instance of this typo elsewhere, so some proof-reading would be welcome)

commit:a0d6d4f

> * In “Once all of […] are checked”, I don’t know what “checked” means.

commit:6eab4c1 already mentioned above.

> * “Pay attention to the parameter passed at the exported resources collection.” -> that is? What should I pay attention to?

commit:6eae4fc

> * “the related node manifest” -> what’s that?

commit:453232b and commit:4b12e90

> * “serveral time” -> spell checker + grammar

commit:384d228

#12 Updated by intrigeri 2016-07-23 06:37:38

  • Status changed from In Progress to Resolved
  • Assignee deleted (intrigeri)
  • % Done changed from 70 to 100
  • QA Check changed from Ready for QA to Pass

Added commit:cec2084ba499c0426911ba1b47adc94d358cbfb3 on top (please have a look), and I think we’re good. Let’s see what happens the first time I try to actually use this piece of doc :)