Bug #10396: Sort out overallocated storage situation on isotesterN.lizard

Bug #10396

Sort out overallocated storage situation on isotesterN.lizard

Added by intrigeri 2015-10-20 11:08:08 . Updated 2015-12-20 10:15:21 .

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Infrastructure

Target version:

Tails_2.0

Start date:

2015-10-20

Due date:

% Done:

100%

Feature Branch:

Type of work:

Sysadmin

Blueprint:

Starter:

Affected tool:

Deliverable for:

267

Description

On lizard’s VG we currently have 405.62g (= 435.53G) available, which is clearly not enough to cover our planned needs, e.g. there’s not enough space left for me to allocate its planned space to the freezable APT repo (which I’d like to do within 24-48 hours). I suspect I should not have ignored the GiB vs. GB subtleties when doing ~~Feature #9400~~… and/or some space was allocated somewhere that I had not taken into account when we did the planning.

Anyway, it seems that jenkins-data is using half of the space we’ve allocated for it so on the short term I could just steal some space there. On the longer term we might need to go back to the drawing board for ~~Feature #9399~~; note that this is related to ~~Feature #9264~~ (if we get another box, it’ll need disks).

Subtasks

Related issues

Related to Tails - ~~Feature #9399~~: Extend lizard's storage capacity	Resolved	2015-05-14
Related to Tails - ~~Feature #9264~~: Consider buying more server hardware to run our automated test suite	Resolved	2015-12-15

History

#1 Updated by intrigeri 2015-10-20 11:08:52

intrigeri wrote:
> […] and/or some space was allocated somewhere that I had not taken into account when we did the planning.

bertagaz, if you have any clue on this aspect, please let me know :)

#2 Updated by intrigeri 2015-10-20 11:10:31

related to ~~Feature #9399~~: Extend lizard's storage capacity added

#3 Updated by intrigeri 2015-10-20 11:10:40

related to ~~Feature #9264~~: Consider buying more server hardware to run our automated test suite added

#4 Updated by bertagaz 2015-10-21 02:55:44

Assignee changed from intrigeri to bertagaz

intrigeri wrote:
> intrigeri wrote:
> > […] and/or some space was allocated somewhere that I had not taken into account when we did the planning.
>
> bertagaz, if you have any clue on this aspect, please let me know :)

I’ll have a look.

#5 Updated by intrigeri 2015-10-27 13:12:33

bertagaz wrote:
> intrigeri wrote:
> > intrigeri wrote:
> > > […] and/or some space was allocated somewhere that I had not taken into account when we did the planning.
> >
> > bertagaz, if you have any clue on this aspect, please let me know :)
>
> I’ll have a look.

Any news on this side? If you don’t think you can do this by tomorrow night, please simply reassign to me and I’ll deal with it somehow.

#6 Updated by intrigeri 2015-11-02 04:23:06

Status changed from Confirmed to Rejected
Assignee deleted (~~bertagaz~~)

Too late, I’ve found other ways to work on ~~Feature #5926~~.

#7 Updated by intrigeri 2015-11-06 11:17:42

Subject changed from Sort out storage situation on lizard to Sort out overallocated storage situation on isotesterN.lizard
Status changed from Rejected to Confirmed
Assignee set to bertagaz
Priority changed from High to Normal
Target version changed from Tails_1.7 to Tails_1.8
Parent task set to ~~Feature #5288~~
Deliverable for set to 267

We had planned to allocate 20GB for each isotesterN (see details on ~~Feature #9400#note-6~~) and they got 41GB each, except isotester1 that has 51GB for some reason. So we’re wasting 3*21+31=94GB here. I think our estimates were slightly wrong, and these VMs probably need a bit more than what we thought initially, but the overallocated space is simply too much of a waste to be ignored.

#8 Updated by bertagaz 2015-11-18 05:29:49

Assignee changed from bertagaz to intrigeri
QA Check set to Info Needed

intrigeri wrote:
> We had planned to allocate 20GB for each isotesterN (see details on ~~Feature #9400#note-6~~) and they got 41GB each, except isotester1 that has 51GB for some reason. So we’re wasting 3*21+31=94GB here. I think our estimates were slightly wrong, and these VMs probably need a bit more than what we thought initially, but the overallocated space is simply too much of a waste to be ignored.

Right, I think I messed up something when resizing the isotesterN-tmp lvm volumes. I’m not sure how much is over-allocated though.

Having a look at our munin graph for disk usage seems to say that we had pikes at 98% or so of this tmp partitions, but it might as well be that there were some bugs at some points that filled the partition, but looking at the daily graphs, this high percentage seems to happen often.

So I wonder if in the end this over-allocation didn’t save our ass at some point. May you have a look at this graph and tell me what you think about them?

#9 Updated by intrigeri 2015-11-21 02:49:25

Assignee changed from intrigeri to bertagaz
QA Check changed from Info Needed to Dev Needed

> Having a look at our munin graph for disk usage seems to say that we had pikes at 98% or so of this tmp partitions

On our Munin I see no filesystem space usage stats for the FS’es we’re talking about, so I don’t know what you’re talking about (in case it’s a source of confusion, diskstats_utilization is not about disk space utilisation; the diskstats plugin is about how the device behaves at the block device layer, and “utilization” is “how busy is the device”). If we have such stats somewhere already, then please point me precisely (with unambiguous language and pointers) to where I can find them to confirm your findings.

Meanwhile, to save one roundtrip and some data gathering time I’m running:

while true ; do
for i in $(seq 1 4) ; do
ssh isotester$i.lizard df /tmp/TailsToaster | tail -n1 | awk '{print $3}' >> ~/tmp/isotester$i-tailstoaster-used
done
sleep 60
done

… so when you’re back to this ticket, you can easily check the data (in my $HOME) and fix storage allocation based on these numbers.

#10 Updated by bertagaz 2015-12-05 07:36:41

Assignee changed from bertagaz to intrigeri
QA Check changed from Dev Needed to Info Needed

intrigeri wrote:
> On our Munin I see no filesystem space usage stats for the FS’es we’re talking about, so I don’t know what you’re talking about (in case it’s a source of confusion, diskstats_utilization is not about disk space utilisation; the diskstats plugin is about how the device behaves at the block device layer, and “utilization” is “how busy is the device”). If we have such stats somewhere already, then please point me precisely (with unambiguous language and pointers) to where I can find them to confirm your findings.

Hmm right, thanks for the explanations.

> Meanwhile, to save one roundtrip and some data gathering time I’m running:
>
> […]
>
> … so when you’re back to this ticket, you can easily check the data (in my $HOME) and fix storage allocation based on these numbers.

Biggest number out of this stats is at almost 12G.

So we may reduce this tmp partition to 15G for a bit of safety, or 20G if we expect it to grow at some point (but I don’t believe so). What’s your opinion?

#11 Updated by intrigeri 2015-12-05 10:20:29

Status changed from Confirmed to In Progress
Assignee changed from intrigeri to bertagaz
% Done changed from 0 to 20
QA Check changed from Info Needed to Dev Needed

bertagaz wrote:
> Biggest number out of this stats is at almost 12G.
>
> So we may reduce this tmp partition to 15G for a bit of safety,

According to what I’m quoting on ~~Feature #10503~~, we could quite easily get this number down by a few GB, so I think that adding 25% of safety margin is over-enthusiastic => I suggest 10% of safety marging. Beware of the units, by the way ;)

#12 Updated by bertagaz 2015-12-15 03:26:43

Target version changed from Tails_1.8 to Tails_2.0

Postponing

#13 Updated by intrigeri 2015-12-19 10:46:10

% Done changed from 20 to 50

Done on isotester2 and isotester3 with: VM=isotester3 ; virsh shutdown "$VM" && virsh vol-delete "${VM}-tmp" --pool lvm && virsh vol-create-as lvm "${VM}-tmp" 14173392076 && sudo mkfs.ext4 -m 0 "/dev/lizard/${VM}-tmp" && virsh start "$VM"

#14 Updated by bertagaz 2015-12-20 07:32:43

Assignee changed from bertagaz to intrigeri
% Done changed from 50 to 80
QA Check changed from Dev Needed to Ready for QA

Done for isotester1 and isotester4 almost following your guidelines: I shut down the VMs by hand before starting to play with the -tmp volume, just to be sure it was down for real.

#15 Updated by intrigeri 2015-12-20 10:15:21

Status changed from In Progress to Resolved
Assignee deleted (~~intrigeri~~)
% Done changed from 80 to 100
QA Check changed from Ready for QA to Pass

Looks good!