Bug #10396
Sort out overallocated storage situation on isotesterN.lizard
100%
Description
On lizard’s VG we currently have 405.62g (= 435.53G) available, which is clearly not enough to cover our planned needs, e.g. there’s not enough space left for me to allocate its planned space to the freezable APT repo (which I’d like to do within 24-48 hours). I suspect I should not have ignored the GiB vs. GB subtleties when doing Feature #9400… and/or some space was allocated somewhere that I had not taken into account when we did the planning.
Anyway, it seems that jenkins-data
is using half of the space we’ve allocated for it so on the short term I could just steal some space there. On the longer term we might need to go back to the drawing board for Feature #9399; note that this is related to Feature #9264 (if we get another box, it’ll need disks).
Subtasks
Related issues
Related to Tails - |
Resolved | 2015-05-14 | |
Related to Tails - |
Resolved | 2015-12-15 |
History
#1 Updated by intrigeri 2015-10-20 11:08:52
intrigeri wrote:
> […] and/or some space was allocated somewhere that I had not taken into account when we did the planning.
bertagaz, if you have any clue on this aspect, please let me know :)
#2 Updated by intrigeri 2015-10-20 11:10:31
- related to
Feature #9399: Extend lizard's storage capacity added
#3 Updated by intrigeri 2015-10-20 11:10:40
- related to
Feature #9264: Consider buying more server hardware to run our automated test suite added
#4 Updated by bertagaz 2015-10-21 02:55:44
- Assignee changed from intrigeri to bertagaz
intrigeri wrote:
> intrigeri wrote:
> > […] and/or some space was allocated somewhere that I had not taken into account when we did the planning.
>
> bertagaz, if you have any clue on this aspect, please let me know :)
I’ll have a look.
#5 Updated by intrigeri 2015-10-27 13:12:33
bertagaz wrote:
> intrigeri wrote:
> > intrigeri wrote:
> > > […] and/or some space was allocated somewhere that I had not taken into account when we did the planning.
> >
> > bertagaz, if you have any clue on this aspect, please let me know :)
>
> I’ll have a look.
Any news on this side? If you don’t think you can do this by tomorrow night, please simply reassign to me and I’ll deal with it somehow.
#6 Updated by intrigeri 2015-11-02 04:23:06
- Status changed from Confirmed to Rejected
- Assignee deleted (
bertagaz)
Too late, I’ve found other ways to work on Feature #5926.
#7 Updated by intrigeri 2015-11-06 11:17:42
- Subject changed from Sort out storage situation on lizard to Sort out overallocated storage situation on isotesterN.lizard
- Status changed from Rejected to Confirmed
- Assignee set to bertagaz
- Priority changed from High to Normal
- Target version changed from Tails_1.7 to Tails_1.8
- Parent task set to
Feature #5288 - Deliverable for set to 267
We had planned to allocate 20GB for each isotesterN (see details on Feature #9400#note-6) and they got 41GB each, except isotester1 that has 51GB for some reason. So we’re wasting 3*21+31=94GB here. I think our estimates were slightly wrong, and these VMs probably need a bit more than what we thought initially, but the overallocated space is simply too much of a waste to be ignored.
#8 Updated by bertagaz 2015-11-18 05:29:49
- Assignee changed from bertagaz to intrigeri
- QA Check set to Info Needed
intrigeri wrote:
> We had planned to allocate 20GB for each isotesterN (see details on Feature #9400#note-6) and they got 41GB each, except isotester1 that has 51GB for some reason. So we’re wasting 3*21+31=94GB here. I think our estimates were slightly wrong, and these VMs probably need a bit more than what we thought initially, but the overallocated space is simply too much of a waste to be ignored.
Right, I think I messed up something when resizing the isotesterN-tmp lvm volumes. I’m not sure how much is over-allocated though.
Having a look at our munin graph for disk usage seems to say that we had pikes at 98% or so of this tmp partitions, but it might as well be that there were some bugs at some points that filled the partition, but looking at the daily graphs, this high percentage seems to happen often.
So I wonder if in the end this over-allocation didn’t save our ass at some point. May you have a look at this graph and tell me what you think about them?
#9 Updated by intrigeri 2015-11-21 02:49:25
- Assignee changed from intrigeri to bertagaz
- QA Check changed from Info Needed to Dev Needed
> Having a look at our munin graph for disk usage seems to say that we had pikes at 98% or so of this tmp partitions
On our Munin I see no filesystem space usage stats for the FS’es we’re talking about, so I don’t know what you’re talking about (in case it’s a source of confusion, diskstats_utilization
is not about disk space utilisation; the diskstats plugin is about how the device behaves at the block device layer, and “utilization” is “how busy is the device”). If we have such stats somewhere already, then please point me precisely (with unambiguous language and pointers) to where I can find them to confirm your findings.
Meanwhile, to save one roundtrip and some data gathering time I’m running:
while true ; do
for i in $(seq 1 4) ; do
ssh isotester$i.lizard df /tmp/TailsToaster | tail -n1 | awk '{print $3}' >> ~/tmp/isotester$i-tailstoaster-used
done
sleep 60
done
… so when you’re back to this ticket, you can easily check the data (in my $HOME
) and fix storage allocation based on these numbers.
#10 Updated by bertagaz 2015-12-05 07:36:41
- Assignee changed from bertagaz to intrigeri
- QA Check changed from Dev Needed to Info Needed
intrigeri wrote:
> On our Munin I see no filesystem space usage stats for the FS’es we’re talking about, so I don’t know what you’re talking about (in case it’s a source of confusion, diskstats_utilization
is not about disk space utilisation; the diskstats plugin is about how the device behaves at the block device layer, and “utilization” is “how busy is the device”). If we have such stats somewhere already, then please point me precisely (with unambiguous language and pointers) to where I can find them to confirm your findings.
Hmm right, thanks for the explanations.
> Meanwhile, to save one roundtrip and some data gathering time I’m running:
>
> […]
>
> … so when you’re back to this ticket, you can easily check the data (in my $HOME
) and fix storage allocation based on these numbers.
Biggest number out of this stats is at almost 12G.
So we may reduce this tmp partition to 15G for a bit of safety, or 20G if we expect it to grow at some point (but I don’t believe so). What’s your opinion?
#11 Updated by intrigeri 2015-12-05 10:20:29
- Status changed from Confirmed to In Progress
- Assignee changed from intrigeri to bertagaz
- % Done changed from 0 to 20
- QA Check changed from Info Needed to Dev Needed
bertagaz wrote:
> Biggest number out of this stats is at almost 12G.
>
> So we may reduce this tmp partition to 15G for a bit of safety,
According to what I’m quoting on Feature #10503, we could quite easily get this number down by a few GB, so I think that adding 25% of safety margin is over-enthusiastic => I suggest 10% of safety marging. Beware of the units, by the way ;)
#12 Updated by bertagaz 2015-12-15 03:26:43
- Target version changed from Tails_1.8 to Tails_2.0
Postponing
#13 Updated by intrigeri 2015-12-19 10:46:10
- % Done changed from 20 to 50
Done on isotester2 and isotester3 with: VM=isotester3 ; virsh shutdown "$VM" && virsh vol-delete "${VM}-tmp" --pool lvm && virsh vol-create-as lvm "${VM}-tmp" 14173392076 && sudo mkfs.ext4 -m 0 "/dev/lizard/${VM}-tmp" && virsh start "$VM"
#14 Updated by bertagaz 2015-12-20 07:32:43
- Assignee changed from bertagaz to intrigeri
- % Done changed from 50 to 80
- QA Check changed from Dev Needed to Ready for QA
Done for isotester1 and isotester4 almost following your guidelines: I shut down the VMs by hand before starting to play with the -tmp volume, just to be sure it was down for real.
#15 Updated by intrigeri 2015-12-20 10:15:21
- Status changed from In Progress to Resolved
- Assignee deleted (
intrigeri) - % Done changed from 80 to 100
- QA Check changed from Ready for QA to Pass
Looks good!