Feature #16041: Replace rotating drives with new SSDs on lizard

Feature #16041

Replace rotating drives with new SSDs on lizard

Added by intrigeri about 7 years ago. Updated about 7 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

intrigeri

Category:

Infrastructure

Target version:

Start date:

2018-10-11

Due date:

% Done:

100%

Feature Branch:

Type of work:

Sysadmin

Blueprint:

Starter:

Affected tool:

Deliverable for:

Description

On the sysadmin side I disabled the old rotating drives:

sudo vgremove spinninglizard
sudo pvremove /dev/mapper/md2_crypt
sudo cryptdisks_stop md2_crypt
sudo mdadm --stop /dev/md2
sudo sed -i --regexp-extended '/^md2_crypt/ d' /etc/crypttab
sudo sed -i --regexp-extended '/^ARRAY \/dev\/md\/2 / d' /etc/mdadm/mdadm.conf
sudo update-initramfs -u

groente, can you please check if I forgot something? Then reassign to me so I handle the next steps.

Subtasks

Related issues

Related to Tails - ~~Bug #16131~~: Broken Samsung SSD 850 EVO 1TB on lizard	Resolved	2018-11-17
Related to Tails - ~~Bug #16161~~: optimise pv placement for io-performance	Resolved	2018-11-28
Blocks Tails - Feature #13242: Core work: Sysadmin (Maintain our already existing services)	Confirmed	2017-06-29
Blocks Tails - ~~Bug #16155~~: increase jenkins and iso-archive diskspace	Resolved	2018-11-27

History

#1 Updated by intrigeri about 7 years ago

related to #15779 added

#2 Updated by intrigeri about 7 years ago

blocks Feature #13242: Core work: Sysadmin (Maintain our already existing services) added

#3 Updated by groente about 7 years ago

Assignee changed from groente to intrigeri
QA Check changed from Ready for QA to Dev Needed

intrigeri wrote:
> On the sysadmin side I disabled the old rotating drives:
>
> […]
>
> groente, can you please check if I forgot something? Then reassign to me so I handle the next steps.

Apart from the systemd services that tried to bring md2_crypt back up again I already mentioned on xmpp, I think that pretty much covers it.

Just to be safe, I would recommend running grub-install again on the remaining disks (sda — sdf), it should already be there, but with the occassional ‘grub not found’ during lizard reboots, it’s better to be safe than sorry before pulling disks out.

#4 Updated by intrigeri about 7 years ago

Target version changed from Tails_3.10.1 to Tails_3.11

#5 Updated by intrigeri about 7 years ago

groente wrote:
> Apart from the systemd services that tried to bring md2_crypt back up again I already mentioned on xmpp, I think that pretty much covers it.

FTR that was fixed.

> Just to be safe, I would recommend running grub-install again on the remaining disks (sda — sdf), it should already be there, but with the occassional ‘grub not found’ during lizard reboots, it’s better to be safe than sorry before pulling disks out.

Good idea! I did sudo dpkg-reconfigure grub-pc, selected /dev/sd[c-f] and let it install GRUB on those drives. Note that we can’t install GRUB on /dev/sd[ab] because there’s simply no room for it (fully encrypted, no partition table nor filesystem that GRUB can use).

#6 Updated by intrigeri about 7 years ago

Status changed from Confirmed to In Progress

#7 Updated by intrigeri about 7 years ago

related to ~~Bug #16131~~: Broken Samsung SSD 850 EVO 1TB on lizard added

#8 Updated by intrigeri about 7 years ago

Our BIOS was still configured to start on the rotating drives. I’ve fixed that.

Pinged taggart on IRC today.

#9 Updated by groente about 7 years ago

due to md1 being degraded (see ~~Bug #16131~~), the following LV’s will be moved from md1 to md4:

    root
    puppet-git-system   *
    apt-system
    apt-data
    rsync-system
    bittorrent-system
    apt-proxy-system
    apt-proxy-data
    whisperback-system
    bitcoin-data        **
    jenkins-system
    bridge-system
    www-system
    misc-system
    puppet-git-data
    bitcoin-system
    bitcoin-swap
    isos-www            **
    isotester1-system
    im-system
    monitor-system
    isotester2-system
    isotester3-system
    isotester4-system
    isotester4-data
    apt-snapshots       **
    isotester5-system
    isotester5-data
    isotester6-system
    isotester6-data
    translate-system    **
    isobuilder1-system
    isobuilder4-system
    isobuilder3-system
    isobuilder3-data    **
    isobuilder2-system 
    isobuilder2-libvirt **
    isobuilder3-libvirt **
    isobuilder4-libvirt **
    isobuilder1-libvirt **
    apt-proxy-swap

LV’s marked * also have a foot in md3, only the parts from md1 will be moved
LV’s marked were already partially on md4

#10 Updated by intrigeri about 7 years ago

Assignee changed from intrigeri to bertagaz
% Done changed from 0 to 10

Old drives pulled out, new drives plugged in. Please do the basic setup of the new drives (or ask me to do it) and reassign to me so I do the next steps. See ML for required timing & technical details. Thanks in advance!

#11 Updated by groente about 7 years ago

Assignee changed from bertagaz to groente

stealing this ticket because we need the diskspace for the sprint

#12 Updated by groente about 7 years ago

blocks ~~Bug #16155~~: increase jenkins and iso-archive diskspace added

#13 Updated by intrigeri about 7 years ago

Regarding spreading the I/O load again accross PVs aka. RAID arrays:

this much seems obvious: spread the ISO builders & testers over at least 2 arrays; they don’t use that much I/O though (we’ve set up stuff & memory to minimize I/O needs here)
top IOPS consumers (average IOPS over a week, max of read & write): jenkins-data (147.91), apt-snapshots (36.66), translate-system (10.90), apt-proxy-data (7.44), puppet-git-system (6.81), isos (4.86), bitcoin-data (3.80)
ISO builders & testers, when busy, make other volumes busy (mainly jenkins-data, apt-snapshots, apt-proxy-data); let’s separate them if we can

So let’s try this:

md3 (old, 500GB): ~~translate-system~~, apt-proxy-data, ~~puppet-git-system~~, ~~bitcoin-data~~, half of Jenkins workers (isobuilders 1-2, ~~isotesters 1-3~~)
md4 (old, 2TB): ~~jenkins-data~~, ~~isos~~, ~~1/4 of ISO builders & testers (isobuilder3, isotester4)~~
md5 (new, 4TB): ~~apt-snapshots~~, ~~1/4 of Jenkins workers (isobuilder4, isotesters 5-6)~~

I’ll do this once the lower part of the stack is ready and a week or two later I’ll check latency and IOPS per PV, which should tell me how good or bad this first iteration was.

#14 Updated by groente about 7 years ago

Assignee changed from groente to intrigeri
QA Check changed from Dev Needed to Pass

go for it, once that’s done i think this ticket can be closed \o/

#15 Updated by intrigeri about 7 years ago

% Done changed from 10 to 50
QA Check deleted (~~Pass~~)

#16 Updated by intrigeri about 7 years ago

Amending the plan: md3 would be too full if we do exactly that so I’ll move isobuilder2 stuff to md5 instead.

#17 Updated by groente about 7 years ago

related to ~~Bug #16161~~: optimise pv placement for io-performance added

#18 Updated by groente about 7 years ago

Status changed from In Progress to Resolved
Target version deleted (~~Tails_3.11~~)
% Done changed from 50 to 100

all done with the disk replacement, created a new ticket for the pv-switcheroo

Tails