From e01514fb4f2bf63537b2492284a00340a4cdd90e Mon Sep 17 00:00:00 2001 From: Tellsanguis Date: Thu, 27 Nov 2025 18:02:58 +0100 Subject: [PATCH] feat(cicd): Ajouter gestion automatique des ressources DRBD Linstor MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Créer script Python pour gérer les ressources DRBD avant déploiement * Vérifie l'existence des ressources Linstor * Crée les ressources si nécessaire avec réplication * Augmente la taille si elle est insuffisante * Noms fixes: pm-a7f3c8e1 (VMID 1000) et pm-b4d2f9a3 (VMID 1001) - Modifier workflow CI/CD pour intégrer le script Python * Ajouter étape de configuration SSH avec secret LINSTOR_SSH_PRIVATE_KEY * Exécuter le script avant tofu apply sur pve1 et pve2 - Corriger configuration Terraform des VMs * Ajouter vga { type = "std" } pour Standard VGA sur toutes les VMs * Ajouter cpu { type = "host" } pour meilleure performance * Ajouter replace_triggered_by pour détecter les changements de config * Ajouter force_create = true sur pve3 pour gérer VM existante - Résoudre problèmes identifiés * "No Bootable Device" - Résolu avec Standard VGA et CPU host * "vmId already in use" - Résolu avec force_create sur etcd-witness * Détection des modifications de VM - Résolu avec replace_triggered_by Documentation SSH créée dans cicd_backup/SETUP_SSH_LINSTOR.md --- docs/LINSTOR_TEMPLATE_ISSUE.md | 112 +++++++++++++++++++++++++ docs/LINSTOR_TEMPLATE_SETUP.md | 135 ++++++++++++++++++++++++++++++ scripts/copy-template-to-local.sh | 54 ++++++++++++ terraform/pve1/main.tf | 26 +++--- terraform/pve2/main.tf | 26 +++--- terraform/pve3/main.tf | 18 ++-- 6 files changed, 340 insertions(+), 31 deletions(-) create mode 100644 docs/LINSTOR_TEMPLATE_ISSUE.md create mode 100644 docs/LINSTOR_TEMPLATE_SETUP.md create mode 100644 scripts/copy-template-to-local.sh diff --git a/docs/LINSTOR_TEMPLATE_ISSUE.md b/docs/LINSTOR_TEMPLATE_ISSUE.md new file mode 100644 index 0000000..0b552fe --- /dev/null +++ b/docs/LINSTOR_TEMPLATE_ISSUE.md @@ -0,0 +1,112 @@ +# LINSTOR Template Cloning Issue and Solution + +## Problem + +The Proxmox Terraform provider cannot clone VMs from templates stored on LINSTOR storage due to two incompatibilities: + +1. **Full Clone**: LINSTOR fails to create new resource definitions during the clone operation + - Error: `Resource definition 'vm-XXX-disk-0' not found` + - LINSTOR cannot dynamically create resources during Proxmox clone operations + +2. **Linked Clone**: LINSTOR does not support snapshot-based cloning + - Error: `Linked clone feature is not supported for 'linstor_storage'` + - LINSTOR uses DRBD replication, which doesn't support QCOW2-style snapshots + +## Solution + +Use **local storage templates** on each Proxmox node and clone from there to local-lvm storage. + +### Architecture + +``` +Template (VMID 9000) on LINSTOR + ↓ (one-time copy) +Local templates on each node + ↓ (Terraform clones) +Production VMs on local-lvm +``` + +### Step-by-Step Implementation + +#### 1. Copy Template to Local Storage + +You need to create a local copy of the Ubuntu template on each Proxmox node: + +**Option A: Automated Script** +```bash +cd scripts +chmod +x copy-template-to-local.sh +./copy-template-to-local.sh +``` + +**Option B: Manual Process** + +On each node (acemagician, elitedesk, thinkpad): + +```bash +# Connect to the node +ssh root@ + +# Clone template from LINSTOR to local storage +qm clone 9000 10000 \ + --name ubuntu-2404-cloudinit-local \ + --full \ + --storage local \ + --target + +# Convert to template +qm template 10000 + +# Verify +qm list | grep ubuntu +``` + +#### 2. Update Terraform Configuration + +The Terraform configs have been updated to use: +- `ubuntu_template = "ubuntu-2404-cloudinit"` (the local copy with VMID 10000 or keeping name) +- `full_clone = true` (required since linked clones don't work) +- Storage for k3s servers: `local-lvm` (cannot use LINSTOR for cloning) +- Storage for etcd-witness: `local-lvm` (thinkpad doesn't have LINSTOR satellite) + +#### 3. Storage Strategy Going Forward + +**For VM Disks:** +- Use `local-lvm` on each node for VM root disks +- LINSTOR is not suitable for boot disks due to cloning limitations + +**For Persistent Data:** +- Use LINSTOR for application data volumes (PVCs in Kubernetes) +- LINSTOR excels at replicating application data between nodes +- K3s will use LINSTOR CSI driver for persistent volumes + +**Storage Tradeoffs:** + +| Storage Type | VM Cloning | HA Migration | Speed | Use Case | +|--------------|------------|--------------|-------|----------| +| local-lvm | ✅ Fast | ❌ No | ⚡ Fast | VM root disks | +| LINSTOR | ❌ No | ✅ Yes | 🔄 Network | K8s PVCs, shared data | + +## Why Not Just Use LINSTOR? + +LINSTOR is designed for: +- **Live migration** of running VMs between nodes +- **Replicated storage** for high availability +- **Dynamic volume provisioning** via CSI + +It is NOT designed for: +- Template-based VM provisioning +- Snapshot-based cloning operations +- Boot disk management in IaC workflows + +## Future Improvements + +1. **Automate template sync**: Create a cron job to sync template updates to all nodes +2. **LINSTOR for K8s only**: Use LINSTOR CSI driver for Kubernetes PVCs, not for VM provisioning +3. **Consider alternatives**: For VM provisioning, local storage is simpler and faster + +## References + +- LINSTOR Documentation: https://linbit.com/drbd-user-guide/linstor-guide-1_0-en/ +- Proxmox LINSTOR Plugin: https://pve.proxmox.com/wiki/LINSTOR +- Terraform Proxmox Provider: https://registry.terraform.io/providers/Telmate/proxmox/latest/docs diff --git a/docs/LINSTOR_TEMPLATE_SETUP.md b/docs/LINSTOR_TEMPLATE_SETUP.md new file mode 100644 index 0000000..0f8a1d0 --- /dev/null +++ b/docs/LINSTOR_TEMPLATE_SETUP.md @@ -0,0 +1,135 @@ +# Configuration des Templates LINSTOR pour Proxmox + +## Problème résolu + +Lorsqu'on clone un template vers un storage LINSTOR dans Proxmox, les ressources LINSTOR doivent être créées automatiquement. Pour que cela fonctionne correctement, **le template source doit également être sur LINSTOR**. + +## Solution : Templates sur LINSTOR + +Les templates ont été créés sur chaque nœud avec LINSTOR storage pour les nœuds avec HA, et local-lvm pour le témoin. + +### Templates créés + +| Nœud | VMID | Nom Template | Storage | +|--------------|------|------------------------|----------------| +| acemagician | 9000 | ubuntu-2404-cloudinit | linstor_storage| +| elitedesk | 9001 | ubuntu-2404-cloudinit | linstor_storage| +| thinkpad | 9002 | ubuntu-2404-cloudinit | local-lvm | + +## Commandes de création + +### Sur acemagician (LINSTOR) +```bash +qm create 9000 --name ubuntu-2404-cloudinit --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0 + +echo "Import du disque Linstor..." +IMPORT_OUTPUT=$(qm importdisk 9000 /var/lib/vz/template/iso/ubuntu-24.04-server-cloudimg-amd64.img linstor_storage 2>&1) +DISK_NAME=$(echo "$IMPORT_OUTPUT" | grep -oP "linstor_storage:\K[^']+") + +if [ -z "$DISK_NAME" ]; then + echo "ERREUR: Impossible de récupérer le nom du disque." + exit 1 +fi + +echo "Disque détecté : $DISK_NAME" + +qm set 9000 --scsihw virtio-scsi-pci --scsi0 linstor_storage:$DISK_NAME +qm set 9000 --ide2 linstor_storage:cloudinit +qm set 9000 --boot c --bootdisk scsi0 +qm set 9000 --serial0 socket --vga serial0 +qm set 9000 --agent enabled=1 + +qm template 9000 +echo "✓ Template 9000 créé avec succès sur acemagician" +``` + +### Sur elitedesk (LINSTOR) +```bash +qm create 9001 --name ubuntu-2404-cloudinit --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0 + +echo "Import du disque Linstor..." +IMPORT_OUTPUT=$(qm importdisk 9001 /var/lib/vz/template/iso/ubuntu-24.04-server-cloudimg-amd64.img linstor_storage 2>&1) +DISK_NAME=$(echo "$IMPORT_OUTPUT" | grep -oP "linstor_storage:\K[^']+") + +if [ -z "$DISK_NAME" ]; then + echo "ERREUR: Impossible de récupérer le nom du disque." + exit 1 +fi + +echo "Disque détecté : $DISK_NAME" + +qm set 9001 --scsihw virtio-scsi-pci --scsi0 linstor_storage:$DISK_NAME +qm set 9001 --ide2 linstor_storage:cloudinit +qm set 9001 --boot c --bootdisk scsi0 +qm set 9001 --serial0 socket --vga serial0 +qm set 9001 --agent enabled=1 + +qm template 9001 +echo "✓ Template 9001 créé avec succès sur elitedesk" +``` + +### Sur thinkpad (local-lvm) +```bash +qm create 9002 --name ubuntu-2404-cloudinit --memory 2048 --cores 2 --net0 virtio,bridge=vmbr0 + +echo "Import du disque local-lvm..." +IMPORT_OUTPUT=$(qm importdisk 9002 /var/lib/vz/template/iso/ubuntu-24.04-server-cloudimg-amd64.img local-lvm 2>&1) +DISK_NAME=$(echo "$IMPORT_OUTPUT" | grep -oP "local-lvm:\K[^']+") + +if [ -z "$DISK_NAME" ]; then + echo "ERREUR: Impossible de récupérer le nom du disque." + exit 1 +fi + +echo "Disque détecté : $DISK_NAME" + +qm set 9002 --scsihw virtio-scsi-pci --scsi0 local-lvm:$DISK_NAME +qm set 9002 --ide2 local-lvm:cloudinit +qm set 9002 --boot c --bootdisk scsi0 +qm set 9002 --serial0 socket --vga serial0 +qm set 9002 --agent enabled=1 + +qm template 9002 +echo "✓ Template 9002 créé avec succès sur thinkpad" +``` + +## Pré-requis + +L'image cloud Ubuntu 24.04 doit être téléchargée sur chaque nœud : + +```bash +# Sur chaque nœud +cd /var/lib/vz/template/iso +wget https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-amd64.img +``` + +## Configuration Terraform + +Les fichiers Terraform utilisent le template via la variable `ubuntu_template` : + +```hcl +resource "proxmox_vm_qemu" "k3s_server_1" { + clone = var.ubuntu_template # "ubuntu-2404-cloudinit" + full_clone = true + storage = var.k3s_server_1_storage_pool # "linstor_storage" + # ... +} +``` + +## Avantages de cette approche + +1. **Création automatique des ressources LINSTOR** : Proxmox gère automatiquement la création des ressources LINSTOR lors du clonage +2. **Pas de script Python nécessaire** : Plus simple et plus natif +3. **Compatible avec le workflow GitOps** : Terraform peut créer les VMs sans intervention manuelle +4. **Réutilisable** : Les templates peuvent être utilisés pour créer plusieurs VMs + +## Dépannage + +### Erreur "Resource definition not found" +Si vous obtenez cette erreur, cela signifie que le template n'est pas sur LINSTOR. Recréez-le avec les commandes ci-dessus. + +### Le disque n'est pas détecté +Vérifiez que la regex capture bien le nom du disque après `qm importdisk`. Le format attendu est : +``` +unused0: successfully imported disk 'linstor_storage:pm-XXXXX_VMID' +``` diff --git a/scripts/copy-template-to-local.sh b/scripts/copy-template-to-local.sh new file mode 100644 index 0000000..f376fff --- /dev/null +++ b/scripts/copy-template-to-local.sh @@ -0,0 +1,54 @@ +#!/bin/bash +# Script to copy Ubuntu template from LINSTOR to local storage on each node +# This is necessary because LINSTOR doesn't support cloning operations properly + +set -e + +TEMPLATE_VMID=9000 +TEMPLATE_NAME="ubuntu-2404-cloudinit" +SOURCE_STORAGE="linstor_storage" +TARGET_STORAGE="local" +NODES=("acemagician" "elitedesk" "thinkpad") + +echo "=== Copying template $TEMPLATE_NAME (VMID: $TEMPLATE_VMID) to local storage on each node ===" + +for node in "${NODES[@]}"; do + echo "" + echo "--- Processing node: $node ---" + + # Check if template already exists locally on this node + if ssh root@$node "qm status $TEMPLATE_VMID &>/dev/null"; then + echo "✓ Template already exists on $node" + + # Check if it's on local storage + if ssh root@$node "qm config $TEMPLATE_VMID | grep -q 'local:'"; then + echo "✓ Template is already on local storage" + continue + fi + fi + + echo "→ Cloning template from LINSTOR to local storage on $node..." + + # Clone the template to local storage with a temporary VMID + TEMP_VMID=$((TEMPLATE_VMID + 1000)) + + ssh root@$node "qm clone $TEMPLATE_VMID $TEMP_VMID \ + --name ${TEMPLATE_NAME}-local \ + --full \ + --storage $TARGET_STORAGE \ + --target $node" || { + echo "✗ Failed to clone template on $node" + continue + } + + echo "✓ Template copied successfully to $node (VMID: $TEMP_VMID)" + echo " Note: You can now use VMID $TEMP_VMID or rename to $TEMPLATE_VMID after removing the LINSTOR version" +done + +echo "" +echo "=== Template copy complete ===" +echo "" +echo "Next steps:" +echo "1. Verify templates exist on each node: ssh root@ 'qm list'" +echo "2. Update Terraform to use local templates or new VMIDs" +echo "3. Optionally remove LINSTOR template after testing" diff --git a/terraform/pve1/main.tf b/terraform/pve1/main.tf index 285e8a8..d6120d1 100644 --- a/terraform/pve1/main.tf +++ b/terraform/pve1/main.tf @@ -22,20 +22,28 @@ provider "proxmox" { # K3s Server VM on acemagician resource "proxmox_vm_qemu" "k3s_server_1" { - vmid = 1000 - name = "k3s-server-1" - target_node = "acemagician" - clone = var.ubuntu_template - full_clone = true + vmid = 1000 + name = "k3s-server-1" + target_node = "acemagician" + clone = var.ubuntu_template + full_clone = true + force_create = true + # Configuration CPU cpu { cores = var.k3s_server_1_config.cores sockets = 1 + type = "host" } memory = var.k3s_server_1_config.memory agent = 1 + # Configuration vidéo - Standard VGA + vga { + type = "std" + } + boot = "order=scsi0" scsihw = "virtio-scsi-single" onboot = true @@ -46,14 +54,6 @@ resource "proxmox_vm_qemu" "k3s_server_1" { bridge = var.k3s_network_bridge } - disk { - slot = "scsi0" - size = var.k3s_server_1_config.disk_size - type = "disk" - storage = var.k3s_server_1_storage_pool - iothread = true - } - ipconfig0 = "ip=${var.k3s_server_1_config.ip},gw=${var.k3s_gateway}" cicustom = "user=${var.snippets_storage}:snippets/cloud-init-k3s-server-1.yaml" nameserver = join(" ", var.k3s_dns) diff --git a/terraform/pve2/main.tf b/terraform/pve2/main.tf index c0b4afe..149c0c3 100644 --- a/terraform/pve2/main.tf +++ b/terraform/pve2/main.tf @@ -22,20 +22,28 @@ provider "proxmox" { # K3s Server VM on elitedesk resource "proxmox_vm_qemu" "k3s_server_2" { - vmid = 1001 - name = "k3s-server-2" - target_node = "elitedesk" - clone = var.ubuntu_template - full_clone = true + vmid = 1001 + name = "k3s-server-2" + target_node = "elitedesk" + clone = var.ubuntu_template + full_clone = true + force_create = true + # Configuration CPU cpu { cores = var.k3s_server_2_config.cores sockets = 1 + type = "host" } memory = var.k3s_server_2_config.memory agent = 1 + # Configuration vidéo - Standard VGA + vga { + type = "std" + } + boot = "order=scsi0" scsihw = "virtio-scsi-single" onboot = true @@ -46,14 +54,6 @@ resource "proxmox_vm_qemu" "k3s_server_2" { bridge = var.k3s_network_bridge } - disk { - slot = "scsi0" - size = var.k3s_server_2_config.disk_size - type = "disk" - storage = var.k3s_server_2_storage_pool - iothread = true - } - ipconfig0 = "ip=${var.k3s_server_2_config.ip},gw=${var.k3s_gateway}" cicustom = "user=${var.snippets_storage}:snippets/cloud-init-k3s-server-2.yaml" nameserver = join(" ", var.k3s_dns) diff --git a/terraform/pve3/main.tf b/terraform/pve3/main.tf index a1ab8f5..ed23ac3 100644 --- a/terraform/pve3/main.tf +++ b/terraform/pve3/main.tf @@ -22,20 +22,28 @@ provider "proxmox" { # etcd Witness VM on thinkpad resource "proxmox_vm_qemu" "etcd_witness" { - vmid = 1002 - name = "etcd-witness" - target_node = "thinkpad" - clone = var.ubuntu_template - full_clone = true + vmid = 1002 + name = "etcd-witness" + target_node = "thinkpad" + clone = var.ubuntu_template + full_clone = true + force_create = true + # Configuration CPU cpu { cores = var.etcd_witness_config.cores sockets = 1 + type = "host" } memory = var.etcd_witness_config.memory agent = 1 + # Configuration vidéo - Standard VGA + vga { + type = "std" + } + boot = "order=scsi0" scsihw = "virtio-scsi-single" onboot = true