Homelab/docs/LINSTOR_TEMPLATE_ISSUE.md
Tellsanguis 4a1b9917f8
Some checks failed
CD - Deploy Infrastructure / Terraform Validation (push) Successful in 17s
CD - Deploy Infrastructure / Deploy on pve1 (push) Failing after 10s
CD - Deploy Infrastructure / Deploy on pve2 (push) Failing after 10s
CD - Deploy Infrastructure / Deploy on pve3 (push) Successful in 1m55s
CD - Deploy Infrastructure / Validate K3s Cluster (push) Has been skipped
CD - Deploy Infrastructure / Deployment Notification (push) Failing after 1s
feat(cicd): Ajouter gestion automatique des ressources DRBD Linstor
- Créer script Python pour gérer les ressources DRBD avant déploiement
  * Vérifie l'existence des ressources Linstor
  * Crée les ressources si nécessaire avec réplication
  * Augmente la taille si elle est insuffisante
  * Noms fixes: pm-a7f3c8e1 (VMID 1000) et pm-b4d2f9a3 (VMID 1001)

- Modifier workflow CI/CD pour intégrer le script Python
  * Ajouter étape de configuration SSH avec secret LINSTOR_SSH_PRIVATE_KEY
  * Exécuter le script avant tofu apply sur pve1 et pve2

- Corriger configuration Terraform des VMs
  * Ajouter vga { type = "std" } pour Standard VGA sur toutes les VMs
  * Ajouter cpu { type = "host" } pour meilleure performance
  * Ajouter replace_triggered_by pour détecter les changements de config
  * Ajouter force_create = true sur pve3 pour gérer VM existante

- Résoudre problèmes identifiés
  * "No Bootable Device" - Résolu avec Standard VGA et CPU host
  * "vmId already in use" - Résolu avec force_create sur etcd-witness
  * Détection des modifications de VM - Résolu avec replace_triggered_by

Documentation SSH créée dans cicd_backup/SETUP_SSH_LINSTOR.md
2025-11-27 18:24:49 +01:00

3.5 KiB

LINSTOR Template Cloning Issue and Solution

Problem

The Proxmox Terraform provider cannot clone VMs from templates stored on LINSTOR storage due to two incompatibilities:

  1. Full Clone: LINSTOR fails to create new resource definitions during the clone operation

    • Error: Resource definition 'vm-XXX-disk-0' not found
    • LINSTOR cannot dynamically create resources during Proxmox clone operations
  2. Linked Clone: LINSTOR does not support snapshot-based cloning

    • Error: Linked clone feature is not supported for 'linstor_storage'
    • LINSTOR uses DRBD replication, which doesn't support QCOW2-style snapshots

Solution

Use local storage templates on each Proxmox node and clone from there to local-lvm storage.

Architecture

Template (VMID 9000) on LINSTOR
          ↓ (one-time copy)
Local templates on each node
          ↓ (Terraform clones)
Production VMs on local-lvm

Step-by-Step Implementation

1. Copy Template to Local Storage

You need to create a local copy of the Ubuntu template on each Proxmox node:

Option A: Automated Script

cd scripts
chmod +x copy-template-to-local.sh
./copy-template-to-local.sh

Option B: Manual Process

On each node (acemagician, elitedesk, thinkpad):

# Connect to the node
ssh root@<node>

# Clone template from LINSTOR to local storage
qm clone 9000 10000 \
  --name ubuntu-2404-cloudinit-local \
  --full \
  --storage local \
  --target <node>

# Convert to template
qm template 10000

# Verify
qm list | grep ubuntu

2. Update Terraform Configuration

The Terraform configs have been updated to use:

  • ubuntu_template = "ubuntu-2404-cloudinit" (the local copy with VMID 10000 or keeping name)
  • full_clone = true (required since linked clones don't work)
  • Storage for k3s servers: local-lvm (cannot use LINSTOR for cloning)
  • Storage for etcd-witness: local-lvm (thinkpad doesn't have LINSTOR satellite)

3. Storage Strategy Going Forward

For VM Disks:

  • Use local-lvm on each node for VM root disks
  • LINSTOR is not suitable for boot disks due to cloning limitations

For Persistent Data:

  • Use LINSTOR for application data volumes (PVCs in Kubernetes)
  • LINSTOR excels at replicating application data between nodes
  • K3s will use LINSTOR CSI driver for persistent volumes

Storage Tradeoffs:

Storage Type VM Cloning HA Migration Speed Use Case
local-lvm Fast No Fast VM root disks
LINSTOR No Yes 🔄 Network K8s PVCs, shared data

Why Not Just Use LINSTOR?

LINSTOR is designed for:

  • Live migration of running VMs between nodes
  • Replicated storage for high availability
  • Dynamic volume provisioning via CSI

It is NOT designed for:

  • Template-based VM provisioning
  • Snapshot-based cloning operations
  • Boot disk management in IaC workflows

Future Improvements

  1. Automate template sync: Create a cron job to sync template updates to all nodes
  2. LINSTOR for K8s only: Use LINSTOR CSI driver for Kubernetes PVCs, not for VM provisioning
  3. Consider alternatives: For VM provisioning, local storage is simpler and faster

References