Some checks failed
CD - Deploy Infrastructure / Terraform Validation (push) Successful in 19s
CD - Deploy Infrastructure / Deploy on pve1 (push) Successful in 1m55s
CD - Deploy Infrastructure / Deploy on pve2 (push) Successful in 1m57s
CD - Deploy Infrastructure / Deploy on pve3 (push) Successful in 1m54s
CD - Deploy Infrastructure / Validate K3s Cluster (push) Successful in 5m3s
CD - Deploy Infrastructure / Deployment Notification (push) Failing after 1s
- Créer script Python pour gérer les ressources DRBD avant déploiement
* Vérifie l'existence des ressources Linstor
* Crée les ressources si nécessaire avec réplication
* Augmente la taille si elle est insuffisante
* Noms fixes: pm-a7f3c8e1 (VMID 1000) et pm-b4d2f9a3 (VMID 1001)
- Modifier workflow CI/CD pour intégrer le script Python
* Ajouter étape de configuration SSH avec secret LINSTOR_SSH_PRIVATE_KEY
* Exécuter le script avant tofu apply sur pve1 et pve2
- Corriger configuration Terraform des VMs
* Ajouter vga { type = "std" } pour Standard VGA sur toutes les VMs
* Ajouter cpu { type = "host" } pour meilleure performance
* Ajouter replace_triggered_by pour détecter les changements de config
* Ajouter force_create = true sur pve3 pour gérer VM existante
- Résoudre problèmes identifiés
* "No Bootable Device" - Résolu avec Standard VGA et CPU host
* "vmId already in use" - Résolu avec force_create sur etcd-witness
* Détection des modifications de VM - Résolu avec replace_triggered_by
Documentation SSH créée dans cicd_backup/SETUP_SSH_LINSTOR.md
112 lines
3.5 KiB
Markdown
112 lines
3.5 KiB
Markdown
# LINSTOR Template Cloning Issue and Solution
|
|
|
|
## Problem
|
|
|
|
The Proxmox Terraform provider cannot clone VMs from templates stored on LINSTOR storage due to two incompatibilities:
|
|
|
|
1. **Full Clone**: LINSTOR fails to create new resource definitions during the clone operation
|
|
- Error: `Resource definition 'vm-XXX-disk-0' not found`
|
|
- LINSTOR cannot dynamically create resources during Proxmox clone operations
|
|
|
|
2. **Linked Clone**: LINSTOR does not support snapshot-based cloning
|
|
- Error: `Linked clone feature is not supported for 'linstor_storage'`
|
|
- LINSTOR uses DRBD replication, which doesn't support QCOW2-style snapshots
|
|
|
|
## Solution
|
|
|
|
Use **local storage templates** on each Proxmox node and clone from there to local-lvm storage.
|
|
|
|
### Architecture
|
|
|
|
```
|
|
Template (VMID 9000) on LINSTOR
|
|
↓ (one-time copy)
|
|
Local templates on each node
|
|
↓ (Terraform clones)
|
|
Production VMs on local-lvm
|
|
```
|
|
|
|
### Step-by-Step Implementation
|
|
|
|
#### 1. Copy Template to Local Storage
|
|
|
|
You need to create a local copy of the Ubuntu template on each Proxmox node:
|
|
|
|
**Option A: Automated Script**
|
|
```bash
|
|
cd scripts
|
|
chmod +x copy-template-to-local.sh
|
|
./copy-template-to-local.sh
|
|
```
|
|
|
|
**Option B: Manual Process**
|
|
|
|
On each node (acemagician, elitedesk, thinkpad):
|
|
|
|
```bash
|
|
# Connect to the node
|
|
ssh root@<node>
|
|
|
|
# Clone template from LINSTOR to local storage
|
|
qm clone 9000 10000 \
|
|
--name ubuntu-2404-cloudinit-local \
|
|
--full \
|
|
--storage local \
|
|
--target <node>
|
|
|
|
# Convert to template
|
|
qm template 10000
|
|
|
|
# Verify
|
|
qm list | grep ubuntu
|
|
```
|
|
|
|
#### 2. Update Terraform Configuration
|
|
|
|
The Terraform configs have been updated to use:
|
|
- `ubuntu_template = "ubuntu-2404-cloudinit"` (the local copy with VMID 10000 or keeping name)
|
|
- `full_clone = true` (required since linked clones don't work)
|
|
- Storage for k3s servers: `local-lvm` (cannot use LINSTOR for cloning)
|
|
- Storage for etcd-witness: `local-lvm` (thinkpad doesn't have LINSTOR satellite)
|
|
|
|
#### 3. Storage Strategy Going Forward
|
|
|
|
**For VM Disks:**
|
|
- Use `local-lvm` on each node for VM root disks
|
|
- LINSTOR is not suitable for boot disks due to cloning limitations
|
|
|
|
**For Persistent Data:**
|
|
- Use LINSTOR for application data volumes (PVCs in Kubernetes)
|
|
- LINSTOR excels at replicating application data between nodes
|
|
- K3s will use LINSTOR CSI driver for persistent volumes
|
|
|
|
**Storage Tradeoffs:**
|
|
|
|
| Storage Type | VM Cloning | HA Migration | Speed | Use Case |
|
|
|--------------|------------|--------------|-------|----------|
|
|
| local-lvm | ✅ Fast | ❌ No | ⚡ Fast | VM root disks |
|
|
| LINSTOR | ❌ No | ✅ Yes | 🔄 Network | K8s PVCs, shared data |
|
|
|
|
## Why Not Just Use LINSTOR?
|
|
|
|
LINSTOR is designed for:
|
|
- **Live migration** of running VMs between nodes
|
|
- **Replicated storage** for high availability
|
|
- **Dynamic volume provisioning** via CSI
|
|
|
|
It is NOT designed for:
|
|
- Template-based VM provisioning
|
|
- Snapshot-based cloning operations
|
|
- Boot disk management in IaC workflows
|
|
|
|
## Future Improvements
|
|
|
|
1. **Automate template sync**: Create a cron job to sync template updates to all nodes
|
|
2. **LINSTOR for K8s only**: Use LINSTOR CSI driver for Kubernetes PVCs, not for VM provisioning
|
|
3. **Consider alternatives**: For VM provisioning, local storage is simpler and faster
|
|
|
|
## References
|
|
|
|
- LINSTOR Documentation: https://linbit.com/drbd-user-guide/linstor-guide-1_0-en/
|
|
- Proxmox LINSTOR Plugin: https://pve.proxmox.com/wiki/LINSTOR
|
|
- Terraform Proxmox Provider: https://registry.terraform.io/providers/Telmate/proxmox/latest/docs
|