After [choosing LINSTOR DRBD as the distributed storage solution](/blog/stockage-distribue-proxmox-ha) for my Proxmox HA cluster, it was time to move on to deployment automation with OpenTofu (open-source fork of Terraform). Spoiler: it didn't go as planned.
<!--truncate-->
## Context Recap
In my [previous article about choosing a distributed storage technology](/blog/stockage-distribue-proxmox-ha), I opted for LINSTOR DRBD over Ceph for several reasons:
- **Superior performance**: LINSTOR DRBD uses synchronous block-level replication via DRBD, offering better performance than Ceph on a 1 Gbps network
- **Simpler architecture**: No need for monitors, managers, and OSDs like with Ceph
- **Resource consumption**: Lighter on RAM and CPU
Partition the NVMe drives on each node into two parts:
- One partition for local LVM storage (`local-lvm`)
- One partition for the LINSTOR DRBD pool (`linstor_storage`)
Then use `local-lvm` for VM disks (simple provisioning) and `linstor_storage` for other needs requiring replication.
**Important note for my Kubernetes use case**: Using `local-lvm` (without Proxmox-level replication) is viable for a Kubernetes cluster because **Kubernetes handles high availability**, not Proxmox. With etcd distributed across 3 nodes and a replicated control plane, the loss of a VM doesn't impact the cluster - Kubernetes continues to function with the remaining nodes. VMs become "cattle" (replaceable via Infrastructure as Code) while real "pets" data (precious) would reside in application-level storage solutions.
**Advantages**:
- Simple and fast VM provisioning on `local-lvm`
- Preservation of LINSTOR DRBD for distributed storage needs
- Optimal use of available hardware
- Maximum performance for VMs (direct local access)
- **HA ensured at the right level**: Kubernetes, not Proxmox
**Disadvantages**:
- **Setup complexity**: Disk repartitioning required
- **Risk of data loss**: Invasive operation on existing disks
- **Capacity planning**: Need to determine partition size in advance
- **Less flexibility**: Fixed partition sizes, difficult to modify
- **No HA at Proxmox level**: VMs no longer benefit from replication (acceptable if HA at Kubernetes level)
### Option 4: Migrate to Ceph with Network Upgrade
Abandon LINSTOR DRBD and migrate to Ceph, upgrading the network to 5 Gbps (or 10 Gbps if budget allows):
**Advantages**:
- Native support for dynamic provisioning in Proxmox
- Perfect integration with OpenTofu/Terraform
- Mature and well-documented ecosystem
- Snapshots and clones natively supported
- Acceptable performance with a 5 Gbps NIC
**Disadvantages**:
- **Hardware cost**: Purchase of 5 Gbps (or 10 Gbps) network cards for the 3 nodes
- **Increased complexity**: Monitors, Managers, OSDs to manage
- **Resource consumption**: More demanding on RAM and CPU than LINSTOR
- **Complete migration**: Reconstruction of existing storage
- **Still inferior performance**: Even with 5 Gbps, greater overhead than DRBD
## My Current Thinking
I'm currently torn between these options:
**Option 1 (Script)** appeals to me because it preserves LINSTOR and automates everything. With fixed VMIDs (1000, 1001, 1002), the script would be relatively simple to maintain. Just need to ensure the script runs before OpenTofu in the CI/CD pipeline.
**Option 3 (Partitioning)** is technically interesting but very invasive. Repartitioning NVMe drives in production is risky, and I lose high availability at the Proxmox level for the VMs themselves. However, in my Kubernetes context, this isn't necessarily a problem since HA is managed at the K3s cluster level, not at the individual VM level. If a VM goes down, Kubernetes continues to function with the other nodes.
**Option 4 (Ceph + network upgrade)** solves all technical problems but involves a hardware investment. A 5 Gbps switch + 3 network cards represents a significant budget for a homelab. On the other hand, it opens the door to other future possibilities.
## Key Takeaways
### LINSTOR ≠ General-purpose Storage for Proxmox
LINSTOR excels for certain use cases, but dynamic VM provisioning via Proxmox cloning is not one of them. LINSTOR documentation is heavily focused on `resource-group` and application storage, not Proxmox integration.
### The Limitation Is Architectural, Not a Bug
This isn't a configuration problem or my mistake: LINSTOR is designed with an explicit resource management model. On-the-fly dynamic provisioning simply isn't in its philosophy.
### HA Can Be Delegated to a Higher Layer
For a Kubernetes cluster, losing HA at the Proxmox level (VMs on local storage) isn't necessarily problematic. Kubernetes is designed to handle node failures - that's actually its main role. With distributed etcd and a replicated control plane, the cluster survives the loss of one or more nodes.
### Each Solution Has Its Cost
- **Script** → Software complexity
- **Partitioning** → Operational complexity and loss of HA at Proxmox level
- **Ceph** → System complexity and hardware cost
There's no silver bullet. I must choose which type of complexity I'm willing to accept.
## Next Steps
I'll probably test **Option 1** (pre-creation script) first, as it allows me to:
1. Keep LINSTOR DRBD and its performance
2. Fully automate deployment
3. Avoid immediate hardware investment
4. Learn to better manage LINSTOR programmatically
If this approach proves too complex or fragile, I'll reconsider either **Option 3** (partitioning, acceptable in a Kubernetes context), or **Option 4** (Ceph + network upgrade), which is the most "standard" and documented solution in the Proxmox ecosystem.
I'll document my final decision and its implementation in a future article.
After testing Option 1 with a [Python script for LINSTOR resource management](/scripts/blog/2025-11-26-linstor-drbd-opentofu/manage_linstor_resources.py), I found that this approach, while functional, added too much complexity and synchronization risks for production use.
**The final decision**: Partition the NVMe drives on each Proxmox node according to the following strategy:
- **300 GB allocated to LINSTOR DRBD** (`linstor_storage`) for:
- VMs and LXC containers requiring high availability at the Proxmox level
- The LXC container hosting the NFS server (see [zfs-sync-nfs-ha project](https://forgejo.tellserv.fr/Tellsanguis/zfs-sync-nfs-ha))
- Any distributed storage managed by Proxmox HA
- **200 GB allocated to local-lvm** (`local-lvm`) for:
- K3S cluster VMs that **don't need** HA at the Proxmox level
- High availability ensured by the Kubernetes cluster itself
- Simple and fast provisioning via OpenTofu
This architecture allows using the right tool for the right purpose: LINSTOR DRBD for what truly requires synchronous replication at the infrastructure level, and performant local storage for workloads where HA is managed by the application layer (Kubernetes).
A detailed article on this implementation and the HA NFS container will follow soon.