Longhorn PVC Mounting Issue and Backup Setup
Recently I ran into a persistent error with Longhorn while trying to attach a PVC. The error kept repeating and initially, I thought I needed the v2 engine—but it turns out the real culprit was multipath being created. The
Longhorn KB article on troubleshooting multipath explained exactly how to fix it.
Original Error
This was the original error I kept seeing:
MountVolume.MountDevice failed for volume "pvc-4b04c380-e38a-4462-a71a-875a16b53c9d": rpc error: code = Internal desc = format of disk "/dev/longhorn/pvc-4b04c380-e38a-4462-a71a-875a16b53c9d" failed: type:("ext4") target:("/var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/.../globalmount") options:("defaults") errcode:(exit status 1) output: mke2fs 1.47.0 reports disk is in use; will not make a filesystem here!
I tried several fixes, but none worked. Every time I would try and restart all longhorn pods like iAI recommended the errors became like a game of wacamole. Randomly other pvcs would have the same issue. Eventually, AI Claude 4 led me down a path that deleted all the Longhorn drives, and I had no backups—so this was a costly lesson.
Final Solution
The final solution was to go on each node and set the multipath setting as explained in the article. I also realized the importance of having a backup solution in place.
MariaDB Backups to S3
For some databases, I have a MariaDB backup YAML running that sends all objects to S3. Here’s an example (sensitive data sanitized):
apiVersion: k8s.mariadb.com/v1alpha1 kind: Backup metadata: name: mariadb-backup spec: mariaDbRef: name: mariadb storage: s3: bucket: my-bucket prefix: mariadb-backups/ endpoint: s3.amazonaws.com region: us-east-1 tls: enabled: true accessKeyIdSecretKeyRef: name: aws-credentials key: ACCESS_KEY_ID secretAccessKeySecretKeyRef: name: aws-credentials key: SECRET_ACCESS_KEY schedule: cron: "0 2 * * *" maxRetention: "720h" # 30 days compression: gzip databases: - mariadb - recisphere
Longhorn Backup Setup
For Longhorn, I first updated the values.yaml with the backup settings (sensitive data sanitized):
defaultSettings: backupTarget: "s3://user@us-east-1/longhorn-backups" backupTargetCredentialSecret: "s3-backup-secret" allowRecurringJobWhileVolumeDetached: ~ createDefaultDiskLabeledNodes: ~ defaultDataPath: /mnt/sda
Next, I created a recurring backup job:
apiVersion: longhorn.io/v1beta2 kind: RecurringJob metadata: name: daily-backup namespace: longhorn-system spec: cron: "0 2 * * *" task: backup groups: ["daily-backup"] retain: 7 concurrency: 2
Finally, I annotated all PVCs with the recurring job label:
kubectl get pvc --all-namespaces --no-headers | while read ns name rest; do kubectl label pvc $name -n $ns recurring-job-group.longhorn.io/daily-backup=enabled --overwrite; done
metadata: labels: recurring-job-group.longhorn.io/daily-backup: enabled
Verification
Now the Longhorn UI shows a backup target connected to S3, and backups appear there. The cron job handles automatic backups, and you can view the job pod outputs in Kubernetes for debugging if necessary.
Lessons Learned
- Always verify multipath settings on nodes to avoid PVC mount issues.
- Never rely solely on local storage—set up automated backups for both databases and Longhorn volumes.
- Test backup restores periodically to ensure the data is recoverable.
- Use short, descriptive PVC names and recurring job labels to avoid confusion in large clusters.