You are looking at the documentation of a prior release. To read the documentation of the latest release, please
visit here.
Populating volumes of a Deployment
This guide will show you how to use KubeStash to populate the volumes of a Deployment. We’ll walk through backing up the volumes of a Deployment and then restoring the backed up data to new PVCs in a Kubernetes-native way with KubeStash.
Before You Begin
- At first, you need to have a Kubernetes cluster, and the
kubectlcommand-line tool must be configured to communicate with your cluster. If you do not already have a cluster, you can create one by using kind. - Install
KubeStashin your cluster following the steps here. - You should be familiar with the following
KubeStashconcepts:
To keep everything isolated, we are going to use a separate namespace called demo throughout this tutorial.
$ kubectl create ns demo
namespace/demo created
Note: YAML files used in this tutorial are stored in docs/guides/volumes/pvc/examples directory of kubestash/docs repository.
Prepare Workload
At first, we are going to deploy a Deployment with two PVCs and generate some sample data in it.
Create PersistentVolumeClaim :
At first, let’s create two sample PVCs. We are going to mount these PVCs in our targeted Deployment.
Below is the YAML of the sample PVCs,
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: source-data
namespace: demo
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: source-config
namespace: demo
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Let’s create the PVCs we have shown above.
$ kubectl apply -f https://github.com/kubestash/docs/raw/v2024.9.30/docs/guides/volume-populator/deployment/examples/backup-pvcs.yaml
persistentvolumeclaim/source-data created
persistentvolumeclaim/source-config created
Deploy Deployment :
Now, we are going to deploy a Deployment that uses the above PVCs. This Deployment will automatically create data.txt and config.cfg file in /source/data and /source/config directory.
Below is the YAML of the Deployment that we are going to create,
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: kubestash-demo
name: kubestash-deployment
namespace: demo
spec:
replicas: 2
selector:
matchLabels:
app: kubestash-demo
template:
metadata:
labels:
app: kubestash-demo
name: busybox
spec:
containers:
- args: ["echo sample_data > /source/data/data.txt; echo sample_config > /source/config/config.cfg && sleep 3000"]
command: ["/bin/sh", "-c"]
image: busybox
imagePullPolicy: IfNotPresent
name: busybox
volumeMounts:
- mountPath: /source/data
name: source-data
- mountPath: /source/config
name: source-config
restartPolicy: Always
volumes:
- name: source-data
persistentVolumeClaim:
claimName: source-data
- name: source-config
persistentVolumeClaim:
claimName: source-config
Let’s create the deployment that we have shown above.
$ kubectl apply -f https://github.com/kubestash/docs/raw/v2024.9.30/docs/guides/volume-populator/deployment/examples/backup-deployment.yaml
deployment.apps/kubestash-deployment created
Now, wait for the pod of the Deployment to go into the Running state.
$ kubectl get pods -n demo
NAME READY STATUS RESTARTS AGE
kubestash-deployment-745d49c4cd-vsj8m 1/1 Running 0 62s
kubestash-deployment-745d49c4cd-wwj8w 1/1 Running 0 62s
Verify that the sample data has been created in /source/data and /source/config directory using the following command,
$ kubectl exec -n demo kubestash-deployment-745d49c4cd-vsj8m -- cat /source/data/data.txt
sample_data
$ kubectl exec -n demo kubestash-deployment-745d49c4cd-vsj8m -- cat /source/config/config.cfg
config_data
Prepare Backend
Now, we are going to backup the Deployment kubestash-deployment to a GCS bucket using KubeStash. For this, we have to create a Secret with necessary credentials and a BackupStorage object. If you want to use a different backend, please read the respective backend configuration doc from here.
For GCS backend, if the bucket does not exist, KubeStash needs
Storage Object Adminrole permissions to create the bucket. For more details, please check the following guide.
Create Secret:
Let’s create a Secret named gcs-secret with access credentials of our desired GCS backend,
$ echo -n '<your-project-id>' > GOOGLE_PROJECT_ID
$ cat /path/to/downloaded/sa_key_file.json > GOOGLE_SERVICE_ACCOUNT_JSON_KEY
$ kubectl create secret generic -n demo gcs-secret \
--from-file=./GOOGLE_PROJECT_ID \
--from-file=./GOOGLE_SERVICE_ACCOUNT_JSON_KEY
secret/gcs-secret created
Create BackupStorage:
Now, create a BackupStorage custom resource specifying the desired bucket, and directory inside the bucket where the backed up data will be stored.
Below is the YAML of BackupStorage object that we are going to create,
apiVersion: storage.kubestash.com/v1alpha1
kind: BackupStorage
metadata:
name: gcs-storage
namespace: demo
spec:
storage:
provider: gcs
gcs:
bucket: kubestash-qa
prefix: demo
secretName: gcs-secret
usagePolicy:
allowedNamespaces:
from: All
default: true
deletionPolicy: WipeOut
Let’s create the BackupStorage object that we have shown above,
$ kubectl apply -f https://github.com/kubestash/docs/raw/v2024.9.30/docs/guides/volume-populator/deployment/examples/backupstorage.yaml
backupstorage.storage.kubestash.com/gcs-storage created
Now, we are ready to backup our target volume to this backend.
Create RetentionPolicy:
Now, we have to create a RetentionPolicy object to specify how the old Snapshots should be cleaned up.
Below is the YAML of the RetentionPolicy object that we are going to create,
apiVersion: storage.kubestash.com/v1alpha1
kind: RetentionPolicy
metadata:
name: demo-retention
namespace: demo
spec:
default: true
failedSnapshots:
last: 2
maxRetentionPeriod: 2mo
successfulSnapshots:
last: 5
usagePolicy:
allowedNamespaces:
from: Same
Notice the spec.usagePolicy that allows referencing the RetentionPolicy from all namespaces.For more details on configuring it for specific namespaces, please refer to the following RetentionPolicy usage policy.
Let’s create the RetentionPolicy object that we have shown above,
$ kubectl apply -f https://github.com/kubestash/docs/raw/v2024.9.30/docs/guides/volume-populator/deployment/examples/retentionpolicy.yaml
retentionpolicy.storage.kubestash.com/demo-retention created
Backup
We have to create a BackupConfiguration custom resource targeting the kubestash-demo Deployment that we have created earlier.
We also have to create another Secret with an encryption key RESTIC_PASSWORD for Restic. This secret will be used by Restic for both encrypting and decrypting the backup data during backup & restore.
Create Secret:
Let’s create a secret named encrypt-secret with the Restic password.
$ echo -n 'changeit' > RESTIC_PASSWORD
$ kubectl create secret generic -n demo encrypt-secret \
--from-file=./RESTIC_PASSWORD
secret/encrypt-secret created
Create BackupConfiguration :
Below is the YAML of the BackupConfiguration that we are going to create,
apiVersion: core.kubestash.com/v1alpha1
kind: BackupConfiguration
metadata:
name: sample-backup-dep
namespace: demo
spec:
target:
apiGroup: apps
kind: Deployment
name: kubestash-deployment
namespace: demo
backends:
- name: gcs-backend
storageRef:
namespace: demo
name: gcs-storage
retentionPolicy:
name: demo-retention
namespace: demo
sessions:
- name: frequent-backup
sessionHistoryLimit: 3
scheduler:
schedule: "*/5 * * * *"
jobTemplate:
backoffLimit: 1
repositories:
- name: gcs-repository
backend: gcs-backend
directory: /dep
encryptionSecret:
name: encrypt-secret # some addon may not support encryption
namespace: demo
deletionPolicy: WipeOut
addon:
name: workload-addon
tasks:
- name: logical-backup
params:
paths: /source/data,/source/config
exclude: /source/data/lost+found,/source/config/lost+found
Let’s create the BackupConfiguration object we have shown above.
$ kubectl apply -f https://github.com/kubestash/docs/raw/v2024.9.30/docs/guides/volume-populator/deployment/examples/backupconfiguration.yaml
backupconfiguration.core.kubestash.com/sample-backup-dep created
Verify Backup Setup Successful:
If everything goes well, the phase of the BackupConfiguration should be in Ready state. The Ready phase indicates that the backup setup is successful.
Let’s check the Phase of the BackupConfiguration,
$ kubectl get backupconfiguration -n demo
NAME PHASE PAUSED AGE
sample-backup-dep Ready 2m50s
Verify Repository:
Verify that the Repository specified in the BackupConfiguration has been created using the following command,
$ kubectl get repositories -n demo
NAME INTEGRITY SNAPSHOT-COUNT SIZE PHASE LAST-SUCCESSFUL-BACKUP AGE
gcs-repository Ready 28s
KubeStash keeps the backup for Repository YAMLs. If we navigate to the GCS bucket, we will see the Repository YAML stored in the kubestash-qa/demo/dep directory.
Verify CronJob:
Verify that KubeStash has created a CronJob with the schedule specified in spec.sessions[*].scheduler.schedule field of BackupConfiguration object.
Check that the CronJob has been created using the following command,
$ kubectl get cronjob -n demo
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
trigger-sample-backup-dep-frequent-backup */5 * * * * False 0 <none> 20s
Wait for BackupSession:
Now, wait for the next backup schedule. You can watch for BackupSession CR using the following command,
$ watch -n 1 kubectl get backupsession -n demo -l=kubestash.com/invoker-name=sample-backup-dep
Every 1.0s: kubectl get backupsession -n demo -l=kubestash.com/invoker-name=sample-backup-dep anisur: Wed Jan 17 15:20:09 2024
NAME INVOKER-TYPE INVOKER-NAME PHASE DURATION AGE
sample-backup-dep-frequent-backup-1705483201 BackupConfiguration sample-backup-dep Running 9s
Here, the phase Succeeded means that the backup process has been completed successfully.
Verify Backup:
When backup session is complete, KubeStash will update the respective Repository object to reflect the backup. Check that the repository gcs-repository has been updated by the following command,
$ kubectl get repository -n demo gcs-demo-repo
NAME INTEGRITY SNAPSHOT-COUNT SIZE PHASE LAST-SUCCESSFUL-BACKUP AGE
gcs-repository true 1 806 B Ready 8m27s 9m18s
At this moment we have one Snapshot. Run the following command to check the respective Snapshot.
Verify created Snapshot object by the following command,
$ kubectl get snapshots -n demo -l=kubestash.com/repo-name=gcs-repository
NAME REPOSITORY SESSION SNAPSHOT-TIME DELETION-POLICY PHASE AGE
gcs-repository-sample-backup-dep-frequent-backup-1706015400 gcs-demo-repo demo-session 2024-01-23T13:10:54Z Delete Succeeded 16h
When a backup is triggered according to schedule, KubeStash will create a
Snapshotwith the following labelskubestash.com/app-ref-kind: PersistentVolumeClaim,kubestash.com/app-ref-name: <pvc-name>,kubestash.com/app-ref-namespace: <pvc-namespace>andkubestash.com/repo-name: <repository-name>. We can use these labels to watch only theSnapshotof our desired Workload orRepository.
Now, lets retrieve the YAML for the Snapshot, and inspect the spec.status section to see the backup up components of the Deployment.
$ kubectl get snapshots -n demo gcs-repository-sample-backup-dep-frequent-backup-1706015400 -oyaml
apiVersion: storage.kubestash.com/v1alpha1
kind: Snapshot
metadata:
labels:
kubestash.com/app-ref-kind: Deployment
kubestash.com/app-ref-name: kubestash-deployment
kubestash.com/app-ref-namespace: demo
kubestash.com/repo-name: gcs-repository
name: gcs-repository-sample-backup-dep-frequent-backup-1706015400
namespace: demo
spec:
...
status:
components:
dump:
driver: Restic
duration: 7.534461497s
integrity: true
path: repository/v1/frequent-backup/dump
phase: Succeeded
resticStats:
- hostPath: /source/data
id: f28441a36b2167d64597d66d1046573181cad81aa8ff5b0998b64b31ce16f077
size: 11 B
uploaded: 1.049 KiB
- hostPath: /source/config
id: f28441a36b2167d64597d66d1046573181cad81aa8ff5b0998b64b31ce16f077
size: 11 B
uploaded: 1.049 KiB
size: 806 B
...
For Deployment, KubeStash takes backup from only one pod of the Deployment. So, only one component has been taken backup. The component name is
dump.
Populate Volumes
This section will show you how to populate the volumes of a Deployment with data from the Snapshot of the previous backup using KubeStash.
Create PersistentVolumeClaim :
Now, we need to create two new Persistent Volume Claims (PVCs) with the spec.dataSourceRef set to reference our Snapshot object.
Below is the YAML of the PVCs,
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: restored-source-data
namespace: demo
annotations:
populator.kubestash.com/app-name: kubestash-restored-deployment
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
dataSourceRef:
apiGroup: storage.kubestash.com
kind: Snapshot
name: gcs-repository-sample-backup-dep-frequent-backup-1706015400
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: restored-source-data
namespace: demo
annotations:
populator.kubestash.com/app-name: kubestash-restored-deployment
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
dataSourceRef:
apiGroup: storage.kubestash.com
kind: Snapshot
name: gcs-repository-sample-backup-dep-frequent-backup-1706015400
Here,
spec.dataSourceRefspecifies that whichsnapshotwe want to use for restoring and populating the volume. We have referenced theSnapshotobject that was backed up in the previous sectionmetadata.annotations.populator.kubestash.com/app-namefield is mandatory for any volume population of a deployment through KubeStash.- This field denotes the deployment that will be attached those volumes via mount paths. The volume population will only be successful if the mount path of this volume matches the mount paths of the backup deployment.
- For example, you backed up a deployment with volumes named
source-dataandsource-config, each with corresponding mount paths/source/dataand/source/config. Now you wish to populate volumes namedrestore-source-dataandrestore-source-configattached to a deployment namedrestored-kubestash-deployment, then this deployment must have mount paths set to/source/dataand/source/config, respectively.
Let’s create the PVCs YAMl that we have shown above.
$ kubectl apply -f https://github.com/kubestash/docs/raw/v2024.9.30/docs/guides/volume-populator/deployment/examples/populated-pvcs.yaml
persistentvolumeclaim/restored-source-data created
persistentvolumeclaim/restored-source-config created
Deploy Deployment :
Now, we are going to deploy a Deployment that uses the above PVCs. This Deployment has mount paths /source/data and /source/config corresponding to volumes named restored-source-data and restored-source-config respectively.
Below is the YAML of the Deployment that we are going to create,
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: kubestash-demo
name: restored-kubestash-deployment
namespace: demo
spec:
replicas: 1
selector:
matchLabels:
app: kubestash-demo
template:
metadata:
labels:
app: kubestash-demo
name: busybox
spec:
containers:
- command: ["/bin/sh", "-c"]
image: busybox
imagePullPolicy: IfNotPresent
name: busybox
volumeMounts:
- mountPath: /source/data
name: source-data
- mountPath: /source/config
name: source-config
restartPolicy: Always
volumes:
- name: restored-source-data
persistentVolumeClaim:
claimName: source-data
- name: restored-source-config
persistentVolumeClaim:
claimName: source-config
Let’s create the deployment we have shown above.
$ kubectl apply -f https://github.com/kubestash/docs/raw/v2024.9.30/docs/guides/volume-populator/deployment/examples/restore-deployment.yaml
deployment.apps/restored-kubestash-deployment created
Wait for Populate Volume:
When you create two PVCs with spec.dataSourceRef that refers our Snapshot object, KubeStash automatically creates a populator Job. Now, just wait for the volume population process to finish.
You can watch the PVCs status using the following command,
$ watch kubectl get pvc -n demo
Every 2.0s: kubectl get pvc -n demo anisur: Tue Feb 13 18:37:26 2024
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
restored-source-config Bound pvc-27a19e86-0ed5-4b18-9263-c8a66459aeb0 1Gi RWO standard-rwo 2m50s
restored-source-data Bound pvc-d210787e-5137-40b7-89e7-73d982716b40 1Gi RWO standard-rwo 2m51s
The output of the command shows the PVCs status as Bound, indicating successful completion of the volume population.
Verify Restored Data :
We are going to exec a pod of restored-dep deployment to verify whether the restored data.
Now, wait for the deployment pod to go into the Running state.
$ kubectl get pods -n demo
NAME READY STATUS RESTARTS AGE
restored-kubestash-deployment-84b974dfb5-4jzjd 1/1 Running 0 2m48s
Verify that the backed up data has been restored into /source/data and /source/config directory of above pod using the following command,
$ kubectl exec -it -n demo restored-kubestash-deployment-84b974dfb5-4jzjd -- cat /source/data/data.txt
sample_data
$ kubectl exec -it -n demo restored-kubestash-deployment-84b974dfb5-4jzjd -- cat /source/config/config.cfg
sample_config
Cleaning Up
To clean up the Kubernetes resources created by this tutorial, run:
kubectl delete backupconfiguration -n demo sample-backup-dep
kubectl delete backupstorage -n demo gcs-storage
kubectl delete retentionPolicy -n demo demo-retention
kubectl delete secret -n demo gcs-secret
kubectl delete secret -n demo encrypt-secret
kubectl delete deploy -n demo kubestash-deployment restored-kubestash-deployment
kubectl delete pvc -n demo --all






