Module 6 - Kubernetes Storage
A complete guide to Kubernetes Storage — covering Volumes, Persistent Volumes, PVCs, StorageClasses, StatefulSets, Cloud Storage, CSI, Backup strategies, and hands-on labs with real-world troubleshooting examples.
Module 6 — Kubernetes Storage
Table of Contents
- Introduction to Kubernetes Storage
- Kubernetes Volumes
- Persistent Storage Concepts
- Storage Classes
- Stateful Applications
- Cloud Storage Integration
- CSI (Container Storage Interface)
- Backup and Data Protection
- Troubleshooting Kubernetes Storage
- Hands-On Labs
1. Introduction to Kubernetes Storage
Why Storage is Required in Kubernetes
Kubernetes is a container orchestration platform where workloads are designed to be distributed, scalable, and self-healing. Containers, by their very nature, are stateless and ephemeral — when a container restarts or is rescheduled to another node, everything written inside the container’s filesystem is permanently lost.
This creates a fundamental challenge for real-world applications:
Without Storage: With Storage:
┌────────────────────┐ ┌────────────────────┐
│ Pod Crashes │ │ Pod Crashes │
│ ┌──────────────┐ │ │ ┌──────────────┐ │
│ │ Container │ │ │ │ Container │ │
│ │ /data ──X │ │ │ │ /data ──────┼──┼──▶ Volume
│ └──────────────┘ │ │ └──────────────┘ │ (persists)
│ Data is LOST │ │ Data SURVIVES │
└────────────────────┘ └────────────────────┘
Storage is required for:
| Requirement | Example |
|---|---|
| Data Persistence | Database files surviving Pod restarts |
| Data Sharing | Multiple Pods reading the same config file |
| Configuration Injection | Mounting ConfigMaps/Secrets as files |
| Stateful Workloads | MySQL, PostgreSQL, MongoDB, Kafka, Elasticsearch |
| Log Aggregation | Centralising container logs on a shared volume |
| Cache Persistence | Redis RDB/AOF files surviving restarts |
| ML Model Storage | Large model files shared across inference Pods |
Stateless vs Stateful Applications
Understanding this distinction is the foundation of Kubernetes storage design.
┌──────────────────────────────────────────────────────────────────────┐
│ STATELESS APPLICATION │
│ │
│ Request ─▶ Pod A ─▶ Response All Pods are identical │
│ Request ─▶ Pod B ─▶ Response Any Pod can serve any request │
│ Request ─▶ Pod C ─▶ Response Pod death = zero data loss │
│ │
│ Examples: REST APIs, Web servers, Microservices, Nginx │
└──────────────────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────┐
│ STATEFUL APPLICATION │
│ │
│ Client ─▶ Pod A (primary DB) Pods have unique identities │
│ Client ─▶ Pod B (replica DB) Each Pod has its own storage │
│ Pod order and names matter │
│ Pod death = must restore state │
│ │
│ Examples: MySQL, PostgreSQL, Kafka, Zookeeper, Elasticsearch │
└──────────────────────────────────────────────────────────────────────┘
| Characteristic | Stateless | Stateful |
|---|---|---|
| Data persistence | Not required | Critical |
| Pod identity | Interchangeable | Unique (pod-0, pod-1…) |
| Scaling | Simple horizontal | Complex (order matters) |
| Storage | Ephemeral or none | Persistent volumes |
| Kubernetes resource | Deployment | StatefulSet |
| Failure impact | Replace immediately | Must maintain state |
Ephemeral Storage in Containers
Every container gets a writable layer on top of its image. This writable layer is:
- Tied to the container lifecycle — gone when the container is removed
- Not shared between containers (even in the same Pod)
- Local to the node — data cannot follow a rescheduled Pod
- Counted against node disk — excessive writes can evict Pods
Container Filesystem Layers (Union Mount):
┌─────────────────────────────────────────┐
│ Writable Layer (ephemeral) │ ← Container writes here
│ /app/logs, /tmp, /var/cache │ LOST on container death
├─────────────────────────────────────────┤
│ Image Layer 3 (read-only) │
│ /app/config.json │
├─────────────────────────────────────────┤
│ Image Layer 2 (read-only) │
│ /usr/local/bin/node │
├─────────────────────────────────────────┤
│ Image Layer 1 (read-only) │
│ /etc, /usr, /lib │
└─────────────────────────────────────────┘
Consequences of relying on ephemeral storage:
# Demonstrate data loss — run a container, write data, kill it
kubectl run ephemeral-demo --image=busybox -it --rm -- sh
# Inside container:
echo "Important data" > /tmp/mydata.txt
cat /tmp/mydata.txt
# Important data
# Now restart the pod (ctrl+d to exit, pod auto-deletes with --rm)
# If you run it again → /tmp/mydata.txt is gone!
Kubernetes provides Volumes to overcome this limitation.
2. Kubernetes Volumes
What are Volumes?
A Kubernetes Volume is a directory accessible to containers in a Pod. Unlike the container’s ephemeral writable layer, a Volume:
- Survives container restarts within the same Pod (data persists as long as the Pod exists)
- Can be shared between multiple containers in the same Pod
- Supports many backends — local disk, NFS, cloud disks, ConfigMaps, Secrets, etc.
- Is declared in the Pod spec — mounted into containers at specified paths
Pod Spec Structure:
┌─────────────────────────────────────────────────────────────────┐
│ Pod │
│ ┌──────────────────────┐ ┌──────────────────────┐ │
│ │ Container A │ │ Container B │ │
│ │ volumeMounts: │ │ volumeMounts: │ │
│ │ - /data → vol1 │ │ - /shared → vol1 │ │
│ └──────────────────────┘ └──────────────────────┘ │
│ │ │
│ volumes: │ │
│ - name: vol1 ───────────────────┘ │
│ emptyDir: {} │
└─────────────────────────────────────────────────────────────────┘
Volume Types Overview:
| Type | Persists Pod restart? | Persists Pod deletion? | Shared across Pods? |
|---|---|---|---|
emptyDir | ✅ Yes | ❌ No | ❌ No |
hostPath | ✅ Yes | ✅ Yes (on same node) | ❌ No |
configMap | ✅ Yes | ✅ Yes | ✅ Yes (read-only) |
secret | ✅ Yes | ✅ Yes | ✅ Yes (read-only) |
persistentVolumeClaim | ✅ Yes | ✅ Yes | Depends on AccessMode |
nfs | ✅ Yes | ✅ Yes | ✅ Yes |
EmptyDir Volume
An emptyDir volume is created empty when a Pod is assigned to a Node. It exists as long as the Pod is running on that node. All containers in the Pod share the same emptyDir and can read/write to it.
Lifecycle: Pod scheduled → emptyDir created → Pod deleted → emptyDir deleted
Use Cases:
- Scratch space for disk-based merge sort
- Checkpoint files for long computations
- Sharing files between a main container and a sidecar (e.g., log processor)
- Cache directory shared between containers
# emptydir-example.yaml
apiVersion: v1
kind: Pod
metadata:
name: emptydir-demo
spec:
containers:
- name: writer
image: busybox
command: ["/bin/sh", "-c"]
args:
- |
while true; do
echo "$(date): Writing data" >> /shared/output.log
sleep 5
done
volumeMounts:
- name: shared-data
mountPath: /shared
- name: reader
image: busybox
command: ["/bin/sh", "-c"]
args:
- |
while true; do
echo "=== Reading shared log ==="
cat /shared/output.log 2>/dev/null || echo "File not yet created"
sleep 10
done
volumeMounts:
- name: shared-data
mountPath: /shared # Same volume, same path
volumes:
- name: shared-data
emptyDir: {} # Empty directory, lives with the Pod
EmptyDir with Memory-Backed Storage:
volumes:
- name: cache-volume
emptyDir:
medium: Memory # Stored in RAM (tmpfs) — faster, but counts against memory limit
sizeLimit: 512Mi # Limit size to 512 MB
Test it:
kubectl apply -f emptydir-example.yaml
# Check writer is producing data
kubectl exec emptydir-demo -c writer -- cat /shared/output.log
# Check reader can see the same data
kubectl exec emptydir-demo -c reader -- cat /shared/output.log
# Restart the writer container — data survives!
kubectl exec emptydir-demo -c writer -- kill 1
# (container restarts)
kubectl exec emptydir-demo -c writer -- cat /shared/output.log
# Previous data is still there ← emptyDir survived container restart
# Delete the Pod — data is lost
kubectl delete pod emptydir-demo
HostPath Volume
A hostPath volume mounts a file or directory from the host Node’s filesystem into the Pod. The data persists beyond the Pod’s lifetime but is tied to the specific node.
Use Cases:
- Accessing Docker socket (
/var/run/docker.sock) for container monitoring tools - Reading node-level log files (
/var/log) - DaemonSet workloads that need node-local data (log collectors like Fluentd)
- Development/testing where you need node-persistent storage
# hostpath-example.yaml
apiVersion: v1
kind: Pod
metadata:
name: hostpath-demo
spec:
containers:
- name: app
image: nginx:1.25
volumeMounts:
- name: host-logs
mountPath: /var/log/nginx-host # Inside container
- name: docker-sock
mountPath: /var/run/docker.sock # Docker socket access
volumes:
- name: host-logs
hostPath:
path: /tmp/k8s-logs # Path on the HOST node
type: DirectoryOrCreate # Create if it doesn't exist
- name: docker-sock
hostPath:
path: /var/run/docker.sock
type: Socket # Only if it's a socket file
HostPath Type Values:
| Type | Behaviour |
|---|---|
"" (empty) | No checks — path is used as-is |
DirectoryOrCreate | Create directory if not exists |
Directory | Directory must already exist |
FileOrCreate | Create file if not exists |
File | File must already exist |
Socket | Unix socket must exist |
BlockDevice | Block device must exist |
⚠️ Security Warning:
hostPathgives containers access to the node filesystem. It should be used sparingly and avoided in multi-tenant clusters. Prefer PersistentVolumes for data persistence.
ConfigMap Volume
Mounts a ConfigMap as a directory of files inside a container. Each key in the ConfigMap becomes a filename; the value becomes the file content.
Use Cases:
- Injecting application configuration files (nginx.conf, app.properties)
- Providing environment-specific settings without rebuilding images
- Storing non-sensitive configuration that can be updated at runtime
# 1. Create a ConfigMap with config file contents
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
namespace: default
data:
app.properties: |
server.port=8080
db.host=postgres-service
db.port=5432
log.level=INFO
cache.ttl=3600
nginx.conf: |
server {
listen 80;
location / {
proxy_pass http://backend-service:3000;
proxy_set_header Host $host;
}
}
feature-flags.json: |
{
"newDashboard": true,
"betaCheckout": false,
"darkMode": true
}
---
# 2. Mount ConfigMap as volume in a Pod
apiVersion: v1
kind: Pod
metadata:
name: configmap-volume-demo
spec:
containers:
- name: app
image: mycompany/backend:v1.0
volumeMounts:
- name: config-volume
mountPath: /etc/app-config # All ConfigMap keys appear as files here
readOnly: true
- name: nginx-config
mountPath: /etc/nginx/conf.d
readOnly: true
volumes:
- name: config-volume
configMap:
name: app-config # Reference the ConfigMap
items: # Optional: select specific keys
- key: app.properties
path: application.properties # Rename the file on mount
- name: nginx-config
configMap:
name: app-config
items:
- key: nginx.conf
path: default.conf
Verify the mount:
kubectl exec configmap-volume-demo -- ls /etc/app-config
# application.properties
kubectl exec configmap-volume-demo -- cat /etc/app-config/application.properties
# server.port=8080
# db.host=postgres-service
# ...
# ConfigMap updates propagate to the volume automatically (within ~1 minute)
kubectl edit configmap app-config
# Change log.level=DEBUG
# After ~60s:
kubectl exec configmap-volume-demo -- cat /etc/app-config/application.properties
# log.level=DEBUG ← Updated without Pod restart!
Secret Volume
Mounts a Kubernetes Secret as files into a container. Similar to ConfigMap volumes but the data is base64-decoded and the volume is backed by tmpfs (in-memory) for security — secrets never touch the node disk.
Use Cases:
- TLS certificates and private keys
- Database passwords
- API keys and tokens
- SSH private keys
# 1. Create a Secret
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
namespace: default
type: Opaque
data:
# Values must be base64 encoded: echo -n "value" | base64
db-password: cGFzc3dvcmQxMjM= # "password123"
api-key: c2VjcmV0LWFwaS1rZXktMTIz # "secret-api-key-123"
stringData:
# stringData is auto-encoded by Kubernetes — no manual base64 needed
db-url: "postgresql://user:password123@postgres:5432/mydb"
---
# 2. Create TLS Secret from files
# kubectl create secret tls tls-secret \
# --cert=path/to/tls.crt \
# --key=path/to/tls.key
# 3. Mount Secret as volume
apiVersion: v1
kind: Pod
metadata:
name: secret-volume-demo
spec:
containers:
- name: app
image: mycompany/backend:v1.0
volumeMounts:
- name: secret-volume
mountPath: /etc/secrets
readOnly: true
- name: tls-certs
mountPath: /etc/ssl/app
readOnly: true
volumes:
- name: secret-volume
secret:
secretName: app-secrets
defaultMode: 0400 # Restrictive file permissions (owner read-only)
- name: tls-certs
secret:
secretName: tls-secret
items:
- key: tls.crt
path: server.crt
- key: tls.key
path: server.key
mode: 0400 # Extra-restrictive for private key
Verify and inspect:
kubectl exec secret-volume-demo -- ls -la /etc/secrets
# total 0
# -r-------- 1 root root 11 Jan 20 10:00 db-password
# -r-------- 1 root root 23 Jan 20 10:00 api-key
# -r-------- 1 root root 58 Jan 20 10:00 db-url
kubectl exec secret-volume-demo -- cat /etc/secrets/db-password
# password123 ← Already base64-decoded by Kubernetes!
# Secret files are stored in memory (tmpfs) — not on disk
kubectl exec secret-volume-demo -- df /etc/secrets
# tmpfs ← Confirms in-memory storage
3. Persistent Storage Concepts
What is Persistent Volume (PV)?
A Persistent Volume (PV) is a piece of storage in the cluster that has been provisioned by an administrator (or dynamically by a StorageClass). It is a cluster-level resource — not tied to any namespace or Pod — and represents physical storage on a disk, NAS, cloud volume, NFS share, etc.
Persistent Volume = The actual storage resource
(like a hard drive in the cluster)
┌─────────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Persistent Volume (PV) — Cluster Scoped │ │
│ │ │ │
│ │ Name: pv-postgres-data │ │
│ │ Capacity: 50Gi │ │
│ │ AccessMode: ReadWriteOnce │ │
│ │ StorageClass: fast-ssd │ │
│ │ ReclaimPolicy: Retain │ │
│ │ Source: AWS EBS vol-0a1b2c3d4e5f │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Example PV manifest:
# persistent-volume.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-postgres-data
labels:
type: ssd
environment: production
spec:
capacity:
storage: 50Gi # Total size of this volume
accessModes:
- ReadWriteOnce # Only one Node can mount read-write at a time
persistentVolumeReclaimPolicy: Retain # Keep data after PVC is deleted
storageClassName: fast-ssd # Must match PVC's storageClassName
# Storage backend — choose ONE:
# Option A: Local path (for testing/on-premise)
hostPath:
path: /mnt/data/postgres
# Option B: NFS
# nfs:
# server: nfs-server.company.com
# path: /exports/postgres-data
# Option C: AWS EBS (static provisioning)
# awsElasticBlockStore:
# volumeID: vol-0a1b2c3d4e5f6789
# fsType: ext4
What is Persistent Volume Claim (PVC)?
A Persistent Volume Claim (PVC) is a request for storage made by a user or application. It’s namespace-scoped and acts like a storage “order form” — specifying how much storage is needed, what access mode is required, and optionally which StorageClass to use.
Persistent Volume Claim = The storage request
(like ordering a hard drive)
Namespace: my-app
┌──────────────────────────────────────────────────────────────┐
│ PVC: pvc-postgres-claim │
│ Request: 20Gi │
│ AccessMode: ReadWriteOnce │
│ StorageClass: fast-ssd │
│ Status: Bound → bound to pv-postgres-data │
└──────────────────────────────────────────────────────────────┘
Example PVC manifest:
# persistent-volume-claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-postgres-claim
namespace: production # PVCs are namespace-scoped
spec:
accessModes:
- ReadWriteOnce # Must be compatible with the PV
resources:
requests:
storage: 20Gi # Request 20Gi (PV must offer >= 20Gi)
storageClassName: fast-ssd # Must match PV's storageClassName
# Or omit for dynamic provisioning
# Optional: select a specific PV by labels
selector:
matchLabels:
environment: production
How PV and PVC Work Together
The relationship between PV and PVC follows a bind-and-use lifecycle:
┌─────────────────────────────────────────────────────────────────────┐
│ PV / PVC Lifecycle │
│ │
│ 1. PROVISION │
│ Admin creates PV ──▶ PV Status: Available │
│ (or StorageClass auto-provisions) │
│ │
│ 2. BIND │
│ User creates PVC ──▶ Control plane matches PVC to PV │
│ PVC Status: Bound ◀── PV Status: Bound │
│ │
│ 3. USE │
│ Pod references PVC ──▶ Volume mounted into container │
│ Data read/written ──▶ persists to backend storage │
│ │
│ 4. RELEASE │
│ Pod deleted ──▶ PVC still exists (data safe) │
│ PVC deleted ──▶ PV Status: Released │
│ │
│ 5. RECLAIM (based on ReclaimPolicy) │
│ Retain ──▶ PV stays, data intact, manual cleanup needed │
│ Delete ──▶ PV and underlying storage deleted automatically │
│ Recycle ──▶ Data wiped, PV made Available again (deprecated) │
└─────────────────────────────────────────────────────────────────────┘
Using a PVC in a Pod:
# pod-with-pvc.yaml
apiVersion: v1
kind: Pod
metadata:
name: postgres-pod
namespace: production
spec:
containers:
- name: postgres
image: postgres:15
env:
- name: POSTGRES_DB
value: "myapp"
- name: POSTGRES_USER
value: "admin"
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
ports:
- containerPort: 5432
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data # Where PostgreSQL stores data
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: pvc-postgres-claim # Reference the PVC
Verify the binding:
# Check PV status
kubectl get pv pv-postgres-data
# NAME STATUS CLAIM STORAGECLASS AGE
# pv-postgres-data Bound production/pvc-postgres-claim fast-ssd 5m
# Check PVC status
kubectl get pvc pvc-postgres-claim -n production
# NAME STATUS VOLUME CAPACITY ACCESS MODES
# pvc-postgres-claim Bound pv-postgres-data 50Gi RWO
# Check Pod is using the PVC
kubectl describe pod postgres-pod -n production | grep -A5 Volumes
# Volumes:
# postgres-storage:
# Type: PersistentVolumeClaim
# ClaimName: pvc-postgres-claim
# ReadOnly: false
Access Modes in PV
Access modes define how many Nodes can mount the volume simultaneously and in what mode.
| Access Mode | Short | Description | Example Storage |
|---|---|---|---|
ReadWriteOnce | RWO | One Node mounts read-write | AWS EBS, GCE PD, Azure Disk |
ReadOnlyMany | ROX | Many Nodes mount read-only | NFS, CephFS |
ReadWriteMany | RWX | Many Nodes mount read-write | NFS, CephFS, Azure Files |
ReadWriteOncePod | RWOP | One Pod mounts read-write (k8s 1.22+) | CSI volumes |
ReadWriteOnce (RWO): ReadWriteMany (RWX):
┌────────────┐ ┌────────────┐
│ Node 1 │◀─── Mounted RW │ Node 1 │◀─── Mounted RW
│ (Pod A) │ │ (Pod A) │
└────────────┘ ┌──────────┐ └────────────┘
│ PV │
┌────────────┐ └──────────┘ ┌────────────┐
│ Node 2 │ │ Node 2 │◀─── Mounted RW
│ (empty) │ ✗ Cannot mount │ (Pod B) │
└────────────┘ └────────────┘
⚠️ Important: Access modes describe what the storage supports, not what is currently active. A PV with RWX can still be mounted by just one node — but it allows many.
Reclaim Policies
When a PVC is deleted, the ReclaimPolicy on the PV determines what happens to the data:
spec:
persistentVolumeReclaimPolicy: Retain # or Delete, or Recycle
| Policy | What Happens to PV | What Happens to Data | Use Case |
|---|---|---|---|
| Retain | PV stays in Released state | Data intact — manual admin action needed | Production databases |
| Delete | PV deleted automatically | Underlying storage deleted | Dynamic provisioning, cloud disks |
| Recycle | PV scrubbed (rm -rf) and made Available | Data wiped | Deprecated — use dynamic provisioning |
Retain workflow (most important for production):
# 1. Delete PVC
kubectl delete pvc pvc-postgres-claim -n production
# 2. Check PV status — it becomes Released
kubectl get pv pv-postgres-data
# STATUS: Released ← PV is not re-usable yet (claimRef still set)
# 3. Manual cleanup — remove claimRef to make PV Available again
kubectl patch pv pv-postgres-data -p '{"spec":{"claimRef": null}}'
# 4. Now PV is Available for a new PVC
kubectl get pv pv-postgres-data
# STATUS: Available
4. Storage Classes
What is StorageClass?
A StorageClass defines a class or tier of storage and how it should be dynamically provisioned. Think of it like a storage catalogue — administrators create StorageClasses describing different storage offerings (fast SSD, slow HDD, replicated NFS, etc.), and users reference them in PVCs without needing to know the underlying infrastructure.
Without StorageClass: With StorageClass:
Admin manually creates PVs ─▶ PVs auto-created on demand
User waits for admin ─▶ User creates PVC → PV auto-provisioned
Static, slow process ─▶ Dynamic, instant storage
StorageClass anatomy:
# storage-class.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
annotations:
storageclass.kubernetes.io/is-default-class: "false" # Not the default
provisioner: kubernetes.io/aws-ebs # Which plugin provisions the storage
# or ebs.csi.aws.com for CSI
parameters:
type: gp3 # AWS EBS volume type
iops: "3000"
throughput: "125"
fsType: ext4
encrypted: "true"
kmsKeyId: "arn:aws:kms:us-east-1:..."
reclaimPolicy: Delete # Delete PV when PVC is deleted
allowVolumeExpansion: true # Allow PVC resize (kubectl edit pvc)
volumeBindingMode: WaitForFirstConsumer # Delay PV creation until Pod is scheduled
# (vs Immediate — create PV on PVC creation)
mountOptions:
- debug
- discard
Dynamic Provisioning
Dynamic provisioning automatically creates a PV when a PVC is created that references a StorageClass. No admin pre-provisioning required.
Dynamic Provisioning Flow:
User creates PVC with StorageClass "fast-ssd"
│
▼
Control plane calls StorageClass provisioner plugin
│
▼
Provisioner creates the physical volume
(e.g., calls AWS API to create a new EBS volume)
│
▼
PV is automatically created in Kubernetes
│
▼
PV is bound to PVC automatically
│
▼
Pod mounts the PVC — storage ready!
Dynamic provisioning example:
# 1. StorageClass (admin sets up once)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard-ssd
provisioner: ebs.csi.aws.com
parameters:
type: gp3
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
---
# 2. PVC (user creates — no PV needed!)
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dynamic-pvc
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: standard-ssd # References the StorageClass
---
# 3. Pod using the dynamically provisioned PVC
apiVersion: v1
kind: Pod
metadata:
name: app-with-dynamic-storage
spec:
containers:
- name: app
image: nginx
volumeMounts:
- name: app-data
mountPath: /data
volumes:
- name: app-data
persistentVolumeClaim:
claimName: dynamic-pvc
# Apply PVC — triggers dynamic provisioning
kubectl apply -f dynamic-pvc.yaml
# Watch the PV get created automatically
kubectl get pv -w
# NAME CAPACITY STATUS
# pvc-a1b2c3d4-e5f6-7890-abcd-ef1234567890 10Gi Bound
# The PV was auto-created by the provisioner!
kubectl get pvc dynamic-pvc
# STATUS: Bound ← Ready immediately
Default StorageClass
When a PVC doesn’t specify a storageClassName, Kubernetes uses the default StorageClass (if one is configured).
# View all StorageClasses and find the default
kubectl get storageclass
# NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE DEFAULT
# standard (default) rancher.io/local-path Delete WaitForFirstConsumer ← DEFAULT
# fast-ssd ebs.csi.aws.com Delete WaitForFirstConsumer
# Set a StorageClass as default
kubectl patch storageclass fast-ssd \
-p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
# Unset the old default
kubectl patch storageclass standard \
-p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
PVC without storageClassName (uses default):
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: auto-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
# storageClassName omitted → uses default StorageClass
Storage Provisioners
A provisioner is the plugin that handles the actual storage creation. The provisioner field in StorageClass tells Kubernetes which plugin to call.
| Provisioner | Storage Backend | Environment |
|---|---|---|
ebs.csi.aws.com | AWS EBS | AWS |
disk.csi.azure.com | Azure Managed Disk | Azure |
pd.csi.storage.gke.io | Google Persistent Disk | GCP |
file.csi.azure.com | Azure Files (NFS) | Azure |
nfs.csi.k8s.io | NFS Server | On-premise/Any |
rancher.io/local-path | Local node path | Local/Dev |
docker.io/hostpath | Host path | Docker Desktop |
rook-ceph.rbd.csi.ceph.com | Ceph RBD | On-premise |
linstor.csi.linbit.com | LINSTOR/DRBD | On-premise |
5. Stateful Applications
Introduction to StatefulSets
A StatefulSet is a Kubernetes workload resource designed for stateful applications. Unlike Deployments where all Pods are interchangeable, StatefulSets provide:
Deployment Pods: StatefulSet Pods:
┌─────────────────────────┐ ┌─────────────────────────┐
│ web-7d9f8c-abc12 │ │ mysql-0 (Primary) │
│ web-7d9f8c-def34 │ │ mysql-1 (Replica) │
│ web-7d9f8c-ghi56 │ │ mysql-2 (Replica) │
│ │ │ │
│ Random names │ │ Stable, ordered names │
│ Any order │ │ Created 0→1→2 │
│ Any node │ │ Deleted 2→1→0 │
│ Shared or no storage │ │ Each has own PVC │
└─────────────────────────┘ └─────────────────────────┘
StatefulSet guarantees:
- Stable network identities —
<statefulset>-<ordinal>(mysql-0, mysql-1) - Stable storage — Each Pod gets its own PVC that persists across rescheduling
- Ordered deployment/scaling — Pods start/stop in a predictable sequence
- Ordered rolling updates — Updates proceed in reverse ordinal order
Storage in StatefulSets
Each Pod in a StatefulSet gets its own dedicated PVC — not shared. When a Pod is rescheduled (even to a different node), it reattaches to the same PVC and thus the same data.
StatefulSet Storage Architecture:
mysql-0 ──binds to──▶ pvc-mysql-0 ──▶ PV (50Gi) ──▶ Actual Disk A
mysql-1 ──binds to──▶ pvc-mysql-1 ──▶ PV (50Gi) ──▶ Actual Disk B
mysql-2 ──binds to──▶ pvc-mysql-2 ──▶ PV (50Gi) ──▶ Actual Disk C
If mysql-1 is rescheduled to Node 3:
mysql-1 ──still binds to──▶ pvc-mysql-1 ──▶ PV (50Gi) ──▶ Same Disk B
(data unchanged!)
VolumeClaimTemplates
volumeClaimTemplates in a StatefulSet spec automatically creates a unique PVC for each Pod using a template pattern <template-name>-<pod-name>.
# statefulset-mysql.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
namespace: production
spec:
serviceName: mysql-headless # Required: headless service for DNS
replicas: 3
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
initContainers:
- name: init-mysql
image: mysql:8.0
command:
- bash
- "-c"
- |
# Assign server-id based on pod ordinal
[[ $(hostname) =~ -([0-9]+)$ ]] || exit 1
ordinal=${BASH_REMATCH[1]}
echo [mysqld] > /mnt/conf.d/server-id.cnf
echo server-id=$((100 + $ordinal)) >> /mnt/conf.d/server-id.cnf
volumeMounts:
- name: conf
mountPath: /mnt/conf.d
containers:
- name: mysql
image: mysql:8.0
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: root-password
ports:
- containerPort: 3306
volumeMounts:
- name: data # References volumeClaimTemplate name
mountPath: /var/lib/mysql
- name: conf
mountPath: /etc/mysql/conf.d
readinessProbe:
exec:
command: ["mysqladmin", "ping", "-u", "root", "-p$(MYSQL_ROOT_PASSWORD)"]
initialDelaySeconds: 30
periodSeconds: 10
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2"
memory: "4Gi"
volumes:
- name: conf
emptyDir: {}
# This is the key: unique PVC per Pod
volumeClaimTemplates:
- metadata:
name: data # Creates: data-mysql-0, data-mysql-1, data-mysql-2
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: fast-ssd
resources:
requests:
storage: 50Gi
---
# Headless Service (required for StatefulSet DNS)
apiVersion: v1
kind: Service
metadata:
name: mysql-headless
namespace: production
spec:
clusterIP: None # Headless — no virtual IP
selector:
app: mysql
ports:
- port: 3306
Verify StatefulSet storage:
kubectl apply -f statefulset-mysql.yaml
# Watch pods come up in order (0 → 1 → 2)
kubectl get pods -l app=mysql -w
# mysql-0 1/1 Running 0 30s
# mysql-1 1/1 Running 0 60s
# mysql-2 1/1 Running 0 90s
# Verify PVCs — one per pod, auto-named
kubectl get pvc -n production
# NAME STATUS VOLUME CAPACITY ACCESS MODES
# data-mysql-0 Bound pvc-abc123... 50Gi RWO
# data-mysql-1 Bound pvc-def456... 50Gi RWO
# data-mysql-2 Bound pvc-ghi789... 50Gi RWO
# DNS for each pod (via headless service):
# mysql-0.mysql-headless.production.svc.cluster.local
# mysql-1.mysql-headless.production.svc.cluster.local
# mysql-2.mysql-headless.production.svc.cluster.local
6. Cloud Storage Integration
AWS EBS with Kubernetes
AWS Elastic Block Store (EBS) provides block storage volumes for AWS EC2 instances. With the EBS CSI driver, Kubernetes can dynamically provision EBS volumes for PVCs.
Architecture:
┌────────────────────────────────────────────────────────┐
│ AWS EKS Cluster │
│ │
│ ┌──────────────────────────┐ │
│ │ Pod │ │
│ │ ┌────────────────────┐ │ │
│ │ │ Container │ │ │
│ │ │ /var/lib/postgres ─┼──┼──▶ PVC ──▶ EBS Volume │
│ │ └────────────────────┘ │ (gp3, 50Gi) │
│ └──────────────────────────┘ │
└────────────────────────────────────────────────────────┘
Setup EBS CSI Driver (EKS):
# Install EBS CSI Driver using Helm
helm repo add aws-ebs-csi-driver \
https://kubernetes-sigs.github.io/aws-ebs-csi-driver
helm install aws-ebs-csi-driver aws-ebs-csi-driver/aws-ebs-csi-driver \
--namespace kube-system \
--set controller.serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=\
arn:aws:iam::ACCOUNT:role/AmazonEKS_EBS_CSI_DriverRole
EBS StorageClass:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ebs-gp3
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: ebs.csi.aws.com
parameters:
type: gp3
iops: "3000"
throughput: "125"
encrypted: "true"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer # Critical for EBS (zone-aware)
allowVolumeExpansion: true
⚠️ EBS Limitation: EBS volumes are
ReadWriteOnceonly — they can only attach to one EC2 instance at a time. ForReadWriteMany, use EFS (Elastic File System) instead.
Azure Disk Storage
Azure Managed Disks provide block storage for Azure Kubernetes Service (AKS).
# Azure Disk StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: azure-premium-ssd
provisioner: disk.csi.azure.com
parameters:
skuName: Premium_LRS # Premium SSD locally redundant
# skuName: StandardSSD_LRS # Standard SSD
# skuName: Standard_LRS # Standard HDD
kind: Managed
cachingMode: ReadOnly # None, ReadOnly, ReadWrite
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
Azure Files (ReadWriteMany):
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: azure-files-premium
provisioner: file.csi.azure.com
parameters:
skuName: Premium_LRS
reclaimPolicy: Delete
volumeBindingMode: Immediate
allowVolumeExpansion: true
mountOptions:
- dir_mode=0777
- file_mode=0777
- uid=0
- gid=0
- mfsymlinks
- cache=strict
Google Persistent Disk
Google Persistent Disks are block storage for Google Kubernetes Engine (GKE).
# GKE StorageClass with pd-ssd
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gce-pd-ssd
provisioner: pd.csi.storage.gke.io
parameters:
type: pd-ssd # pd-standard, pd-ssd, pd-balanced, pd-extreme
replication-type: regional-pd # For regional redundancy
disk-encryption-kms-key: projects/.../cryptoKeyVersions/...
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
GKE Filestore (ReadWriteMany):
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: filestore-rwx
provisioner: filestore.csi.storage.gke.io
parameters:
tier: standard # standard, premium, enterprise
network: default
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
NFS Storage
NFS (Network File System) is a popular on-premise storage solution that supports ReadWriteMany access mode — multiple Pods on multiple Nodes can mount the same NFS share simultaneously.
Setup NFS CSI Driver:
# Install NFS CSI Driver
helm repo add csi-driver-nfs https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/charts
helm install csi-driver-nfs csi-driver-nfs/csi-driver-nfs \
--namespace kube-system
NFS StorageClass:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-storage
provisioner: nfs.csi.k8s.io
parameters:
server: nfs-server.company.com # NFS server hostname or IP
share: /exports/k8s-volumes # NFS export path
subDir: ${pvc.metadata.name} # Create subdirectory per PVC
reclaimPolicy: Delete
volumeBindingMode: Immediate
mountOptions:
- nfsvers=4.1
- hard
- timeo=600
- retrans=3
NFS PVC (ReadWriteMany):
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: shared-data-pvc
spec:
accessModes:
- ReadWriteMany # Multiple Pods can write simultaneously
storageClassName: nfs-storage
resources:
requests:
storage: 100Gi
Multi-Pod NFS usage (shared model):
# Three Pods all writing to the same NFS volume
apiVersion: apps/v1
kind: Deployment
metadata:
name: log-processors
spec:
replicas: 3
template:
spec:
containers:
- name: processor
image: mycompany/log-processor:v1.0
volumeMounts:
- name: shared-logs
mountPath: /shared/logs
volumes:
- name: shared-logs
persistentVolumeClaim:
claimName: shared-data-pvc # All 3 pods share the same PVC
7. CSI (Container Storage Interface)
What is CSI?
The Container Storage Interface (CSI) is a standardised API that enables storage vendors to write one driver that works across any container orchestration system (Kubernetes, Mesos, Nomad, etc.) without modifying the orchestrator’s core code.
Before CSI (in-tree plugins): After CSI:
┌──────────────────────────────┐ ┌──────────────────────────────┐
│ Kubernetes Core │ │ Kubernetes Core │
│ ┌────────┐ ┌────────┐ │ │ ┌─────────────────────────┐│
│ │AWS EBS │ │GCE PD │ │ │ │ CSI Interface (stable) ││
│ │plugin │ │plugin │ │ │ └───────────────┬─────────┘│
│ └────────┘ └────────┘ │ └──────────────────┼──────────┘
│ ┌────────┐ ┌────────┐ │ │
│ │Azure │ │Ceph │ │ ┌──────────────────┼──────────┐
│ │plugin │ │plugin │ │ │ CSI Driver Pods │ │
│ └────────┘ └────────┘ │ │ ┌─────────┐ ┌──┴──────┐ │
│ (compiled into k8s binary!) │ │ │AWS EBS │ │GCE PD │ │
└──────────────────────────────┘ │ │CSI │ │CSI │ │
│ └─────────┘ └─────────┘ │
│ (deployed independently!) │
└────────────────────────────┘
CSI Drivers
A CSI driver consists of Kubernetes-deployed components:
CSI Driver Architecture:
┌────────────────────────────────────────────────────────────┐
│ CSI Driver Deployment │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Controller Plugin (Deployment) │ │
│ │ - CreateVolume / DeleteVolume │ │
│ │ - ControllerPublishVolume / UnpublishVolume │ │
│ │ - CreateSnapshot / DeleteSnapshot │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Node Plugin (DaemonSet — runs on every Node) │ │
│ │ - NodeStageVolume (format/mount on node) │ │
│ │ - NodePublishVolume (bind-mount into Pod) │ │
│ │ - NodeUnpublishVolume (unmount from Pod) │ │
│ └─────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────┘
Popular CSI Drivers:
| Driver | Storage | Install Command |
|---|---|---|
| AWS EBS CSI | Amazon EBS | helm install aws-ebs-csi-driver |
| Azure Disk CSI | Azure Managed Disk | Built-in AKS |
| GCE PD CSI | Google PD | Built-in GKE |
| NFS CSI | NFS Servers | helm install csi-driver-nfs |
| Rook/Ceph CSI | Ceph cluster | helm install rook-ceph |
| Longhorn | Distributed block | helm install longhorn |
| OpenEBS | Local/Distributed | helm install openebs |
# List installed CSI drivers in your cluster
kubectl get csidrivers
# NAME ATTACHREQUIRED PODINFOONMOUNT STORAGECAPACITY
# ebs.csi.aws.com true false false
# efs.csi.aws.com false false false
# nfs.csi.k8s.io false false false
Benefits of CSI
| Benefit | Description |
|---|---|
| Vendor-agnostic | One standard for all storage vendors |
| Out-of-tree drivers | Drivers deployed independently, not compiled into Kubernetes |
| Independent versioning | Drivers update without Kubernetes upgrades |
| Rich feature set | Snapshots, cloning, expansion — all standardised |
| Simpler deprecation | In-tree plugins can be removed without breaking existing drivers |
| Faster innovation | Vendors release new features faster without waiting for Kubernetes release cycles |
8. Backup and Data Protection
Volume Snapshots
Kubernetes VolumeSnapshots allow you to take a point-in-time copy of a PVC. Like StorageClasses for PVCs, VolumeSnapshotClasses define how snapshots are taken.
Setup:
# Install snapshot controller (if not pre-installed)
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml
Create a VolumeSnapshotClass:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: ebs-vsc
driver: ebs.csi.aws.com
deletionPolicy: Delete # Delete or Retain the snapshot when VolumeSnapshot is deleted
parameters:
tagSpecification_1: "backup-by=k8s-snapshot-controller"
Take a snapshot:
# volumesnapshot.yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: postgres-backup-2026-01-20
namespace: production
spec:
volumeSnapshotClassName: ebs-vsc
source:
persistentVolumeClaimName: pvc-postgres-claim # The PVC to snapshot
kubectl apply -f volumesnapshot.yaml
# Check snapshot status
kubectl get volumesnapshot -n production
# NAME READYTOUSE SOURCEPVC AGE
# postgres-backup-2026-01-20 true pvc-postgres-claim 2m
kubectl describe volumesnapshot postgres-backup-2026-01-20 -n production
# Status:
# Ready To Use: true
# Restore Size: 50Gi
Restore from snapshot:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-restored
namespace: production
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: ebs-gp3
dataSource: # Restore from snapshot
name: postgres-backup-2026-01-20
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
Backup Strategies
Strategy 1: Application-Level Backup (Recommended for Databases)
The most reliable approach for databases — use the database’s own backup tools:
# PostgreSQL logical backup using pg_dump
kubectl exec -it postgres-0 -n production -- \
pg_dumpall -U postgres | gzip > /backup/postgres-$(date +%Y%m%d).sql.gz
# MySQL logical backup using mysqldump
kubectl exec -it mysql-0 -n production -- \
mysqldump --all-databases -u root -p"${MYSQL_ROOT_PASSWORD}" | \
gzip > /backup/mysql-$(date +%Y%m%d).sql.gz
Strategy 2: Volume Snapshot Backup (Block-Level)
# Automated snapshot script (run as CronJob)
kubectl apply -f - <<EOF
apiVersion: batch/v1
kind: CronJob
metadata:
name: daily-snapshot
namespace: production
spec:
schedule: "0 2 * * *" # Every day at 2 AM
jobTemplate:
spec:
template:
spec:
serviceAccountName: snapshot-sa
containers:
- name: snapshot-creator
image: bitnami/kubectl
command:
- /bin/sh
- -c
- |
DATE=$(date +%Y%m%d)
kubectl apply -f - <<SNAP
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: auto-snapshot-\${DATE}
namespace: production
spec:
volumeSnapshotClassName: ebs-vsc
source:
persistentVolumeClaimName: pvc-postgres-claim
SNAP
restartPolicy: OnFailure
EOF
Strategy 3: Velero — Full Cluster Backup
Velero is the industry-standard tool for backing up entire Kubernetes clusters including resources and volumes:
# Install Velero with AWS S3 backend
velero install \
--provider aws \
--plugins velero/velero-plugin-for-aws:v1.9.0 \
--bucket my-k8s-backups \
--secret-file ./credentials-velero \
--backup-location-config region=us-east-1 \
--snapshot-location-config region=us-east-1
# Create a backup of a namespace
velero backup create production-backup \
--include-namespaces production \
--wait
# Schedule daily backups
velero schedule create daily-backup \
--schedule="0 1 * * *" \
--include-namespaces production \
--ttl 720h # Keep backups for 30 days
# Restore from backup
velero restore create --from-backup production-backup
Disaster Recovery Basics
Recovery Time Objective (RTO): How long can the system be down?
Recovery Point Objective (RPO): How much data loss is acceptable?
┌─────────────────────────────────────────────────────────────┐
│ Strategy │ RPO │ RTO │ Cost │
├─────────────────────┼───────────────┼─────────────┼────────┤
│ Manual backup │ Hours/Days │ Hours │ Low │
│ Daily snapshots │ Up to 24h │ 30-60 min │ Med │
│ Hourly snapshots │ Up to 1h │ 15-30 min │ Med │
│ Continuous repl. │ Near-zero │ Minutes │ High │
│ Active-active │ Zero │ Seconds │ High │
└─────────────────────────────────────────────────────────────┘
Multi-region DR pattern:
# Velero with cross-region replication
velero backup create dr-snapshot \
--storage-location primary-us-east \
--volume-snapshot-locations primary-us-east
# Copy backup to secondary region
velero backup download dr-snapshot --output /tmp/dr-backup.tar.gz
# Upload to secondary region S3
aws s3 cp /tmp/dr-backup.tar.gz s3://dr-bucket-us-west/
# Restore in secondary cluster
velero restore create --from-backup dr-snapshot
9. Troubleshooting Kubernetes Storage
Troubleshooting Decision Tree
Storage Problem Reported
│
▼
Is the PVC in Pending or Bound state?
kubectl get pvc
│
├── Pending ──▶ [Problem 1: PVC Pending Issues]
│
├── Bound
│ │
│ ▼
│ Is the Pod Running?
│ kubectl get pods
│ │
│ ├── Pending/ContainerCreating ──▶ [Problem 2: Volume Mount Errors]
│ │
│ ├── CrashLoopBackOff
│ │ │
│ │ ▼
│ │ kubectl logs / describe ──▶ [Problem 4: Permission Issues]
│ │
│ └── Running but app fails ──▶ Check app-level storage usage
│
└── Lost/Failed ──▶ [Problem 3: StorageClass Troubleshooting]
Problem 1: PVC Pending Issues
Symptom:
kubectl get pvc
# NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
# my-pvc Pending fast-ssd 5m
Real-Time Diagnosis — Case A: No matching PV (static provisioning):
# Check PVC events
kubectl describe pvc my-pvc
# Events:
# Warning FailedBinding 3m persistentvolume-controller
# no persistent volumes available for this claim and no storage class is set
# Check available PVs
kubectl get pv
# No resources found. ← No PVs exist!
# Check if storageClassName matches
kubectl get pvc my-pvc -o jsonpath='{.spec.storageClassName}'
# fast-ssd
kubectl get pv -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.storageClassName}{"\n"}{end}'
# pv-001 slow-hdd ← StorageClass doesn't match!
Fix:
# Option A: Create a PV with matching storageClassName
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-fast-ssd-001
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
storageClassName: fast-ssd # ← Must match PVC
hostPath:
path: /mnt/fast-ssd-001
EOF
# Option B: Update PVC to use a StorageClass that exists
kubectl patch pvc my-pvc -p '{"spec":{"storageClassName":"slow-hdd"}}'
# Note: This only works if the PVC hasn't been bound yet
Real-Time Diagnosis — Case B: Dynamic provisioning failing:
kubectl describe pvc my-pvc
# Events:
# Warning ProvisioningFailed 2m ebs.csi.aws.com
# failed to provision volume: UnauthorizedOperation: You are not authorized
# to perform this operation
# The CSI driver doesn't have IAM permissions!
# Check CSI driver pod status
kubectl get pods -n kube-system | grep ebs-csi
# ebs-csi-controller-xxx 4/6 Running 0 10m
# ← Only 4/6 containers running — something is wrong
kubectl logs -n kube-system ebs-csi-controller-xxx -c csi-provisioner
# Error: failed to assume role: AccessDenied
Fix:
# Add IAM policy to the EBS CSI driver service account
aws iam attach-role-policy \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy \
--role-name AmazonEKS_EBS_CSI_DriverRole
# Restart the CSI driver pods
kubectl rollout restart deployment ebs-csi-controller -n kube-system
# Re-check PVC
kubectl get pvc my-pvc
# STATUS: Bound ← Fixed!
Real-Time Diagnosis — Case C: WaitForFirstConsumer — PVC stays Pending until Pod exists:
kubectl describe pvc my-pvc
# Events:
# Normal WaitForFirstConsumer 1m persistentvolume-controller
# waiting for first consumer to be created before binding
# This is EXPECTED behaviour for WaitForFirstConsumer binding mode
# The PVC will bind once a Pod that uses it is scheduled
# This is correct — not an error!
# Solution: Create a Pod that uses the PVC
kubectl apply -f pod-with-pvc.yaml
# PVC will bind once the Pod is scheduled
Problem 2: Volume Mount Errors
Symptom:
kubectl get pods
# NAME READY STATUS RESTARTS AGE
# my-pod 0/1 ContainerCreating 0 3m
Real-Time Diagnosis:
# Check pod events
kubectl describe pod my-pod
# Events:
# Warning FailedMount 2m kubelet
# MountVolume.SetUp failed for volume "my-pvc" :
# rpc error: code = Internal desc = Could not attach volume
# "vol-0abc123" to node "ip-10-0-1-50":
# attachment of disk "vol-0abc123" failed,
# current node: "ip-10-0-1-50",
# attachment node: "ip-10-0-1-99"
# The EBS volume is still attached to a different node!
# This happens when a Pod moves to a new node but the old node
# didn't fully detach the volume.
Fix:
# Step 1: Find which node the volume is still attached to
aws ec2 describe-volumes --volume-ids vol-0abc123 \
--query 'Volumes[0].Attachments'
# [{"InstanceId": "i-0xyz...", "State": "attached"}]
# ← Still attached to old node!
# Step 2: Force detach from AWS (use with caution!)
aws ec2 detach-volume --volume-id vol-0abc123 --force
# Step 3: Wait and retry
sleep 30
kubectl delete pod my-pod
kubectl apply -f pod-with-pvc.yaml
# Monitor
kubectl get pod my-pod -w
# STATUS: Running ← Fixed
Symptom 2: FailedMount — Wrong filesystem or corrupted volume:
kubectl describe pod my-pod
# Events:
# Warning FailedMount kubelet
# MountVolume.MountDevice failed:
# fsType "xfs" on "/dev/xvdbf": exit status 32
# stderr: mount: /var/lib/kubelet/plugins/.../mount:
# wrong fs type, bad option, bad superblock
# Volume was formatted as ext4 but StorageClass requests xfs
Fix:
# Check StorageClass fsType
kubectl get storageclass fast-ssd -o jsonpath='{.parameters.fsType}'
# xfs
# The volume was previously formatted as ext4.
# Options:
# A) Create a new PVC and migrate data
# B) Change StorageClass fsType to match existing volume: ext4
kubectl edit storageclass fast-ssd
# Change: fsType: xfs → fsType: ext4
# Note: This affects new volumes only; existing volumes are unchanged
Problem 3: StorageClass Troubleshooting
Symptom: Dynamic provisioning not working, no PV created.
Real-Time Diagnosis:
# Check if StorageClass exists
kubectl get storageclass
# No resources found. ← StorageClass missing!
# Or: StorageClass exists but provisioner is wrong
kubectl get storageclass fast-ssd -o yaml | grep provisioner
# provisioner: kubernetes.io/aws-ebs ← Old in-tree plugin (deprecated)
# Should be: ebs.csi.aws.com
# Check if the provisioner (CSI driver) is running
kubectl get pods -n kube-system | grep csi
# No resources found. ← CSI driver not installed!
# Check CSI drivers registered
kubectl get csidrivers
# No resources found. ← No CSI drivers installed
Fix:
# Install the EBS CSI driver
helm repo add aws-ebs-csi-driver \
https://kubernetes-sigs.github.io/aws-ebs-csi-driver
helm install aws-ebs-csi-driver \
aws-ebs-csi-driver/aws-ebs-csi-driver \
--namespace kube-system
# Update StorageClass to use CSI provisioner
kubectl delete storageclass fast-ssd
cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: ebs.csi.aws.com # ← Updated to CSI
parameters:
type: gp3
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
EOF
Problem 4: Permission Issues
Symptom:
kubectl logs my-pod
# Error: EACCES: permission denied, open '/data/app.db'
# or:
# mkdir: cannot create directory '/data/uploads': Permission denied
Real-Time Diagnosis:
# Check what user the container runs as
kubectl exec my-pod -- id
# uid=1001(appuser) gid=1001(appgroup)
# Check permissions on the mounted volume
kubectl exec my-pod -- ls -la /data
# drwxr-xr-x 2 root root 6 Jan 20 10:00 .
# ← Owned by root, app runs as uid 1001 — no write access!
Fix Option A: fsGroup in Pod Security Context (Recommended)
spec:
securityContext:
fsGroup: 1001 # All volume files owned by group 1001
runAsUser: 1001 # Container runs as user 1001
runAsGroup: 1001
containers:
- name: app
image: mycompany/app:v1.0
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true # Force read-only root (good security)
volumeMounts:
- name: data
mountPath: /data
Fix Option B: Init Container to chmod/chown:
spec:
initContainers:
- name: volume-permissions
image: busybox
command: ["sh", "-c", "chown -R 1001:1001 /data && chmod 755 /data"]
volumeMounts:
- name: data
mountPath: /data
securityContext:
runAsUser: 0 # Run init as root to change permissions
containers:
- name: app
securityContext:
runAsUser: 1001
volumeMounts:
- name: data
mountPath: /data
Fix Option C: StorageClass with fsType and mount options:
# For NFS volumes, set permissions in mountOptions
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-storage
provisioner: nfs.csi.k8s.io
mountOptions:
- dir_mode=0777 # World-writable directory
- file_mode=0666 # World-writable files
- uid=1001
- gid=1001
Quick Troubleshooting Cheat Sheet
# === PVC INSPECTION ===
kubectl get pvc -A # All PVCs all namespaces
kubectl get pvc -n <ns> # PVCs in namespace
kubectl describe pvc <name> -n <ns> # Full PVC details + events
kubectl get pv # All Persistent Volumes
kubectl describe pv <name> # Full PV details
# === STORAGECLASS ===
kubectl get storageclass # All storage classes
kubectl describe storageclass <name> # StorageClass details
kubectl get csidrivers # Installed CSI drivers
# === POD STORAGE ===
kubectl describe pod <name> # Volume mount events
kubectl exec <pod> -- df -h # Disk usage inside pod
kubectl exec <pod> -- ls -la /mountpath # File permissions
kubectl exec <pod> -- mount | grep <path> # Verify volume is mounted
# === VOLUME SNAPSHOTS ===
kubectl get volumesnapshot -A # All snapshots
kubectl get volumesnapshotcontent # Underlying snapshot content
kubectl describe volumesnapshot <name> # Snapshot details
# === EVENTS (most useful for storage debugging) ===
kubectl get events -n <ns> --sort-by='.lastTimestamp' | grep -i volume
kubectl get events -n <ns> --sort-by='.lastTimestamp' | grep -i pvc
kubectl get events -n <ns> --sort-by='.lastTimestamp' | grep -i mount
# === CAPACITY ===
kubectl get pvc -A -o custom-columns=\
'NAMESPACE:.metadata.namespace,NAME:.metadata.name,STATUS:.status.phase,\
CAPACITY:.status.capacity.storage,STORAGECLASS:.spec.storageClassName'
10. Hands-On Labs
Lab 1: Create EmptyDir Volume
Objective: Create a Pod with two containers sharing an EmptyDir volume.
# Step 1: Create the Pod
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: lab1-emptydir
spec:
containers:
- name: producer
image: busybox
command: ["sh", "-c", "while true; do date >> /shared/timestamps.txt; sleep 3; done"]
volumeMounts:
- name: shared
mountPath: /shared
- name: consumer
image: busybox
command: ["sh", "-c", "while true; do echo '--- File contents ---'; cat /shared/timestamps.txt 2>/dev/null; sleep 5; done"]
volumeMounts:
- name: shared
mountPath: /shared
volumes:
- name: shared
emptyDir: {}
EOF
# Step 2: Verify both containers are running
kubectl get pod lab1-emptydir
kubectl describe pod lab1-emptydir | grep -A2 "Containers:"
# Step 3: Watch the consumer output
kubectl logs lab1-emptydir -c consumer -f
# Step 4: Verify shared data from producer side
kubectl exec lab1-emptydir -c producer -- cat /shared/timestamps.txt
# Step 5: Simulate container restart — data survives
kubectl exec lab1-emptydir -c producer -- kill 1
# (producer restarts)
kubectl exec lab1-emptydir -c producer -- cat /shared/timestamps.txt
# Previous timestamps still there!
# Step 6: Delete pod — data is lost
kubectl delete pod lab1-emptydir
kubectl run verify-gone --image=busybox --rm -it -- sh
# No /shared directory exists here — ephemeral as expected
echo "✅ Lab 1 Complete"
Lab 2: Create Persistent Volume
Objective: Manually create a static PV backed by a hostPath.
# Step 1: Create the directory on the node (Minikube)
minikube ssh -- sudo mkdir -p /mnt/lab2-data
minikube ssh -- sudo chmod 777 /mnt/lab2-data
# Step 2: Create the PV
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolume
metadata:
name: lab2-pv
labels:
lab: lab2
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: lab-storage
hostPath:
path: /mnt/lab2-data
type: DirectoryOrCreate
EOF
# Step 3: Verify PV is Available
kubectl get pv lab2-pv
# STATUS: Available ← Ready to be claimed
kubectl describe pv lab2-pv
echo "✅ Lab 2 Complete"
Lab 3: Create Persistent Volume Claim
Objective: Create a PVC that binds to the PV from Lab 2.
# Step 1: Create PVC
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: lab3-pvc
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
storageClassName: lab-storage
selector:
matchLabels:
lab: lab2
EOF
# Step 2: Check binding
kubectl get pvc lab3-pvc
# STATUS: Bound ← PVC is now bound to lab2-pv
kubectl get pv lab2-pv
# STATUS: Bound ← PV is now claimed
# Step 3: Inspect the binding details
kubectl describe pvc lab3-pvc
# Volume: lab2-pv ← Shows which PV it's bound to
echo "✅ Lab 3 Complete"
Lab 4: Attach PVC to Pod
Objective: Mount the PVC from Lab 3 into a Pod and verify data persistence.
# Step 1: Create a Pod using the PVC
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: lab4-pod
spec:
containers:
- name: app
image: busybox
command: ["sh", "-c", "sleep 3600"]
volumeMounts:
- name: persistent-storage
mountPath: /data
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: lab3-pvc
EOF
# Step 2: Wait for pod to be running
kubectl wait --for=condition=ready pod/lab4-pod --timeout=60s
# Step 3: Write data to the persistent volume
kubectl exec lab4-pod -- sh -c "echo 'Hello Persistent World!' > /data/test.txt"
kubectl exec lab4-pod -- cat /data/test.txt
# Hello Persistent World!
# Step 4: Delete and recreate the Pod — data must survive!
kubectl delete pod lab4-pod
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: lab4-pod-v2
spec:
containers:
- name: app
image: busybox
command: ["sh", "-c", "sleep 3600"]
volumeMounts:
- name: persistent-storage
mountPath: /data
volumes:
- name: persistent-storage
persistentVolumeClaim:
claimName: lab3-pvc # Same PVC!
EOF
kubectl wait --for=condition=ready pod/lab4-pod-v2 --timeout=60s
# Step 5: Verify data persisted across pod deletion
kubectl exec lab4-pod-v2 -- cat /data/test.txt
# Hello Persistent World! ← Data survived Pod deletion! ✅
echo "✅ Lab 4 Complete"
Lab 5: Dynamic Provisioning Example
Objective: Use a StorageClass (Minikube’s default) to dynamically provision a PVC.
# Step 1: Check the default StorageClass in Minikube
kubectl get storageclass
# NAME PROVISIONER RECLAIMPOLICY
# standard (default) rancher.io/local-path Delete
# Step 2: Create a PVC without specifying a PV — dynamic provisioning!
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: lab5-dynamic-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Mi
# storageClassName omitted → uses default (standard)
EOF
# Step 3: Check — PVC and PV should both be created automatically
kubectl get pvc lab5-dynamic-pvc
# STATUS: Bound ← Immediately bound!
kubectl get pv
# A new PV was automatically created by the provisioner!
# NAME CAPACITY STATUS
# pvc-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx 200Mi Bound
# Step 4: Use the dynamically provisioned volume in a Pod
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: lab5-dynamic-pod
spec:
containers:
- name: app
image: nginx
volumeMounts:
- name: dynamic-storage
mountPath: /usr/share/nginx/html
volumes:
- name: dynamic-storage
persistentVolumeClaim:
claimName: lab5-dynamic-pvc
EOF
kubectl wait --for=condition=ready pod/lab5-dynamic-pod --timeout=60s
kubectl exec lab5-dynamic-pod -- df -h /usr/share/nginx/html
# Filesystem Size Used Avail Mounted on
# /dev/... 200M 1.5M 198M /usr/share/nginx/html
echo "✅ Lab 5 Complete"
Lab 6: StatefulSet with Persistent Storage
Objective: Deploy a StatefulSet and verify each Pod gets its own unique PVC.
# Step 1: Create the StatefulSet with volumeClaimTemplates
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
name: lab6-headless
spec:
clusterIP: None
selector:
app: lab6-stateful
ports:
- port: 80
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: lab6-stateful
spec:
serviceName: lab6-headless
replicas: 3
selector:
matchLabels:
app: lab6-stateful
template:
metadata:
labels:
app: lab6-stateful
spec:
containers:
- name: app
image: busybox
command: ["sh", "-c", "echo Pod \$(hostname) started > /data/pod-identity.txt && sleep 3600"]
volumeMounts:
- name: data
mountPath: /data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Mi
EOF
# Step 2: Watch pods come up in order
kubectl get pods -l app=lab6-stateful -w
# lab6-stateful-0 Running ← First
# lab6-stateful-1 Running ← Second
# lab6-stateful-2 Running ← Third (ordered!)
# Step 3: Verify 3 separate PVCs were created (one per pod)
kubectl get pvc | grep lab6
# data-lab6-stateful-0 Bound 100Mi
# data-lab6-stateful-1 Bound 100Mi
# data-lab6-stateful-2 Bound 100Mi
# Step 4: Verify each pod wrote to ITS OWN volume
kubectl exec lab6-stateful-0 -- cat /data/pod-identity.txt
# Pod lab6-stateful-0 started
kubectl exec lab6-stateful-1 -- cat /data/pod-identity.txt
# Pod lab6-stateful-1 started
kubectl exec lab6-stateful-2 -- cat /data/pod-identity.txt
# Pod lab6-stateful-2 started
# Step 5: Delete a pod and verify it reattaches to its own PVC
kubectl delete pod lab6-stateful-1
kubectl get pods -l app=lab6-stateful -w
# lab6-stateful-1 is recreated automatically
kubectl exec lab6-stateful-1 -- cat /data/pod-identity.txt
# Pod lab6-stateful-1 started ← Same data! PVC reattached!
# Step 6: Clean up
kubectl delete statefulset lab6-stateful
kubectl delete svc lab6-headless
# Note: PVCs are NOT auto-deleted when StatefulSet is deleted (by design)
kubectl delete pvc data-lab6-stateful-0 data-lab6-stateful-1 data-lab6-stateful-2
echo "✅ Lab 6 Complete — All Labs Done!"
Summary
| Concept | Key Takeaway |
|---|---|
| Ephemeral Storage | Container writable layer — lost on container restart |
| EmptyDir | Shared scratch space within a Pod — lost on Pod deletion |
| HostPath | Node filesystem mount — persists beyond Pod, tied to a node |
| ConfigMap Volume | Injects config files; auto-updates in ~1 min without restart |
| Secret Volume | Injects secrets as files; stored in memory (tmpfs) |
| PersistentVolume (PV) | Cluster-wide storage resource — the actual storage |
| PersistentVolumeClaim (PVC) | Namespace-scoped storage request — binds to a PV |
| Access Modes | RWO (one node), ROX (many read), RWX (many write) |
| Reclaim Policies | Retain (keep data), Delete (remove storage), Recycle (deprecated) |
| StorageClass | Storage tier definition; enables dynamic provisioning |
| Dynamic Provisioning | Auto-creates PV when PVC is created — no manual admin work |
| StatefulSet | Ordered Pods with stable identity and per-Pod PVCs |
| VolumeClaimTemplates | Auto-creates PVC per StatefulSet Pod using a template |
| CSI | Standard interface for storage drivers — vendor-agnostic |
| VolumeSnapshots | Point-in-time copies of PVCs for backup and restore |
Previous: ← Module 5 - Kubernetes Networking
Next Up: Module 7 - Kubernetes Security → — Learn about RBAC, Network Policies, Pod Security Standards, and Secrets management best practices.