Interview Q&A All Levels Kubernetes

Kubernetes Interview Questions & Answers (2026) part 05

70+ Kubernetes interview questions and answers from basic to advanced — covering Pods, Deployments, Services, Networking, RBAC, Helm, Autoscaling, Security, and real-world troubleshooting scenarios.

May 18, 2025 40 min read 70 Questions DB
Level:

Kubernetes & EKS Interview Questions & Answers for AWS Cloud and AWS DevOps Engineer

A comprehensive guide covering Basic, Intermediate, and Advanced topics for Kubernetes and Amazon EKS interviews.


Answer:

Kubernetes (K8s) is an open-source container orchestration platform originally developed by Google and now maintained by the CNCF (Cloud Native Computing Foundation). It automates the deployment, scaling, and management of containerized applications.

Key capabilities:

  • Automated rollouts and rollbacks — deploy changes and roll them back if something goes wrong
  • Service discovery and load balancing — expose containers using DNS names or IP addresses
  • Storage orchestration — automatically mount storage systems (local, cloud, NFS, etc.)
  • Self-healing — restarts failed containers, replaces containers, kills containers that don’t respond to health checks
  • Secret and config management — manage sensitive information without rebuilding images
  • Horizontal scaling — scale applications up or down with a command or automatically
# Check Kubernetes version
kubectl version --short

# Get cluster info
kubectl cluster-info

🔝 Back to Table of Contents

Answer:

Kubernetes follows a master-worker architecture:

Control Plane (Master) Components:

ComponentRole
API Server (kube-apiserver)Frontend for Kubernetes control plane; all communication goes through it
etcdConsistent and highly-available key-value store for all cluster data
Scheduler (kube-scheduler)Watches for newly created Pods and assigns them to nodes
Controller Manager (kube-controller-manager)Runs controller processes (Node, Replication, Endpoint controllers, etc.)
Cloud Controller ManagerLinks cluster to cloud provider APIs

Worker Node Components:

ComponentRole
kubeletAgent on each node that ensures containers are running in a Pod
kube-proxyMaintains network rules on nodes for Pod communication
Container RuntimeSoftware to run containers (containerd, CRI-O, Docker)
┌──────────────────────────────────────────────┐
│              Control Plane                    │
│  ┌──────────┐ ┌──────────┐ ┌──────────────┐  │
│  │API Server│ │Scheduler │ │  Controller  │  │
│  └──────────┘ └──────────┘ │  Manager     │  │
│        │                   └──────────────┘  │
│  ┌─────▼──────────────────────────────────┐  │
│  │                  etcd                  │  │
│  └────────────────────────────────────────┘  │
└──────────────────────────────────────────────┘
           │
┌──────────▼───────────┐
│     Worker Node      │
│  ┌────────────────┐  │
│  │    kubelet     │  │
│  ├────────────────┤  │
│  │   kube-proxy   │  │
│  ├────────────────┤  │
│  │Container Runtime│ │
│  └────────────────┘  │
└──────────────────────┘

Answer:

A Pod is the smallest and most basic deployable unit in Kubernetes. It represents a single instance of a running process and can contain one or more tightly coupled containers that share the same network namespace, IP address, and storage.

Key characteristics:

  • Each Pod gets a unique IP address within the cluster
  • Containers inside a Pod share localhost networking
  • Pods are ephemeral — they are not self-healing by themselves
  • They are typically managed by higher-level controllers (Deployments, StatefulSets)
# Example Pod manifest
apiVersion: v1
kind: Pod
metadata:
  name: my-app-pod
  labels:
    app: my-app
spec:
  containers:
  - name: app-container
    image: nginx:1.21
    ports:
    - containerPort: 80
    resources:
      requests:
        cpu: "100m"
        memory: "128Mi"
      limits:
        cpu: "200m"
        memory: "256Mi"
# Get all pods in all namespaces
kubectl get pods -A

# Describe a pod
kubectl describe pod my-app-pod

# Get pod logs
kubectl logs my-app-pod

Answer:

A Node is a physical or virtual machine that runs workloads (Pods) in a Kubernetes cluster. Every node is managed by the control plane and contains the services needed to run Pods.

Node components:

  • kubelet — communicates with the API server and manages Pod lifecycle
  • kube-proxy — handles networking rules for Service routing
  • Container runtime — runs the containers (e.g., containerd)

Node types:

  • Master Node — runs control plane components (in older setups)
  • Worker Node — runs application workloads
# List all nodes
kubectl get nodes

# Get node details
kubectl describe node <node-name>

# Check node resource usage
kubectl top nodes

Answer:

Namespaces provide a mechanism to isolate groups of resources within a single cluster. They are ideal for multi-team or multi-project environments where resource quotas and access control need to be separated.

Default namespaces:

  • default — for objects with no other namespace
  • kube-system — for system objects created by Kubernetes
  • kube-public — readable by all users; used for public cluster info
  • kube-node-lease — holds Lease objects for node heartbeats
# List namespaces
kubectl get namespaces

# Create a namespace
kubectl create namespace my-team

# Get pods in a specific namespace
kubectl get pods -n my-team

# Set default namespace for kubectl
kubectl config set-context --current --namespace=my-team
# ResourceQuota for a namespace
apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-quota
  namespace: my-team
spec:
  hard:
    pods: "20"
    requests.cpu: "4"
    requests.memory: 8Gi
    limits.cpu: "8"
    limits.memory: 16Gi

Answer:

A Deployment is a higher-level abstraction that manages a ReplicaSet and provides declarative updates for Pods. It ensures the desired number of Pod replicas are running and handles rollouts and rollbacks.

Features:

  • Declarative updates — you describe the desired state
  • Rolling updates — updates Pods gradually to avoid downtime
  • Rollback — easily revert to previous versions
  • Pause and resume deployments
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-app
        image: my-app:v2
        ports:
        - containerPort: 8080
# Create deployment
kubectl apply -f deployment.yaml

# Check rollout status
kubectl rollout status deployment/my-app

# Rollback to previous version
kubectl rollout undo deployment/my-app

# View rollout history
kubectl rollout history deployment/my-app

🔝 Back to Table of Contents

Answer:

A ReplicaSet ensures that a specified number of Pod replicas are running at any given time. It replaces Pods that fail, are deleted, or are terminated. Deployments manage ReplicaSets, so in practice you rarely create a ReplicaSet directly.

How it works:

  1. ReplicaSet defines a label selector to identify which Pods it manages
  2. It continuously monitors Pod counts against the desired count
  3. It creates or deletes Pods to match the desired state
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: my-replicaset
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-app
        image: nginx

Answer:

A Service is an abstraction that defines a logical set of Pods and a policy to access them. Since Pods are ephemeral and their IPs change, a Service provides a stable IP address and DNS name to access them.

How it works:

  • Services use label selectors to find matching Pods
  • kube-proxy maintains network rules to route traffic
  • Each Service gets a ClusterIP and a DNS entry (e.g., my-service.default.svc.cluster.local)
apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  selector:
    app: my-app
  ports:
  - protocol: TCP
    port: 80
    targetPort: 8080
  type: ClusterIP

🔝 Back to Table of Contents

Answer:

TypeDescriptionUse Case
ClusterIPDefault; exposes Service on internal cluster IPInternal microservice communication
NodePortExposes Service on each Node’s IP at a static port (30000-32767)External access in development
LoadBalancerExposes Service externally using a cloud load balancerProduction external access
ExternalNameMaps Service to a DNS name (e.g., external DB)Connecting to external services
# LoadBalancer Service Example (EKS)
apiVersion: v1
kind: Service
metadata:
  name: my-lb-service
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
  selector:
    app: my-app
  ports:
  - port: 80
    targetPort: 8080
  type: LoadBalancer

🔝 Back to Table of Contents

Answer:

A ConfigMap stores non-confidential configuration data as key-value pairs, decoupling configuration from container images. This allows you to change application behavior without rebuilding images.

apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  APP_ENV: "production"
  APP_PORT: "8080"
  config.json: |
    {
      "logLevel": "info",
      "retries": 3
    }

Using ConfigMap in a Pod:

spec:
  containers:
  - name: app
    image: my-app
    envFrom:
    - configMapRef:
        name: app-config
    volumeMounts:
    - name: config-volume
      mountPath: /etc/config
  volumes:
  - name: config-volume
    configMap:
      name: app-config

Answer:

A Secret stores sensitive data such as passwords, tokens, and SSH keys. Data is stored base64-encoded (not encrypted by default, but can be encrypted at rest with KMS).

# Create a secret from literal values
kubectl create secret generic db-secret \
  --from-literal=username=admin \
  --from-literal=password=s3cr3t
apiVersion: v1
kind: Secret
metadata:
  name: db-secret
type: Opaque
data:
  username: YWRtaW4=      # base64("admin")
  password: czNjcjN0      # base64("s3cr3t")
# Using secrets as environment variables
spec:
  containers:
  - name: app
    image: my-app
    env:
    - name: DB_USERNAME
      valueFrom:
        secretKeyRef:
          name: db-secret
          key: username
    - name: DB_PASSWORD
      valueFrom:
        secretKeyRef:
          name: db-secret
          key: password

Answer:

kubectl is the command-line tool for interacting with Kubernetes clusters. It communicates with the Kubernetes API server to create, update, delete, and inspect resources.

Common commands:

# Cluster info
kubectl cluster-info
kubectl get nodes

# Pod management
kubectl get pods -n <namespace>
kubectl describe pod <pod-name>
kubectl logs <pod-name> -f
kubectl exec -it <pod-name> -- /bin/bash

# Apply/delete manifests
kubectl apply -f manifest.yaml
kubectl delete -f manifest.yaml

# Scale deployment
kubectl scale deployment my-app --replicas=5

# Port forwarding
kubectl port-forward pod/my-pod 8080:80

# Resource usage
kubectl top pods
kubectl top nodes

🔝 Back to Table of Contents

Answer:

A DaemonSet ensures that a copy of a Pod runs on all (or some selected) nodes. When a new node joins the cluster, the DaemonSet controller automatically adds a Pod to it.

Common use cases:

  • Log collectors (Fluentd, Filebeat)
  • Monitoring agents (Prometheus Node Exporter, Datadog)
  • Network plugins (Calico, Cilium)
  • Storage daemons
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
spec:
  selector:
    matchLabels:
      name: node-exporter
  template:
    metadata:
      labels:
        name: node-exporter
    spec:
      tolerations:
      - key: node-role.kubernetes.io/control-plane
        effect: NoSchedule
      containers:
      - name: node-exporter
        image: prom/node-exporter:latest
        ports:
        - containerPort: 9100

Answer:

A StatefulSet manages stateful applications that require stable, unique network identifiers, persistent storage, and ordered deployment/scaling/deletion. Unlike Deployments, StatefulSets give each Pod a unique, stable hostname.

When to use:

  • Databases (MySQL, PostgreSQL, Cassandra)
  • Message brokers (Kafka, RabbitMQ)
  • Any app needing stable network identity
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: postgres
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:14
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

Pods are named with ordinal suffixes: postgres-0, postgres-1, postgres-2.

Answer:

Job: Creates one or more Pods and ensures they complete successfully. It is used for batch processing or one-time tasks.

apiVersion: batch/v1
kind: Job
metadata:
  name: data-migration
spec:
  completions: 1
  parallelism: 1
  template:
    spec:
      restartPolicy: Never
      containers:
      - name: migrate
        image: my-migration-tool
        command: ["./migrate.sh"]

CronJob: Creates Jobs on a recurring schedule using cron syntax.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: daily-backup
spec:
  schedule: "0 2 * * *"   # Every day at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
          - name: backup
            image: my-backup-tool
            command: ["./backup.sh"]

Answer:

Amazon Elastic Kubernetes Service (EKS) is a fully managed Kubernetes service from AWS that simplifies running Kubernetes by handling the control plane infrastructure, upgrades, patches, and high availability.

Key benefits:

  • AWS manages the Kubernetes control plane across multiple Availability Zones
  • Certified Kubernetes conformant — compatible with standard K8s tooling
  • Integrates natively with AWS services (IAM, VPC, ALB, ECR, CloudWatch)
  • Supports EC2 nodes, Fargate, and EKS Anywhere
  • Automated Kubernetes version upgrades
# Create EKS cluster using eksctl
eksctl create cluster \
  --name my-cluster \
  --region us-east-1 \
  --nodegroup-name standard-nodes \
  --node-type t3.medium \
  --nodes 3 \
  --nodes-min 1 \
  --nodes-max 5 \
  --managed

# Update kubeconfig
aws eks update-kubeconfig --name my-cluster --region us-east-1

Answer:

FeatureEKSSelf-Managed Kubernetes
Control PlaneFully managed by AWSYou manage it
UpgradesSimplified with one-clickManual and complex
HAMulti-AZ by defaultMust configure manually
etcd backupManaged by AWSYour responsibility
Cost$0.10/hour per cluster + node costsOnly node costs
AWS IntegrationNative (IAM, VPC, ALB)Manual configuration
FlexibilityLess control over control planeFull control

Answer:

Node Groups in EKS are collections of EC2 instances (worker nodes) that share the same configuration. There are two types:

Managed Node Groups:

  • AWS provisions, registers, and terminates nodes automatically
  • Supports EC2 Auto Scaling Groups
  • Handles AMI updates and node lifecycle
  • Easier to maintain; recommended for most cases

Self-Managed Node Groups:

  • You manage the EC2 instances manually
  • More control over AMIs and configurations
  • Required for specialized hardware (GPU, custom kernel)
# Create managed node group
eksctl create nodegroup \
  --cluster my-cluster \
  --name gpu-nodes \
  --node-type p3.2xlarge \
  --nodes 2 \
  --managed

# List node groups
eksctl get nodegroup --cluster my-cluster

Answer:

AWS Fargate is a serverless compute engine for containers. With EKS + Fargate, you can run Pods without provisioning or managing EC2 instances — AWS manages the underlying infrastructure automatically.

Key points:

  • No node management; pay per Pod CPU/memory
  • Uses Fargate Profiles to define which Pods run on Fargate
  • Each Pod gets its own isolated micro-VM (enhanced security)
  • DaemonSets are NOT supported on Fargate
# Fargate Profile via eksctl
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: my-cluster
  region: us-east-1
fargateProfiles:
- name: default
  selectors:
  - namespace: default
  - namespace: kube-system
# Create fargate profile
eksctl create fargateprofile \
  --cluster my-cluster \
  --name my-profile \
  --namespace my-namespace

Answer:

EKS uses AWS IAM for authentication and Kubernetes RBAC for authorization.

Authentication flow:

  1. kubectl calls the AWS CLI/SDK to get a pre-signed token via STS
  2. The token is passed to the Kubernetes API server
  3. The EKS cluster verifies the token with AWS IAM
  4. Kubernetes RBAC is checked for authorization
# Configure kubectl for EKS
aws eks update-kubeconfig \
  --name my-cluster \
  --region us-east-1 \
  --role-arn arn:aws:iam::123456789:role/eks-admin-role

# Verify access
kubectl auth can-i get pods

# Check kubeconfig
kubectl config view

🔝 Back to Table of Contents

Answer:

FeatureDeploymentStatefulSet
Pod identityInterchangeable (random names)Stable, unique (pod-0, pod-1)
StorageShared or ephemeralDedicated PVC per Pod
ScalingAny orderOrdered (0, 1, 2…)
DNSSingle service endpointIndividual headless DNS per Pod
Use caseStateless appsStateful apps (DBs, queues)
Rolling updateParallel or rollingSequential (n-1 to 0)
# StatefulSet pod DNS format:
# <pod-name>.<service-name>.<namespace>.svc.cluster.local
# Example: postgres-0.postgres.default.svc.cluster.local

Answer:

PersistentVolume (PV): A piece of storage in the cluster provisioned by an admin or dynamically via StorageClass. It exists independently of Pods.

PersistentVolumeClaim (PVC): A request for storage by a user. It binds to a PV that matches its requirements (size, access mode, StorageClass).

Access Modes:

  • ReadWriteOnce (RWO) — can be mounted by a single node
  • ReadOnlyMany (ROX) — can be mounted by many nodes in read-only mode
  • ReadWriteMany (RWX) — can be mounted by many nodes in read/write mode
# PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: gp2
# Using PVC in a Pod
spec:
  containers:
  - name: app
    volumeMounts:
    - mountPath: /data
      name: my-storage
  volumes:
  - name: my-storage
    persistentVolumeClaim:
      claimName: my-pvc

Answer:

A StorageClass defines the “class” of storage (e.g., SSD, HDD, network storage) and enables dynamic provisioning of PersistentVolumes when a PVC is created.

# EKS GP3 StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: gp3
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  iops: "3000"
  throughput: "125"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

Reclaim policies:

  • Delete — automatically delete PV when PVC is deleted
  • Retain — keep PV after PVC deletion for manual reclamation

Answer:

A rolling update gradually replaces old Pod instances with new ones to ensure zero downtime.

Parameters:

  • maxSurge — max Pods above desired count during update
  • maxUnavailable — max Pods that can be unavailable during update
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1          # Allow 1 extra pod
      maxUnavailable: 0    # No downtime; always keep all pods running

Process:

  1. Create a new ReplicaSet with the updated Pod template
  2. Scale up new ReplicaSet by maxSurge Pods
  3. Scale down old ReplicaSet by maxUnavailable Pods
  4. Repeat until all Pods are updated
# Update image (triggers rolling update)
kubectl set image deployment/my-app my-app=my-app:v2

# Monitor rollout
kubectl rollout status deployment/my-app

# Pause and resume
kubectl rollout pause deployment/my-app
kubectl rollout resume deployment/my-app

🔝 Back to Table of Contents

Answer:

Liveness Probe: Determines if a container is running. If it fails, the kubelet restarts the container.

Readiness Probe: Determines if a container is ready to accept traffic. If it fails, the Pod is removed from Service endpoints (no traffic sent to it).

Startup Probe: Checks if an application has started. Useful for slow-starting containers; disables liveness/readiness until startup succeeds.

spec:
  containers:
  - name: app
    image: my-app
    livenessProbe:
      httpGet:
        path: /healthz
        port: 8080
      initialDelaySeconds: 30
      periodSeconds: 10
      failureThreshold: 3
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 5
    startupProbe:
      httpGet:
        path: /healthz
        port: 8080
      failureThreshold: 30
      periodSeconds: 10

Probe types: httpGet, tcpSocket, exec (command)

Answer:

HPA automatically scales the number of Pods in a Deployment or StatefulSet based on observed CPU/memory utilization or custom metrics.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
# Create HPA
kubectl autoscale deployment my-app --cpu-percent=70 --min=2 --max=10

# Check HPA status
kubectl get hpa

Note: HPA requires the metrics-server to be installed in the cluster.

Answer:

The kube-scheduler is a control plane component that watches for newly created Pods with no assigned node and selects a node for them to run on.

Scheduling process (two phases):

  1. Filtering (Predicates): Finds nodes that are feasible for the Pod (resource availability, taints/tolerations, node selectors, etc.)
  2. Scoring (Priorities): Ranks feasible nodes based on scoring functions (resource balance, affinity rules, etc.)

Factors considered:

  • Resource requests and limits
  • Node labels and selectors
  • Taints and tolerations
  • Affinity and anti-affinity rules
  • Pod topology spread constraints
  • Node pressure (DiskPressure, MemoryPressure)
# Force Pod to a specific node
spec:
  nodeSelector:
    kubernetes.io/hostname: ip-192-168-1-100

# Or use nodeName directly
spec:
  nodeName: ip-192-168-1-100

🔝 Back to Table of Contents

Answer:

Taints are applied to nodes to repel Pods that don’t explicitly tolerate the taint. Tolerations are applied to Pods to allow them to be scheduled on tainted nodes.

Taint effects:

  • NoSchedule — Pods without toleration are not scheduled on the node
  • PreferNoSchedule — Kubernetes tries to avoid scheduling Pods without toleration
  • NoExecute — Existing Pods without toleration are evicted
# Add a taint to a node
kubectl taint nodes node1 dedicated=gpu:NoSchedule

# Remove a taint
kubectl taint nodes node1 dedicated=gpu:NoSchedule-
# Pod with toleration
spec:
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "gpu"
    effect: "NoSchedule"

Use cases:

  • Dedicated nodes for specific workloads (GPU nodes, prod nodes)
  • Preventing non-system Pods on control plane nodes

🔝 Back to Table of Contents

Answer:

Node Affinity: Constrains which nodes a Pod can be scheduled on based on node labels. More expressive than nodeSelector.

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:  # hard rule
        nodeSelectorTerms:
        - matchExpressions:
          - key: instance-type
            operator: In
            values: ["m5.xlarge", "m5.2xlarge"]
      preferredDuringSchedulingIgnoredDuringExecution:  # soft rule
      - weight: 1
        preference:
          matchExpressions:
          - key: zone
            operator: In
            values: ["us-east-1a"]

Pod Affinity / Anti-Affinity: Schedules Pods relative to other Pods.

# Anti-affinity: spread pods across nodes
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchLabels:
            app: my-app
        topologyKey: kubernetes.io/hostname

Answer:

Role-Based Access Control (RBAC) regulates access to Kubernetes resources based on the roles of users or service accounts.

Key objects:

ObjectScopePurpose
RoleNamespaceGrants permissions within a namespace
ClusterRoleCluster-wideGrants permissions across all namespaces
RoleBindingNamespaceBinds Role to user/group/service account
ClusterRoleBindingCluster-wideBinds ClusterRole cluster-wide
# Role — allows reading pods in default namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-reader
  namespace: default
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]

---
# RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods-binding
  namespace: default
subjects:
- kind: ServiceAccount
  name: my-service-account
  namespace: default
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Answer:

An Ingress resource defines HTTP/HTTPS routing rules to Services. An Ingress Controller implements those rules (e.g., NGINX, Traefik, AWS ALB Ingress Controller).

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - myapp.example.com
    secretName: tls-secret
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /api
        pathType: Prefix
        backend:
          service:
            name: api-service
            port:
              number: 80
      - path: /
        pathType: Prefix
        backend:
          service:
            name: frontend-service
            port:
              number: 80

In EKS, the AWS Load Balancer Controller creates ALBs automatically from Ingress resources using annotations.

Answer:

etcd is a distributed, consistent key-value store used as Kubernetes’ backing store for all cluster data. Every API object (Pods, Services, ConfigMaps, Secrets, etc.) is stored in etcd.

Key properties:

  • Consistency: Uses Raft consensus algorithm for leader election and data replication
  • High availability: Typically run as a 3 or 5-node cluster (odd number for quorum)
  • Watch API: Enables Kubernetes controllers to watch for changes

Important facts:

  • All communication with etcd goes through the API server
  • Backing up etcd is critical for disaster recovery
  • In EKS, etcd is fully managed by AWS
# In a self-managed cluster — backup etcd
ETCDCTL_API=3 etcdctl snapshot save snapshot.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key

Answer:

A Kubernetes Operator is a method of packaging, deploying, and managing a Kubernetes application using custom controllers and CRDs. Operators encode operational knowledge (how to deploy, scale, upgrade, backup) into software.

Operator pattern:

  1. Define a CRD (e.g., PostgreSQLCluster)
  2. Implement a Controller that watches the CRD
  3. Controller reconciles the actual state with the desired state

Popular Operators:

  • Prometheus Operator
  • PostgreSQL Operator (Zalando or CrunchyData)
  • Cert-Manager
  • ArgoCD
# Example: Using the Prometheus Operator CRD
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
spec:
  replicas: 2
  retention: 30d
  storage:
    volumeClaimTemplate:
      spec:
        resources:
          requests:
            storage: 50Gi

Answer:

FeatureRoleClusterRole
ScopeSingle namespaceCluster-wide
Use caseNamespace-scoped resourcesCluster-scoped resources (Nodes, PVs) or all namespaces
Bound byRoleBindingClusterRoleBinding (or RoleBinding for namespace scope)

A ClusterRole can be bound in two ways:

  1. ClusterRoleBinding → grants access across all namespaces
  2. RoleBinding → grants ClusterRole access within a specific namespace only

This is useful when you want to reuse a ClusterRole definition across multiple namespaces.

🔝 Back to Table of Contents

Answer:

Kubernetes uses CoreDNS as the cluster DNS server. Every Pod gets a /etc/resolv.conf pointing to the CoreDNS service IP. Services and Pods are accessible via DNS.

DNS naming format:

# Service
<service-name>.<namespace>.svc.cluster.local

# Pod
<pod-ip-dashes>.<namespace>.pod.cluster.local
# Example: 10-244-1-5.default.pod.cluster.local

# Headless service Pods (StatefulSet)
<pod-name>.<service-name>.<namespace>.svc.cluster.local
# Example: mysql-0.mysql.default.svc.cluster.local
# Test DNS from inside a Pod
kubectl exec -it my-pod -- nslookup kubernetes.default
kubectl exec -it my-pod -- curl http://my-service.my-namespace.svc.cluster.local

🔝 Back to Table of Contents

Answer:

The EKS Control Plane includes the Kubernetes API server, scheduler, controller manager, and etcd. AWS:

  • Runs the control plane across 3 Availability Zones for HA
  • Manages etcd backups automatically
  • Handles security patches and control plane upgrades
  • Provides a dedicated API server endpoint for each cluster
  • Monitors and auto-replaces unhealthy control plane nodes

Customers are responsible for:

  • Worker nodes and node groups
  • Application deployments
  • Kubernetes version upgrades (with AWS assistance)
# Check EKS cluster status
aws eks describe-cluster --name my-cluster --region us-east-1

# List EKS clusters
aws eks list-clusters --region us-east-1

🔝 Back to Table of Contents

Answer:

EKS uses a webhook token authenticator that verifies AWS IAM identities:

  1. kubectl requests a pre-signed STS token via aws eks get-token
  2. The token is sent to the Kubernetes API server
  3. The API server passes it to the AWS IAM Authenticator webhook
  4. The webhook calls STS to validate the token and returns the IAM identity
  5. Kubernetes maps the IAM identity to a Kubernetes RBAC user/group via the aws-auth ConfigMap (or EKS Access Entries)
# Get token manually (for debugging)
aws eks get-token --cluster-name my-cluster

# Check what identity kubectl uses
kubectl auth whoami
aws sts get-caller-identity

🔝 Back to Table of Contents

Answer:

The aws-auth ConfigMap in the kube-system namespace maps AWS IAM principals (users, roles) to Kubernetes RBAC users and groups. This controls who can access the cluster and with what permissions.

apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - rolearn: arn:aws:iam::123456789:role/eks-node-group-role
      username: system:node:{{EC2PrivateDNSName}}
      groups:
        - system:bootstrappers
        - system:nodes
    - rolearn: arn:aws:iam::123456789:role/eks-admin-role
      username: admin
      groups:
        - system:masters
  mapUsers: |
    - userarn: arn:aws:iam::123456789:user/john
      username: john
      groups:
        - developers

Note: AWS now recommends EKS Access Entries (API-based) as the preferred alternative to the aws-auth ConfigMap.

🔝 Back to Table of Contents

Answer:

The Amazon VPC CNI (Container Network Interface) plugin is the default networking plugin for EKS. It assigns real VPC IP addresses to Pods from the node’s subnet, enabling direct communication between Pods and other AWS resources.

Key features:

  • Each Pod gets a real VPC IP address (not a virtual overlay network)
  • Pods can communicate directly with RDS, ElastiCache, and other AWS services
  • Security Groups can be applied directly to Pods (SecurityGroupPolicy)
  • Supports IPv4 and IPv6
# Check VPC CNI version
kubectl describe daemonset aws-node -n kube-system | grep Image

# Check IP address allocation per node
kubectl get nodes -o custom-columns=\
  'NAME:.metadata.name,MAX_PODS:.status.capacity.pods'

IP address calculation: Each EC2 instance type has a limit on ENIs and IPs per ENI. Max Pods = (ENIs × (IPs per ENI - 1)) + 2

🔝 Back to Table of Contents

Answer:

Methods to scale EKS nodes:

  1. Cluster Autoscaler (CA): Automatically adjusts the number of nodes in an Auto Scaling Group based on pending Pods.

  2. Karpenter: AWS-native node autoscaler that provisions optimal EC2 instances on demand (faster and more flexible than CA).

  3. Manual scaling: Update the desired count in the ASG or via eksctl.

# Manual scaling with eksctl
eksctl scale nodegroup \
  --cluster my-cluster \
  --name standard-nodes \
  --nodes 5 \
  --nodes-min 2 \
  --nodes-max 10
# Cluster Autoscaler deployment annotation
spec:
  template:
    spec:
      containers:
      - name: cluster-autoscaler
        command:
        - ./cluster-autoscaler
        - --cloud-provider=aws
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-cluster
        - --balance-similar-node-groups
        - --skip-nodes-with-system-pods=false

Answer:

Karpenter is an open-source, high-performance node autoscaler for Kubernetes, originally built by AWS. It provisions the right EC2 instance types for your workloads in seconds, rather than minutes.

Advantages over Cluster Autoscaler:

  • Provisions nodes directly (no ASG required)
  • Supports diverse instance types automatically (spot + on-demand mix)
  • Consolidates underutilized nodes automatically
  • Node provisioning in ~60 seconds vs 2-5 minutes for CA
# NodePool (Karpenter v0.30+)
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
      - key: karpenter.sh/capacity-type
        operator: In
        values: ["spot", "on-demand"]
      - key: kubernetes.io/arch
        operator: In
        values: ["amd64"]
      - key: karpenter.k8s.aws/instance-category
        operator: In
        values: ["c", "m", "r"]
  disruption:
    consolidationPolicy: WhenUnderutilized
  limits:
    cpu: 1000

Answer:

EKS Anywhere allows you to create and manage Kubernetes clusters on your own infrastructure (on-premises, VMware vSphere, bare metal, or other cloud providers) using the same EKS tools and configurations used in AWS.

Use cases:

  • Regulatory requirements that prevent cloud usage
  • Data sovereignty requirements
  • Hybrid cloud architectures
  • Air-gapped environments

Key features:

  • Uses same EKS configuration API
  • Supports curated packages (CoreDNS, Cilium, etc.)
  • Optionally connect to AWS via EKS Connector for management in the AWS Console

Answer:

The AWS Load Balancer Controller is a controller that manages AWS Elastic Load Balancers for Kubernetes clusters. It provisions:

  • Application Load Balancers (ALBs) for Ingress resources
  • Network Load Balancers (NLBs) for Service type LoadBalancer
# ALB Ingress via AWS Load Balancer Controller
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    kubernetes.io/ingress.class: alb
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:...
    alb.ingress.kubernetes.io/ssl-redirect: "443"
spec:
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-service
            port:
              number: 80
# Install AWS Load Balancer Controller via Helm
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
  -n kube-system \
  --set clusterName=my-cluster \
  --set serviceAccount.create=false \
  --set serviceAccount.name=aws-load-balancer-controller

🔝 Back to Table of Contents

Answer:

EKS Add-ons are operational software components that extend the functionality of Kubernetes. AWS manages their lifecycle (installation, updates, conflict resolution).

Available add-ons:

  • kube-proxy — network proxy
  • coredns — cluster DNS
  • vpc-cni — Pod networking
  • aws-ebs-csi-driver — EBS storage
  • aws-efs-csi-driver — EFS storage
  • adot — AWS Distro for OpenTelemetry
  • amazon-cloudwatch-observability — CloudWatch monitoring
  • eks-pod-identity-agent — Pod identity
# List available add-ons
aws eks describe-addon-versions --kubernetes-version 1.29

# Install an add-on
aws eks create-addon \
  --cluster-name my-cluster \
  --addon-name aws-ebs-csi-driver \
  --service-account-role-arn arn:aws:iam::123456789:role/ebs-csi-role

# List installed add-ons
aws eks list-addons --cluster-name my-cluster

🔝 Back to Table of Contents

Answer:

Options for secret management in EKS:

  1. Kubernetes Secrets (base64 encoded; encrypt with AWS KMS for security)
  2. AWS Secrets Manager + Secrets Store CSI Driver (mount secrets as volumes)
  3. AWS Systems Manager Parameter Store (same CSI driver)
  4. External Secrets Operator (sync external secrets to Kubernetes Secrets)
# Using Secrets Store CSI Driver with AWS Secrets Manager
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: aws-secrets
spec:
  provider: aws
  parameters:
    objects: |
      - objectName: "prod/myapp/db-password"
        objectType: secretsmanager
        objectAlias: db-password

---
spec:
  containers:
  - name: app
    volumeMounts:
    - name: secrets
      mountPath: /mnt/secrets
      readOnly: true
  volumes:
  - name: secrets
    csi:
      driver: secrets-store.csi.k8s.io
      readOnly: true
      volumeAttributes:
        secretProviderClass: aws-secrets
# Enable EKS secrets encryption with KMS
aws eks create-cluster \
  --name my-cluster \
  --encryption-config "resources=[secrets],provider={keyArn=arn:aws:kms:...}"

🔝 Back to Table of Contents

Answer:

Kubernetes enforces a flat networking model with these requirements:

  • All Pods can communicate with each other without NAT
  • All Nodes can communicate with all Pods without NAT
  • The IP a Pod sees for itself is the same IP others see

Network layers:

  1. Pod-to-Pod — via CNI plugin (VPC CNI, Calico, Cilium, Flannel)
  2. Pod-to-Service — via kube-proxy (iptables or IPVS rules)
  3. External-to-Service — via NodePort, LoadBalancer, or Ingress

CNI Plugin responsibilities:

  • Assign IP addresses to Pods
  • Set up routing rules
  • Handle network policy enforcement (Calico, Cilium)
# Inspect CNI config on a node
cat /etc/cni/net.d/10-aws.conflist

# Trace network path
kubectl exec -it my-pod -- traceroute 10.100.0.1

🔝 Back to Table of Contents

Answer:

A CRD extends the Kubernetes API by defining new resource types. Once a CRD is registered, you can create instances of it using kubectl like any built-in resource.

# Define a CRD
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.mycompany.io
spec:
  group: mycompany.io
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              engine:
                type: string
                enum: [postgres, mysql]
              replicas:
                type: integer
                minimum: 1
  scope: Namespaced
  names:
    plural: databases
    singular: database
    kind: Database
# Use the custom resource
apiVersion: mycompany.io/v1
kind: Database
metadata:
  name: my-db
spec:
  engine: postgres
  replicas: 3

Answer:

VPA automatically adjusts the CPU and memory requests/limits of containers based on actual usage. Unlike HPA (which scales replicas), VPA scales resources per container.

VPA modes:

  • Off — only provides recommendations (no changes)
  • Initial — sets resources only at Pod creation
  • Auto — updates resources and evicts/restarts Pods to apply changes
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: my-app
      minAllowed:
        cpu: 100m
        memory: 128Mi
      maxAllowed:
        cpu: 2
        memory: 2Gi

Note: VPA and HPA should not be used together on the same metric (e.g., both on CPU). HPA + VPA on different metrics (e.g., HPA on custom metrics, VPA on CPU/memory) can work together.

Answer:

A PDB limits the number of Pods of a replicated application that are down simultaneously during voluntary disruptions (node drains, cluster upgrades, evictions).

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  selector:
    matchLabels:
      app: my-app
  minAvailable: 2      # OR use maxUnavailable
  # maxUnavailable: 1  # Maximum 1 Pod can be unavailable

Use cases:

  • Ensures minimum replicas during node drain for upgrades
  • Protects stateful applications from data loss during disruptions
  • Works with the Cluster Autoscaler and Karpenter
# Check PDB status
kubectl get pdb

# During cluster upgrade — drain blocks if PDB would be violated
kubectl drain node1 --ignore-daemonsets --delete-emptydir-data

Answer:

Kubernetes implements service discovery in two ways:

1. DNS-based (recommended):

  • CoreDNS resolves Service names to ClusterIPs
  • <service>.<namespace>.svc.cluster.local

2. Environment variables:

  • At Pod start, Kubernetes injects env vars for all Services in the namespace
  • e.g., MY_SERVICE_SERVICE_HOST, MY_SERVICE_SERVICE_PORT
  • Limitation: only works for Services created before the Pod

Headless Services (for StatefulSets):

  • Set clusterIP: None
  • DNS returns individual Pod IPs instead of a single VIP
  • Enables direct Pod addressing
# Headless service
apiVersion: v1
kind: Service
metadata:
  name: my-stateful-svc
spec:
  clusterIP: None
  selector:
    app: my-stateful-app
  ports:
  - port: 5432

Answer:

A Service Mesh is a dedicated infrastructure layer that manages service-to-service communication (traffic management, observability, security) using sidecar proxies without changing application code.

Istio architecture:

  • Data Plane: Envoy sidecar proxies (injected automatically into Pods) handle all traffic
  • Control Plane (Istiod): Manages proxy configuration, certificate lifecycle, and traffic policies

Key Istio features:

  • Traffic management: canary deployments, circuit breaking, retries, timeouts
  • mTLS: automatic mutual TLS between services
  • Observability: distributed tracing (Jaeger), metrics (Prometheus), logging
  • Authorization policies: fine-grained L7 access control
# VirtualService — traffic splitting (canary)
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: my-app
spec:
  hosts:
  - my-app
  http:
  - route:
    - destination:
        host: my-app
        subset: v1
      weight: 90
    - destination:
        host: my-app
        subset: v2
      weight: 10

Answer:

Admission Controllers are plugins that intercept API server requests after authentication and authorization but before persisting objects to etcd. They can validate or mutate requests.

Two types:

  • Mutating Admission Webhooks: Modify the request (e.g., inject sidecar, add labels/defaults)
  • Validating Admission Webhooks: Allow or reject the request (e.g., enforce policies)

Built-in admission controllers:

  • LimitRanger — enforces resource limits
  • ResourceQuota — enforces namespace quotas
  • PodSecurity — enforces Pod security standards
  • MutatingAdmissionWebhook — calls external webhook for mutations
  • ValidatingAdmissionWebhook — calls external webhook for validation
# Validating Webhook configuration
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  name: my-policy-webhook
webhooks:
- name: validate.mycompany.io
  clientConfig:
    service:
      name: policy-service
      namespace: kube-system
      path: /validate
  rules:
  - operations: ["CREATE", "UPDATE"]
    apiGroups: ["apps"]
    resources: ["deployments"]
  admissionReviewVersions: ["v1"]
  sideEffects: None

Answer:

Network Policies are Kubernetes resources that control traffic flow at the IP/port level between Pods, namespaces, and external endpoints. They require a CNI plugin that supports them (Calico, Cilium, Weave).

# Deny all ingress, allow only from specific namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-frontend
  namespace: backend
spec:
  podSelector:
    matchLabels:
      role: db
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: frontend
      podSelector:
        matchLabels:
          role: api
    ports:
    - protocol: TCP
      port: 5432
  egress:
  - to:
    - ipBlock:
        cidr: 10.0.0.0/8

Default behavior: Without a NetworkPolicy, all traffic is allowed. Once a NetworkPolicy selects a Pod, that Pod follows the policy’s rules.

Answer:

Kubernetes is not inherently multi-tenant but can be made so using multiple isolation mechanisms:

Soft multi-tenancy (shared cluster):

  • Namespaces — logical isolation
  • RBAC — access control per team/namespace
  • ResourceQuotas — limit resource consumption per namespace
  • LimitRanges — default/max resources per Pod in a namespace
  • Network Policies — isolate network traffic between namespaces
  • Pod Security Standards — enforce security contexts

Hard multi-tenancy (strong isolation):

  • Separate clusters per tenant (VCluster, separate EKS clusters)
  • VCluster — virtual Kubernetes clusters inside a namespace
  • Capsule / HNC — multi-tenancy frameworks
# ResourceQuota per tenant namespace
apiVersion: v1
kind: ResourceQuota
metadata:
  name: tenant-a-quota
  namespace: tenant-a
spec:
  hard:
    requests.cpu: "10"
    requests.memory: 20Gi
    limits.cpu: "20"
    limits.memory: 40Gi
    pods: "50"
    services: "10"

Answer:

kube-proxy manages network rules on nodes for Service routing. It supports three modes:

FeatureiptablesIPVS
RoutingSequential rule matchingHash table lookup
PerformanceDegrades at scale (O(n))Constant time (O(1))
Load balancing algorithmsRound-robin onlyRR, least connections, source hash, etc.
ScaleGood up to ~1000 ServicesScales to 10,000+ Services
Health checkingLimitedBuilt-in
# Check current kube-proxy mode
kubectl get configmap kube-proxy -n kube-system -o yaml | grep mode

# Switch to IPVS mode (via configmap)
kubectl edit configmap kube-proxy -n kube-system
# Set: mode: "ipvs"

For large clusters (>1000 Services), IPVS mode is strongly recommended. EKS also supports IPVS mode.

rk?

Answer:

Kubernetes garbage collection automatically removes objects that are no longer needed:

1. Owner References & Cascading Deletion:

  • Resources have ownerReferences pointing to their owner (e.g., Pod → ReplicaSet → Deployment)
  • When an owner is deleted, dependents are deleted via Foreground or Background cascading deletion
# Delete with cascade (default: background)
kubectl delete deployment my-app

# Orphan dependents (don't delete ReplicaSet/Pods)
kubectl delete deployment my-app --cascade=orphan

2. Image Garbage Collection:

  • kubelet removes unused container images when disk usage exceeds imageGCHighThresholdPercent (default 85%)

3. Container Garbage Collection:

  • Removes terminated containers based on MaxContainerCount and MaxDeadContainerAge

4. API Resource GC:

  • Removes completed Jobs, finished Pods (based on ttlSecondsAfterFinished)
# Auto-delete Job after 60 seconds
spec:
  ttlSecondsAfterFinished: 60

Answer:

Cluster Autoscaler (CA) automatically adjusts the size of a Kubernetes cluster by adding or removing nodes based on Pod scheduling needs.

Scale-up logic:

  1. Pod is Pending because no node has enough resources
  2. CA simulates which node group could schedule the Pod
  3. CA increases the ASG desired count
  4. New node joins, Pod is scheduled

Scale-down logic:

  1. CA checks if a node’s utilization drops below 50% for 10+ minutes
  2. Verifies all Pods can be rescheduled on other nodes
  3. Drains the node and terminates the EC2 instance
# EKS — ASG tags required for CA
Tags:
  k8s.io/cluster-autoscaler/enabled: "true"
  k8s.io/cluster-autoscaler/<cluster-name>: "owned"

Key CA flags:

--scale-down-delay-after-add=10m       # Wait after scale-up before scale-down check
--scale-down-unneeded-time=10m         # Node must be unneeded for this long
--skip-nodes-with-local-storage=false  # Allow scale-down of nodes with local storage

Answer:

GitOps is an operational framework where Git is the single source of truth for infrastructure and application configuration. Changes are made via Git commits/PRs, and automated agents reconcile the cluster state to match.

GitOps tools for Kubernetes:

  • ArgoCD — declarative, Git-based continuous delivery
  • Flux CD — lightweight, GitOps toolkit for Kubernetes

ArgoCD workflow:

  1. Developer commits Kubernetes manifests to Git
  2. ArgoCD detects the diff between Git and cluster state
  3. ArgoCD syncs the cluster (applies manifests)
  4. Health status is reported back
# ArgoCD Application
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: my-app
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/my-org/my-app
    targetRevision: main
    path: k8s/
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

Answer:

CrashLoopBackOff means a container is repeatedly crashing and Kubernetes is backing off before restarting it.

Step-by-step troubleshooting:

# 1. Check Pod status and events
kubectl get pods
kubectl describe pod <pod-name>

# 2. Check current logs
kubectl logs <pod-name>

# 3. Check previous container logs (if container already crashed)
kubectl logs <pod-name> --previous

# 4. Check exit code (tells you why the container exited)
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].lastState.terminated.exitCode}'

# 5. Debug with a shell (if container has bash)
kubectl debug -it <pod-name> --image=busybox --target=<container-name>

Common exit codes:

Exit CodeMeaning
0Success (no crash — liveness probe failing?)
1Application error
137OOMKilled (out of memory)
139Segmentation fault
143Graceful termination (SIGTERM)

Common root causes:

  • Wrong command or entrypoint in the container
  • Missing environment variables or secrets
  • Application fails healthcheck (liveness probe)
  • OOMKilled — increase memory limits
  • Bad image or missing dependencies

🔝 Back to Table of Contents

Answer:

OPA Gatekeeper is a policy engine for Kubernetes built on Open Policy Agent (OPA). It uses validating admission webhooks and CRDs to enforce custom policies.

Key concepts:

  • ConstraintTemplate — defines the policy logic in Rego
  • Constraint — an instance of a ConstraintTemplate with specific parameters
# ConstraintTemplate — enforce required labels
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: requirelabels
spec:
  crd:
    spec:
      names:
        kind: RequireLabels
      validation:
        openAPIV3Schema:
          type: object
          properties:
            labels:
              type: array
              items:
                type: string
  targets:
  - target: admission.k8s.gatekeeper.sh
    rego: |
      package requirelabels
      violation[{"msg": msg}] {
        provided := {label | input.review.object.metadata.labels[label]}
        required := {label | label := input.parameters.labels[_]}
        missing := required - provided
        count(missing) > 0
        msg := sprintf("Missing required labels: %v", [missing])
      }

---
# Constraint — apply the policy
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: RequireLabels
metadata:
  name: must-have-env-label
spec:
  match:
    kinds:
    - apiGroups: ["apps"]
      kinds: ["Deployment"]
  parameters:
    labels: ["env", "team", "app"]

Answer:

IRSA (IAM Roles for Service Accounts) allows Kubernetes Pods to assume AWS IAM roles using Kubernetes Service Accounts. This replaces the old pattern of assigning IAM roles to EC2 nodes.

How it works:

  1. EKS cluster has an OIDC provider configured
  2. IAM role has a trust policy allowing the OIDC provider and specific service account
  3. Pod uses a service account annotated with the IAM role ARN
  4. The EKS Pod Identity Webhook injects AWS credential env vars into the Pod
  5. AWS SDK in the Pod automatically fetches temporary credentials via OIDC token
# Create OIDC provider for EKS cluster
eksctl utils associate-iam-oidc-provider \
  --cluster my-cluster \
  --approve

# Create IAM role for service account
eksctl create iamserviceaccount \
  --cluster my-cluster \
  --namespace my-namespace \
  --name my-service-account \
  --attach-policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess \
  --approve
# Service Account with IRSA annotation
apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-service-account
  namespace: my-namespace
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789:role/my-pod-role

Answer:

EKS Pod Identity is a newer, simpler mechanism for granting AWS permissions to Pods. It uses a dedicated Pod Identity Agent DaemonSet instead of OIDC webhooks.

Comparison:

FeatureIRSAEKS Pod Identity
MechanismOIDC + webhookPod Identity Agent DaemonSet
IAM trust policyComplex (OIDC condition)Simple (pods.eks.amazonaws.com)
Cross-accountSupportedSupported
Cluster configOIDC provider requiredAgent add-on required
SimplicityMore complex setupSimpler setup
# Enable Pod Identity add-on
aws eks create-addon \
  --cluster-name my-cluster \
  --addon-name eks-pod-identity-agent

# Create Pod Identity association
aws eks create-pod-identity-association \
  --cluster-name my-cluster \
  --namespace my-namespace \
  --service-account my-service-account \
  --role-arn arn:aws:iam::123456789:role/my-pod-role

🔝 Back to Table of Contents

Answer:

Multi-cluster architectures improve availability, separate concerns, and meet compliance requirements.

Common patterns:

1. Active-Active (Global Load Balancing):

  • Multiple EKS clusters in different regions
  • Route 53 latency/geolocation routing between clusters
  • Data synchronization via CRDTs or database replication

2. Active-Passive (Disaster Recovery):

  • Primary cluster in one region, standby in another
  • Velero for backup/restore
  • Route 53 failover routing

3. Hub-Spoke (Management Cluster):

  • Central management cluster running ArgoCD/Flux
  • Spoke clusters receive workloads from the hub

Tools for multi-cluster:

  • ArgoCD — multi-cluster GitOps
  • Cluster API (CAPI) — manage cluster lifecycle
  • AWS App Mesh — cross-cluster service mesh
  • Velero — backup and DR
# Register multiple clusters in ArgoCD
argocd cluster add --kubeconfig ./cluster2-kubeconfig arn:aws:eks:us-west-2:123456789:cluster/cluster2

Answer:

Cost optimization strategies:

1. Right-size workloads:

  • Use VPA recommendations to set appropriate resource requests
  • Avoid over-provisioning CPU/memory

2. Spot Instances:

  • Use Karpenter or CA with mixed instance types and Spot
  • Design apps to handle interruptions gracefully (2-minute notice)

3. Node consolidation:

  • Enable Karpenter’s consolidation policy to bin-pack Pods

4. Fargate for variable workloads:

  • Only pay for actual Pod CPU/memory

5. Cluster Autoscaler / Karpenter:

  • Scale down idle nodes automatically

6. Savings Plans & Reserved Instances:

  • Commit to 1 or 3 years for baseline workloads
# Karpenter — prefer Spot, fall back to On-Demand
spec:
  template:
    spec:
      requirements:
      - key: karpenter.sh/capacity-type
        operator: In
        values: ["spot", "on-demand"]
  disruption:
    consolidationPolicy: WhenUnderutilized
    consolidateAfter: 30s
# Monitor costs with Kubecost
helm install kubecost cost-analyzer \
  --repo https://kubecost.github.io/cost-analyzer/ \
  --namespace kubecost --create-namespace

Answer:

Amazon EKS Distro (EKS-D) is the same Kubernetes distribution that powers Amazon EKS, made available for you to use anywhere. It is a free, open-source distribution of Kubernetes that includes:

  • The same Kubernetes version and patches used in EKS
  • Extended support timelines for older K8s versions
  • Same versions of dependencies: etcd, CoreDNS, metrics-server, etc.
  • Amazon-tested and signed binaries

Use cases:

  • Run the same Kubernetes distribution on-premises as in EKS
  • Consistent behavior across hybrid environments
  • Foundation for EKS Anywhere

Answer:

Blue/Green deployment runs two identical environments (blue = current, green = new) and switches traffic instantaneously.

Method 1: Kubernetes Services + Label Switching

# Switch traffic from blue to green by updating service selector
kubectl patch service my-service \
  -p '{"spec":{"selector":{"version":"green"}}}'

Method 2: AWS ALB Weighted Target Groups

# Ingress with traffic splitting (AWS Load Balancer Controller)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    alb.ingress.kubernetes.io/actions.blue-green: |
      {
        "type": "forward",
        "forwardConfig": {
          "targetGroups": [
            {"serviceName": "blue-service", "servicePort": 80, "weight": 0},
            {"serviceName": "green-service", "servicePort": 80, "weight": 100}
          ]
        }
      }

Method 3: ArgoCD Rollouts (Argo Rollouts)

apiVersion: argoproj.io/v1alpha1
kind: Rollout
spec:
  strategy:
    blueGreen:
      activeService: my-active-service
      previewService: my-preview-service
      autoPromotionEnabled: false

Answer:

In EKS, etcd is fully managed by AWS. You do not have direct access to etcd. AWS automatically handles:

  • etcd backups (multiple times per day)
  • Multi-AZ replication for etcd
  • Automatic etcd recovery

For application-level DR:

  • Velero — backs up Kubernetes resources and PersistentVolumes to S3
# Install Velero with AWS S3 backend
velero install \
  --provider aws \
  --plugins velero/velero-plugin-for-aws:v1.8.0 \
  --bucket my-velero-bucket \
  --backup-location-config region=us-east-1 \
  --snapshot-location-config region=us-east-1 \
  --secret-file ./credentials-velero

# Backup all resources in a namespace
velero backup create my-backup --include-namespaces production

# Schedule daily backups
velero schedule create daily-backup \
  --schedule="0 1 * * *" \
  --include-namespaces production

# Restore from backup
velero restore create --from-backup my-backup

Answer:

1. IAM and RBAC:

  • Use IRSA or Pod Identity instead of node-level IAM roles
  • Apply least-privilege IAM policies
  • Use EKS Access Entries instead of aws-auth ConfigMap
  • Regularly audit RBAC bindings

2. Network Security:

  • Enable private API server endpoint
  • Use Security Groups for Pods
  • Implement Network Policies (Calico or Cilium)
  • Use VPC endpoints for AWS service traffic

3. Secrets Management:

  • Encrypt Kubernetes Secrets with KMS at rest
  • Use AWS Secrets Manager via CSI driver or External Secrets Operator

4. Pod Security:

  • Enforce Pod Security Standards (Restricted profile)
  • Disable privilege escalation: allowPrivilegeEscalation: false
  • Run containers as non-root users
  • Use read-only root filesystems

5. Runtime Security:

  • Enable Amazon GuardDuty for EKS (runtime threat detection)
  • Use Falco for real-time runtime security

6. Image Security:

  • Scan images with Amazon ECR image scanning (or Trivy/Snyk)
  • Use immutable image tags
  • Sign images with Notary/Cosign
# Restricted Pod Security Standard
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Answer:

Three pillars of observability: Metrics, Logs, Traces

Metrics:

  • Amazon CloudWatch Container Insights — native AWS monitoring for EKS
  • Prometheus + Grafana — open-source, highly flexible
  • Datadog / New Relic — enterprise observability platforms

Logs:

  • Fluent Bit (DaemonSet) → CloudWatch Logs / OpenSearch
  • Fluentd — more plugins, slightly heavier
  • EKS control plane logging: enable in AWS Console/CLI

Traces:

  • AWS X-Ray — native AWS distributed tracing
  • OpenTelemetry (ADOT) — standard collection pipeline
  • Jaeger / Tempo — open-source tracing
# Enable EKS Control Plane logging
aws eks update-cluster-config \
  --name my-cluster \
  --logging '{"clusterLogging":[{"types":["api","audit","authenticator","controllerManager","scheduler"],"enabled":true}]}'

# Install Prometheus stack via Helm
helm install kube-prometheus-stack \
  prometheus-community/kube-prometheus-stack \
  --namespace monitoring --create-namespace \
  --set grafana.adminPassword=admin123
# CloudWatch agent as a DaemonSet (Container Insights)
# Install via add-on
aws eks create-addon \
  --cluster-name my-cluster \
  --addon-name amazon-cloudwatch-observability

🔝 Back to Table of Contents

Answer:

EKS upgrade process (recommended steps):

Phase 1: Preparation

# 1. Review EKS release notes and deprecated APIs
# 2. Test upgrade in lower environments first
# 3. Backup with Velero

# Check current version
aws eks describe-cluster --name my-cluster --query cluster.version

# Check deprecated API usage
kubectl convert --help
# Use Pluto to detect deprecated APIs
pluto detect-all-in-cluster --target-versions k8s=v1.29.0

Phase 2: Upgrade the Control Plane

# Upgrade control plane (15-25 min, no downtime)
aws eks update-cluster-version \
  --name my-cluster \
  --kubernetes-version 1.29

# Wait for completion
aws eks wait cluster-active --name my-cluster

Phase 3: Upgrade Add-ons

# Update EKS add-ons (vpc-cni, coredns, kube-proxy)
aws eks update-addon \
  --cluster-name my-cluster \
  --addon-name vpc-cni \
  --resolve-conflicts OVERWRITE

Phase 4: Upgrade Node Groups

# For Managed Node Groups
aws eks update-nodegroup-version \
  --cluster-name my-cluster \
  --nodegroup-name standard-nodes

# The process: new nodes → cordon old nodes → drain → terminate
# PodDisruptionBudgets are respected during drain

Phase 5: Validate

kubectl get nodes
kubectl get pods -A
kubectl get events -A | grep Warning

Key tip: Upgrade one minor version at a time (e.g., 1.27 → 1.28 → 1.29). Skipping versions is not supported.


📌 Quick Reference

Kubectl Cheat Sheet

# Context management
kubectl config get-contexts
kubectl config use-context <context>

# Resource management
kubectl get all -n <namespace>
kubectl apply -f <file>
kubectl delete -f <file>
kubectl edit <resource> <name>

# Debugging
kubectl describe <resource> <name>
kubectl logs <pod> -c <container> -f --previous
kubectl exec -it <pod> -- /bin/sh
kubectl top pods --sort-by=memory

# Port forwarding
kubectl port-forward svc/<service> 8080:80

# Rollout management
kubectl rollout status deployment/<name>
kubectl rollout history deployment/<name>
kubectl rollout undo deployment/<name> --to-revision=2

Happy interviewing! 🚀 This guide covers the most commonly asked Kubernetes and EKS interview questions across all experience levels.

Add More Questions to This Guide

Know questions that should be here? Share them and help the community!

Open Google Form