Kubernetes (K8s) is an open-source container orchestration platform. These 18 questions cover core architecture, workloads, networking, storage, scaling, security, and advanced operations — spanning beginner to advanced levels.
Kubernetes is an open-source container orchestration system that automates deployment, scaling, and management of containerized applications. It abstracts infrastructure complexity, provides self-healing, load balancing, and declarative configuration, making it the industry standard for running containers at scale.
The control plane consists of the API Server (single entry point for all REST operations), etcd (distributed key-value store for cluster state), Scheduler (assigns pods to nodes), Controller Manager (runs reconciliation loops), and optionally Cloud Controller Manager (integrates with cloud providers).
A Pod is the smallest deployable unit in Kubernetes, wrapping one or more tightly coupled containers that share the same network namespace, IP address, and storage volumes. Containers in a pod communicate via localhost and are always co-scheduled on the same node.
A Deployment manages stateless replicas with interchangeable pods and random pod names, supporting rolling updates and rollbacks. A StatefulSet manages stateful applications by providing stable network identities, ordered pod creation/deletion, and persistent volume claims that follow each pod — essential for databases like MySQL or Cassandra.
A Service provides a stable virtual IP (ClusterIP) and DNS name that load-balances traffic to a dynamic set of pods selected by labels. The four types are ClusterIP (internal only), NodePort (exposes on each node's IP), LoadBalancer (provisions a cloud load balancer), and ExternalName (maps to an external DNS name).
A ConfigMap stores non-sensitive configuration data as key-value pairs that can be injected into pods as environment variables or mounted files. A Secret stores sensitive data (passwords, tokens, keys) in base64-encoded form and supports stricter RBAC and encryption-at-rest; base64 is encoding, not encryption, so Secrets should be encrypted via KMS in production.
Controllers continuously watch the API Server for the current state of resources and compare it to the desired state declared in the spec; when they diverge, the controller takes corrective action to reconcile them. For example, the ReplicaSet controller will create or delete pods to match the desired replica count, embodying the declarative model.
etcd is a distributed, strongly consistent key-value store that holds all cluster state — nodes, pods, configs, secrets, and more. If etcd becomes unavailable, the control plane cannot read or write cluster state, making regular etcd backups and high-availability etcd clusters essential for production.
Update a Deployment by changing the container image (kubectl set image or editing the manifest); Kubernetes replaces pods incrementally using maxSurge and maxUnavailable parameters to maintain availability. Roll back with kubectl rollout undo deployment/<name>, which reverts to the previous ReplicaSet; rollout history shows all revisions.
Requests are the guaranteed CPU/memory a container needs for scheduling; the Scheduler uses them to find a suitable node. Limits cap the maximum a container can consume — exceeding CPU is throttled, exceeding memory triggers an OOMKill. Setting both correctly prevents noisy-neighbor problems and enables accurate bin-packing.
A Liveness probe determines if a container is alive; failure causes the kubelet to restart it, recovering from deadlocks or unrecoverable states. A Readiness probe determines if a container is ready to receive traffic; failure removes the pod's IP from Service endpoints without restarting it, preventing traffic from reaching an unready pod.
Kubernetes mandates that every pod gets a unique cluster-wide IP and that pods can communicate with each other without NAT. The Container Network Interface (CNI) is a plugin specification; plugins like Calico, Flannel, or Cilium implement the required routing, often using VXLAN overlays, BGP, or eBPF. Kube-proxy (or eBPF) handles Service VIP translation via iptables/IPVS rules.
An Ingress is an API object that defines HTTP/HTTPS routing rules (host-based, path-based) to backend Services; it requires an Ingress Controller (e.g., NGINX, Traefik) to implement the rules. A LoadBalancer Service provisions one cloud load balancer per service, which is expensive at scale; Ingress allows many services to share a single external IP with routing at the L7 layer.
A PersistentVolume (PV) is a piece of storage provisioned in the cluster (statically or dynamically via a StorageClass). A PersistentVolumeClaim (PVC) is a user's request for storage specifying size and access mode; Kubernetes binds a matching PV to the PVC, decoupling pod definitions from underlying storage infrastructure.
HPA watches metrics (CPU, memory, or custom metrics via the Metrics API) and automatically adjusts the replica count of a Deployment or StatefulSet to meet a target utilization threshold. It queries the metrics-server every 15 seconds by default and scales within configured minReplicas and maxReplicas bounds.
Role-Based Access Control restricts what API operations subjects (users, groups, ServiceAccounts) can perform. It uses Roles (namespace-scoped) or ClusterRoles (cluster-scoped) to define allowed verbs on resources, then binds them to subjects via RoleBindings or ClusterRoleBindings. Least-privilege RBAC is critical for securing multi-tenant clusters.
A DaemonSet ensures exactly one pod runs on every (or selected) node in the cluster, and automatically adds/removes pods as nodes join or leave. Common use cases include node-level log collection (Fluentd), metrics agents (Prometheus node-exporter), and CNI/network plugins that must run on every node.
An Operator extends Kubernetes with custom resources (CRDs) and a controller that encodes domain-specific operational knowledge — automating complex stateful application lifecycle tasks like backups, failover, and schema migrations. You should build one when kubectl and Helm are insufficient to manage an application's day-2 operational complexity, as popularized by tools like the Prometheus Operator or Strimzi (Kafka).
© RM Full Stack & AI Engineer · All interview questions · Roadmaps · Open the app