DevOps interview questions covering CI/CD, containerization, infrastructure as code, monitoring, cloud, and culture — spanning beginner to advanced levels.
DevOps is a set of practices that combines software development (Dev) and IT operations (Ops) to shorten the development lifecycle and deliver software continuously. It solves the traditional silo problem where dev and ops teams worked separately, causing slow releases, miscommunication, and deployment failures.
CI/CD stands for Continuous Integration and Continuous Delivery/Deployment. Continuous Delivery means code is always in a deployable state but requires manual approval to release to production; Continuous Deployment goes one step further and automatically deploys every passing build to production without human intervention.
IaC is the practice of managing and provisioning infrastructure through machine-readable configuration files rather than manual processes. It enables version control, repeatability, and automated provisioning, reducing configuration drift and human error.
Docker containers share the host OS kernel and isolate only the application and its dependencies, making them lightweight and fast to start. Virtual machines include a full guest OS on top of a hypervisor, consuming more resources but providing stronger isolation.
A Dockerfile is a text file containing instructions to build a Docker image. Key instructions include FROM (base image), RUN (execute commands), COPY/ADD (copy files), EXPOSE (declare ports), ENV (set environment variables), and CMD/ENTRYPOINT (define the container's default process).
Kubernetes core components include the API Server (cluster gateway), etcd (distributed key-value store for cluster state), Scheduler (assigns pods to nodes), Controller Manager (maintains desired state), and kubelet (node-level agent that runs pods). On the data plane, nodes run Pods, Services, and Deployments.
A Deployment manages stateless application replicas with rolling updates. A StatefulSet manages stateful applications (like databases) with stable network identities and persistent storage per pod. A DaemonSet ensures one pod runs on every (or selected) node, commonly used for log collectors or monitoring agents.
Blue-green deployment maintains two identical production environments (blue=current, green=new); traffic is switched to green after testing, allowing instant rollback by redirecting back to blue. Its main advantages are zero-downtime releases and the ability to quickly recover from bad deployments.
Terraform is a declarative IaC tool focused on provisioning and managing cloud infrastructure resources (VMs, networks, storage). Ansible is an imperative configuration management and orchestration tool focused on installing software and configuring existing servers. They are often used together: Terraform provisions, Ansible configures.
A ConfigMap stores non-sensitive configuration data as key-value pairs that pods can consume as environment variables or mounted files. A Secret stores sensitive data (passwords, tokens, keys) in base64-encoded form and is handled with additional access controls; neither provides encryption at rest by default without additional configuration.
Monitoring tracks predefined metrics and alerts on known failure conditions (you know what to look for). Observability is a broader property of a system — achieved through logs, metrics, and traces (the three pillars) — that lets you understand unknown internal states by querying external outputs, enabling debugging of novel issues.
HPA automatically scales the number of pod replicas in a Deployment or StatefulSet based on observed CPU/memory utilization or custom metrics from the Metrics Server or external adapters. It periodically queries metrics, compares them to target thresholds, and adjusts the replica count up or down within defined min/max bounds.
GitOps is a paradigm where Git is the single source of truth for both application code and infrastructure configuration; automated agents (like Argo CD or Flux) continuously reconcile the live cluster state with the desired state declared in Git. Unlike traditional CI/CD which pushes changes, GitOps uses a pull-based model where the cluster pulls and applies updates, improving auditability and security.
Use an expand-contract (parallel change) pattern: first deploy a backward-compatible schema change (expand), update the application to support both old and new schemas, then in a subsequent release drop the old schema (contract). Tools like Flyway or Liquibase manage versioned migrations, and feature flags can control application-side rollout independently of schema changes.
TLS is one-way authentication where only the server presents a certificate to the client. mTLS (mutual TLS) requires both parties to present and verify certificates, ensuring bidirectional identity verification. Service meshes like Istio use mTLS between sidecars to enforce zero-trust security, preventing unauthorized service-to-service communication inside a cluster.
Requests are the guaranteed resources a container needs; the scheduler uses them to find a suitable node. Limits cap the maximum resources a container can consume; exceeding CPU limits causes throttling, while exceeding memory limits causes the container to be OOM-killed. Kubernetes assigns QoS classes (Guaranteed, Burstable, BestEffort) based on whether requests equal limits or are unset, affecting eviction priority under resource pressure.
A canary deployment routes a small percentage of traffic to a new version while the majority continues to the stable version, allowing real-user validation before full rollout. In Kubernetes it can be implemented using multiple Deployments with weighted Services, or more precisely with a service mesh (Istio/Linkerd) or ingress controller (NGINX, Argo Rollouts) that supports traffic-weight rules.
Use minimal base images (distroless or Alpine), run containers as a non-root user, avoid storing secrets in image layers, pin base image versions, scan images with tools like Trivy or Snyk, use multi-stage builds to exclude build tooling from final images, and enable Docker Content Trust for image signing and verification.
© RM Full Stack & AI Engineer · All interview questions · Roadmaps · Open the app