Blue-green deployment is a release strategy that reduces downtime and risk by running two identical production environments, switching traffic between them to ship new versions safely and enable instant rollbacks.
Blue-green deployment maintains two parallel production environments called 'blue' (the current live version) and 'green' (the new version being released). At any moment only one environment serves real user traffic. The other environment is idle or being prepared with the next release. The names are arbitrary labels; what matters is the concept of two mirrored, switchable environments.
Traditional in-place deployments risk downtime and difficult rollbacks if something goes wrong after release. Blue-green deployments make the cutover near-instantaneous and make rollback as simple as redirecting traffic back to the previous environment. This dramatically reduces mean time to recovery (MTTR) and gives teams confidence to deploy more frequently.
A router, load balancer, or DNS record sits in front of both environments and determines which one receives live traffic. When the green environment has been deployed, tested, and validated, the router is updated to point 100% of traffic to green — green is now live and blue becomes idle. The switch itself typically takes milliseconds to seconds depending on the mechanism (load balancer rule vs. DNS TTL).
Because the green environment is running but not yet public, teams can run automated smoke tests or manual checks against it using a direct URL or an internal preview route. Only after those tests pass is the router flipped, ensuring users never hit an untested build. This pre-production validation is one of the strategy's most powerful safety nets.
The biggest gotcha in blue-green deployments is shared state, particularly databases. If blue and green share the same database, schema migrations must be backward-compatible with both versions simultaneously — you cannot drop a column until the old version is fully decommissioned. Strategies like expand-and-contract migrations (add the new column first, backfill data, then remove the old column in a later release) are essential to making this work safely.
Keep blue online for a short 'bake period' after green goes live so you can roll back instantly if a subtle bug surfaces in production. Automate environment provisioning with infrastructure-as-code tools like Terraform or AWS CloudFormation to ensure blue and green are truly identical. Monitor error rates and latency immediately after the switch using observability tooling, and define a clear rollback threshold so the decision to revert is objective rather than reactive.
© RM Full Stack & AI Engineer · All guides · Roadmaps · Open the app