What Is Blue/Green Deployment and Automating Blue/Green in Kubernetes
Blue/green deployment is a software deployment approach that helps organizations deploy frequent updates while maintaining high quality and a smooth user experience. The blue/green deployment method helps to minimize the risk of introducing flaws during software updates while limiting downtime during the transition to new versions.
This model uses two similar production environments (blue and green) to release software updates. The blue environment runs the existing software version, while the green environment runs the new version. Only one environment is live at any time, receiving all production traffic. Once the new version passes the relevant tests, it is safe to transfer the traffic to the new environment. If something goes wrong, traffic is switched to the previous version.
How Does a Blue/Green Deployment Process Work
The main prerequisite for a blue/green deployment is having two identical production environments, with a router, load balancer, or service mesh that can switch traffic between them.
The blue/green deployment process works as follows:
Deploy new version—deploy the new (green) version alongside the current (blue) version. Test it to ensure it works as expected, and deploy changes to it if needed.
Switch over traffic—when the new version is ready, switch overall traffic from blue to green. This should be done seamlessly so end-users aren’t interrupted.
Monitor—closely monitor how users interact with the new version and watch out for errors and issues.
Deploy or rollback—if there is a problem, immediately roll back by switching traffic back to the blue version. Otherwise, keep traffic on the green version and continue using it. The green version now becomes the blue (current) version, and a new version can be deployed alongside it as the “new green” version.
Blue/Green Deployment: Use Cases and Benefits
Here are a few scenarios in which blue/green deployments can be useful.
For product owners working in a CI/CD framework, a blue/green deployment is a great way to move software into production quickly. DevOps teams can release software at any time, without having to schedule a weekend or after-hours release. These deployments do not negatively impact users as there is no associated downtime.
Blue/green deployments also make updates easier for DevOps teams. There is no need to rush updates during deployments, which can lead to errors and unnecessary stress.
Blue/green deployments provide an easy way to roll back to a safe, working version in case something goes wrong. This reduces the risks inherent in experimenting in a production environment. Teams can easily eliminate issues with simple routing changes and return to a stable production environment.
If the application is stateful, there is the risk of disrupting ongoing transactions when rolling back. One way to mitigate this issue is to make the application read-only during the transition. Another way is to use a load balancer or service mesh to perform a rolling transition, waiting for each transaction to complete and only then routing the user back to the blue version.
Testing in Production
No matter how much effort goes into making the staging and production environments identical, there are usually subtle differences. This can lead to exceptions and bugs that may not be discovered until a new version is pushed to production. A blue/green deployment lets DevOps teams test new code in a real production environment—checking for last-minute issues and verifying performance. It is then possible to migrate traffic seamlessly to the new version.
Another potential use case for blue/green deployments is A/B testing. In this use case, a new version of the code is loaded into the blue environment and a fraction (typically 50%) of user traffic is sent to the blue version instead of the original green version. You can then monitor the performance of both environments for key metrics to determine the impact of the new version.
Blue/Green Deployment Challenges
While blue/green deployments are highly beneficial, there are a few concerns to watch out for:
Setup time and effort—setting up a blue/green deployment is complex, risky, and may need to be repeated several times to work properly. In a Kubernetes environment, you can use dedicated tools like Argo Rollouts that have built-in blue/green deployment functionality.
Cold starts—users can experience performance issues when they switch to a new environment. There can be other unexpected issues in the cut-off point from one environment to the other. This can be alleviated by running warm-up routines and stress testing.
Cost—a blue/green deployment requires doubling the production environment. When running on-premises, this requires purchasing more equipment. In the cloud, it can mean paying double for the infrastructure.
Schema migration—database migration is a difficult task. In general, blue/green deployments do not support changes to the database schema. Even if the schema stays the same, it can be difficult to keep data in sync. Common strategies are using replication or making the database read-only during the transition.
User transactions—when traffic moves from blue to green or vice versa, transactions can be aborted. One way to deal with this is to show an error and ask the user to retry their transaction, but this is a poor user experience. A better way is to serve all transactions in parallel to both environments. This requires cleaning duplicate data after the deployment.
Shared services—if the application depends on an external database or any other third-party or legacy service, these services can leak information between blue and green environments. If one environment indirectly affects the other, this can interfere with the deployment and make testing unreliable.
Blue/Green Deployment Best Practices
Here are some best practices for implementing a blue/green deployment model.
Use Database Versioning
Database versioning helps organizations keep track of their application databases and address challenges such as mismatched data between environments or missing data in one of the databases. Blue/green deployment requires multiple database instances, which can easily become unsynchronized.
Effective database versioning should include these practices:
Document all changes in version control—treat the application database schema and related reference data as regular code, storing them in a source control system.
Ensure a separate database instance for each developer—developers should each have their own database instances to help avoid collisions between incompatible programming changes.
Decouple schema and code changes—separating schema changes from code changes enables the relational database to serve both environments, as it can reside outside the defined blue/green environment boundary.
Leverage Feature Flags
Feature flags let developers easily switch features on or off, reducing the engineering required for feature testing. Developers can leverage feature flags to control feature access and integrate customer feedback into their features.
In a blue/green deployment, feature flags are useful for testing features in the reserve environment without disrupting the live environment. They allow organizations to assess user feedback and switch off any unsuccessful feature.
Applying chaos engineering in a non-production environment helps an organization avoid introducing risks into the production environment while experimenting and testing the reliability of the production system. Blue/green deployments help insulate problems from the production so they don’t impact the end-users.
While safer, chaos engineering in a testing environment may be limited compared to a live environment. For example, the lack of actual users may limit the engineer’s ability to test some aspects of the application. However, most configuration-related aspects are testable even with pretend users.
These practices can help optimize seamless switching between environments in a blue/green deployment:
Monitor the environments—organizations must monitor both environments in a blue/green deployment, making sure to establish a clear separation to avoid confusion. It should be easy to switch between environments and filter non-critical alerts.
Ensure compatibility of code versions—successful rolling updates require different code versions to co-exist. Mini releases can help ensure that code runs successfully side-by-side during updates and avoid downtime.
Use load balancing or service meshes—load balancers distribute workloads across multiple resources to enable more efficient computing. A load balancer can immediately route traffic to different servers. In a Kubernetes environment, a service mesh provides fine-grained control over traffic routing between resources.
Automate where possible—automation makes processes quicker and easier and can reduce the risk of human error, especially for repetitive tasks.
Can You Do a Blue-Green Deployment in Kubernetes?
Many development teams are moving to Kubernetes as their environment for development, testing, and production deployments. Kubernetes can make application deployment easier, and can also be used to automate blue/green deployments.
However, Kubernetes does not provide blue/green deployments out of the box. It offers the Deployment object, which enables “rolling updates”. This makes it possible to update an application with zero downtime, by gradually replacing pods with a new version of an application.
This is similar to blue/green deployment, but does not provide all its benefits. In a rolling deployment, it can be difficult to roll back if something is wrong with the new version. It also takes time to roll out the application, while a blue/green pattern offers instant switchover.
DevOps teams can customize Kubernetes and implement blue/green deployment using several techniques and tools. One popular tool is Argo Rollouts, a controller that provides advanced deployment capabilities for blue-green deployments and other progressive delivery types.
Kubernetes Blue Green Deployment with Argo Rollouts
Argo Rollouts is a progressive delivery controller for Kubernetes. It supports several deployment strategies including blue/green and canary deployments.
Argo Rollouts provides a new Kubernetes object called a Rollout—which is similar to a Deployment, but with additional parameters that allow teams to specify advanced deployment strategies.
Here is how a Rollout object automatically executes a blue/green deployment:
Users provide a Kubernetes service that routes to the current (blue) version and, optionally another service that routes to the new (green) version.
The Rollout controller uses ReplicaSets to deploy an initial version of the application and routes traffic to it by injecting a unique hash of the ReplicaSet to service selectors.
When there is an update to the application, developers change the Rollout template (for example, to include a different container image). The Rollout object detects the change and creates a new ReplicaSet with the new version.
The active service continues to point at the old ReplicaSet while the new one is being deployed.
When the new ReplicaSet becomes available, the controller automatically modified the active service to route to the new ReplicaSet.
The controller waits for a time period, configured in the Rollout template, and then scales down the old ReplicaSet.
The following is an example template for a Rollout object that performs blue/green deployment:
[...] # this part of the template is identical to a regular Deployment
The template uses these three parameters to control the behavior of the blue/green deployment:
activeService—the service that should route to the current (blue) version. This is the initial ReplicaSet deployed by the controller.
previewService(optional)—the service that should route to the new (green) version before it is promoted. This makes it possible to preview the new version without making it accessible to production traffic.
autoPromotionEnabled—if this is set to false, the new version is not immediately promoted. An operator can promote it manually using the command kubectl argo rollouts promote ROLLOUT. Otherwise, by default, the Rollout instantly switches over traffic when the new version is ready.
Blue/Green Deployments in Kubernetes: A General Process
The current version of the application is marked with a color (e.g blue)
A new deployment is performed with brand new pods and is marked with the new color (i.e. green)
Both versions exist at the same but the Kubernetes service is still pointing at the existing/blue version and thus not all users of the system can see the change yet
Different types of tests (e.g. smoke tests) can be performed on the new version with no impact to existing users
After a user-defined amount of time, the Kubernetes service is switched and now points to the new version. All live users can now use the new functionality without any downtime
If the new version works as expected, the old version is destroyed. The new version becomes the “current version” and the Kubernetes service stays as is
If the new version has issues, the Kubernetes service is switched back to the previous version. This has minimal impact on users. The new version is destroyed and everything is back to the original state
Automated Blue/Green Deployments With Codefresh: Quick Overview
Kubernetes already comes with the basic building blocks (deployments and services) that make a blue/green deployment possible using plain kubectl commands. The challenge for a sound CI/CD solution is how to automate those kubectl commands so that blue/green deployments happen in a well controlled and repeatable manner.
Let’s see how to package these kubectl invocations into a pre-packaged Docker image offering a declarative way to do blue/green deployments.
The end goal is that in order to deploy using blue/green you can just insert the following build step in your codefresh.yml:
Here blue/green deployments happen in a completely declarative manner. All kubectl commands are abstracted.
The Blue/Green deploy step is essentially a docker image with a single executable that takes the following parameters as environment variables:
Name of your cluster in Codefresh dashboard
Existing K8s service
Existing k8s deployment
Docker tag for the next version of the app
How many seconds both colors should coexist. After that new version pods will be checked for restarts
K8s Namespace where deployments happen
Prerequisites for Automated Blue/Green Deployments with Codefresh
The blue/green deployments steps expect the following assumptions:
An initial service and the respective deployment should already exist in your cluster.
The service and the deployment need to use labels that define their version.
The second assumption is very important, as this is how the blue/green step detects the current version and can switch the load balancer to the next version.
You can use anything you want as a “version”, but the recommended approach is to use Git hashes and tag your Docker images with them. In Codefresh, this is very easy because the built-in variable CF_SHORT_REVISION gives you the git hash of the commit that was pushed.
The build step of the main application that creates the Docker image that will be used in the blue/green step is a standard build step that tags the Docker image with the git hash
Once you meet the prerequisites above, running a blue/green deployment in CodeFresh is as simple as:
Modifying your application template, triggering a deployment in Codefresh
The Codefresh pipeline step displays the following output:
The blue/green step copies your existing deployment and changes its version, creating a second one with the updated Docker image. Note: At this point, both versions (old and new) of your application are deployed in the Kubernetes cluster. All live traffic is still routed to the old application.
There is a waiting period (configurable as an environment parameter, shown in the previous section). During this period you are free to do any external checks on your own (e.g. check your health dashboard or run some kind of smoke testing).
Once the waiting period is over, the script checks for the number of restarts in the pods of the new application. If there are any errors, it destroys the new deployment and the cluster is rolled back to the initial state. Your users are not affected in any way.
If there are no pod restarts, the service is switched to point to the new deployment and the old deployment is discarded.
You can also see the changes in the Codefresh Kubernetes dashboard. The following example uses an Azure Kubernetes cluster, but any cluster will work as long as the labels are present in the manifest files.
And there you have it! Now you can deploy your own application using the blue/green strategy.
The blue/green Docker image is also available in Docker Hub.
Learn Progressive Delivery in the Codefresh GitOps Certification Program
Codefresh is offering a new certification program that can help you adopt GitOps in your organization. Within the course, we cover:
What is progressive delivery
How to use Argo Rollouts for blue/green and canary deployments
How to manage Secrets with GitOps
The course includes a live exercise that can help you learn progressive delivery hands on, including how to install the Argo Rollouts controller, release an application with blue/green deployments, and monitor deployment progress.