Codefresh is a docker native CI/CD platform that builds and runs thousands of Docker images.
We operate hundreds of Docker clusters with complex business logic built-in. As a platform, Codefresh uses dozens of microservices, each being updated independently, sometimes multi-times per day.
Why did we decide to move to Kubernetes?
Kubernetes (aka k8s) was our first choice since we are heavy users of Google Cloud, GCR, Firebase, and other services. Our product is completely cloud agnostic, which allows us to run customer clusters on AWS and Google. From the beginning, we received great support from Google and from the K8s community. After researching the communities, capabilities, and roadmaps, we decide to go with Kubernetes because if it’s powerful scalability, stability, and networking features. Beyond that we had a few specific considerations.
We wanted a standard deployment mechanism – We had automated our microservice deployment using Chef and Ansible but it quickly became clear that some of the features we wanted like zero downtime, health check, auto scaling, etc. had bugs that often blocked us from pushing our code and caused delays.
We didn’t want our DevOps team to be the only ones who could manage deployment – At Codefresh, we push ten times a day and it was hard to depend on a DevOps team. We wanted to avoid a situation where only our DevOps could customize deployment and have a flexible DevOps environment, since our developers were not familiar with Chef and Ansible.
We wanted the ability to scale through a standard API – for any microservice based platform it’s crucial to be able to scale/auto-heal on a microservice level. Kubernetes provides a great way to do this though API and the command line.
We also wanted the ability to monitor microservices – though we used New Relic to monitor the code, it was not on a microservice level. K8s on the other hand has an easy-to-use dashboard.
Every new microservice required customization – due to the fact that deployment was defined in Chef scripts. Every new microservice required actual coding and that was mostly done by Devops engineers.
We wanted to be able to get deployment history – so we could use it for rollbacks, problem isolation, and other activities.
We wanted the ability to easily re-create a production environment – Previously, we used docker compose, which was very different from how our production actually looks. While it could be helpful in some situations it often failed to deliver; it was not good enough for staging. We wanted to create new environments on-demand using a standard and simple API. We can now create a full stand alone version of Codefresh per feature branch, which eliminates the bottlenecks around doing integration tests on one environment (like staging).
How did we do it?
- We created a K8s cluster in GCE and did a practical training and deploy demo micro services to cover all user cases and learn the terminology.
Step 1 – Microservice Definition + Deployment scripts
We reviewed all our deployment architecture, existing microservices, configuration, and security rules, and decide how we are going to map it out to K8s Pods and Services. Kubernetes’ rich deployment options (such as fine grained controller over the deployment strategy, readiness probs, etc.) helped us replace a complex in-house solution for achieving zero-downtime deployments with a much simpler and readable configuration
For every microservice we added a K8s deployment descriptor file that included both services and POD definitions.
Step 2 – Secrets and Configuration
Configuration management – we examined the application configuration and split it into two segments; sensitive data that will be stored using Kubernetes’ secrets, and regular data that will be stored using ConfigMaps.
We wanted to have an option to track all the configuration changes and have a quick way to revert to the previous versions.To achieve that, we decided to encrypt and store all the configuration in a centralized repository, and use Codefresh’s pipelines to automate the process of pushing the changes to Kubernetes.
Our team doesn’t change the Secrets or ConfigMaps directly through Kubernetes’ API or CLI. Instead, we use git to track and encrypt the changes and Codefresh’s pipelines to release them to production.
Steps 3-4 – Load Balancing
We replaced a previous Load Balancing complex solution with a simple ingress controller descriptor file.
Step 5 – Automation and Testing:
The Codefresh team uses Codefresh for building, testing, and deploying Codefresh (dog food!), so we built a Kubernetes deployment pipeline that pushes microservices from master branch to a dedicated namespace. We use the same mechanism to deploy internally to provide out-of-the-box K8s deployments to our users.
We ran load testing and migrated ourselves to K8s before exposing it to our customers.
Step 6 – Training Kube Day
- All developers did an extended practical session with Kubernetes so they could easily push and do rollbacks, hotfixes, etc.
Once we had tested everything, we migrated all of our customer pipelines and environment backends into K8s.
Where are we today? One of most positive results of migrating our deployment to work with K8s is that we have had almost no issues with deployment itself. Everything was standard and stable. The ability to control our production environment in a predictive state has been dramatically improved.
We are able to create a full Codefresh environment in a few seconds with 100% automation. This dramatically improves our ability to get feedback early, run integration tests, etc.