7 Steps to K8s: How We Moved Codefresh onto Kubernetes

Why Kubernetes?

Codefresh is a docker native CI/CD platform that builds and runs thousands of Docker images.

We operate hundreds of Docker clusters with complex business logic built-in. As a platform, Codefresh uses dozens of microservices, each being updated independently, sometimes multi-times per day.

Why did we decide to move to Kubernetes?

Kubernetes (aka k8s) was our first choice since we are heavy users of Google Cloud, GCR, Firebase, and other services. Our product is completely cloud agnostic, which allows us to run customer clusters on AWS and Google. From the beginning, we received great support from Google and from the K8s community. After researching the communities, capabilities, and roadmaps, we decide to go with Kubernetes because if it’s powerful scalability, stability, and networking features. Beyond that we had a few specific considerations.

We wanted a standard deployment mechanism – We had automated our microservice deployment using Chef and Ansible but it quickly became clear that some of the features we wanted like zero downtime, health check, auto scaling, etc. had bugs that often blocked us from pushing our code and caused delays.

We didn’t want our DevOps team to be the only ones who could manage deployment – At Codefresh, we push ten times a day and it was hard to depend on a DevOps team. We wanted to avoid a situation where only our DevOps could customize deployment and have a flexible DevOps environment, since our developers were not familiar with Chef and Ansible.

We wanted the ability to scale through a standard API – for any microservice based platform it’s crucial to be able to scale/auto-heal on a microservice level. Kubernetes provides a great way to do this though API and the command line.

We also wanted the ability to monitor microservices – though we used New Relic to monitor the code, it was not on a microservice level. K8s on the other hand has an easy-to-use dashboard.

Every new microservice required customization – due to the fact that deployment was defined in Chef scripts. Every new microservice required actual coding and that was mostly done by Devops engineers.

We wanted to be able to get deployment history – so we could use it for rollbacks, problem isolation, and other activities.

We wanted the ability to easily re-create a production environment – Previously, we used docker compose, which was very different from how our production actually looks. While it could be helpful in some situations it often failed to deliver; it was not good enough for staging. We wanted to create new environments on-demand using a standard and simple API. We can now create a full stand alone version of Codefresh per feature branch, which eliminates the bottlenecks around doing integration tests on one environment (like staging).

How did we do it?

Preparation :

We created a K8s cluster in GCE and did a practical training and deploy demo micro services to cover all user cases and learn the terminology.

Step 1 – Microservice Definition + Deployment scripts

We reviewed all our deployment architecture, existing microservices, configuration, and security rules, and decide how we are going to map it out to K8s Pods and Services. Kubernetes’ rich deployment options (such as fine grained controller over the deployment strategy, readiness probs, etc.) helped us replace a complex in-house solution for achieving zero-downtime deployments with a much simpler and readable configuration

For every microservice we added a K8s deployment descriptor file that included both services and POD definitions.

Step 2 – Secrets and Configuration

Configuration management – we examined the application configuration and split it into two segments; sensitive data that will be stored using Kubernetes’ secrets, and regular data that will be stored using ConfigMaps.

We wanted to have an option to track all the configuration changes and have a quick way to revert to the previous versions.To achieve that, we decided to encrypt and store all the configuration in a centralized repository, and use Codefresh’s pipelines to automate the process of pushing the changes to Kubernetes.

Our team doesn’t change the Secrets or ConfigMaps directly through Kubernetes’ API or CLI. Instead, we use git to track and encrypt the changes and Codefresh’s pipelines to release them to production.

Steps 3-4 – Load Balancing

We replaced a previous Load Balancing complex solution with a simple ingress controller descriptor file.

Step 5 – Automation and Testing:

The Codefresh team uses Codefresh for building, testing, and deploying Codefresh (dog food!), so we built a Kubernetes deployment pipeline that pushes microservices from master branch to a dedicated namespace. We use the same mechanism to deploy internally to provide out-of-the-box K8s deployments to our users.

We ran load testing and migrated ourselves to K8s before exposing it to our customers.

Step 6 – Training Kube Day

All developers did an extended practical session with Kubernetes so they could easily push and do rollbacks, hotfixes, etc.

Once we had tested everything, we migrated all of our customer pipelines and environment backends into K8s.

Where are we today? One of most positive results of migrating our deployment to work with K8s is that we have had almost no issues with deployment itself. Everything was standard and stable. The ability to control our production environment in a predictive state has been dramatically improved.

We are able to create a full Codefresh environment in a few seconds with 100% automation. This dramatically improves our ability to get feedback early, run integration tests, etc.

3 thoughts on “7 Steps to K8s: How We Moved Codefresh onto Kubernetes”

Liran Tal says:

July 2, 2017 at 7:38 pm

Thanks for sharing your story, really interesting how organizations move through this journey!

I’m specifically interested in the configuration management part.
Do developers ship configuration with code to your git repository, but the configuration files/items are specifically encrypted?
1. How do you manage the encryption? each developer gets their own cert to sign it?
2. How do you automate the distribution/provisioning of the configuration?
Once code is merged there’s a hook to decrypt the config file/items and provision it on ConfigMaps?
3. How do you deal with conflicting changes? i.e: 2 different devs open 2 Pull-Requests but they both change a specific config item from one thing to another? (say it’s a config object with some keys). On the same topic maybe is how you revert back because now the configuration state is coupled to the deployed version? meaning an older version will look for different config data.

That came out long, but I’m in the midst of working on a solution for configuration management as well 🙂

Thanks again!

Oleg Verhovsky says:

July 3, 2017 at 9:58 am

Hi, thanks for raising those points.

We see configuration updates as a code change, which means it requires a pull request, code review, and release creation.

We have an automatic pipeline that takes a specific release (we are using helm for this) and updates the environment. As part of environment update it might update a config map and run a rolling update.

There is an option to listen to config changes and restart the pods, but we use it in very specific cases.

We have a pipeline in Codefresh (we use Codefresh of our Docker CI/CD) that runs releases on a specific environment. So basically when developer merges a pull request it will will trigger the pipeline that will update both code and configuration.

The conflicting problem in this case is the same as regular work of developers on same piece of code. But maybe I didn’t understand fully the challenge?

Will be happy to elaborate, feel free to contact me on [email protected]

Liran Tal says:

July 3, 2017 at 8:20 pm

Thanks for replying! I’ll shoot an e-mail 🙂