“Codefresh has quite literally changed the way we work, and the results are incredible.”
When their legacy tech stack started slowing down software delivery, Codefresh helped reduce friction and accelerate how fast this social gaming company delivered value to their players.
The technological backdrop
Recently, this social gaming company (referred to henceforth as ‘the company’ to maintain their privacy) had been moving away from some of the more painful parts of its legacy tech stack. This included a custom-built infrastructure-as-code tool that was difficult to maintain and a neglected Jenkins server set up in the early days of the company.
Games are developed in a mix of TypeScript/NodeJS/Fargate and Java/EC2. Most builds were occurring in CircleCI with deployments being done by Jenkins.
The challenge
The company found that their deployments were getting bigger and bigger, taking a long time and incurring a lot of risk.
The self-hosted Jenkins server was set up by people who had left the company a long time ago and as a result, had been neglected for some time. The pipelines had become bloated and fragile and with no sysadmin or operations staff tending to things, would only continue to deteriorate. This was not only exposing them to risk but also resulted in many pipeline failures during deployment, requiring escalation to software engineers who would then need to log into Jenkins, figure out which of the many scripts had failed and why. This long-winded process was severely holding up production and reduced confidence.
“Like any young startup, we’re always under pressure to get features out and to secure its market share. This obviously doesn’t leave much time to work on things like deployment pipelines and infrastructure-as-code,” says Luke, Senior Site Reliability Engineer. “Once we’d reached the point where the product was proven and we had customers, it was time to go back and review all the other key parts of maintaining a healthy product.”
Our top priority was to address our brittle deployment pipelines, processes, and the operational risks imposed on us by our neglected Jenkins server.
The company had a preference for using off-the-shelf SaaS products as opposed to having to self-manage their own infrastructure and quickly identified that even if the issues with Jenkins could be remediated it still wasn’t the right fit for them. CircleCI was briefly considered since it was already in use for builds, but its authentication model meant it was difficult to give access to non-engineering staff like Producers and Quality Control. It was clear a new tool needed to be procured.
The solution
Following a broad feature comparison of CI/CD solutions on the market including DroneCI, Harness, and Azure Pipelines, they selected Codefresh for a proof of concept and put together several deployment pipelines to test the solution.
The PoC went well so they signed up for an enterprise account, starting to deploy Codefresh in its recently-rebuilt lobby service. The lobby team was soon carrying out both builds and deployments with Codefresh and very quickly saw significant improvements to lead time, release frequency, and deployment failure rates.
Because Codefresh is entirely based on containers, it was a good fit for us and has helped us make better use of containers across all parts of our stack.
“Codefresh encourages good container practices; it’s helped us rethink how we build and deploy software in a repeatable and reusable way. For example, we’ve cleaned up a lot of our Dockerfiles to take advantage of multi-stage Docker builds and maximize our use of Codefresh’s Docker build caching. These small systemic changes are improving the way we work every single day.”
The result
Since moving to Codefresh, the company’s lobby team has gone from deploying once a day to deploying on-demand, often up to 8 or 9 times a day.
The bottleneck now lies in how fast engineers and testers can build and test new features as opposed to deployment processes and brittle pipelines. Now that releases are much smaller and happening more frequently, change failure rates have dropped significantly as well.
The team’s engineers report that container build time is down to 3-4 minutes in Codefresh compared to 6-8 minutes in CircleCI. Recently, an urgent request from a senior executive to roll out a new feature, which would ordinarily have taken a few days, was achieved in just 90 minutes. They credit this huge time saving to Codefresh and the improved processes put in place during implementation.
With an ambitious goal of shutting down the Jenkins server by the end of 2020, they are now following a general strategy that anything new or needing updating happens in Codefresh, with a view of achieving full migration across the company’s multiple services as soon as possible.
“Our lobby team is doing everything in Codefresh now, and we’re really happy with the results,” says Luke.
The metrics are extremely encouraging, and we’re moving full steam ahead to migrate our other services over, beginning with the games and platform teams, so we can get everything into Codefresh ASAP.