Using Codefresh with monorepos

Using Codefresh with monorepos

4 min read

Today Codefresh has released a completely redesigned Git trigger dialog with several new options for controlling the cases that trigger a pipeline.

New options include:

  • The ability to select specific pull request events. If you always wanted a pipeline that only runs when a pull request is opened, you can easily do this in the GUI now (available only for GitHub repositories)
  • The ability to trigger only if the target of a pull request follows a specific naming pattern. Common examples here would be master or production so that this pipeline is triggered only when a team member is trying to merge back to the respective branch (available for all Git providers apart from Atlassian Stash)
  • The ability to trigger a pipeline only if the changed files match a specific naming pattern (available for GitHub and GitLab repositories)

The last feature is especially interesting because this means that you can now control exactly which pipelines trigger according to which files are affected by each commit. This is a game-changing feature with several implications, the biggest one being the easy management of monorepos in Codefresh.

At the moment this feature is available to Codefresh projects that are using Github, GitLab, Azure DevOps and Bitbucket Server as a Git provider.

Microservices, monorepos, multi-repos and atomic commits

Traditionally, applications developed using the monolith approach were also placed in a single Git repository. This approach was a natural fit given the big size of the project. A developer could simply check out the whole project at once, make any changes on any module and also deploy the whole application locally.
This approach was very convenient for both humans and systems as it was very simple to implement and presented a holistic view of the whole application. By checking out a single Git repository you had full access to all aspects of the application.

For really big projects, however, using a single git repository was sometimes problematic given the size of the code. Checking out code could be really slow, teams that were working on different modules would need to keep track of one another’s pull request and builds and in general, scalability of operations suffered. Still, because the application was always deployed, a single entity using a single GIT repository was the most obvious answer.

Lately, companies started moving their applications into the cloud and the introduction of containers in the form of Docker have completely changed the way an application could be deployed. Instead of deploying everything as a single entity, individual components deployed on their own (microservices) allowed for easy upgrades and most importantly easy scaling of the application.

With the appearance of microservices, developers had to face the same question of code organization. Again, the most natural choice was to split the different applications to several Git repositories. A team could now own a specific repository where:

  • Commits,
  • Pull requests,
  • Deployments,
  • and auto-scaling decisions

would happen in an independent manner.

Having different Git repositories for each microservice works well for several companies. There are some cases, however, where having completely different multiple repositories is cumbersome for the following reasons:

  • It is hard to get a good overview of the system as a whole
  • It is difficult to know which version of which microservice is dependent on everything else
  • It leads to excessive code duplication
  • It makes “atomic commits” (commits that change the API in multiple modules) a nightmare

This realization is especially true for companies that are now experimenting with serverless architectures. Having multiple functions reside in the same repository is much more straightforward even when in practice each function is deployed on its own.

For specific projects, it therefore makes sense to have a mon-repo per service. Each module/function is still deployed in an individual manner (and thus all scalability benefits at runtime are still present), but since all of them exist in the same repository, it is very simple for a developer to check out the whole service at once.

This hybrid approach, where a single Git repository holds multiple modules/functions, is unofficially called ‘monorepo’.

Using monorepos with a traditional CI solution is very challenging. Excess builds are happening all the time because multiple people are working on the same repository. Pull requests are becoming stale as they were created against a revision which quickly becomes obsolete.

Limiting triggering of builds to specific folders

To truly gain the benefits of a monorepo the CI system should be able to trigger pipelines only when changes happen in specific folders. This way the individual projects in a monorepo will build when their files change.

Codefresh offers this capability today. In the trigger of each pipeline, you can define a glob expression that will map to the project files. Only when matched files change will the pipeline trigger.

Here are some examples of glob expressions:

**/package.json
**/Dockerfile*
my-subproject/**
my-subproject/sub-subproject/package.json
my-subproject/**/pom.xml

This means that you can now define pipelines within Codefresh that only compile/run/deploy each individual microservice even though all of them exist in the same Git repository.

This technique not only cuts down the number of builds happening during day-to-day development but also opens several other possibilities on the granularity of your Codefresh builds.

Other scenarios for limiting builds to specific changed files

Glob expressions also allow you to define files instead of just folders. This capability allows for running build-only when a specific file changes. Some examples would be:

  • Only trigger a build if a specific Dockerfile changes
  • Only trigger a build if a specific package.json/pom.xml/Gemfile changes

The first example is particularly interesting because it allows an organization to keep a big repo with “blessed” Dockerfiles and only build them when they actually change.

How will you use the modified files field?

You can read the full documentation here: https://codefresh.io/docs/docs/configure-ci-cd-pipeline/triggers/git-triggers/#using-the-modified-files-field-to-constrain-triggers-to-specific-folderfiles

New to Codefresh? Create Your Free Account Today!

How useful was this post?

Click on a star to rate it!

Average rating 0 / 5. Vote count: 0

No votes so far! Be the first to rate this post.

5 thoughts on “Using Codefresh with monorepos

  1. Interesting article,
    we are thinking of creating a monorepo and putting all our services in it. Problem is releasing – we currently release new version of our service by creating a git release – it than triggers Codefresh pipeline which does the build, test and deploy. If all of our services are in the same repo – this means that release on one service will include release of other services, although they didn’t change. Codefresh has “MODIFIED FILES” filter on the trigger but it does not apply on release event, just on push/commit. Any suggestion here?

    1. In order for the “modified files” filter to work, we need the information of what changed by the Git provider.
      If the git provider doesn’t send this info on git releases then Codefresh cannot apply this filter.

      My suggestion is to decouple Git tags/releases. Just create a release for a new version in ad-hoc manner. Creating a Git release
      should not be a requirement.

  2. Hi, we have a mono repo but we store artifacts in it (for version control and gitops) so the size of the repo has become large (accessed by git ifs). So, we have moved each folder (service) into its own git submodule. Can Codefresh triggers access folders in those submodules?

    Thanks

    1. Hello

      It depends on your Git provider. I would say in most cases the answer is yes. Please contact our support or sales team for more information.

Leave a Reply

Your email address will not be published. Required fields are marked *

Comment

Ready to Get Started?
  • safer deployments
  • More frequent deployments
  • resilient deployments