Create your FREE Codefresh account and start making pipelines fast. Create Account

How to Model Your Gitops Environments and Promote Releases between Them

17 min read

Two of the most important questions that people ask themselves on day 2 after adopting GitOps are:

  1. How should I represent different environments on Git?
  2. How should I handle promoting releases between environments?

In the previous article of the series, I focused on what NOT to do and explained why using Git branches for different environments is a bad idea. I also hinted that the “environment-per-folder” approach is a better idea. This article has proved hugely popular and several people wanted to see all the details about the suggested structure for environments when folders are used.

In this article, I am going to explain how to model your GitOps environments using different folders on the same Git branch, and as an added bonus, how to handle environment promotion (both simple and complex) with simple file copy operations.

GitOps promotion
GitOps promotion

Hopefully this article will help with the endless stream of questions and discussions on this hot topic.

Learn your application first

Before creating your folder structure you need to do some research first and understand the “settings” of your application. Even though several people talk about application configuration in a generic manner, in reality not all configuration settings are equally important.

In the context of a Kubernetes application, we have the following categories of “environment configuration”:

  1. The application version in the form of the container tag used. This is probably the most important setting in a Kubernetes manifest (as far as environment promotions are concerned). Depending on your use case, you might get away with simply changing the version of the container image. However, several times a new change in the source code also requires a new change in the deployment environment
  2. Kubernetes specific settings for your application. This includes the replicas of the application and other Kubernetes related information such as resource limits, health checks, persistent volumes, affinity rules, etc.
  3. Mostly static business settings. This is the set of settings that are unrelated to Kubernetes but have to do with the business of your application. It might be external URLs, internal queue sizes, UI defaults, authentication profiles, etc. By “mostly static,” I mean settings that are defined once for each environment and then never change afterwards. For example, you always want your production environment to use production.paypal.com and your non-production environments to use staging.paypal.com. This is a setting that you never want to promote between environments
  4. Non-static business settings. This is the same thing as the previous point, but it includes settings that you DO want to promote between environments. This could be a global VAT setting, your recommendation engine parameters, the available bitrate encodings, and any other setting that is specific to your business.

It is imperative that you understand what all the different settings are and, more importantly, which of them belong to category 4 as these are the ones that you also want to promote along with your application version.

This way you can cover all possible promotion scenarios:

  1. Your application moves from version 1.34 to 1.35 in QA. This is a simple source code change. Therefore you only need to change the container image property in your QA environment.
  2. Your application moves from version 3.23 to 3.24 in Staging. This is not a simple source code change. You need to update the container image property and also bring the new setting “recommender.batch_size” from QA to staging.

I see too many teams that don’t understand the distinction between different configuration parameters and have a single configuration file (or mechanism) with values from different areas (i.e. both runtime and application business settings).

Once you have the list of your settings and which area they belong to, you are ready to create your environment structure and optimize the file copy operations for the settings that change a lot and need to be moved between environments.

Example with 5 GitOps environments and variations between them

Let’s see an actual example. I thought about doing the classic QA/Staging/Production trilogy, but this is rather boring so let’s dive into a more realistic example.

We are going to model the environment situation mentioned in the first article of the series. The company that we will examine has 5 distinct environments:

  • Load Testing
  • Integration Testing
  • QA
  • Staging
  • Production

Then let’s assume that the last 2 environments are also deployed to EU, US, and Asia while the first 2 also have GPU and Non-GPU variations. This means that the company has a total of 11 environments.

You can find the suggested folder structure at https://github.com/kostis-codefresh/gitops-environment-promotion. All environments are different folders in the same branch. There are NO branches for the different environments. If you want to know what is deployed in an environment, you simply look at envs/ in the main branch of the repo.

Before we explain the structure, here are some disclaimers:

Disclaimer 1: Writing this article took a long time because I wasn’t sure if I should cover Kustomize or Helm or plain manifests. I chose Kustomize as it makes things a bit easier (and I also mention Helm at the end of the article). Note however that the Kustomize templates in the example repo are simply for illustration purposes. The present article is NOT a Kustomize tutorial. In a real application, you might have Configmap generators, custom patches and adopt a completely different “component” structure than the one I am showing here. If you are not familiar with Kustomize, spend some time understanding its capabilities first and then come back to this article.

Disclaimer 2: The application I use for the promotions is completely dummy, and its configuration misses several best practices mainly for brevity and simplicity reasons. For example, some deployments are missing health checks, and all of them are missing resource limits. Again, this article is NOT about how to create Kubernetes deployments. You should already know how proper deployment manifests look. If you want to learn more about production-grade best practices, then see my other article at https://codefresh.io/kubernetes-tutorial/kubernetes-antipatterns-1/

With the disclaimers out of the way, here is the repository structure:

GitOps folder structure
GitOps folder structure

The base directory holds configuration which is common to all environments. It is not expected to change often. If you want to do changes to multiple environments at the same time, it is best to use the “variants” folder.

The variants folder (a.k.a mixins, a.k.a. components) holds common characteristics between environments. It is up to you to define what exactly you think is “common” between your environments after researching your application as discussed in the previous section.

In the example application, we have variants for all prod and non-prod environments and also the regions. Here is an example of the prod variant that applies to ALL production environments.

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: simple-deployment
spec:
  template:
    spec:
      containers:
      - name: webserver-simple
        env:
        - name: ENV_TYPE
          value: "production"
        - name: PAYPAL_URL
          value: "production.paypal.com"   
        - name: DB_USER
          value: "prod_username"
        - name: DB_PASSWORD
          value: "prod_password"                     
        livenessProbe:
          httpGet:
            path: /health
            port: 8080

In the example above, we make sure that all production environments are using the production DB credentials, the production payment gateway, and a liveness probe (this is a contrived example, please see disclaimer 2 at the start of this section). These settings belong to the set of configuration that we don’t expect to promote between environments, but we assume that they will be static across the application lifecycle.

With the base and variants ready, we can now define every final environment with a combination of those properties.

Here is an example of the staging ASIA environment:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: staging
namePrefix: staging-asia-

resources:
- ../../base

components:
  - ../../variants/non-prod
  - ../../variants/asia

patchesStrategicMerge:
- deployment.yml
- version.yml
- replicas.yml
- settings.yml

First we define some common properties. We inherit all configuration from base, from non-prod environments, and for all environments in Asia.

The key point here is the patches that we apply. The version.yml and replicas.yml are self-explanatory. They only define the image and replicas on their own and nothing else.

The version.yml file (which is the most important thing to promote between environments) defines only the image of the application and nothing else.

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: simple-deployment
spec:
  template:
    spec:
      containers:
      - name: webserver-simple
        image: docker.io/kostiscodefresh/simple-env-app:2.0

The associated settings for each release that we DO expect to promote between environments are also defined in settings.yml

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: simple-deployment
spec:
  template:
    spec:
      containers:
      - name: webserver-simple
        env:
        - name: UI_THEME
          value: "dark"
        - name: CACHE_SIZE
          value: "1024kb"
        - name: PAGE_LIMIT
          value: "25"
        - name: SORTING
          value: "ascending"    
        - name: N_BUCKETS
          value: "42"         

Feel free to look at the whole repository to understand the way all kustomizations are formed.

Performing the initial deployment via GitOps

To deploy an application to its associated environment, just point your GitOps controller to the respective “env” folder and kustomize will create the complete hierarchy of settings and values.

Here is the example application as it runs in Staging/Asia

GitOps application example
GitOps application example

You can also use Kustomize on the command line to preview what is going to be deployed for each environment. Examples:

kustomize build envs/staging-asia
kustomize build envs/qa
kustomize build envs/integration-gpu

You can of course pipe the output to kubectl to deploy each environment, but in the context of GitOps, you should always let your GitOps controller deploy your environments and avoid manual kubectl operations.

Comparing the configuration of two environments

A very common need for a software team is to understand what is different between two environments. I have seen several teams who have the misconception that only with branches you can easily find differences between environments.

This could not be further from the truth. You can easily use mature file-diffing utilities to find what is different between environments just by comparing files and folders.

The simplest way is to diff only the settings that are critical to the app.

vimdiff envs/integration-gpu/settings.yml envs/integration-non-gpu/settings.yml
GitOps settings diff
GitOps settings diff

And with the help of kustomize, you can compare any number of whole environments for the full picture:

kustomize build envs/qa/> /tmp/qa.yml
kustomize build envs/staging-us/ > /tmp/staging-us.yml
kustomize build envs/prod-us/ > /tmp/prod-us.yml
vimdiff /tmp/staging-us.yml /tmp/qa.yml /tmp/prod-us.yml
GitOps environment diff
GitOps environment diff

I personally don’t see any disadvantage between this method and performing “git diff” between environment branches.

How to perform promotions between GitOps environments

Now that the file structure is clear, we can finally answer the age-old question “how do I promote releases with GitOps”?

Let’s see some promotion scenarios. If you have been paying attention to the file structure, you should already understand how all promotions resolve to simple file copy operations.

Scenario: Promote application version from QA to staging environment in the US:

  1. cp envs/qa/version.yml envs/staging-us/version.yml
  2. commit/push changes

Scenario: Promote application version from integration testing (GPU) to load testing (GPU) and then to QA. This is a 2 step process

  1. cp envs/integration-gpu/version.yml envs/load-gpu/version.yml
  2. commit/push changes
  3. cp envs/load-gpu/version.yml envs/qa/version.yml
  4. commit/push changes

Scenario: Promote an application from prod-eu to prod-us along with the extra configuration. Here we also copy our setting file(s).

  1. cp envs/prod-eu/version.yml envs/prod-us/version.yml
  2. cp envs/prod-eu/settings.yml envs/prod-us/settings.yml
  3. commit/push changes

Scenario: Make sure that QA has the same replica count as staging-asia

  1. cp envs/staging-asia/replicas.yml envs/qa/replicas.yml
  2. commit/push changes

Scenario: Backport all settings from qa to integration testing (non-gpu variant)

  1. cp envs/qa/settings.yml envs/integration-non-gpu/settings.yml
  2. commit/push changes

Scenario: Make a global change to all non-prod environments at once (but see also next section for some discussion on this operation)

  1. Make your change in variants/non-prod/non-prod.yml
  2. commit/push changes

Scenario: Add a new configuration file to all US environments (both production and staging).

  1. Add the new manifest in the variants/us folder
  2. Modify the variants/us/kustomization.yml file to include the new manifest
  3. commit/push changes

In general, all promotions are just copy operations. Unlike the environment-per-branch approach, you are now free to promote anything from any environment to any other environment without any fear of taking the wrong changes. Especially when it comes to back-porting configuration, environment-per-folder really shines as you can simply move configuration both “upwards” and “backwards” even between unrelated environments.

Note that I am using cp operations just for illustration purposes. In a real application, this operation would be performed automatically by your CI system or other orchestration tool. And depending on the environment, you might want to create a Pull Request first instead of directly editing the folder in the main branch.

Making changes to multiple environments at once

Several people have asked in the comments of the first article about the use-case of changing multiple environments at once and how to achieve and/or prevent this scenario.

First of all, we need to define what exactly we mean by “multiple” environments. We can assume the following 2 cases.

  1. Changing multiple environments at once that are on the same “level.” As an example, you want to make a change that affects prod-us, prod-eu and prod-asia at the same time
  2. Changing multiple environments at once that are NOT on the same level. As an example, you want to make a change to “integration” and “staging-eu” at the same time

The first case is a valid scenario, and we will cover this below. However, I consider the second scenario an anti-pattern. The whole point of having different environments is to be able to release things in a gradual way and promote a change from one environment to the next. So if you find yourself deploying the same change in environments of different importance, ask yourself if this is really needed and why.

For the valid scenario of deploying a single change to multiple “similar” environments, there are two strategies:

  1. If you are absolutely certain that the change is “safe” and you want it to reach all environments at once, you can make that change in the appropriate variant (or respective folders). For example, if you commit/push a change in the variants/non-prod folder then all non-production environments will get this change at the same time. I am personally against this approach because several changes look “safe” in theory but can be problematic in practice
  2. The preferable approach is to apply the change to each individual folder and then move it to the “parent” variant when it is live on all environments.

Let’s take an example. We want to make a change that affects all EU environments (e.g. a GDPR feature). The naive way would be to commit/push the configuration change directly to variants/eu folder. This would indeed affect all EU environments (prod-eu and staging-eu). However this is a bit risky, because if the deployment fails, you have just brought down a production environment.

The suggested approach is the following:

  1. Make the change to envs/staging-eu first
  2. Then make the same change to envs/prod-eu
  3. Finally, delete the change from both environments and add it in variants/eu (in a single commit/push action).
Gradual GitOps promotion
Gradual GitOps promotion

You might recognize this pattern from gradual database refactorings. The final commit is “transitional” in the sense that it doesn’t really affect any environments in any way. Kustomize will create the exact same definition in both cases. Your GitOps controller shouldn’t find any differences at all.

The advantages of this approach are of course the easy way to rollback/revert the change as you move it through environments. The disadvantage is the increased effort (and commits) you need to promote the change to all environments, but I believe that the effort outweighs the risks.

If you adopt this approach, it means that you never apply new changes to the base folder directly. If you want a change to happen to all environments, you first apply the change to individual environments and/or variants and then backport it to the base folder while simultaneously removing it from all downstream folders.

The advantages of the “environment-per-folder” approach

Now that we have analyzed all the inner workings of the “environment-per-folder” approach, it is time to explain why it is better than the “environment-per-branch” approach. If you have been paying attention to the previous sections, you should already understand how the “environment-per-folder” approach directly avoids all the problems analyzed in the previous article.

The most glaring issues with environment branches is the order of commits and the danger of bringing unwanted changes when you merge from one environment to another. With the folder approach, this problem is completely eliminated:

  1. The order of commits on the repo is now irrelevant. When you copy a file from one folder to the next, you don’t care about its commit history, just its content
  2. By only copying files around, you only take exactly what you need and nothing else. When you copy envs/qa/version.yml to envs/staging-asia/version.yml you can be certain that you only promote the container image and nothing else. If somebody else has changed the replicas in the QA environment in the meantime, it doesn’t affect your promotion action.
  3. You don’t need to use git cherry-picks or any other advanced git method to promote releases. You only copy files around and have access to the mature ecosystem of utilities for file processing.
  4. You are free to take any change from any environment to either an upstream or downstream environment without any constraints about the correct “order” of environments. If for example you want to backport your settings from production US to staging US, you can do a simple copy operation of envs/prod-us/settings.yml to envs/staging-us/settings.yml without the fear that you might take inadvertently unrelated hotfixes that were supposed to be only in production.
  5. You can easily use file diff operations to understand what is different between environments in all directions (both from source and target environments and vice versa)

I consider these advantages very important for any non-trivial application, and I bet that several “failed deployments” in big organizations could be directly or indirectly attributed to the problematic environment-per-branch model.

The second problem mentioned in the previous article was the presence of configuration drift when you merge a branch to the next environment. The reason for this is that when you do a “git merge,” git only notifies you about the changes it will bring, and it doesn’t say anything about what changes are already in the target branch.

Again this problem is completely eliminated with folders. As we said already, file diff operations have no concept of “direction.” You can copy any setting from any environment either upwards or downwards, and if you do a diff operation on the files, you will see all changes between environments regardless of their upstream/downstream position.

The last point about environment branches was the linear complexity of branches as the number of environments grows. With 5 environments, you need to juggle changes between 5 branches, and with 20 environments, you need to have 20 branches. Moving a release correctly between a large number of branches is a cumbersome process, and in the case of production environments, it is a recipe for disaster.

With the folder approach, the number of branches is not only static but it is exactly 1. If you have 5 environments you manage them all with your “main” branch, and if you need more environments, you only add extra folders. If you have 20 environments, you still need a single Git branch. Getting a centralized view on what is deployed where is trivial when you have a single branch.

Using Helm with GitOps environments

If you don’t use Kustomize but prefer Helm instead, it is also possible to create a hierarchy of folders with “common” stuff for all environments, specific features/mixins/components, and final folders specific to each environment.

Here is how the folder structure would look like

chart/
  [...chart files here..]
common/
  values-common.yml
variants/
  prod/
     values-prod.yml
  non-prod/
    Values-non-prod.yml
  [...other variants…]
 envs/
     prod-eu/
           values-env-default.yaml
           values-replicas.yaml
           values-version.yaml
           values-settings.yaml
   [..other environments…]

Again you need to spend some time to examine your application properties and decide how to split them into different value files for optimal promotion speed.

Other than this, most of the processes are the same when it comes to environment promotion.

Scenario: Promote application version from QA to staging environment in the US:

  1. cp envs/qa/values-version.yml envs/staging-us/values-version.yml
  2. commit/push changes

Scenario: Promote application version from integration testing (GPU) to load testing (GPU) and then to QA. This is a 2 step process

  1. cp envs/integration-gpu/values-version.yml envs/load-gpu/values-version.yml
  2. commit/push changes
  3. cp envs/load-gpu/values-version.yml envs/qa/values-version.yml
  4. commit/push changes

Scenario: Promote an application from prod-eu to prod-us along with the extra configuration. Here we also copy our setting file(s).

  1. cp envs/prod-eu/values-version.yml envs/prod-us/values-version.yml
  2. cp envs/prod-eu/values-settings.yml envs/prod-us/values-settings.yml
  3. commit/push changes

It is also critical to understand how Helm (or your GitOps agent which handles Helm) works with multiple value files and the order in which they override each other.

If you want to preview one of your environments, instead of “kustomize build” you can use the following command

helm template chart/ --values common/values-common.yaml --values variants/prod/values-prod.yaml –values envs/prod-eu/values-env-default.yml –values envs/prod-eu/values-replicas.yml –values envs/prod-eu/values-version.yml –values envs/prod-eu/values-settings.yml

You can see that Helm is a bit more cumbersome than Kustomize, if you have a large number of variants or files in each environment folder.

The “environment-per-git-repo” approach

When I talk with big organizations about the folder approach, one of the first objections I see is that people (especially security teams) don’t like to see a single branch in a single Git repository that contains both prod and non-prod environments.

This is an understandable objection and arguably can be the single weak point of the folder approach against the “environment-per-branch” paradigm. After all, it is much easier to secure individual branches in a Git repository instead of folders in a single branch.

This problem can be easily solved with automation, validation checks, or even manual approvals if you think it is critical for your organization. I want to stress again that I only use “cp” in the file operations for promoting releases just for illustration purposes. It doesn’t mean that an actual human should run cp manually in an interactive terminal when a promotion happens.

Ideally you should have an automated system that copies files around and commits/pushes them. This can be your Continuous Integration (CI) system or other platform that deals with your software lifecycle. And if you still have humans that make the changes themselves, they should never commit to “main” directly. They should open a Pull Request instead. Then you should have a proper workflow that checks that Pull Request before merging.

I realize however that some organizations are particularly sensitive to security issues and they prefer a bulletproof approach when it comes to Git protection. For these organizations, you can employ 2 Git repositories. One has the base configuration, all prod variants, and all prod environments (and everything else related to production) while the second Git repository has all non-production stuff.

This approach makes promotions a bit harder, as now you need to checkout 2 git repositories before doing any promotion. On the other hand, it allows your security team to place extra security constraints to the “production” Git repository, and you still have a static number of Git repositories (exactly 2) regardless of the amount of environments you deploy to.

I personally consider this approach an overkill that, at least to me, shows a lack of trust against developers and operators. The discussion on whether or not people should have direct access to production environments is a complex one and probably deserves a blog post on its own.

Embrace folders and forget branches

We hope that with this blog post we addressed all the questions that arose from the “don’t use branches for environments” article and you now have a good understanding about the benefits of the folder approach and why you should use it.

If you have other interesting use cases or have extra questions on the subject of organizing your GitOps environments, please ask in the comments section.

Happy GitOps deployments!

Kostis Kapelonis

Kostis is a software engineer/technical-writer dual class character. He lives and breathes automation, good testing practices and stress-free deployments with GitOps.

24 responses to “How to Model Your Gitops Environments and Promote Releases between Them

  1. Mohammad Abusaa says:

    Thank for this article,

    I want to challenge you for the below two points :

    1. How do you plan for rollback ? especially its a single branch, suppose we promote a change to production “commitX”, then we commit a change to staging “commitY” . I want to rollback production changes only.

    2. You know that in real automated approach , all initiated by CI system that build the image and you may have some new business configurations to be added as configmaps or envvars, how you start the process after building image from source code toward first environment “QA as example” and introducing the new business configuration as well.

    1. Re: your first point: I guess you can just revert “commitX” via “git revert commitX” which creates another commit on top of the main branch. Or, if you have a UAT environment which replicates the settings from production, you could just copy the corresponding setting file from UAT to prod.

      Re: your second point: I’ve decoupled the CI pipeline from the CD pipeline. The former just creates the binaries, Container images, and Helm Chart and publishes it to the corresponding registries/repositories. At the end, it sends an event which is consumed by the CD pipeline. Here, I’m cloning the Git environment repository, generate all required files and commit these to the repo (if you need some further adjustments to the settings, you can commit this to a feature branch and create a pull request). Then it is up to the GitOps controller to deploy/provision the new environment once the change is applied to the main branch.

    2. Kostis Kapelonis says:

      For rollback you can literally do “git revert commitX”. CommitX is reverted and commitY stays as is. It doesn’t get any simpler than this.

      That is a good question but it has nothing to do with environment-per-folder. You still need to answer it if you are using environment-per-branch. You can use hooks in your CI system,
      Flux/Argo image updater or something custom. There is no way around it in either case.

  2. Really interesting reading – thank you for sharing and your efforts to write this article.

    At the moment I’m using the environment-per-branch pattern, but I’ve been thinking of migrating to a trunk-based pattern. Luckily, the configuration is already split into common values (which apply to all environments) and environment specific settings (to avoid merge conflicts). But your recommendation to split this further makes a lot of sense to me. What I’m still struggling with is how the environment-per-folder approach works if you want to use different Helm Chart versions in the environments. For example, if you want to use/test the Helm Chart in version 2.0 in the QA environment, but still want to use version 1.8 in staging until 2.0 has been fully tested. Would it then make sense to move the Helm Chart (Chart.yaml) into the corresponding envs folders? Unfortunately, ArgoCD doesn’t seem to support using the Helm Chart from the Helm repository, but the Helm Value files from the Git repository – otherwise the Helm Chart version could be part of the corresponding (environment specific) Application resource file from ArgoCD. Instead you have to write another Helm Chart in your Git repository and reference the original Helm Chart as a dependency in the Chart.yaml 🙁

    1. Kostis Kapelonis says:

      Yes it would make absolute sense to move the Helm chart in the envs folder.

      Note however that there is an open issue in Argo CD (I am not sure about Flux) that will support getting Helm values from a different place than the chart it self, so when this is implemented you might need to re-evaluate your case.

      That being said (and I know you might not like this suggestion) I personally suggest using Helm for external applications (i.e. those you don’t develop) and instead adopt Kustomize for your own apps (the ones your development team produces).

  3. Joe Bowbeer says:

    Where do you maintain the ArgoCD Application manifests (or Flux equivalents)?

    Are they also in git? Which repo? Where?

    1. Kostis Kapelonis says:

      Certainly in Git. Anywhere you want (same repo or other). I don’t think it matters. I mean how often do you need to change them? At least in the case of ArgoCD each application file says that this folder should be deployed to this cluster (e.g. QA folder goes to QA cluster). I think this is mostly static information. Is there another concern that I am missing?

  4. Frédéric SPIERS says:

    Again, huge thanks for this article and your work.

    I also want to challenge you on 2 main points:

    1. It’s exactly the same as Mohammed Abusaa, I think his point is very relevant. How do you handle rollback of your application for a specific environment since with the folder approach, all commits from all environments are mixed in the same branch.
    If you use a tool like ArgoCD to deploy your application, you will have to git revert in order to rollback your changes after some tests for example, but if you mix your commits, I don’t see any solution to rollback properly except maybe simulating a git revert by re-commit the state of the previous version of the application. But it becomes very hard to handle.

    2. My second point is how do you handle different deployment approach ?
    For example, I want to do continuous deployment on DEV and QA but I want to stick with continuous delivery on PROD (with a MR/PR and protected main branch). Seems like the only way is to create a separated repository only for prod like you said just above in the The “environment-per-git-repo” approach chapter.

    Anyway, thanks a lot, thats really instructive and gfood quality article.

    1. Kostis Kapelonis says:

      1. I am not sure I understand the issue here. You promote from integration to QA with commitX. Then let’s say you promote from Prod-eu to Prod-US with commitY. The first promotion went wrong. You do “git revert commitX”. The second promotion stays as is. What is there to simulate? What is hard to handle? On a related note I would use Argo Rollout or Flagger for progressive delivery instead of manually reverting stuff

      2. The “environment-per-repo” approach was suggested ONLY for security reasons. No need to adopt it if you don’t have this limitation. To answer your question you always do MR/PRs and the main branch is of course protected. In the case of Continuous deployment you simply auto-approve the PR if all checks are ok. Auto-approving/merging PRs is an essential capability of all good CI systems. It has nothing to do with GitOps or ArgoCD.

      1. Frédéric SPIERS says:

        Indeed you are right, I went through the whole thing too quickly and obviously you can just git revert regardless of the commit order. Fine, thanks for explanation.

        And thanks for explanation about PR/MR.

        Considering your usage of Argo Rollouts, I agree with you that this kind of tool is very useful, but, the way I tried it, I didnt find any solution to avoid reverting my commit.
        Argo Rollouts will rollback the version if it doesnt pass the test phase, but only on a pod perspective. It means that you will have your application in a degraded state with version A in the cluster, but version B on the Git repository and then OutOfSync from ArgoCD. How do you handle the rollback from Argo Rollouts matching the actual Git state of your application ?

        1. Kostis Kapelonis says:

          Argo Rollouts doesn’t touch Git. It is not an alternative to Git reverts. Normally if a deployment fails (and you onlyuse ArgoCD) your environment is down and you are on the hook for making a quick git revert. With Argo Rollouts, the environment will still be up (of course with the previous version). This means that you can now git revert at your leisure, or even better fix the problem and deploy again (roll forward). Basically with Argo Rollouts you should have zero downtime even on failed deployments. It is a way to avoid hasty git reverts in the first place.

          In theory you could also use Argo Events/Rollout notification to also auto-revert in Git as well, but I think this is too complex. Rolling forward (e.g. fixing the issue) is much more realistic in my opinion.

  5. Wouldn’t the approach of copying the version from one environment to the other may cause the release of a broken version in case of a race condition?

    imagine the following flow:
    1. both staging and production are in version v22
    2. staging gets new version v23
    3. CD tool starts deploying to staging (pipeline 1), and it takes a while
    4. while staging is deploying, the CI tool pushes a new version v24 to staging, and pipeline 2 starts
    5. pipeline 1 finishes deploy to staging, and triggers a copy of the staging version to production

    now since the current version in staging is v24, that’s the version that’s gonna be applies to production, even though the second pipeline didn’t finish. In case v24 is broken, we just release a broken version to production

    1. Kostis Kapelonis says:

      First of all there are no “deployment pipelines” in GitOps. Only the GitOps controller is allowed to deploy applications.

      Your question is a valid one, but I don’t see why it is related to the folder approach. I can ask the exact same thing even if you have branches per environments.

      “Imagine the following flow”
      1.both staging and production BRANCHES are in version v22
      2. staging BRANCH gets new version v23
      3. GitOPS controller starts deployment of staging BRANCH to v23
      4. while staging BRANCH is deploying, the CI tool pushes a new version v24 to staging BRANCH
      5. Staging deployment is finished , and triggers a merge of the staging BRANCH to production.

      “If v24 is broken we just released a broken version to production”.

      Do you see my point?

      Anyway to answer your question you can either serialize your CI pipelines (allowing only a single instance of a specific pipeline to run is an essential function of all CI systems) or use
      a feature of the GitOps controller that solves this (sync windows come in mind for Argo CD).

      But I want to stress out that your question has nothing to do with GitOps or why the “environment-per-folder” approach is the best recommendation for GitOps environments. Or do you mean something different and I misunderstood your question?

      1. Your flow imagining isn’t quite right.

        In the “branch-per-environment” approach, the flow that promotes from staging to production will be linked to a specific commit revision, and therefore when it gets to the “merge to production” stage, what lands in production is the state that was present at the start of the flow.

        Therefore the result will be “if v24 is broken we just released v23 to production while the broken v24 is on staging”.

        Which is exactly what you want.

        TL;DR: With branch-per-environment you can pin flows to a specific state (point in revision history) of the gitops repo.

        1. Kostis Kapelonis says:

          The article is written in a completely generic way and doesn’t assume any specific GitOps provider. However if we are talking about ArgoCD, I assume you mean that the application entry is pointed at a specific Git revision which itself is changed by CI or other system.

          However if that is the case, you can still use the same solution with folders as well. Have your CI system commit the ArgoCD app and change both the “path” and “targetRevision” fields to specific points in time.

          So this solution will work regardless on whether you use branch-per-environment or branch-per-folder approach if you want to have the level of granularity.

  6. Very interesting read.

    I am curious how you handle use cases where you deploy something from a 3rd party. Usually you would have something like “https://github.com/jetstack/cert-manager/releases/download/v1.6.1/cert-manager.yaml” in your base kustomization and then apply the necessary patches on top.

    In this example, upgrading the version of cert-manager in the base, would upgrade it on all environments. How do you get around this?

    Would you use an empty base and provide a specific cert-manager version in each environment, and once they are all equal, you again put it back into the base?
    Or just keep it into the base but have each argoCD applications refer to a specific tag or commit SHA, so each environment does not get upgraded automatically?

    Same problem would apply for a 3rd party chart, except you wouldn’t have the option of an empty base here.

    1. Kostis Kapelonis says:

      Hello. The article is focusing on applications that your developers create since these applications have a faster/frequent lifecycle.

      Regarding external applications, the answer is it depends 🙂 . I mean it depends on what you think about their lifecycle.
      If cert-manager is something that you update all the time then yes, I would have each environment have its own version and if they are the same move back to parent base. Essentially
      treat it like your “own” application.
      However if another external app is something that you only update from time to time, then maybe I would put in the two bases (non-prod and prod) and update in all environments
      at once (accepting the risk). Up to you.

      The point is that the “environment-per-folder” approach is much more flexible and with less limitations than the “environment-per-branch” approach.

  7. Lucas Stocksmeier says:

    Hi, thanks for the informative article. I wanted to ask what the problem is with having the k8s manifest in the application repository instead of having 2 Git Repos. The GitOps Tools of choice can still monitor the path in the repo and changes that affect the manifest, as well as the source code, like introducing a new env variables can be done in one commit.

    1. Kostis Kapelonis says:

      Hello. People have already written about this here https://argo-cd.readthedocs.io/en/stable/user-guide/best_practices/ and here https://codefresh.io/argo-platform/argo-cd-best-practices/

      The biggest problem for me is reuse. If I have 10 similar microservices/applications, I don’t want to maintain 10 charts/manifests. I will split the code in 10 repos with just source code
      and one repo with a single chart (or kustomize stuff or whatever). If you keep everything along with the application, I can bet that you have duplication in your manifests. And then if I want to make a global change on all my kubernetes manifests, I need to go to 10 repositories one-by-one and make the exact same change.

      Can you share how many repositories with applications do you have (that include the manifests on the same repo I mean)?

      The second problem is waste of resources and extra complexity. If you just change a manifest, your CI system will still trigger a pipeline and try to re-build the source code. You need to instruct your CI system
      to ignore the manifest directory and not all CI systems have this capability. And for those CI systems that support it, you need to remember to enable it for each and every pipeline that you have that touches this repository. You are trying to solve a problem that should not exist in the first place.

      The last point is that by decoupling them you can follow different Git approaches for each one of them. For example you can follow Git-flow for source code and trunk-based-development for manifests.

      1. Lucas Stocksmeier says:

        Hi Kostis, we are planing to move to Kubernetes in 1 month and I am currently building the PoC, so at the moment we have 0 :D. However we want to move 8 stateless applications to Kubernetes and more to come.

        I followed your Ideas and set up separate App and Infrastructure Repos with different Git Approaches. A CI job in the app repo commits the image tag change to the Infra Repo where argocd watches each folder. Would love to hear your opinion on the folder structure, my current intention is to have one base dir and then all the applications in the env folder.

        ├── base
        ├── env
        │ ├── app1-production
        │ ├── app1-staging
        │ ├── app1-testing
        │ ├── app2-production
        │ ├── app2-staging
        │ └── app2-testing
        └── variants
        ├── non-prod
        ├── prod

        1. Kostis Kapelonis says:

          Yes this looks good assuming that app1 and app2 are very very very similar and share the same configuration.

          You could also split per app the folders and do

          env/app1/production
          env/app1/staging
          env/app2/production
          env/app2/staging

          This way you can put common stuff of app1 and app2 in the middle folder.

          If app1 and app2 are completely different, I would just have multiple instances of the structure shown in the article.
          Remember also that Kustomize supports reading configuration from other Git repositories.

          So ask yourself, how “similar” app1 and app2 are. The structure you suggest is good only if they have very high coupling (as far as configuration is concerned).
          If they are unrelated or have different configuration there is no reason to mix them like this.

          1. Lucas Stocksmeier says:

            Hi Kostis, I have another concern. So my current plan is the following, I have on Infrastructure Git Repository, in there 10 Applications with each 3 Environments gonna live there. Each application repositories checkouts the infra repo and then changes with a simple yq command the value of the image tag and commit the change to the Infrastructure repository.

            run: yq -i ‘.image.tag=”${{ inputs.image_tag }}”‘ ${{ inputs.values_file }}

            I dont really have a security concern, rather I am concerned that a developers might touch the CI Job and pass the wrong path like env/frontend-webapp/production/values.yml instead of env/broker/production/values.yml and now suddenly a completely wrong image tag is deployed.

            Do you have any recommendation to make this workflow more secure to prevent human error of deploying a wrong tag?

          2. Kostis Kapelonis says:

            You didn’t mention which CI system you use, but in general I am against freetext fields in CI jobs (and this was true even before GitOps)

            You should either have a CI system that creates/fills predefined choices, or have simple hardcoded values (i.e. a pipeline that promotes whatever is in staging to productio without any capacity to override the version).

            That being said, you can also add an extra layer of security by making everything a Pull Request. So the CI job doesn’t actually commit to an environment. It only creates a PR against it and then another human needs to confirm. You can start with this pattern at least in the production environments and see if you can automate it later.

  8. Thank you for this great article. I have a question regarding folder based environment promotion. Is there a way to manage merge permissions for different folders somehow? For example only test system team can deploy and approve change for Test environment and SRE can approve changes for Prod environments

    1. Kostis Kapelonis says:

      There isn’t any standard mechanism for this. You need to check your Git provider. There are some approaches (using submodules or Codeowner files). But I think
      simply catching these on CI or with Pull requests are much easier.

      As explained in the article it is best to have this automated anyway (or have 2 git repos for prod and non-prod)

  9. Thank you for the great article. A couple of questions.

    1) If a developer needs to change files in chart/ or common/, won’t that automatically be picked up in all of the environments? I’m assuming that the Argo applications are looking at the root of this repository for changes rather than a specific environments (i.e. evns/prod/).

    2) Our development teams current process use features branches and on push, the app is deployed to a feature namespace that is automatically created. The environment-per-folder approach works well for our static environments but has some challenges for these dynamic environments. We are looking at using Kubernetes manifests to dynamically deploy Argo CD applications (https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/). Any suggestions for this sort of a strategy?

    1. Kostis Kapelonis says:

      1) Yes and No. The section “Making changes to multiple environments at once” explains exactly this scenario. Yes in the sense that if you do make a change in common it is indeed
      picked by all environments. No, because you will already have “tested” this change to all environments before committing to common. Check the picture in that section. In step 3
      the extra setting is committed to “eu common”, but it doesn’t really affect anything because the change was already present in both environments.

      2) This is actually the third article in the series that I am currently writing. You can see an old approach at “https://codefresh.io/continuous-deployment/creating-temporary-preview-environments-based-pull-requests-argo-cd-codefresh/” The new article that I am going to write will use https://argocd-applicationset.readthedocs.io/en/stable/Generators-Pull-Request/. In general my advice is to look at application sets.

  10. Hi Kostas,

    Thank you for the article.
    Couple questions about splitting application code and configuration (k8s manifests) to separate repos.
    Lets say we have an app change (add new envar) which is a change in application code and k8s manifest. With a single repo structure, developers can easily test locally (i.e. deploy to minikube and test if they didn’t do any mistakes in k8s manifests) and create a new image version with the git tag corresponding to our app+k8s manifest change.

    If we split application code and configuration (k8s manifests) to separate repos, what is the recommended approach for local app development?
    In other words what is the easiest way to quickly test a change locally in our app + k8s config? Just making a change in 2 repos and configuring local tooling to pull those changes? Or there are betters ways of doing it?
    Also, doesn’t it become a bit tricky to propagate that change to environments? Just because the change is now in 2 separate repos which have 2 different git tags.

    Many thanks

    1. Kostis Kapelonis says:

      Local development for Kubernetes apps is a whole topic on its own.

      I would use a dedicated solution such as Telepresence, Tilt, okteto, devspace, garden.io etc.
      We have some articles already on these

      See

      Regarding propagation, remember that GitOps controllers know nothing about source code. So in theory, whenever you change the source code, no deployment happens. You need to push the code to an image AND update the manifest in order for the GitOps controller to actually do something. So even though the change is in two Git repos, the GitOps agent is only working with the manifest repo (so only one repo).

Leave a Reply

* All fields are required. Your email address will not be published.