Docker anti-patterns

Docker anti-patterns

19 min read

Container usage is exploding. Even if you are not yet convinced that Kubernetes is the way forward, it is very easy to add value just by using Docker on its own. Containers can now simplify both deployments and CI/CD pipelines.

The official Docker best practices page is highly technical and focuses more on the structure of the Dockerfile instead of generic information on how to use containers in general. Every Docker newcomer will at some point understand the usage of Docker layers, how they are cached, and how to create small Docker images. Multi-stage builds are not rocket science either. The syntax of Dockerfiles is fairly easy to understand.

However, the main problem of container usage is the inability of companies to look at the bigger picture and especially the immutable character of containers/images. Several companies in particular attempt to convert their existing VM-based processes to containers with dubious results. There is a wealth of information on low-level details of containers (how to create them and run them), but very little information on high level best practices.

To close this documentation gap, I present to you a list of high-level Docker best-practices. Since it is impossible to cover the internal processes of every company out there I will instead explain bad practices (i.e. what you should not do). Hopefully, this will give you some insights on how you should use containers.

Here is the complete list of bad practices that we will examine:

  1. Attempting to use VM practices on containers.
  2. Creating Docker files that are not transparent.
  3. Creating Dockerfiles that have external side effects.
  4. Confusing images used for deployment with those used for development.
  5. Building different images per environment.
  6. Pulling code from git into production servers and building images on the fly.
  7. Promoting git hashes between teams.
  8. Hardcoding secrets into container images.
  9. Using Docker as poor man’s CI/CD.
  10. Assuming that containers are a dumb packaging method.

Anti-pattern 1 – Treating Docker containers as Virtual Machines

Before going into some more practical examples, let’s get the basic theory out of the way first. Containers are not Virtual Machines. At first glance, they might look like they behave like VMs but the truth is completely different. Stackoverflow and related forums are filled with questions like:

  1. How to I update applications running inside containers?
  2. How do I ssh in a Docker container?
  3. How do I get logs/files from a container?
  4. How do I apply security fixes inside a container?
  5. How do I run multiple programs in a container?

All these questions are technically correct, and the people that have answered them have also given technically correct answers. However, all these questions are the canonical example of the XY problem. The real question behind these questions is:

“How can I unlearn all my VM practices and processes and change my workflow to work with immutable, short-lived, stateless containers instead of mutable, long-running, stateful VMs?”

Many companies out there are trying to reuse the same practices/tools/knowledge from VMs in the container world. Some companies were even caught completely off-guard as they had not even finished their bare-metal-to-vm migration when containers appeared.

Unlearning something is very difficult. Most people that start using containers see them initially as an extra abstraction layer on top of their existing practices:

Containers are not VMs
Containers are not VMs

In reality, containers require a completely different view and change of existing processes. You need to rethink all your CI/CD processes when adopting containers.

Containers require a new way of thinking
Containers require a new way of thinking

There is no easy fix for this anti-pattern other than reading about the nature of containers, their building blocks, and their history (going all the way back to the venerable chroot).

If you regularly find yourself wanting to open ssh sessions to running containers in order to “upgrade” them or manually get logs/files out of them you are definitely using Docker in the wrong way and you need to do some extra reading on how containers work.

Anti-pattern 2 – Creating Docker images that are not transparent

A Dockerfile should be transparent and self-contained. It should describe all the components of an application in plain sight. Anybody should be able to get the same Dockerfile and recreate the same image. It is ok if the Dockerfile downloads extra libraries (in a versioned and well-controlled manner) but creating Dockerfiles that perform “magic” steps should be avoided.

Here is a particularly bad example:

FROM alpine:3.4

RUN apk add --no-cache 
      ca-certificates 
      pciutils 
      ruby 
      ruby-irb 
      ruby-rdoc 
      && 
    echo http://dl-4.alpinelinux.org/alpine/edge/community/ >> /etc/apk/repositories && 
    apk add --no-cache shadow && 
    gem install puppet:"5.5.1" facter:"2.5.1" && 
    /usr/bin/puppet module install puppetlabs-apk

# Install Java application
RUN /usr/bin/puppet agent --onetime --no-daemonize

ENTRYPOINT ["java","-jar","/app/spring-boot-application.jar"]

Now don’t get me wrong. I love Puppet as it is a great tool (or Ansible, or Chef for that matter). Misusing it for application deployments might have been easy with VMs, but with containers it is disastrous.

First of all, it makes this Dockerfile location-dependent. You have to build it on a computer that has access to the production Puppet server. Does your workstation have access to the production puppet server? If yes, should your workstation really have access to the production puppet server?

But the biggest problem is that this Docker image cannot be easily recreated. Its contents depend on what the puppet server had at the time of the initial build. If you build the same Dockerfile today you might get a completely different image. And if you don’t have access to the puppet server or the puppet server is down you cannot even build the image in the first place. You don’t even know what is the version of the application if you don’t have access to the puppet scripts.

The team that created this Dockerfile was just lazy. There was already a puppet script for installing the application in a VM. The Dockerfile was just retrofitted to do the same thing (see the previous anti-pattern).

The fix here is to have minimal Dockerfiles that describe explicitly what they do. Here is the same application with the “proper” Dockerfile.

FROM openjdk:8-jdk-alpine

ENV MY_APP_VERSION="3.2"

RUN apk add --no-cache 
      ca-certificates

WORKDIR /app
ADD  http://artifactory.mycompany.com/releases/${MY_APP_VERSION}/spring-boot-application.jar .

ENTRYPOINT ["java","-jar","/app/spring-boot-application.jar"]

Notice that:

  1. There is no dependency on puppet infrastructure. The Dockerfile can be built on any developer machine that has access to the binary repository
  2. Versions of the software are explicitly defined.
  3. It is very easy to change the version of the application by editing only the Dockerfile (instead of puppet scripts).

This was just a very simple (and contrived) example. I have seen many Dockerfiles in the wild that depend on “magic” recipes with special requirements for the time and place they can be built. Please don’t write your Dockerfiles in this manner, as developers (and other people who don’t have access to all systems) will have great difficulties creating Docker images locally.

An even better alternative would be if the Dockerfile compiled the source Java code on its own (using multi-stage builds). That would give you even greater visibility on what is happening in the Docker image.

Anti-pattern 3 -Creating Dockerfiles that have external side effects

Let’s imagine that you are an operator/SRE working at very big company where multiple programming languages are used. It would be very difficult to become an expert in all the programming languages and build systems out there.

This is one of the major advantages of adopting containers in the first place. You should be able to download any Dockerfile from any development team and build it without really caring about side effects (because there shouldn’t be any).

Building a Docker image should be an idempotent operation. It shouldn’t matter if you build the same Dockerfile one time or a thousand times. Or if you build it on a CI server first and then on your workstation.

Yet, there are several Dockerfiles out there that during the build phase…

  1. perform git commits or other git actions,
  2. clean up or tamper with database data,
  3. or call other external services with POST/PUT operations.

Containers offer isolation as far as the host filesystem is concerned but there is nothing protecting you from a Dockerfile that contains a RUN directive with curl POSTING an HTTP payload to your intranet.

Here is a simple example where a Dockerfile both packages (a safe action) and publishes (an unsafe action) an npm application in a single run.

FROM node:9
WORKDIR /app

COPY package.json ./package.json
COPY package-lock.json ./package-lock.json
RUN npm install
COPY . .

RUN npm test

ARG npm_token

RUN echo "//registry.npmjs.org/:_authToken=${npm_token}" > .npmrc
RUN npm publish --access public

EXPOSE 8080
CMD [ "npm", "start" ]

This Docker file confuses two unrelated concerns, releasing a version of the application, and creating a Docker image for it. Maybe sometimes these two actions happen indeed together at the same time, but this is no excuse for polluting a Dockerfile with side effects.

Docker is NOT a generic CI system and it was never meant to be one. Don’t abuse Dockerfiles as glorified bash scripts that have unlimited power. Having side effects while containers are running is ok. Having side effects during container build time is not.

The solution is to simplify your Dockerfiles and make sure that they only contain idempotent operations such as:

  • Cloning source code
  • Downloading dependencies
  • Compiling/packaging code
  • Processing/Minifying/Transforming local resources
  • Running scripts and editing files on the container filesystem only

Also, keep in mind the ways Docker caches filesystem layers. Docker assumes that if a layer and the ones before it have not “changed” they can be reused from cache. If your Dockerfile directives have side effects you essentially break the Docker caching mechanism.

FROM node:10.15-jessie

RUN apt-get update && apt-get install -y mysql-client && rm -rf /var/lib/apt

RUN mysql -u root --password="" < test/prepare-db-for-tests.sql

WORKDIR /app

COPY package.json ./package.json
COPY package-lock.json ./package-lock.json
RUN npm install
COPY . .

RUN npm integration-test

EXPOSE 8080
CMD [ "npm", "start" ]

Let’s say that you try to build this Dockerfile and your unit tests fail. You make a change to the source code and you try to rebuild again. Docker will assume that the layer that clears the DB is already “run” and it will reuse the cache. So your unit tests will now run in a DB that isn’t cleaned and contains data from the previous run.

In this contrived example, the Dockerfile is very small and it is very easy to locate the statement that has side effects (the mysql command) and move it to the correct place in order to fix layer caching. But in a real Dockerfile with many commands, trying to hunt down the correct order of RUN statements is very difficult if you don’t know which have side effects and which do not.

Your Dockerfiles will be much simpler if all actions they perform are read-only and with local scope.

Anti-pattern 4 -Confusing images that are used for development with those that are used for deployment

In any company that has adopted containers, there are usually two separate categories of Docker images. First, there are the images that are used as the actual deployment artifact sent to production servers.

The deployment images should contain:

  1. The application code in minified/compiled form plus its runtime dependencies.
  2. Nothing else. Really nothing else.

The second category is the images used for the CI/CD systems or developers and might contain:

  1. The source code in its original form (i.e. unminified)
  2. Compilers/minifiers/transpilers
  3. Testing frameworks/reporting tools
  4. Security scanning, quality scanning, static analyzers
  5. Cloud integration tools
  6. Other utilities needed for the CI/CD pipeline

It should be obvious that these categories of container images should be handled separately as they have different purposes and goals. Images that get deployed to servers should be minimal, secure and battle-hardened. Images that get used in the CI/CD process are never actually deployed anywhere so they have much less strict requirements (for size and security).

Yet for some reason, people do not always understand this distinction. I have seen several companies who try to use the same Docker image both for development and for deployment. Almost always what happens is that unrelated utilities and frameworks end up in the production Docker image.

There are exactly 0 reasons on why a production Docker image should contain git, test frameworks, or compilers/minifiers.

The promise of containers as a universal deployment artifact was always about using the same deployment artifact between different environments and making sure that what you are testing is also what you are deploying (more on this later). But trying to consolidate local development with production deployments is a losing battle.

In summary, try to understand the roles of your Docker images. Each image should have a single role. If you are shipping test frameworks/libraries to production you are doing it wrong. You should also spend some time to learn and use multi-stage builds.

Anti-pattern 5 -Using different images for each environment (qa, stage, production)

One of the most important advantages of using containers is their immutable attribute. This means that a Docker image should be built only once and then promoted to various environments until it reaches production.

Promoting the same Docker image
Promoting the same Docker image

Because the exact same image is promoted as a single entity, you get the guarantee that what you are testing in one environment is the same as the other.

I see a lot of companies that build different artifacts for their environments with slightly different versions of code or configuration.

Different image per environment
Different image per environment

This is problematic because there is no guarantee that images are “similar enough” to verify that they behave in the same manner. It also opens a lot of possibilities for abuse, where developers/operators are sneaking in extra debugging tools in the non-production images creating an even bigger rift between images for different environments.

Instead of trying to make sure that your different images are the same as possible, it is far easier to use a single image for all software lifecycle phases.

Note that it is perfectly normal if the different environments use different settings (i.e. secrets and configuration variables) as we will see later in this article. Everything else, however, should be exactly the same.

Anti-pattern 6 -Creating Docker images on production servers

The Docker registry serves as a catalog of existing applications that can be re-deployed at any time to any additional environments. It is also a central location of application assets with extra metadata along with previous historical versions of the same application. It should be very easy to choose a specific tag of a Docker image and deploy it to any environment.

One of the most flexible ways of using Docker registries is by promoting images between them. An organization has at least two registries (the development one and the production one). A Docker image should be built once (see previous anti-pattern) and placed in the development registry. Then, once integration tests, security scans, and other quality gates verify its correct functionality, the image can be promoted to the production Docker registry to be sent to production servers or Kubernetes clusters.

Is also possible to have different organizations for Docker registries per region/location or per department. The main point here is that the canonical way for Docker deployments also includes a Docker registry. Docker registries serve both as an application asset repository as well as intermediate storage before an application is deployed to production.

A very questionable practice is the complete removal of Docker registries from the lifecycle and the pushing of source code directly to production servers.

Building images in production servers
Building images in production servers

Production servers use “git pull” to get the source code and then Docker build to create an image on the fly and run it locally (usually with Docker-compose or other custom orchestration). This “deployment method” essentially employs multiple anti-patterns all at once!

This deployment practice suffers from a lot of issues, starting with security. Production servers should not have inbound access to your git repositories. If a company is serious about security, this pattern will not even fly with the security committee. There is also no reason why production servers should have git installed. Git (or any other version control system) is a tool intended for developer collaboration and not an artifact delivery solution.

But the most critical issue is that with this “deployment method” you bypass completely the scope of Docker registries. You no longer know what Docker image is deployed on your servers as there is no central place that holds Docker images anymore.

This deployment method might work ok in a startup, but will quickly become inefficient in bigger installations. You need to learn how to use Docker registries and the advantages they bring (also related to security scanning of containers).

Using a Docker registry
Using a Docker registry

Docker registries have a well-defined API, and there are several open-source and proprietary products that can be used to set-up one within your organization.

Notice also that with Docker registries your source code securely resides behind the firewall and never leaves the premises.

Anti-pattern 7 -Working with git hashes instead of Docker images

A corollary to the previous two anti-patterns is that once you adopt containers, your Docker registry should become the single point of truth for everything. People should talk in terms of Docker tags and image promotions. Developers and operators should use containers as their common language. The hand-over entity between teams should be a container and not a git hash.

Talking about git hashes
Talking about git hashes

This comes in contrast with the old way of using Git hashes as “promotion” artifacts. The source code is of course important, but re-building the same hash over and over in order to promote it is a waste of resources (see also anti-pattern 5). Several companies think that containers should only be handled by operators, while developers are still working with just the source code. This could not be further from the truth. Containers are the perfect opportunity for developers and operators to work together.

Talking about containers
Talking about containers

Ideally, operators should not even care about what goes on with the git repo of an application. All they need to know is if the Docker image they have at hand is ready to be pushed to production or not. They should not be forced to rebuild a git hash in order to get the same Docker image that developers were using in pre-production environments.

You can understand if you are the victim of this anti-pattern by asking operators in your organization. If they are forced to become familiar with application internals such as build systems or test frameworks that normally are not related to the actual runtime of the application, they have a heavy cognitive load which is otherwise not needed for daily operations.

Anti-pattern 8 -Hardcoding secrets and configuration into container images

This anti-pattern is closely related to Anti-pattern 5 (different images per environment). In most cases when I ask companies why they need different images for qa/staging/production, the usual answer is that they contain different configurations and secrets.

This not only breaks the main promise of Docker (deploy what you tested) but also makes all CI/CD pipelines very complex as they have to manage secrets/configuration during build time.

The anti-pattern here is, of course, the hard-coding of configurations. Applications should not have embedded configurations. This should not be news for anybody who is familiar with 12-factor apps.

Hardcoding configuration at build time
Hardcoding configuration at build time

Your applications should fetch configuration during runtime instead of build time. A Docker image should be configuration agnostic. Only during runtime configuration should be “attached” to the container. There are many solutions for this and most clustering/deployment systems can work with a solution for runtime configuration (configmaps, zookeeper, consul etc) and secrets (vault, keywhiz, confidant, cerberus).

Loading configuration during runtime
Loading configuration during runtime

If your Docker image has hardcoded IPs and/or credentials you are definitely doing it wrong.

Anti-pattern 9 -Creating Docker files that do too much

I have come across articles who suggest that Dockerfiles should be used as a poor man’s CI solution. Here is an actual example of a single Dockerfile.

# Run Sonar analysis
FROM newtmitch/sonar-scanner AS sonar
COPY src src
RUN sonar-scanner
# Build application
FROM node:11 AS build
WORKDIR /usr/src/app
COPY . .
RUN yarn install 
 yarn run lint 
 yarn run build 
 yarn run generate-docs
LABEL stage=build
# Run unit test
FROM build AS unit-tests
RUN yarn run unit-tests
LABEL stage=unit-tests
# Push docs to S3
FROM containerlabs/aws-sdk AS push-docs
ARG push-docs=false
COPY --from=build docs docs
RUN [[ "$push-docs" == true ]] && aws s3 cp -r docs s3://my-docs-bucket/
# Build final app
FROM node:11-slim
EXPOSE 8080
WORKDIR /usr/src/app
COPY --from=build /usr/src/app/node_modules node_modules
COPY --from=build /usr/src/app/dist dist
USER node
CMD ["node", "./dist/server/index.js"]

While at first glance this Docker file might look like a good use of multi-stage builds, it is essentially a combination of previous anti-patterns.

  • It assumes the presence of a SonarQube server (anti-pattern 2).
  • It has potential side effects as it can push to S3 (anti-pattern 3).
  • It acts both as a development as well as a deployment image (anti-pattern 4).

Docker is not a CI system on its own. Container technology can be used as part of a CI/CD pipeline, but this technique is something completely different. Don’t confuse commands that need to run in the Docker container with commands that need to run in a CI build job.

The author of this Dockerfile advocates that you should use build arguments that interact with the labels and switch on/off specific build phases (so you could disable sonar for example). But this approach is just adding complexity for the sake of complexity.

The way to fix this Dockerfile is to split it into 5 other Dockerfiles. One is used for the application deployment and all others are different pipeline steps in your CI/CD pipeline. A single Dockerfile should have a single purpose/goal.

Anti-pattern 10 -Creating Docker files that do too little

Because containers also include their dependencies, they are great for isolating library and framework versions per application. Developers are already familiar with the issues of trying to install multiple versions of the same tool on their workstation. Docker promises to solve this problem by allowing you to describe in your Dockerfile exactly what your application needs and nothing more.

But this Docker promise only holds true if you actually employ it. As an operator, I should not really care about the programming tool you use in your Docker image. I should be able to create a Docker image of a Java application, then a Python one and then a NodeJs one, without actually having a development environment for each language on my laptop.

A lot of companies however still see Docker as a dumb package format and just use it to package a finished artifact/application that was already created outside of the container. This anti-pattern is very famous with Java heavy organizations and even official documentation seems to promote it.

Here is the suggested Dockerfile from the official Spring Boot Docker guide.

FROM openjdk:8-jdk-alpine
VOLUME /tmp
ARG JAR_FILE
COPY ${JAR_FILE} app.jar
ENTRYPOINT ["java","-Djava.security.egd=file:/dev/./urandom","-jar","/app.jar"]

This Dockerfile just packages an existing jar file. How was the Jar file created? Nobody knows. It is not described in the Dockerfile. If I am an operator I am forced to install all Java development libraries locally just to build this Dockerfile. And if you work in an organization that works with multiple programming languages this process gets quickly out of hand not only for operators but also for build nodes.

I am using Java as an example here but this anti-pattern is present in other situations as well. Dockerfiles that don’t work unless you have first performed an “npm install” locally first are a very common occurrence.

The solution to this anti-pattern is the same for anti-pattern 2 (Dockerfiles that are not self-contained). Make sure that your Dockerfiles describe the whole process of something. Your operators/SREs will love you even more if you follow this approach. In the case of the Java example before the Dockerfile should be modified as below:

FROM openjdk:8-jdk-alpine
COPY pom.xml /tmp/
COPY src /tmp/src/
WORKDIR /tmp/
RUN ./gradlew build
COPY  /tmp/build/app.war /app.jar
ENTRYPOINT ["java","-Djava.security.egd=file:/dev/./urandom","-jar","/app.jar"]

This Dockerfile described exactly how the application is created and can be run by anybody on any workstation without the need for local Java installation. You can improve this Dockerfile even further with multi-stage builds (exercise for the reader).

Summary

A lot of companies have trouble adopting containers because they attempt to shoehorn their existing VM practices into containers. It is best to spend some time to rethink all the advantages that containers have and understand how you can create your process from scratch with that new knowledge.

In this guide, I have presented several bad practices with container usage and also the solution to each one.

  1. Attempting to use VM practices on containers. Solution: understand what containers are.
  2. Creating Docker files that are not transparent. Solution: write Dockerfiles from scratch instead of adopting existing scripts.
  3. Creating Dockerfiles that have external side effects. Solution: move side effects to your CI/CD solution and keep Dockerfiles side-effect free.
  4. Confusing images used for deployment with those used for development. Solution: don’t ship development tools and test frameworks into production servers.
  5. Building different images per environment. Solution: build an image only once and promote it across various environments
  6. Pulling code from git intro production servers and building images on the fly. Solution: use a Docker registry
  7. Promoting git hashes between teams. Solution: promote container images between teams
  8. Hardcoding secrets into container images. Solution: build an image only once and use runtime configuration injection
  9. Using Docker as CI/CD. Solution: use Docker as a deployment artifact and choose a CI/CD solution for CI/CD
  10. Assuming that containers are a dumb packaging method. Solution: Create Dockerfiles that compile/package source code on their own from scratch.

Look at your workflows, ask developers (if you are an operator) or operators (if you are a developer) and try to find if your company falls into one or more of these bad practices.

Do you know of any other good/bad container practices? Let us know in the comments below.

How useful was this post?

Click on a star to rate it!

Average rating 4.7 / 5. Vote count: 12

No votes so far! Be the first to rate this post.

39 thoughts on “Docker anti-patterns

  1. Hey, greate article! And I agree with almost all of your points. I am however not fully convinced by point 9 “Using Docker as CI/CD. Solution: use Docker as a deployment artifact and choose a CI/CD solution for CI/CD”. I agree in your example that depending on the external sonarqube and introducing side effects makes the Dockerfile no longer portable and more fragile.

    TLDR: do you have any recommendations on building a CI pipeline without relying on multi-stage docker builds that is without redundancies, fast, and ensures development-production parity.

    So let’s say I created the 5 Dockerfiles as you suggested for your example and use GitLab CI to run each one of them as a stage in sequence. I’d first build the ‘sonar’ Dockerfile (not too familiar with sonar, so maybe with the sonar url as a build arg?). If it builds without fail, I move to the ‘build’ stage and use the build Dockerfile (I should be able to use the production/deployment-ready Dockerfile here, shouldn’t I?). This build will run yarn install. Next stage is testing: here I’ll have to copy the src dir again, run yarn install again, and finally run the tests. Let’s ignore the doc publishing here. My problems with this approach vs. a single multi-stage Dockerfile: I’ll have to copy the src and install dependencies for each stage that depends on it, causing the pipeline to run slower.

    So I’m thinking, what if I stick to a traditional build/test pipeline defined entirely within the gitlab-ci.yml (i.e. no docker-build, but raw script-calls to yarn install, etc. (although of course these steps are executed within docker images anyway)) and if the raw build and test worked out I’ll run a final docker build on the production-ready dockerfile and push it to the registry. But then I’d still have a disparity between the ci build and the dockerfile build: what if I forgot to add a dependency in the final Dockerfile, but not the ci-pipeline?

    1. “So let’s say I created the 5 Dockerfiles as you suggested for your example and use GitLab CI to run each one of them as a stage in sequence. I’d first build the ‘sonar’ Dockerfile (not too familiar with sonar, so maybe with the sonar url as a build arg?). If it builds without fail, I move to the ‘build’ stage and use the build Dockerfile (I should be able to use the production/deployment-ready Dockerfile here, shouldn’t I?). This build will run yarn install. Next stage is testing: here I’ll have to copy the src dir again, run yarn install again, and finally run the tests. Let’s ignore the doc publishing here. My problems with this approach vs. a single multi-stage Dockerfile: I’ll have to copy the src and install dependencies for each stage that depends on it, causing the pipeline to run slower.”

      Any proper CI/CD solution will have a way to share src and deps between steps automatically. Since you are on the Codefresh website, it is also natural to mention that we do this already https://codefresh.io/docs/docs/configure-ci-cd-pipeline/introduction-to-codefresh-pipelines/#sharing-the-workspace-between-build-steps

      All the build steps that you mention run with a common attached volume. So everything that one step does is automatically available to the next. So if you run yarn install on the first step then dependencies that were downloaded are automatically available to all subsequent steps.

      1. Great article, but I too am taking issue with point 9.

        What happens when you add local Docker development to the picture ?

        I’m having a hard enough time convincing devs to run tests locally in Docker instead of the native system (macOS usually), so forget about asking them to run tests only in CI 🙂 They love their short local “inner loop”.

        With a single multi-stage Dockerfile you get a portable caching/code sharing system between steps, and you can run tests locally exactly in the same way as in CI : “docker build –target unit-tests”. This works in CodeFresh too with “target:”.

        If you split this into 5 Dockerfiles, suddenly you can’t use “COPY –from” : you have to either 1. Repeat the exact same “yarn install” commands in each 2. Do some weird manual stuff with “FROM” or 3. rely on an external volume mount for caching (in CodeFresh this is “/codefresh/volume”, but you have to roll your own for local builds…) which I believe violates points 2 & 3.

        Am I missing something ?
        How do you guys recommend bridging the gap between local dev/CI ?
        I still struggle to connect the dots between local tools like Skaffold (that want to own the whole pipeline from dev to prod), and CI using a GitOps methodology.
        I know you have a blog post about Draft/Skaffold/Garden, but in them you don’t actually show how to use these tools together with CodeFresh/GitOps.

        1. Hello

          For local dev I specifically recommend Telepresence. You don’t even need Docker locally. So developers can run their tests with their native system.
          See https://codefresh.io/kubernetes-tutorial/telepresence-2-local-development/

          Other options are also Okteto https://codefresh.io/kubernetes-tutorial/okteto/ or tilt https://codefresh.io/kubernetes-tutorial/local-kubernetes-development-tilt-dev/

          But if you want the simplest way Telepresence is the easiest to start.

          To answer your question I don’t think you need weird stuff with FROM. Just have a base image with all your test dependencies and frameworks and then have all tests extend it. But looking at a dedicated tool like Okteto/Telepresence/Garden/Tilt would be my first choice.

  2. Struggling to understand how 4) and 5) can be true at the same time.

    4) says: dont use same image on production as for dev (no need for test framework in prod, etc.).

    5) says all stages need same images, one image should be passed through all stages (dev, staging, prod).

    what am i missing here?

    1. Yes a lot of people were confused by this. “dev” In the second case is just an environment called “dev” and not what developers use. I changed all references to “QA” so I think it much more clear now.

      1. Still not clear to me. So what is the best strategy? Should we have different image for development (which is ‘dev’ environment) and another one for production (‘production’ environment)? How can we use same image for different environments?

        1. No, the image between all environments should be exactly the same:-) You can easily do this if all configuration is NOT inside the image, but somewhere else.
          What is different in your case between the dev image and the prod image? In 99.99% of the cases it is just configuration.

          Have you seen the 12-factor app pages? https://12factor.net/config

          1. Yes that is exactly the anti-pattern. Baking different configurations inside the artifact make things complex for everyone. I don’t understand why such complexity is needed.

            The reasoning according to the article you sent is that maybe you want your front-end to hit http://prod.backend.example.com in production and http://qa.backend.example.com in QA.
            But why fall into this trap? Why not have your frontend simply hit “http://backend”?

            And then in each environment just have the name map to the correct service. This way you only have one artifact, one build process and the guarantee that what you tested in QA is the same thing as gets deployed in production.

            Is there another argument for building a frontend application with different configuration that I am missing?

  3. Don’t blindly follow every blog post on the Internet. Anti-patterns just work and they will continue working even if consul, vault, zookeeper, etc go down. You are not Facebook. Wants encryption? Use plain old GPG.

    1. Hello. We don’t follow anything blindly. I have used all the techniques mentioned in the article, in production, in real companies which are not Facebook.

      All things can go down. Even your ISP, even Google, even Amazon. That doesn’t really mean anything. If you are running Consul on your production cluster and the cluster is down, you have bigger problems than Consul not being available.

      I have never used GPG for application level secrets and I would be glad to know more. Maybe you can write an article about this topic? I think it would be very interesting for our readers.

  4. Regarding your point 10.

    A typical java build will execute tests during the packaging cycle, before creating the jar file. We commonly use the Testcontainers library to have e.g. a database container available for module-local system integration tests. This only works if the library can find a docker deamon during execution, which is not possible during docker build. Even if it were possible it just doesn’t feel right to start docker containers during the building of another one. How would you approach this? I see a few options (apart from the widely used approach that you reject):
    1. Execute the tests before the creation of the docker image – you will have to compile everything twice and manually disable tests in the Dockerfile
    2. Use jib to create the docker image as part of the packaging lifecycle (and without using a docker deamon)

    1. Very good question.
      First of all as anti-pattern 4 says, production images should have just the app and nothing else. So in the case of Java, just the JRE, the app (and possibly an app server if you use one).

      The solution is then very simple. You have a multi-stage Dockerfile. The first stage contains maven, jdk, junit and testcontainers or testdbs that you want. The last stage contains only jre and your application. NOTHING else.

      You build that multi-stage dockerfile UNTIL the first stage (using the target docker argument). So you end up with an image that has all test tools (including test containers). After you build the image you run your tests (these tests are actually integration tests, so they should be run at mvn verify and not mvn package). After the tests succeed, you build the final image (the same multi-stage dockerfile) but you let it run until completion so the final production image is created. The docker layer cache will kick-in so it will actually resume from the previous layer

      So the pipeline has 3 steps
      1. Build multi-stage docker file but stop it at a layer that has all your testing tools (using docker target argument)
      2. Run integration test with that image (in Codefresh this would be a freestyle step)
      3. Continue building the image until completion

      The result is very clean. You have a single Dockerfile, nothing is built twice, and your production image has only your app and nothing else. Anti-patterns 4 and 10 are also avoided. You only need to rebind your tests from the test phase to the integration-test phase.

      I don’t like Jib and because it takes something that is agnostic (docker) and makes it Java specific. Not all companies are Java-only shops.
      I am happy to clarify more if you have any questions.

      1. I completely agree with your point 4)

        I do understand the workflow you’re proposing. Just to make sure: Are you aware of how the testcontainers library works? If not, take a look at: https://www.testcontainers.org/quickstart/junit_4_quickstart/

        Typically we have many layers of tests:

        Tests that only use the java classes in the project (can be unit tests but also integration tests involving the interaction between classes)
        Tests that run in the same project but start a few containers to test the basic interaction with external dependencies like databases
        Test that run in a seperate project, start the already built project containers and test the interaction between them

        And then of course stuff like acceptance testing, smoke tests, whatever – but I think those first few layers demonstrate the problem, because at (2) I couldn’t find a way to combine that with the the scheme you’re proposing. Another solution would be just to not have those kinds of tests and run them together with the (3) tests, but it just feels so much easier to write tests on the lower levels.

        1. Yes I think I understand how testcontainers work. You run a JUnit/Spock test and behind the scenes you launch Docker containers (as test rules) that co-exist with your main application so that your tests can target/use them (am I missing something there?). Other than needing direct access to docker daemon is there anything special about them?

          I basically would split tests into just two categories. Unit tests (those that depend only on Java classes) and integration tests (everything else). The first categoriy can run anytime (pipeline step 1 in my previous message). The second category runs at a different phase (pipeline step 2 in my previous message). So they are never combined (and they shouldn’t).
          You can see a quick example at the end of this article (not with testcontainers, but just plain integration tests) https://codefresh.io/howtos/using-docker-maven-maven-docker/

          You might also find this article very interesting http://blog.codepipes.com/testing/software-testing-antipatterns.html

          1. Hello, to use test container in our code fresh pipeline for integration testing, how we can directly access to docker daemon?

          2. In the SAAS version of Codefresh you don’t get direct access to the docker daemon for security reasons. You can either switch to an enterprise plan, or install our new Codefresh runner in your own cluster to run pipelines. If you use your own cluster, then you can also enable direct docker access to pipelines.

  5. Thank you, this was an excellent article. It helps validate, with well-explained reasoning, a lot of the ideas that jump out from the low-level technical documentation on Docker. One of the best posts on the internet about how to use Docker at maximum efficiency.

  6. Thanks for a great article 🙂 I hope that more people, including whos from my country, know these practices.
    So, Can I translate this article into Korean in my personal blog? (of course, I’ll link this original article!)
    Thank you!

  7. Very good Docker anti-patterns article.

    I have seen so many other blogs posting similar articles that are completely wrong that makes me very happy to see one that is so well explained and so true.

    Thanks for posting quality content 🙂

  8. I’m a bit late to the party, but just wanted to say that this is a really good article, and it points out some really common mistakes people make when first switching to a container based workflow.

    Regarding anti-pattern 4 … you do mention the use of multi-stage builds, and this is exactly what I use to overcome this issue. The first few stages will always produce a production ready image, so that this can easily be reproduced and manually tested in a local environment. But then a final stage will install any dev-tools or settings required for local development. That way, during a full local dev build, it will always produce the production image first as part of that build, to ensure that no local changes have caused a potential break in the production build.

  9. Hello,

    thank you for this in-depth article. I’m a bit confused about 4-th and 5-th points. Aren’t they cancelling each other out?
    I mean, If you say not to mix dev and prod images, then how can I not use different images for different environments?

    1. It is already explained in another comment thread on the article. No they don’t cancel each other out

      First category of images: dev = what developers use, what CI/CD pipelines use.
      Second category of images: qa/prod/staging/load testing = what is deployed in environments

      Anti-pattern 4 says that images between first and second category should be different
      Anti-pattern 5 says that images WITHIN the second category only should be the same

      I hope it is clear.

  10. Hi,

    Reading about anti-pattern #3, it is written that the offending (non-idempotent) line is the one which initializes the DB.

    But what about the line before, with apt-get update (which I presume is similar to “yum update”)? This could lead to a different set of packages depending on the date the container is built.

    This non-idempotent behavior could also occur when installing a new package for which existing dependencies might be automatically updated.

    Is updating packages a safe exception to the idempotent rule?
    I seem to recall having seen many examples of Dockerfiles that just start with “yum -y update”…

    Regards

    1. Hello. The big difference with apt-get update and yum update is that they only affect the container itself and nothing else. You can build the container as many times as you want and nothing gets affected in an external system. The command shown in the example that inits the DB is bad because it affects something OUTSIDE the container. I hope that the distinction is clear.

      The topic of reproducible builds is of course important, but I think it deserves another discussion.

      I might rephrase the title of the anti-pattern to say “creating dockerfiles that have EXTERNAL side effects”.

  11. “trying to consolidate local development with production deployments is a losing battle”

    So is it ok to have a completely independent docker file for development and deployment?

    1. Yes exactly. In the development dockerfile you need Git, linters, compilers, testing frameworks etc. None of this should be in the deployment dockerfile.

      Some companies have a single dockerfile that uses multi-stage builds and different targets have different tools. But if you are just starting out, having 2 dockerfiles are just fine.

      If you search github you will find several projects that use “dockerfile” for deployment and “dockerfile.dev” for development

  12. Thanks for the great article!

    Regarding anti-pattern #10, from my experience, we usually run a build on the CI pipeline and publish the artifact to an artifactory repository. Then the docker build would just pick up the artifact (a jar file for example) from the artifactory so the dockerfile would look simple. I doubt using docker to do everything is a good approach due to the fact that a CI build process can become complex also can take a lot resources (where you would need more powerful build agent). In addition, would you have to deal with snapshot build and release build? How would you manage that?

    1. This all comes back to anti-pattern 1 – not rethinking your whole workflow with containers. I know that several Java shops already have a legacy workflow and when containers appeared they just started packing jars/wars (which is anti-pattern 10). This is the solution of least resistance but not the most optimal one.

      Regarding your questions
      1) There is nothing that is stopping you from installing Docker on build agents and doing docker builds there. This is actually the recommended approach. Installing multiple java versions on a single build agent is a non-trivial task. Installing docker once and then testing ANY java version is very easy. It is even more important if you have other frameworks (e.g. nodejs for the frontend) as your build agent can handle this as well with zero modifications.

      2) There is no need to have an artifact repository for intermediate storage. Just use a docker registry which were created exactly for this purpose. Right now you paying for traffic to put jar to repo then more traffic to download it and even more to push the final container. Creating a container is the first place is much easier on both complexity and resources. You can keep the artifact repo for individual shared libraries, but for the final deployable artifact docker registries are the way to go

      3) For snapshots you can simply name a docker image my-app:1.x-SNAPSHOT and the final thing my-app:1.2. It doesn’t get any simpler than this

      Let me know if I missed something in your questions.

      Also check https://codefresh.io/blog/using-docker-maven-maven-docker/ and https://codefresh.io/blog/create-docker-images-for-java/

Leave a Reply

Your email address will not be published. Required fields are marked *

Comment

Ready to Get Started?
  • safer deployments
  • More frequent deployments
  • resilient deployments