Codefresh Runner installation

Run Codefresh pipelines on your private Kubernetes cluster

Install the Codefresh Runner on your Kubernetes cluster to run pipelines and access secure internal services without compromising on-premises security requirements. These pipelines run on your infrastructure, even behind the firewall, and keep code on your Kubernetes cluster secure.

As the Codefresh Runner is not dependent on any special dockershim features, any compliant container runtime is acceptable. The docker socket/daemon used by Codefresh pipelines is NOT the one on the host node (as it might not exist at all in the case of containerd or cri-o), but instead an internal docker daemon created/managed by the pipeline itself.

IMPORTANT:
Using spot instances can cause failures in Codefresh builds as they can be taken down without notice. If you require 100% availability, we do not recommend using spot instances.

System requirements

Item Requirement
Kubernetes cluster Server version 1.10 to 1.24.
Tip: To check the server version, run:
kubectl version --short.
Node requirements Disk space: 50 GB per node
Container runtime Any compliant container runtime, as the runner is not dependent on any special dockershim features.
Examples:
CLI token Codefresh CLI token

Codefresh Runner installation

Install the Runner from any workstation or laptop with access to the Kubernetes cluster running Codefresh builds, via kubectl. The Codefresh Runner authenticates to your Codefresh account using the CLI token.

Notes:
You must install the Codefresh Runner on every cluster that runs Codefresh pipelines.
The Runner is not needed in clusters used for deployment, as you can deploy applications on clusters without the Runner.

Access to the Codefresh CLI is only needed when installing the Codefresh Runner. After installation, the Runner authenticates on its own using the details provided. You don’t need to install the Codefresh CLI on the cluster running Codefresh pipelines.

Use any of the following options to install the Codefresh Runner:

If the Kubernetes cluster with the Codefresh Runner is behind a proxy server, complete Runner installation.

Install Runner with CLI Wizard

During installation, you can see which API token will be used by the Runner (if you don’t provide one). The printed token includes the permissions used by the Runner to communicate with the Codefresh platform and run pipelines. If you save the token, even if or when you delete the deployment, you can use the same token to restore the Runner’s permissions without having to re-install the Codefresh Runner.

Only a Codefresh account administrator can install the Codefresh Runner.

Before you begin
Make sure you have a:

How to

  1. Install the Codefresh CLI:
    npm install -g codefresh
    
  2. Authenticate the Codefresh CLI:
    codefresh auth create-context --api-key {API_KEY}  
    

    where:
    {API_KEY} is the API key you generated from User Settings.

  3. Start the installation:
    codefresh runner init --token <my-token> <--dry-run>
    

    where:

    • <my-token> is required, and is the CLI token you created with the required scopes.
    • <--dry-run>is optional. When specified, after you answer the configuration prompts, does the following:
    • Saves all the Kubernetes manifests used by the installer locally in the folder ./codefresh_manifests.
    • Installs the Agent and YAML file describing the runtime.
  4. Reply to the prompts as needed:

Codefresh Runner wizard

Codefresh Runner wizard

The Wizard also creates and runs a sample pipeline that you can see in your Codefresh UI.

Codefresh Runner example pipeline

Codefresh Runner example pipeline

You have completed installing the Codefresh Runner with CLI Wizard.

  1. Optional. If the Kubernetes cluster with the Codefresh Runner is behind a proxy, continue with Complete Codefresh Runner installation.
  2. Optional. Verify your installation:
codefresh runner info


Tip:
You can customize the installation by passing your own values in the init command.
To inspect all available options run init with the --help flag:

codefresh runner init --help


Install Codefresh Runner with values file

Use this example as a starting point for your values file.

  1. To install the Codefresh Runner with a predefined values file, add the --values flag, followed by the name of the YAML file:
    codefresh runner init --values values.yaml 
    
  2. Optional. If the Kubernetes cluster with the Codefresh Runner is behind a proxy, continue with Complete Codefresh Runner installation.


Install Codefresh Runner with Helm

Installing the Codefresh Runner with Helm requires you to first create a generated_values.yaml file, and pass the file as part of the Helm installation.

You must create generated_values.yaml file for every installation of the Codefresh Runner.

Before you begin

How to

  1. Run the following command to create all the necessary entities in Codefresh:

     codefresh runner init --generate-helm-values-file --skip-cluster-integration true
    

    where:

    • --skip-cluster-integration is optional, and when set to true (the default), does not create a cluster integration in Codefresh.

    The command:

    • Creates the Runner Agent and the Runtime Environment in your Codefresh account.
    • Creates a generated_values.yaml file in your current directory, which you will need to provide to the helm install command later.
  2. Install the Codefresh Runner:

     helm repo add cf-runtime https://chartmuseum.codefresh.io/cf-runtime
        
     helm install cf-runtime cf-runtime/cf-runtime -f ./generated_values.yaml --create-namespace --namespace codefresh
    
  3. Optional. If the Kubernetes cluster with the Codefresh Runner is behind a proxy, continue with Complete Codefresh Runner installation.

For reference, have a look at the repository with the chart: https://github.com/codefresh-io/venona/tree/release-1.0/.deploy/cf-runtime.

codefresh runner execute-test-pipeline --runtime-name <runtime-name>

Note:
The runner init command determines the configuration of the engine and dind components.
The helm install command controls the configuration of only the runner, dind-volume-provisioner and lv-monitor components.


Complete Codefresh Runner installation

If the Kubernetes cluster with the Codefresh Runner is behind a proxy server without direct access to g.codefresh.io, follow the additional steps to complete the installation.

Before you begin
Make sure you have installed the Codefresh Runner using any of the options

How to

  • Run kubectl edit deployment runner -n codefresh-runtime and add the proxy variables:
spec:
  containers:
  - env:
    - name: HTTP_PROXY
      value: http://<ip of proxy server>:port
    - name: HTTPS_PROXY
      value: http://<ip of proxy server>:port
    - name: http_proxy
      value: http://<ip of proxy server>:port
    - name: https_proxy
      value: http://<ip of proxy server>:port
    - name: no_proxy
      value: localhost,127.0.0.1,<local_ip_of_machine>
    - name: NO_PROXY
      value: localhost,127.0.0.1,<local_ip_of_machine>
  • Add the following variables to your runtime.yaml, both to the runtimeScheduler: and to the dockerDaemonScheduler: blocks, within the envVars: section:
    HTTP_PROXY: http://<ip of proxy server>:port
    http_proxy: http://<ip of proxy server>:port
    HTTPS_PROXY: http://<ip of proxy server>:port
    https_proxy: http://<ip of proxy server>:port
    No_proxy: localhost, 127.0.0.1, <local_ip_of_machine>
    NO_PROXY: localhost, 127.0.0.1, <local_ip_of_machine>
    
  • Add .firebaseio.com to the allowed-sites of the proxy server.
  • Exec into the dind pod, and run ifconfig.
  • If the MTU value for docker0 is greater than or equal to the MTU value of eth0 (sometimes the docker0 MTU is 1500, while eth0 MTU is 1440), change the docker0 MTU value to be lower than the eth0 MTU.
    • To change the docker0 MTU value, edit the configmap in the codefresh-runtime namespace:
      kubectl edit cm codefresh-dind-config -n codefresh-runtime
      
    • Add the string below after one of the commas: \"mtu\":1440,

Post-installation configuration

After installation, configure the Kubernetes cluster with the Codefresh Runner to better match your environment and cloud provider.

AWS backend volume configuration

For Codefresh Runners on EKS or any other custom cluster in Amazon, such as kops for example, configure the Runner to work with EBS volumes to support caching during pipeline execution.

The configuration assumes that you have installed the Runner with the default options: codefresh runner init


dind-volume-provisioner permissions

The dind-volume-provisioner deployment should have permissions to create/attach/detach/delete/get EBS volumes.

There are three options for this:

  1. Run dind-volume-provisioner pod on the node/node-group with IAM role
  2. Mount K8s secret in AWS credential format: To ~/.aws/credentials OR
    By passing the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY as environment variables to the dind-volume-provisioner pod
  3. Use AWS identity for Service Account IAM role assigned to volume-provisioner-runner service account

Minimal policy for dind-volume-provisioner

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:AttachVolume",
        "ec2:CreateSnapshot",
        "ec2:CreateTags",
        "ec2:CreateVolume",
        "ec2:DeleteSnapshot",
        "ec2:DeleteTags",
        "ec2:DeleteVolume",
        "ec2:DescribeInstances",
        "ec2:DescribeSnapshots",
        "ec2:DescribeTags",
        "ec2:DescribeVolumes",
        "ec2:DetachVolume"
      ],
      "Resource": "*"
    }
  ]
}

Configuration

Step 1: Create Storage Class for EBS volumes:

Choose one of the Availability Zones (AZs)to be used for your pipeline builds. Multi AZ configuration is not supported.

  • Storage Class (gp2)
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: dind-ebs
### Specify name of provisioner
provisioner: codefresh.io/dind-volume-provisioner-runner-<-NAMESPACE-> # <---- rename <-NAMESPACE-> with the runner namespace
volumeBindingMode: Immediate
parameters:
  # ebs or ebs-csi
  volumeBackend: ebs 
  # Valid zone
  AvailabilityZone: us-central1-a # <---- change it to your AZ
  #  gp2, gp3 or io1
  VolumeType: gp2
  # in case of io1 you can set iops
  # iops: 1000
  # ext4 or xfs (default to xfs, ensure that there is xfstools )
  fsType: xfs
  • Storage Class (gp3)
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: dind-ebs
### Specify name of provisioner
provisioner: codefresh.io/dind-volume-provisioner-runner-<-NAMESPACE-> # <---- rename <-NAMESPACE-> with the runner namespace
volumeBindingMode: Immediate
parameters:
  # ebs or ebs-csi
  volumeBackend: ebs
  # Valid zone
  AvailabilityZone: us-central1-a  # <---- change it to your AZ
  #  gp2, gp3 or io1
  VolumeType: gp3
  # ext4 or xfs (default to xfs, ensure that there is xfstools )
  fsType: xfs
  # I/O operations per second. Only effetive when gp3 volume type is specified.
  # Default value - 3000.
  # Max - 16,000
  iops: "5000"
  # Throughput in MiB/s. Only effective when gp3 volume type is specified.
  # Default value - 125.
  # Max - 1000.
  throughput: "500"

Step 2: Apply storage class manifest:

kubectl apply -f dind-ebs.yaml

Step 3: Get the YAML representation of the runtime you just added:

  • Get a list of all available runtimes:
    codefresh get runtime-environments
    
  • Select the runtime you just added, and get its YAML representation:
    codefresh get runtime-environments my-eks-cluster/codefresh -o yaml > runtime.yaml
    

Step 4: Modify the YAML:

  • In dockerDaemonScheduler.cluster, add nodeSelector: topology.kubernetes.io/zone: <your_az_here>.
    > Make sure you define the same AZ you selected for Runtime Configuration.
  • Modify pvcs.dind to use the Storage Class you created above (dind-ebs).

Here is an example of the runtime.yaml including the required updates:

version: 1
metadata:
  ...
runtimeScheduler:
  cluster:
    clusterProvider:
      accountId: 5f048d85eb107d52b16c53ea
      selector: my-eks-cluster
    namespace: codefresh
    serviceAccount: codefresh-engine
  annotations: {}
dockerDaemonScheduler:
  cluster:
    clusterProvider:
      accountId: 5f048d85eb107d52b16c53ea
      selector: my-eks-cluster
    namespace: codefresh
    nodeSelector:
      topology.kubernetes.io/zone: us-central1-a
    serviceAccount: codefresh-engine
  annotations: {}
  userAccess: true
  defaultDindResources:
    requests: ''
  pvcs:
    dind:
      volumeSize: 30Gi
      storageClassName: dind-ebs
      reuseVolumeSelector: 'codefresh-app,io.codefresh.accountName'
extends:
  - system/default/hybrid/k8s_low_limits
description: '...'
accountId: 5f048d85eb107d52b16c53ea

Step 5: Update your runtime environment with the patch command:

codefresh patch runtime-environment my-eks-cluster/codefresh -f runtime.yaml

Step 6: If necessary, delete all existing PV (Persistent Volume) and PVC (Persistent Volume Claim ) objects that remain from the default local provisioner:

kubectl delete pvc -l codefresh-app=dind -n <your_runner_ns>
kubectl delete pv -l codefresh-app=dind -n <your_runner_ns>

Step 7: Restart the volume provisioner pod.

Values YAML for configuration

You can define all these options above for clean Runner installation with values.yaml file:

values-ebs.yaml example:

### Storage parameter example for aws ebs disks
Storage:
  Backend: ebs
  AvailabilityZone: us-east-1d
  VolumeType: gp3
  #AwsAccessKeyId: ABCDF
  #AwsSecretAccessKey: ZYXWV
  Encrypted:  # encrypt volume, default is false
  VolumeProvisioner: 
    ServiceAccount:
      Annotations:
        eks.amazonaws.com/role-arn: arn:aws:iam::<ACCOUNT_ID>:role/<IAM_ROLE_NAME>
NodeSelector: topology.kubernetes.io/zone=us-east-1d
...
 Runtime:
   NodeSelector: # dind and engine pods node-selector (--build-node-selector)
     topology.kubernetes.io/zone: us-east-1d
codefresh runner init --values values-ebs.yaml --exec-demo-pipeline false --skip-cluster-integration true

GKE (Google Kubernetes Engine) backend volume configuration

GKE volume configuration includes:


Local SSD storage configuration

Configure the Codefresh Runner to use local SSDs for your pipeline volumes:

How-to: Configuring an existing Runtime Environment with Local SSDs (GKE only)


GCE disk storage configuration

Prerequisites
The dind-volume-provisioner should have permissions to create/delete/get GCE disks.

There are three options to provide cloud credentials:

  1. Run dind-volume-provisioner-runner pod on a node with an IAM role which can create/delete/get GCE disks
  2. Create Google Service Account with ComputeEngine.StorageAdmin role, download its key in JSON format, and pass it to codefresh runner init with --set-file=Storage.GooogleServiceAccount=/path/to/google-service-account.json
  3. Use Google Workload Identity to assign IAM role to volume-provisioner-runner service account

Notice that builds run in a single Availability Zone (AZ), so you must specify Availability Zone parameters.

Configuration
How-to: Configuring an existing Runtime Environment with GCE disks


Using multiple Availability Zones

Currently, to support effective caching with GCE disks, the builds/pods need to be scheduled in a single AZ. Note that this is more related to a GCP limitation than a Codefresh Runner issue.

To use the Codefresh Runner on Kubernetes nodes running in multiple Availability Zones, check out our suggestions:

Provision a new Kubernetes cluster that runs in a single AZ This is the preferred solution and avoids extra complexity. The cluster should be dedicated for usage with the Codefresh Runner.


Install Codefresh Runner in your multi-zone cluster and run it in the default Node Pool

You must specify:
--build-node-selector=<node-az-label> (e.g.: --build-node-selector=topology.kubernetes.io/zone=us-central1-c)
OR
Do the following:

  1. Modify the Runtime environment as below:
    codefresh get re $RUNTIME_NAME -o yaml > re.yaml
    
  2. Edit the yaml:
    version: 2
    metadata:
      ...
    runtimeScheduler:
      cluster:
     nodeSelector: #schedule engine pod onto a node whose labels match the nodeSelector
       topology.kubernetes.io/zone: us-central1-c
     ...  
    dockerDaemonScheduler:
      cluster:
     nodeSelector: #schedule dind pod onto a node whose labels match the nodeSelector
       topology.kubernetes.io/zone: us-central1-c
     ...  
      pvcs:
     dind:
       ...
    
  3. Apply changes with:
    codefresh patch re -f re.yaml
    


Install Codefresh Runner in your multi-zone cluster and run it with a dedicated Node Pool
Follow the instructions for the default Node Pool.

Install a Codefresh Runner for every Availability Zone
Install separate Codefresh Runners in each Availability Zone, one for AZ A, and the other for AZ B, for example.
This is technically viable, but to distribute the builds across the Codefresh Runner runtime environments, you must manually specify the runtime environment for the pipelines that don’t use the default runtime environment.

For example, let’s say Venona-zoneA is the default runtime environment, for the pipelines to run in Venona-zoneB, modify their RE settings, and explicitly set Venona-zoneB as the one to use.

The Codefresh Runner does not currently support Regional Persistent Disks.

Configure internal registry mirror

You can configure your Codefresh Runner to use an internal registry as a mirror for any container images that are specified in your pipelines.

  1. Set up an internal registry as described in https://docs.docker.com/registry/recipes/mirror/.
  2. Locate the codefresh-dind-config config map in the namespace that houses the Runner.
    kubectl -n codefresh edit configmap codefresh-dind-config
    
  3. Add the line \ \"registry-mirrors\": [ \"https://<my-docker-mirror-host>\" ], \n to define the single registry to use as a mirror to data after the tlskey:
    data:
      daemon.json: "{\n  \"hosts\": [ \"unix:///var/run/docker.sock\",\n             \"tcp://0.0.0.0:1300\"],\n
     \ \"storage-driver\": \"overlay2\",\n  \"tlsverify\": true,  \n  \"tls\": true,\n
     \ \"tlscacert\": \"/etc/ssl/cf-client/ca.pem\",\n  \"tlscert\": \"/etc/ssl/cf/server-cert.pem\",\n
     \ \"tlskey\": \"/etc/ssl/cf/server-key.pem\",\n  \"insecure-registries\" : [\"192.168.99.100:5000\"],\n
     \ \"registry-mirrors\": [ \"https://<my-docker-mirror-host>\" ], \n
     \ \"metrics-addr\" : \"0.0.0.0:9323\",\n  \"experimental\" : true\n}\n"
    
  4. Save and quit by typing :wq.

Now any container image used in your pipeline and isn’t fully qualified, will be pulled through the Docker registry that is configured as a mirror.

Add custom labels to dind and engine pods

Add custom labels to your Engine and Dind pods in Runtime Environment (RE) by patching it.

  1. Get the configuration of the RE and place it in a file named runtime.yaml.
    codefresh get runtime-environments -o yaml <$RUNTIME_ENVIRONMENT> > runtime.yaml
    where:
    $RUNTIME_ENVIRONMENT must be replaced with the name of your RE.
  2. Edit the dockerDaemonScheduler.labels or runtimeScheduler.labels property of runtime.yaml to include the label, as in the example below.
    If the dockerDaemonScheduler.labels are not included in the RE configuration by default, add them.
    version: 1
    metadata:
      [...]
    runtimeScheduler:
      labels:
     my-custom-ENGINE-label: "true"
      cluster:
     [...]
    dockerDaemonScheduler:
      cluster:
     [...]
      annotations: {}
      labels:
     my-custom-DIND-label: "true"
    [...]
    
  3. Patch the runtime environment: codefresh patch re $RUNTIME_ENVIRONMENT -f runtime.yaml
    where:
    $RUNTIME_ENVIRONMENT must be replaced with the name of your RE.

Once you have applied the patch, future builds will include the label preventing eviction.

View Codefresh Runner and runtime environments

Once installed, the Runner polls Codefresh every three seconds by default to automatically create all resources needed for running pipelines.
To see the cluster with the Runner:

  • In the Codefresh UI, click the Settings icon on the toolbar.
  • From the sidebar, select Pipeline Runtimes, and then click the Codefresh Runners tab.

Available runtime environments

Available runtime environments

Select a default runtime environment

If you have multiple runtime environments, select the one to use as the default environment for all the pipelines in the account.

  • In the Codefresh UI, click the Settings icon on the toolbar.
  • From the sidebar, select Pipeline Runtimes.
  • From the list of Pipeline Runtimes, select the row with the runtime to set as the default.
  • Click the context menu on the right, and select Set as Default.

Override default runtime environment for a pipeline

Override the default runtime environment for a specific pipeline through the pipeline’s Build Runtime settings.

Running a pipeline on a specific environment

Running a pipeline on a specific environment

Runner components and resources

Once installed, the Codefresh Runner is similar to any Kubernetes application, and you can monitor it using your existing tools. Among the Runner components, only the runner pod persists within your cluster. Other components, such as the engine, exist for the duration of pipeline builds.

To monitor the Runner, list the resources inside the namespace you chose during installation:

$ kubectl get pods -n codefresh-runtime
NAME                                              READY   STATUS    RESTARTS   AGE
dind-5ee7577017ef40908b784388                     1/1     Running   0          22s
dind-lv-monitor-runner-hn64g                      1/1     Running   0          3d
dind-lv-monitor-runner-pj84r                      1/1     Running   0          3d
dind-lv-monitor-runner-v2lhc                      1/1     Running   0          3d
dind-volume-provisioner-runner-64994bbb84-lgg7v   1/1     Running   0          3d
engine-5ee7577017ef40908b784388                   1/1     Running   0          22s
monitor-648b4778bd-tvzcr                          1/1     Running   0          3d
runner-5d549f8bc5-7h5rc                           1/1     Running   0          3d

You can also list secrets, config-maps, logs, volumes, etc. for the Codefresh builds.

The Runner uses the following pods:

  • runner: Picks tasks (builds) from the Codefresh API
  • engine: Runs pipelines
  • dind: Builds and uses Docker images
  • dind-volume-provisioner: Provisions volumes (PVs) for dind
  • dind-lv-monitor: Cleans local volumes

CPU/Memory

The following table shows the minimum resources for each Runner component:

Component CPU requests RAM requests Storage Type Always on
runner 100m 100Mi Doesn’t need PV Deployment Yes
engine 100m 500Mi Doesn’t need PV Pod No
dind 400m 800Mi 16GB PV Pod No
dind-volume-provisioner 300m 400Mi Doesn’t need PV Deployment Yes
dind-lv-monitor 300m 400Mi Doesn’t need PV DaemonSet Yes

NOTES:
Components that are always on consume resources all the time. Components that are not always on, only consume resources when pipelines are running. They are automatically both created and destroyed for each pipeline.

Node size and count depends entirely on how many pipelines you want to be “ready” for, and how many will use “burst” capacity:

  • Ready (nodes): Lower initialization time and faster build times.
  • Burst (nodes): High initialization time and slower build times (not recommended).

The size of your nodes directly relates to the size required for your pipelines and is thus dynamic. If you find that only a few large pipelines require larger nodes, you may want to have two Codefresh Runners associated with different node pools.

Storage

For the storage options needed by the dind pod, we suggest:

  • Local Volumes /var/lib/codefresh/dind-volumes on the K8S nodes filesystem (default)
  • EBS in the case of AWS. See also the notes about getting caching working.
  • Local SSD or GCE Disks in the case of GCP. See notes about configuration.

Networking Requirements

  • dind: Pod creates an internal network in the cluster to run all the pipeline steps; needs outgoing/egress access to Docker Hub and quay.io.
  • runner: Pod needs outgoing/egress access to g.codefresh.io; needs network access to app-proxy if installed.
  • engine: Pod needs outgoing/egress access to g.codefresh.io, *.firebaseio.com and quay.io; needs network access to dind pod

All CNI providers/plugins are compatible with the runner components.

Monitoring disk space in Codefresh Runner

Codefresh pipelines require disk space for:

Codefresh offers two options to manage disk space and prevent out-of-space errors:

To improve performance by using Docker cache and decreasing I/O rate, volume-provisioner can provision previously used disks with Docker images and pipeline volumes from previously run builds.

Types of runtime cleaners

Docker images and volumes must be cleaned on a regular basis.


IN-DIND cleaner

Purpose: Removes unneeded docker containers, images, volumes inside Kubernetes volume mounted on the dind pod

Where it runs: Inside each dind pod as script

Triggered by: SIGTERM and also during the run when disk usage (cleaner-agent ) > 90% (configurable)

Configured by: Environment Variables which can be set in Runtime Environment configuration

Configuration/Logic: README.md

Override dockerDaemonScheduler.envVars on Runtime Environment if necessary (the following are defaults):

dockerDaemonScheduler:
  envVars:
    CLEAN_PERIOD_SECONDS: '21600' # launch clean if last clean was more than CLEAN_PERIOD_SECONDS seconds ago
    CLEAN_PERIOD_BUILDS: '5' # launch clean if last clean was more CLEAN_PERIOD_BUILDS builds since last build
    IMAGE_RETAIN_PERIOD: '14400' # do not delete docker images if they have events since current_timestamp - IMAGE_RETAIN_PERIOD
    VOLUMES_RETAIN_PERIOD: '14400' # do not delete docker volumes if they have events since current_timestamp - VOLUMES_RETAIN_PERIOD
    DISK_USAGE_THRESHOLD: '0.8' # launch clean based on current disk usage DISK_USAGE_THRESHOLD
    INODES_USAGE_THRESHOLD: '0.8' # launch clean based on current inodes usage INODES_USAGE_THRESHOLD


External volume cleaner

Purpose: Removes unused kubernetes volumes and related backend volumes

Where it runs: On Runtime Cluster as CronJob (kubectl get cronjobs -n codefresh -l app=dind-volume-cleanup). Installed in case the Runner uses non-local volumes (Storage.Backend != local)

Triggered by: CronJob every 10min (configurable), part of runtime-cluster-monitor and runner deployment

Configuration:

Set codefresh.io/volume-retention annotation on Runtime Environment:

dockerDaemonScheduler:
  pvcs:
    dind:
      storageClassName: dind-ebs-volumes-runner-codefresh
      reuseVolumeSelector: 'codefresh-app,io.codefresh.accountName,pipeline_id'
      volumeSize: 32Gi
      annotations:
        codefresh.io/volume-retention: 7d

Override environment variables for dind-volume-cleanup cronjob if necessary:

  • RETENTION_DAYS (defaults to 4)
  • MOUNT_MIN (defaults to 3)
  • PROVISIONED_BY (defaults to codefresh.io/dind-volume-provisioner)

About optional -m argument:

  • dind-volume-cleanup to clean volumes that were last used more than RETENTION_DAYS ago
  • dind-volume-cleanup-m to clean volumes that were used more than a day ago, but mounted less than MOUNT_MIN times


Local volume cleaner

Purpose: Deletes local volumes when node disk space is close to the threshold

Where it runs: On each node on runtime cluster as DaemonSet dind-lv-monitor. Installed in case the Runner uses local volumes (Storage.Backend == local)

Triggered by: Disk space usage or node usage that exceeds thresholds (configurable)

Configuration:

Override environment variables for dind-lv-monitor daemonset if necessary:

  • VOLUME_PARENT_DIR - default /var/lib/codefresh/dind-volumes
  • KB_USAGE_THRESHOLD - default 80 (percentage)
  • INODE_USAGE_THRESHOLD - default 80

Codefresh Runner architecture

Codefresh Runner architecture overview

Codefresh Runner architecture overview
  1. Runtime-Environment specification defines engine and dind pods spec and PVC parameters.
  2. Runner pod (Agent) pulls tasks (Builds) from Codefresh API every 3 seconds.
  3. Once the agent receives build task (either Manual run build or Webhook triggered build) it calls k8s API to create engine/dind pods and PVC object.
  4. Volume Provisioner listens for PVC events (create) and based on StorageClass definition it creates PV object with the corresponding underlying volume backend (ebs/gcedisk/local).
  5. During the build, each step (clone/build/push/freestyle/composition) is represented as docker container inside dind (docker-in-docker) pod. Shared Volume (/codefresh/volume) is represented as docker volume and mounted to every step (docker containers). PV mount point inside dind pod is /var/lib/docker.
  6. Engine pod controls dind pod. It deserializes pipeline yaml to docker API calls, terminates dind after build has been finished or per user request (sigterm).
  7. dind-lv-monitor DaemonSet OR dind-volume-cleanup CronJob are part of runtime cleaners, app-proxy Deployment and Ingress are described in the App-Proxy installation, monitor Deployment is for Kubernetes Dashboard.

Customized Codefresh Runner installations

App-Proxy installation

The App-Proxy is an optional component of the Runner, used mainly when the Git provider server is installed on-premises, behind the firewall.

App-Proxy requirements

App-Proxy requires a Kubernetes cluster:

  1. With the Codefresh Runner installed
  2. With an active ingress controller. The ingress controller must allow incoming connections from the VPC/VPN where users are browsing the Codefresh UI.
    The ingress connection must have a hostname assigned for this route, and must be configured to perform SSL termination.

Currently, App-Proxy is supported for both SaaS and on-prem versions of GitHub and GitLab, and Bitbucket Server.

Install App-Proxy

On a Kubernetes cluster with existing Codefresh Runner:

codefresh install app-proxy --host=<hostname-of-ingress>

Install Codefresh Runner and App-Proxy:

codefresh runner init --app-proxy --app-proxy-host=<hostname-of-ingress> 

Define the ingress class for App-Proxy:

If you have multiple ingress controllers in the Kubernetes cluster, use the --app-proxy-ingress-class parameter to define which ingress will be used.
For additional security, to further limit the web browsers that can access the ingress, you can also define an allowlist of IPs/ranges. Check the documentation of your ingress controller for the exact details.

By default, the app-proxy ingress uses the path hostname/app-proxy. You can change that default by using the values file in the installation with the flag --values values.yaml.
See the AppProxy section in the example values.yaml.

codefresh install app-proxy --values values.yaml

App-Proxy architecture

Here is the architecture of the App-Proxy:

How App Proxy and the Codefresh runner work together

How App Proxy and the Codefresh runner work together

The App-Proxy:

  • Enables you to automatically create webhooks for Git in the Codefresh UI (identical to the SaaS experience)
  • Sends commit status information back to your Git provider (identical to the SaaS experience)
  • Makes all Git operations in the GUI work exactly like the SaaS installation of Codefresh

For a Git GET operation, the Codefresh UI communicates with the App-Proxy to route the request to the Git provider. The confidential Git information never leaves the firewall premises, and the connection between the browser and the ingress is SSL/HTTPS.

The App-Proxy has to work over HTTPS, and by default it uses the ingress controller to terminate the SSL. Therefore, the ingress controller must be configured to perform SSL termination. Check the documentation of your ingress controller (for example nginx ingress). This means that the App-Proxy does not compromise security in any way.

Install multiple runtimes with a single Runner (agent)

Advanced users can install a single Codefresh Runner (agent) to manage multiple runtime environments.

Note:
Make sure the cluster on which the Runner (agent) is installed has network access to the other clusters in the runtime environments.

# 1. Create namespace for the agent: 
kubectl create namespace codefresh-agent

# 2. Install the agent in the namespace ( give your agent a unique name as $NAME):
# Note down the token and use it in the second command.
codefresh create agent $NAME
codefresh install agent --token $TOKEN --kube-namespace codefresh-agent
codefresh get agents

# 3. Create namespace for the first runtime:
kubectl create namespace codefresh-runtime-1

# 4. Install the first runtime in the namespace
# 5. the runtime name is printed
codefresh install runtime --runtime-kube-namespace codefresh-runtime-1

# 6. Attach the first runtime to agent:
codefresh attach runtime --agent-name $AGENT_NAME --agent-kube-namespace codefresh-agent --runtime-name $RUNTIME_NAME --runtime-kube-namespace codefresh-runtime-1

# 7. Restart the runner pod in namespace `codefresh-agent`
kubectl delete pods $RUNNER_POD

# 8. Create namespace for the second runtime
kubectl create namespace codefresh-runtime-2

# 9. Install the second runtime on the namespace
codefresh install runtime --runtime-kube-namespace codefresh-runtime-2

# 10. Attach the second runtime to agent and restart the Venona pod automatically
codefresh attach runtime --agent-name $AGENT_NAME --agent-kube-namespace codefresh-agent --runtime-name $RUNTIME_NAME --runtime-kube-namespace codefresh-runtime-2 --restart-agent

Install Codefresh Runner on Google Kubernetes Engine (GKE)

You can install the Codefresh Runner on GKE Kubernetes cluster.
Codefresh supports the following GKE configurations:

Common prerequsites

Before you start the installation, verify the following:

  • Make sure your user has Kubernetes Engine Cluster Admin role in Google console
  • Bind your user with cluster-admin Kubernetes cluster role.
kubectl create clusterrolebinding cluster-admin-binding \
  --clusterrole cluster-admin \
  --user $(gcloud config get-value account)


Install on GKE with local SSD

Prerequisites
GKE cluster with local SSD

How to

  1. Run the CLI Wizard with these options:
    codefresh runner init [options] --set-value=Storage.LocalVolumeParentDir=/mnt/disks/ssd0/codefresh-volumes \
                             --build-node-selector=cloud.google.com/gke-local-ssd=true
    
  2. Based on the installation mode, edit the predfined values-example.yaml values file or the generated Helm values file:
    ...
    ### Storage parameters example for gke-local-ssd
     Storage:
    Backend: local
    LocalVolumeParentDir: /mnt/disks/ssd0/codefresh-volumes
     NodeSelector: cloud.google.com/gke-local-ssd=true
    ...
     Runtime:
    NodeSelector: # dind and engine pods node-selector (--build-node-selector)
      cloud.google.com/gke-local-ssd: "true"
    ...
    
    codefresh runner init [options] --values values-example.yaml
    


Install Codefresh Runner on GKE with GCE disks and Google SA JSON key

With the CLI Wizard:

codefresh runner init [options] \
  --set-value=Storage.Backend=gcedisk \
  --set-value=Storage.AvailabilityZone=us-central1-c \
  --kube-node-selector=topology.kubernetes.io/zone=us-central1-c \
  --build-node-selector=topology.kubernetes.io/zone=us-central1-c \
  --set-file=Storage.GoogleServiceAccount=/path/to/google-service-account.json

With the values values-example.yaml file:

...
### Storage parameter example for GCE disks
 Storage:
   Backend: gcedisk
   AvailabilityZone: us-central1-c
   GoogleServiceAccount: > #serviceAccount.json content
     {
      "type": "service_account",
      "project_id": "...",
      "private_key_id": "...",
      "private_key": "...",
      "client_email": "...",
      "client_id": "...",
      "auth_uri": "...",
      "token_uri": "...",
      "auth_provider_x509_cert_url": "...",
      "client_x509_cert_url": "..."
      }
 NodeSelector: topology.kubernetes.io/zone=us-central1-c
...
 Runtime:
   NodeSelector: # dind and engine pods node-selector (--build-node-selector)
     topology.kubernetes.io/zone: us-central1-c
...
codefresh runner init [options] --values values-example.yaml


Install Codefresh Runner on GKE with GCE disks with Workload Identity and IAM role

With the values values-example.yaml file:

  1. Configure the storage options for GCE disks as in the example below.
    ...
    ### Storage parameter example for GCE disks
     Storage:
    Backend: gcedisk
    AvailabilityZone: us-central1-c
    VolumeProvisioner:
      ServiceAccount:
        Annotations: #annotation to the volume-provisioner service account, using the email address of the Google service account
          iam.gke.io/gcp-service-account: <GSA_NAME>@<PROJECT_ID>.iam.gserviceaccount.com
     NodeSelector: topology.kubernetes.io/zone=us-central1-c
    ...
     Runtime:
    NodeSelector: # dind and engine pods node-selector (--build-node-selector)
      topology.kubernetes.io/zone: us-central1-c
    ...
    
  2. Install the Codefresh Runner with values-example.yaml:
    codefresh runner init [options] --values values-example.yaml
    
  3. Create the binding between Kubernetes service account and the Google service account:
    export K8S_NAMESPACE=codefresh
    export KSA_NAME=volume-provisioner-runner
    export GSA_NAME=<google_sa_name>
    export PROJECT_ID=<google_project_name>
    gcloud iam service-accounts add-iam-policy-binding \
      --role roles/iam.workloadIdentityUser \
      --member "serviceAccount:${PROJECT_ID}.svc.id.goog[${K8S_NAMESPACE}/${KSA_NAME}]" \
      ${GSA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com
    


Install Codefresh Runner on EKS

Installing the Codefresh Runner on EKS includes:
Step 1: Create an EKS cluster
Step 2: Install autoscaler on EKS cluster
Step 3: (Optional) Configure overprovisioning with Cluster Autoscaler
Step 4: Add an EKS cluster as Runner to the Codefresh platform with EBS support

Step 1: Create an EKS cluster

You need to create three files:

  • cluster.yaml file with separate node pools for dind, engine and other services, like runner, cluster-autoscaler etc
  • Create two separate IAM policies for:
    • Volume-provisioner controller(policy/runner-ebs) for creating and deleting volumes
    • Dind pods(policy/dind-ebs) for attaching/detaching the volumes to the appropriate nodes using IAM attachPolicyARNs options.

policy/runner-ebs:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:AttachVolume",
                "ec2:CreateSnapshot",
                "ec2:CreateTags",
                "ec2:CreateVolume",
                "ec2:DeleteSnapshot",
                "ec2:DeleteTags",
                "ec2:DeleteVolume",
                "ec2:DescribeInstances",
                "ec2:DescribeSnapshots",
                "ec2:DescribeTags",
                "ec2:DescribeVolumes",
                "ec2:DetachVolume"
            ],
            "Resource": "*"
        }
    ]
}

policy/dind-ebs:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeVolumes"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:DetachVolume",
                "ec2:AttachVolume"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}

How to

  1. Create the cluster.yaml as in the example below (my-eks-cluster.yaml).
    apiVersion: eksctl.io/v1alpha5
    kind: ClusterConfig
    metadata:
      name: my-eks
      region: us-west-2
      version: "1.15"
    nodeGroups:
      - name: dind
     instanceType: m5.2xlarge
     desiredCapacity: 1
     iam:
       attachPolicyARNs:
         - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
         - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
         - arn:aws:iam::aws:policy/ElasticLoadBalancingFullAccess
         - arn:aws:iam::XXXXXXXXXXXX:policy/dind-ebs
       withAddonPolicies:
         autoScaler: true
     ssh: # import public key from file
       publicKeyPath: ~/.ssh/id_rsa.pub
     minSize: 1
     maxSize: 50
     volumeSize: 50
     volumeType: gp2
     ebsOptimized: true
     availabilityZones: ["us-west-2a"]
     kubeletExtraConfig:
         enableControllerAttachDetach: false
     labels:
       node-type: dind
     taints:
       codefresh.io: "dinds:NoSchedule"
      - name: engine
     instanceType: m5.large
     desiredCapacity: 1
     iam:
       withAddonPolicies:
         autoScaler: true
     minSize: 1
     maxSize: 10
     volumeSize: 50
     volumeType: gp2
     availabilityZones: ["us-west-2a"]
     labels:
       node-type: engine
     taints:
       codefresh.io: "engine:NoSchedule"
      - name: addons
     instanceType: m5.2xlarge
     desiredCapacity: 1
     ssh: # import public key from file
       publicKeyPath: ~/.ssh/id_rsa.pub
     minSize: 1
     maxSize: 10
     volumeSize: 50
     volumeType: gp2
     ebsOptimized: true
     availabilityZones: ["us-west-2a"]
     labels:
       node-type: addons
     iam:
       attachPolicyARNs:
         - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
         - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
         - arn:aws:iam::aws:policy/ElasticLoadBalancingFullAccess
         - arn:aws:iam::XXXXXXXXXXXX:policy/runner-ebs
       withAddonPolicies:
         autoScaler: true
    availabilityZones: ["us-west-2a", "us-west-2b", "us-west-2c"]
    
  2. Execute:
    eksctl create cluster -f my-eks-cluster.yaml
    

The configuration leverages Amazon Linux 2 as the default operating system for the nodes in the node group.

Bottlerocket-based nodes
Bottlerocket is an open source Linux based Operating System specifically built to run containers. It focuses on security, simplicity and easy updates via transactions. Find more information in the official repository.

To leverage Bottlerocket-based nodes:

  • Specify the AMI Family using amiFamily: Bottlerocket
  • Add these additional IAM Policies:
    • arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
    • arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore

Step 2: Install autoscaler on EKS cluster

Once the cluster is up and running, install the cluster autoscaler.

Because we used IAM AddonPolicies "autoScaler: true" in the cluster.yaml file, everything is done automatically, and there is no need to create a separate IAM policy or add Auto Scaling group tags.

  • Deploy the cluster autoscaler:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml
  • Add the cluster-autoscaler.kubernetes.io/safe-to-evict annotation:
    kubectl -n kube-system annotate deployment.apps/cluster-autoscaler cluster-autoscaler.kubernetes.io/safe-to-evict="false"
    
  • Edit the cluster-autoscaler container command:
kubectl -n kube-system edit deployment.apps/cluster-autoscaler
  • Do the following as in the example below:
    • Replace <YOUR CLUSTER NAME> with the name of the cluster cluster.yaml
    • Add the following options:
      --balance-similar-node-groups
      --skip-nodes-with-system-pods=false
spec:
      containers:
      - command:
        - ./cluster-autoscaler
        - --v=4
        - --stderrthreshold=info
        - --cloud-provider=aws
        - --skip-nodes-with-local-storage=false
        - --expander=least-waste
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/my-eks
        - --balance-similar-node-groups
        - --skip-nodes-with-system-pods=false
kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.15.6

Check your version of the EKS to make sure that the you have the correct autoscaler version for it.

Step 3: (Optional) Configure overprovisioning with Cluster Autoscaler

For details, see the FAQ.

Step 4: Add an EKS cluster as Runner to the Codefresh platform with EBS support

How to

  • Make sure to target the correct cluster:
$ kubectl config current-context 
my-aws-runner
  • Install the Runner with additional options:
    • Specify the zone in which to create your volumes, for example: --set-value=Storage.AvailabilityZone=us-west-2a.
    • (Optional) To assign the volume-provisioner to a specific node, for example, a specific node group with an IAM role that can create EBS volumes, --set-value Storage.VolumeProvisioner.NodeSelector=node-type=addons.
    • To use encrypted EBS volumes, add the custom value --set-value=Storage.Encrypted=true.
    • If you already have a key, add its ARN via --set-value=Storage.KmsKeyId=<key id> value. Otherwise a key is generated by AWS.

    Here is an example with all the options configured:

codefresh runner init \
--name my-aws-runner \
--kube-node-selector=topology.kubernetes.io/zone=us-west-2a \
--build-node-selector=topology.kubernetes.io/zone=us-west-2a \
--kube-namespace cf --kube-context-name my-aws-runner \
--set-value Storage.VolumeProvisioner.NodeSelector=node-type=addons \
--set-value=Storage.Backend=ebs \
--set-value=Storage.AvailabilityZone=us-west-2a\
--set-value=Storage.Encrypted=[false|true] \
--set-value=Storage.KmsKeyId=<key id>
For descriptions of the other options, run `codefresh runner init --help`.
  • When the Wizard completes the installation, modify the runtime environment of my-aws-runner to specify the necessary toleration, nodeSelector and disk size:
    • Run:
      codefresh get re --limit=100 my-aws-runner/cf -o yaml > my-runtime.yml
      
    • Modify the file my-runtime.yml as shown below:
version: null
metadata:
  agent: true
  trial:
    endingAt: 1593596844167
    reason: Codefresh hybrid runtime
    started: 1592387244207
  name: my-aws-runner/cf
  changedBy: ivan-codefresh
  creationTime: '2020/06/17 09:47:24'
runtimeScheduler:
  cluster:
    clusterProvider:
      accountId: 5cb563d0506083262ba1f327
      selector: my-aws-runner
    namespace: cf
    nodeSelector:
      node-type: engine
  tolerations:
  - effect: NoSchedule
    key: codefresh.io
    operator: Equal
    value: engine
  annotations: {}
dockerDaemonScheduler:
  cluster:
    clusterProvider:
      accountId: 5cb563d0506083262ba1f327
      selector: my-aws-runner
    namespace: cf
    nodeSelector:
      node-type: dind
  annotations: {}
  defaultDindResources:
    requests: ''
  tolerations:
  - effect: NoSchedule
    key: codefresh.io
    operator: Equal
    value: dinds
  pvcs:
    dind:
      volumeSize: 30Gi
      reuseVolumeSelector: 'codefresh-app,io.codefresh.accountName'
      storageClassName: dind-local-volumes-runner-cf
  userAccess: true
extends:
  - system/default/hybrid/k8s_low_limits
description: 'Runtime environment configure to cluster: my-aws-runner and namespace: cf'
accountId: 5cb563d0506083262ba1f327
  • Apply changes.
    codefresh patch re my-aws-runner/cf -f my-runtime.yml
    

You have completed installing the Codefresh Runner on an EKS cluster. You can try runing a pipeline on the runtime environment my-aws-runner/cf.

Install Codefresh Runner on Rancher RKE 2.X

Installing Codefresh Runner on Rancher RKE 2.X includes these steps:

Step 1: Configure kubelet for Runner StorageClass

Configure the cluster to allow the Runner’s default StorageClass to create the persistent cache volume from local storage on each node.

  1. In the Rancher UI: For Rancher v2.5.9 and earlier, drill into the target cluster, and then click Edit Cluster at the top-right.

Drill into your cluster and click Edit Cluster on the right

Drill into your cluster and click Edit Cluster on the right

For Rancher v2.6+ with the updated UI, open Cluster Management in the left panel, then click the three-dot menu near the corresponding cluster and select Edit Config.

Click Edit Cluster on the right in your cluster list

Click Edit Cluster on the right in your cluster list
  1. On the edit cluster page, scroll down to the Cluster Options section, and click its Edit as YAML button

Cluster Options -> Edit as YAML

Cluster Options -> Edit as YAML
  1. Edit the YAML to include an extra mount in the kubelet service:
rancher_kubernetes_engine_config:
  ...  
  services:
    ...  
    kubelet:
      extra_binds:
        - '/var/lib/codefresh:/var/lib/codefresh:rshared'

Add volume to rancher_kubernetes_engine_config.services.kublet.extra_binds

Add volume to rancher_kubernetes_engine_config.services.kublet.extra_binds

Step 2: Set kubeconfig user permissions

The user in your kubeconfig must be a ClusterAdmin to install the Runner.

There are two options to create a user:

  • For your pipelines to connect to this cluster as a cluster admin, create a Codefresh user in the Rancher UI with a non-expiring kubeconfig token. This is the easiest way install Codefresh Runner.

  • For your pipelines to connect to this cluster with fewer privileges, use your personal user account with ClusterAdmin privileges for the installation, and then create a Codefresh account with fewer privileges later.


  1. Do one of the following:
    • To create a Codefresh user with ClusterAdmin rights in the Rancher, start from step 2.
    • To use your personal user account with ClusterAdmin privileges for the installation, continue from Step 3: Install the Runner.
  2. In the Rancher UI do the following:
    • Click Security at the top, and then select Users.
    • Click Add User, and below Global Permissions, select Restricted Administrstor.
    • Log out of the Rancher UI, and then log back in as the new user.
    • Click your user icon at the top-right, and then choose API & Keys.
    • Click Add Key, and create a kubeconfig token with Expires set to Never.
    • Copy the Bearer Token field (combines Access Key and Secret Key).
    • Edit your kubeconfig and paste the Bearer Token you copied in the token field of your user.

Create a cluster admin user for Codefresh

Create a cluster admin user for Codefresh

Step 3: Install the Runner

If you’ve created your kubeconfig in the Rancher UI, it includes an API endpoint that is not reachable internally from within the cluster. To work around this, the Runner should use Kubernetes’ generic internal API endpoint. If your kubeconfig contains your personal user account, then you should also add the --skip-cluster-integration option.

  1. Do one of the following
    • Install the Runner with a Codefresh user (ClusterAdmin, non-expiring token):
      codefresh runner init \
        --set-value KubernetesHost=https://kubernetes.default.svc.cluster.local
      
    • Install the runner with your personal user account:
      codefresh runner init \
        --set-value KubernetesHost=https://kubernetes.default.svc.cluster.local \
        --skip-cluster-integration
      
  2. Answer the prompts to complete the installation.

Step 4: Update Runner Docker MTU

By default, RKE nodes use the Canal CNI, which combines elements of Flannel and Calico, and uses VXLAN encapsulation. The VXLAN encapsulation has a 50-byte overhead, and reduces the MTU of its virtual interfaces from the standard 1500 to 1450.

For example, when running ifconfig on an RKE 2.5.5 node, you might see several interfaces like this. Note the MTU:1450.

cali0f8ac592086 Link encap:Ethernet  HWaddr ee:ee:ee:ee:ee:ee
          inet6 addr: fe80::ecee:eeff:feee:eeee/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:11106 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10908 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:922373 (922.3 KB)  TX bytes:9825590 (9.8 MB)

Reduce the Docker MTU used by the Runner’s Docker in Docker (dind) pods to match this lower MTU:

  1. Edit the configmap in the namespace where the Runner is installed:
    The example shows the edit command if you installed the Runner in the codefresh namespace
    kubectl edit cm codefresh-dind-config -n codefresh
    
  2. In the editor, update the daemon.json field, by adding ,\"mtu\":1440 before the final closing curley brace (highlighted in the example below).

Update the runner's Docker MTU

Update the runner's Docker MTU

Step 5: Create the cluster integration

If you installed the Runner with the --skip-cluster-integration option, then must add a Rancher cluster to your to your Kubernetes integrations.

Once complete, you can go to the Codefresh UI and run a pipeline on the new runtime, including steps that deploy to the Kubernetes Integration.

Troubleshooting TLS Errors

Depending on your Rancher configuration, you may need to allow insecure HTTPS/TLS connections, by adding an environment variable to the Runner deployment.

  • Edit the Runner deployment:
    The example below assumes that you installed the Runner in the codefresh namespace.
kubectl edit deploy runner -n codefresh
  • In the editor, add this environment variable below spec.containers.env[]:
- name: NODE_TLS_REJECT_UNAUTHORIZED
  value: "0"

Install Codefresh Runner on Azure Kubernetes Service (AKS)

Prerequisites

  • Volume provisioner (dind-volume-provisioner) with permissions to create/delete/get Azure Disks

  • Minimal IAM role for dind-volume-provisioner: <br /> dind-volume-provisioner-role.json`

    {
      "Name": "CodefreshDindVolumeProvisioner",
      "Description": "Perform create/delete/get disks",
      "IsCustom": true,
      "Actions": [
          "Microsoft.Compute/disks/read",
          "Microsoft.Compute/disks/write",
          "Microsoft.Compute/disks/delete"
    
      ],
      "AssignableScopes": ["/subscriptions/<your-subsripton_id>"]
    }
    

    How to

  1. If you use AKS with managed identities for node group, you can run the script below to assign CodefreshDindVolumeProvisioner role to AKS node identity:
export ROLE_DEFINITIN_FILE=dind-volume-provisioner-role.json
export SUBSCRIPTION_ID=$(az account show --query "id" | xargs echo )
export RESOURCE_GROUP=codefresh-rt1
export AKS_NAME=codefresh-rt1
export LOCATION=$(az aks show -g $RESOURCE_GROUP -n $AKS_NAME --query location | xargs echo)
export NODES_RESOURCE_GROUP=MC_${RESOURCE_GROUP}_${AKS_NAME}_${LOCATION}
export NODE_SERVICE_PRINCIPAL=$(az aks show -g $RESOURCE_GROUP -n $AKS_NAME --query identityProfile.kubeletidentity.objectId | xargs echo)

az role definition create --role-definition @${ROLE_DEFINITIN_FILE}
az role assignment create --assignee $NODE_SERVICE_PRINCIPAL --scope /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$NODES_RESOURCE_GROUP --role CodefreshDindVolumeProvisioner
  1. Install Codefresh Runner using one of these options:
    CLI Wizard:
    codefresh runner init --set-value Storage.Backend=azuredisk --set Storage.VolumeProvisioner.MountAzureJson=true 
    

values-example.yaml:

Storage:
  Backend: azuredisk
  VolumeProvisioner:
    MountAzureJson: true
codefresh runner init --values values-example.yaml 

Helm chart values.yaml:

storage:
  backend: azuredisk
  azuredisk:
    skuName: Premium_LRS

volumeProvisioner:
  mountAzureJson: true
helm install cf-runtime cf-runtime/cf-runtime -f ./generated_values.yaml -f values.yaml --create-namespace --namespace codefresh 

Manually install Codefresh Runner

Manually install the Codefresh Runner on a single cluster with both the runtime and the agent:

kubectl create namespace codefresh
codefresh install agent --agent-kube-namespace codefresh --install-runtime

The Codefresh runner consists of the following:

  • Runner: Gets tasks from the platform and executes them. You can install a single Runner per account that can handle multiple runtimes.
  • Runtime: Includes the components for workflow execution:
    • Volume provisioner (prefix dind-volume-provisioner-runner): Provisions volumes for dind pod
    • lv-monitor (prefix dind-lv-monitor-runner): Daemonset that cleans volumes

You can monitor the runner.

Install monitoring component

If your cluster is located behind the firewall, you can use the Runner’s monitoring component to get valuable information about cluster resources to Codefresh dashboards. For example, to Kubernetes and Helm Releases dashboards.

You can install the monitoring component during Runner installation with cluster integration, or after Runner installation without cluster integration.

Install with cluster integration during Runner install

The cluster integration is created automatically during Runner installation.

codefresh runner init --install-monitor

where:

  • --install-monitor is by default set to true and installs the monitoring component that makes valuable data on the cluster available to Codefresh.

Install without cluster integration after Runner install

If you defined the --skip-cluster-integration flag to skip cluster integration during Runner installation, then you cannot install the monitoring component during the installation. Install the monitoring component separately after completing the Runner installation to get cluster resource information to the Codefresh dashboards.

codefresh install monitor --kube-context-name <CONTEXT> --kube-namespace <NAMESPACE> --cluster-id <CLUSTER_NAME> --token <TOKEN>

where:

  • <CONTEXT>, <NAMESPACE>, ‘' are the context, namespace, and the name of the cluster to which install the monitoring component.
  • <TOKEN> is the token to authenticate to the cluster.

Injecting AWS ARN roles into the cluster

Step 1 - Make sure the OIDC provider is connected to the cluster

See:

Step 2 - Create IAM role and policy as explained in https://docs.aws.amazon.com/eks/latest/userguide/create-service-account-iam-policy-and-role.html

Here, in addition to the policy explained, you need a Trust Relationship established between this role and the OIDC entity.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_PROVIDER}:sub": "system:serviceaccount:${CODEFRESH_NAMESPACE}:codefresh-engine"
        }
      }
    }
  ]
}

Step 3 - Annotate the codefresh-engine Kubernetes Service Account in the namespace where the Codefresh Runner is installed with the proper IAM role.

kubectl annotate -n ${CODEFRESH_NAMESPACE} sa codefresh-engine eks.amazonaws.com/role-arn=${ROLE_ARN}

Once the annotation is added, you should see it when you describe the Service Account.

kubectl describe -n ${CODEFRESH_NAMESPACE} sa codefresh-engine

Name:                codefresh-engine
Namespace:           codefresh
Labels:              app=app-proxy
                     version=1.6.8
Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/Codefresh
Image pull secrets:  <none>
Mountable secrets:   codefresh-engine-token-msj8d
Tokens:              codefresh-engine-token-msj8d
Events:              <none>

Step 4 - Using the AWS assumed role identity

After annotating the Service Account, run a pipeline to test the AWS resource access:

RunAwsCli:
      title : Communication with AWS
      image : mesosphere/aws-cli
      stage: "build"
      commands :
         - apk update
         - apk add jq
         - env
         - cat /codefresh/volume/sensitive/.kube/web_id_token
         - aws sts assume-role-with-web-identity --role-arn $AWS_ROLE_ARN --role-session-name mh9test --web-identity-token file://$AWS_WEB_IDENTITY_TOKEN_FILE --duration-seconds 1000 > /tmp/irp-cred.txt
         - export AWS_ACCESS_KEY_ID="$(cat /tmp/irp-cred.txt | jq -r ".Credentials.AccessKeyId")"
         - export AWS_SECRET_ACCESS_KEY="$(cat /tmp/irp-cred.txt | jq -r ".Credentials.SecretAccessKey")"
         - export AWS_SESSION_TOKEN="$(cat /tmp/irp-cred.txt | jq -r ".Credentials.SessionToken")"
         - rm /tmp/irp-cred.txt
         - aws s3api get-object --bucket jags-cf-eks-pod-secrets-bucket --key  eks-pod2019-12-10-21-18-32-560931EEF8561BC4 getObjectNotWorks.txt

Runtime environment specification

The following section describes the runtime environment specification and possible options to modify it.
Notice that there are additional and hidden fields that are autogenerated by Codefresh that complete a full runtime spec. You can view and edit these fields only for Codefresh On-Premises Installation.

Modify runtime

  1. Get a list of all available runtimes:
    codefresh get runtime-environments
    #or
    codefresh get re
    
  2. Choose the runtime you want to inspect or modify, and get its yaml/json representation:
    codefresh get re my-eks-cluster/codefresh -o yaml > runtime.yaml
    #or
    codefresh get re my-eks-cluster/codefresh -o json > runtime.json
    
  3. Update your runtime environment with the patch command:
    codefresh patch re my-eks-cluster/codefresh -f runtime.yaml
    

Below is an example of the default and basic runtime spec after you’ve installed the Runner:

version: 1
metadata:
  ...
runtimeScheduler:
  cluster:
    clusterProvider:
      accountId: 5f048d85eb107d52b16c53ea
      selector: my-eks-cluster
    namespace: codefresh
    serviceAccount: codefresh-engine
  annotations: {}
dockerDaemonScheduler:
  cluster:
    clusterProvider:
      accountId: 5f048d85eb107d52b16c53ea
      selector: my-eks-cluster
    namespace: codefresh
    serviceAccount: codefresh-engine
  annotations: {}
  userAccess: true
  defaultDindResources:
    requests: ''
  pvcs:
    dind:
      storageClassName: dind-local-volumes-runner-codefresh
extends:
  - system/default/hybrid/k8s_low_limits
description: '...'
accountId: 5f048d85eb107d52b16c53ea

Top level fields

Field name Type Value
version string Runtime environment version
metadata object Meta-information
runtimeScheduler object Engine pod definition
dockerDaemonScheduler object Dind pod definition
extends array System field (links to full runtime spec from Codefresh API)
description string Runtime environment description (k8s context name and namespace)
accountId string Account to which this runtime belongs
appProxy object Optional filed for app-proxy

runtimeScheduler fields (engine)

Field name Type Value
image string Override default engine image
imagePullPolicy string Override image pull policy (default IfNotPresent)
type string KubernetesPod
envVars object Override or add environment variables passed into the engine pod
userEnvVars object Add external env var(s) to the pipeline. See Custom Global Environment Variables
cluster object k8s related information (namespace, serviceAccount, nodeSelector)
resources object Specify non-default requests and limits for engine pod. For memory, use Mi (mebibytes); for CPU, use m (millicpu)
tolerations array Add tolerations to engine pod
annotations object Add custom annotations to engine pod (empty by default {})
labels object Add custom labels to engine pod (empty by default {})
dnsPolicy string Engine pod’s DNS policy
dnsConfig object Engine pod’s DNS config

runtimeScheduler example:

runtimeScheduler:
  imagePullPolicy: Always
  cluster:
    clusterProvider:
      accountId: 5f048d85eb107d52b16c53ea
      selector: my-eks-cluster
    nodeSelector: #schedule engine pod onto a node whose labels match the nodeSelector
      node-type: engine  
    namespace: codefresh
    serviceAccount: codefresh-engine
  annotations: {}
  labels:
    spotinst.io/restrict-scale-down: "true" #optional label to prevent node scaling down when the runner is deployed on spot instances using spot.io
  envVars:
    NODE_TLS_REJECT_UNAUTHORIZED: '0' #disable certificate validation for TLS connections (e.g. to g.codefresh.io)
    METRICS_PROMETHEUS_ENABLED: 'true' #enable /metrics on engine pod
    DEBUGGER_TIMEOUT: '30' #debug mode timeout duration (in minutes)
  userEnvVars:
    - name: GITHUB_TOKEN
      valueFrom:
        secretKeyRef:
          name: github-token
          key: token
  resources:
    requests:
      cpu: 60m
      memory: 500Mi
    limits:
      cpu: 1000m
      memory: 2048Mi
  tolerations:
    - effect: NoSchedule
      key: codefresh.io
      operator: Equal
      value: engine            

dockerDaemonScheduler fields (dind)

Field name Type Value
dindImage string Override default dind image
type string DindPodPvc
envVars object Override or add environment variables passed into the dind pod. See IN-DIND cleaner
userVolumeMounts with userVolumes object Add volume mounts to the pipeline See Custom Volume Mounts
cluster object k8s related information (namespace, serviceAccount, nodeSelector)
defaultDindResources object Override requests and limits for dind pod (defaults are cpu: 400m and memory:800Mi). For memory, use Mi (mebibytes); for CPU, use m (millicpu)
tolerations array Add tolerations to dind pod
annotations object Add custom annotations to dind pod (empty by default {})
labels object Add custom labels to dind pod (empty by default {})
pvc object Override default storage configuration for PersistentVolumeClaim (PVC) with storageClassName, volumeSize, reuseVolumeSelector. See Volume reuse policy
dnsPolicy string Dind pod’s DNS policy
dnsConfig object Dind pod’s DNS config

dockerDaemonScheduler example:

dockerDaemonScheduler:
  cluster:
    clusterProvider:
      accountId: 5f048d85eb107d52b16c53ea
      selector: my-eks-cluster
    nodeSelector: #schedule dind pod onto a node whose labels match the nodeSelector
      node-type: dind  
    namespace: codefresh
    serviceAccount: codefresh-engine
  annotations: {}
  labels:
    spotinst.io/restrict-scale-down: "true" #optional label to prevent node scaling down when the runner is deployed on spot instances using spot.io
  userAccess: true
  defaultDindResources:
    requests: ''
    limits:
      cpu: 1000m
      memory: 2048Mi
  userVolumeMounts:
    my-cert:
      name: cert
      mountPath: /etc/ssl/cert
      readOnly: true
  userVolumes:
    my-cert:
      name: cert
      secret:
        secretName: tls-secret
  pvcs:
    dind:
      storageClassName: dind-local-volumes-runner-codefresh
      volumeSize: 30Gi
      reuseVolumeSelector: 'codefresh-app,io.codefresh.accountName,pipeline_id'
  tolerations:
    - key: codefresh.io
      operator: Equal
      value: dinds
      effect: NoSchedule    

Custom global environment variables

You can add your own environment variables to the runtime environment. All pipeline steps have access to the global variables. A typical example of such a variable would be a shared secret that you want to pass to the pipeline.

To the runtimeScheduler block, you can add an additional element with named userEnvVars that follows the same syntax as secret/environment variables.

runtime.yaml

...
runtimeScheduler:
  userEnvVars:
    - name: GITHUB_TOKEN
      valueFrom:
        secretKeyRef:
          name: github-token
          key: token
...

Custom volume mounts

You can add your own volume mounts in the runtime environment, so that all pipeline steps have access to the same set of external files. A typical example of this scenario is when you want to make a set of SSL certificates available to all your pipelines. Rather than manually download the certificates for each pipeline, you can provide them centrally at the runtime level.

Under the dockerDaemonScheduler block you can add two additional elements with names userVolumeMounts and userVolumes (they follow the same syntax as normal k8s volumes and volumeMounts) and define your own global volumes.

runtime.yaml

...
dockerDaemonScheduler:
  userVolumeMounts:
    my-cert:
      name: cert
      mountPath: /etc/ssl/cert
      readOnly: true
  userVolumes:
    my-cert:
      name: cert
      secret:
        secretName: tls-secret
...

Debug timeout duration

The default timeout for debug mode is 14 minutes, even if the user is actively working.
To change the duration of the debugger for a runtime, you must update the Runtime Spec of that runtime with the DEBUGGER_TIMEOUT to the environment variable. The timeout is defined in minutes, so ‘30’ corresponds to 30 minutes.

  • Under .runtimeScheduler, add an envVars section
  • Add DEBUGGER_TIMEOUT to envVars with the value you want.
...
runtimeScheduler:
  envVars:
    DEBUGGER_TIMEOUT: '30'
...

Volume reuse policy

Volume reuse behavior depends on the configuration for reuseVolumeSelector in the runtime environment spec.

The following options are available:

  • reuseVolumeSelector: 'codefresh-app,io.codefresh.accountName'
    Determined PV can be used by ANY pipeline in the specified account (it’s a default volume selector).
    • Benefit: Fewer PVs, resulting in lower costs. Since any PV can be used by any pipeline, the cluster needs to maintain/reserve fewer PVs in its PV pool for Codefresh.
    • Downside: Since the PV can be used by any pipeline, the PVs could have assets and info from different pipelines, reducing the probability of cache.
  • reuseVolumeSelector: 'codefresh-app,io.codefresh.accountName,project_id'
    Determined PV can be used by ALL pipelines in your account, assigned to the same project.

  • reuseVolumeSelector: 'codefresh-app,io.codefresh.accountName,pipeline_id'
    Determined PV can be used only by a single pipeline.
    • Benefit: More probability of cache without “spam” from other pipelines.
    • Downside: More PVs to maintain and therefore higher costs.
  • reuseVolumeSelector: 'codefresh-app,io.codefresh.accountName,pipeline_id,io.codefresh.branch_name'
    Determined PV can be used only by single pipeline AND single branch.
  • reuseVolumeSelector: 'codefresh-app,io.codefresh.accountName,pipeline_id,trigger'
    Determined PV can be used only by single pipeline AND single trigger.

To change volume selector:

  • Get runtime yaml spec.
  • Below dockerDaemonScheduler.pvcs.dind block, specify reuseVolumeSelector:
  pvcs:
    dind:
      volumeSize: 30Gi
      reuseVolumeSelector: 'codefresh-app,io.codefresh.accountName,pipeline_id'

ARM Builds

With the Codefresh Runner, you can run native ARM64v8 builds.

Note:
You cannot run both amd64 and arm64 images within the same pipeline. As we do not support multi-architecture builds, and one pipeline can map only to one runtime, you can run either amd64 or arm64 within the same pipeline.

The following scenario is an example of how to set up ARM Runner on existing EKS cluster:

Step 1: Preparing nodes

  • Create new ARM nodegroup:
eksctl utils update-coredns --cluster <cluster-name>
eksctl utils update-kube-proxy --cluster <cluster-name> --approve
eksctl utils update-aws-node --cluster <cluster-name> --approve

eksctl create nodegroup \
--cluster <cluster-name> \
--region <region> \
--name <arm-ng> \
--node-type <a1.2xlarge> \
--nodes <3>\
--nodes-min <2>\
--nodes-max <4>\
--managed
  • Check nodes status:
kubectl get nodes -l kubernetes.io/arch=arm64
  • Also it’s recommeded to label and taint the required ARM nodes:
kubectl taint nodes <node> arch=aarch64:NoSchedule
kubectl label nodes <node> arch=arm

Step 2: Runner installation

  • Use values.yaml to inject tolerations, kube-node-selector, build-node-selector into the Runtime Environment spec.

values-arm.yaml

...
Namespace: codefresh

### NodeSelector --kube-node-selector: controls runner and dind-volume-provisioner pods
NodeSelector: arch=arm

### Tolerations --tolerations: controls runner, dind-volume-provisioner and dind-lv-monitor
Tolerations: 
- key: arch
  operator: Equal
  value: aarch64
  effect: NoSchedule
...
########################################################
###                Codefresh Runtime                 ###
###                                                  ###
###         configure engine and dind pods           ###
########################################################
Runtime:
### NodeSelector --build-node-selector: controls engine and dind pods
  NodeSelector:
    arch: arm
### Tolerations for engine and dind pods
  tolerations: 
  - key: arch
    operator: Equal
    value: aarch64
    effect: NoSchedule  
...    
  • Install the Runner:
    codefresh runner init --values values-arm.yaml --exec-demo-pipeline false --skip-cluster-integration true
    

Step 3: Post-installation fixes

  • Change engine image version in Runtime Environment specification:
# get the latest engine ARM64 tag
curl -X GET "https://quay.io/api/v1/repository/codefresh/engine/tag/?limit=100" --silent | jq -r '.tags[].name' | grep "^1.*arm64$"
1.136.1-arm64
# get runtime spec
codefresh get re $RUNTIME_NAME -o yaml > runtime.yaml
  • Under runtimeScheduler.image change image tag:
runtimeScheduler:
  image: 'quay.io/codefresh/engine:1.136.1-arm64'
# patch runtime spec
codefresh patch re -f runtime.yaml
  • For local storage patch dind-lv-monitor-runner DaemonSet and add nodeSelector:
kubectl edit ds dind-lv-monitor-runner
    spec:
      nodeSelector:
        arch: arm

Step 4: Run Demo pipeline

Run a modified version of the CF_Runner_Demo pipeline:

version: '1.0'
stages:
  - test
steps:
  test:
    stage: test
    title: test
    image: 'arm64v8/alpine'
    commands:
      - echo hello Codefresh Runner!

Uninstall Codefresh Runner

You may want to uninstall the Codefresh Runner.

Uninstalling the Codefresh Runner does not affect pipelines. You continue to see existing pipelines and can create new pipelines.

  • Run:
codefresh runner delete
  • Answer the prompts as required.

To use the CLI, run: (--help to see the available options):

codefresh runner delete --help

Troubleshooting

For troubleshooting refer to the Knowledge Base

Codefresh installation options
Codefresh On-Premises installation
Codefresh API