A journey to Kubernetes on Azure

    With the ongoing migration to Azure, I would like to share my thoughts regarding one of the biggest challenges we have faced thus far: orchestrating container infrastructure. Many of the Jenkins project’s applications are run as Docker containers, making Kubernetes a logical choice as far as running our containers, but it presents its own set of challenges. For example, what would the workflow from development to production look like?

    Before going deeper into the challenges, let’s review the requirements we started with:

    Git

    We found it mandatory to keep track of all the infrastructure changes in Git repositories, including secrets, in order to facilitate reviewing, validation, rollback, etc of all infra changes.

    Tests

    Infrastructure contributors are geographically distributed and in different timezones. Getting feedback can take time, so we heavily rely on a lot of tests before any changes can be merged.

    Automation

    The change submitter is not necessarily the person who will deploy it. Repetitive tasks are error prone and a waste of time. For these reasons, all steps must be automated and stay as simple as possible.

    A high level overview of our "infrastructure as code" workflow would look like:

    Infrastructure as Code Workflow
      __________       _________       ______________
      |         |      |        |      |             |
      | Changes | ---->|  Test  |----->| Deployment  |
      |_________|      |________|  ^   |_____________|
                                   |
                            ______________
                           |             |
                           | Validation  |
                           |_____________|

    We identified two possible approaches for implementing our container orchestration with Kubernetes:

    1. The Jenkins Way: Jenkins is triggered by a Git commit, runs the tests, and after validation, Jenkins deploys changes into production.

    2. The Puppet Way: Jenkins is triggered by a Git commit, runs the tests, and after validation, it triggers Puppet to deploy into production.

    Let’s discuss these two approaches in detail.

    The Jenkins Way

    Workflow
      _________________       ____________________       ______________
      |                |      |                   |      |             |
      |    Github:     |      |     Jenkins:      |      |   Jenkins:  |
      | Commit trigger | ---->| Test & Validation | ---->|  Deployment |
      |________________|      |___________________|      |_____________|

    In this approach, Jenkins is used to test, validate, and deploy our Kubernetes configuration files. kubectl can be run on a directory and is idempotent. This means that we can run it as often as we want: the result will not change. Theoretically, this is the simplest way. The only thing needed is to run kubectl command each time Jenkins detects changes.

    The following Jenkinsfile gives an example of this workflow.

    Jenkinsfile
      pipeline {
        agent any
        stages {
          stage('Init'){
            steps {
              sh 'curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl'
            }
          }
          stage('Test'){
            steps {
              sh 'Run tests'
            }
          }
          stage('Deploy'){
            steps {
              sh './kubectl apply -R true -f my_project'
            }
          }
        }
      }

    The devil is in the details of course, and it was not as easy as it looked at first sight.

    Order matters

    Some resources needed to be deployed before others. A workaround was to use numbers as file names. But this added extra logic at file name level, for example:

    project/00-nginx-ingress
    project/09-www.jenkins.io

    Portability

    The deployment environments needed to be the same across development machines and the Jenkins host. Although this a well-known problem, it was not easy to solve. The more the project grew, the more our scripts needed additional tools (make, bats, jq gpg, etc). The more tools we used, the more issues appeared because of the different versions used.

    Another challenge that emerged when dealing with different environments was: how should we manage environment-specific configurations (dev, prod, etc)? Would it be better to define different configuration files per environment? Perhaps, but this means code duplication, or using file templates which would require more tools (sed, jinja2, erb), and more work.

    There wasn’t a golden rule we discovered, and the answer is probably somewhere in between.

    In any case, the good news is that a Jenkinsfile provides an easy way to execute tasks from a Docker image, and an image can contain all the necessary tools in our environment. We can even use different Docker images for each stage along the way.

    In the following example, I use the my_env Docker image. It contains all the tools needed to test, validate, and deploy changes.

    Jenkinsfile
    pipeline{
      agent {
        docker{
          image 'my_env:1.0'
        }
      }
      options{
        buildDiscarder(logRotator(numToKeepStr: '10'))
        disableConcurrentBuilds()
        timeout(time: 1, unit: 'HOURS')
      }
      triggers{
        pollSCM('* * * * *')
      }
      stages{
        stage('Init'){
          steps{
            // Init everything required to deploy our infra
            sh 'make init'
          }
        }
        stage('Test'){
          steps{
           // Run tests to validate changes
           sh 'make test'
          }
        }
        stage('Deploy'){
          steps{
           // Deploy changes in production
           sh 'make deploy'
          }
        }
      }
      post{
        always {
          sh 'make notify'
        }
      }
    }

    Secret credentials

    Managing secrets is a big subject and brings with it many different requirements which are very hard to fulfill. For obvious reasons, we couldn’t publish the credentials used within the infra project. On the other hand, we needed to keep track and share them, particularly for the Jenkins node that deploys our cluster. This means that we needed a way to encrypt or decrypt those credentials depending on permissions, environments, etc. We analyzed two different approaches to handle this:

    1. Storing secrets in a key management tool like Key Vault or Vault and use them like a Kubernetes "secret" type of resource.
      → Unfortunately, these tools are not yet integrated in Kubernetes. But we may come back to this option later. Kubernetes issue: 10439

    2. Publishing and encrypting using a public GPG key.
      This means that everybody can encrypt credentials for the infrastructure project but only the owner of the private key can decrypt credentials.
      This solution implies:

      • Scripting: as secrets need to be decrypted at deployment time.

      • Templates: as secret values will change depending on the environment.
        → Each Jenkins node should only have the private key to decrypt secrets associated to its environment.

    Scripting

    Finally, the system we had built was hard to work with. Our initial Jenkinsfile which only ran one kubectl command slowly become a bunch of scripts to accommodate for:

    • Resources needing to be updated only in some situations.

    • Secrets needing to be encrypted/decrypted.

    • Tests needing to be run.

    In the end, the amount of scripts required to deploy the Kubernetes resources started to become unwieldy and we began asking ourselves: "aren’t we re-inventing the wheel?"

    The Puppet Way

    The Jenkins project already uses Puppet, so we decided to look at using Puppet to orchestrate our container deployment with Kubernetes.

    Workflow
      _________________       ____________________       _____________
      |                |      |                   |      |            |
      |    Github:     |      |     Jenkins:      |      | Puppet:    |
      | Commit trigger | ---->| Test & Validation | ---->| Deployment |
      |________________|      |___________________|      |____________|

    In this workflow, Puppet is used to template and deploy all Kubernetes configurations files needed to orchestrate our cluster. Puppet is also used to automate basic kubectl operations such as 'apply' or 'remove' for various resources based on file changes.

    Puppet workflow
    ______________________
    |                     |
    |  Puppet Code:       |
    |    .                |
    |    ├── apply.pp     |
    |    ├── kubectl.pp   |
    |    ├── params.pp    |
    |    └── resources    |
    |        ├── lego.pp  |
    |        └── nginx.pp |
    |_____________________|
              |                                        _________________________________
              |                                       |                                |
              |                                       |  Host: Prod orchestrator       |
              |                                       |    /home/k8s/                  |
              |                                       |    .                           |
              |                                       |    └── resources               |
              | Puppet generate workspace             |        ├── lego                |
              └-------------------------------------->|        │   ├── configmap.yaml  |
                Puppet apply workspaces' resources on |        │   ├── deployment.yaml |
              ----------------------------------------|        │   └── namespace.yaml  |
              |                                       |        └── nginx               |
              v                                       |            ├── deployment.yaml |
     ______________                                   |            ├── namespace.yaml  |
     |     Azure:  |                                  |            └── service.yaml    |
     | K8s Cluster |                                  |________________________________|
     |_____________|

    The main benefit of this approach is letting Puppet manage the environment and run common tasks. In the following example, we define a Puppet class for Datadog.

    Puppet class for resource Datadog
    # Deploy datadog resources on kubernetes cluster
    #   Class: profile::kubernetes::resources::datadog
    #
    #   This class deploy a datadog agent on each kubernetes node
    #
    #   Parameters:
    #     $apiKey:
    #       Contain datadog api key.
    #       Used in secret template
    class profile::kubernetes::resources::datadog (
        $apiKey = base64('encode', $::datadog_agent::api_key, 'strict')
      ){
      include ::stdlib
      include profile::kubernetes::params
      require profile::kubernetes::kubectl
    
      file { "${profile::kubernetes::params::resources}/datadog":
        ensure => 'directory',
        owner  => $profile::kubernetes::params::user,
      }
    
      profile::kubernetes::apply { 'datadog/secret.yaml':
        parameters => {
            'apiKey' => $apiKey
        },
      }
      profile::kubernetes::apply { 'datadog/daemonset.yaml':}
      profile::kubernetes::apply { 'datadog/deployment.yaml':}
    
      # As secrets change do not trigger pods update,
      # we must reload pods 'manually' in order to use updated secrets.
      # If we delete a pod defined by a daemonset,
      # this daemonset will recreate pods automatically.
      exec { 'Reload datadog pods':
        path        => ["${profile::kubernetes::params::bin}/"],
        command     => 'kubectl delete pods -l app=datadog',
        refreshonly => true,
        environment => ["KUBECONFIG=${profile::kubernetes::params::home}/.kube/config"] ,
        logoutput   => true,
        subscribe   => [
          Exec['apply datadog/secret.yaml'],
          Exec['apply datadog/daemonset.yaml'],
        ],
      }
    }

    Let’s compare the Puppet way with the challenges discovered with the Jenkins way.

    Order Matters

    With Puppet, it becomes easier to define priorities as Puppet provides relationship meta parameters and the function 'require' (see also: Puppet relationships).

    In our Datadog example, we can be sure that deployment will respect the following order:

    datadog/secret.yaml -> datadog/daemonset.yaml -> datadog/deployment.yaml

    Currently, our Puppet code only applies configuration when it detects file changes. It would be better to compare local files with the cluster configuration in order to trigger the required updates, but we haven’t found a good way to implement this yet.

    Portability

    As Puppet is used to configure working environments, it becomes easier to be sure that all tools are present and correctly configured. It’s also easier to replicate environments and run tests on them with tools like RSpec-puppet, Serverspec or Vagrant.

    In our Datadog example, we can also easily change the Datadog API key depending on the environment with Hiera.

    Secret credentials

    As we were already using Hiera GPG with Puppet, we decided to continue to use it, making managing secrets for containers very simple.

    Scripting

    Of course the Puppet DSL is used, and even if it seems harder at the beginning, Puppet simplifies a lot the management of Kubernetes configuration files.

    Conclusion

    It was much easier to bootstrap the project with a full CI workflow within Jenkins as long as the Kubernetes project itself stays basic. But as soon as the project grew, and we started deploying different applications with different configurations per environment, it became easier to delegate Kubernetes management to Puppet.

    If you have any comments feel free to send a message to Jenkins Infra mailing list.

    Thanks

    Thanks to Lindsay Vanheyste, Jean Marc Meessen, and Damien Duportal for their feedback.

    About the Author
    Olivier Vernin
    Olivier Vernin

    Olivier is the Jenkins infrastructure officer and Senior Operations Engineer at CloudBees. As a regular contributor to the Jenkins infrastructure projects, he works on a wide range of tasks from services reliability to applications maintenance.