Robert’s Blog

Writing a custom Terraform provider!

2019-10-29T21:00:00+00:00

So, how do you develop a custom Terraform provider? There are a lot of blogs like this, but the purpose of this one is to learn myself. In this post we will develop a custom terraform provider that exposes a data source for retrieving the Coinbase API endpoint.

First, I read the Terraform plugin SDK getting started guide, which is is a good overview of the SDK which is used for developing a custom Terraform provider. This blog posts assumes you have all “Go” pre-reqs set-up successfully.

Next, we need to create two boilerplate files:

provider.go - this is the core of the plugin. This controls the resources that the provider offers and can expose via its interface
main.go - the entry point when the binary is built, this invokes the “Provider” function in provider.go

Provider.go

package main

import (
        "github.com/hashicorp/terraform-plugin-sdk/helper/schema"
)

func Provider() *schema.Provider {
        return &schema.Provider{}
}

package main

import (
        "github.com/hashicorp/terraform-plugin-sdk/plugin"
        "github.com/hashicorp/terraform-plugin-sdk/terraform"
)

func main() {
        plugin.Serve(&plugin.ServeOpts{
                ProviderFunc: func() terraform.ResourceProvider {
                        return Provider()
                },
        })
}

Now, we have our boilerplate code set-up, let’s start writing our new data source. First we define the Terraform interface we are working towards for the provider. Based on the below we want to create a new data source resource which returns the Coinbase “api_endpoint”. Data sources are read-only and therefore, for this basic example we do not have to persist anything in the statefile.

data "coinbase_address" "api_address" {}

output "endpoint" {
  value = ["${data.coinbase_address.api_address.api_endpoint}"]
}

Next, let’s create our data source code. Create a new go file called “data_address”. By convention for a provider we name the file type_name, where type is the Terraform component it provides (e.g. data (data source), resource).

Create the file, with the following:

package main

import (
        "github.com/hashicorp/terraform-plugin-sdk/helper/schema"
)

func dataCoinbaseAddress() *schema.Resource {
        return &schema.Resource{
                Read:   resourceCoinbaseSourceAddressRead,
                Schema: map[string]*schema.Schema{
                        "api_endpoint": &schema.Schema{
				            Type:     schema.TypeString,
                            Computed: true,
			            },
                },
        }
}

func resourceCoinbaseSourceAddressRead(d *schema.ResourceData, m interface{}) error {
        endpoint := "https://api.coinbase.com/v2" 
        d.SetId(endpoint)
        d.Set("api_endpoint", endpoint)
        return nil
}

Breaking this down a bit. The dataCoinbaseAddress function returns a new Resource. “Read” is the function to execute when refreshing the Terraform data source and “Schema” is the data resource attributes. This resource has one attribute named “api_endpoint” which is computed and therefore not provided by the user.

The resourceCoinbaseSourceAddressRead function takes in the ResourceData which is the user provided attributes of the data source. Using d.Set("api_endpoint", endpoint) we can then add an attribute to this struct of type ResourceData. ResourceData is used for CRUD based operations on the resource e.g. querying or setting attributes and other purposes.

To build the provider you will need to run the following Go commands:

go get -v
go build -o terraform-provider-coinbase

To test this create a new folder example/' and a file named main.tf` with the following contents.

data "coinbase_address" "api_address" {}

output "endpoint" {
  value = ["${data.coinbase_address.api_address.api_endpoint}"]
}

Now, run terraform init followed by a plan targeting the /example folder. You should get similar output to the below:

terraform init example/
Initializing the backend...
Initializing provider plugins...
Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
terraform plan example/
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

data.coinbase_address.api_address: Refreshing state...

------------------------------------------------------------------------

No changes. Infrastructure is up-to-date.

This means that Terraform did not detect any differences between your
configuration and real physical resources that exist. As a result, no
actions need to be performed.
terraform apply example/
data.coinbase_address.api_address: Refreshing state...

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

endpoint = [
  "https://api.coinbase.com/v2",
]

Checkout the GitHub repo for the full example source code, along with example Terraform code for testing.

~Robert

Improving maintenance of your Jenkins(file) pipelines

2019-10-20T18:00:00+00:00

Do you have 100s repositories with Jenkinsfile that look very similar and spend hours updating after the slightest change?

Declarative Jenkins pipeline are great. It makes it easier to create multi-branch pipelines (e.g. run tests or even build environments on feature branches). However, with all these Jenkinsfile everywhere maintaining them and sharing assets can be a challenge.

Imagine an environment where there is 10 services and therefore 10 copies of the same Jenkinsfile. To change a method we have to update the Jenkinsfile in 10 different places. With this level of decoupling teams will struggle to standardise and benefit from efficiency as we scale.

To try and resolve this you can look at developing Jenkins shared libraries. Libraries enable you to encapsulate build steps in re-usable methods. A shared library is stored in Git and retrieved as part of Jenkins pipeline builds. Jenkins pipelines can then import methods from this library - see example code below. This is great, we can now build a pipeline with shared step implementation. The library can also be versioned. I prefer using the latest and greatest but if you roll-back to a previous deploy you want to make sure you used the same version of the deployment scripts.

@Library('shared-library') // no version specified always pull the latest and greatest

pipeline {

    agent none

    stages {
        stage('Deploy') {
            steps {
                deployToEnv('DEV')  // shared steps
            }
        }
    }
}

Shared libraries are a good start as it enables standard methods that product teams can use for their CI / CD steps. Teams still have control over the order of stages in the pipeline. In my opinion, we should be opinionated on the stages on what a CI / CD pipeline for say Java should look like.

Imagine we have made this improvement and scale again. You have 100 services (same technology) spread across 100 repositories and we want to add a new “stage” to the pipeline for enhanced security testing. With this type of change the product teams would still have to update 100 repositories as a new stage would need to be added. How might you tackle this?

Linking into a previous blog post you could jump to buildpacks, but let’s solve the problem with Jenkins pipelines first.

To resolve this, alongside adding specific Jenkins stage step implementations to the shared library you can add the whole pipeline as a class that is configurable. Product teams could then add Jenkinsfile to their repositories and instantiate the pipeline suitable for their application - see example below.

@Library('shared-library') // no version specified always pull the latest and greatest

import pipeline.JavaPipeline;

new JavaPipeline(
  name: "ServiceA"
);

With this approach it brings greater efficiency of scale (why reinvent the wheel for a typical Java CI / CD pipeline) and enables standardisation and control.

~Robert

Why use Terraform

2019-09-15T12:00:00+00:00

I am often asked should we use Terraform or the cloud providers native (e.g. AWS CloudFormation, Azure ARM templates).Terraform is great like a software design pattern it provides a common language for orchestrating your infrastructure provisioning. I am fan of the “monorepo” when developing Terraform code for the purpose of dependency management, versioning all infrastructure code as one version and therefore faster development. It makes sense to decouple modules from the monorepo if they are to be consumed by other parties and are building a centralised asset catalogue.

How does this work? Well the monorepo would consist of a number of number of modules and a single Terraform main.tf file to call these modules. You might also have InSpec tests and a Makefile to orchestrate Terraform commands in a CI / CD pipeline.

But why would you use Terraform over other infrastructure as code:

Support for cross cloud and other services

Terraform range of providers let you configure cloud resources from another clouds and further for example Kubernetes clusters enabling you to create that one-click infrastructure and minimise post-provisioning scripts.

Simple Common Language

Terraform code is simple, the syntax is easy to learn and is easy to make modular code.

Templating

The ability to override default values and template files makes it a good candidate for creating multiple-environments from the same code base without changing the code.

Why would you not use Terraform:

API support

Terraform naturally lags behind the cloud providers API and therefore latest and greatest features and often not available for new services.

Statefile

You have to manage a statefile. So you have a ‘chicken and egg’ problem. To use Terraform you need a storage location and to provision this you would need to use the cloud providers native tool of choice. Hashicorp have recently announced Terraform cloud Remote State Management which should resolve this if you are comfortable with them storing the state file.

Secrets

The statefile can contain secrets if you are passing them into services.

You have a large existing state managed by cloud providers native

Terraform import command will have to be used and is very time consuming.

Do you use Terraform?

~Robert

Let’s talk about application buildpacks

2019-07-13T15:00:00+00:00

In micro/nano service development every service should have it’s own pipeline and be independently deployable. In this world the source code repository would have a pipeline definitions, deployment template (Helm chart or deploy.sh), definitions of the application runtime (Dockerfile) and maybe some ignore files and security policies files (e.g. exclusions).

Defining these components for every service source control is tiresome, difficult to update, govern centrally and does not scale when templates need to be changed. You could argue if you defined this centrally your services are not decoupled from one another, but in order to scale you need shared assets and templates.

How could you achieve this? Well, buildpacks (not new, been around a while e.g. Heroku, CloudFoundry, Gitlab CI). Buildpacks groups these common components together to enable an application to be deployed and run. Buildpacks are defined for various technologies (Java, JavaScript, Go).

But who owns these buildpacks? Well, you might have a SRE team responsible for running applications and defining central buildpacks. But you’re abstracting the mechanics away from the developers, yes it is a trade off but nothing stops them developing the build packs they just need to be centrally catalogued and governed.

How might you implement build packs? That’s for another post.

Additional resources:

https://buildpacks.io/#learn-more

~Robert

#jailbreak - Escaping the container

2019-05-25T15:00:00+00:00

It’s common knowledge that Docker containers should not be be run in privileged mode with a shared host PID namespace, but why? In this case a malicious actor, assuming they have access to the container via exec or a vulnerable application, could gain root access to the host using a program like nsenter. nsenter enables you to execute a process with the context of another process, for example PID 1.

This example will help provide a concrete example of escaping a container, enabling you to access the hosts file system and execute shell commands.

$ docker-machine start default

$ docker run --privileged --rm --pid=host -ti ubuntu 

// This gives you access to the hosts file system
$ nsenter --target 1 --mount sh 

This one of many ways you can escape a container. When running containers you will want to drop other capabilities such as the ability to reboot a host from inside the container.

~ Robert

Fail faster, it’s okay to fail

2019-03-10T15:00:00+00:00

It’s okay to fail! If you’re not failing you may not be learning effectively. Maybe failing / making mistakes through trial and error is how you learn best, but perseverance is key. Failing fast is a key principle of DevOps to reduce the impact of failing (e.g. this could be financial by identifying defects earlier - see image below from AgileModeling).

As a DevOps enthusiast I’m interested in ways of failing faster and improving efficiency. A recent example - I like readable code (we all do) by being well formatted (e.g. using the terraform format command). I often ignore formatting warnings in my editor and forget to execute unit tests for small changes, but it would fail CI. This is waste, we could have failed faster. We should shift further left out of the pipeline and into the IDE so I forced myself to fix errors prior to committing by installing a pre-commit hook - this can also be shared with the team.

Every-time the pipeline breaks beacuse of a code formatting error, assuming we are following a Trunk-based development approach, someone in the team may have to stop to fix it for them to progress their change, slowing development - this is waste. Should code formatting even fail the build if it functional works?

So what can you do to fail faster and reduce risk:

Install a pre-commit hook if you forget to validate before committing (e.g. scan for secrets, execute unit tests etc)
Work to a minimum viable product (MVP)
Use metrics and data / feedback to guide solutions
Execute experiments - Hypothesis Driven Development
Don’t over-optimise, not everything needs to be a “golden” solution - as in my previous blog post building a “well architected monolith” is more appropriate for eliciting requirements than building a custom micro-service architecture that takes months to fail or deliver value
Establish CI pipelines from day 1

So when you are working on your next project, going out for dinner without a reservation or taking a car for a MOT ask yourself how can you fail faster!

~ Robert

Terraform + InSpec for automated Infra as Code testing

2019-02-07T18:00:00+00:00

Our Infra structure CI / CD pipeline is a key enabler for us failing fast, controlling releases to environments, delivering with agility, ensuring parity between environments. The typical steps include: linting, Terraform validate, plan and apply.

We often face issues with Terraform apply passing and changes being promoted to higher environments, but the cloud account not being in the desired state – “Null Resources” try to avoid them at all costs (not idempotent). Manual testing is also slow and painful. We are also aspiring to Continuous Deployment for infrastructure as code – we need tests to achieve this.

After a brief investigation we discovered a few approaches:

InSpec – RSpec audit and compliance as code framework for testing cloud account configuration
Terratest
AWSSpec
Test Kitchen terraform driver – Manages Terraform lifecycle + InSpec

We concluded in using InSpec because:

It’s well documented
Tests are readable, follows RSpec BDD framework
Has libraries for testing with different cloud providers GCP, Azure, AWS
We didn’t need Test Kitchen as we already use a Makefile to manage the Terraform lifecycle of our environments and would not benefit from further abstractions – we use a Makefile to abstract build steps from Jenkinsfile but also to ensure our CI environment and developers workflow is identical.

What did we do:

We updated our Dockerfile.ci to include Ruby and InSpec gem which defines our Jenkins slave execution environment and lives alongside our Jenkinsfile and infrastructure code.
Extended our Makefile with a action to invoke InSpec and pass in an “attributes” file which contains environment specific config (e.g environment resource prefix for resource naming convention)

test-development:
   inspec detect -t azure://
   inspec vendor test –overwrite
   inspec exec test -t azure://$(AZURE_SUBSCRIPTION_ID) –attrs test/fixtures/development.yml

Developed some simple tests to validate Terraform code (e.g. Virtual Networks exists with correct tags, in correct regions)

describe azurerm_virtual_network(resource_group: resource_group, name: vnet_name) do
    it { should exist }
    its(‘location’) { should eq ‘ukwest’ }
    its(‘type’) { should eq ‘Microsoft.Network/virtualNetworks’ }
end

Developed some general tests to validate security conformance:
- No any protocol, destination address and port Network Security Group rules
- All NSGs have explicit deny all inbound and outbound rules

In my opinion, we are not using this in force, but it’s a great first step to validating our Terraform code and can be used for security compliance. The next steps are to look into CIS benchmark implementations in InSpec.

~ Robert

Well Architected Monoliths are Okay

2018-09-21T18:00:00+00:00

DevOps Days London was great this year! The talks were interesting and the culture was inclusive and friendly.

I’ve always thought that we should build the ‘correct size service’ rather than ‘microservices’ just for the sake of having them. It is equivalent to using Kubernetes with one docker container - you would have more etcd nodes…

In this blog post I would like to share one of the outcomes of a DevOps days open space discussion (suggested by Chris Mills) on what is / why good monoliths are okay to start with.

So why are well architected and developed monoliths okay:

Fail fast - they let development teams focus on delivering features (to prove or disprove a hypothesis) rather than a complicated microservice architecture
It helps you to understand your requirements (UML diagrams and domain models are not perfect first time they need to evolve)
Microservices are complicated to develop (e.g. graceful degradation, health checks, retries) and monitor
Microservices dependencies are difficult to track

So what does a good monolith look like:

The code base is modularised by component (e.g. invoices, projects)
Asynchronous communication between components should use a queue (e.g. RabbitMQ). A single code base is publishing and consuming messages
If using queues run them in a separate process (e.g. RabbitMQ docker container)

Once the application is proven, then it would be a good opportunity to start decomposing if required:

to support horizontal scaling
to reduce the risk of deployments
to distribute development of components of the application to other squads
simplifies debugging and maintenance

What are well architected monoliths?

Participant in the discussion concluded they are a good step for bad monoliths to evolve to before splitting to microservices. This is because the monolith has been battle tested.

~ Robert

Jenkins Meetup @ Telegraph

2018-03-29T21:04:00+00:00

Co-presented Scaling Jenkins to over 150 projects with Michael Dukes.

~ Robert

Robert’s Blog

Writing a custom Terraform provider!

Improving maintenance of your Jenkins(file) pipelines

Why use Terraform

Support for cross cloud and other services

Simple Common Language

Templating

API support

Statefile

Secrets

You have a large existing state managed by cloud providers native

Let’s talk about application buildpacks

#jailbreak - Escaping the container

Fail faster, it’s okay to fail

Terraform + InSpec for automated Infra as Code testing

Well Architected Monoliths are Okay

Jenkins Meetup @ Telegraph

Recommended reading