Kyler Middleton: Connect Azure DevOps to AWS

Azure DevOps (ADO) is a CI/CD platform from Microsoft ($MSFT). It permits a great deal of flexibility in the type of code run, the structure and permissions sets applied to jobs, and many other items of your creation and management of resources and automated jobs. However, support for other cloud providers is (perhaps obviously) weaker than at $MSFT's native Azure Cloud.

However, that doesn't meant $MSFT hasn't made inroads into helping us connect Azure DevOps jobs to the other cloud providers.

I've spent the week researching how to integrate the two. The closest I could find were specific use cases, like Elastic Beanstalk deployments (sans terraform) or arguments about how things worked, or why. No one seems to have built it before, so I knew this challenge would make an interesting blog post. I've done my best to package up the code and lessons to permit you to get this stuff going in your own lab as well.

Install the Microsoft DevLab Terraform Add-On Into Azure DevOps

So first of all, let's install the add-on. Make sure you're signed into dev.azure.com with whatever account you'd like to connect this service to, then go here: https://marketplace.visualstudio.com/items?itemName=ms-devlabs.custom-terraform-tasks

At the top, click the button that says "Get It Free".

Make sure your org is selected, then click "Install". Once complete, you're good to go. Head back to dev.azure.com (or click "Proceed to Organization") to get started.

Remote State, State Locking, Permissions

Azure DevOps builds these items for us in the Azure cloud, so we never have to worry about it. However, when we're crossing clouds we'll need to build a few items to enable Azure DevOps to take over and do its thing in AWS. The items we need to address are:

Remote state storage - Terraform uses a state file to keep track of resources and map the text TF configuration to the resource IDs in the environment. We'll need to store it somewhere. In AWS, the preferred method is an S3 bucket.

State Locking - When Terraform is actively making changes to a remote state file, it locks the file so no one else can make changes at the same time. This prevents the remote state file from being corrupted by multiple concurrent writes. The preferred way to handle this in AWS is a DynamoDB database.

Permissions - This is the most complex bit - We need to create an IAM user in AWS that ADO can connect as (authentication) and associate any IAM policies the ADO user might require to the role (authorization).

Catch22, Immediately

Our next step is to build some resources in AWS to permit this connection (IAM), store the remote state (S3 bucket), and handle state locking (DynamoDB). What intuitively makes the most sense is to use our trusty Azure DevOps to build a terraform job to build the things.

But it's a catch-22 - we can't execute the job against Terraform without the permissions already in place. And we can't very easily managed the AWS resources with Terraform if we build them by hand. So what do we do?

The best solution I've found is to create the Azure DevOps "seed" configuration in AWS via a Terraform job from my desktop, without using a remote state file. Once we get all the configuration in place to where Azure DevOps can take over, we'll add the remote-state file from our desktop to the S3 bucket, and start running our jobs from ADO.

Let's build some resources!

Local Terraform - S3, IAM, DynamoDB

Doing all this from the ground up is time consuming and complex! So I did that work for you, and created a cheat-sheet of Terraform to help you get started.

https://github.com/KyMidd/AzureDevOps_Terraform_AwsSeed

This GitHub repo contains a few files you can use to get a running start. Make sure to preserve the folder structure - the main.tf file uses the path to the ado_seed to find it.

Let's walk through what we're doing in the main.tf file. The first block of main.tf initializes terraform, and requires we use version 0.12.6 exactly. When you run terraform it'll tell you if your version is behind. Right now, 0.12.6 is the state of the art.
Then we define the provider - in this case, AWS. Change the region to whatever region you'd like. When we update this in the future for cloud hosting in ADO, we'll add a remote state location to this block. For now, though, we want to create resources in AWS from our computer, and store the tfstate locally. Then we call the ado_seed module and pass it some variables. This helps ADO name the resources specific to what you'd like. You'll also have an opportunity to look over the ado_seed module itself and see where that info is.
Let's pop into the ado_seed module and see what TF code we're running. First, we're building the S3 bucket. The name of the S3 bucket can be anything, but it has to be globally unique. Also, these S3 buckets are only useable for us in the same region as the environment, so it makes sense to include the region ID in the name for ease of use.

We're enabling strong encryption by default, versioning history as the state file changes, and a terraform attribute called lifecycle prevent_destroy which means TF will error out before replacing or destroying the resource, which is good news for us - we will be in trouble if our state file gets destroyed.
Then we're going to build the DynamoDB. Terraform can consume this database to use it for state locking. Basically, when terraform is editing the state file in S3, it'll put an entry into the database here. When it's done, it removes the entry. As long as every TF session is configured to use the same database, the state locking mechanism works. The primary key for this DB is required to be LockID.
Then we need to start on the IAM user, role, and policies. Bear with me, because the AWS implementation of permissions is incredibly verbose.

First, we need to create an IAM user. This user is where we can generate secret credentials to teach something how to connect as it - for instance, to tell ADO to connect as this user. The user itself doesn't contain permissions - there's no authorization, only authentication.
Then we create a policy for the IAM user. This is a list of the permissions we grant it. I've done something here for simplicity that isn't a good practice - note that the second rule in this policy grants our IAM user ALL rights to ALL resources. That convenient, but if someone compromises this user, not great. It's a better idea to iterate through each permission your ADO service requires and grant it there.
Then we link the policy to the IAM user.
Despite this step-by-step walkthrough I'd recommend copying the whole things down to your computer to avoid syntax and spelling issues and go from there.

Local AWS Authentication, TF Apply

Now that we understand all the steps, let's authenticate our local comp to our AWS environment and build these items. Log into your AWS account and click on your org name, then on "My Security Credentials"

Click on "Create New Access Key" and then copy down the data that is displayed. This credential provides root level access to your AWS account, so 100% do not share it. Copy down both before closing this window - it won't be displayed again.

Export that info to your terminal using this type of syntax:

Run "terraform init" and then "terraform apply" from your desktop in the directory where the main.tf is. Once you see the confirmation to create, type "yes" and hit enter. Terraform will report if there were any issues.

Now we have an S3 bucket for storage, a DynamoDB for locking, and an IAM user for authentication. Let's switch to Azure DevOps to move our Terraform jobs to the cloud!

State to S3, Create IAM Creds

Now that we have our environment in the state we want it, we need to make sure our cloud Terraform jobs know about the state of the environment as it exists right now. To do that, we'll need to upload our local terraform.tfstate file into the S3 bucket.

Head over to the S3 bucket and click on Upload in the top left. Find your terraform.tfstate file in the root of the location you ran your "terraform apply" in and upload it. All options can be left at their defaults.

Once that's done, we need to head into the IAM console to generate some secrets info for our new IAM user so we can provide it to ADO for authentication. Head over to the IAM console --> Users --> and find your user. Click it to jump into it.

Click on the "Security credentials" tab, then click on "Create access key" to generate an IAM secret.

This IAM secret will only be shown once, so don't close this window. Copy down the Access key ID and Secret access key. We'll use that information in the next section.

Integrating Azure DevOps with AWS IAM

With that done, we're finally(!) ready to head over to Azure DevOps and add a service principal that utilizes this new IAM user and the secrets info we just created.

Drop into ADO --> your project --> Project Settings in the bottom left. Under pipelines, find "Service Connections". These service connections are useful in that they are able to store and manage the secrets and configuration required to authenticate to a cloud environment. Our terraform jobs will be able to consume this info and make our lives easier.

Click on "New service connection" in the top left of this panel and find the "AWS for Terraform" selection. If it's not listed there, head back up to the top of this blog and make sure to follow the steps under "Install the Microsoft DevLab Terraform Add-On Into Azure DevOps".

Fill in the information requested. The name is just a string, name it whatever makes sense to you. The access key and secrets key id are the information from the IAM user that we just generated. This ISN'T your own user's root access to the env. That will work, but isn't a best practice since the root user has unfettered access to the account, not the permissions we set in the IAM policy assigned to this user. Also fill in the region - these service accounts (and S3 buckets, for what it's worth) are only valid in the region they are created for. So if you need to deploy this stuff in multiple regions, you're going to have multiple S3 buckets and multiple service connections, one for each region.

Update Code, then push to ADO repo

We're moving all our workflows into the Azure DevOps cloud, which means we need our Terraform code to live there also. The only change we have to make before pushing this code to our ADO repo is to add the "backend s3" block to our terraform config in the main.tf, like so:

Since we're just starting this repo, you'll probably push directly to master. For info on how to do that (or how to start up a branch in git and add your changes to it), refer to previous blogs.

I put mine in the folders terraform / terraform-aws / main.tf.

Pipeline Time!

Now all the pieces are in place, and we can get to actually building the pipeline and setting up each step we'll need to actually DO stuff! This is exciting times. Let's do it.

I'm assuming the build pipeline for the terraform repo is already complete. If it isn't, refer back to previous blogs on this site on how to build that.

Head into your project, then click on Pipelines --> Releases. Click on "New" in the top left, then on New release pipeline to build a new one.

ADO really wants to be helpful and can be somewhat confusing. Make sure to click "start with an empty job" to avoid the wizard.

Click on the Artifacts box in the left on "Add an artifact" to pull up our build artifact selection wizard.

On the "select an artifact" screen, find your terraform build pipeline, then probably use the "Latest" Default version. Click Add to head back to our release pipeline automation screen.

Click on "Add a stage" to add the new Terraform Plan stage.

Call the stage AWS Terraform Plan or something similar. This stage will only do validations and planning - no changes will be executed. That'll help us confirm our stages and configuration are working correctly before we move on to executing changes.

Click the plus (+) sign on the right side of the agent job to add a step and search for "terraform". Look for the "terraform tool installer". It'll handle installing the version of terraform you specify.

Remember we've required terraform version 0.12.6 in our config files, so make sure to specify the right version here.

Click the plus (+) sign on the agent bar on the left again and again search for "terraform". Look for a step called just that - Terraform. There are several add-on modules that sound similar but have different capabilities, so look for one that looks like this picture. Click add to put this step into your workflow.

Change the provider to AWS, and set the TF command to "init". Also make sure to hit the 3 dots to the right of the "Configuration directory" to find where the command should be executed at - the location of your main.tf file.

Under the Amazon Web Services (AWS) backend configuration, find your Amazon Web Services connection - the service connection we built earlier that uses the IAM user. If nothing shows up, click the refresh button on the right or double-check you created the correct type of ADO service connection. Set the bucket name to the bucket you created from your desktop. Then set the "Key" to the path and name of your terraform.tfstate file in the S3 bucket. I just put mine in the root of the S3 bucket, so my key is simply "terraform.tfstate".

And boom, that's init. You can stop and test here, but I'd recommend adding a few more steps to make sure we're all good to go. We'll want to add a "terraform validate" and a "terraform plan" step to this stage. The easiest way is to right click on the "Terraform Init" step we just created and click "Clone task(s)".

Update the third stage to "terraform validate" and the fourth to "terraform plan". Each stage requires different information, but it's all information we've covered already. Once you've created the stages, it'll look like this:

Once you feel good about it, click on "Save" in the top right, then click on "Create release", then "Create". Click on the release banner at the top to jump into the release logs.

In my experience these freeze a lot, so be aware of the "refresh" at the top. If the "Terraform Plan" stage fails, click into the logs and you can check out why. Click on the "terraform plan" stage to see the CLI results. Hopefully yours looks like this also, which means all things have gone well, and our ADO Terraform now has the same state files as we did locally and all things are working.

Profit!

There ya go, a functional Azure DevOps Terraform pipeline to build and manage your resources in AWS. Woot!

Try building your own resources and see how things go! Try to tack on pull requests, validation of PRs pre-merge, and anything else, and report back the cool things you find!

Good luck out there!

kyler

Kyler Middleton

Sunday, August 4, 2019

Connect Azure DevOps to AWS