Sunday, July 14, 2019

Network Engineering is Dying (Except at Cloud Providers)

Hey all!

This past week I spoke to a recruiter for one of the gang of 4 largest companies in tech. That term refers to Google, Amazon, Facebook, Apple (and sometimes Microsoft). The recruiter pitched me on a network engineering role - something that I've happily done for years now.

For the past 20+ years, network engineering teams from most companies have maintained the networks that connect computers which serve up every internet service we interact with each day. Network engineers make sure redundancies exist for the inevitable failures of a network that spans the globe, and they verify the health of all the hardware devices and interfaces which run the network.

A common analogy for network engineering is building the roads for the application "cars" to drive upon.

These jobs have been stable and profitable, integral to the growth and stability of any company that wants to use the internet to drive its business (read: all of them). Most would jump at the opportunity to take any job at these companies. These jobs sparkle on resumes, and even if the day-to-day is similar to most other jobs in the industry, the looming profile these companies have in the news cycle mean it'd be foolish to write off an employment opportunity like this one.

However, the world of network engineering is changing. Many would say dying.


With the exception of maybe a dozen companies on the planet, nearly every company is moving away from physical data centers. IT orgs struggle with the long lead times required to make changes in physical data centers. Purchasing hardware, organizing cabling standards, cooling, 24x7 staffing, and dozens of other concerns are simply avoided by moving to the cloud.

Ironically, the only companies who aren't decreasing their data center footprint? Cloud providers. 


Because of the increased demand, cloud providers are growing their physical data centers at an incredible rate. This requires hiring network engineers, data center engineers, and others with the skillsets to grow them in a scalable way.

The gilded cage of skill-set lock-in
The problem, of course, is skillset lock-in. Not only do most of the gang of 4 famously build their own tooling, but their business model is shared by almost no other company on the globe - to build world-spanning data centers and massive internet-scale networks.

Only cloud providers still invest in physical data centers - and the skillsets required to run them.


Spending time in your career at one of these companies in a department focused on these legacy networks is a dead-end in a career because of this skillset lock-in. It'll be difficult for the folks locked into these positions to leave the very small network of a dozen or so companies that provide these massive clouds and take just a job at just about anywhere else, because these other companies are looking for reliability engineers (SREs), DevOps engineers, and any number of other software-defined cloud computing experts that need entirely different skillsets than those harbored within these divisions at the gang of 4.

If you have the opportunity to work in these divisions at cloud providers, good luck to you! Their famously great pay and benefits are nothing to scoff at. But I hope you consider my points above about career lock-in. Your career must be played as a strategic long-game, and I worry these jobs might be the wrong move.

Best of luck out there.
kyler

Sunday, July 7, 2019

Sync Terraform Config and .tfstate for Existing AWS Resources

Hey all!

Terraform is a great (and dominant) infrastructure automation tool. It is multi-cloud, can build all sorts of resources, and in some cases supports API calls to build resources before the native tooling from cloud providers does.

However, it's dependent on a state file that is local, and only reflects resources created by terraform, and a local configuration file to describe resources. It's not able to reach out to a cloud account and create a configuration and .tfstate file based on the existing resources that were built via another method. Or at least, it isn't able to yet. The scaffolding for this functionality exists within Terraform for the AWS cloud, and is called the "import" functionality. It's able to map a single existing resource to a single configuration block for the same resource type and fill in the info, which is of course a manual and tedious process. And imagine if you have hundreds (or thousands!) of resources. It isn't a feasible way to move forward.

Terraforming (link) is a wrapper around terraform and is able to map multiple resources at the same time to configuration blocks, as well as build .tfstate files for multiple existing resource types. Still, it's a little awkward to use - only a single resource type is able to be imported at the same time, and if a command is run against a non-existing resource type (say you don't have a batch configuration, and run a sync against the batch resource), it wipes out the existing .tfstate entirely, removing your progress.

Clearly, the tools could use some help. So I wrote some. I imagine both of these tools (terraform import & terraforming) will eventually get this same functionality. In fact, both of these tools are open source, and I'll work on adding this functionality natively to both of these tools via PRs.

However, for the time being, I'm publishing my code which permits:

  • Creating from scratch a .tfstate file for every terraforming supported resource in an AWS region
  • Creating a single (monolithic) configuration file for each existing resource in an AWS region
This code assumes you don't have an existing .tfstate file - in fact, it wipes out your existing local .tfstate file and builds a new one. So please back up your .tfstate and configuration files before running this tool

However, if your'e new to terraform and want to sync the configuration to an existing AWS region and look at the config for all the resources that exist there, this is a neat shortcut. 

Rather than post the code here and update each time I (or you! The code is open source) update it, I'll post a link to the public github repo. 


I hope it's useful to you. Please add any corrections, comments, and feature additions you'd like via pull requests. And if you know how to update the terraform or terraforming source code to add these functionalities and make my code obsolete, please do so! That would be the best case scenario. 

Thanks all. Good luck out there! 
kyler