Terraforming F5 - Part 1 // Wanderings in the Silicon Forest

#Adventures in Terraforming an F5 BigIP cluster - part 1.

My team at work has had a project on the back burner for more than a year now to migrate our key sites to a pair of F5 BigIP devices. It keeps getting pushed back when some other urgent project comes up. I decided to take a stab at making it easier to setup these “Virtual Servers”, as they’re called in BigIP terminology.

Back a year or more ago I attended a local F5 user group meeting here in Portland put on by the awesome local sales rep for F5. She would setup these kind of training lunches on a quarterly basis as a way to get customers familiar with new technologies F5 was selling. There was always a catered lunch during the product demonstration followed by a working lab session where everyone got to work through the training labs on their laptops with the help of the trainer and other sales engineers that were there. Some were not really relevant for our use, but I remember one that demonstrated the use of Ansible and the BigIP REST API to do some cool stuff.

Fast forward a year and Ansible seems to have lost a bit of it’s shine - at least for me. It’s still popular but it is not always the best tool for the job. Ansible is good at making repeatable changes to multiple systems, but getting it to manage the complete desired state of a system can be tricky. For building things in the cloud, the new cool toy is Terraform.

In the past our company has been slow to move things to the cloud. Previous IT management had a real aversion to “the cloud” and wanted everything done in-house. Well, they’re gone now and we’re playing catch-up. For some of our systems it just makes more sense to run them in the cloud, as it lets us stand up systems that are geographically dispersed, but still close to our customers. The teams in our company that were early adopters have done a lot of work in Terraform. It’s a great tool for being able to specify in code what you want the infrastructure to look like in your chosen cloud.

Since most of my projects are mostly “in-house” stuff I hadn’t had much chance to get to work with Terraform. I’d taken a couple of O’Reilly training classes, but hadn’t had much chance to put it to use.

When I finally decided that I needed to get with the times and work on something that actually used Terraform I realized that the F5 migration was a good place to start. F5 has written a provider for BigIP that lets you define the virtual servers on the F5 as Terraform resources. A “provider” is just a plugin that tells Terraform how to work with whatever device/cloud platform/virtualization platform/etc that you want it to manage. It does the translation from the resources you describe in Terraform code to the REST API calls needed to query/provision/destroy the objects in that platform.

Terraforming as a Team

I started this from the beginning with the idea that it needs to be usable by multiple people at once. Our team works very asynchronously - each making changes to our various Puppet, or Ansible code bases in our own branches and merging things when we’re done. Terraform work is a little different. By default it works in local mode, where it assumes it has exclusive access to manage the resources defined in your code. If you’re the only one managing these systems, it works fine that way. But, as soon as your teammate needs to do some changes while you’re on vacation there becomes a problem because the “state” of the resources is managed in your local Terraform code base on your laptop. His copy of the Terraform code base doesn’t have knowledge of the current state and thus will try to recreate everything when he tries to run it.

So, the very first thing I did was look into setting up a back-end that could be shared by multiple users and supported the locking mechanisms necessary to make sure two Terraform runs don’t happen at the same time (which is bad). Most of the training and documentation punts on this and just says to either use Terraform Cloud or to use an S3 bucket in AWS. Our company wasn’t going to pay for Terraform Cloud, so that was off the table. Even though we do have AWS accounts that I could easily put an S3 bucket in, I felt like it was not the best way to go. That may come back to bite me later, but I felt that since this was an “on-prem” device managing local services it made more sense to keep that state “on-prem” as well. My manager tells me I’m paranoid, but we’ll see who’s right or wrong when we have an internet outage at the data-center and nobody can make any changes because they can’t get to the S3 bucket. On second thought, maybe I am paranoid.

Looking through the list of Terraform back-end providers, the only one that made sense given my self-imposed constraints was the PostgreSQL back-end. We use a lot of Postgres databases in our company and we have some great DBAs managing them, so I figured if I had any trouble I would at least have someone around to help with this part of the system.

Postgres In a Container

Somehow I got the idea that the best way to stand up a Postgres database was by running it in a container. I think it IS a good idea, but getting there was a struggle. Our company tends to be (*ahem*) “slow” to adopt new OS versions. Without getting into specifics, let’s just say getting a current version of Docker installed was way more work than it should have been. Also our Puppet code-base still has an ancient version of the docker module to manage images and containers. I think it was the last version the author released before Puppet merged his code in as an official Puppet module in newer versions. Anyway, our version has no support at all for the recent changes Docker made in terms of naming and their package repository locations. I ended up having to fork the module and update the portions that assumed old names and URLs to get it all to work.

Eventually, I got a VM running with the official Postgresql 13.2 docker image in a container. Does it make sense to run a dedicated VM just to run a container? Probably not, but we don’t have much of a container infrastructure yet and I figure with the data and configurations living on the VM and bind-mounted into the container, it should make upgrading Postgres a fairly simple matter. We’ll see if that holds true.