You don’t need HashiCorp’s Vault

There are a few things I dislike about the programming industry. Much of what programmers do is driven by fads and trends. There is a lot of cargo culting with little critical analysis. This is especially true when it comes to DevOps tools and practices. Today I’m going to argue that you don’t need to deploy and manage any kind of secret token management system, e.g. Vault, if your workloads are already running in the cloud. I’m going to argue that all you need is a set of GPG/AES keys and whatever key management system (KMS) is offered by your cloud provider. Google has Cloud KMS and Amazon has AWS KMS. I’m sure Microsoft has one too but the point is they’re all equivalent and basically have the same API. For the rest of this post I’m just going to generically refer to all these solutions as KMS.

First I’m going to outline the use case I have in mind. There are cases where you would want to use Vault so I’m going to outline the use case I have in mind and how Vault can be entirely avoided.

Most web applications need to talk to various other web applications. For security purposes these other web services will require various secret tokens to be made available with the request so they can validate who they’re talking to. All AWS APIs require that whoever is making the request signs the request with HMAC-SHA256. I’m not going to go into the details because you can just search and read up on why they do this. The relevant part for us is that HMAC-SHA256 requires a secret token that AWS gives their clients. In the case of AWS there are much better ways to make requests because if you are inside the AWS network and running an AWS VM then you already implicitly trust them so you can just use something called IAM profiles to get the necessary keys and tokens at runtime. We can’t do this for 3rd party services so we have to get the token and store it somewhere that will be accessible to our application.

Already we have an issue. Secret tokens are basically configuration information but we can’t store them like regular configuration information alongside the code because that would defeat the purpose. Secret tokens need to remain secret and if we check them into the codebase in plaintext then anyone that has access to the repository will have access to the secrets. This is where GPG comes into play. The token needs to be stored and maybe even deployed alongside the application and the easiest way to do that is to encrypt the secret with GPG and store it alongside the code. I put all secret tokens in a folder called secrets/ and structure that folder in some sensible way to make it obvious what secret is in which file. Then I put that entire folder in .gitignore so that I don’t accidentally check in plaintext tokens and encrypt all the files with an AES256 key. I’ve argued elsewhere why you should do this so I’m not going to repeat those arguments. GPG allows multi-key encryption so it also gives you basic access management out of the box. If you want to give other people access to the tokens then you import their key and re-encrypt the AES key with multiple recipients. Already we have solved the durable storage problem. Next step is to give our applications access to those secrets at runtime.

Now that the secrets are durably stored in encrypted form we can figure out how to make them available to the application at runtime. This is where KMS comes in. We generate another AES key and make that AES key available to the application at runtime. The key can be per secret or it can be like the master key we used to encrypt everything (in theory we could have used a different key per secret when we were checking in the secrets and to make rotation easier you probably want to do this for extra security and fine grained access management). Now all you need to do is make the original secret tokens available in some form but encrypted with the KMS key. There are many ways to do this and few that come to mind are blob storage (like S3), volume snapshots, NFS server, Redis, etc. So now when the application is starting it queries KMS for the key, queries the place the secret is stored, decrypts the secret with the KMS key and stores it in memory for as long as necessary. Depending on your request volume you can avoid storing the secret in memory and just re-querying and decrypting it as necessary but this is starting to get into premature optimization territory so going to leave that for the future since it is also very application specific.

So what did we accomplish? We used very basic and composable tools to durably and safely store secret tokens and also made them available to the application at runtime using nothing more than what is already available in every cloud provider. I’ve yet to run into a situation where using Vault was the right solution or justified the extra management overhead of configuring and managing it in a production environment. Sticking to the basic cloud offerings makes the whole thing much simpler in my opinion. Now this isn’t to say the overhead of managing Vault is never justified. If you’re running everything in some enterprise environment with no access to any of the usual cloud offerings then go ahead and deploy vault but if you’re already in the cloud then it’s extra and unnecessary overhead.

Edit: After posting this on r/programming someone mentioned AWS Secrets Manager. I hadn’t heard about it before but it looks like a pretty good managed service if you are already in AWS.