optimizing spot instance allocation

There is surprisingly little information on how to optimize costs using the AWS spot instance market. There are several services that provide management on top of the spot market if you have an architecture that supports an interruptible workload but very little in the way of how to go about doing it yourself other than surface level advice on setting up autoscaling groups. To remedy the situation here’s an outline of how I’ve solved part of the problem for CI (continuous integration) type of workload Continue reading

learn some math

This is in response to the math myth.

The author couldn’t be more wrong even if he tried. My current project is using GLPK to do some basic mixed integer programming to optimize AWS spot instance allocation. If I had not taken linear algebra, calculus, and a few courses in linear programming the idea would not even have crossed my mind that I could use mixed integer programming to solve the spot allocation problem and that’s just the first half.

The second half can be considered a problem in control theory because it requires taking the new allocations and gracefully transitioning from the old set of allocations. You can go even further and say that the whole thing would be even better if I understood more about stochastic processes and could potentially model the spot market and make predictions ahead of time to simplify the control problem and get ahead of the price fluctuations.

Saying all you need is Excel and 8th grade is in the words of one famous physicist “not even wrong”. If you’re in an engineering discipline then the more math you know the better.

varnish, nginx, and S3

A while back I wrote proxy for accessing an S3 bucket that transparently handles decrypting objects before serving them. It combines some things I’v already talked about in what I consider interesting ways.

Disclaimer: The proxy is in no way production worthy. There are much better S3 proxies out there with much better documentation. This post is just to document a trick that I came up with to avoid using a database. Continue reading

basic infrastructure patterns

Some basic patterns I’ve noticed come up over and over again while working on build/CI/deployment related things.

Pipeline

The pipeline is the bedrock on which pretty much everything else is built. A properly designed pipeline takes well-defined inputs and produces well-defined outputs Continue reading

infrastructure as a database

The more I work with infrastructure as code the more I dislike it. I think that’s because fundamentally it’s the wrong metaphor. Code is terrible, it’s brittle and constantly breaking. Not to mention versioning and dependency issues. Even with immutable infrastructure you still have pretty much the same problems. I’ve used pretty much every configuration management tool that is out there and found them lacking. Continue reading