ruby dsl tricks: reifying references

Ruby is great for writing DSLs because it has first class support for two of the most important ingredients of DSLs, contexts and code blocks. With the proper use of instance_eval the same block of code can be evaluated in various contexts to have different kinds of effects but most often what we want to do is evaluate the code block in the “freest” possible context to create an AST (abstract syntax tree). I’m almost certain there is a connection here with initial and terminal algebras in category theory but someone smarter than me will have to chase that analogy. Today I’m just going to demonstrate how to reify references so that we can support cyclic structures in our DSL. Continue reading

war story: caching

There was that one time I used strace and a Ruby script to bypass a really long step in a build pipeline. The trick was figuring out the inputs and outputs by running the process under strace and utilizing the output from strace to compute some hashes. The core of the script was a utility class and some convenience methods for computing hashes by shelling out to find, tar, and shasum Continue reading

finger exercises: pipes, bytes, and fibers

In which I try to figure out how to pack and unpack bytes over an in-process pipe so that I can use it in some future message framing protocol for a worker pool. There will be a guest appearance by Fiber to simplify the parsing of messages in a non-blocking manner. Continue reading

software hygiene: encrypt your secrets

Most of my projects have a Rakefile because common tasks should be expressed in code instead of english and Rake is a great way to codify those common tasks. One thing that I have seen developers do is check-in secret tokens into their repositories in plaintext. I have done this as well. It is the simplest thing to do but it is terrible practice so to atone for my past sins and get others to not check-in secret tokens here is some code I now use to handle secret tokens. Adapt to your own workflow accordingly. Continue reading

optimizing spot instance allocation

There is surprisingly little information on how to optimize costs using the AWS spot instance market. There are several services that provide management on top of the spot market if you have an architecture that supports an interruptible workload but very little in the way of how to go about doing it yourself other than surface level advice on setting up autoscaling groups. To remedy the situation here’s an outline of how I’ve solved part of the problem for CI (continuous integration) type of workload Continue reading

production grade logging

Logging to sockets is better than logging to files. It allows for more flexibility in terms of log rotation and data integrity. I started looking around for examples of this but everything these days when it comes to logging is built for the enterprise. The actual skeleton of what all those enterprise systems are doing is quite simple. In fact it is so simple that you can do it in less than 30 lines of code in most high level languages. Here’s the skeleton for a logging server in Ruby: Continue reading

production grade ruby interpreter deployment process

rvm, chruby, rbenv, etc. do not belong in a production environment. Even if you are deploying and co-hosting applications that require different versions of ruby those tools still do not belong in a production environment. All those tools are strictly for development environments.

Binary shims and other hacks have no place in a production environment. Ideally you have one user per application that has the proper profile for setting up PATH to point to the right version of ruby which has been compiled and deployed wholesale ahead of time. This is actually quite simple and is in fact a one time operation if you do it right and package the binary bits with an RPM or Debian package. Heck, even a tar file would work if you’re willing to have some extra deployment logic and these days you can use any number of devops tools like chef and ansible to codify the initial production environment setup as well. Continue reading

simple in-memory store with consistent reads

Problem Statement

Suppose you want to write a simple in-memory JSON store with an equally simple socket based protocol. You want this in-memory store to support parallel and consistent reads. By “parallel reads” what I mean is if 10 clients request to read data from the store then no client should be blocking any other client. By “consistent reads” what I mean is when a client requests some data from the store there is absolutely no way that client gets half of the data before a write and half of it after a write and there is also some kind of ordering for reads and writes. In other words, if we have an array “[1,2,3,4]” that corresponds to the key “ints” in our JSON store then the following sequence of events is impossible: Continue reading