war story: caching

There was that one time I used strace and a Ruby script to bypass a really long step in a build pipeline. The trick was figuring out the inputs and outputs by running the process under strace and utilizing the output from strace to compute some hashes. The core of the script was a utility class and some convenience methods for computing hashes by shelling out to find, tar, and shasum

The above was used in conjunction with some other files in various folders that were evaled at runtime

After computing the aggregate hash it was used to generate a tar file of the required outputs

The resulting cache file was then moved around and shared across hosts so that whoever else needed the outputs could just compute the aggregate hash and then download and unpack the file. The end result was the build times for that step were reduced from 5-30 minutes to less than 10 seconds on average and given how many times throughout the day we ran that step it was quite a bit of savings.