In signal processing and filter design there is the notion of a linear system. The nice thing about linear systems is that they can be safely analyzed by just poking and prodding them with an impulse. You push a certain form of signal through the system and observe the output and if for signal
s you get output
o then by linearity you can conclude that
2s will lead to
2o. Basically, linear systems are nice because you can add and scale inputs and expect the outputs to also add and scale appropriately. The trouble is that the world is not linear and all our intuitions fail us when dealing with non-linear systems. This is especially problematic in software because the scale of the complexity makes the systems we deal with essentially black boxes and the only view we have into them is by poking and prodding them with impulses that we understand and then assuming some kind of linearity. An anecdote from work about Docker and LXC follows.
When Docker was all the rage everyone and their grandma wanted to build developer tools around it in order to reduce the gap between development and production environments. The issue was that Docker was always in some kind of beta state. It was never stable and mature enough to act as a foundation for anything. If your software team is moving fast and Docker is moving fast and you are trying to build tools to bridge the gap then you are basically someone trying to do a split on top of two fast moving cars. It’s a recipe for disaster and I’ve never seen any kind of Docker tooling help the development workflow. Whatever you could do with Docker you could do with Vagrant without the extra layer of indirection because Docker does not have native support on Macs and you are running a VM somewhere anyway. What is the point of all this? The point is that I was once one of the people that tried to use Docker as a foundation for a scalable CI pipeline. Needless to say I failed miserably.
The impulse I poked Docker with was restarting containers, starting some process inside a container and then connecting to the container from another shell and killing the process, doing these operations in sequence, in parallel, and various other combinations of some basic “impulses” that you would expect Docker as a system to respond linearly to. In other words, if I was starting and stopping containers at some rate and then doubled that rate then I’d expect some slowdown from Docker that would be proportional to that rate. So I’d expect maybe a slowdown of between 2-4x if I doubled the rate of starting and stopping containers. Well, you can guess what happened. Docker is not a linear system. Even at moderate loads I got the Docker daemon to consistently crash and burn in all sorts of weird ways. Sometimes it would forget what filesystem it was running on. Sometimes host mounts would disappear. Sometimes the container would get stuck and be essentially unkillable. Sometimes the entire host would lock up and only a hard reboot would fix things. All of this wouldn’t be so bad if there wasn’t a mountain of code between me and those impulse responses. If I could somehow trace the impulse through the system then maybe I would have some hope of tracking things down but I gave up on that when I tried to build the Docker daemon from scratch and saw what was necessary to do that. Even after getting it built I couldn’t get it to reliably run anyway because there was always some kind of driver mismatch that would send me on some other wild goose chase in kernel land. So I gave up on Docker and looked around to other container things and the best thing I could find was LXC.
I pushed all the same impulses through LXC and it performed admirably. It worked well enough on the version of Ubuntu that we were running in production that I didn’t bother looking for anything else. I put some shell scripts around the LXC tools that ship with Ubuntu and called it a day.
What’s the lesson in all this? I don’t know but if you are dealing with black boxes left and right then the ones that respond linearly to obvious impulses are probably more stable and reliable than the ones that don’t.