No matter how much I wish for formal methods to become more mainstream I’m going on the record to say that they’ll never be mainstream. It’s too much for the typical software engineer. The gap between what regular software engineering looks like and what formal methods look like is just too wide and I don’t think there is any simple or even complicated way to close that gap. There is an irreducible amount of complexity when it to comes to stating and proving theorems and the only system up to the task is the human brain. So far the neural net people haven’t figured out how symbolic reasoning works and formal methods are the domain of symbolic methods.
Programmers tend to throw this word around a lot but it’s often not clear what it means. At the lowest level we have syntax, grammar, and source code. At this level almost everything is composable. Moving up a few levels we get to semantics and semantic composition and this is where things get a bit more tricky. This is the level where most of the discussions tend to happen. I’m going to use threads as an example to demonstrate the point. The code snippets will be in Ruby.
All of logic can be encoded with a suitable type theory and this includes the usual existential and universal quantifiers. The universal quantifier is easy because it corresponds to Π types but the existential quantifier is a little more tricky. It took me some time to make sense of it so here’s a small outline of how to think about it (this is as much to help me understand as it is for someone else so if you find any issues then let me know).
Let me know if this sounds familiar. You work at a software company that uses a 3/3.5 tier architecture. There are frontend servers, backend servers, some batch work servers, and a database or two. If it doesn’t sound familiar then maybe you have never worked with web application but that’s the predominant architecture at most web shops. In fact, even with all the microservice/devops/container/serverless hype no one has managed to change that 3/3.5 tier architecture. There is always a frontend, there is always a backend, there is always a database. The only differences are in the domain logic and implementation details like which language is used for the backend components and which database is used to hold the data. Let’s now get into some more concrete but still abstract details because I promise there is point at the end of this.
I only have anecdotal evidence to back this up but I do have plenty of anecdotal evidence. I’ve worked across enough platforms and technology stacks to conclude that stacks are irrelevant. Here’s some of the anecdotal evidence.
I’m starting to get more of the pieces of the theorem proving puzzle. Going through “Software Foundations” and having “Type Theory and Formal Proof” as a reference has been very helpful with putting the pieces together. I’m starting to understand why folks like type theory. The encoding of logic and computation with types is surprisingly elegant. Most of the pieces fall out of abstraction and application/beta-reduction which is kinda amazing.
But enough of that. While going through “Software Foundations” I had to do an exercise about reversing lists. More specifically, reversing lists is an involution and involutions more generally are bijections so I wanted to prove this in the more general setting without having to worry about reversing lists.
I’ve been told the best way to learn is to do so let’s do some propositional logic with Coq. I got these exercises from the first two sessions of “The Incredible Proof Machine“.
I’m not sure how to translate contexts in sequent calculus other than making everything above the bar an input and everything below the bar an output so some of the exercises did not have analogs (like flipping the order of the premises). I either skipped those or made a best effort translation.
I’ve been in this game for a bit now and have observed a few patterns. What follows is a list of patterns about software infrastructure and how it can be managed or mismanaged depending on your perspective. The list is in no particular order or ranking. Continue reading
Most implementations of generators end up being state machines. This means instead of implementing explicit state machines one can use generators to keep the state implicit and let the compiler do the hard work of converting the generator to an explicit state machine. The downside of generator based approach is that it becomes much harder to serialize the state for persistence. The upside is there is less juggling of state overall because one can use the control flow facilities of the language in which the generator is embedded to keep track of the state. Continue reading
It’s time to bury this argument. It is no longer valid. (Actually it was never valid and I’ve just accepted this now).
We now have gradually typed languages that allow for mixing and matching the best aspects of both paradigms. I’ve also recently started to appreciate the value of a good type system. The compiler flagging weird data flow is very valuable and helpful in exploring the problem domain. I now think if given the choice it’s always better to use a statically typed language unless the shape of the data is determined at runtime. If the application is highly dynamic then there isn’t much to be gained from using a statically typed language and the iteration speed of a dynamically typed language is probably more valuable than compiler/type enforced correctness. Then again this trade-off is moot with languages like TypeScript so time to have the cake and eat it too.