Why Most Distributed Systems Papers Get Consistency Wrong
There is a persistent confusion in the distributed systems literature between consistency as a safety property and consistency as a liveness property. This confusion leads to systems that claim strong guarantees while silently violating them under partition.
The Linearizability Problem
Linearizability is the gold standard, but most papers that cite it do not actually implement it. What they implement is something weaker — often sequential consistency or even causal consistency — dressed up in linearizable language.
The distinction matters. A sequentially consistent system can return stale reads that a linearizable system cannot. In a financial system, this is the difference between a correct balance and an overdraft.
Where the Proofs Break Down
Most correctness proofs assume a synchronous or partially synchronous model. When the system enters an asynchronous period — which every real system does — the proof no longer applies. The system continues to operate, but without its safety guarantee.
A system that is correct "most of the time" is not correct. It is a system with undocumented failure modes.
A More Honest Approach
What the field needs is not stronger consistency models, but more honest documentation of what guarantees actually hold and under what conditions they degrade. Every system has a consistency envelope — the set of conditions under which its guarantees hold. Making this envelope explicit would be more useful than another impossibility result.
Correctness is not a spectrum. A system is either correct under its stated assumptions or it is not. The task is to state the assumptions honestly.
← All writing