The CAP Theorem: The Backbone of Distributed Systems
Essential Insights into Data Consistency, Availability, and Partition Tolerance
The CAP Theorem, a core principle of distributed systems, states that in the presence of a network partition, a distributed system can provide either consistency or availability, but not both.
In other words, a distributed system can only provide two of three properties simultaneously: Consistency, Availability, and Partition tolerance.
This theorem underpins the architecture of modern databases and distributed systems, influencing how data is stored, accessed, and managed across nodes in a network.
Explanation of CAP
📚 Consistency
Every read operation receives the most recent write or an error.
In a consistent system, all nodes always see the same data.
🌱 Availability
Every request receives a (non-error) response, without the guarantee that it contains the most recent write.
The available system guarantees that it remains operational at all times.
🚧 Partition Tolerance
The system continues to operate despite an arbitrary number of messages being dropped or delayed by the network between nodes.
A system that is tolerant of network partitions can handle situations when a subset of its nodes is unreachable due to a communication break.
Proof of the CAP theorem
Let’s assume there is a Consistent, Available, and Partition-tolerant system.
In such a system, a client can perform a successful request to store a value δ to node A and immediately after that make a successful request to read the value δ out from node B even though there is a network failure between the nodes.
Clearly, without a means to communicate the new value from node A to node B, the client will:
Either receive the old value ω - making the system not consistent.
Or receive an error during writing or reading - making the system unavailable.
Therefore, we can conclude that such a system does not exist and that only two out of the three CAP properties are achievable at any given time.
CAP as a Core Principle in Distributed Systems
The CAP theorem addresses fundamental limitations and trade-offs inherent to systems that span across multiple nodes or are even geographically distributed.
⚠️ Network partitions in distributed systems are inevitable
They can appear due to hardware failures, network outages, or maintenance. These partitions mean that parts of the system may become temporarily isolated. In such cases, systems must choose between the ability to remain consistent or available. For example:
a financial trading platform may prioritize consistency to prevent double-spending
a messaging app may prioritize availability to ensure messages can be sent
These trade-offs are central to the system’s architecture and can significantly impact user experience.
“CP” and Relational Databases (ACID)
SQL databases have traditionally been designed to fulfill ACID requirements:
Atomicity: All operations within a transaction are completed successfully, or none are.
Consistency 📚: A transaction brings the database from one valid state to another.
Isolation: Concurrent transactions occur separately from one another.
Durability 🚧: Once a transaction is committed, it will remain so, even in the event of a system failure.
This means that from the point of view of CAP, consistency and partition tolerance (aka durability) are prioritized. In the event of a network partition, these databases will choose to remain consistent while possibly sacrificing availability.
In banking and financial systems, ensuring transaction integrity and account balance correctness is crucial. SQL databases in these contexts will maintain strict consistency, even if it means some operations must wait or fail during network disruptions, affecting availability but preserving data integrity.
“PA” and NoSQL Databases (BASE)
BASE is a set of database principles that prioritize availability at the cost of consistency:
Basically Available 🌱: Data is always available without promising immediate consistency.
Soft state: The system's state may change over time, even without input.
Eventual consistency 🚧: The system will become consistent over time, given that the system does not receive new inputs during that time.
NoSQL databases often prioritize availability and partition tolerance, making them suitable for applications where immediate consistency is less critical. This is particularly useful in services requiring high levels of scalability and where data can be replicated across many nodes.
Social media platforms like Twitter and Facebook use NoSQL databases to manage the enormous volumes of data they handle daily. These platforms prioritize the ability for users to post and view content over having that content be consistent across all nodes at the exact same time. The system eventually becomes consistent, ensuring all users eventually see the same posts.
The CAP Theorem and Blockchain
Blockchain, by its nature, introduces a paradigm where data consistency, availability, and partition tolerance are managed through decentralized consensus mechanisms rather than traditional centralized control.
This shift offers a new perspective on achieving consistency and availability in the presence of network partitions, which leads us into the territory of the PACELC theorem.
While the CAP theorem addresses the trade-offs between consistency and availability during network partitions, PACELC extends this by adding another dimension:
Even when the system is running normally in the absence of partitions, one has to choose between latency (L) and loss of consistency (C).
A detailed exploration of how blockchain technologies navigate the CAP theorem and an introduction to the PACELC theorem will be featured in a future newsletter post.
In Case You Missed It
In the previous post, we looked into the multi-faceted world of software engineering leadership and how, even as an individual contributor (IC), you can still be in a leadership position.
Top picks this week
The Guide to Stock Options Conversations in
by — An insightful article about a topic that is usually discussed behind closed doors. Give it a read, even if stock options are not part of your compensation package at your current job.Why I write a newsletter every week even with a full-time software engineering job in
by — This is a great read whether you already publish a newsletter, are thinking about starting, or want to appreciate the motivation and work behind writing one.5 Non-Verbal Behaviors Killing Team Health in
by — People often notice negative non-verbal behaviors of others more than their own, but it’s essential to understand how to navigate both.