Skip to content

Understanding the CAP Theorem

Posted on:August 5, 2023 at 06:23 PM

In distributed systems that include a state, it is impossible to implement all properties of consistency, availability, and partition tolerance simultaneously.

cap

Table of Contents

Open Table of Contents

Consistency

Availability

Partition Tolerance


Why can’t we implement the all properties of CAP?

In the real world, the selection both of C(onsistency) and A(vailability) is not possible. Network partitions in distributed systems are inevitable so we select P(artition tolerance) first.

Selection between C and A depends on system requirements. Consider this example:

  1. We have two nodes master-master which are X and Y.
  2. The user updates data in the master node X.
  3. After the data is updated, master X sends the update (synchronizes) to master Y.
  4. While sending the update, a network partition occurred so synchronization failed. In this situation, we have two options:
    • Select the availability: Cancel synchronization of the nodes. Send the update after the problem disappears. (Consistency is sacrificed)
    • Select the consistency: Do not accept any further request (Availability is sacrificed) until synchronization successfully completes.

Another example

Two ATM machines communicate through a network and there is no centralized database.

The ATM machines should synchronize themselves considering if the customer deposit money in ATM 1, the customer should see the same amount of money in ATM 2.

Regarding network partition is inevitable we should select Consistency or Availability:

Consistency may be more important than availability in this situation and the choice between consistency or availability lean on the system requirements.

For example in CP prioritized database MongoDB, there is only one master node for writes for a given replica set. Other nodes(slaves) synchronize themselves according to the master node. If any failure happens in the master node, a slave in the slaves gets the master role, and other slaves connect to the new master. (In this period, MongoDB discards any requests so availability is sacrificed.)

AP prioritized database CassandraDB has a masterless design accordingly single point of failure is prevented. Users can write to any node at any time in the Cassandra cluster. (Consistency is sacrificed due to potential synchronization failures)

In the AP systems, some operations are implemented for preserving consistency, for example, vector clocks, application-specific conflict resolution, etc.

AP systems may good fit for eventual consistency (in the below consistency patterns) requirements.

RDBMS are generally categorized as CA databases because RDBMS’ are often single-node systems. For this reason, partition tolerance is not considered. (just for single-node systems!)

Availability and Consistency metrics of the DBMS’ can be changed by configuration:

Example in Mongodb:

Example in CassandraDB

But considerate while playing with configuration, Consistency, and Availability are mutually exclusive. For example, if we increase the consistency of the system it will result most likely in a drop in availability and vice-versa.

Consistency Patterns

More than one copy of the data leads synchronization problem.

Weak consistency

Lowest consistency level, a read operation that comes just a write operation may see or not see an updated state. This level is generally used for real-time applications like VoIP.

Eventual consistency

Read operation that comes just after a write operation will see an updated state eventually. Data is replicated asynchronously. Preferred in DNS and Email systems. Good fit for high availability systems.

Strong consistency

After a write operation, a read operation sees the updated state. Data is replicated synchronously(hence availability is sacrificed). It is generally used in RDBMS and it is inevitable for transactions.

Thanks for reading

References

  1. Understanding the CAP Theorem, With Real World Examples
  2. CAP Theorem Partition Tolerance and Availability
  3. You Cant Sacrifice Partition Tolerance
  4. CAP Theorem An Impossible Selection
  5. MongoDB, Cassandra, RDBMS and CAP
  6. CAP Simplified
  7. System Design Primer