Linearizability vs. Serializability
SummaryThis section clarifies the distinction between linearizability and...
This section clarifies the distinction between linearizability and...
This section clarifies the distinction between linearizability and serializability, the two strongest distributed consistency models. Linearizability guarantees recency: operations appear to execute in a global order matching real-time, requiring consensus protocols and sacrificing availability during partitions (CP). Serializability guarantees ordering: the outcome of concurrent transactions is equivalent to some serial execution, without real-time constraints. The cost of linearizability is formalized by the CAP theorem and demonstrated through implementation examples like a linearizable register requiring quorum consensus. Real-world trade-offs are illustrated: user password updates (requiring linearizability) versus comment feeds (accepting eventual consistency). System classifications (Google Spanner, CockroachDB as CP; Cassandra, DynamoDB as AP/PA-EL) show the practical impact of these design choices. The immutable trade-off is clear: stricter consistency guarantees (linearizability) come at the cost of higher latency and reduced availability, especially under partition.
Linearizability vs. Serializability
Linearizability and serializability are two of the strongest consistency models used in distributed systems to ensure data consistency across different nodes. While they share some similarities, they have distinct differences in their guarantees and use cases.
Linearizability
Linearizability is the strongest consistency model, which guarantees that all operations appear to execute in a single, global order consistent with real-time. Every read operation returns the value of the most recent write operation. This model is essential in systems where the order of operations matters, such as financial transactions or user password updates.
# Example: Linearizable Register Implementation
class LinearizableRegister:
def __init__(self, node_id: int, quorum_size: int) -> None:
self.node_id = node_id
self.quorum_size = quorum_size
self.value: Optional[str] = None
self.timestamp: int = 0
self.pending_writes: Dict[int, str] = {}
async def write(self, new_value: str) -> bool:
# Generate new timestamp (logical time)
new_timestamp = self.timestamp + 1
# Phase 1: Prepare (propose)
prepare_ok = await self._send_prepare(new_timestamp)
if not prepare_ok:
return False
# Phase 2: Commit
commit_ok = await self._send_commit(new_value, new_timestamp)
if commit_ok:
self.value = new_value
self.timestamp = new_timestamp
return commit_ok
async def read(self) -> Optional[str]:
# Contact quorum to get latest value
responses: List[Tuple[Optional[str], int]] = await self._query_quorum()
# Find value with highest timestamp
latest_value: Optional[str] = None
latest_ts = -1
for value, timestamp in responses:
if timestamp > latest_ts:
latest_value = value
latest_ts = timestamp
# Update local state if behind
if latest_ts > self.timestamp:
self.value = latest_value
self.timestamp = latest_ts
return self.value
The requirement for real-time ordering in linearizability necessitates coordination across a quorum of nodes. This coordination, enforced through consensus protocols like Paxos or Raft, directly implies that during a network partition, a linearizable system cannot safely accept writes without risking split-brain—thus, availability must be sacrificed to preserve consistency, as dictated by the CAP theorem.
Serializability
Serializability is a consistency model that guarantees the result of concurrent transactions is equivalent to some serial (one-at-a-time) execution of those transactions. It does not impose a real-time ordering constraint like linearizability but ensures that the outcome of concurrent operations could have occurred in some sequential order.
# Example: Serializable Transaction Implementation
def user_password_update_serializable(old_password: str, new_password: str) -> bool:
# Critical section requiring serializability
current_password = get_current_password()
if current_password != old_password:
return False
success = set_password(new_password)
if success:
verify_password = get_current_password()
return verify_password == new_password
return False
def add_comment_eventual_consistent(user_id: int, comment: str, post_id: int):
# Write locally first (low latency)
local_success = write_comment_to_local_node(user_id, comment, post_id)
# Asynchronously replicate
if local_success:
replicate_comment_async(user_id, comment, post_id)
return local_success
Comparison of Consistency Models
The following table compares different consistency models, including linearizability and serializability, based on their primary guarantees, real-time constraints, availability during partitions, latency costs, and use case examples.
| Consistency Model | Primary Guarantee | Real-time Constraint | Availability During Partition | Latency Cost | Use Case Example |
|---|---|---|---|---|---|
| Linearizability | Recency (most recent write) | Yes - operations ordered by real-time completion | Must sacrifice availability (CP) | High - requires consensus/quorum | User password update, financial balance |
| Serializability | Ordering (equivalent to serial execution) | No - only requires equivalent serial order | Depends on implementation | Moderate - requires locking/isolation | Database transactions, inventory management |
| Eventual Consistency | Eventually converges | No | Maintains availability (AP) | Low - no coordination required | Comments feed, social media likes |
System Implementations and CAP/PACELC Classifications
Real-world systems can be classified based on their CAP/PACELC choices, which reflect their trade-offs between consistency, availability, and latency. The following table illustrates some examples:
| System | CAP/PACELC Classification | Linearizability Support | Mechanism | Availability Trade-off |
|---|---|---|---|---|
| Google Spanner | CP/PC-EC | Yes (external consistency) | TrueTime synchronized clocks | Sacrifices availability during partition |
| CockroachDB | CP/PC-EC | Yes (serializable) | Hybrid Logical Clocks | Sacrifices availability during partition |
| Cassandra | AP/PA-EL | No (configurable consistency levels) | Tunable quorums, eventual consistency | Maintains availability, configurable consistency |
| DynamoDB | AP/PA-EL (configurable) | Configurable strong consistency | Quorum reads/writes | Can be configured for CP or AP |
Diagrams and Illustrations
The following diagrams illustrate key concepts related to linearizability and serializability:
Diagram 1: Linearizability vs Serializability Timeline
Real Timeline (wall-clock time): T0: Write X=1 starts T1: Write X=1 completes T2: Read X starts T3: Read X must return 1 (linearizability requirement) T4: Write X=2 starts T5: Write X=2 completes T6: Read X starts T7: Read X must return 2
Linearizable Order (must match real-time):
- Write X=1 (T0-T1)
- Read X → 1 (T2-T3)
- Write X=2 (T4-T5)
- Read X → 2 (T6-T7)
Serializable Order (can reorder but equivalent to some serial order): Could be: Write X=2, Write X=1, Read X→1, Read X→2 (Still valid if equivalent to some serial execution)
Diagram 2: Availability Cost of Linearizability
During Network Partition: [Node A] --- Partition --- [Node B]
Scenario: Client connected to Node A wants to write
CP System (Linearizable):
- Node A cannot contact quorum including Node B
- Write fails or blocks
- Consistency preserved, availability sacrificed
AP System (Eventually Consistent):
- Node A accepts write locally
- Write succeeds immediately
- When partition heals, conflict resolution needed
- Availability preserved, consistency sacrificed
Diagram 3: Real-world Examples Architecture
User Password System (Linearizable Required): [Client] → [Load Balancer] → [Primary Database] ↓ [Synchronous Replication] ↓ [Secondary Replicas]
Comments Feed System (Eventually Consistent Acceptable): [Client] → [Any Node] → [Local Write Accepted] ↓ [Async Replication] ↓ [Eventual Convergence]
The choice between linearizability and serializability represents a fundamental trade-off between recency guarantees and transaction isolation, with direct implications for system latency and availability.