Paxos

Comprehensive study notes, diagrams, and exam preparation for Paxos.

Paxos

Definition

Paxos is a distributed consensus algorithm that enables a group of nodes to agree on a single value safely, even if some nodes crash or communication is unreliable, as long as a majority of the nodes remain available.

More formally, Paxos ensures that:

  • Only one value is chosen for a given consensus instance.
  • If a value is chosen, all correct nodes can eventually learn the same value.
  • Progress is possible when a majority of nodes can communicate reliably.

It is not simply a voting system. Instead, it is a carefully designed protocol with roles such as proposers, acceptors, and learners that work together to preserve safety and reach agreement.


Main Content

1. Consensus in Distributed Systems

  • Consensus means all participating nodes must agree on the same decision, such as a transaction outcome, configuration update, or leader selection.
  • In distributed systems, consensus is difficult because nodes can fail independently, networks can partition, and messages may arrive out of order or not at all.

Paxos was created to solve this exact problem. Suppose a replicated database needs to commit a write. If one server says “yes” and another says “no,” the system can become inconsistent. Paxos provides a formal method so that even if several servers fail, the cluster can still decide one value safely. The most important property is safety: no two different values can both be chosen for the same decision.

A practical example is a configuration service used by many servers. If one server wants to update the configuration while others are offline, Paxos ensures that once the update is decided, all healthy nodes eventually agree on the same version.

2. Roles and Responsibilities in Paxos

Proposer

  • : suggests a value for consensus and initiates the protocol.

Acceptor

  • : votes on proposals and is the core of Paxos safety.

Learner

  • : finds out which value was chosen and applies it.

These roles may be separate processes or combined in a single node in real systems. The proposer attempts to drive the system toward agreement by assigning proposal numbers and contacting acceptors. Acceptors are the most important actors for correctness: they remember enough state to prevent conflicting decisions. Learners observe the outcome and use it for state machine replication or service updates.

For example, in a cluster of five nodes, three may act as acceptors. A proposer sends a proposal to them. If a majority accept the same proposal, the value is chosen. Learners then learn the chosen value so the system can proceed consistently.

3. Safety, Liveness, and Majority Quorums

Safety

  • means the protocol never decides two different values for the same instance.

Liveness

  • means the protocol eventually makes progress if conditions are favorable.

Majority quorum

  • means decisions are based on more than half of acceptors, ensuring overlap between decision groups.

The key idea behind Paxos is quorum intersection. If two majorities exist in a cluster of five nodes, for example, both majorities must share at least one acceptor. This overlap guarantees that once a value has been accepted by one majority, any later proposal must discover that value and avoid conflicting decisions.

Consider five acceptors: A, B, C, D, E. A majority could be A, B, C. Another majority could be C, D, E. Since both include C, the protocol can preserve consistency through shared knowledge. Even if some nodes fail, as long as a majority survives, consensus can still be reached.


Working / Process

  1. A proposer selects a new proposal number and asks a majority of acceptors whether they are willing to consider it.
  2. Acceptors respond with any previously promised or accepted information, allowing the proposer to avoid violating earlier decisions.
  3. If enough acceptors accept the proposal, the value is chosen and then learned by the learners.

In more detail, Paxos usually works in two phases:

Phase 1: Prepare / Promise

  • The proposer picks a unique proposal number.
  • It sends a prepare request to acceptors.
  • Acceptors promise not to accept smaller-numbered proposals in the future.
  • They also return any proposal they have already accepted.

Phase 2: Accept / Accept

  • The proposer chooses a value.
  • If acceptors reported prior accepted values, the proposer must preserve the highest-numbered accepted value to avoid conflict.
  • It sends an accept request with the chosen value.
  • If a majority accepts, the value is decided.

Example:
Suppose a proposer wants to choose value V1. It sends prepare messages with proposal number 7. Acceptors reply that none have accepted anything yet. The proposer then sends an accept request for V1. If a majority accepts, V1 becomes the decided value.

If another proposer later tries proposal number 9, but one acceptor reports that it already accepted a value, the new proposer must respect Paxos rules and carry forward the correct value if necessary. This is how conflicting decisions are prevented.


Advantages / Applications

  • Strong fault tolerance with safety even when some nodes fail or messages are unreliable.
  • Widely used in distributed databases, configuration systems, and leader election services.
  • Provides a mathematically proven consensus foundation for replicated state machines.

Paxos is especially valuable in environments where consistency is more important than raw availability during failures. It is a standard building block for systems that must preserve correctness under crash failures. Many coordination tools, metadata services, and distributed databases rely on Paxos or Paxos-inspired techniques.

Typical applications include:

  • Replicating logs in storage systems.
  • Maintaining cluster membership and configuration.
  • Electing a leader among multiple servers.
  • Coordinating updates to shared metadata.
  • Ensuring consistent commits in distributed transactions.

A real-world advantage is that Paxos allows systems to continue operating correctly even if several servers go down, as long as a majority remains reachable. This makes it highly suitable for enterprise-grade and cloud-native infrastructure.


Summary

Paxos is a consensus protocol that helps distributed systems agree on one value safely despite failures. It relies on proposers, acceptors, and learners, and uses majority quorums to prevent conflicting decisions. Although conceptually difficult, it is one of the most important ideas in fault-tolerant distributed computing.