PAXOS Made Simple: The Consensus Algorithm

Posted in distributed-systems by Christopher R. Wirz on Wed Sep 25 2024



The Problem

In distributed systems, achieving consensus among a collection of processes is crucial. The Paxos algorithm addresses this by ensuring that a single value is chosen from among proposed values. The safety requirements for consensus are:

  • Only a proposed value may be chosen.
  • Only a single value is chosen.
  • A process never learns that a value has been chosen unless it actually has been.

Choosing a Value

The algorithm involves three roles: proposers, acceptors, and learners. Initially, a single acceptor can choose a value, but this is not fault-tolerant. Therefore, multiple acceptors are used, and a value is chosen when a majority of acceptors accept it. This ensures that any two majorities overlap, maintaining consistency.

To handle multiple proposals, each proposal is assigned a unique number. The key properties to maintain are:

  • P1: An acceptor must accept the first proposal it receives.
  • P2: If a proposal with value ( v ) is chosen, every higher-numbered proposal that is chosen must have value ( v ).

The Algorithm

The algorithm operates in two phases:

  1. Prepare Phase: A proposer selects a proposal number ( n ) and sends a prepare request to a majority of acceptors. Acceptors respond with a promise not to accept proposals numbered less than ( n ) and the highest-numbered proposal they have accepted.
  2. Accept Phase: If the proposer receives responses from a majority, it sends an accept request with the proposal number ( n ) and a value ( v ). Acceptors accept the proposal unless they have already promised not to accept proposals numbered less than ( n ).

Learning a Chosen Value

Learners need to know when a value has been chosen. Acceptors can inform learners directly, or a distinguished learner can aggregate responses and inform others. This reduces communication overhead but introduces a single point of failure.

Progress

To ensure progress, a distinguished proposer is elected to issue proposals. This avoids the scenario where multiple proposers continuously issue conflicting proposals, preventing any value from being chosen. The election of a proposer can use randomness or real-time mechanisms like timeouts.

Implementation

In practice, the Paxos algorithm assumes a network of processes where each process can act as a proposer, acceptor, and learner. Stable storage is used to remember the highest-numbered proposal an acceptor has accepted and the highest-numbered prepare request it has responded to. This ensures that the algorithm can recover from failures.

Implementing a State Machine

A distributed system can be implemented as a collection of clients and servers, where servers execute commands in a sequence. Using Paxos, each command corresponds to an instance of the consensus algorithm. A leader is elected to propose commands, ensuring that all servers execute the same sequence of commands.

During normal operation, the leader proposes commands, and servers execute them in order. If the leader fails, a new leader is elected, and it fills any gaps in the command sequence with no-op commands to maintain consistency.

Handling Failures and Leader Election

Failures and leader elections are rare but critical events. When a leader fails, a new leader executes phase 1 of the consensus algorithm for all instances to ensure no gaps in the command sequence. This involves sending prepare requests and receiving responses from acceptors.

Optimizations and Performance

The Paxos algorithm is optimized to minimize communication overhead. For example, acceptors only respond with detailed information if they have already accepted a proposal. The algorithm is designed to be efficient, with the cost of achieving consensus being the minimum possible in the presence of faults.

Conclusion

The Paxos algorithm is a fundamental building block for fault-tolerant distributed systems. By ensuring consensus among processes, it enables reliable and consistent operation even in the presence of failures. Understanding the technical details of Paxos is essential for designing robust distributed systems.

Key Terms

  • Acceptor: An agent in the Paxos algorithm that receives proposals from proposers and decides whether to accept them. A value is chosen when a majority of acceptors accept the same proposal.
  • Byzantine Fault: A type of fault in a distributed system where components may fail and there is imperfect information on whether a component has failed. Byzantine faults include arbitrary failures, such as malicious or random behavior.
  • Client: In a distributed system, a client is an entity that requests services or resources from a server.
  • Consensus Algorithm: An algorithm used in distributed systems to achieve agreement on a single data value among distributed processes or systems. Paxos is a well-known consensus algorithm.
  • Distinguished Learner: A specific learner in the Paxos algorithm that collects responses from acceptors and informs other learners about the chosen value, reducing communication overhead.
  • Distinguished Proposer: A proposer that is elected to issue proposals in the Paxos algorithm to ensure progress and avoid conflicts between multiple proposers.
  • Fault-Tolerant: The ability of a system to continue operating properly in the event of the failure of some of its components.
  • Learner: An agent in the Paxos algorithm that learns the chosen value once it has been accepted by a majority of acceptors.
  • Majority: In the context of Paxos, a majority refers to more than half of the acceptors. This ensures that any two majorities overlap, maintaining consistency.
  • No-op Command: A special command in a state machine that does nothing and leaves the state unchanged. It is used to fill gaps in the sequence of commands.
  • Prepare Request: A request sent by a proposer to acceptors in the first phase of the Paxos algorithm, asking them to promise not to accept any proposals with a number less than the one in the request.
  • Proposer: An agent in the Paxos algorithm that proposes values to be chosen. Proposers initiate the consensus process by sending proposals to acceptors.
  • State Machine: A model of computation representing a system with a set of states and transitions between those states. In distributed systems, state machines are used to ensure consistency across multiple servers.
  • Stable Storage: Non-volatile storage that retains information even after a system failure. In Paxos, stable storage is used to remember the highest-numbered proposal an acceptor has accepted and the highest-numbered prepare request it has responded to.
  • Synod Algorithm: The core consensus algorithm within Paxos, which ensures that a single value is chosen from among proposed values.
  • Value: In the context of Paxos, a value is the data proposed by a proposer and potentially chosen by the consensus algorithm.
  • View: A configuration or state of the system, often used in the context of leader election or reconfiguration in distributed systems.
@article{lamport2001paxos,
	title={Paxos made simple},
	author={Lamport, Leslie},
	journal={ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001)},
	pages={51--58},
	year={2001}
}