How Replicated Machines Achieve Agreement Without Trust

CLUSTER: IDLE

Term

Committed

Leader

The Problem

You have five servers. Each can crash at any time. Messages between them can be delayed or lost. Yet clients expect a single, consistent service.

This is the consensus problem: getting unreliable machines to agree on a sequence of operations, as if they were one reliable machine.

All Start as Followers

Every node begins in the follower state. Followers are passive — they listen for heartbeats and respond to requests, but never initiate actions.

Each follower runs an election timeout — a random countdown between 150 and 300 ms. If it expires without a heartbeat, something has gone wrong.

The Timeout Fires

Node C's timer runs out first. It assumes the leader has crashed, increments its term number, and transitions to candidate state.

The randomized timeout breaks symmetry — a little randomness prevents a lot of deadlock.

Requesting Votes

The candidate votes for itself, then sends RequestVote RPCs to every other node. Each node votes for at most one candidate per term, first-come-first-served.

Critical rule: a node will not vote for a candidate whose log is less up-to-date than its own.

A Leader Emerges

Node C receives votes from a majority (3 of 5, including itself). It becomes leader and immediately begins sending heartbeats.

Heartbeats are empty AppendEntries RPCs — proof of life that reset followers' election timers and prevent unnecessary elections.

Log Replication

A client sends a command. The leader appends it to its log with the current term number, then sends AppendEntries RPCs to all followers.

Each follower checks the Log Matching Property: if the preceding entry matches, it appends. If not, the leader retries with earlier entries until the logs converge.

Commitment

Once the leader has replicated an entry to a majority of nodes, that entry is committed — it will never be lost or overwritten.

The leader advances its commitIndex, notifies followers, and all nodes apply the entry to their state machines. The command takes effect.

Leader Failure

The leader crashes. Followers stop receiving heartbeats. After their election timeouts expire, a new election begins.

The election restriction ensures the new leader has every committed entry. Safety is never violated — only liveness is temporarily lost.

Recovery

A new leader is elected within milliseconds. The crashed node eventually restarts as a follower and catches up by receiving the entries it missed.

The cluster continues serving requests. Clients may not even notice the interruption. This is Raft: agreement without trust, continuity without perfection.

The Machines That Run on Agreement

Raft is not an academic exercise. It is the consensus backbone of Kubernetes (via etcd), CockroachDB, TiKV, and HashiCorp Consul. When you deploy a container, commit a distributed transaction, or register a service, Raft is the mechanism ensuring that the operation is consistent and durable.

Every time a hospital records patient data across replicated databases, every time a bank processes a transaction through a distributed ledger — the correctness of that operation rests on the same three mechanisms: leader election, log replication, and the safety guarantee that committed entries are never lost.

How Replicated Machines Achieve Agreement Without Trust

The Problem

All Start as Followers

The Timeout Fires

Requesting Votes

A Leader Emerges

Log Replication

Commitment

Leader Failure

Recovery

The Algorithm Humanity Could Read

Terms: Raft's Logical Clock

The Machines That Run on Agreement

Log Replication: The Majority Rule

Decomposition as Design

Build Your Own Raft Cluster