SlideShare a Scribd company logo
CONSENSUS IN 

DISTRIBUTED COMPUTING
LET’S TALK ABOUT…
CONSENSUS IN DISTRIBUTED COMPUTING
RUBEN TAN LONG ZHENG
▸ CTO of Neuroware, Inc
▸ We Do Blockchain Stuff™
▸ Co-founder of Javascript Developers Malaysia
▸ Proud owner of 2 useless cats
▸ @roguejs
CONSENSUS IN DISTRIBUTED COMPUTING
SUPER HIGH-LEVEL OVERVIEW
▸ Consensus in Distributed Computing
▸ Consensus
▸ Agreeing that something is the truth
▸ Distributed Computing
▸ A network of nodes operating together
Consensus in distributed computing
CONSENSUS IN DISTRIBUTED COMPUTING
FAILURE MODES
▸ Fail-stop = a node dies
▸ Fail-recover = a node dies and comes back later (Jesus/
Zombie)
▸ Byzantine = a node misbehaves
CONSENSUS IN DISTRIBUTED COMPUTING
BYZANTINE GENERAL’S PROBLEM
▸ One of the first impossibility proof in computer
communications
▸ Impossible to solve in a perfect manner
▸ Originated from the Two General’s Problem (1975)
▸ Explored in detail in Leslie Lamport, Robert Shostak,
Marshall Pease paper: The Byzantine General Problem
(1982)
ENEMY
A
B
C
D
E
F
TRAITOR
ATTACK!
ATTACK!
ATTACK!
RETREAT!
RETREAT!
RETREAT!
ATTACK! RETREAT!
ENEMY
A
B
C
D
E
F
TRAITOR
MUAHAHA, NO CONSENSUS!
ROUTS THE FLEEING ARMY
ATTACKERS HAVE
INSUFFICIENT FORCE
AND ARE DESTROYED
CONSENSUS IN DISTRIBUTED COMPUTING
BYZANTINE FAULT TOLERANCE
▸ Byzantine Fault
▸ Any fault that presents different symptoms to different
observers (some general attack, some general retreat)
▸ Byzantine Failure
▸ The loss of a system service reliant on consensus due to
Byzantine Fault
▸ Byzantine Fault Tolerance
▸ A system that is resilient/tolerant of a Byzantine Fault
CONSENSUS IN DISTRIBUTED COMPUTING
ON A SIDENOTE…
▸ Distributed computing is inherently unreliable
▸ Peter Deutsch, Bill Joy, Tom Lyon and James Gosling
▸ The Eight Fallacies of Distributed Computing
(1994-1997)
▸ Today, we still have engineers who believe in some, if not
all of the fallacies
CONSENSUS IN DISTRIBUTED COMPUTING
EIGHT FALLACIES OF DISTRIBUTED COMPUTING
▸ The network is reliable
▸ Latency is zero
▸ Bandwidth is infinite
▸ The network is secure
▸ Topology does not change
▸ There is only one administrator
▸ Transport cost is zero
▸ The network is homogeneous (same platform)
When you believe in any of the eight fallacies…
CONSENSUS
The Real Talk Begins™
CONSENSUS IN DISTRIBUTED COMPUTING
CONSENSUS OVERVIEW
▸ Achieving Consensus = distributed system acting as one entity
▸ Consensus Problem = getting nodes in a distributed system to
agree on something (value, operation, etc)
▸ Basically… consensus = THE HIVE MIND
▸ Common Examples
▸ Commit transactions to a database
▸ Synchronising clocks
CONSENSUS IN DISTRIBUTED COMPUTING
FLP IMPOSSIBILITY PROOF
▸ Michael J. Fisher, Nancy A. Lynch, and Michael S. Patterson
▸ Impossibility of Distributed Consensus with One Faulty
Process (1985) - Dijkstra (dike-stra) Award (2001)
▸ In synchronous settings, it is possible to reach consensus at
the cost of time
▸ Consensus is impossible in an asynchronous setting even
when only 1 node will crash
Consensus in distributed computing
CONSENSUS IN DISTRIBUTED COMPUTING
SOLVING THE CONSENSUS PROBLEM
▸ Strong consensus follows these properties:
▸ Termination - all nodes eventually decide on a value
▸ Agreement - all nodes decide on a value
▸ Validity - the value decided must be proposed by a
node (AKA no default value to fall back on)
▸ Termination + Agreement + Validity = Consensus
CONSENSUS IN DISTRIBUTED COMPUTING
CONSENSUS PROTOCOLS
▸ 2 Phase Commit
▸ 3 Phase Commit
▸ Basic Paxos
▸ The Future…
CONSENSUS IN DISTRIBUTED COMPUTING
2 PHASE COMMIT
▸ Simplest consensus protocol
▸ Phase 1 - Proposal
▸ A node (called coordinator) proposes a value to all other nodes,
then gathers votes
▸ Phase 2 - Commit-or-abort
▸ The coordinator sends:
▸ Commit if all nodes voted yes. All nodes commit the new value
▸ Abort if 1 or more nodes voted no. All nodes abort the value
COOR.
NODE
NODE
NODE
NODE
Coordinator proposes a value
COOR.
NODE
NODE
NODE
NODE
All nodes vote yes or no
COOR.
NODE
NODE
NODE
NODE
Coordinator sends commit if
all nodes voted yes; sends
abort otherwise All nodes now
update themselves
to contain the
proposed value, or
all nodes abort
CONSENSUS IN DISTRIBUTED COMPUTING
2 PHASE COMMIT
▸ Agreement - every node accepts the value from the
coordinator at phase 2 = YES
▸ Validity - commit/abort originated from the coordinator =
YES
▸ Termination = no loops in the steps, doesn’t run forever =
YES
▸ Therefore, 2 phase commit fulfils the requirements of a
consensus protocol
CONSENSUS IN DISTRIBUTED COMPUTING
2 PHASE COMMIT
▸ Blocking failure when coordinator fails before sending
proposal to all nodes
COOR.
NODE
NODE
NODE
Coordinator proposes a value
▸ Blocking failure when coordinator fails before sending
proposal to all nodes
2 PHASE COMMIT
CONSENSUS IN DISTRIBUTED COMPUTING
COOR.
NODE
NODE
NODE
Receives proposed
value, votes yes, now
waiting for commit
▸ Blocking failure when coordinator fails before sending
proposal to all nodes
2 PHASE COMMIT
CONSENSUS IN DISTRIBUTED COMPUTING
COOR.
NODE
NODE
NODE
Coordinator crashes… and a different
coordinator comes in to propose a
different value
NEW
COOR.
▸ Blocking failure when coordinator fails before sending
proposal to all nodes
2 PHASE COMMIT
CONSENSUS IN DISTRIBUTED COMPUTING
COOR.
NODE
NODE
NODE
NEW
COOR.
Node cannot accept new proposal
because waiting on commit. Cannot
abort because first Coordinator might
recover.
CONSENSUS IN DISTRIBUTED COMPUTING
2 PHASE COMMIT
▸ Guarantees safety, but not liveness
▸ Safety = all nodes agree on a value proposed by a node
▸ Liveness = should still be able to function when some
nodes crash
CONSENSUS IN DISTRIBUTED COMPUTING
3 PHASE COMMIT
▸ Similar to 2 Phase Commit, with an extra phase (duh)
▸ Phase 1 - Proposal - same as 2PC
▸ Phase 2 - Pre-approve - similar to 2PC commit-or-abort,
but nodes reply with ACK instead
▸ Phase 3 - Do Commit - now the nodes commit
▸ Tolerant of node crashes, but not network partitions
▸ Won’t cover in detail
CONSENSUS IN DISTRIBUTED COMPUTING
PAXOS
▸ Presented by Leslie Lamport in The Part-Time Parliament
(1988)
▸ Named after the Paxos civilisation’s legislation
▸ Remains as:
▸ The hardest to understand in theory
▸ The hardest to implement
▸ The closest we get to reaching ideal consensus
CONSENSUS IN DISTRIBUTED COMPUTING
PAXOS
▸ Used in:
▸ Apache Zookeeper
▸ Google Chubby (BigTable)
▸ Google Spannar
▸ Apache Mesos
▸ Apache Cassandra
▸ etc
CONSENSUS IN DISTRIBUTED COMPUTING
PAXOS
▸ Components:
▸ Proposers
▸ Proposes values to other nodes
▸ Acceptors
▸ Respond to proposers with votes
▸ Commits chosen value & decision state
▸ Server can have both 1 Proposer & 1 Acceptor
CONSENSUS IN DISTRIBUTED COMPUTING
PAXOS
▸ Uses a two-base approach:
▸ Broadcast Prepare
▸ Find out if there’s already a chosen value
▸ Block older proposals that have yet to be completed
▸ Broadcast Accept
▸ Ask acceptors to accept a value
CONSENSUS IN DISTRIBUTED COMPUTING
PAXOS
▸ Prepare(n)
▸ n = proposal number [max++]~[server id]
▸ Return(p, v)
▸ p = proposal number
▸ v = current accepted value (if any)
▸ Accept(p, v)
▸ p = proposal number
▸ v = value to be accepted
CONSENSUS IN DISTRIBUTED COMPUTING
PAXOS
▸ Proposal Phase
▸ Proposer generates a proposal number p
▸ Proposer broadcasts p and a value v
▸ Acceptor checks p if higher than its min-p, updates if so
▸ Acceptor replies any acc-p and acc-v
▸ Proposer waits for majority
▸ Checks if any return acc-p is highest, and replace v with acc-v
CONSENSUS IN DISTRIBUTED COMPUTING
PAXOS
▸ Accept Phase
▸ Proposer sends p and v to all acceptors
▸ Acceptors check if p is lower than min-p, and ignores if
so. Otherwise, acc-p = min-p = p and acc-v = v
▸ Acceptor reply accepted or rejected
▸ If majority accepted, terminate with v. Otherwise, restart
Propose Phase with new p
A1
A2
A3
7
v7 is proposed with p1
P1MIN-P 0 ACC-P - ACC-V -
MIN-P 0 ACC-P - ACC-V -
MIN-P 0 ACC-P - ACC-V -
P
7
A1
A2
A3
7
v7 is proposed with p1
P1MIN-P 1 ACC-P - ACC-V -
MIN-P 0 ACC-P - ACC-V -
MIN-P 0 ACC-P - ACC-V -
P
7
P1 7
A1
A2
A3
7
v7 is proposed with p1
P1MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
MIN-P 0 ACC-P - ACC-V -
P
7
P1 7
P1 7
A1
A2
A3
7
v7 is proposed with p1
P1MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
P
7
P1 7
P1 7
A1
A2
A3
7
v7 is proposed with p1
P1MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
P
7
P1 7
P1 7
ACC-P -
ACC-V -
A1
A2
A3
7
v7 is proposed with p1
P1MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
P
7
P1 7
P1 7
ACC-P -
ACC-V -
ACC-P -
ACC-V -
A1
A2
A3
7
Has majority! Since acc-p and acc-v are both null, we know
that we are the only proposers in the network so far
P1MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
P
7
P1 7
P1 7
ACC-P -
ACC-V -
ACC-P -
ACC-V -
A1
A2
A3
Now, we send out p and v in the accept phase
P1MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
P
7
P1 7
P1 7
A1
A2
A3
Acceptors update acc-p and acc-v
P1MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
P
7
P1 7
P1 7
A1
A2
A3
Accept!
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
P
A1
A2
A3
Accept!
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
P
Accept!
A1
A2
A3
Accept!
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
P
Accept!
Oh look, we have majority! v7 is the terminated value then!
A1
A2
A3
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
P
Shuddup, nobody loves you
Accept? :(
CONSENSUS IN DISTRIBUTED COMPUTING
PAXOS - MULTI PROPOSERS
▸ What if there were multiple proposers?
▸ Brace yourself, It’s Complicated™ (not really)
A1
A2
A3
7
P1MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
P1
7
P1 7
P2
P1 7
A1
A2
A3
7
P1MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
MIN-P 1 ACC-P - ACC-V -
P
7
P1 7
P1 7
ACC-P -
ACC-V -
ACC-P -
ACC-V -
P2
A1
A2
A3
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
P1
P2
P1 7
P1 7
P1 7 P2 5
5
v5 is proposed with p2
A1
A2
A3
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
P1
P2
P1 7
P1 7
P1 7 P2 5
ACC-P 1
ACC-V 7
5
A1
A2
A3
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
P1
P2
P1 7
P1 7
P1 7 P2 7
7
value of p2 is changed to 7
A1
A2
A3
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
MIN-P 1 ACC-P 1 ACC-V 7
P1
P2
P1 7
P1 7
P1 7 P2 7
Broadcast accept phase with
p2 and v7
A1
A2
A3
MIN-P 2 ACC-P 1 ACC-V 7
MIN-P 2 ACC-P 1 ACC-V 7
MIN-P 2 ACC-P 1 ACC-V 7
P1
P2
P1 7
P1 7
P1 7 P2 7
P2 7
P2 7
Both proposer succeed! No blocking here.
CONSENSUS IN DISTRIBUTED COMPUTING
BASIC PAXOS
▸ This is BASIC Paxos: 2PC with a twist (Quorum)
▸ It has vulnerabilities!
▸ Best of 2PC (safety), with strong liveness
▸ Most Consensus Algorithm are a variant of Paxos
▸ Forms the basis of Distributed Computing research
CONSENSUS IN DISTRIBUTED COMPUTING
CLOSING…
▸ Basic Paxos is not Byzantine Fault Tolerant
▸ It is a challenge to create a consensus protocol
(termination, agreement, validity) that is Byzantine Fault
Tolerant
▸ Nakamoto Consensus (aka bitcoin consensus) skirts
around Byzantine problems by imposing proof-of-work
▸ Raft is an implementation of Paxos, used in etcd and
consul
PAXOS - BEST GEEKY PICKUP
LINE NEVER
Ruben Tan
CONSENSUS IN DISTRIBUTED COMPUTING

More Related Content

PPTX
Introduction to cyber security
PPT
Disaster Recovery & Data Backup Strategies
PDF
Chapter 1 Introduction of Cryptography and Network security
PPTX
Active and Passive Network Attacks
PPT
Lecture 3,4
PPT
Cloud computing and service models
PPT
Information Security & Cryptography
PPTX
Presentation Routing algorithm
Introduction to cyber security
Disaster Recovery & Data Backup Strategies
Chapter 1 Introduction of Cryptography and Network security
Active and Passive Network Attacks
Lecture 3,4
Cloud computing and service models
Information Security & Cryptography
Presentation Routing algorithm

What's hot (20)

PPTX
Cryptography
PPTX
DATA RATE LIMITS
PPTX
Key management
PPT
Introduction to Web Server Security
PPTX
PACE-IT: The Importance of Network Segmentation
PPTX
Computer networks - Channelization
PPTX
Network security (vulnerabilities, threats, and attacks)
PPTX
Data Encryption Standard (DES)
PPT
Lecture 1,2
PPSX
Congestion avoidance in TCP
PPTX
Elements of dynamic programming
PPT
Turing Machine
PPTX
Fundamentals of Network security
PPTX
Privacy in cloud computing
PPT
Query Decomposition and data localization
PPTX
Cloud Computing for college presenation project.
PPTX
RSA Algorithm
PPTX
Types of grammer - TOC
PDF
Cloud Computing - An Introduction
Cryptography
DATA RATE LIMITS
Key management
Introduction to Web Server Security
PACE-IT: The Importance of Network Segmentation
Computer networks - Channelization
Network security (vulnerabilities, threats, and attacks)
Data Encryption Standard (DES)
Lecture 1,2
Congestion avoidance in TCP
Elements of dynamic programming
Turing Machine
Fundamentals of Network security
Privacy in cloud computing
Query Decomposition and data localization
Cloud Computing for college presenation project.
RSA Algorithm
Types of grammer - TOC
Cloud Computing - An Introduction
Ad

Viewers also liked (18)

PPT
Leveraging zeromq for node.js
PDF
Banking on blockchains
PDF
Demystifying blockchains
PPTX
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
ZIP
Distributed-ness: Distributed computing & the clouds
PDF
International Journal of Distributed Computing and Technology vol 2 issue 1
PPT
Grid – Distributed Computing at Scale
PPT
Distributed Computing & MapReduce
PDF
Distributed computing the Google way
PPTX
BitCoin, P2P, Distributed Computing
PDF
Introduction to OpenDaylight & Application Development
PDF
The byzantine generals problem
PPTX
Concepts of Distributed Computing & Cloud Computing
PPTX
Load Balancing In Distributed Computing
PDF
Grid computing notes
ODP
Distributed Computing
PDF
OpenStack and OpenDaylight: An Integrated IaaS for SDN/NFV
PPTX
Distributed Computing
Leveraging zeromq for node.js
Banking on blockchains
Demystifying blockchains
Grid'5000: Running a Large Instrument for Parallel and Distributed Computing ...
Distributed-ness: Distributed computing & the clouds
International Journal of Distributed Computing and Technology vol 2 issue 1
Grid – Distributed Computing at Scale
Distributed Computing & MapReduce
Distributed computing the Google way
BitCoin, P2P, Distributed Computing
Introduction to OpenDaylight & Application Development
The byzantine generals problem
Concepts of Distributed Computing & Cloud Computing
Load Balancing In Distributed Computing
Grid computing notes
Distributed Computing
OpenStack and OpenDaylight: An Integrated IaaS for SDN/NFV
Distributed Computing
Ad

Similar to Consensus in distributed computing (20)

PPTX
Basic Paxos Implementation in Orc
PPT
L14.C3.FA18.ppt
PDF
Papers We Love / Kyiv : PAXOS (and little about other consensuses )
PDF
PDF
Distributed Consensus: Making the Impossible Possible
PDF
Distributed Consensus: Making Impossible Possible [Revised]
PDF
Consensus Algorithms: An Introduction & Analysis
PDF
Distributed Consensus: Making Impossible Possible
PDF
Distributed Consensus: Making Impossible Possible by Heidi howard
PDF
Impossibility
PPS
the Paxos Commit algorithm
PPT
consensus-slides distributed computing.ppt
PDF
Distributed Systems Theory for Mere Mortals
ZIP
9X5u87KWa267pP7aGX3K
PDF
Impossibility of Consensus with One Faulty Process - Papers We Love SF
PPTX
L16.A.FA1ggggggggggggggggggggggggg6.pptx
PPTX
Fault Tolerance in distributed operating system
PDF
6 two phasecommit
PPT
Distributed System by Pratik Tambekar
PDF
Ripple - XRP we know how XRP blockchain throughout Whitepaper
Basic Paxos Implementation in Orc
L14.C3.FA18.ppt
Papers We Love / Kyiv : PAXOS (and little about other consensuses )
Distributed Consensus: Making the Impossible Possible
Distributed Consensus: Making Impossible Possible [Revised]
Consensus Algorithms: An Introduction & Analysis
Distributed Consensus: Making Impossible Possible
Distributed Consensus: Making Impossible Possible by Heidi howard
Impossibility
the Paxos Commit algorithm
consensus-slides distributed computing.ppt
Distributed Systems Theory for Mere Mortals
9X5u87KWa267pP7aGX3K
Impossibility of Consensus with One Faulty Process - Papers We Love SF
L16.A.FA1ggggggggggggggggggggggggg6.pptx
Fault Tolerance in distributed operating system
6 two phasecommit
Distributed System by Pratik Tambekar
Ripple - XRP we know how XRP blockchain throughout Whitepaper

More from Ruben Tan (8)

PDF
Basic distributed systems principles
PDF
Defensive programming in Javascript and Node.js
PDF
Client-side storage
KEY
Distributed app development with nodejs and zeromq
KEY
How we git - commit policy and code review
KEY
NodeHack #2 - MVP
KEY
40 square's git workflow
KEY
Unit testing for 40 square software
Basic distributed systems principles
Defensive programming in Javascript and Node.js
Client-side storage
Distributed app development with nodejs and zeromq
How we git - commit policy and code review
NodeHack #2 - MVP
40 square's git workflow
Unit testing for 40 square software

Recently uploaded (20)

PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
Teaching material agriculture food technology
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
KodekX | Application Modernization Development
PPTX
Big Data Technologies - Introduction.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Modernizing your data center with Dell and AMD
PDF
Advanced IT Governance
PDF
Chapter 2 Digital Image Fundamentals.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
How Onsite IT Support Drives Business Efficiency, Security, and Growth.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PPTX
MYSQL Presentation for SQL database connectivity
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Teaching material agriculture food technology
Advanced Soft Computing BINUS July 2025.pdf
KodekX | Application Modernization Development
Big Data Technologies - Introduction.pptx
Electronic commerce courselecture one. Pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Modernizing your data center with Dell and AMD
Advanced IT Governance
Chapter 2 Digital Image Fundamentals.pdf
Understanding_Digital_Forensics_Presentation.pptx
How Onsite IT Support Drives Business Efficiency, Security, and Growth.pdf
Empathic Computing: Creating Shared Understanding
Chapter 3 Spatial Domain Image Processing.pdf
GamePlan Trading System Review: Professional Trader's Honest Take
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
MYSQL Presentation for SQL database connectivity
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

Consensus in distributed computing

  • 1. CONSENSUS IN 
 DISTRIBUTED COMPUTING LET’S TALK ABOUT…
  • 2. CONSENSUS IN DISTRIBUTED COMPUTING RUBEN TAN LONG ZHENG ▸ CTO of Neuroware, Inc ▸ We Do Blockchain Stuff™ ▸ Co-founder of Javascript Developers Malaysia ▸ Proud owner of 2 useless cats ▸ @roguejs
  • 3. CONSENSUS IN DISTRIBUTED COMPUTING SUPER HIGH-LEVEL OVERVIEW ▸ Consensus in Distributed Computing ▸ Consensus ▸ Agreeing that something is the truth ▸ Distributed Computing ▸ A network of nodes operating together
  • 5. CONSENSUS IN DISTRIBUTED COMPUTING FAILURE MODES ▸ Fail-stop = a node dies ▸ Fail-recover = a node dies and comes back later (Jesus/ Zombie) ▸ Byzantine = a node misbehaves
  • 6. CONSENSUS IN DISTRIBUTED COMPUTING BYZANTINE GENERAL’S PROBLEM ▸ One of the first impossibility proof in computer communications ▸ Impossible to solve in a perfect manner ▸ Originated from the Two General’s Problem (1975) ▸ Explored in detail in Leslie Lamport, Robert Shostak, Marshall Pease paper: The Byzantine General Problem (1982)
  • 8. ENEMY A B C D E F TRAITOR MUAHAHA, NO CONSENSUS! ROUTS THE FLEEING ARMY ATTACKERS HAVE INSUFFICIENT FORCE AND ARE DESTROYED
  • 9. CONSENSUS IN DISTRIBUTED COMPUTING BYZANTINE FAULT TOLERANCE ▸ Byzantine Fault ▸ Any fault that presents different symptoms to different observers (some general attack, some general retreat) ▸ Byzantine Failure ▸ The loss of a system service reliant on consensus due to Byzantine Fault ▸ Byzantine Fault Tolerance ▸ A system that is resilient/tolerant of a Byzantine Fault
  • 10. CONSENSUS IN DISTRIBUTED COMPUTING ON A SIDENOTE… ▸ Distributed computing is inherently unreliable ▸ Peter Deutsch, Bill Joy, Tom Lyon and James Gosling ▸ The Eight Fallacies of Distributed Computing (1994-1997) ▸ Today, we still have engineers who believe in some, if not all of the fallacies
  • 11. CONSENSUS IN DISTRIBUTED COMPUTING EIGHT FALLACIES OF DISTRIBUTED COMPUTING ▸ The network is reliable ▸ Latency is zero ▸ Bandwidth is infinite ▸ The network is secure ▸ Topology does not change ▸ There is only one administrator ▸ Transport cost is zero ▸ The network is homogeneous (same platform)
  • 12. When you believe in any of the eight fallacies…
  • 14. CONSENSUS IN DISTRIBUTED COMPUTING CONSENSUS OVERVIEW ▸ Achieving Consensus = distributed system acting as one entity ▸ Consensus Problem = getting nodes in a distributed system to agree on something (value, operation, etc) ▸ Basically… consensus = THE HIVE MIND ▸ Common Examples ▸ Commit transactions to a database ▸ Synchronising clocks
  • 15. CONSENSUS IN DISTRIBUTED COMPUTING FLP IMPOSSIBILITY PROOF ▸ Michael J. Fisher, Nancy A. Lynch, and Michael S. Patterson ▸ Impossibility of Distributed Consensus with One Faulty Process (1985) - Dijkstra (dike-stra) Award (2001) ▸ In synchronous settings, it is possible to reach consensus at the cost of time ▸ Consensus is impossible in an asynchronous setting even when only 1 node will crash
  • 17. CONSENSUS IN DISTRIBUTED COMPUTING SOLVING THE CONSENSUS PROBLEM ▸ Strong consensus follows these properties: ▸ Termination - all nodes eventually decide on a value ▸ Agreement - all nodes decide on a value ▸ Validity - the value decided must be proposed by a node (AKA no default value to fall back on) ▸ Termination + Agreement + Validity = Consensus
  • 18. CONSENSUS IN DISTRIBUTED COMPUTING CONSENSUS PROTOCOLS ▸ 2 Phase Commit ▸ 3 Phase Commit ▸ Basic Paxos ▸ The Future…
  • 19. CONSENSUS IN DISTRIBUTED COMPUTING 2 PHASE COMMIT ▸ Simplest consensus protocol ▸ Phase 1 - Proposal ▸ A node (called coordinator) proposes a value to all other nodes, then gathers votes ▸ Phase 2 - Commit-or-abort ▸ The coordinator sends: ▸ Commit if all nodes voted yes. All nodes commit the new value ▸ Abort if 1 or more nodes voted no. All nodes abort the value
  • 22. COOR. NODE NODE NODE NODE Coordinator sends commit if all nodes voted yes; sends abort otherwise All nodes now update themselves to contain the proposed value, or all nodes abort
  • 23. CONSENSUS IN DISTRIBUTED COMPUTING 2 PHASE COMMIT ▸ Agreement - every node accepts the value from the coordinator at phase 2 = YES ▸ Validity - commit/abort originated from the coordinator = YES ▸ Termination = no loops in the steps, doesn’t run forever = YES ▸ Therefore, 2 phase commit fulfils the requirements of a consensus protocol
  • 24. CONSENSUS IN DISTRIBUTED COMPUTING 2 PHASE COMMIT ▸ Blocking failure when coordinator fails before sending proposal to all nodes COOR. NODE NODE NODE Coordinator proposes a value
  • 25. ▸ Blocking failure when coordinator fails before sending proposal to all nodes 2 PHASE COMMIT CONSENSUS IN DISTRIBUTED COMPUTING COOR. NODE NODE NODE Receives proposed value, votes yes, now waiting for commit
  • 26. ▸ Blocking failure when coordinator fails before sending proposal to all nodes 2 PHASE COMMIT CONSENSUS IN DISTRIBUTED COMPUTING COOR. NODE NODE NODE Coordinator crashes… and a different coordinator comes in to propose a different value NEW COOR.
  • 27. ▸ Blocking failure when coordinator fails before sending proposal to all nodes 2 PHASE COMMIT CONSENSUS IN DISTRIBUTED COMPUTING COOR. NODE NODE NODE NEW COOR. Node cannot accept new proposal because waiting on commit. Cannot abort because first Coordinator might recover.
  • 28. CONSENSUS IN DISTRIBUTED COMPUTING 2 PHASE COMMIT ▸ Guarantees safety, but not liveness ▸ Safety = all nodes agree on a value proposed by a node ▸ Liveness = should still be able to function when some nodes crash
  • 29. CONSENSUS IN DISTRIBUTED COMPUTING 3 PHASE COMMIT ▸ Similar to 2 Phase Commit, with an extra phase (duh) ▸ Phase 1 - Proposal - same as 2PC ▸ Phase 2 - Pre-approve - similar to 2PC commit-or-abort, but nodes reply with ACK instead ▸ Phase 3 - Do Commit - now the nodes commit ▸ Tolerant of node crashes, but not network partitions ▸ Won’t cover in detail
  • 30. CONSENSUS IN DISTRIBUTED COMPUTING PAXOS ▸ Presented by Leslie Lamport in The Part-Time Parliament (1988) ▸ Named after the Paxos civilisation’s legislation ▸ Remains as: ▸ The hardest to understand in theory ▸ The hardest to implement ▸ The closest we get to reaching ideal consensus
  • 31. CONSENSUS IN DISTRIBUTED COMPUTING PAXOS ▸ Used in: ▸ Apache Zookeeper ▸ Google Chubby (BigTable) ▸ Google Spannar ▸ Apache Mesos ▸ Apache Cassandra ▸ etc
  • 32. CONSENSUS IN DISTRIBUTED COMPUTING PAXOS ▸ Components: ▸ Proposers ▸ Proposes values to other nodes ▸ Acceptors ▸ Respond to proposers with votes ▸ Commits chosen value & decision state ▸ Server can have both 1 Proposer & 1 Acceptor
  • 33. CONSENSUS IN DISTRIBUTED COMPUTING PAXOS ▸ Uses a two-base approach: ▸ Broadcast Prepare ▸ Find out if there’s already a chosen value ▸ Block older proposals that have yet to be completed ▸ Broadcast Accept ▸ Ask acceptors to accept a value
  • 34. CONSENSUS IN DISTRIBUTED COMPUTING PAXOS ▸ Prepare(n) ▸ n = proposal number [max++]~[server id] ▸ Return(p, v) ▸ p = proposal number ▸ v = current accepted value (if any) ▸ Accept(p, v) ▸ p = proposal number ▸ v = value to be accepted
  • 35. CONSENSUS IN DISTRIBUTED COMPUTING PAXOS ▸ Proposal Phase ▸ Proposer generates a proposal number p ▸ Proposer broadcasts p and a value v ▸ Acceptor checks p if higher than its min-p, updates if so ▸ Acceptor replies any acc-p and acc-v ▸ Proposer waits for majority ▸ Checks if any return acc-p is highest, and replace v with acc-v
  • 36. CONSENSUS IN DISTRIBUTED COMPUTING PAXOS ▸ Accept Phase ▸ Proposer sends p and v to all acceptors ▸ Acceptors check if p is lower than min-p, and ignores if so. Otherwise, acc-p = min-p = p and acc-v = v ▸ Acceptor reply accepted or rejected ▸ If majority accepted, terminate with v. Otherwise, restart Propose Phase with new p
  • 37. A1 A2 A3 7 v7 is proposed with p1 P1MIN-P 0 ACC-P - ACC-V - MIN-P 0 ACC-P - ACC-V - MIN-P 0 ACC-P - ACC-V - P 7
  • 38. A1 A2 A3 7 v7 is proposed with p1 P1MIN-P 1 ACC-P - ACC-V - MIN-P 0 ACC-P - ACC-V - MIN-P 0 ACC-P - ACC-V - P 7 P1 7
  • 39. A1 A2 A3 7 v7 is proposed with p1 P1MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - MIN-P 0 ACC-P - ACC-V - P 7 P1 7 P1 7
  • 40. A1 A2 A3 7 v7 is proposed with p1 P1MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - P 7 P1 7 P1 7
  • 41. A1 A2 A3 7 v7 is proposed with p1 P1MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - P 7 P1 7 P1 7 ACC-P - ACC-V -
  • 42. A1 A2 A3 7 v7 is proposed with p1 P1MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - P 7 P1 7 P1 7 ACC-P - ACC-V - ACC-P - ACC-V -
  • 43. A1 A2 A3 7 Has majority! Since acc-p and acc-v are both null, we know that we are the only proposers in the network so far P1MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - P 7 P1 7 P1 7 ACC-P - ACC-V - ACC-P - ACC-V -
  • 44. A1 A2 A3 Now, we send out p and v in the accept phase P1MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - P 7 P1 7 P1 7
  • 45. A1 A2 A3 Acceptors update acc-p and acc-v P1MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 P 7 P1 7 P1 7
  • 46. A1 A2 A3 Accept! MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 P
  • 47. A1 A2 A3 Accept! MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 P Accept!
  • 48. A1 A2 A3 Accept! MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 P Accept! Oh look, we have majority! v7 is the terminated value then!
  • 49. A1 A2 A3 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 P Shuddup, nobody loves you Accept? :(
  • 50. CONSENSUS IN DISTRIBUTED COMPUTING PAXOS - MULTI PROPOSERS ▸ What if there were multiple proposers? ▸ Brace yourself, It’s Complicated™ (not really)
  • 51. A1 A2 A3 7 P1MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - P1 7 P1 7 P2 P1 7
  • 52. A1 A2 A3 7 P1MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - MIN-P 1 ACC-P - ACC-V - P 7 P1 7 P1 7 ACC-P - ACC-V - ACC-P - ACC-V - P2
  • 53. A1 A2 A3 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 P1 P2 P1 7 P1 7 P1 7 P2 5 5 v5 is proposed with p2
  • 54. A1 A2 A3 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 P1 P2 P1 7 P1 7 P1 7 P2 5 ACC-P 1 ACC-V 7 5
  • 55. A1 A2 A3 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 P1 P2 P1 7 P1 7 P1 7 P2 7 7 value of p2 is changed to 7
  • 56. A1 A2 A3 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 MIN-P 1 ACC-P 1 ACC-V 7 P1 P2 P1 7 P1 7 P1 7 P2 7 Broadcast accept phase with p2 and v7
  • 57. A1 A2 A3 MIN-P 2 ACC-P 1 ACC-V 7 MIN-P 2 ACC-P 1 ACC-V 7 MIN-P 2 ACC-P 1 ACC-V 7 P1 P2 P1 7 P1 7 P1 7 P2 7 P2 7 P2 7 Both proposer succeed! No blocking here.
  • 58. CONSENSUS IN DISTRIBUTED COMPUTING BASIC PAXOS ▸ This is BASIC Paxos: 2PC with a twist (Quorum) ▸ It has vulnerabilities! ▸ Best of 2PC (safety), with strong liveness ▸ Most Consensus Algorithm are a variant of Paxos ▸ Forms the basis of Distributed Computing research
  • 59. CONSENSUS IN DISTRIBUTED COMPUTING CLOSING… ▸ Basic Paxos is not Byzantine Fault Tolerant ▸ It is a challenge to create a consensus protocol (termination, agreement, validity) that is Byzantine Fault Tolerant ▸ Nakamoto Consensus (aka bitcoin consensus) skirts around Byzantine problems by imposing proof-of-work ▸ Raft is an implementation of Paxos, used in etcd and consul
  • 60. PAXOS - BEST GEEKY PICKUP LINE NEVER Ruben Tan CONSENSUS IN DISTRIBUTED COMPUTING