Non-blocking
Michael-Scott queue algorithm
Alexey Fyodorov
JUG.ru Group
• Programming
• Algorithms
• Concurrency
What is this talk about?
• Programming
• Algorithms
• Concurrency
Are	you	sure	you	need	it?
What is this talk about?
For concurrency
beginners
Sorry
Please go to another room
For concurrency
beginners
Sorry
Please go to another room
For non-blocking
programming beginners
A short introduction
For concurrency
beginners
Sorry
Please go to another room
For non-blocking
programming beginners
A short introduction
For advanced concurrent
programmers
CAS-based queue algorithm
You have another room!
12:10
Non-blocking Michael-
Scott queue algorithm
Alexey Fyodorov
Easily scale enterprise
applications
using distributed data grids
Ondrej Mihaly
Main Models
Shared Memory
write + read
Similar to how
we program it
Concurrent
Programming
Main Models
Shared Memory Messaging
write + read send + onReceive
Similar to how
we program it
Similar to how
a real hardware works
Distributed
Programming
Concurrent
Programming
Advantages of Parallelism
Resource utilization Utilization of several cores/CPUs
aka PERFORMANCE
Advantages of Parallelism
Resource utilization
Simplicity Complexity goes to magic frameworks
• ArrayBlockingQueue
• ConcurrentHashMap
• Akka
Utilization of several cores/CPUs
aka PERFORMANCE
Advantages of Parallelism
Resource utilization
Async handling
Simplicity
Utilization of several cores/CPUs
aka PERFORMANCE
Complexity goes to magic frameworks
• ArrayBlockingQueue
• ConcurrentHashMap
• Akka
Responsible services, Responsible UI
Disadvantages of Locking
• Deadlocks
Disadvantages of Locking
• Deadlocks
• Priority Inversion
Disadvantages of Locking
• Deadlocks
• Priority Inversion
• Reliability
• What will happen if lock owner die?
Disadvantages of Locking
• Deadlocks
• Priority Inversion
• Reliability
• What will happen if lock owner die?
• Performance
• Scheduler can push lock owner out
• No parallelism inside a critical section!
Amdahl’s Law
α non-parallelizable part of the computation
1-α parallelizable part of the computation
p number of threads
Amdahl’s Law
α non-parallelizable part of the computation
1-α parallelizable part of the computation
p number of threads
S =	
#
α$	
%&α
'
If-Modify-Write
volatile int value = 0;
Can we run it
in multithreaded environment?
if (value == 0) {
value = 42;
}
If-Modify-Write
volatile int value = 0;
No atomicity
if (value == 0) {
value = 42;
}
}
Compare-And-Set
int value = 0;
LOCK
if (value == 0) {
value = 42;
}
UNLOCK
Introducing a Magic Operation
value.compareAndSet(0, 42);
int value = 0;
Simulated CAS
long value;
synchronized long get() {
return value;
}
synchronized long compareAndSwap(long expected, long newValue) {
long oldValue = value;
if (oldValue == expected) {
value = newValue;
}
return oldValue;
}
synchronized boolean compareAndSet(long expected, long newValue) {
return expected == compareAndSwap(expected, newValue);
}
Simulated CAS
long value;
synchronized long get() {
return value;
}
synchronized long compareAndSwap(long expected, long newValue) {
long oldValue = value;
if (oldValue == expected) {
value = newValue;
}
return oldValue;
}
synchronized boolean compareAndSet(long expected, long newValue) {
return expected == compareAndSwap(expected, newValue);
}
Simulated CAS
long value;
synchronized long get() {
return value;
}
synchronized long compareAndSwap(long expected, long newValue) {
long oldValue = value;
if (oldValue == expected) {
value = newValue;
}
return oldValue;
}
synchronized boolean compareAndSet(long expected, long newValue) {
return expected == compareAndSwap(expected, newValue);
}
Simulated CAS
long value;
synchronized long get() {
return value;
}
synchronized long compareAndSwap(long expected, long newValue) {
long oldValue = value;
if (oldValue == expected) {
value = newValue;
}
return oldValue;
}
synchronized boolean compareAndSet(long expected, long newValue){
return expected == compareAndSwap(expected, newValue);
}
Compare and Swap — Hardware Support
compare-and-swap
CAS
load-link / store-conditional
LL/SC
cmpxchg ldrex/strex lwarx/stwcx
Atomics in JDK
AtomicReference
• ref.get()
• ref.compareAndSet(v1, v2)
• ...
AtomicLong
• i.get()
• i.compareAndSet(42, 43)
• i.incrementAndGet(1)
• i.getAndAdd(5)
• ...
java.util.concurrent.atomic
Atomics in JDK
AtomicReference
• ref.get()
• ref.compareAndSet(v1, v2)
• ...
AtomicLong
• i.get()
• i.compareAndSet(42, 43)
• i.incrementAndGet(1)
• i.getAndAdd(5)
• ...
java.util.concurrent.atomic
Example. Atomic Counter
AtomicLong value = new AtomicLong();
long get() {
return value.get();
}
void increment() {
long v;
do {
v = value.get();
} while (!value.compareAndSet(v, v + 1));
}
AtomicLong value = new AtomicLong();
long get() {
return value.get();
}
void increment() {
long v;
do {
v = value.get();
} while (!value.compareAndSet(v, v + 1));
}
Example. Atomic Counter
Atomics.
Questions?
Non-blocking Guarantees
Wait-Free Per-thread progress is guaranteed
Non-blocking Guarantees
Wait-Free Per-thread progress is guaranteed
Lock-Free Overall progress is guaranteed
Non-blocking Guarantees
Wait-Free Per-thread progress is guaranteed
Lock-Free Overall progress is guaranteed
Obstruction-Free Overall progress is guaranteed
if threads don’t interfere with each other
CAS-loop
do {
v = value.get();
} while (!value.compareAndSet(v, v + 1));
A. Wait-Free
B. Lock-Free
C. Obstruction-Free
CAS-loop
do {
v = value.get();
} while (!value.compareAndSet(v, v + 1));
A. Wait-Free
B. Lock-Free
C. Obstruction-Free
*for modern hardware supporting CAS or LL/SC
Stack & Concurrency
class Node<E> {
final E item;
Node<E> next;
Node(E item) {
this.item = item;
}
}
...
class Node<E> {
final E item;
Node<E> next;
Node(E item) {
this.item = item;
}
}
E3
E1
E2
E3
E1
E2
top
E3
E1
E2
top
item1
Thread 1
E3
E1
E2
top
item1
Thread 1
E3
E1
E2
top
item2item1
Thread 1 Thread 2
E3
E1
E2
top
item2item1
Thread 1 Thread 2
E3
E1
E2
item2item1
Thread 1 Thread 2top
E3
E1
E2
item2item1
Thread 1 Thread 2
We need
a synchronization
top
Non-blocking Stack
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
AtomicReference<Node<E>> top;
E3
E1
E2
top
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
AtomicReference<Node<E>> top;
E3
E1
E2
item
top
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
E3
E1
E2
item
AtomicReference<Node<E>> top;
top
newHead
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
E3
E1
E2
AtomicReference<Node<E>> top;
item
top
newHead
oldHead
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
AtomicReference<Node<E>> top;
E3
E1
E2
item
top
newHead
oldHead
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
AtomicReference<Node<E>> top;
E3
E1
E2
item
top
newHead
oldHead
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
AtomicReference<Node<E>> top;
E3
E1
E2
item
top
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
AtomicReference<Node<E>> top;
E3
E1
E2
item
top
newHead
oldHead
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
E3
E1
E2
AtomicReference<Node<E>> top;
top
itemnewHead
oldHead
void push(E item) {
Node<E> newHead = new Node<>(item);
Node<E> oldHead;
do {
oldHead = top.get();
newHead.next = oldHead;
} while (!top.compareAndSet(oldHead, newHead));
}
E3
E1
E2
AtomicReference<Node<E>> top;
top
item
E pop() {
Node<E> newHead;
Node<E> oldHead;
do {
oldHead = top.get();
if (oldHead == null) return null;
newHead = oldHead.next;
} while (!top.compareAndSet(oldHead, newHead));
return oldHead.item;
}
E3
E1
E2
top
Non-blocking Stack.
Questions?
Non-blocking Queue
Michael and Scott, 1996
https://www.research.ibm.com/people/m/michael/podc-1996.pdf
Threads help each other
Non-blocking queue
class LinkedQueue<E> {
static class Node<E> {
E item;
AtomicReference<Node<E>> next;
Node(E item, AtomicReference<Node<E>> next) {
this.item = item;
this.next = next;
}
}
Node<E> dummy = new Node<>(null, null);
AtomicReference<Node<E>> head = new AtomicReference<>(dummy);
AtomicReference<Node<E>> tail = new AtomicReference<>(dummy);
}
class LinkedQueue<E> {
static class Node<E> {
E item;
AtomicReference<Node<E>> next;
Node(E item, AtomicReference<Node<E>> next) {
this.item = item;
this.next = next;
}
}
Node<E> dummy = new Node<>(null, null);
AtomicReference<Node<E>> head = new AtomicReference<>(dummy);
AtomicReference<Node<E>> tail = new AtomicReference<>(dummy);
}
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tail
dummy 1 2
head
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tail
dummy 1 2 item
head
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNode
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.compareAndSet(null, newNode);
tail.compareAndSet(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode);
tail.CAS(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode);
tail.CAS(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode); // true
tail.CAS(curTail, curTail.next.get()); // true
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode);
tail.CAS(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode); // true
tail.CAS(curTail, curTail.next.get()); // false
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode);
tail.CAS(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode); // false
tail.CAS(curTail, curTail.next.get()); // false
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
another
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode);
tail.CAS(curTail, curTail.next.get());
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode); // false
tail.CAS(curTail, curTail.next.get()); // true
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
another
void put(E item) {
Node<E> newNode = new Node<>(item, null);
boolean success;
do {
Node<E> curTail = tail.get();
success = curTail.next.CAS(null, newNode); // false
tail.CAS(curTail, curTail.next.get()); // true
} while (!success);
}
tailhead
dummy 1 2 item
newNodecurTail
anotherHELP
Synchronization
Blocking
lock + unlock
Invariant: before & after
lock-based
Synchronization
Blocking Non-blocking
lock + unlock CAS-loop
Invariant: before & after Semi-invariant
CAS-basedlock-based
public void put(E item) {
Node<E> newNode = new Node<>(item, null);
while (true) {
Node<E> currentTail = tail.get();
Node<E> tailNext = currentTail.next.get();
if (currentTail == tail.get()) {
if (tailNext != null) {
tail.compareAndSet(currentTail, tailNext);
} else {
if (currentTail.next.compareAndSet(null, newNode)) {
tail.compareAndSet(currentTail, newNode);
return;
}
}
}
}
}
public E poll() {
while (true) {
Node<E> first = head.get();
Node<E> last = tail.get();
Node<E> next = first.next.get();
if (first == head.get()) {
if (first == last) {
if (next == null) return null;
tail.compareAndSet(last, next);
} else {
E item = next.item;
if (head.compareAndSet(first, next))
return item;
}
}
}
}
Non-blocking Queue in JDK
ConcurrentLinkedQueue is
based on Michael-Scott queue
— based on CAS-like operations
— use CAS-loop pattern
— threads help one another
Non-blocking algorithms. Summary
Non-blocking Queue.
Questions?
ArrayBlockingQueue
ArrayBlockingQueue
0 1 2 3 4 N-1
...
void put(E e) throws InterruptedException {
checkNotNull(e);
final ReentrantLock lock = this.lock;
lock.lockInterruptibly();
try {
while (count == items.length)
notFull.await();
final Object[] items = this.items;
items[putIndex] = x;
if (++putIndex == items.length)
putIndex = 0;
count++;
notEmpty.signal();
} finally {
lock.unlock();
}
}
ArrayBlockingQueue.put()
void put(E e) throws InterruptedException {
checkNotNull(e);
final ReentrantLock lock = this.lock;
lock.lockInterruptibly();
try {
while (count == items.length)
notFull.await();
final Object[] items = this.items;
items[putIndex] = x;
if (++putIndex == items.length)
putIndex = 0;
count++;
notEmpty.signal();
} finally {
lock.unlock();
}
}
ArrayBlockingQueue.put()
Modifications
Ladan-Mozes, Shavit, 2004, 2008
Key IDEA: use Doubly Linked List to avoid 2nd CAS
Optimistic	Approach
http://people.csail.mit.edu/edya/publications/OptimisticFIFOQueue-journal.pdf
Hoffman, Shalev, Shavit, 2007
Baskets	Queue
http://people.csail.mit.edu/shanir/publications/Baskets%20Queue.pdf
— Throughput is better
— no FIFO any more
— usually you don’t need strong FIFO in real life
Baskets Queue
Summary
— Non-blocking algorithms are complicated
— Blocking algorithms are easier
— correctness checking is difficult
— difficult to support
— Sometimes it has better performance
Summary
— Non-blocking algorithms are complicated
— Blocking algorithms are easier
— correctness checking is difficult
— difficult to support
— Sometimes it has better performance
Summary
— Non-blocking algorithms are complicated
— Blocking algorithms are easier
— correctness checking is difficult
— difficult to support
— Sometimes it has better performance
Summary
Engineering is the art of trade-offs
Links & Books
Books
Links
• Nitsan Wakart — http://psy-lob-saw.blogspot.com/	
• Alexey	Shipilev— https://shipilev.net/	
• concurrency-interest	mailing	list:	
http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
Q & A

More Related Content

PDF
PPTX
jQuery PPT
PDF
En route vers Java 21 - Javaday Paris 2023
PPTX
Express js
PDF
L'API Collector dans tous ses états
PDF
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
PDF
Introduction to Apache Calcite
ODP
Introduction to Swagger
jQuery PPT
En route vers Java 21 - Javaday Paris 2023
Express js
L'API Collector dans tous ses états
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
Introduction to Apache Calcite
Introduction to Swagger

What's hot (20)

PPTX
Reactive Programming In Java Using: Project Reactor
PDF
Apache Flink 101 - the rise of stream processing and beyond
PDF
GraphQL with Spring Boot
PDF
RailsAdmin - Overview and Best practices
PPTX
Java 8 presentation
PDF
Django Introduction & Tutorial
PPT
Hibernate
PDF
REST APIs with Spring
PDF
Android local sockets in native code
PPT
Java Persistence API (JPA) Step By Step
PPTX
PPTX
Introduction to GraphQL Presentation.pptx
PPT
Java Collections Framework
PDF
Spring boot introduction
PDF
ReactJS Tutorial For Beginners | ReactJS Redux Training For Beginners | React...
PPTX
PDF
Reactive programming with RxJava
PPTX
Java spring ppt
PPTX
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
PDF
Java 8-streams-collectors-patterns
Reactive Programming In Java Using: Project Reactor
Apache Flink 101 - the rise of stream processing and beyond
GraphQL with Spring Boot
RailsAdmin - Overview and Best practices
Java 8 presentation
Django Introduction & Tutorial
Hibernate
REST APIs with Spring
Android local sockets in native code
Java Persistence API (JPA) Step By Step
Introduction to GraphQL Presentation.pptx
Java Collections Framework
Spring boot introduction
ReactJS Tutorial For Beginners | ReactJS Redux Training For Beginners | React...
Reactive programming with RxJava
Java spring ppt
Advanced Stream Processing with Flink and Pulsar - Pulsar Summit NA 2021 Keynote
Java 8-streams-collectors-patterns
Ad

Similar to Non-blocking Michael-Scott queue algorithm (20)

PDF
Non-blocking synchronization — what is it and why we (don't?) need it
PDF
无锁编程
PDF
Lock free algorithms
PPTX
Introduction to Concurrent Data Structures
PPT
Hs java open_party
PDF
Concurrency
PDF
Lockless
PDF
Java Concurrency in Practice
PDF
Scale Up with Lock-Free Algorithms @ JavaOne
PPTX
Introduction to Concurrent Programming
PDF
The Need for Async @ ScalaWorld
PDF
Understanding the Disruptor
PDF
Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-s...
PDF
jvm/java - towards lock-free concurrency
KEY
Java Core | Understanding the Disruptor: a Beginner's Guide to Hardcore Concu...
PDF
If You Think You Can Stay Away from Functional Programming, You Are Wrong
DOCX
Java 5 concurrency
PDF
[JavaOne 2011] Models for Concurrent Programming
PDF
Let's Talk Locks!
PDF
Towards a Scalable Non-Blocking Coding Style
Non-blocking synchronization — what is it and why we (don't?) need it
无锁编程
Lock free algorithms
Introduction to Concurrent Data Structures
Hs java open_party
Concurrency
Lockless
Java Concurrency in Practice
Scale Up with Lock-Free Algorithms @ JavaOne
Introduction to Concurrent Programming
The Need for Async @ ScalaWorld
Understanding the Disruptor
Atmosphere Conference 2015: Need for Async: In pursuit of scalable internet-s...
jvm/java - towards lock-free concurrency
Java Core | Understanding the Disruptor: a Beginner's Guide to Hardcore Concu...
If You Think You Can Stay Away from Functional Programming, You Are Wrong
Java 5 concurrency
[JavaOne 2011] Models for Concurrent Programming
Let's Talk Locks!
Towards a Scalable Non-Blocking Coding Style
Ad

More from Alexey Fyodorov (14)

PDF
How threads help each other
PDF
Помоги ближнему, или Как потоки помогают друг другу
PDF
Counter Wars (JEEConf 2016)
PDF
Синхронизация без блокировок и СМС
PDF
Unsafe: to be or to be removed?
PDF
Общество Мертвых Потоков
PDF
JDK: CPU, PSU, LU, FR — WTF?!
PDF
Atomics, CAS and Nonblocking algorithms
PDF
Philosophers
PDF
Java in Motion
PDF
Java Platform Tradeoffs (Riga 2013)
PDF
Java Platform Tradeoffs (CEE SECR 2013)
PDF
Процесс изменения платформы Java
PPTX
Java: how to thrive in the changing world
How threads help each other
Помоги ближнему, или Как потоки помогают друг другу
Counter Wars (JEEConf 2016)
Синхронизация без блокировок и СМС
Unsafe: to be or to be removed?
Общество Мертвых Потоков
JDK: CPU, PSU, LU, FR — WTF?!
Atomics, CAS and Nonblocking algorithms
Philosophers
Java in Motion
Java Platform Tradeoffs (Riga 2013)
Java Platform Tradeoffs (CEE SECR 2013)
Процесс изменения платформы Java
Java: how to thrive in the changing world

Recently uploaded (20)

PDF
Cryptography and Network Security-Module-I.pdf
PPTX
CN_Unite_1 AI&DS ENGGERING SPPU PUNE UNIVERSITY
PPTX
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
PDF
VSL-Strand-Post-tensioning-Systems-Technical-Catalogue_2019-01.pdf
DOC
T Pandian CV Madurai pandi kokkaf illaya
PPTX
Sorting and Hashing in Data Structures with Algorithms, Techniques, Implement...
PPTX
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
PDF
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
PPTX
Micro1New.ppt.pptx the mai themes of micfrobiology
PDF
Beginners-Guide-to-Artificial-Intelligence.pdf
PDF
Computer System Architecture 3rd Edition-M Morris Mano.pdf
PPTX
Software Engineering and software moduleing
PPTX
"Array and Linked List in Data Structures with Types, Operations, Implementat...
PPTX
MAD Unit - 3 User Interface and Data Management (Diploma IT)
PPTX
Chapter 2 -Technology and Enginerring Materials + Composites.pptx
PDF
MLpara ingenieira CIVIL, meca Y AMBIENTAL
PDF
Unit1 - AIML Chapter 1 concept and ethics
PDF
Design of Material Handling Equipment Lecture Note
PPT
Chapter 1 - Introduction to Manufacturing Technology_2.ppt
PDF
August 2025 - Top 10 Read Articles in Network Security & Its Applications
Cryptography and Network Security-Module-I.pdf
CN_Unite_1 AI&DS ENGGERING SPPU PUNE UNIVERSITY
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
VSL-Strand-Post-tensioning-Systems-Technical-Catalogue_2019-01.pdf
T Pandian CV Madurai pandi kokkaf illaya
Sorting and Hashing in Data Structures with Algorithms, Techniques, Implement...
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
Micro1New.ppt.pptx the mai themes of micfrobiology
Beginners-Guide-to-Artificial-Intelligence.pdf
Computer System Architecture 3rd Edition-M Morris Mano.pdf
Software Engineering and software moduleing
"Array and Linked List in Data Structures with Types, Operations, Implementat...
MAD Unit - 3 User Interface and Data Management (Diploma IT)
Chapter 2 -Technology and Enginerring Materials + Composites.pptx
MLpara ingenieira CIVIL, meca Y AMBIENTAL
Unit1 - AIML Chapter 1 concept and ethics
Design of Material Handling Equipment Lecture Note
Chapter 1 - Introduction to Manufacturing Technology_2.ppt
August 2025 - Top 10 Read Articles in Network Security & Its Applications

Non-blocking Michael-Scott queue algorithm

  • 2. • Programming • Algorithms • Concurrency What is this talk about?
  • 3. • Programming • Algorithms • Concurrency Are you sure you need it? What is this talk about?
  • 5. For concurrency beginners Sorry Please go to another room For non-blocking programming beginners A short introduction
  • 6. For concurrency beginners Sorry Please go to another room For non-blocking programming beginners A short introduction For advanced concurrent programmers CAS-based queue algorithm
  • 7. You have another room! 12:10 Non-blocking Michael- Scott queue algorithm Alexey Fyodorov Easily scale enterprise applications using distributed data grids Ondrej Mihaly
  • 8. Main Models Shared Memory write + read Similar to how we program it Concurrent Programming
  • 9. Main Models Shared Memory Messaging write + read send + onReceive Similar to how we program it Similar to how a real hardware works Distributed Programming Concurrent Programming
  • 10. Advantages of Parallelism Resource utilization Utilization of several cores/CPUs aka PERFORMANCE
  • 11. Advantages of Parallelism Resource utilization Simplicity Complexity goes to magic frameworks • ArrayBlockingQueue • ConcurrentHashMap • Akka Utilization of several cores/CPUs aka PERFORMANCE
  • 12. Advantages of Parallelism Resource utilization Async handling Simplicity Utilization of several cores/CPUs aka PERFORMANCE Complexity goes to magic frameworks • ArrayBlockingQueue • ConcurrentHashMap • Akka Responsible services, Responsible UI
  • 14. Disadvantages of Locking • Deadlocks • Priority Inversion
  • 15. Disadvantages of Locking • Deadlocks • Priority Inversion • Reliability • What will happen if lock owner die?
  • 16. Disadvantages of Locking • Deadlocks • Priority Inversion • Reliability • What will happen if lock owner die? • Performance • Scheduler can push lock owner out • No parallelism inside a critical section!
  • 17. Amdahl’s Law α non-parallelizable part of the computation 1-α parallelizable part of the computation p number of threads
  • 18. Amdahl’s Law α non-parallelizable part of the computation 1-α parallelizable part of the computation p number of threads S = # α$ %&α '
  • 19. If-Modify-Write volatile int value = 0; Can we run it in multithreaded environment? if (value == 0) { value = 42; }
  • 20. If-Modify-Write volatile int value = 0; No atomicity if (value == 0) { value = 42; } }
  • 21. Compare-And-Set int value = 0; LOCK if (value == 0) { value = 42; } UNLOCK
  • 22. Introducing a Magic Operation value.compareAndSet(0, 42); int value = 0;
  • 23. Simulated CAS long value; synchronized long get() { return value; } synchronized long compareAndSwap(long expected, long newValue) { long oldValue = value; if (oldValue == expected) { value = newValue; } return oldValue; } synchronized boolean compareAndSet(long expected, long newValue) { return expected == compareAndSwap(expected, newValue); }
  • 24. Simulated CAS long value; synchronized long get() { return value; } synchronized long compareAndSwap(long expected, long newValue) { long oldValue = value; if (oldValue == expected) { value = newValue; } return oldValue; } synchronized boolean compareAndSet(long expected, long newValue) { return expected == compareAndSwap(expected, newValue); }
  • 25. Simulated CAS long value; synchronized long get() { return value; } synchronized long compareAndSwap(long expected, long newValue) { long oldValue = value; if (oldValue == expected) { value = newValue; } return oldValue; } synchronized boolean compareAndSet(long expected, long newValue) { return expected == compareAndSwap(expected, newValue); }
  • 26. Simulated CAS long value; synchronized long get() { return value; } synchronized long compareAndSwap(long expected, long newValue) { long oldValue = value; if (oldValue == expected) { value = newValue; } return oldValue; } synchronized boolean compareAndSet(long expected, long newValue){ return expected == compareAndSwap(expected, newValue); }
  • 27. Compare and Swap — Hardware Support compare-and-swap CAS load-link / store-conditional LL/SC cmpxchg ldrex/strex lwarx/stwcx
  • 28. Atomics in JDK AtomicReference • ref.get() • ref.compareAndSet(v1, v2) • ... AtomicLong • i.get() • i.compareAndSet(42, 43) • i.incrementAndGet(1) • i.getAndAdd(5) • ... java.util.concurrent.atomic
  • 29. Atomics in JDK AtomicReference • ref.get() • ref.compareAndSet(v1, v2) • ... AtomicLong • i.get() • i.compareAndSet(42, 43) • i.incrementAndGet(1) • i.getAndAdd(5) • ... java.util.concurrent.atomic
  • 30. Example. Atomic Counter AtomicLong value = new AtomicLong(); long get() { return value.get(); } void increment() { long v; do { v = value.get(); } while (!value.compareAndSet(v, v + 1)); }
  • 31. AtomicLong value = new AtomicLong(); long get() { return value.get(); } void increment() { long v; do { v = value.get(); } while (!value.compareAndSet(v, v + 1)); } Example. Atomic Counter
  • 34. Non-blocking Guarantees Wait-Free Per-thread progress is guaranteed Lock-Free Overall progress is guaranteed
  • 35. Non-blocking Guarantees Wait-Free Per-thread progress is guaranteed Lock-Free Overall progress is guaranteed Obstruction-Free Overall progress is guaranteed if threads don’t interfere with each other
  • 36. CAS-loop do { v = value.get(); } while (!value.compareAndSet(v, v + 1)); A. Wait-Free B. Lock-Free C. Obstruction-Free
  • 37. CAS-loop do { v = value.get(); } while (!value.compareAndSet(v, v + 1)); A. Wait-Free B. Lock-Free C. Obstruction-Free *for modern hardware supporting CAS or LL/SC
  • 39. class Node<E> { final E item; Node<E> next; Node(E item) { this.item = item; } } ...
  • 40. class Node<E> { final E item; Node<E> next; Node(E item) { this.item = item; } } E3 E1 E2
  • 47. E3 E1 E2 item2item1 Thread 1 Thread 2 We need a synchronization top
  • 49. void push(E item) { Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } AtomicReference<Node<E>> top; E3 E1 E2 top
  • 50. void push(E item) { Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } AtomicReference<Node<E>> top; E3 E1 E2 item top
  • 51. void push(E item) { Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } E3 E1 E2 item AtomicReference<Node<E>> top; top newHead
  • 52. void push(E item) { Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } E3 E1 E2 AtomicReference<Node<E>> top; item top newHead oldHead
  • 53. void push(E item) { Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } AtomicReference<Node<E>> top; E3 E1 E2 item top newHead oldHead
  • 54. void push(E item) { Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } AtomicReference<Node<E>> top; E3 E1 E2 item top newHead oldHead
  • 55. void push(E item) { Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } AtomicReference<Node<E>> top; E3 E1 E2 item top
  • 56. void push(E item) { Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } AtomicReference<Node<E>> top; E3 E1 E2 item top newHead oldHead
  • 57. void push(E item) { Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } E3 E1 E2 AtomicReference<Node<E>> top; top itemnewHead oldHead
  • 58. void push(E item) { Node<E> newHead = new Node<>(item); Node<E> oldHead; do { oldHead = top.get(); newHead.next = oldHead; } while (!top.compareAndSet(oldHead, newHead)); } E3 E1 E2 AtomicReference<Node<E>> top; top item
  • 59. E pop() { Node<E> newHead; Node<E> oldHead; do { oldHead = top.get(); if (oldHead == null) return null; newHead = oldHead.next; } while (!top.compareAndSet(oldHead, newHead)); return oldHead.item; } E3 E1 E2 top
  • 62. Michael and Scott, 1996 https://www.research.ibm.com/people/m/michael/podc-1996.pdf Threads help each other Non-blocking queue
  • 63. class LinkedQueue<E> { static class Node<E> { E item; AtomicReference<Node<E>> next; Node(E item, AtomicReference<Node<E>> next) { this.item = item; this.next = next; } } Node<E> dummy = new Node<>(null, null); AtomicReference<Node<E>> head = new AtomicReference<>(dummy); AtomicReference<Node<E>> tail = new AtomicReference<>(dummy); }
  • 64. class LinkedQueue<E> { static class Node<E> { E item; AtomicReference<Node<E>> next; Node(E item, AtomicReference<Node<E>> next) { this.item = item; this.next = next; } } Node<E> dummy = new Node<>(null, null); AtomicReference<Node<E>> head = new AtomicReference<>(dummy); AtomicReference<Node<E>> tail = new AtomicReference<>(dummy); }
  • 65. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tail dummy 1 2 head
  • 66. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tail dummy 1 2 item head
  • 67. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNode
  • 68. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 69. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 70. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 71. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 72. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.compareAndSet(null, newNode); tail.compareAndSet(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 73. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); tail.CAS(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 74. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); tail.CAS(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 75. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); // true tail.CAS(curTail, curTail.next.get()); // true } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 76. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); tail.CAS(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 77. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); // true tail.CAS(curTail, curTail.next.get()); // false } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 78. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); tail.CAS(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 79. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); // false tail.CAS(curTail, curTail.next.get()); // false } while (!success); } tailhead dummy 1 2 item newNodecurTail another
  • 80. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); tail.CAS(curTail, curTail.next.get()); } while (!success); } tailhead dummy 1 2 item newNodecurTail
  • 81. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); // false tail.CAS(curTail, curTail.next.get()); // true } while (!success); } tailhead dummy 1 2 item newNodecurTail another
  • 82. void put(E item) { Node<E> newNode = new Node<>(item, null); boolean success; do { Node<E> curTail = tail.get(); success = curTail.next.CAS(null, newNode); // false tail.CAS(curTail, curTail.next.get()); // true } while (!success); } tailhead dummy 1 2 item newNodecurTail anotherHELP
  • 84. Synchronization Blocking Non-blocking lock + unlock CAS-loop Invariant: before & after Semi-invariant CAS-basedlock-based
  • 85. public void put(E item) { Node<E> newNode = new Node<>(item, null); while (true) { Node<E> currentTail = tail.get(); Node<E> tailNext = currentTail.next.get(); if (currentTail == tail.get()) { if (tailNext != null) { tail.compareAndSet(currentTail, tailNext); } else { if (currentTail.next.compareAndSet(null, newNode)) { tail.compareAndSet(currentTail, newNode); return; } } } } }
  • 86. public E poll() { while (true) { Node<E> first = head.get(); Node<E> last = tail.get(); Node<E> next = first.next.get(); if (first == head.get()) { if (first == last) { if (next == null) return null; tail.compareAndSet(last, next); } else { E item = next.item; if (head.compareAndSet(first, next)) return item; } } } }
  • 87. Non-blocking Queue in JDK ConcurrentLinkedQueue is based on Michael-Scott queue
  • 88. — based on CAS-like operations — use CAS-loop pattern — threads help one another Non-blocking algorithms. Summary
  • 92. void put(E e) throws InterruptedException { checkNotNull(e); final ReentrantLock lock = this.lock; lock.lockInterruptibly(); try { while (count == items.length) notFull.await(); final Object[] items = this.items; items[putIndex] = x; if (++putIndex == items.length) putIndex = 0; count++; notEmpty.signal(); } finally { lock.unlock(); } } ArrayBlockingQueue.put()
  • 93. void put(E e) throws InterruptedException { checkNotNull(e); final ReentrantLock lock = this.lock; lock.lockInterruptibly(); try { while (count == items.length) notFull.await(); final Object[] items = this.items; items[putIndex] = x; if (++putIndex == items.length) putIndex = 0; count++; notEmpty.signal(); } finally { lock.unlock(); } } ArrayBlockingQueue.put()
  • 95. Ladan-Mozes, Shavit, 2004, 2008 Key IDEA: use Doubly Linked List to avoid 2nd CAS Optimistic Approach http://people.csail.mit.edu/edya/publications/OptimisticFIFOQueue-journal.pdf
  • 96. Hoffman, Shalev, Shavit, 2007 Baskets Queue http://people.csail.mit.edu/shanir/publications/Baskets%20Queue.pdf
  • 97. — Throughput is better — no FIFO any more — usually you don’t need strong FIFO in real life Baskets Queue
  • 99. — Non-blocking algorithms are complicated — Blocking algorithms are easier — correctness checking is difficult — difficult to support — Sometimes it has better performance Summary
  • 100. — Non-blocking algorithms are complicated — Blocking algorithms are easier — correctness checking is difficult — difficult to support — Sometimes it has better performance Summary
  • 101. — Non-blocking algorithms are complicated — Blocking algorithms are easier — correctness checking is difficult — difficult to support — Sometimes it has better performance Summary Engineering is the art of trade-offs
  • 103. Books
  • 104. Links • Nitsan Wakart — http://psy-lob-saw.blogspot.com/ • Alexey Shipilev— https://shipilev.net/ • concurrency-interest mailing list: http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
  • 105. Q & A