SlideShare a Scribd company logo
CQRS and Event
Sourcing Applications
with Cassandra_
Matthias Niehoff
#CassandraSummit 2015
1
! The Use Case
! Event Sourcing
! CQRS
! Cassandra for Storage
! Spark for Processing
! Benefits & Pitfalls
! Q&A
Agenda_
2
The Use Case
3
24x7 Proxy_
4
LegacySystems

(Not24x7)
“InternetReady“
Applications
(24x7available)
24x7 Proxy
•Caches data
•Provides data
•Stores changes
•Provides changes
•No business logic/validation
•Solution needs to be highly scalable 

(up to 100.000 reads/s, 10.000 writes/s)
•Read and write access needs to be low latency
•Read/write ratio is 10:1 or higher
•Solution needs to deal with up to 500.000.000
customers
Assumptions_
5
Event Sourcing
6
Traditional Pattern: Saving Application State_
7
Store
ID
Address
Article
Name
StockSize updateInventory()
getInventory()
sells
A series of sales and replenishments for
• a tablet
• Starting with 60, sell 20, replenish 10
• a stove
• Starting with 25, sell 5, no replenishments
What is different with Event Sourcing?_
8
Saving only application state
What is the Difference?_
9
:ArticleInventory
Fancy Tablet
50
:ArticleInventory
Gas Stove
20
Saving events instead of state
What is the Difference?_
10
:ArticleInventory
Fancy Tablet
39
15-08-14T19:..
:ArticleInventory
Gas Stove
20
15-08-14T19:..
:ArticleInventory
Fancy Tablet
45
15-08-14T19:..
:ArticleInventory
Gas Stove
20
15-08-14T19:..
:ArticleInventory
Fancy Tablet
50
15-08-14T19:..
:ArticleInventory
Gas Stove
20
15-08-14T19:..
•Log of all stock changes
•Complete rebuild of the state
•Temporal query
•Event replay and rollback
Benefits of Storing Events_
11
CQRS
12
Default Application Architecture_
13
UserInterface
DomainModel
ApplicationServices
DB
CQRS Application Architecture_
14
UserInterface
Query
Services
Command
Services
DomainModel
DB
•The pattern is simple
•Going further
• Split up the domain model
• Independent scaling of models
• Not using a query model at all
• Different databases for models
A Pattern Changing Your Mindset_
15
Event Sourcing & CQRS_
16
Command
Services
Command
Model
ReadLayer
Query
Services
Query
Services
Query
Services Asynchronous
DB
Event Store
Query
Stores
ProcessorEvent
Processor
DB
DB
DB
Storage with Cassandra
17
•Not only an event sink
• Compaction
• Selective replay
•No single point of failure
•Horizontal scale & Geo Replication
•Write ahead of unmodified data
•Plays well with further processing
•Open source & a huge community
•Easy operations
Why Cassandra…
18
For accessing all entities of a given type
Event Store_
19
CREATE TABLE event_source_by_type (
entity_type TEXT,
bucket INT,
entity_key TEXT,
insert_time TIMESTAMP,
update_time TIMESTAMP,
payload TEXT,
PRIMARY KEY((entity_type,bucket),insert_time,entity_key)
) 

WITH CLUSTERING ORDER BY (created_at DESC,entity_key ASC);
e.g. as JSON, XML, protobuf, Avro
prevent huge partitions
CREATE TABLE event_source_by_key (
entity_type TEXT,
entity_key TEXT,
insert_time TIMESTAMP,
update_time TIMESTAMP,
payload TEXT,
PRIMARY KEY((entity_type,entity_key),created_at)
) 

WITH CLUSTERING ORDER BY (created_at DESC);
For accessing an entity directly
Optional: Second Table_
20
e.g. as JSON, XML or protobuf
•Create tables that fit your queries!
•E.g. „Get articles in category ‚computer‘“
Query Stores_
21
CREATE TABLE articles_by_category (
category TEXT PRIMARY KEY,
article_id UUID,
article_info TEXT
);
may need bucketing
could also be a
JSON document
Query Stores_
22
„I need ad-hoc queries“
„I need specific queries with
a lot of different filters“
Query Stores_
23
Processing with Spark
24
•Command model triggers event processor
•Event processor updates query views
From Event Store to Query Store_
25
Command
Model
Event
Processor DB
DB
DB
Event
Processor
Event
Processor
Event Processing in Detail_
26
Command
Model DB
DB
DB
•Easy scale out
•Easy deployment
•Intuitive Scala & Java API
•Fault tolerant
•Out-of-the-box Kafka adapter
•Integrates well with Cassandra
Why Spark?
27
•Spark Streaming application
•Consumes only topics of interest
•Joins the stream of events with the current view
• Use primary key of entity for correlation
• Use joinWithCassandraTable
Spark Job in Detail_
28
1. Create a table for the query view
2. Create a Spark job filling your table
3. Deploy the Spark job
4. Init reprocess of the event DB
• same transformation logic as in normal processing
• source can be different
5. Mark view as initialized
If you need a new query view_
29
Query
DB
Event
DB
Benefits &
Pitfalls
30
•Scalability
• On storage & processing: just add nodes
• Efficient queries due to separation
•Collaboration
• Every client gets its own data access
• Easy to support new queries
Benefits_
31
•More complexity than simple CRUD
•Side effects on event replay
•Eventual consistency in query views
•Concurrent writes
•Performance of replay
Pitfalls_
32
Lost Updates
•Due to parallel processing
• Two events A and B as sequential input
• A is processed after B
•Solution
• Partition Spark RDD by entity key
• Use a lambda architecture
Pitfalls_
33
speed
Data
Stream
Serving
Layer
batch
•Event Store Compaction
• Compact store to improve processing time
• Only store latest entry of a entity key
• e.g. a Spark batch job / Cassandra TTL
•Snapshot / Master State
• Constantly build a complete state of all data
• Can be used
• To speed up initialization
• As a store for a search engine
Pitfalls_
34
The Use Case
Solved with ES & CQRS
35
24x7 Proxy
24x7 Proxy_
36
LegacyCoreSystems

(Not24x7)
“InternetReady“Applications
(24x7available)
37
Questions?
Thank You!
Matthias Niehoff,
IT-Consultant
codecentric AG
Zeppelinstraße 2
76185 Karlsruhe, Germany
www.codecentric.de
blog.codecentric.de
matthiasniehoff
38

More Related Content

PPTX
Kafka 101
PPTX
Flink SQL & TableAPI in Large Scale Production at Alibaba
PDF
Kafka Streams: What it is, and how to use it?
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
PPTX
Introduction to Apache Flink
PDF
Event Sourcing with Cassandra (from Cassandra Japan Meetup in Tokyo March 2016)
PDF
Deep Dive: Memory Management in Apache Spark
PPTX
Kafka Tutorial: Kafka Security
Kafka 101
Flink SQL & TableAPI in Large Scale Production at Alibaba
Kafka Streams: What it is, and how to use it?
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Introduction to Apache Flink
Event Sourcing with Cassandra (from Cassandra Japan Meetup in Tokyo March 2016)
Deep Dive: Memory Management in Apache Spark
Kafka Tutorial: Kafka Security

What's hot (20)

PDF
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
PPSX
Event Sourcing & CQRS, Kafka, Rabbit MQ
PPTX
Kafka at Peak Performance
PDF
Kafka Streams State Stores Being Persistent
PPTX
Programming in Spark using PySpark
PPTX
Introduction to Data Engineering
PDF
From Zero to Hero with Kafka Connect
PDF
Building a fully managed stream processing platform on Flink at scale for Lin...
PDF
MongodB Internals
PPTX
Deep Dive into Apache Kafka
PPTX
01. Kubernetes-PPT.pptx
PDF
How Apache Kafka® Works
PDF
A Deep Dive into Kafka Controller
PPTX
Autoscaling Flink with Reactive Mode
PDF
Productizing Structured Streaming Jobs
PPTX
Microservices Part 3 Service Mesh and Kafka
PDF
Apache Kafka Fundamentals for Architects, Admins and Developers
PPTX
Kafka Tutorial - introduction to the Kafka streaming platform
PPTX
HBase and HDFS: Understanding FileSystem Usage in HBase
PPTX
What is Change Data Capture (CDC) and Why is it Important?
Unlocking the Power of Apache Flink: An Introduction in 4 Acts
Event Sourcing & CQRS, Kafka, Rabbit MQ
Kafka at Peak Performance
Kafka Streams State Stores Being Persistent
Programming in Spark using PySpark
Introduction to Data Engineering
From Zero to Hero with Kafka Connect
Building a fully managed stream processing platform on Flink at scale for Lin...
MongodB Internals
Deep Dive into Apache Kafka
01. Kubernetes-PPT.pptx
How Apache Kafka® Works
A Deep Dive into Kafka Controller
Autoscaling Flink with Reactive Mode
Productizing Structured Streaming Jobs
Microservices Part 3 Service Mesh and Kafka
Apache Kafka Fundamentals for Architects, Admins and Developers
Kafka Tutorial - introduction to the Kafka streaming platform
HBase and HDFS: Understanding FileSystem Usage in HBase
What is Change Data Capture (CDC) and Why is it Important?
Ad

Viewers also liked (9)

PPTX
Unit tests benefits
PDF
Microservice Architecture with CQRS and Event Sourcing
ODP
Event sourcing with Eventuate
PPTX
Moving Beyond Lambda Architectures with Apache Kudu
PDF
CQRS and Event Sourcing with Akka, Cassandra and RabbitMQ
PDF
Developing event-driven microservices with event sourcing and CQRS (svcc, sv...
PPTX
Going Serverless with CQRS on AWS
PDF
Akka persistence == event sourcing in 30 minutes
PPTX
CQRS and Event Sourcing, An Alternative Architecture for DDD
Unit tests benefits
Microservice Architecture with CQRS and Event Sourcing
Event sourcing with Eventuate
Moving Beyond Lambda Architectures with Apache Kudu
CQRS and Event Sourcing with Akka, Cassandra and RabbitMQ
Developing event-driven microservices with event sourcing and CQRS (svcc, sv...
Going Serverless with CQRS on AWS
Akka persistence == event sourcing in 30 minutes
CQRS and Event Sourcing, An Alternative Architecture for DDD
Ad

Similar to codecentric AG: CQRS and Event Sourcing Applications with Cassandra (20)

PDF
Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...
PPT
Scaling Web Applications with Cassandra Presentation.ppt
PDF
Spark and cassandra (Hulu Talk)
PDF
Cassandra & Spark for IoT
PDF
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
PDF
Apache cassandra & apache spark for time series data
PPT
Scaling Web Applications with Cassandra Presentation (1).ppt
PDF
Event Sourcing without any Framework
PPTX
5 Ways to Use Spark to Enrich your Cassandra Environment
PPTX
Introduction to CQRS and Event Sourcing
PDF
Analytics with Cassandra & Spark
PPTX
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
PDF
Analyzing_Data_with_Spark_and_Cassandra
PDF
Springone2gx 2015 Cassandra and Grails
PDF
Owning time series with team apache Strata San Jose 2015
PDF
Getting started with Spark & Cassandra by Jon Haddad of Datastax
PDF
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
PDF
Introduction to Apache Cassandra
PDF
Kafka spark cassandra webinar feb 16 2016
PDF
Kafka spark cassandra webinar feb 16 2016
Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...
Scaling Web Applications with Cassandra Presentation.ppt
Spark and cassandra (Hulu Talk)
Cassandra & Spark for IoT
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Apache cassandra & apache spark for time series data
Scaling Web Applications with Cassandra Presentation (1).ppt
Event Sourcing without any Framework
5 Ways to Use Spark to Enrich your Cassandra Environment
Introduction to CQRS and Event Sourcing
Analytics with Cassandra & Spark
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Analyzing_Data_with_Spark_and_Cassandra
Springone2gx 2015 Cassandra and Grails
Owning time series with team apache Strata San Jose 2015
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Introduction to Apache Cassandra
Kafka spark cassandra webinar feb 16 2016
Kafka spark cassandra webinar feb 16 2016

More from DataStax Academy (20)

PDF
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
PPTX
Introduction to DataStax Enterprise Graph Database
PPTX
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
PPTX
Cassandra on Docker @ Walmart Labs
PDF
Cassandra 3.0 Data Modeling
PPTX
Cassandra Adoption on Cisco UCS & Open stack
PDF
Data Modeling for Apache Cassandra
PDF
Coursera Cassandra Driver
PDF
Production Ready Cassandra
PDF
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
PPTX
Cassandra @ Sony: The good, the bad, and the ugly part 1
PPTX
Cassandra @ Sony: The good, the bad, and the ugly part 2
PDF
Standing Up Your First Cluster
PDF
Real Time Analytics with Dse
PDF
Introduction to Data Modeling with Apache Cassandra
PDF
Cassandra Core Concepts
PPTX
Enabling Search in your Cassandra Application with DataStax Enterprise
PPTX
Bad Habits Die Hard
PDF
Advanced Data Modeling with Apache Cassandra
PDF
Advanced Cassandra
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Cassandra on Docker @ Walmart Labs
Cassandra 3.0 Data Modeling
Cassandra Adoption on Cisco UCS & Open stack
Data Modeling for Apache Cassandra
Coursera Cassandra Driver
Production Ready Cassandra
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 2
Standing Up Your First Cluster
Real Time Analytics with Dse
Introduction to Data Modeling with Apache Cassandra
Cassandra Core Concepts
Enabling Search in your Cassandra Application with DataStax Enterprise
Bad Habits Die Hard
Advanced Data Modeling with Apache Cassandra
Advanced Cassandra

Recently uploaded (20)

PDF
Getting Started with Data Integration: FME Form 101
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Machine Learning_overview_presentation.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
Spectroscopy.pptx food analysis technology
PDF
Machine learning based COVID-19 study performance prediction
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
Empathic Computing: Creating Shared Understanding
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
Teaching material agriculture food technology
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Getting Started with Data Integration: FME Form 101
MIND Revenue Release Quarter 2 2025 Press Release
Per capita expenditure prediction using model stacking based on satellite ima...
Machine Learning_overview_presentation.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Univ-Connecticut-ChatGPT-Presentaion.pdf
A comparative analysis of optical character recognition models for extracting...
Spectroscopy.pptx food analysis technology
Machine learning based COVID-19 study performance prediction
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
cloud_computing_Infrastucture_as_cloud_p
Empathic Computing: Creating Shared Understanding
Reach Out and Touch Someone: Haptics and Empathic Computing
Teaching material agriculture food technology
Mobile App Security Testing_ A Comprehensive Guide.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025

codecentric AG: CQRS and Event Sourcing Applications with Cassandra

  • 1. CQRS and Event Sourcing Applications with Cassandra_ Matthias Niehoff #CassandraSummit 2015 1
  • 2. ! The Use Case ! Event Sourcing ! CQRS ! Cassandra for Storage ! Spark for Processing ! Benefits & Pitfalls ! Q&A Agenda_ 2
  • 4. 24x7 Proxy_ 4 LegacySystems
 (Not24x7) “InternetReady“ Applications (24x7available) 24x7 Proxy •Caches data •Provides data •Stores changes •Provides changes •No business logic/validation
  • 5. •Solution needs to be highly scalable 
 (up to 100.000 reads/s, 10.000 writes/s) •Read and write access needs to be low latency •Read/write ratio is 10:1 or higher •Solution needs to deal with up to 500.000.000 customers Assumptions_ 5
  • 7. Traditional Pattern: Saving Application State_ 7 Store ID Address Article Name StockSize updateInventory() getInventory() sells
  • 8. A series of sales and replenishments for • a tablet • Starting with 60, sell 20, replenish 10 • a stove • Starting with 25, sell 5, no replenishments What is different with Event Sourcing?_ 8
  • 9. Saving only application state What is the Difference?_ 9 :ArticleInventory Fancy Tablet 50 :ArticleInventory Gas Stove 20
  • 10. Saving events instead of state What is the Difference?_ 10 :ArticleInventory Fancy Tablet 39 15-08-14T19:.. :ArticleInventory Gas Stove 20 15-08-14T19:.. :ArticleInventory Fancy Tablet 45 15-08-14T19:.. :ArticleInventory Gas Stove 20 15-08-14T19:.. :ArticleInventory Fancy Tablet 50 15-08-14T19:.. :ArticleInventory Gas Stove 20 15-08-14T19:..
  • 11. •Log of all stock changes •Complete rebuild of the state •Temporal query •Event replay and rollback Benefits of Storing Events_ 11
  • 15. •The pattern is simple •Going further • Split up the domain model • Independent scaling of models • Not using a query model at all • Different databases for models A Pattern Changing Your Mindset_ 15
  • 16. Event Sourcing & CQRS_ 16 Command Services Command Model ReadLayer Query Services Query Services Query Services Asynchronous DB Event Store Query Stores ProcessorEvent Processor DB DB DB
  • 18. •Not only an event sink • Compaction • Selective replay •No single point of failure •Horizontal scale & Geo Replication •Write ahead of unmodified data •Plays well with further processing •Open source & a huge community •Easy operations Why Cassandra… 18
  • 19. For accessing all entities of a given type Event Store_ 19 CREATE TABLE event_source_by_type ( entity_type TEXT, bucket INT, entity_key TEXT, insert_time TIMESTAMP, update_time TIMESTAMP, payload TEXT, PRIMARY KEY((entity_type,bucket),insert_time,entity_key) ) 
 WITH CLUSTERING ORDER BY (created_at DESC,entity_key ASC); e.g. as JSON, XML, protobuf, Avro prevent huge partitions
  • 20. CREATE TABLE event_source_by_key ( entity_type TEXT, entity_key TEXT, insert_time TIMESTAMP, update_time TIMESTAMP, payload TEXT, PRIMARY KEY((entity_type,entity_key),created_at) ) 
 WITH CLUSTERING ORDER BY (created_at DESC); For accessing an entity directly Optional: Second Table_ 20 e.g. as JSON, XML or protobuf
  • 21. •Create tables that fit your queries! •E.g. „Get articles in category ‚computer‘“ Query Stores_ 21 CREATE TABLE articles_by_category ( category TEXT PRIMARY KEY, article_id UUID, article_info TEXT ); may need bucketing could also be a JSON document
  • 22. Query Stores_ 22 „I need ad-hoc queries“ „I need specific queries with a lot of different filters“
  • 25. •Command model triggers event processor •Event processor updates query views From Event Store to Query Store_ 25 Command Model Event Processor DB DB DB Event Processor Event Processor
  • 26. Event Processing in Detail_ 26 Command Model DB DB DB
  • 27. •Easy scale out •Easy deployment •Intuitive Scala & Java API •Fault tolerant •Out-of-the-box Kafka adapter •Integrates well with Cassandra Why Spark? 27
  • 28. •Spark Streaming application •Consumes only topics of interest •Joins the stream of events with the current view • Use primary key of entity for correlation • Use joinWithCassandraTable Spark Job in Detail_ 28
  • 29. 1. Create a table for the query view 2. Create a Spark job filling your table 3. Deploy the Spark job 4. Init reprocess of the event DB • same transformation logic as in normal processing • source can be different 5. Mark view as initialized If you need a new query view_ 29 Query DB Event DB
  • 31. •Scalability • On storage & processing: just add nodes • Efficient queries due to separation •Collaboration • Every client gets its own data access • Easy to support new queries Benefits_ 31
  • 32. •More complexity than simple CRUD •Side effects on event replay •Eventual consistency in query views •Concurrent writes •Performance of replay Pitfalls_ 32
  • 33. Lost Updates •Due to parallel processing • Two events A and B as sequential input • A is processed after B •Solution • Partition Spark RDD by entity key • Use a lambda architecture Pitfalls_ 33 speed Data Stream Serving Layer batch
  • 34. •Event Store Compaction • Compact store to improve processing time • Only store latest entry of a entity key • e.g. a Spark batch job / Cassandra TTL •Snapshot / Master State • Constantly build a complete state of all data • Can be used • To speed up initialization • As a store for a search engine Pitfalls_ 34
  • 35. The Use Case Solved with ES & CQRS 35
  • 38. Thank You! Matthias Niehoff, IT-Consultant codecentric AG Zeppelinstraße 2 76185 Karlsruhe, Germany www.codecentric.de blog.codecentric.de matthiasniehoff 38