SlideShare a Scribd company logo
1
Cassandra 2.2 and 3.0
new features
DuyHai DOAN
Apache Cassandra Technical Evangelist
#VoxxedBerlin @doanduyhai
Datastax
2
•  Founded in April 2010
•  We contribute a lot to Apache Cassandra™
•  400+ customers (25 of the Fortune 100), 450+ employees
•  Headquarter in San Francisco Bay area
•  EU headquarter in London, offices in France and Germany
•  Datastax Enterprise = OSS Cassandra + extra features
Materialized Views (MV)
•  Why ?
•  Detailed Impl
•  Gotchas
Why Materialized Views ?
•  Relieve the pain of manual denormalization
CREATE TABLE user(
id int PRIMARY KEY,
country text,
…
);
CREATE TABLE user_by_country(
country text,
id int,
…,
PRIMARY KEY(country, id)
);
4
CREATE TABLE user_by_country (
country text,
id int,
firstname text,
lastname text,
PRIMARY KEY(country, id));
Materialzed View In Action
CREATE MATERIALIZED VIEW user_by_country
AS SELECT country, id, firstname, lastname
FROM user
WHERE country IS NOT NULL AND id IS NOT NULL
PRIMARY KEY(country, id)
5
Materialzed View Syntax
CREATE MATERIALIZED VIEW [IF NOT EXISTS]
keyspace_name.view_name
AS SELECT column1, column2, ...
FROM keyspace_name.table_name
WHERE column1 IS NOT NULL AND column2 IS NOT NULL ...
PRIMARY KEY(column1, column2, ...)
Must select all primary key columns of base table
•  IS NOT NULL condition for now
•  more complex conditions in future
•  at least all primary key columns of base table
(ordering can be different)
•  maximum 1 column NOT pk from base table
6
Materialized Views Demo
7
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
①
•  send mutation to all replicas
•  waiting for ack(s) with CL
8
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
②
Acquire local lock on
base table partition
9
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
③
Local read to fetch current values
SELECT * FROM user WHERE id=1
10
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
④
Create BatchLog with
•  DELETE FROM user_by_country
WHERE country = ‘old_value’
•  INSERT INTO
user_by_country(country, id, …)
VALUES(‘FR’, 1, ...)
11
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑤
Execute async BatchLog
to paired view replica
with CL = ONE
12
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑥
Apply base table updade locally
SET COUNTRY=‘FR’
13
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑦
Release local lock
14
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑧
Return ack to
coordinator
15
Materialized View Impl
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑨
If CL ack(s)
received, ack client
16
MV Failure Cases: concurrent updates
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘US’)
•  Send async BatchLog
•  Apply update country=‘US’
1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘FR’)
•  Send async BatchLog
•  Apply update country=‘FR’
t0
t1
t2
Without local lock
17
MV Failure Cases: concurrent updates
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘US’)
•  Send async BatchLog
•  Apply update country=‘US’
1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘FR’)
•  Send async BatchLog
•  Apply update country=‘FR’
t0
t1
t2
Without local lock
18
INSERT INTO mv …(country) VALUES(‘US’)
INSERT INTO mv …(country) VALUES(‘FR’)
MV Failure Cases: concurrent updates
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘US’)
•  Send async BatchLog
•  Apply update country=‘US’
1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’
Read base row (country=‘US’)
•  DELETE FROM mv WHERE
country=‘US’
•  INSERT INTO mv …(country)
VALUES(‘FR’)
•  Send async BatchLog
•  Apply update country=‘FR’
With local lock
🔒
🔓 🔒
🔓19
MV Failure Cases: failed updates to MV
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑤
Execute async BatchLog
to paired view replica
with CL = ONE
✘
MV replica down
20
MV Failure Cases: failed updates to MV
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
BatchLog
replay
MV replica up
21
Materialized View Performance
•  Write performance
•  local lock
•  local read-before-write for MV à update contention on partition (most of perf hits)
•  local batchlog for MV
•  ☞ you only pay this price once whatever number of MV
•  for each base table update: mv_count x 2 (DELETE + INSERT) extra mutations
22
Materialized View Performance
•  Write performance vs manual denormalization
•  MV better because no client-server network traffic for read-before-write
•  MV better because less network traffic for multiple views (client-side BATCH)
•  Makes developer life easier à priceless
23
Materialized View Performance
•  Read performance vs secondary index
•  MV better because single node read (secondary index can hit many nodes)
•  MV better because single read path (secondary index = read index + read data)
24
Materialized Views Consistency
•  Consistency level
•  CL honoured for base table, ONE for MV + local batchlog
•  Weaker consistency guarantees for MV than for base table.
•  Exemple, write at QUORUM
•  guarantee that QUORUM replicas of base tables have received write
•  guarantee that QUORUM of MV replicas will eventually receive DELETE + INSERT
25
Materialized Views Gotchas
•  Beware of hot spots !!!
•  MV user_by_gender 😱
26
Q & A
! "
27
User Define Functions (UDF)
•  Why ?
•  Detailed Impl
•  UDAs
•  Gotchas
Rationale
•  Push computation server-side
•  save network bandwidth (1000 nodes!)
•  simplify client-side code
•  provide standard & useful function (sum, avg …)
•  accelerate analytics use-case (pre-aggregation for Spark)
29
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALL ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS $$
// source code here
$$;
30
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS $$
// source code here
$$;
An UDF is keyspace-wide
31
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS $$
// source code here
$$;
Param name to refer to in the code
Type = CQL3 type
32
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language // j
AS $$
// source code here
$$;
Always called
Null-check mandatory in code
33
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language // jav
AS $$
// source code here
$$;
If any input is null, code block is
skipped and return null
34
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS $$
// source code here
$$;
CQL types
•  primitives (boolean, int, …)
•  collections (list, set, map)
•  tuples
•  UDT
35
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS $$
// source code here
$$; JVM supported languages
•  Java, Scala
•  Javascript (slow)
•  Groovy, Jython, JRuby
•  Clojure ( JSR 223 impl issue)
36
How to create an UDF ?
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS $$
// source code here
$$;
37
UDF Demo
38
UDA
•  Real use-case for UDF
•  Aggregation server-side à huge network bandwidth saving
•  Provide similar behavior for Group By, Sum, Avg etc …
39
How to create an UDA ?
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
Only type, no param name
State type
Initial state type
40
How to create an UDA ?
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
Accumulator function. Signature:
accumulatorFunction(stateType, type1, type2, …)
RETURNS stateType
41
How to create an UDA ?
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
Optional final function. Signature:
finalFunction(stateType)
42
How to create an UDA ?
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
UDA return type ?
If finalFunction
•  return type of finalFunction
Else
•  return stateType
43
UDA Demo
44
Gotchas
C* C*
C*
C*
UDA
①
② & ③
⑤
② & ③
② & ③
45
Gotchas
C* C*
C*
C*
UDA
①
② & ③
⑤
② & ③
② & ③
46
Why do not apply UDF/UDA on replica node ?
Gotchas
C* C*
C*
C*
UDA
①
② & ③
④
•  apply accumulatorFunction
•  apply finalFunction
⑤
② & ③
② & ③
1.  Because of eventual
consistency
2.  UDF/UDA applied AFTER
last-write-win logic
47
Gotchas
48
•  UDA in Cassandra is not distributed !
•  Execute UDA on a large number of rows (106 for ex.)
•  single fat partition
•  multiple partitions
•  full table scan
•  à Increase client-side timeout
•  default Java driver timeout = 12 secs
•  JAVA-1033 JIRA for per-request timeout setting
Cassandra UDA or Apache Spark ?
49
Consistency
Level
Single/Multiple
Partition(s)
Recommended
Approach
ONE Single partition UDA with token-aware driver because node local
ONE Multiple partitions Apache Spark because distributed reads
> ONE Single partition UDA because data-locality lost with Spark
> ONE Multiple partitions Apache Spark definitely
Cassandra UDA or Apache Spark ?
50
Consistency
Level
Single/Multiple
Partition(s)
Recommended
Approach
ONE Single partition UDA with token-aware driver because node local
ONE Multiple partitions Apache Spark because distributed reads
> ONE Single partition UDA because data-locality lost with Spark
> ONE Multiple partitions Apache Spark definitely
Q & A
! "
51
New Storage Engine
•  Data structure
•  Disk space usage
Pre 3.0 data structure
Map<byte[ ], SortedMap<byte[ ], Cell>>
53
CREATE TABLE sensor_data(
sensor_id uuid,
date timestamp,
sensor_type text,
sensor_value double,
PRIMARY KEY(sensor_id, date)
);
Pre 3.0 on disk layout
54
RowKey: de305d54-75b4-431b-adb2-eb6b9e546014
=> (column=2015-04-27 10:00:00+0100:, value=, timestamp=1430128800)
=> (column=2015-04-27 10:00:00+0100:sensor_type, value=‘Temperature’, timestamp=1430128800)
=> (column=2015-04-27 10:00:00+0100:sensor_value, value=23.48, timestamp=1430128800)
=> (column=2015-04-27 10:01:00+0100:, value=, timestamp=1430128860)
=> (column=2015-04-27 10:01:00+0100:sensor_type, value=‘Temperature’, timestamp=1430128860)
=> (column=2015-04-27 10:01:00+0100:sensor_value, value=24.08, timestamp=1430128860)
Clustering values are repeated
for each normal column
Full timestamp storage
Cassandra 3.0 data structure
Map<byte[ ], SortedMap<ClusteringColumn, Row>>
55
CREATE TABLE sensor_data(
sensor_id uuid,
date timestamp,
sensor_type text,
sensor_value double,
PRIMARY KEY(sensor_id, date)
);
Cassandra 3.0 on disk layout
56
PartitionKey: de305d54-75b4-431b-adb2-eb6b9e546014
=> clusteringColumn:2015-04-27 10:00:00+0100
=> row_timestamp=1430128800
=> (column_value=‘Temperature’, delta_encoded_timestamp=+0)
=> (column_value=23.48, delta_encoded_timestamp=+0)
=> clusteringColumn:2015-04-27 10:01:00+0100
=> row_timestamp=1430128860
=> (column_value=‘Temperature’, delta_encoded_timestamp=+0)
=> (column_value=24.08, delta_encoded_timestamp=+0)
Delta-encoded timestamp
vs row timestamp
Gains
57
•  No clustering value repetition
•  Column labels are stored only once in meta data
•  Delta encoding of timestamp, 8 bytes saved each time
•  Less disk space used
Benchmarks
58
CREATE TABLE events (
id uuid,
date timeuuid,
prop1 int,
prop2 text,
prop3 float,
PRIMARY KEY(id, date));
106 rows
Small string
Benchmarks
59
CREATE TABLE largetext(
key int,
prop1 int,
prop2 text,
PRIMARY KEY(id));
106 rows
Large string (1000)
Benchmarks
60
CREATE TABLE
largeclustering(
key int,
clust text,
prop1 int,
prop2 set<float>,
PRIMARY KEY(id, clust));
106 rowsMedium string (100)
50 items
Benchmarks
61
CREATE TABLE events (
id uuid,
date timeuuid,
prop1 int,
prop2 text,
prop3 float,
PRIMARY KEY(id, date))
WITH COMPACT STORAGE ;
Q & A
! "
62
@doanduyhai
duy_hai.doan@datastax.com
https://academy.datastax.com/
Thank You
63

More Related Content

PDF
Cassandra UDF and Materialized Views
PDF
User defined-functions-cassandra-summit-eu-2014
PDF
Testing Cassandra Guarantees under Diverse Failure Modes with Jepsen
PDF
Data stax academy
PDF
Http4s, Doobie and Circe: The Functional Web Stack
PDF
Node Boot Camp
PDF
Cassandra 3.0 Awesomeness
KEY
The Why and How of Scala at Twitter
Cassandra UDF and Materialized Views
User defined-functions-cassandra-summit-eu-2014
Testing Cassandra Guarantees under Diverse Failure Modes with Jepsen
Data stax academy
Http4s, Doobie and Circe: The Functional Web Stack
Node Boot Camp
Cassandra 3.0 Awesomeness
The Why and How of Scala at Twitter

What's hot (20)

PDF
XQuery in the Cloud
PDF
Indexing in Cassandra
PDF
Scala @ TechMeetup Edinburgh
PDF
Terraform introduction
PDF
PDF
Mentor Your Indexes
PDF
Not your Grandma's XQuery
PDF
XQuery Rocks
PDF
Faster Data Analytics with Apache Spark using Apache Solr - Kiran Chitturi, L...
PDF
Scala active record
PDF
Custom deployments with sbt-native-packager
PDF
Scala coated JVM
PDF
Solr Indexing and Analysis Tricks
PDF
Spark workshop
ODP
Aura Project for PHP
PPTX
A Brief Intro to Scala
PDF
Webエンジニアから見たiOS5
PDF
Introductory Overview to Managing AWS with Terraform
PDF
Benchx: An XQuery benchmarking web application
PDF
Lucene for Solr Developers
XQuery in the Cloud
Indexing in Cassandra
Scala @ TechMeetup Edinburgh
Terraform introduction
Mentor Your Indexes
Not your Grandma's XQuery
XQuery Rocks
Faster Data Analytics with Apache Spark using Apache Solr - Kiran Chitturi, L...
Scala active record
Custom deployments with sbt-native-packager
Scala coated JVM
Solr Indexing and Analysis Tricks
Spark workshop
Aura Project for PHP
A Brief Intro to Scala
Webエンジニアから見たiOS5
Introductory Overview to Managing AWS with Terraform
Benchx: An XQuery benchmarking web application
Lucene for Solr Developers
Ad

Viewers also liked (20)

PDF
Cassandra Materialized Views
PDF
Spring 4.3-component-design
PDF
Paolucci voxxed-days-berlin-2016-age-of-orchestration
PDF
Voxxed berlin2016profilers|
PDF
Docker orchestration voxxed days berlin 2016
PDF
The internet of (lego) trains
PDF
Advanced akka features
PDF
Light Weight Transactions Under Stress (Christopher Batey, The Last Pickle) ...
PDF
OrientDB - Voxxed Days Berlin 2016
PDF
Size does matter - How to cut (micro-)services correctly
PDF
Advanced search and Top-K queries in Cassandra
PPT
05 OLAP v6 weekend
PPTX
FedX - Optimization Techniques for Federated Query Processing on Linked Data
PPT
Whats A Data Warehouse
PDF
Data Warehouse and OLAP - Lear-Fabini
PPTX
Oracle Optimizer: 12c New Capabilities
PPT
Benchmarking graph databases on the problem of community detection
PDF
Materialized views in PostgreSQL
PPTX
SSSW2015 Data Workflow Tutorial
PDF
Olap Cube Design
 
Cassandra Materialized Views
Spring 4.3-component-design
Paolucci voxxed-days-berlin-2016-age-of-orchestration
Voxxed berlin2016profilers|
Docker orchestration voxxed days berlin 2016
The internet of (lego) trains
Advanced akka features
Light Weight Transactions Under Stress (Christopher Batey, The Last Pickle) ...
OrientDB - Voxxed Days Berlin 2016
Size does matter - How to cut (micro-)services correctly
Advanced search and Top-K queries in Cassandra
05 OLAP v6 weekend
FedX - Optimization Techniques for Federated Query Processing on Linked Data
Whats A Data Warehouse
Data Warehouse and OLAP - Lear-Fabini
Oracle Optimizer: 12c New Capabilities
Benchmarking graph databases on the problem of community detection
Materialized views in PostgreSQL
SSSW2015 Data Workflow Tutorial
Olap Cube Design
 
Ad

Similar to Cassandra and materialized views (20)

PDF
Cassandra 3 new features 2016
PDF
Cassandra 3 new features @ Geecon Krakow 2016
PPTX
Cassandra 2.2 & 3.0
PDF
Flexviews materialized views for my sql
PPTX
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
PDF
Cassandra for impatients
PDF
Datastax day 2016 introduction to apache cassandra
PPTX
Materialized Views and Secondary Indexes in Scylla: They Are finally here!
PPT
Unit 1- dbms.ppt
PPT
PPTX
Cassandra20141009
PPT
Dbms
PPTX
Chen li asterix db: 大数据处理开源平台
PDF
Really Big Elephants: PostgreSQL DW
PPT
Toronto jaspersoft meetup
PPT
dbms.ppt
PPT
dbms (1).ppt
PPT
dbms.ppt
PPT
Database management concepts With Normalization
Cassandra 3 new features 2016
Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 2.2 & 3.0
Flexviews materialized views for my sql
Couchbase Tutorial: Big data Open Source Systems: VLDB2018
Cassandra for impatients
Datastax day 2016 introduction to apache cassandra
Materialized Views and Secondary Indexes in Scylla: They Are finally here!
Unit 1- dbms.ppt
Cassandra20141009
Dbms
Chen li asterix db: 大数据处理开源平台
Really Big Elephants: PostgreSQL DW
Toronto jaspersoft meetup
dbms.ppt
dbms (1).ppt
dbms.ppt
Database management concepts With Normalization

Recently uploaded (20)

PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PPTX
Patient Appointment Booking in Odoo with online payment
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Nekopoi APK 2025 free lastest update
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Odoo Companies in India – Driving Business Transformation.pdf
DOCX
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
iTop VPN Free 5.6.0.5262 Crack latest version 2025
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
17 Powerful Integrations Your Next-Gen MLM Software Needs
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Wondershare Filmora 15 Crack With Activation Key [2025
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Patient Appointment Booking in Odoo with online payment
wealthsignaloriginal-com-DS-text-... (1).pdf
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Design an Analysis of Algorithms I-SECS-1021-03
Internet Downloader Manager (IDM) Crack 6.42 Build 41
CHAPTER 2 - PM Management and IT Context
Nekopoi APK 2025 free lastest update
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Odoo Companies in India – Driving Business Transformation.pdf
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
Why Generative AI is the Future of Content, Code & Creativity?
Design an Analysis of Algorithms II-SECS-1021-03
iTop VPN Free 5.6.0.5262 Crack latest version 2025
Operating system designcfffgfgggggggvggggggggg
How to Choose the Right IT Partner for Your Business in Malaysia
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
17 Powerful Integrations Your Next-Gen MLM Software Needs

Cassandra and materialized views

  • 1. 1 Cassandra 2.2 and 3.0 new features DuyHai DOAN Apache Cassandra Technical Evangelist #VoxxedBerlin @doanduyhai
  • 2. Datastax 2 •  Founded in April 2010 •  We contribute a lot to Apache Cassandra™ •  400+ customers (25 of the Fortune 100), 450+ employees •  Headquarter in San Francisco Bay area •  EU headquarter in London, offices in France and Germany •  Datastax Enterprise = OSS Cassandra + extra features
  • 3. Materialized Views (MV) •  Why ? •  Detailed Impl •  Gotchas
  • 4. Why Materialized Views ? •  Relieve the pain of manual denormalization CREATE TABLE user( id int PRIMARY KEY, country text, … ); CREATE TABLE user_by_country( country text, id int, …, PRIMARY KEY(country, id) ); 4
  • 5. CREATE TABLE user_by_country ( country text, id int, firstname text, lastname text, PRIMARY KEY(country, id)); Materialzed View In Action CREATE MATERIALIZED VIEW user_by_country AS SELECT country, id, firstname, lastname FROM user WHERE country IS NOT NULL AND id IS NOT NULL PRIMARY KEY(country, id) 5
  • 6. Materialzed View Syntax CREATE MATERIALIZED VIEW [IF NOT EXISTS] keyspace_name.view_name AS SELECT column1, column2, ... FROM keyspace_name.table_name WHERE column1 IS NOT NULL AND column2 IS NOT NULL ... PRIMARY KEY(column1, column2, ...) Must select all primary key columns of base table •  IS NOT NULL condition for now •  more complex conditions in future •  at least all primary key columns of base table (ordering can be different) •  maximum 1 column NOT pk from base table 6
  • 8. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ① •  send mutation to all replicas •  waiting for ack(s) with CL 8
  • 9. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ② Acquire local lock on base table partition 9
  • 10. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ③ Local read to fetch current values SELECT * FROM user WHERE id=1 10
  • 11. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ④ Create BatchLog with •  DELETE FROM user_by_country WHERE country = ‘old_value’ •  INSERT INTO user_by_country(country, id, …) VALUES(‘FR’, 1, ...) 11
  • 12. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑤ Execute async BatchLog to paired view replica with CL = ONE 12
  • 13. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑥ Apply base table updade locally SET COUNTRY=‘FR’ 13
  • 14. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑦ Release local lock 14
  • 15. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑧ Return ack to coordinator 15
  • 16. Materialized View Impl C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑨ If CL ack(s) received, ack client 16
  • 17. MV Failure Cases: concurrent updates Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘US’) •  Send async BatchLog •  Apply update country=‘US’ 1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’ Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘FR’) •  Send async BatchLog •  Apply update country=‘FR’ t0 t1 t2 Without local lock 17
  • 18. MV Failure Cases: concurrent updates Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘US’) •  Send async BatchLog •  Apply update country=‘US’ 1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’ Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘FR’) •  Send async BatchLog •  Apply update country=‘FR’ t0 t1 t2 Without local lock 18 INSERT INTO mv …(country) VALUES(‘US’) INSERT INTO mv …(country) VALUES(‘FR’)
  • 19. MV Failure Cases: concurrent updates Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘US’) •  Send async BatchLog •  Apply update country=‘US’ 1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’ Read base row (country=‘US’) •  DELETE FROM mv WHERE country=‘US’ •  INSERT INTO mv …(country) VALUES(‘FR’) •  Send async BatchLog •  Apply update country=‘FR’ With local lock 🔒 🔓 🔒 🔓19
  • 20. MV Failure Cases: failed updates to MV C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑤ Execute async BatchLog to paired view replica with CL = ONE ✘ MV replica down 20
  • 21. MV Failure Cases: failed updates to MV C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 BatchLog replay MV replica up 21
  • 22. Materialized View Performance •  Write performance •  local lock •  local read-before-write for MV à update contention on partition (most of perf hits) •  local batchlog for MV •  ☞ you only pay this price once whatever number of MV •  for each base table update: mv_count x 2 (DELETE + INSERT) extra mutations 22
  • 23. Materialized View Performance •  Write performance vs manual denormalization •  MV better because no client-server network traffic for read-before-write •  MV better because less network traffic for multiple views (client-side BATCH) •  Makes developer life easier à priceless 23
  • 24. Materialized View Performance •  Read performance vs secondary index •  MV better because single node read (secondary index can hit many nodes) •  MV better because single read path (secondary index = read index + read data) 24
  • 25. Materialized Views Consistency •  Consistency level •  CL honoured for base table, ONE for MV + local batchlog •  Weaker consistency guarantees for MV than for base table. •  Exemple, write at QUORUM •  guarantee that QUORUM replicas of base tables have received write •  guarantee that QUORUM of MV replicas will eventually receive DELETE + INSERT 25
  • 26. Materialized Views Gotchas •  Beware of hot spots !!! •  MV user_by_gender 😱 26
  • 27. Q & A ! " 27
  • 28. User Define Functions (UDF) •  Why ? •  Detailed Impl •  UDAs •  Gotchas
  • 29. Rationale •  Push computation server-side •  save network bandwidth (1000 nodes!) •  simplify client-side code •  provide standard & useful function (sum, avg …) •  accelerate analytics use-case (pre-aggregation for Spark) 29
  • 30. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALL ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS $$ // source code here $$; 30
  • 31. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS $$ // source code here $$; An UDF is keyspace-wide 31
  • 32. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS $$ // source code here $$; Param name to refer to in the code Type = CQL3 type 32
  • 33. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language // j AS $$ // source code here $$; Always called Null-check mandatory in code 33
  • 34. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language // jav AS $$ // source code here $$; If any input is null, code block is skipped and return null 34
  • 35. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS $$ // source code here $$; CQL types •  primitives (boolean, int, …) •  collections (list, set, map) •  tuples •  UDT 35
  • 36. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS $$ // source code here $$; JVM supported languages •  Java, Scala •  Javascript (slow) •  Groovy, Jython, JRuby •  Clojure ( JSR 223 impl issue) 36
  • 37. How to create an UDF ? CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS $$ // source code here $$; 37
  • 39. UDA •  Real use-case for UDF •  Aggregation server-side à huge network bandwidth saving •  Provide similar behavior for Group By, Sum, Avg etc … 39
  • 40. How to create an UDA ? CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; Only type, no param name State type Initial state type 40
  • 41. How to create an UDA ? CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; Accumulator function. Signature: accumulatorFunction(stateType, type1, type2, …) RETURNS stateType 41
  • 42. How to create an UDA ? CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; Optional final function. Signature: finalFunction(stateType) 42
  • 43. How to create an UDA ? CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; UDA return type ? If finalFunction •  return type of finalFunction Else •  return stateType 43
  • 45. Gotchas C* C* C* C* UDA ① ② & ③ ⑤ ② & ③ ② & ③ 45
  • 46. Gotchas C* C* C* C* UDA ① ② & ③ ⑤ ② & ③ ② & ③ 46 Why do not apply UDF/UDA on replica node ?
  • 47. Gotchas C* C* C* C* UDA ① ② & ③ ④ •  apply accumulatorFunction •  apply finalFunction ⑤ ② & ③ ② & ③ 1.  Because of eventual consistency 2.  UDF/UDA applied AFTER last-write-win logic 47
  • 48. Gotchas 48 •  UDA in Cassandra is not distributed ! •  Execute UDA on a large number of rows (106 for ex.) •  single fat partition •  multiple partitions •  full table scan •  à Increase client-side timeout •  default Java driver timeout = 12 secs •  JAVA-1033 JIRA for per-request timeout setting
  • 49. Cassandra UDA or Apache Spark ? 49 Consistency Level Single/Multiple Partition(s) Recommended Approach ONE Single partition UDA with token-aware driver because node local ONE Multiple partitions Apache Spark because distributed reads > ONE Single partition UDA because data-locality lost with Spark > ONE Multiple partitions Apache Spark definitely
  • 50. Cassandra UDA or Apache Spark ? 50 Consistency Level Single/Multiple Partition(s) Recommended Approach ONE Single partition UDA with token-aware driver because node local ONE Multiple partitions Apache Spark because distributed reads > ONE Single partition UDA because data-locality lost with Spark > ONE Multiple partitions Apache Spark definitely
  • 51. Q & A ! " 51
  • 52. New Storage Engine •  Data structure •  Disk space usage
  • 53. Pre 3.0 data structure Map<byte[ ], SortedMap<byte[ ], Cell>> 53 CREATE TABLE sensor_data( sensor_id uuid, date timestamp, sensor_type text, sensor_value double, PRIMARY KEY(sensor_id, date) );
  • 54. Pre 3.0 on disk layout 54 RowKey: de305d54-75b4-431b-adb2-eb6b9e546014 => (column=2015-04-27 10:00:00+0100:, value=, timestamp=1430128800) => (column=2015-04-27 10:00:00+0100:sensor_type, value=‘Temperature’, timestamp=1430128800) => (column=2015-04-27 10:00:00+0100:sensor_value, value=23.48, timestamp=1430128800) => (column=2015-04-27 10:01:00+0100:, value=, timestamp=1430128860) => (column=2015-04-27 10:01:00+0100:sensor_type, value=‘Temperature’, timestamp=1430128860) => (column=2015-04-27 10:01:00+0100:sensor_value, value=24.08, timestamp=1430128860) Clustering values are repeated for each normal column Full timestamp storage
  • 55. Cassandra 3.0 data structure Map<byte[ ], SortedMap<ClusteringColumn, Row>> 55 CREATE TABLE sensor_data( sensor_id uuid, date timestamp, sensor_type text, sensor_value double, PRIMARY KEY(sensor_id, date) );
  • 56. Cassandra 3.0 on disk layout 56 PartitionKey: de305d54-75b4-431b-adb2-eb6b9e546014 => clusteringColumn:2015-04-27 10:00:00+0100 => row_timestamp=1430128800 => (column_value=‘Temperature’, delta_encoded_timestamp=+0) => (column_value=23.48, delta_encoded_timestamp=+0) => clusteringColumn:2015-04-27 10:01:00+0100 => row_timestamp=1430128860 => (column_value=‘Temperature’, delta_encoded_timestamp=+0) => (column_value=24.08, delta_encoded_timestamp=+0) Delta-encoded timestamp vs row timestamp
  • 57. Gains 57 •  No clustering value repetition •  Column labels are stored only once in meta data •  Delta encoding of timestamp, 8 bytes saved each time •  Less disk space used
  • 58. Benchmarks 58 CREATE TABLE events ( id uuid, date timeuuid, prop1 int, prop2 text, prop3 float, PRIMARY KEY(id, date)); 106 rows Small string
  • 59. Benchmarks 59 CREATE TABLE largetext( key int, prop1 int, prop2 text, PRIMARY KEY(id)); 106 rows Large string (1000)
  • 60. Benchmarks 60 CREATE TABLE largeclustering( key int, clust text, prop1 int, prop2 set<float>, PRIMARY KEY(id, clust)); 106 rowsMedium string (100) 50 items
  • 61. Benchmarks 61 CREATE TABLE events ( id uuid, date timeuuid, prop1 int, prop2 text, prop3 float, PRIMARY KEY(id, date)) WITH COMPACT STORAGE ;
  • 62. Q & A ! " 62