SlideShare a Scribd company logo
Introduction to Cassandra
DuyHai DOAN
Apache Cassandra Evangelist
Datastax
•  Founded in April 2010
•  We contribute a lot to Apache Cassandra™
•  400+ customers (25 of the Fortune 100), 450+ employees
•  Headquarter in San Francisco Bay area
•  EU headquarter in London, offices in France and Germany
•  Datastax Enterprise = OSS Cassandra + extra features
© 2016 DataStax, All Rights Reserved.
 2
Cassandra history
•  created at Facebook
•  open-sourced since 2008
•  current version: 3.2
•  column-oriented ☞ distributed table
© 2016 DataStax, All Rights Reserved.
 3
5 Cassandra key points
•  Linear scalability
•  Continuous availability
•  Multi Data-center native
•  Operational simplicity
•  Spark integration
© 2016 DataStax, All Rights Reserved.
4
1) Linear scalability
© 2016 DataStax, All Rights Reserved.
 5
C*
C*	C*
NetcoSports
3 nodes, ≈3GB
1k+ nodes, PB+
YOU
2) Continuous availability
© 2016 DataStax, All Rights Reserved.
 6
•  thanks to the Dynamo architecture
3) Multi Data-centers
© 2016 DataStax, All Rights Reserved.
 7
•  out-of-the-box (config only)
•  AWS config for multi-regions DCs
•  GCE support
•  Microsoft Azure support
•  CloudStack support
Multi DC usages
Data locality, disaster recovery
© 2016 DataStax, All Rights Reserved.
 8
C*
C*
C*
C*
C* C*
C* C* C*
C*
C*
C*
C*
New York (DC1) London (DC2)
Async
replication
Multi DC usages
Virtual DC for workload segregation
© 2016 DataStax, All Rights Reserved.
 9
C*
C*
C*
C*
C* C*
C* C* C*
C*
C*
C*
C*
Production
(LIVE)
Analytics
(Spark)
Async
replication
Same room
Multi DC usages
Prod data copy for back-up/benchmark
© 2016 DataStax, All Rights Reserved.
 10
C*
C*
C*
C*
C* C*
C* C* C*
C*
C*
C*
C*
Use
LOCAL_XXX
Consistency
Levels
My tiny test DC
READ-ONLY!!!
Async
replication
4) Operational simplicity
© 2016 DataStax, All Rights Reserved.
 11
•  1 node = 1 process + 2 config files (cassandra.yaml + cassandra-rackdc.properties)
•  deployment automation
•  OpsCenter for
•  monitoring
•  provisioning*
•  services* (repair, performance, …)
* only with Datastax Enterprise
4) Operational simplicity
© 2016 DataStax, All Rights Reserved.
 12
5) Spark integration
© 2016 DataStax, All Rights Reserved.
 13
•  Cassandra + Spark = awesome !
•  Spark/Cassandra connector = most advanced connector right now for NoSQL
db
•  predicates push-down
•  early filtering
•  dataframe integration
•  Analytics, aggregation, streaming …
Main Cassandra use-cases
© 2016 DataStax, All Rights Reserved.
14
Cassandra use-cases
© 2016 DataStax, All Rights Reserved.
 15
Messaging
Collections/
Playlists
Fraud
detection
Recommendation/
Personalization
Internet of things/
Sensor data
Cassandra use-cases
© 2016 DataStax, All Rights Reserved.
 16
Messaging
Collections/
Playlists
Fraud
detection
Recommendation/
Personalization
Internet of things/
Sensor data
© 2016 DataStax, All Rights Reserved.
 17
Q & A
! "
Layers
© 2016 DataStax, All Rights Reserved.
 18
•  Cluster
•  Amazon DynamoDB paper
•  masterless
•  Storage engine
•  Google Big Table
•  columns/columns family ☞ distributed tables
Data Distribution
© 2016 DataStax, All Rights Reserved.
19
The tokens
© 2016 DataStax, All Rights Reserved.
 20
Random hash of #partition à token = hash(#p)
Hash: ] –x, x ]
hash range: 264 values
x = 264/2
C*
C*
C*
C*
C* C*
C* C*
Token ranges
© 2016 DataStax, All Rights Reserved.
 21
A: −x,−
3x
4
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
B: −
3x
4
,−
2x
4
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
C: −
2x
4
,−
x
4
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
D: −
x
4
,0
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
E: 0,
x
4
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
F:
x
4
,
2x
4
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
G:
2x
4
,
3x
4
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
H :
3x
4
,x
⎤
⎦
⎥
⎥
⎤
⎦
⎥
⎥
C*
C*
C*
C*
C* C*
C* C*
Distributed tables
© 2016 DataStax, All Rights Reserved.
 22
H
A
E
D
B C
G F
user_id1
user_id2
user_id3
user_id4
user_id5
CREATE TABLE users(
user_id int,
…,
PRIMARY KEY(user_id)
),
Distributed tables
© 2016 DataStax, All Rights Reserved.
 23
H
A
E
D
B C
G F
user_id1
user_id2
user_id3
user_id4
user_id5
Linear scalability
© 2016 DataStax, All Rights Reserved.
 24
H
A
E
D
B C
G F
Today = high load
•  disk occupation 80%
•  CPU 70%
•  saturated memory
Scaling out
© 2016 DataStax, All Rights Reserved.
 25
H
A
E
D
B
C
G
F
I
J
+2 nodes
•  disk occupation 50%
•  CPU 50%
•  memory ✌︎
Automatic data rebalancing
•  each node gives up some tokens
•  flag to throttle network bandwidth
•  streamingthroughput
Automatic data re-balancing with virtual nodes
© 2016 DataStax, All Rights Reserved.
 26
A:
B:
C:
D:
E:
F:
G:
H:
A:
B:
C:
D:
E:
F:
G:
H:
I:
J:
+2 nodes
© 2016 DataStax, All Rights Reserved.
 27
Q & A
! "
Replication Model & Consistency
© 2016 DataStax, All Rights Reserved.
28
Failure tolerance
© 2016 DataStax, All Rights Reserved.
 29
Replication factor (RF) = 3
H
A
E
D
B C
G F
1
2 3
{A, H, G}
{B, A, H} {C, B, A}
Coordinator node
© 2016 DataStax, All Rights Reserved.
 30
Responsible for handling requests (read/write)
Every node can be coordinator
•  masterless
•  round robin master for each request
•  no SPOF
•  proxy role
H
A
E
D
B C
G F
coordinator
request
1
2 3
Consistency level
© 2016 DataStax, All Rights Reserved.
 31
Tunable at runtime
•  ONE
•  QUORUM (strict majority w.r.t RF)
•  ALL
Applicable to any request (read/write)
Consistency in action
© 2016 DataStax, All Rights Reserved.
 32
B A A
B A A
Read ONE: A
data replication in progress …
Write ONE: B
ack
RF = 3, Write ONE, Read ONE
Consistency in action
© 2016 DataStax, All Rights Reserved.
 33
B A A
B A A
Read QUORUM: A
data replication in progress …
Write ONE: B
ack
RF = 3, Write ONE, Read QUORUM
Consistency in action
© 2016 DataStax, All Rights Reserved.
 34
B A A
B A A
Read ALL: B
data replication in progress …
Write ONE: B
ack
RF = 3, Write ONE, Read ALL
Consistency in action
© 2016 DataStax, All Rights Reserved.
 35
B B A
B B A
Read ONE: A
data replication in progress …
Write QUORUM: B
ack
RF = 3, Write QUORUM, Read ONE
Consistency in action
© 2016 DataStax, All Rights Reserved.
 36
B B A
B B A
Read QUORUM: A
data replication in progress …
Write QUORUM: B
ack
RF = 3, Write QUORUM, Read QUORUM
Consistency level = trade-off
© 2016 DataStax, All Rights Reserved.
 37
Consistency level
© 2016 DataStax, All Rights Reserved.
 38
ONE
Fast, may not read latest written value
Consistency level
© 2016 DataStax, All Rights Reserved.
 39
QUORUM
Strict majority w.r.t. Replication Factor
Good balance
Consistency level
© 2016 DataStax, All Rights Reserved.
 40
ALL
Paranoid
Slow, lost of high availability
Consistency level common patterns
© 2016 DataStax, All Rights Reserved.
 41
ONERead + ONEWrite
☞ available for read/write even (N-1) replicas down
QUORUMRead + QUORUMWrite
☞ available for read/write even if (RF - 1) replica (s) down
© 2016 DataStax, All Rights Reserved.
 42
Q & A
! "
Last Write Win & Compaction
© 2016 DataStax, All Rights Reserved.
43
Last Write Win (LWW)
© 2016 DataStax, All Rights Reserved.
 44
jdoe
age name
33 John DOE
INSERT INTO users(login, name, age) VALUES('jdoe', 'John DOE', 33);
#partition
Last Write Win (LWW)
© 2016 DataStax, All Rights Reserved.
 45
INSERT INTO users(login, name, age) VALUES('jdoe', 'John DOE', 33);
jdoe
age (t1) name (t1)
33 John DOE
auto-generated timestamp (μs)
.
Last Write Win (LWW)
© 2016 DataStax, All Rights Reserved.
 46
UPDATE users SET age = 34 WHERE login = 'jdoe';
jdoe
age (t1) name (t1)
33 John DOE
jdoe
age (t2)
34
SSTable1 SSTable2
Last Write Win (LWW)
© 2016 DataStax, All Rights Reserved.
 47
DELETE age FROM users WHERE login = 'jdoe';
jdoe
age (t1) name (t1)
33 John DOE
jdoe
age (t2)
34
SSTable1 SSTable2
tombstone
SSTable3
jdoe
age (t3)
ý
Last Write Win (LWW)
© 2016 DataStax, All Rights Reserved.
 48
SELECT age FROM users WHERE login = 'jdoe';
jdoe
age (t1) name (t1)
33 John DOE
jdoe
age (t2)
34
SSTable1 SSTable2 SSTable3
jdoe
age (t3)
ý
???
Last Write Win (LWW)
© 2016 DataStax, All Rights Reserved.
 49
SELECT age FROM users WHERE login = 'jdoe';
jdoe
age (t1) name (t1)
33 John DOE
jdoe
age (t2)
34
SSTable1 SSTable2 SSTable3
jdoe
age (t3)
ý
✓✕✕
Compaction
© 2016 DataStax, All Rights Reserved.
 50
SSTable1 SSTable2 SSTable3
jdoe
age (t3)
ý
jdoe
age (t1) name (t1)
33 John DOE
jdoe
age (t2)
34
New SSTable
jdoe
age (t3) name (t1)
ý John DOE
Basic Data Modeling
© 2016 DataStax, All Rights Reserved.
51
Table creation
© 2016 DataStax, All Rights Reserved.
 52
CREATE TABLE users (
login text,
name text,
age int,
…
PRIMARY KEY(login));
partition key (#partition)
DML statements
© 2016 DataStax, All Rights Reserved.
 53
INSERT INTO users(login, name, age) VALUES('jdoe', 'John DOE', 33);
UPDATE users SET age = 34 WHERE login = 'jdoe';
DELETE age FROM users WHERE login = 'jdoe';
SELECT age FROM users WHERE login = 'jdoe';
What’s about joins ?
© 2016 DataStax, All Rights Reserved.
 54
How can I join data between tables ?
How can I model 1 – N relationships ?
How to model a mailbox ?
EmailsUser
1 n
Compound primary key
© 2016 DataStax, All Rights Reserved.
 55
CREATE TABLE mailbox (
login text,
message_id timeuuid,
interlocutor text,
message text,
PRIMARY KEY((login), message_id));
partition key clustering column unicity
Compound primary key
© 2016 DataStax, All Rights Reserved.
 56
rsmith	
2014-11-21 16:00:00
‘bobm’, ‘It’s really…’
2014-11-21 17:32:12
‘bobm’, ‘It depends..’
2014-11-21 21:21:09
‘bobm’, ‘Don’t do…’
…	
hsue	
2014-11-21 11:04:43
‘jdoe’, ‘Hi, …’
2014-11-21 11:22:43
‘rsmith’, ‘Hello,…’
jdoe	
2014-11-21 11:00:00
‘hsue’, ‘Hi there!’
2014-11-21 11:22:43
‘rsmith’, ‘Hello,…’
2014-11-21 13:06:19
‘bobm’, ‘Do you…’
ordered by clustering column (date)
Not
ordered
Queries
© 2016 DataStax, All Rights Reserved.
 57
Get message by user and message_id (date)
Get message by user and date interval
SELECT * FROM mailbox WHERE login = 'jdoe'
and message_id = ‘2014-11-21 16:00:00’;
SELECT * FROM mailbox WHERE login = 'jdoe'
and message_id <= ‘2014-11-25 23:59:59’
and message_id >= ‘2014-11-20 00:00:00’;
Queries
© 2016 DataStax, All Rights Reserved.
 58
Get message by message_id only
Get message by date interval
SELECT * FROM mailbox WHERE message_id = ‘2014-11-21 16:00:00’; ???
SELECT * FROM mailbox WHERE
and message_id <= ‘2014-11-25 23:59:59’ ???
and message_id >= ‘2014-11-20 00:00:00’;
Queries
© 2016 DataStax, All Rights Reserved.
 59
Get message by message_id only (#partition not provided)
Get message by date interval (#partition not provided)
SELECT * FROM mailbox WHERE message_id = ‘2014-11-21 16:00:00’;
SELECT * FROM mailbox WHERE
and message_id <= ‘2014-11-25 23:59:59’
and message_id >= ‘2014-11-20 00:00:00’;
Without #partition
© 2016 DataStax, All Rights Reserved.
 60
No #partition
☞ no token
☞ where are my data ?
C*
C*
C*
C*
C* C*
C* C*
❓ ❓
❓ ❓
❓
❓
❓
❓
Queries
© 2016 DataStax, All Rights Reserved.
 61
Get message by user range (range query on #partition)
Get message by user pattern (non exact match on #partition)
SELECT * FROM mailbox WHERE login >= hsue and login <= jdoe;
SELECT * FROM mailbox WHERE login like ‘%doe%‘;
WHERE clause restrictions
© 2016 DataStax, All Rights Reserved.
 62
All DML queries must provide #partition
Only exact match (=) on #partition, range queries (<, ≤, >, ≥) not allowed
•  ☞ full cluster scan
On clustering columns, only range queries (<, ≤, >, ≥) and exact match (=)
WHERE clause only possible
•  on columns defined in PRIMARY KEY
•  on indexed columns ( )
WHERE clause restrictions
© 2016 DataStax, All Rights Reserved.
 63
What if I want to perform "arbitrary" WHERE clause ?
•  search form scenario, dynamic search fields
WHERE clause restrictions
© 2016 DataStax, All Rights Reserved.
 64
What if I want to perform "arbitrary" WHERE clause ?
•  search form scenario, dynamic search fields
DO NOT RE-INVENT THE WHEEL !
•  ☞ Apache Solr (Lucene) integration (Datastax Enterprise Search)
•  ☞ Same JVM, 1-cluster-2-products (Solr & Cassandra)
WHERE clause restrictions
© 2016 DataStax, All Rights Reserved.
 65
What if I want to perform "arbitrary" WHERE clause ?
•  search form scenario, dynamic search fields
DO NOT RE-INVENT THE WHEEL !
•  ☞ Apache Solr (Lucene) integration (Datastax Enterprise Search)
•  ☞ Same JVM, 1-cluster-2-products (Solr & Cassandra)
SELECT * FROM users WHERE solr_query = 'age:[33 TO *] AND gender:male';
SELECT * FROM users WHERE solr_query = 'lastname:*schwei?er';
© 2016 DataStax, All Rights Reserved.
 66
Q & A
! "
Advanced Data Modeling
© 2016 DataStax, All Rights Reserved.
67
Collection types
© 2016 DataStax, All Rights Reserved.
 68
CREATE TABLE users (
login text,
name text,
age int,
friends set<text>,
hobbies list<text>,
languages map<int, text>,
…
PRIMARY KEY(login));
User Defined Type (UDT)
© 2016 DataStax, All Rights Reserved.
 69
Instead of
CREATE TABLE users (
login text,
…
street_number int,
street_name text,
postcode int,
country text,
…
PRIMARY KEY(login));
User Defined Type (UDT)
© 2016 DataStax, All Rights Reserved.
 70
CREATE TYPE address (
street_number int,
street_name text,
postcode int,
country text);
CREATE TABLE users (
login text,
…
location frozen <address>,
…
PRIMARY KEY(login));
UDT Insert
© 2016 DataStax, All Rights Reserved.
 71
INSERT INTO users(login,name, location) VALUES (
'jdoe',
'John DOE',
{
'street_number': 124,
'street_name': 'Congress Avenue',
'postcode': 95054,
'country': ‘USA’
});
JSON syntax for INSERT/UPDATE/DELETE
© 2016 DataStax, All Rights Reserved.
 72
CREATE TABLE users (
id text PRIMARY KEY,
age int,
state text );
INSERT INTO users JSON '{"id": "user123", "age": 42, "state": "TX"}’;
INSERT INTO users(id, age, state) VALUES('me', fromJson('20'), 'CA');
UPDATE users SET age = fromJson('25’) WHERE id = fromJson('"me"');
DELETE FROM users WHERE id = fromJson('"me"');
JSON syntax for SELECT
© 2016 DataStax, All Rights Reserved.
 73
> SELECT JSON * FROM users WHERE id = 'me';
[json]
----------------------------------------
{"id": "me", "age": 25, "state": "CA”}
> SELECT JSON age,state FROM users WHERE id = 'me';
[json]
----------------------------------------
{"age": 25, "state": "CA"}
> SELECT age, toJson(state) FROM users WHERE id = 'me';
age | system.tojson(state)
-----+----------------------
25 | "CA"
Why Materialized Views ?
Relieve the pain of manual denormalization
© 2015 DataStax, All Rights Reserved.
 74
CREATE TABLE user(
id int PRIMARY KEY,
country text,
…);
CREATE TABLE user_by_country(
country text,
id int,
…,
PRIMARY KEY(country, id));
Materialzed View In Action
© 2015 DataStax, All Rights Reserved.
 75
CREATE MATERIALIZED VIEW user_by_country
AS SELECT country, id, firstname, lastname
FROM user
WHERE country IS NOT NULL AND id IS NOT NULL
PRIMARY KEY(country, id)
CREATE TABLE user_by_country (
country text,
id int,
firstname text,
lastname text,
PRIMARY KEY(country, id));
User Defined Functions (UDF)
© 2016 DataStax, All Rights Reserved.
 76
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
maxOf (col1 int, col2 int)
CALL ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN int
LANGUAGE java
AS $$
return Math.max(col1, col2);
$$;
SELECT maxOf(col1, col2) FROM table WHERE id = xxx;
User Defined Aggregates (UDA)
© 2016 DataStax, All Rights Reserved.
 77
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
sum(bigint)
SFUNC accumulatorFunction
STYPE bigint
[FINALFUNC finalFunction]
INITCOND 0;
CREATE FUNCTION accumulatorFunction(accu bigint, column bigint)
RETURNS NULL ON NULL INPUT RETURN bigint LANGUAGE java
AS $$ return accu + colum; $$;
© 2016 DataStax, All Rights Reserved.
 78
Q & A
! "
© 2015 DataStax, All Rights Reserved.
 79
@doanduyhai
duy_hai.doan@datastax.com
https://academy.datastax.com/
Thank You

More Related Content

PDF
Spark cassandra integration 2016
PDF
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
PDF
Fast track to getting started with DSE Max @ ING
PDF
Cassandra introduction 2016
PDF
Spark Cassandra 2016
PDF
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
PDF
Sasi, cassandra on full text search ride
PDF
Datastax enterprise presentation
Spark cassandra integration 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Fast track to getting started with DSE Max @ ING
Cassandra introduction 2016
Spark Cassandra 2016
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Sasi, cassandra on full text search ride
Datastax enterprise presentation

What's hot (20)

PDF
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
PDF
Datastax day 2016 introduction to apache cassandra
PDF
Apache cassandra in 2016
PDF
Spark cassandra integration, theory and practice
PDF
Spark cassandra connector.API, Best Practices and Use-Cases
PDF
Cassandra 3 new features 2016
PDF
Big data 101 for beginners riga dev days
PDF
Spark Cassandra Connector Dataframes
PDF
Spark Cassandra Connector: Past, Present, and Future
PDF
Big data 101 for beginners devoxxpl
PDF
Apache Spark and DataStax Enablement
PDF
Big data analytics with Spark & Cassandra
PDF
Datastax day 2016 : Cassandra data modeling basics
PDF
Lightning fast analytics with Spark and Cassandra
PDF
Zero to Streaming: Spark and Cassandra
PPTX
Frustration-Reduced Spark: DataFrames and the Spark Time-Series Library
PDF
Analytics with Cassandra & Spark
PDF
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
PDF
Cassandra introduction apache con 2014 budapest
PDF
Spark ETL Techniques - Creating An Optimal Fantasy Baseball Roster
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Datastax day 2016 introduction to apache cassandra
Apache cassandra in 2016
Spark cassandra integration, theory and practice
Spark cassandra connector.API, Best Practices and Use-Cases
Cassandra 3 new features 2016
Big data 101 for beginners riga dev days
Spark Cassandra Connector Dataframes
Spark Cassandra Connector: Past, Present, and Future
Big data 101 for beginners devoxxpl
Apache Spark and DataStax Enablement
Big data analytics with Spark & Cassandra
Datastax day 2016 : Cassandra data modeling basics
Lightning fast analytics with Spark and Cassandra
Zero to Streaming: Spark and Cassandra
Frustration-Reduced Spark: DataFrames and the Spark Time-Series Library
Analytics with Cassandra & Spark
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Cassandra introduction apache con 2014 budapest
Spark ETL Techniques - Creating An Optimal Fantasy Baseball Roster
Ad

Viewers also liked (17)

PDF
Apache Cassandra Lesson: Data Modelling and CQL3
PDF
Introduction to cassandra 2014
PDF
Cassandra introduction @ ParisJUG
PDF
Introduction to KillrChat
PDF
Cassandra drivers and libraries
PDF
Cassandra introduction @ NantesJUG
PDF
KillrChat presentation
PDF
Apache Zeppelin @DevoxxFR 2016
PDF
Cassandra introduction mars jug
PDF
KillrChat Data Modeling
PDF
Cassandra introduction at FinishJUG
PDF
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
PDF
Data stax academy
PDF
Libon cassandra summiteu2014
PPTX
Cassandra for mission critical data
PDF
Cassandra 3 new features @ Geecon Krakow 2016
PDF
Apache zeppelin the missing component for the big data ecosystem
Apache Cassandra Lesson: Data Modelling and CQL3
Introduction to cassandra 2014
Cassandra introduction @ ParisJUG
Introduction to KillrChat
Cassandra drivers and libraries
Cassandra introduction @ NantesJUG
KillrChat presentation
Apache Zeppelin @DevoxxFR 2016
Cassandra introduction mars jug
KillrChat Data Modeling
Cassandra introduction at FinishJUG
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Data stax academy
Libon cassandra summiteu2014
Cassandra for mission critical data
Cassandra 3 new features @ Geecon Krakow 2016
Apache zeppelin the missing component for the big data ecosystem
Ad

Similar to Cassandra introduction 2016 (20)

PDF
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
PDF
Cassandra and Spark
PDF
Slides: Relational to NoSQL Migration
PDF
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
PPT
Toronto jaspersoft meetup
PDF
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
PDF
An Introduction to Apache Cassandra
PPTX
Cassandra Architecture FTW
PDF
Paris Cassandra Meetup - Cassandra for Developers
PDF
Introduction to Cassandra & Data model
PPTX
Cassandra training
PDF
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
PDF
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...
PDF
Cassandra Day Chicago 2015: Introduction to Apache Cassandra & DataStax Enter...
PDF
Building Scalable, Real Time Applications for Financial Services with DataStax
PDF
DataStax Enterprise – Foundations for Finance – 20160419
PPTX
Big Data Analytics with Spark
PPTX
Cassandra's Sweet Spot - an introduction to Apache Cassandra
PPTX
Presentation
PDF
State of Cassandra 2012
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
Cassandra and Spark
Slides: Relational to NoSQL Migration
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Toronto jaspersoft meetup
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
An Introduction to Apache Cassandra
Cassandra Architecture FTW
Paris Cassandra Meetup - Cassandra for Developers
Introduction to Cassandra & Data model
Cassandra training
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...
Cassandra Day Chicago 2015: Introduction to Apache Cassandra & DataStax Enter...
Building Scalable, Real Time Applications for Financial Services with DataStax
DataStax Enterprise – Foundations for Finance – 20160419
Big Data Analytics with Spark
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Presentation
State of Cassandra 2012

More from Duyhai Doan (8)

PDF
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
PDF
Le futur d'apache cassandra
PDF
Spark zeppelin-cassandra at synchrotron
PDF
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
PDF
Cassandra UDF and Materialized Views
PDF
Apache zeppelin, the missing component for the big data ecosystem
PDF
Distributed algorithms for big data @ GeeCon
PDF
Algorithmes distribues pour le big data @ DevoxxFR 2015
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Le futur d'apache cassandra
Spark zeppelin-cassandra at synchrotron
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Cassandra UDF and Materialized Views
Apache zeppelin, the missing component for the big data ecosystem
Distributed algorithms for big data @ GeeCon
Algorithmes distribues pour le big data @ DevoxxFR 2015

Recently uploaded (20)

PPTX
Machine Learning_overview_presentation.pptx
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
August Patch Tuesday
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Getting Started with Data Integration: FME Form 101
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PPTX
TLE Review Electricity (Electricity).pptx
Machine Learning_overview_presentation.pptx
SOPHOS-XG Firewall Administrator PPT.pptx
Spectroscopy.pptx food analysis technology
August Patch Tuesday
Assigned Numbers - 2025 - Bluetooth® Document
Univ-Connecticut-ChatGPT-Presentaion.pdf
OMC Textile Division Presentation 2021.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
NewMind AI Weekly Chronicles - August'25-Week II
A comparative analysis of optical character recognition models for extracting...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Unlocking AI with Model Context Protocol (MCP)
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Empathic Computing: Creating Shared Understanding
Getting Started with Data Integration: FME Form 101
Heart disease approach using modified random forest and particle swarm optimi...
TLE Review Electricity (Electricity).pptx

Cassandra introduction 2016

  • 1. Introduction to Cassandra DuyHai DOAN Apache Cassandra Evangelist
  • 2. Datastax •  Founded in April 2010 •  We contribute a lot to Apache Cassandra™ •  400+ customers (25 of the Fortune 100), 450+ employees •  Headquarter in San Francisco Bay area •  EU headquarter in London, offices in France and Germany •  Datastax Enterprise = OSS Cassandra + extra features © 2016 DataStax, All Rights Reserved. 2
  • 3. Cassandra history •  created at Facebook •  open-sourced since 2008 •  current version: 3.2 •  column-oriented ☞ distributed table © 2016 DataStax, All Rights Reserved. 3
  • 4. 5 Cassandra key points •  Linear scalability •  Continuous availability •  Multi Data-center native •  Operational simplicity •  Spark integration © 2016 DataStax, All Rights Reserved. 4
  • 5. 1) Linear scalability © 2016 DataStax, All Rights Reserved. 5 C* C* C* NetcoSports 3 nodes, ≈3GB 1k+ nodes, PB+ YOU
  • 6. 2) Continuous availability © 2016 DataStax, All Rights Reserved. 6 •  thanks to the Dynamo architecture
  • 7. 3) Multi Data-centers © 2016 DataStax, All Rights Reserved. 7 •  out-of-the-box (config only) •  AWS config for multi-regions DCs •  GCE support •  Microsoft Azure support •  CloudStack support
  • 8. Multi DC usages Data locality, disaster recovery © 2016 DataStax, All Rights Reserved. 8 C* C* C* C* C* C* C* C* C* C* C* C* C* New York (DC1) London (DC2) Async replication
  • 9. Multi DC usages Virtual DC for workload segregation © 2016 DataStax, All Rights Reserved. 9 C* C* C* C* C* C* C* C* C* C* C* C* C* Production (LIVE) Analytics (Spark) Async replication Same room
  • 10. Multi DC usages Prod data copy for back-up/benchmark © 2016 DataStax, All Rights Reserved. 10 C* C* C* C* C* C* C* C* C* C* C* C* C* Use LOCAL_XXX Consistency Levels My tiny test DC READ-ONLY!!! Async replication
  • 11. 4) Operational simplicity © 2016 DataStax, All Rights Reserved. 11 •  1 node = 1 process + 2 config files (cassandra.yaml + cassandra-rackdc.properties) •  deployment automation •  OpsCenter for •  monitoring •  provisioning* •  services* (repair, performance, …) * only with Datastax Enterprise
  • 12. 4) Operational simplicity © 2016 DataStax, All Rights Reserved. 12
  • 13. 5) Spark integration © 2016 DataStax, All Rights Reserved. 13 •  Cassandra + Spark = awesome ! •  Spark/Cassandra connector = most advanced connector right now for NoSQL db •  predicates push-down •  early filtering •  dataframe integration •  Analytics, aggregation, streaming …
  • 14. Main Cassandra use-cases © 2016 DataStax, All Rights Reserved. 14
  • 15. Cassandra use-cases © 2016 DataStax, All Rights Reserved. 15 Messaging Collections/ Playlists Fraud detection Recommendation/ Personalization Internet of things/ Sensor data
  • 16. Cassandra use-cases © 2016 DataStax, All Rights Reserved. 16 Messaging Collections/ Playlists Fraud detection Recommendation/ Personalization Internet of things/ Sensor data
  • 17. © 2016 DataStax, All Rights Reserved. 17 Q & A ! "
  • 18. Layers © 2016 DataStax, All Rights Reserved. 18 •  Cluster •  Amazon DynamoDB paper •  masterless •  Storage engine •  Google Big Table •  columns/columns family ☞ distributed tables
  • 19. Data Distribution © 2016 DataStax, All Rights Reserved. 19
  • 20. The tokens © 2016 DataStax, All Rights Reserved. 20 Random hash of #partition à token = hash(#p) Hash: ] –x, x ] hash range: 264 values x = 264/2 C* C* C* C* C* C* C* C*
  • 21. Token ranges © 2016 DataStax, All Rights Reserved. 21 A: −x,− 3x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ B: − 3x 4 ,− 2x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ C: − 2x 4 ,− x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ D: − x 4 ,0 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ E: 0, x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ F: x 4 , 2x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ G: 2x 4 , 3x 4 ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ H : 3x 4 ,x ⎤ ⎦ ⎥ ⎥ ⎤ ⎦ ⎥ ⎥ C* C* C* C* C* C* C* C*
  • 22. Distributed tables © 2016 DataStax, All Rights Reserved. 22 H A E D B C G F user_id1 user_id2 user_id3 user_id4 user_id5 CREATE TABLE users( user_id int, …, PRIMARY KEY(user_id) ),
  • 23. Distributed tables © 2016 DataStax, All Rights Reserved. 23 H A E D B C G F user_id1 user_id2 user_id3 user_id4 user_id5
  • 24. Linear scalability © 2016 DataStax, All Rights Reserved. 24 H A E D B C G F Today = high load •  disk occupation 80% •  CPU 70% •  saturated memory
  • 25. Scaling out © 2016 DataStax, All Rights Reserved. 25 H A E D B C G F I J +2 nodes •  disk occupation 50% •  CPU 50% •  memory ✌︎ Automatic data rebalancing •  each node gives up some tokens •  flag to throttle network bandwidth •  streamingthroughput
  • 26. Automatic data re-balancing with virtual nodes © 2016 DataStax, All Rights Reserved. 26 A: B: C: D: E: F: G: H: A: B: C: D: E: F: G: H: I: J: +2 nodes
  • 27. © 2016 DataStax, All Rights Reserved. 27 Q & A ! "
  • 28. Replication Model & Consistency © 2016 DataStax, All Rights Reserved. 28
  • 29. Failure tolerance © 2016 DataStax, All Rights Reserved. 29 Replication factor (RF) = 3 H A E D B C G F 1 2 3 {A, H, G} {B, A, H} {C, B, A}
  • 30. Coordinator node © 2016 DataStax, All Rights Reserved. 30 Responsible for handling requests (read/write) Every node can be coordinator •  masterless •  round robin master for each request •  no SPOF •  proxy role H A E D B C G F coordinator request 1 2 3
  • 31. Consistency level © 2016 DataStax, All Rights Reserved. 31 Tunable at runtime •  ONE •  QUORUM (strict majority w.r.t RF) •  ALL Applicable to any request (read/write)
  • 32. Consistency in action © 2016 DataStax, All Rights Reserved. 32 B A A B A A Read ONE: A data replication in progress … Write ONE: B ack RF = 3, Write ONE, Read ONE
  • 33. Consistency in action © 2016 DataStax, All Rights Reserved. 33 B A A B A A Read QUORUM: A data replication in progress … Write ONE: B ack RF = 3, Write ONE, Read QUORUM
  • 34. Consistency in action © 2016 DataStax, All Rights Reserved. 34 B A A B A A Read ALL: B data replication in progress … Write ONE: B ack RF = 3, Write ONE, Read ALL
  • 35. Consistency in action © 2016 DataStax, All Rights Reserved. 35 B B A B B A Read ONE: A data replication in progress … Write QUORUM: B ack RF = 3, Write QUORUM, Read ONE
  • 36. Consistency in action © 2016 DataStax, All Rights Reserved. 36 B B A B B A Read QUORUM: A data replication in progress … Write QUORUM: B ack RF = 3, Write QUORUM, Read QUORUM
  • 37. Consistency level = trade-off © 2016 DataStax, All Rights Reserved. 37
  • 38. Consistency level © 2016 DataStax, All Rights Reserved. 38 ONE Fast, may not read latest written value
  • 39. Consistency level © 2016 DataStax, All Rights Reserved. 39 QUORUM Strict majority w.r.t. Replication Factor Good balance
  • 40. Consistency level © 2016 DataStax, All Rights Reserved. 40 ALL Paranoid Slow, lost of high availability
  • 41. Consistency level common patterns © 2016 DataStax, All Rights Reserved. 41 ONERead + ONEWrite ☞ available for read/write even (N-1) replicas down QUORUMRead + QUORUMWrite ☞ available for read/write even if (RF - 1) replica (s) down
  • 42. © 2016 DataStax, All Rights Reserved. 42 Q & A ! "
  • 43. Last Write Win & Compaction © 2016 DataStax, All Rights Reserved. 43
  • 44. Last Write Win (LWW) © 2016 DataStax, All Rights Reserved. 44 jdoe age name 33 John DOE INSERT INTO users(login, name, age) VALUES('jdoe', 'John DOE', 33); #partition
  • 45. Last Write Win (LWW) © 2016 DataStax, All Rights Reserved. 45 INSERT INTO users(login, name, age) VALUES('jdoe', 'John DOE', 33); jdoe age (t1) name (t1) 33 John DOE auto-generated timestamp (μs) .
  • 46. Last Write Win (LWW) © 2016 DataStax, All Rights Reserved. 46 UPDATE users SET age = 34 WHERE login = 'jdoe'; jdoe age (t1) name (t1) 33 John DOE jdoe age (t2) 34 SSTable1 SSTable2
  • 47. Last Write Win (LWW) © 2016 DataStax, All Rights Reserved. 47 DELETE age FROM users WHERE login = 'jdoe'; jdoe age (t1) name (t1) 33 John DOE jdoe age (t2) 34 SSTable1 SSTable2 tombstone SSTable3 jdoe age (t3) ý
  • 48. Last Write Win (LWW) © 2016 DataStax, All Rights Reserved. 48 SELECT age FROM users WHERE login = 'jdoe'; jdoe age (t1) name (t1) 33 John DOE jdoe age (t2) 34 SSTable1 SSTable2 SSTable3 jdoe age (t3) ý ???
  • 49. Last Write Win (LWW) © 2016 DataStax, All Rights Reserved. 49 SELECT age FROM users WHERE login = 'jdoe'; jdoe age (t1) name (t1) 33 John DOE jdoe age (t2) 34 SSTable1 SSTable2 SSTable3 jdoe age (t3) ý ✓✕✕
  • 50. Compaction © 2016 DataStax, All Rights Reserved. 50 SSTable1 SSTable2 SSTable3 jdoe age (t3) ý jdoe age (t1) name (t1) 33 John DOE jdoe age (t2) 34 New SSTable jdoe age (t3) name (t1) ý John DOE
  • 51. Basic Data Modeling © 2016 DataStax, All Rights Reserved. 51
  • 52. Table creation © 2016 DataStax, All Rights Reserved. 52 CREATE TABLE users ( login text, name text, age int, … PRIMARY KEY(login)); partition key (#partition)
  • 53. DML statements © 2016 DataStax, All Rights Reserved. 53 INSERT INTO users(login, name, age) VALUES('jdoe', 'John DOE', 33); UPDATE users SET age = 34 WHERE login = 'jdoe'; DELETE age FROM users WHERE login = 'jdoe'; SELECT age FROM users WHERE login = 'jdoe';
  • 54. What’s about joins ? © 2016 DataStax, All Rights Reserved. 54 How can I join data between tables ? How can I model 1 – N relationships ? How to model a mailbox ? EmailsUser 1 n
  • 55. Compound primary key © 2016 DataStax, All Rights Reserved. 55 CREATE TABLE mailbox ( login text, message_id timeuuid, interlocutor text, message text, PRIMARY KEY((login), message_id)); partition key clustering column unicity
  • 56. Compound primary key © 2016 DataStax, All Rights Reserved. 56 rsmith 2014-11-21 16:00:00 ‘bobm’, ‘It’s really…’ 2014-11-21 17:32:12 ‘bobm’, ‘It depends..’ 2014-11-21 21:21:09 ‘bobm’, ‘Don’t do…’ … hsue 2014-11-21 11:04:43 ‘jdoe’, ‘Hi, …’ 2014-11-21 11:22:43 ‘rsmith’, ‘Hello,…’ jdoe 2014-11-21 11:00:00 ‘hsue’, ‘Hi there!’ 2014-11-21 11:22:43 ‘rsmith’, ‘Hello,…’ 2014-11-21 13:06:19 ‘bobm’, ‘Do you…’ ordered by clustering column (date) Not ordered
  • 57. Queries © 2016 DataStax, All Rights Reserved. 57 Get message by user and message_id (date) Get message by user and date interval SELECT * FROM mailbox WHERE login = 'jdoe' and message_id = ‘2014-11-21 16:00:00’; SELECT * FROM mailbox WHERE login = 'jdoe' and message_id <= ‘2014-11-25 23:59:59’ and message_id >= ‘2014-11-20 00:00:00’;
  • 58. Queries © 2016 DataStax, All Rights Reserved. 58 Get message by message_id only Get message by date interval SELECT * FROM mailbox WHERE message_id = ‘2014-11-21 16:00:00’; ??? SELECT * FROM mailbox WHERE and message_id <= ‘2014-11-25 23:59:59’ ??? and message_id >= ‘2014-11-20 00:00:00’;
  • 59. Queries © 2016 DataStax, All Rights Reserved. 59 Get message by message_id only (#partition not provided) Get message by date interval (#partition not provided) SELECT * FROM mailbox WHERE message_id = ‘2014-11-21 16:00:00’; SELECT * FROM mailbox WHERE and message_id <= ‘2014-11-25 23:59:59’ and message_id >= ‘2014-11-20 00:00:00’;
  • 60. Without #partition © 2016 DataStax, All Rights Reserved. 60 No #partition ☞ no token ☞ where are my data ? C* C* C* C* C* C* C* C* ❓ ❓ ❓ ❓ ❓ ❓ ❓ ❓
  • 61. Queries © 2016 DataStax, All Rights Reserved. 61 Get message by user range (range query on #partition) Get message by user pattern (non exact match on #partition) SELECT * FROM mailbox WHERE login >= hsue and login <= jdoe; SELECT * FROM mailbox WHERE login like ‘%doe%‘;
  • 62. WHERE clause restrictions © 2016 DataStax, All Rights Reserved. 62 All DML queries must provide #partition Only exact match (=) on #partition, range queries (<, ≤, >, ≥) not allowed •  ☞ full cluster scan On clustering columns, only range queries (<, ≤, >, ≥) and exact match (=) WHERE clause only possible •  on columns defined in PRIMARY KEY •  on indexed columns ( )
  • 63. WHERE clause restrictions © 2016 DataStax, All Rights Reserved. 63 What if I want to perform "arbitrary" WHERE clause ? •  search form scenario, dynamic search fields
  • 64. WHERE clause restrictions © 2016 DataStax, All Rights Reserved. 64 What if I want to perform "arbitrary" WHERE clause ? •  search form scenario, dynamic search fields DO NOT RE-INVENT THE WHEEL ! •  ☞ Apache Solr (Lucene) integration (Datastax Enterprise Search) •  ☞ Same JVM, 1-cluster-2-products (Solr & Cassandra)
  • 65. WHERE clause restrictions © 2016 DataStax, All Rights Reserved. 65 What if I want to perform "arbitrary" WHERE clause ? •  search form scenario, dynamic search fields DO NOT RE-INVENT THE WHEEL ! •  ☞ Apache Solr (Lucene) integration (Datastax Enterprise Search) •  ☞ Same JVM, 1-cluster-2-products (Solr & Cassandra) SELECT * FROM users WHERE solr_query = 'age:[33 TO *] AND gender:male'; SELECT * FROM users WHERE solr_query = 'lastname:*schwei?er';
  • 66. © 2016 DataStax, All Rights Reserved. 66 Q & A ! "
  • 67. Advanced Data Modeling © 2016 DataStax, All Rights Reserved. 67
  • 68. Collection types © 2016 DataStax, All Rights Reserved. 68 CREATE TABLE users ( login text, name text, age int, friends set<text>, hobbies list<text>, languages map<int, text>, … PRIMARY KEY(login));
  • 69. User Defined Type (UDT) © 2016 DataStax, All Rights Reserved. 69 Instead of CREATE TABLE users ( login text, … street_number int, street_name text, postcode int, country text, … PRIMARY KEY(login));
  • 70. User Defined Type (UDT) © 2016 DataStax, All Rights Reserved. 70 CREATE TYPE address ( street_number int, street_name text, postcode int, country text); CREATE TABLE users ( login text, … location frozen <address>, … PRIMARY KEY(login));
  • 71. UDT Insert © 2016 DataStax, All Rights Reserved. 71 INSERT INTO users(login,name, location) VALUES ( 'jdoe', 'John DOE', { 'street_number': 124, 'street_name': 'Congress Avenue', 'postcode': 95054, 'country': ‘USA’ });
  • 72. JSON syntax for INSERT/UPDATE/DELETE © 2016 DataStax, All Rights Reserved. 72 CREATE TABLE users ( id text PRIMARY KEY, age int, state text ); INSERT INTO users JSON '{"id": "user123", "age": 42, "state": "TX"}’; INSERT INTO users(id, age, state) VALUES('me', fromJson('20'), 'CA'); UPDATE users SET age = fromJson('25’) WHERE id = fromJson('"me"'); DELETE FROM users WHERE id = fromJson('"me"');
  • 73. JSON syntax for SELECT © 2016 DataStax, All Rights Reserved. 73 > SELECT JSON * FROM users WHERE id = 'me'; [json] ---------------------------------------- {"id": "me", "age": 25, "state": "CA”} > SELECT JSON age,state FROM users WHERE id = 'me'; [json] ---------------------------------------- {"age": 25, "state": "CA"} > SELECT age, toJson(state) FROM users WHERE id = 'me'; age | system.tojson(state) -----+---------------------- 25 | "CA"
  • 74. Why Materialized Views ? Relieve the pain of manual denormalization © 2015 DataStax, All Rights Reserved. 74 CREATE TABLE user( id int PRIMARY KEY, country text, …); CREATE TABLE user_by_country( country text, id int, …, PRIMARY KEY(country, id));
  • 75. Materialzed View In Action © 2015 DataStax, All Rights Reserved. 75 CREATE MATERIALIZED VIEW user_by_country AS SELECT country, id, firstname, lastname FROM user WHERE country IS NOT NULL AND id IS NOT NULL PRIMARY KEY(country, id) CREATE TABLE user_by_country ( country text, id int, firstname text, lastname text, PRIMARY KEY(country, id));
  • 76. User Defined Functions (UDF) © 2016 DataStax, All Rights Reserved. 76 CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] maxOf (col1 int, col2 int) CALL ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN int LANGUAGE java AS $$ return Math.max(col1, col2); $$; SELECT maxOf(col1, col2) FROM table WHERE id = xxx;
  • 77. User Defined Aggregates (UDA) © 2016 DataStax, All Rights Reserved. 77 CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] sum(bigint) SFUNC accumulatorFunction STYPE bigint [FINALFUNC finalFunction] INITCOND 0; CREATE FUNCTION accumulatorFunction(accu bigint, column bigint) RETURNS NULL ON NULL INPUT RETURN bigint LANGUAGE java AS $$ return accu + colum; $$;
  • 78. © 2016 DataStax, All Rights Reserved. 78 Q & A ! "
  • 79. © 2015 DataStax, All Rights Reserved. 79 @doanduyhai [email protected] https://academy.datastax.com/ Thank You