SlideShare a Scribd company logo
Copyright Ā© ArangoDB Inc. , 2018
One Engine, one Query Language.
Multiple Data Models.
Copyright Ā© ArangoDB Inc. , 2018
”Hola, me llamo Jan!
I am working for ArangoDB Inc. in Colonia, DE
I am one of the developers of ArangoDB,
the distributed, multi-model database
About me
Copyright Ā© ArangoDB Inc. , 2018
Running complex queries
in a distributed system
Copyright Ā© ArangoDB Inc. , 2018
Until recently, there was a tradeof to consider when choosing an
OLTP database
Database tradeofs
Complex queries, joins
Transactional guarantees
Highly available
Scalable
traditional
relational
ā€œNoSQLā€
Copyright Ā© ArangoDB Inc. , 2018
In the last few years, there has been a trend towards distributed
databases adopting complex query functionality and transactions
Database trends
Complex queries, joins
Transactional guarantees
Highly available
Scalable
traditional
relational
ā€œNoSQLā€
Highly available
Scalable
Transactional guarantees
Complex queries, joins
ā€œNewSQLā€
(insert buzzword of choice)
Copyright Ā© ArangoDB Inc. , 2018
ā—
Distributed databases primer
ā—
Organizing queries in a distributed database
ā—
Distributed ACID transactions
ā—
Q & A
Today I will only consider OLTP databases
Sorry, no Spark/Hadoop!
Agenda
Copyright Ā© ArangoDB Inc. , 2018
Distributed databases
primer
Copyright Ā© ArangoDB Inc. , 2018
A distributed database is a cluster of database nodes
The overall dataset is partitioned into smaller chunks (ā€œshardsā€)
Adding new nodes to the database increases its capacity (scale out)
Distributed databases
Copyright Ā© ArangoDB Inc. , 2018
Sharding example
node A node B node C
Shards: S1, S2 Shards: S3, S4 Shards: S5, S6, S7
3 nodes (A, B, C), 7 shards (S1, S2, S3, S4, S5, S6, S7)
shards
Copyright Ā© ArangoDB Inc. , 2018
Adding a node = increased capacity
node A node B node C
Shards: S1, S2 Shards: S3, S4 Shards: S5, S6
4 nodes (A, B, C, D), 8 shards (S1, S2, S3, S4, S5, S6, S7, S8)
shards
node D
Shards: S7, S8
Copyright Ā© ArangoDB Inc. , 2018
What about data loss?
node A node B node C
Shards: S1, S2 Shards: S3, S4 Shards: S5, S6, S7
3 nodes (A, B, C), 7 shards (S1, S2, S3, S4, S5, S6, S7)
shards
Copyright Ā© ArangoDB Inc. , 2018
Node failure = data loss
node A node B node C
Shards: S1, S2 Shards: S3, S4 Shards: S5, S6, S7
3 nodes (A, B, C), 7 shards (S1, S2, S3, S4, S5, S6, S7)
shards
Copyright Ā© ArangoDB Inc. , 2018
Shards example with replicas
node A node B node C
Shards: S1, S2
Replicas: S4, S6, S7
Shards: S3, S4
Replicas: S2, S5
Shards: S5, S6, S7
Replicas: S1, S3
shards
replicas
3 nodes (A, B, C), 7 shards (S1, S2, S3, S4, S5, S6, S7)
Copyright Ā© ArangoDB Inc. , 2018
Node failure with a replica setup
node A node B node C
Shards: S1, S2
Replicas: S4, S6, S7
Shards: S3, S4
Replicas: S2, S5
Shards: S5, S6, S7
Replicas: S1, S3
shards
replicas
3 nodes (A, B, C), 7 shards (S1, S2, S3, S4, S5, S6, S7)
Copyright Ā© ArangoDB Inc. , 2018
Promoting replicas
node A node B node C
Shards: S1, S2, S4
Replicas: S4, S6, S7
Shards: S3, S4
Replicas: S2, S5
Shards: S3, S5, S6, S7
Replicas: S1, S3
shards
replicas
2 nodes (A, C), 7 shards (S1, S2, S3, S4, S5, S6, S7)
Copyright Ā© ArangoDB Inc. , 2018
Creating new replicas
node A node B node C
Shards: S1, S2, S4
Replicas: S3, S5, S6, S7
Shards: S3, S4
Replicas: S2, S5
Shards: S3, S5, S6, S7
Replicas: S1, S2, S4
shards
replicas
2 nodes (A, C), 7 shards (S1, S2, S3, S4, S5, S6, S7)
Copyright Ā© ArangoDB Inc. , 2018
Organizing queries in a
distributed database
Copyright Ā© ArangoDB Inc. , 2018
A typical distributed query will involve multiple nodes, and requires
communication between them
There is normally a coordinating node for per query, which is
responsible for
ā—
triggering data processing steps on the other nodes
ā—
putting together the partial results from the other nodes
ā—
sending the merged result back to the client
ā—
shutting down the query on the other nodes
Query coordination
Copyright Ā© ArangoDB Inc. , 2018
Query coordination example
3 data nodes
Query coordinator node:
fetches data from nodes
merges the results
send result to client
shuts down query on nodes
Query result
data nodes:
return data of
shards
Copyright Ā© ArangoDB Inc. , 2018
For each inter-node communication, there will be a network
roundtrip (latency++)
One of the major goals when running distributed queries is to
minimize the amount of network communication, e.g. by
ā—
restricting the query to as few shards as possible
ā—
pushing flter conditions to the shards
ā—
pre-aggregating data on the shards
Operations on diferent shards can also be executed in parallel to
reduce overall latency
Distributed query considerations
Copyright Ā© ArangoDB Inc. , 2018
Now following are some example queries from ArangoDB
ArangoDB is a multi-model NoSQL database, which supports
documents, graphs and key-values
It can be run in single-server or distributed (cluster) mode
ArangoDB provides its own query language AQL, which is similar to
SQL, but has a diferent syntax
ArangoDB query examples
Copyright Ā© ArangoDB Inc. , 2018
A simple ArangoDB query with a flter condition:
FOR u IN users
FILTER u.active == true
RETURN u
which is equivalent to SQL’s
SELECT * FROM users u WHERE u.active = 1
The coordinator will push the flter condition to the shards,
so they will only return data that satisfes the flter condition
Query example (flter)
Copyright Ā© ArangoDB Inc. , 2018
Query example (flter)
3 data nodes
Query: FOR u IN users
FILTER u.active == true RETURN u coordinator:
fetches data from all shards
merges the results
Query result
data nodes:
return filteirieil
data of shards
Copyright Ā© ArangoDB Inc. , 2018
Now a query using a flter on a shard key attribute:
FOR u IN users
FILTER u._key == ā€œjsteemannā€
RETURN u
which is equivalent to SQL’s
SELECT * FROM users u WHERE u._key = ā€œjsteemannā€
The coordinator will restrict to query to the one shard the data is
located on, push the flter condition to the shard and fetch the results
from there
Query example (flter on shard key)
Copyright Ā© ArangoDB Inc. , 2018
Query example (flter on shard key)
3 data nodes
Query: FOR u IN users FILTER
u._key == ā€œjsteemannā€ RETURN u coordinator:
fetches data from singlei
shard
Query result
singlei data node:
returns filteirieil
data of shard
Copyright Ā© ArangoDB Inc. , 2018
Another ArangoDB query, now with a sort condition and a projection:
FOR u IN users
SORT u.name
RETURN u.name
which is equivalent to SQL’s
SELECT u.name FROM users u ORDER BY u.name
The coordinator will push the sort condition and the projection to all
shards, and combines the locally sorted results from the shards into a
totally ordered result (using merge-sort)
Query example (sorting)
Copyright Ā© ArangoDB Inc. , 2018
Query example (sorting)
3 data nodes
Query: FOR u IN users
SORT u.name RETURN u.name coordinator:
fetches data from all shards
meirigei-sorits the results
Query result
data nodes:
return soriteil and
priojeicteil data of
shards
Copyright Ā© ArangoDB Inc. , 2018
One more ArangoDB query, now using aggregation:
FOR u IN users
COLLECT year = DATE_YEAR(u.dob)
AGGREGATE count = COUNT(u.dob)
RETURN { year, count }
which is equivalent to SQL’s
SELECT YEAR(u.dob) AS year, COUNT(u.dob) AS count
FROM users u GROUP BY year
The coordinator will push the aggregation to all shards, and combines
the already aggregated results from the shards into a single result
Query example (aggregation)
Copyright Ā© ArangoDB Inc. , 2018
Query example (aggregation)
3 data nodes
Query: FOR u IN users COLLECT ...
RETURN { year, count } coordinator:
fetches data from all shards
aggrieigateis thei
aggrieigateisQuery result
data nodes:
return
aggrieigateil data
of shards
Copyright Ā© ArangoDB Inc. , 2018
One fnal ArangoDB query, now with an equi-join:
FOR u IN users FOR p IN purchases
FILTER u._key == p.user
RETURN { user: u, purchase: p }
which is equivalent to SQL’s
SELECT u.* AS user, p.* AS purchase
FROM users u, purchases p WHERE u._key = p.user
The coordinator will query all shards of the ā€œpurchasesā€ collection, and
these will reach out to the coordinator again to get data from all shards
of the ā€œusersā€ collection
Query example (join)
Copyright Ā© ArangoDB Inc. , 2018
Query example (join)
Query: FOR u IN users ...
RETURN {p , u } coordinator:
fetches data from all shards
of ā€œpurchasesā€
merges the results
Query result
data nodes:
fetch data from above
fetch data of shards for
ā€œpurchasesā€
join them
coordinator:
fetches data from all shards
of ā€œusersā€
merges the results
data nodes:
return data of
shards for ā€œusersā€
3 + 2 data nodes
Copyright Ā© ArangoDB Inc. , 2018
Distributed
ACID transactions
Copyright Ā© ArangoDB Inc. , 2018
With transactions, complex operations on multiple data items can be
executed in an all-or-nothing fashion
If something goes wrong, the database will do an automatic
cleanup of partially executed operations
With transactions, the database will ensure consistency of data and
protect us from anomalies, no matter if there are other concurrent
operations on the same data
Key take-away: transactions make application developers’ lifes easier
Benefts of transactions
Copyright Ā© ArangoDB Inc. , 2018
Some distributed databases also support ACID transactions
or have plans to add them:
ā—
Google Cloud Spanner (Database as a service)
ā—
CockroachDB
ā—
FoundationDB
ā—
FaunaDB (closed source)
ā—
...
ā—
MongoDB (announced for future releases, with limitations)
Distributed databases with transactions
Copyright Ā© ArangoDB Inc. , 2018
While a distributed transaction is ongoing, it may make modifcations
on diferent nodes
These changes need to be inefective (hidden) until the transaction
actually commits
On commit, the transaction’s changes must become instantly
visible on all nodes at the same time
Atomicity
Copyright Ā© ArangoDB Inc. , 2018
Distributed databases normally store the status of transactions
(pending, committed, aborted) in a private section of the key space,
e.g:
Key Value
T0 commited
T1 aborted
T2 pending
When a transaction commits, its status key is atomically updated
from ā€œpendingā€ to ā€œcommittedā€
Atomicity
Copyright Ā© ArangoDB Inc. , 2018
Databases that provide consistency normally serialize all write
operations for a key on the designated ā€œleaderā€ node for its shard
The state of data on the leader shard then is a consistent
ā€source of truthā€ for that shard
Write operations are replicated from leaders to replicas in the same
order as applied on the leader
Replicas are thus exact copies of the leader shards and can take over
any time
Consistency – designated leaders
Copyright Ā© ArangoDB Inc. , 2018
Leader-only writes
Query: put(ā€œamountā€, 10)
Query: put(ā€œamountā€, 42)
Leader determines the order of the
operations for the same key and
executes them one after the other,
e.g.:
1. put(ā€œamountā€, 10)
2. put(ā€œamountā€, 42)
Query: put(ā€œamountā€, 42)
10
42
Copyright Ā© ArangoDB Inc. , 2018
Shard leaders can change over time, e.g. in case of node failures,
planned maintenance
It is necessary that all nodes in the cluster have the same view on
who is the current leader for a specifc shard, and which are the
shard’s current replicas
Shard leadership
Copyright Ā© ArangoDB Inc. , 2018
The nodes in the cluster normally use a ā€œconsensus protocolā€ to
exchange status messages
Paxos and RAFT are the most commonly used consensus protocols in
distributed databases
These protocols are designed to handle network partitions and node
failures, and will work reliably if a majority of nodes is still available
and can still exchange messages with each other
Consensus protocols
Copyright Ā© ArangoDB Inc. , 2018
To ensure consistency, transactions that modify the same data must
be put into an unambiguous order
Having an unambiguous global order allows having a cross-node
consistent view on the data
This is hard to achieve because the transactions can start on diferent
nodes in parallel
Ordering transactions
Copyright Ā© ArangoDB Inc. , 2018
Each transaction is assigned a timestamp when it is started
This same timestamp will be used later as the transaction’s commit
timestamp
The timestamps of transactions will be used for ordering them
Rule: a transaction with a lower timestamp happened before a
transaction with a higher timestamp
Ordering transactions using timestamps
Copyright Ā© ArangoDB Inc. , 2018
Timestamps created by diferent nodes are not reliably comparable
due to clock skew
The solution to make them comparable in most cases is to defne an
ā€œuncertainty intervalā€ (which is the maximum tolerable clock skew)
If the timestamp diference is outside of the ā€œuncertainty intervalā€,
two timestamps are safely comparable
Two timestamps with a diference inside the uncertainty interval are
not comparable safely, and the relative order of them is unknown
Clock skew
Copyright Ā© ArangoDB Inc. , 2018
If the transactions could have infuence on each other, this is an
(actual or a potential) read or write confict, and one of the
transactions must be aborted or restarted
A transaction restart also means assigning a new, higher timestamp
Consistency using timestamps
Copyright Ā© ArangoDB Inc. , 2018
To ensure isolation, a running transaction must not overwrite or
remove data that another ongoing transaction may still see
Write operations are stored in a multi-version data structure, which
can handle multiple values for the same key at the same time
Any transaction that reads or writes a key needs to fnd the ā€œcorrectā€
version of it
Isolation
Copyright Ā© ArangoDB Inc. , 2018
Key Transaction ID Value
ā€œamountā€ T0 10
ā€amountā€ T1 42
ā€nameā€ T17 ā€testā€
ā€pageā€ T2 ā€index.htmlā€
ā€pageā€ T50 <removed>
Any operation can identify whether it can ā€œseeā€ an operation from
another transaction, simply by looking up the status and timestamp
of the corresponding transaction
Isolation – multi-versioning
Copyright Ā© ArangoDB Inc. , 2018
Durability
To ensure durability, every write operation (and also transaction status
changes) needs to be persisted on multiple nodes (leader + replicas)
A commit is only considered successful if acknowledged by a
confgurable number of nodes
Copyright Ā© ArangoDB Inc. , 2018
In the last few years, there has been a trend towards distributed
databases adopting complex query functionality and transactions
Database trends
Complex queries, joins
Transactional guarantees
Highly available
Scalable
traditional
relational
ā€œNoSQLā€
Highly available
Scalable
Transactional guarantees
Complex queries, joins
ā€œNewSQLā€
(insert buzzword of choice)
Copyright Ā© ArangoDB Inc. , 2018
”Muchas gracias!
ĀæHay preguntas?
Copyright Ā© ArangoDB Inc. , 2018
Please star ArangoDB on Github:
https://github.com/arangodb/arangodb
Participate in ArangoDB’s community survey to win a t-shirt:
https://arangodb.com/community-survey/
#arangodb | jan@arangodb.com
Icons made by Freepik (www.freepik.com) from www.faticon.com,
licensed by CC 3.0 BY
Links / credits
Ad

Recommended

ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoDB Database
Ā 
An introduction to multi-model databases
An introduction to multi-model databases
Berta Hermida Plaza
Ā 
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB Database
Ā 
Custom Pregel Algorithms in ArangoDB
Custom Pregel Algorithms in ArangoDB
ArangoDB Database
Ā 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
ArangoDB Database
Ā 
Are you a Tortoise or a Hare?
Are you a Tortoise or a Hare?
ArangoDB Database
Ā 
Graph Analytics with ArangoDB
Graph Analytics with ArangoDB
ArangoDB Database
Ā 
Apache Spark Side of Funnels
Apache Spark Side of Funnels
Databricks
Ā 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge Graphs
ArangoDB Database
Ā 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB Database
Ā 
Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
Steve Watt
Ā 
GraphQL & DGraph with Go
GraphQL & DGraph with Go
James Tan
Ā 
Spark SQL - 10 Things You Need to Know
Spark SQL - 10 Things You Need to Know
Kristian Alexander
Ā 
Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...
Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...
Jean Ihm
Ā 
Introduction to DGraph - A Graph Database
Introduction to DGraph - A Graph Database
Knoldus Inc.
Ā 
Spark meetup v2.0.5
Spark meetup v2.0.5
Yan Zhou
Ā 
20140908 spark sql & catalyst
20140908 spark sql & catalyst
Takuya UESHIN
Ā 
Pivoting Data with SparkSQL by Andrew Ray
Pivoting Data with SparkSQL by Andrew Ray
Spark Summit
Ā 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
Andrew Lamb
Ā 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Julian Hyde
Ā 
Graph Analytics in Spark
Graph Analytics in Spark
Paco Nathan
Ā 
SPARQL and Linked Data Benchmarking
SPARQL and Linked Data Benchmarking
Kristian Alexander
Ā 
Apache Spark sql
Apache Spark sql
aftab alam
Ā 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
Databricks
Ā 
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Chris Fregly
Ā 
Spark SQL with Scala Code Examples
Spark SQL with Scala Code Examples
Todd McGrath
Ā 
Tactical data engineering
Tactical data engineering
Julian Hyde
Ā 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
DataWorks Summit/Hadoop Summit
Ā 
Guacamole Fiesta: What do avocados and databases have in common?
Guacamole Fiesta: What do avocados and databases have in common?
ArangoDB Database
Ā 
Deep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDB
ArangoDB Database
Ā 

More Related Content

What's hot (20)

Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge Graphs
ArangoDB Database
Ā 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB Database
Ā 
Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
Steve Watt
Ā 
GraphQL & DGraph with Go
GraphQL & DGraph with Go
James Tan
Ā 
Spark SQL - 10 Things You Need to Know
Spark SQL - 10 Things You Need to Know
Kristian Alexander
Ā 
Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...
Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...
Jean Ihm
Ā 
Introduction to DGraph - A Graph Database
Introduction to DGraph - A Graph Database
Knoldus Inc.
Ā 
Spark meetup v2.0.5
Spark meetup v2.0.5
Yan Zhou
Ā 
20140908 spark sql & catalyst
20140908 spark sql & catalyst
Takuya UESHIN
Ā 
Pivoting Data with SparkSQL by Andrew Ray
Pivoting Data with SparkSQL by Andrew Ray
Spark Summit
Ā 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
Andrew Lamb
Ā 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Julian Hyde
Ā 
Graph Analytics in Spark
Graph Analytics in Spark
Paco Nathan
Ā 
SPARQL and Linked Data Benchmarking
SPARQL and Linked Data Benchmarking
Kristian Alexander
Ā 
Apache Spark sql
Apache Spark sql
aftab alam
Ā 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
Databricks
Ā 
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Chris Fregly
Ā 
Spark SQL with Scala Code Examples
Spark SQL with Scala Code Examples
Todd McGrath
Ā 
Tactical data engineering
Tactical data engineering
Julian Hyde
Ā 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
DataWorks Summit/Hadoop Summit
Ā 
Hacktoberfest 2020 - Intro to Knowledge Graphs
Hacktoberfest 2020 - Intro to Knowledge Graphs
ArangoDB Database
Ā 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB Database
Ā 
Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
Steve Watt
Ā 
GraphQL & DGraph with Go
GraphQL & DGraph with Go
James Tan
Ā 
Spark SQL - 10 Things You Need to Know
Spark SQL - 10 Things You Need to Know
Kristian Alexander
Ā 
Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...
Powerful Spatial Features You Never Knew Existed in Oracle Spatial and Graph ...
Jean Ihm
Ā 
Introduction to DGraph - A Graph Database
Introduction to DGraph - A Graph Database
Knoldus Inc.
Ā 
Spark meetup v2.0.5
Spark meetup v2.0.5
Yan Zhou
Ā 
20140908 spark sql & catalyst
20140908 spark sql & catalyst
Takuya UESHIN
Ā 
Pivoting Data with SparkSQL by Andrew Ray
Pivoting Data with SparkSQL by Andrew Ray
Spark Summit
Ā 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
Andrew Lamb
Ā 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Julian Hyde
Ā 
Graph Analytics in Spark
Graph Analytics in Spark
Paco Nathan
Ā 
SPARQL and Linked Data Benchmarking
SPARQL and Linked Data Benchmarking
Kristian Alexander
Ā 
Apache Spark sql
Apache Spark sql
aftab alam
Ā 
Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
Databricks
Ā 
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Chris Fregly
Ā 
Spark SQL with Scala Code Examples
Spark SQL with Scala Code Examples
Todd McGrath
Ā 
Tactical data engineering
Tactical data engineering
Julian Hyde
Ā 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
DataWorks Summit/Hadoop Summit
Ā 

Similar to Running complex data queries in a distributed system (20)

Guacamole Fiesta: What do avocados and databases have in common?
Guacamole Fiesta: What do avocados and databases have in common?
ArangoDB Database
Ā 
Deep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDB
ArangoDB Database
Ā 
An introduction to multi-model databases
An introduction to multi-model databases
ArangoDB Database
Ā 
Deep Dive on ArangoDB
Deep Dive on ArangoDB
Max Neunhƶffer
Ā 
ArangoDB
ArangoDB
ArangoDB Database
Ā 
MongoDB_ppt.pptx
MongoDB_ppt.pptx
1AP18CS037ShirishKul
Ā 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
ArangoDB Database
Ā 
Oslo bekk2014
Oslo bekk2014
Max Neunhƶffer
Ā 
MongoDB: An Introduction - june-2011
MongoDB: An Introduction - june-2011
Chris Westin
Ā 
Conceptos bÔsicos. Seminario web 1: Introducción a NoSQL
Conceptos bÔsicos. Seminario web 1: Introducción a NoSQL
MongoDB
Ā 
Fishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data Lake
ArangoDB Database
Ā 
Using MongoDB and Python
Using MongoDB and Python
Mike Bright
Ā 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo
Michael Bright
Ā 
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
Laura Ventura
Ā 
MongoDB Pros and Cons
MongoDB Pros and Cons
johnrjenson
Ā 
MongoDB: An Introduction - July 2011
MongoDB: An Introduction - July 2011
Chris Westin
Ā 
ArangoDB - Using JavaScript in the database
ArangoDB - Using JavaScript in the database
ArangoDB Database
Ā 
Introduction to ArangoDB (nosql matters Barcelona 2012)
Introduction to ArangoDB (nosql matters Barcelona 2012)
ArangoDB Database
Ā 
MongoDB - How to model and extract your data
MongoDB - How to model and extract your data
Francesco Lo Franco
Ā 
Python mongo db-training-europython-2011
Python mongo db-training-europython-2011
Andreas Jung
Ā 
Guacamole Fiesta: What do avocados and databases have in common?
Guacamole Fiesta: What do avocados and databases have in common?
ArangoDB Database
Ā 
Deep dive into the native multi model database ArangoDB
Deep dive into the native multi model database ArangoDB
ArangoDB Database
Ā 
An introduction to multi-model databases
An introduction to multi-model databases
ArangoDB Database
Ā 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
ArangoDB Database
Ā 
MongoDB: An Introduction - june-2011
MongoDB: An Introduction - june-2011
Chris Westin
Ā 
Conceptos bÔsicos. Seminario web 1: Introducción a NoSQL
Conceptos bÔsicos. Seminario web 1: Introducción a NoSQL
MongoDB
Ā 
Fishing Graphs in a Hadoop Data Lake
Fishing Graphs in a Hadoop Data Lake
ArangoDB Database
Ā 
Using MongoDB and Python
Using MongoDB and Python
Mike Bright
Ā 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo
Michael Bright
Ā 
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
Hands on Big Data Analysis with MongoDB - Cloud Expo Bootcamp NYC
Laura Ventura
Ā 
MongoDB Pros and Cons
MongoDB Pros and Cons
johnrjenson
Ā 
MongoDB: An Introduction - July 2011
MongoDB: An Introduction - July 2011
Chris Westin
Ā 
ArangoDB - Using JavaScript in the database
ArangoDB - Using JavaScript in the database
ArangoDB Database
Ā 
Introduction to ArangoDB (nosql matters Barcelona 2012)
Introduction to ArangoDB (nosql matters Barcelona 2012)
ArangoDB Database
Ā 
MongoDB - How to model and extract your data
MongoDB - How to model and extract your data
Francesco Lo Franco
Ā 
Python mongo db-training-europython-2011
Python mongo db-training-europython-2011
Andreas Jung
Ā 
Ad

More from ArangoDB Database (17)

ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ArangoDB Database
Ā 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
ArangoDB Database
Ā 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
ArangoDB Database
Ā 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
ArangoDB Database
Ā 
Getting Started with ArangoDB Oasis
Getting Started with ArangoDB Oasis
ArangoDB Database
Ā 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
ArangoDB Database
Ā 
Webinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB Oasis
ArangoDB Database
Ā 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB Database
Ā 
3.5 webinar
3.5 webinar
ArangoDB Database
Ā 
Webinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDB
ArangoDB Database
Ā 
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
ArangoDB Database
Ā 
An E-commerce App in action built on top of a Multi-model Database
An E-commerce App in action built on top of a Multi-model Database
ArangoDB Database
Ā 
Creating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on Mesos
ArangoDB Database
Ā 
Handling Billions of Edges in a Graph Database
Handling Billions of Edges in a Graph Database
ArangoDB Database
Ā 
Introduction to Foxx by our community member Iskandar Soesman @ikandars
Introduction to Foxx by our community member Iskandar Soesman @ikandars
ArangoDB Database
Ā 
Polyglot Persistence & Multi-Model Databases
Polyglot Persistence & Multi-Model Databases
ArangoDB Database
Ā 
Software + Babies
Software + Babies
ArangoDB Database
Ā 
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ArangoDB Database
Ā 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
ArangoDB Database
Ā 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
ArangoDB Database
Ā 
GraphSage vs Pinsage #InsideArangoDB
GraphSage vs Pinsage #InsideArangoDB
ArangoDB Database
Ā 
Getting Started with ArangoDB Oasis
Getting Started with ArangoDB Oasis
ArangoDB Database
Ā 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
ArangoDB Database
Ā 
Webinar: What to expect from ArangoDB Oasis
Webinar: What to expect from ArangoDB Oasis
ArangoDB Database
Ā 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB Database
Ā 
Webinar: How native multi model works in ArangoDB
Webinar: How native multi model works in ArangoDB
ArangoDB Database
Ā 
The Computer Science Behind a modern Distributed Database
The Computer Science Behind a modern Distributed Database
ArangoDB Database
Ā 
An E-commerce App in action built on top of a Multi-model Database
An E-commerce App in action built on top of a Multi-model Database
ArangoDB Database
Ā 
Creating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on Mesos
ArangoDB Database
Ā 
Handling Billions of Edges in a Graph Database
Handling Billions of Edges in a Graph Database
ArangoDB Database
Ā 
Introduction to Foxx by our community member Iskandar Soesman @ikandars
Introduction to Foxx by our community member Iskandar Soesman @ikandars
ArangoDB Database
Ā 
Polyglot Persistence & Multi-Model Databases
Polyglot Persistence & Multi-Model Databases
ArangoDB Database
Ā 
Ad

Recently uploaded (20)

10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
Ā 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
Ā 
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
Ā 
2025_06_18 - OpenMetadata Community Meeting.pdf
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
Ā 
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
Ā 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
Ā 
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
Ā 
OWASP Barcelona 2025 Threat Model Library
OWASP Barcelona 2025 Threat Model Library
PetraVukmirovic
Ā 
CapCut Pro Crack For PC Latest Version {Fully Unlocked} 2025
CapCut Pro Crack For PC Latest Version {Fully Unlocked} 2025
pcprocore
Ā 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
Ā 
UserCon Belgium: Honey, VMware increased my bill
UserCon Belgium: Honey, VMware increased my bill
stijn40
Ā 
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
Ā 
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
Ā 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
Ā 
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
Ā 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
Ā 
Mastering AI Workflows with FME by Mark Döring
Mastering AI Workflows with FME by Mark Döring
Safe Software
Ā 
From Manual to Auto Searching- FME in the Driver's Seat
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
Ā 
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Safe Software
Ā 
Daily Lesson Log MATATAG ICT TEchnology 8
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
Ā 
10 Key Challenges for AI within the EU Data Protection Framework.pdf
10 Key Challenges for AI within the EU Data Protection Framework.pdf
Priyanka Aash
Ā 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
Ā 
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
Ā 
2025_06_18 - OpenMetadata Community Meeting.pdf
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
Ā 
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
Ā 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
Ā 
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
Ā 
OWASP Barcelona 2025 Threat Model Library
OWASP Barcelona 2025 Threat Model Library
PetraVukmirovic
Ā 
CapCut Pro Crack For PC Latest Version {Fully Unlocked} 2025
CapCut Pro Crack For PC Latest Version {Fully Unlocked} 2025
pcprocore
Ā 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
Ā 
UserCon Belgium: Honey, VMware increased my bill
UserCon Belgium: Honey, VMware increased my bill
stijn40
Ā 
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
Ā 
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
Ā 
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Coordinated Disclosure for ML - What's Different and What's the Same.pdf
Priyanka Aash
Ā 
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
EIS-Webinar-Engineering-Retail-Infrastructure-06-16-2025.pdf
Earley Information Science
Ā 
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik - Passionate Tech Enthusiast
Raman Bhaumik
Ā 
Mastering AI Workflows with FME by Mark Döring
Mastering AI Workflows with FME by Mark Döring
Safe Software
Ā 
From Manual to Auto Searching- FME in the Driver's Seat
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
Ā 
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Using the SQLExecutor for Data Quality Management: aka One man's love for the...
Safe Software
Ā 
Daily Lesson Log MATATAG ICT TEchnology 8
Daily Lesson Log MATATAG ICT TEchnology 8
LOIDAALMAZAN3
Ā 

Running complex data queries in a distributed system

  • 1. Copyright Ā© ArangoDB Inc. , 2018 One Engine, one Query Language. Multiple Data Models.
  • 2. Copyright Ā© ArangoDB Inc. , 2018 Ā”Hola, me llamo Jan! I am working for ArangoDB Inc. in Colonia, DE I am one of the developers of ArangoDB, the distributed, multi-model database About me
  • 3. Copyright Ā© ArangoDB Inc. , 2018 Running complex queries in a distributed system
  • 4. Copyright Ā© ArangoDB Inc. , 2018 Until recently, there was a tradeof to consider when choosing an OLTP database Database tradeofs Complex queries, joins Transactional guarantees Highly available Scalable traditional relational ā€œNoSQLā€
  • 5. Copyright Ā© ArangoDB Inc. , 2018 In the last few years, there has been a trend towards distributed databases adopting complex query functionality and transactions Database trends Complex queries, joins Transactional guarantees Highly available Scalable traditional relational ā€œNoSQLā€ Highly available Scalable Transactional guarantees Complex queries, joins ā€œNewSQLā€ (insert buzzword of choice)
  • 6. Copyright Ā© ArangoDB Inc. , 2018 ā— Distributed databases primer ā— Organizing queries in a distributed database ā— Distributed ACID transactions ā— Q & A Today I will only consider OLTP databases Sorry, no Spark/Hadoop! Agenda
  • 7. Copyright Ā© ArangoDB Inc. , 2018 Distributed databases primer
  • 8. Copyright Ā© ArangoDB Inc. , 2018 A distributed database is a cluster of database nodes The overall dataset is partitioned into smaller chunks (ā€œshardsā€) Adding new nodes to the database increases its capacity (scale out) Distributed databases
  • 9. Copyright Ā© ArangoDB Inc. , 2018 Sharding example node A node B node C Shards: S1, S2 Shards: S3, S4 Shards: S5, S6, S7 3 nodes (A, B, C), 7 shards (S1, S2, S3, S4, S5, S6, S7) shards
  • 10. Copyright Ā© ArangoDB Inc. , 2018 Adding a node = increased capacity node A node B node C Shards: S1, S2 Shards: S3, S4 Shards: S5, S6 4 nodes (A, B, C, D), 8 shards (S1, S2, S3, S4, S5, S6, S7, S8) shards node D Shards: S7, S8
  • 11. Copyright Ā© ArangoDB Inc. , 2018 What about data loss? node A node B node C Shards: S1, S2 Shards: S3, S4 Shards: S5, S6, S7 3 nodes (A, B, C), 7 shards (S1, S2, S3, S4, S5, S6, S7) shards
  • 12. Copyright Ā© ArangoDB Inc. , 2018 Node failure = data loss node A node B node C Shards: S1, S2 Shards: S3, S4 Shards: S5, S6, S7 3 nodes (A, B, C), 7 shards (S1, S2, S3, S4, S5, S6, S7) shards
  • 13. Copyright Ā© ArangoDB Inc. , 2018 Shards example with replicas node A node B node C Shards: S1, S2 Replicas: S4, S6, S7 Shards: S3, S4 Replicas: S2, S5 Shards: S5, S6, S7 Replicas: S1, S3 shards replicas 3 nodes (A, B, C), 7 shards (S1, S2, S3, S4, S5, S6, S7)
  • 14. Copyright Ā© ArangoDB Inc. , 2018 Node failure with a replica setup node A node B node C Shards: S1, S2 Replicas: S4, S6, S7 Shards: S3, S4 Replicas: S2, S5 Shards: S5, S6, S7 Replicas: S1, S3 shards replicas 3 nodes (A, B, C), 7 shards (S1, S2, S3, S4, S5, S6, S7)
  • 15. Copyright Ā© ArangoDB Inc. , 2018 Promoting replicas node A node B node C Shards: S1, S2, S4 Replicas: S4, S6, S7 Shards: S3, S4 Replicas: S2, S5 Shards: S3, S5, S6, S7 Replicas: S1, S3 shards replicas 2 nodes (A, C), 7 shards (S1, S2, S3, S4, S5, S6, S7)
  • 16. Copyright Ā© ArangoDB Inc. , 2018 Creating new replicas node A node B node C Shards: S1, S2, S4 Replicas: S3, S5, S6, S7 Shards: S3, S4 Replicas: S2, S5 Shards: S3, S5, S6, S7 Replicas: S1, S2, S4 shards replicas 2 nodes (A, C), 7 shards (S1, S2, S3, S4, S5, S6, S7)
  • 17. Copyright Ā© ArangoDB Inc. , 2018 Organizing queries in a distributed database
  • 18. Copyright Ā© ArangoDB Inc. , 2018 A typical distributed query will involve multiple nodes, and requires communication between them There is normally a coordinating node for per query, which is responsible for ā— triggering data processing steps on the other nodes ā— putting together the partial results from the other nodes ā— sending the merged result back to the client ā— shutting down the query on the other nodes Query coordination
  • 19. Copyright Ā© ArangoDB Inc. , 2018 Query coordination example 3 data nodes Query coordinator node: fetches data from nodes merges the results send result to client shuts down query on nodes Query result data nodes: return data of shards
  • 20. Copyright Ā© ArangoDB Inc. , 2018 For each inter-node communication, there will be a network roundtrip (latency++) One of the major goals when running distributed queries is to minimize the amount of network communication, e.g. by ā— restricting the query to as few shards as possible ā— pushing flter conditions to the shards ā— pre-aggregating data on the shards Operations on diferent shards can also be executed in parallel to reduce overall latency Distributed query considerations
  • 21. Copyright Ā© ArangoDB Inc. , 2018 Now following are some example queries from ArangoDB ArangoDB is a multi-model NoSQL database, which supports documents, graphs and key-values It can be run in single-server or distributed (cluster) mode ArangoDB provides its own query language AQL, which is similar to SQL, but has a diferent syntax ArangoDB query examples
  • 22. Copyright Ā© ArangoDB Inc. , 2018 A simple ArangoDB query with a flter condition: FOR u IN users FILTER u.active == true RETURN u which is equivalent to SQL’s SELECT * FROM users u WHERE u.active = 1 The coordinator will push the flter condition to the shards, so they will only return data that satisfes the flter condition Query example (flter)
  • 23. Copyright Ā© ArangoDB Inc. , 2018 Query example (flter) 3 data nodes Query: FOR u IN users FILTER u.active == true RETURN u coordinator: fetches data from all shards merges the results Query result data nodes: return filteirieil data of shards
  • 24. Copyright Ā© ArangoDB Inc. , 2018 Now a query using a flter on a shard key attribute: FOR u IN users FILTER u._key == ā€œjsteemannā€ RETURN u which is equivalent to SQL’s SELECT * FROM users u WHERE u._key = ā€œjsteemannā€ The coordinator will restrict to query to the one shard the data is located on, push the flter condition to the shard and fetch the results from there Query example (flter on shard key)
  • 25. Copyright Ā© ArangoDB Inc. , 2018 Query example (flter on shard key) 3 data nodes Query: FOR u IN users FILTER u._key == ā€œjsteemannā€ RETURN u coordinator: fetches data from singlei shard Query result singlei data node: returns filteirieil data of shard
  • 26. Copyright Ā© ArangoDB Inc. , 2018 Another ArangoDB query, now with a sort condition and a projection: FOR u IN users SORT u.name RETURN u.name which is equivalent to SQL’s SELECT u.name FROM users u ORDER BY u.name The coordinator will push the sort condition and the projection to all shards, and combines the locally sorted results from the shards into a totally ordered result (using merge-sort) Query example (sorting)
  • 27. Copyright Ā© ArangoDB Inc. , 2018 Query example (sorting) 3 data nodes Query: FOR u IN users SORT u.name RETURN u.name coordinator: fetches data from all shards meirigei-sorits the results Query result data nodes: return soriteil and priojeicteil data of shards
  • 28. Copyright Ā© ArangoDB Inc. , 2018 One more ArangoDB query, now using aggregation: FOR u IN users COLLECT year = DATE_YEAR(u.dob) AGGREGATE count = COUNT(u.dob) RETURN { year, count } which is equivalent to SQL’s SELECT YEAR(u.dob) AS year, COUNT(u.dob) AS count FROM users u GROUP BY year The coordinator will push the aggregation to all shards, and combines the already aggregated results from the shards into a single result Query example (aggregation)
  • 29. Copyright Ā© ArangoDB Inc. , 2018 Query example (aggregation) 3 data nodes Query: FOR u IN users COLLECT ... RETURN { year, count } coordinator: fetches data from all shards aggrieigateis thei aggrieigateisQuery result data nodes: return aggrieigateil data of shards
  • 30. Copyright Ā© ArangoDB Inc. , 2018 One fnal ArangoDB query, now with an equi-join: FOR u IN users FOR p IN purchases FILTER u._key == p.user RETURN { user: u, purchase: p } which is equivalent to SQL’s SELECT u.* AS user, p.* AS purchase FROM users u, purchases p WHERE u._key = p.user The coordinator will query all shards of the ā€œpurchasesā€ collection, and these will reach out to the coordinator again to get data from all shards of the ā€œusersā€ collection Query example (join)
  • 31. Copyright Ā© ArangoDB Inc. , 2018 Query example (join) Query: FOR u IN users ... RETURN {p , u } coordinator: fetches data from all shards of ā€œpurchasesā€ merges the results Query result data nodes: fetch data from above fetch data of shards for ā€œpurchasesā€ join them coordinator: fetches data from all shards of ā€œusersā€ merges the results data nodes: return data of shards for ā€œusersā€ 3 + 2 data nodes
  • 32. Copyright Ā© ArangoDB Inc. , 2018 Distributed ACID transactions
  • 33. Copyright Ā© ArangoDB Inc. , 2018 With transactions, complex operations on multiple data items can be executed in an all-or-nothing fashion If something goes wrong, the database will do an automatic cleanup of partially executed operations With transactions, the database will ensure consistency of data and protect us from anomalies, no matter if there are other concurrent operations on the same data Key take-away: transactions make application developers’ lifes easier Benefts of transactions
  • 34. Copyright Ā© ArangoDB Inc. , 2018 Some distributed databases also support ACID transactions or have plans to add them: ā— Google Cloud Spanner (Database as a service) ā— CockroachDB ā— FoundationDB ā— FaunaDB (closed source) ā— ... ā— MongoDB (announced for future releases, with limitations) Distributed databases with transactions
  • 35. Copyright Ā© ArangoDB Inc. , 2018 While a distributed transaction is ongoing, it may make modifcations on diferent nodes These changes need to be inefective (hidden) until the transaction actually commits On commit, the transaction’s changes must become instantly visible on all nodes at the same time Atomicity
  • 36. Copyright Ā© ArangoDB Inc. , 2018 Distributed databases normally store the status of transactions (pending, committed, aborted) in a private section of the key space, e.g: Key Value T0 commited T1 aborted T2 pending When a transaction commits, its status key is atomically updated from ā€œpendingā€ to ā€œcommittedā€ Atomicity
  • 37. Copyright Ā© ArangoDB Inc. , 2018 Databases that provide consistency normally serialize all write operations for a key on the designated ā€œleaderā€ node for its shard The state of data on the leader shard then is a consistent ā€source of truthā€ for that shard Write operations are replicated from leaders to replicas in the same order as applied on the leader Replicas are thus exact copies of the leader shards and can take over any time Consistency – designated leaders
  • 38. Copyright Ā© ArangoDB Inc. , 2018 Leader-only writes Query: put(ā€œamountā€, 10) Query: put(ā€œamountā€, 42) Leader determines the order of the operations for the same key and executes them one after the other, e.g.: 1. put(ā€œamountā€, 10) 2. put(ā€œamountā€, 42) Query: put(ā€œamountā€, 42) 10 42
  • 39. Copyright Ā© ArangoDB Inc. , 2018 Shard leaders can change over time, e.g. in case of node failures, planned maintenance It is necessary that all nodes in the cluster have the same view on who is the current leader for a specifc shard, and which are the shard’s current replicas Shard leadership
  • 40. Copyright Ā© ArangoDB Inc. , 2018 The nodes in the cluster normally use a ā€œconsensus protocolā€ to exchange status messages Paxos and RAFT are the most commonly used consensus protocols in distributed databases These protocols are designed to handle network partitions and node failures, and will work reliably if a majority of nodes is still available and can still exchange messages with each other Consensus protocols
  • 41. Copyright Ā© ArangoDB Inc. , 2018 To ensure consistency, transactions that modify the same data must be put into an unambiguous order Having an unambiguous global order allows having a cross-node consistent view on the data This is hard to achieve because the transactions can start on diferent nodes in parallel Ordering transactions
  • 42. Copyright Ā© ArangoDB Inc. , 2018 Each transaction is assigned a timestamp when it is started This same timestamp will be used later as the transaction’s commit timestamp The timestamps of transactions will be used for ordering them Rule: a transaction with a lower timestamp happened before a transaction with a higher timestamp Ordering transactions using timestamps
  • 43. Copyright Ā© ArangoDB Inc. , 2018 Timestamps created by diferent nodes are not reliably comparable due to clock skew The solution to make them comparable in most cases is to defne an ā€œuncertainty intervalā€ (which is the maximum tolerable clock skew) If the timestamp diference is outside of the ā€œuncertainty intervalā€, two timestamps are safely comparable Two timestamps with a diference inside the uncertainty interval are not comparable safely, and the relative order of them is unknown Clock skew
  • 44. Copyright Ā© ArangoDB Inc. , 2018 If the transactions could have infuence on each other, this is an (actual or a potential) read or write confict, and one of the transactions must be aborted or restarted A transaction restart also means assigning a new, higher timestamp Consistency using timestamps
  • 45. Copyright Ā© ArangoDB Inc. , 2018 To ensure isolation, a running transaction must not overwrite or remove data that another ongoing transaction may still see Write operations are stored in a multi-version data structure, which can handle multiple values for the same key at the same time Any transaction that reads or writes a key needs to fnd the ā€œcorrectā€ version of it Isolation
  • 46. Copyright Ā© ArangoDB Inc. , 2018 Key Transaction ID Value ā€œamountā€ T0 10 ā€amountā€ T1 42 ā€nameā€ T17 ā€testā€ ā€pageā€ T2 ā€index.htmlā€ ā€pageā€ T50 <removed> Any operation can identify whether it can ā€œseeā€ an operation from another transaction, simply by looking up the status and timestamp of the corresponding transaction Isolation – multi-versioning
  • 47. Copyright Ā© ArangoDB Inc. , 2018 Durability To ensure durability, every write operation (and also transaction status changes) needs to be persisted on multiple nodes (leader + replicas) A commit is only considered successful if acknowledged by a confgurable number of nodes
  • 48. Copyright Ā© ArangoDB Inc. , 2018 In the last few years, there has been a trend towards distributed databases adopting complex query functionality and transactions Database trends Complex queries, joins Transactional guarantees Highly available Scalable traditional relational ā€œNoSQLā€ Highly available Scalable Transactional guarantees Complex queries, joins ā€œNewSQLā€ (insert buzzword of choice)
  • 49. Copyright Ā© ArangoDB Inc. , 2018 Ā”Muchas gracias! ĀæHay preguntas?
  • 50. Copyright Ā© ArangoDB Inc. , 2018 Please star ArangoDB on Github: https://github.com/arangodb/arangodb Participate in ArangoDB’s community survey to win a t-shirt: https://arangodb.com/community-survey/ #arangodb | [email protected] Icons made by Freepik (www.freepik.com) from www.faticon.com, licensed by CC 3.0 BY Links / credits