SlideShare a Scribd company logo
Citus	
  5.0	
  
Extending	
  PostgreSQL	
  to	
  Build	
  a	
  	
  
Distributed	
  Database	
  
Ozgun	
  Erdogan	
  
on	
  behalf	
  of	
  Citus	
  Data	
  team	
  
Talk	
  Outline	
  
1.  IntroducEon	
  
2.  Citus	
  5.0	
  and	
  its	
  use	
  of	
  extension	
  APIs	
  
3.  Distributed	
  query	
  planning	
  
4.  Different	
  distributed	
  executors	
  for	
  different	
  
workloads	
  
•  Three	
  technical	
  lightning	
  talks	
  in	
  one	
  
What	
  is	
  Citus?	
  
•  Citus	
  extends	
  PostgreSQL	
  (not	
  a	
  fork)	
  to	
  provide	
  
it	
  with	
  distributed	
  funcEonality.	
  
•  Citus	
  scales-­‐out	
  Postgres	
  across	
  servers	
  using	
  
sharding	
  and	
  replicaEon.	
  Its	
  query	
  engine	
  
parallelizes	
  SQL	
  queries	
  across	
  many	
  servers.	
  
•  Citus	
  5.0	
  is	
  open	
  source:	
  hVps://github.com/
citusdata/citus	
  
Citus	
  5.0	
  Architecture	
  Diagram	
  
Events	
  
Citus	
  worker	
  1	
  
(PostgreSQL	
  +	
  
Citus	
  extension)	
  
…	
  
…	
   …	
   …	
  
Citus	
  coordinator	
  
(PostgreSQL	
  +	
  
Citus	
  extension)	
  
	
  
Distributed	
  table	
  
(metadata)	
  
E1	
   E3’	
  
Citus	
  worker	
  2	
  
…	
  
…	
   …	
   …	
  
E2	
   E1’	
  
Citus	
  worker	
  N	
  
…	
  
…	
   …	
   …	
  
E3	
   E2’	
  
…	
  
Regular	
  tables	
  
(1	
  shard	
  =	
  	
  
1	
  Postgres	
  table)	
  
When	
  is	
  Citus	
  a	
  good	
  fit?	
  
•  Scaling	
  a	
  mulE-­‐tenant	
  (B2B)	
  database	
  to	
  100K+	
  tenants	
  
•  Sub-­‐second	
  OLAP	
  queries	
  on	
  data	
  as	
  it	
  arrives	
  
•  Powering	
  real-­‐Eme	
  analyEc	
  dashboards	
  
•  Exploratory	
  queries	
  on	
  events	
  as	
  they	
  arrive	
  
•  Who	
  is	
  using	
  Citus?	
  
•  CloudFlare	
  uses	
  Citus	
  to	
  power	
  their	
  analyEc	
  dashboards	
  
•  Neustar	
  builds	
  ad-­‐tech	
  infrastructure	
  with	
  HyperLogLog	
  
•  Heap	
  powers	
  funnel,	
  segmentaEon,	
  and	
  cohort	
  queries	
  
SQL,	
  Scaling	
  out,	
  and	
  What’s	
  
Unique	
  About	
  PostgreSQL?	
  
“SQL	
  doesn’t	
  Scale”	
  
1.  Scaling-­‐out	
  is	
  hard.	
  Scaling	
  data,	
  compared	
  to	
  
scaling	
  computaEons,	
  is	
  even	
  harder.	
  
2.  SQL	
  means	
  different	
  things	
  to	
  different	
  people:	
  
transacEonal	
  workloads,	
  short	
  reads/writes,	
  real-­‐
Eme	
  analyEcs,	
  data	
  warehousing,	
  or	
  triggers.	
  
3.  SQL	
  doesn’t	
  have	
  the	
  no1on	
  of	
  “distribu1on”	
  built	
  
into	
  the	
  language.	
  This	
  can	
  be	
  added	
  in,	
  but	
  not	
  
there	
  in	
  SQL.	
  
Query	
  Languages:	
  An	
  Example	
  
SQL	
  RouEng	
  /	
  ReplicaEon	
  
•  Simple	
  INSERT	
  rouEng	
  and	
  replicaEon	
  
1.  Parse	
  plain	
  text	
  SQL	
  query	
  
2.  Check	
  column	
  values	
  and	
  types	
  against	
  table	
  schema	
  
3.  Apply	
  opEmizaEons,	
  such	
  as	
  constant	
  folding	
  
4.  Determine	
  “billgates”	
  is	
  the	
  distribuEon	
  key	
  
5.  Only	
  then	
  can	
  you	
  route	
  and	
  replicate	
  INSERT	
  
•  What	
  about	
  my	
  SELECT	
  queries?	
  
Takeaway	
  
	
  
When	
  you’re	
  scaling	
  out	
  a	
  SQL	
  query,	
  your	
  
“query	
  distribuEon”	
  logic	
  needs	
  to	
  work	
  
together	
  with	
  the	
  part	
  that	
  understands	
  the	
  
query.	
  
How	
  to	
  overcome	
  this?	
  
1.  ApplicaEon	
  level	
  sharding	
  
2.  Build	
  a	
  distributed	
  database	
  from	
  scratch	
  
3.  Extend	
  on	
  core	
  for	
  agreed	
  upon	
  use-­‐case	
  
•  MulE-­‐master	
  for	
  replicaEon	
  and	
  HA;	
  parEEoning	
  
•  Build	
  middleware	
  for	
  open	
  source	
  database	
  
4.  Fork	
  an	
  open	
  source	
  database	
  
	
  
PostgreSQL	
  Extension	
  APIs	
  
•  CREATE	
  EXTENSION	
  citus;	
  
•  Metadata	
  stored	
  in	
  Postgres	
  tables	
  
•  User-­‐defined	
  funcEons	
  to	
  extend	
  SQL	
  syntax	
  
•  Hooks:	
  Planner,	
  executor,	
  and	
  uElity	
  hooks	
  
•  Similar	
  to	
  interceptors	
  in	
  Java	
  frameworks	
  
Citus	
  Planner	
  Example	
  
Citus	
  
Summary	
  
•  PostgreSQL’s	
  extensible	
  architecture	
  puts	
  it	
  
in	
  a	
  unique	
  place	
  to	
  scale	
  out	
  SQL	
  and	
  also	
  
adapt	
  to	
  evolving	
  hardware	
  trends.	
  
•  It	
  could	
  just	
  be	
  that	
  the	
  monolithic	
  SQL	
  
database	
  is	
  dying.	
  If	
  so,	
  long	
  live	
  Postgres!	
  
Why	
  is	
  distributed	
  query	
  
planning	
  (SELECTs)	
  hard?	
  	
  	
  
Past	
  Experiences	
  
•  Built	
  a	
  similar	
  distributed	
  data	
  processing	
  engine	
  at	
  
Amazon	
  called	
  CSPIT	
  
•  Led	
  by	
  a	
  visionary	
  architect	
  and	
  built	
  by	
  an	
  
extremely	
  talented	
  team	
  
•  Scaled	
  to	
  (at	
  best)	
  a	
  dozen	
  machines.	
  Nicely	
  
distributed	
  basic	
  computaEons	
  across	
  machines	
  
•  Then	
  the	
  dream	
  met	
  reality	
  
Why	
  did	
  it	
  fail?	
  
•  You	
  can	
  solve	
  all	
  distributed	
  systems	
  
problems	
  in	
  one	
  of	
  two	
  days:	
  
1.  Bring	
  your	
  data	
  to	
  the	
  computaEon	
  
2.  Push	
  your	
  computaEon	
  to	
  the	
  data	
  
Bringing	
  data	
  to	
  computaEon	
  (1)	
  
Bringing	
  computaEon	
  to	
  data	
  (2)	
  
Slightly	
  more	
  complex	
  queries	
  
•  Sum(price):	
  sum(price)	
  on	
  worker	
  nodes	
  and	
  
then	
  sum()	
  intermediate	
  results	
  
•  Avg(price):	
  Can	
  you	
  avg(price)	
  on	
  worker	
  
nodes	
  and	
  then	
  avg()	
  intermediate	
  results?	
  
•  Why	
  not?	
  
CommutaEve	
  ComputaEons	
  
•  If	
  you	
  can	
  transform	
  your	
  computaEons	
  into	
  
their	
  commutaEve	
  form,	
  then	
  you	
  can	
  push	
  
them	
  down.	
  
•  (a	
  +	
  b	
  =	
  b	
  +	
  a	
  ;	
  a	
  /	
  b	
  ≠	
  b	
  /	
  a)	
  	
  (*)	
  
•  AssociaEve	
  and	
  distribuEve	
  property	
  for	
  other	
  
operaEons	
  (We	
  also	
  knew	
  about	
  this)	
  
How	
  does	
  this	
  help	
  me?	
  
•  CommutaEve,	
  associaEve,	
  and	
  distribuEve	
  
properEes	
  hold	
  for	
  any	
  query	
  language	
  
•  We	
  pick	
  SQL	
  as	
  an	
  example	
  language	
  
•  SQL	
  uses	
  RelaEonal	
  Algebra	
  to	
  express	
  a	
  query	
  
•  If	
  a	
  query	
  has	
  a	
  WHERE	
  clause	
  in	
  it,	
  that’s	
  a	
  
FILTER	
  node	
  in	
  the	
  relaEonal	
  algebra	
  tree	
  
Simple	
  SQL	
  query	
  
Distributed	
  Logical	
  Plan	
  (unopEmized)	
  
Distributed	
  Logical	
  Plan	
  (opEmized)	
  
Takeaway	
  
	
  
In	
  the	
  land	
  of	
  distributed	
  systems,	
  the	
  
commutaEve	
  (and	
  distribuEve)	
  property	
  is	
  king!	
  
Transform	
  your	
  queries	
  with	
  respect	
  to	
  the	
  king,	
  
and	
  they	
  will	
  scale!	
  
One	
  example	
  doesn’t	
  make	
  a	
  proof	
  
•  Can	
  you	
  prove	
  this	
  model	
  is	
  complete?	
  
•  RelaEonal	
  Algebra	
  has	
  10	
  operators	
  
•  What	
  about	
  opEmizing	
  more	
  complex	
  
plans	
  with	
  joins,	
  subselects,	
  and	
  other	
  
constructs?	
  
MulE-­‐RelaEonal	
  Algebra	
  
•  Correctness	
  of	
  Query	
  ExecuEon	
  Strategies	
  in	
  
Distributed	
  Databases	
  Ceri	
  and	
  Pelagao,	
  1983	
  
•  A	
  Distributed	
  Database	
  paper	
  from	
  a	
  more	
  
civilized	
  age	
  
•  Models	
  each	
  relaEonal	
  algebra	
  operator	
  as	
  a	
  
distributed	
  operator	
  and	
  extends	
  it	
  
CommutaEve	
  Property	
  Rules	
  
DistribuEve	
  Property	
  Rules	
  
FactorizaEon	
  Rules	
  
Two	
  important	
  notes	
  (1)	
  
Logical	
  plan	
  ≠	
  Physical	
  plan	
  
•  “Join”	
  is	
  a	
  logical	
  operator.	
  HashJoin	
  or	
  MergeJoin	
  is	
  a	
  
physical	
  operator.	
  
•  It’s	
  easier	
  to	
  reason	
  about	
  logical	
  operators’	
  
mathemaEcal	
  properEes	
  than	
  those	
  of	
  physical	
  
operators.	
  
•  Distributed	
  databases	
  that	
  start	
  from	
  a	
  “database”	
  
usually	
  extend	
  physical	
  operators.	
  (Greenplum,	
  
Redshis)	
  
	
  
Two	
  important	
  notes	
  (2)	
  
MulE-­‐relaEonal	
  Algebra	
  offers	
  a	
  complete	
  
foundaEon	
  for	
  distribuEng	
  SQL	
  queries.	
  
•  Citus	
  is	
  adding	
  more	
  SQL	
  funcEonality	
  with	
  each	
  
release.	
  
•  From	
  a	
  use-­‐case	
  standpoint,	
  think	
  of	
  Citus	
  not	
  as	
  
a	
  replacement	
  to	
  your	
  data	
  warehouse,	
  and	
  
instead	
  as	
  extending	
  it	
  with	
  real-­‐Eme	
  capabiliEes.	
  
Summary	
  
•  To	
  scale	
  out,	
  you	
  need	
  to	
  transform	
  your	
  
computaEons	
  into	
  their	
  commutaEve	
  and	
  
distribuEve	
  form.	
  
•  Correctness	
  of	
  Query	
  ExecuEon	
  Strategies	
  in	
  
Distributed	
  Databases	
  (1983)	
  offers	
  a	
  
framework	
  to	
  do	
  this	
  for	
  relaEonal	
  algebra.	
  
Distributed	
  Query	
  ExecuEon	
  
across	
  Different	
  Workloads	
  
Different	
  Workloads	
  
1.  Simple	
  Insert	
  /	
  Update	
  /	
  Delete	
  /	
  Select	
  commands	
  
•  High	
  throughput	
  and	
  low	
  latency	
  
2.  Real-­‐Eme	
  Select	
  queries	
  that	
  get	
  parallelized	
  to	
  hundreds	
  of	
  
shards	
  (<300ms)	
  
3.  Long	
  running	
  Select	
  queries	
  that	
  join	
  large	
  tables	
  
•  You	
  can’t	
  restart	
  a	
  Select	
  query	
  just	
  because	
  one	
  task	
  (or	
  one	
  
machine)	
  in	
  1M	
  tasks	
  failed	
  
	
  
	
  
Different	
  Executors	
  
1.  Router	
  Executor:	
  Simple	
  Insert	
  /	
  Update	
  /	
  Delete	
  /	
  
Select	
  commands	
  
2.  Real-­‐Eme	
  Executor:	
  Real-­‐Eme	
  Select	
  queries	
  that	
  
touch	
  100s	
  of	
  shards	
  (<300ms)	
  
3.  Task-­‐tracker	
  Executor:	
  Longer	
  running	
  queries	
  that	
  
need	
  to	
  scale	
  out	
  to	
  10K-­‐1M	
  tasks	
  
	
  
	
  
Conclusions	
  
•  Distributed	
  relaEonal	
  databases	
  is	
  hard	
  
•  PostgreSQL	
  and	
  its	
  extension	
  APIs	
  are	
  unique	
  
•  Citus	
  targets	
  real-­‐Eme	
  data	
  ingest	
  and	
  
querying	
  
•  Citus	
  5.0	
  is	
  open	
  source:	
  hVps://github.com/
citusdata/citus	
  
QuesEons	
  
hVps://citusdata.com	
  
Forums:	
  groups.google.com/forum/#!forum/
citus-­‐users	
  
Ad

Recommended

Understanding oracle rac internals part 1 - slides
Understanding oracle rac internals part 1 - slides
Mohamed Farouk
 
Deep dive to PostgreSQL Indexes
Deep dive to PostgreSQL Indexes
Ibrar Ahmed
 
PostgreSQL Database Slides
PostgreSQL Database Slides
metsarin
 
PostgreSQL Performance Tuning
PostgreSQL Performance Tuning
elliando dias
 
Understanding PostgreSQL LW Locks
Understanding PostgreSQL LW Locks
Jignesh Shah
 
Logical replication with pglogical
Logical replication with pglogical
Umair Shahid
 
[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google Cloud
PgDay.Seoul
 
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning Fundamentals
Enkitec
 
PostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability Methods
Mydbops
 
Understanding oracle rac internals part 2 - slides
Understanding oracle rac internals part 2 - slides
Mohamed Farouk
 
Oracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance Tuning
Tanel Poder
 
PostgreSQL : Introduction
PostgreSQL : Introduction
Open Source School
 
Oracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret Internals
Anil Nair
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
PgDay.Seoul
 
A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN
Riyaj Shamsudeen
 
Part1 of SQL Tuning Workshop - Understanding the Optimizer
Part1 of SQL Tuning Workshop - Understanding the Optimizer
Maria Colgan
 
Deep review of LMS process
Deep review of LMS process
Riyaj Shamsudeen
 
GCP-pde.pdf
GCP-pde.pdf
NirajKumar938204
 
Postgresql Database Administration Basic - Day1
Postgresql Database Administration Basic - Day1
PoguttuezhiniVP
 
MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바
NeoClova
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQL
PGConf APAC
 
The Oracle RAC Family of Solutions - Presentation
The Oracle RAC Family of Solutions - Presentation
Markus Michalewicz
 
Getting started with postgresql
Getting started with postgresql
botsplash.com
 
Oracle Active Data Guard: Best Practices and New Features Deep Dive
Oracle Active Data Guard: Best Practices and New Features Deep Dive
Glen Hawkins
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
ScaleGrid.io
 
Building a REST Service in minutes with Spring Boot
Building a REST Service in minutes with Spring Boot
Omri Spector
 
Exadata master series_asm_2020
Exadata master series_asm_2020
Anil Nair
 
Elasticsearch
Elasticsearch
Hermeto Romano
 
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
Teresa Giacomini
 
Let's scale-out PostgreSQL using Citus (English)
Let's scale-out PostgreSQL using Citus (English)
Noriyoshi Shinoda
 

More Related Content

What's hot (20)

PostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability Methods
Mydbops
 
Understanding oracle rac internals part 2 - slides
Understanding oracle rac internals part 2 - slides
Mohamed Farouk
 
Oracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance Tuning
Tanel Poder
 
PostgreSQL : Introduction
PostgreSQL : Introduction
Open Source School
 
Oracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret Internals
Anil Nair
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
PgDay.Seoul
 
A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN
Riyaj Shamsudeen
 
Part1 of SQL Tuning Workshop - Understanding the Optimizer
Part1 of SQL Tuning Workshop - Understanding the Optimizer
Maria Colgan
 
Deep review of LMS process
Deep review of LMS process
Riyaj Shamsudeen
 
GCP-pde.pdf
GCP-pde.pdf
NirajKumar938204
 
Postgresql Database Administration Basic - Day1
Postgresql Database Administration Basic - Day1
PoguttuezhiniVP
 
MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바
NeoClova
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQL
PGConf APAC
 
The Oracle RAC Family of Solutions - Presentation
The Oracle RAC Family of Solutions - Presentation
Markus Michalewicz
 
Getting started with postgresql
Getting started with postgresql
botsplash.com
 
Oracle Active Data Guard: Best Practices and New Features Deep Dive
Oracle Active Data Guard: Best Practices and New Features Deep Dive
Glen Hawkins
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
ScaleGrid.io
 
Building a REST Service in minutes with Spring Boot
Building a REST Service in minutes with Spring Boot
Omri Spector
 
Exadata master series_asm_2020
Exadata master series_asm_2020
Anil Nair
 
Elasticsearch
Elasticsearch
Hermeto Romano
 
PostgreSQL Replication High Availability Methods
PostgreSQL Replication High Availability Methods
Mydbops
 
Understanding oracle rac internals part 2 - slides
Understanding oracle rac internals part 2 - slides
Mohamed Farouk
 
Oracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance Tuning
Tanel Poder
 
Oracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret Internals
Anil Nair
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
PgDay.Seoul
 
A deep dive about VIP,HAIP, and SCAN
A deep dive about VIP,HAIP, and SCAN
Riyaj Shamsudeen
 
Part1 of SQL Tuning Workshop - Understanding the Optimizer
Part1 of SQL Tuning Workshop - Understanding the Optimizer
Maria Colgan
 
Deep review of LMS process
Deep review of LMS process
Riyaj Shamsudeen
 
Postgresql Database Administration Basic - Day1
Postgresql Database Administration Basic - Day1
PoguttuezhiniVP
 
MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바
NeoClova
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQL
PGConf APAC
 
The Oracle RAC Family of Solutions - Presentation
The Oracle RAC Family of Solutions - Presentation
Markus Michalewicz
 
Getting started with postgresql
Getting started with postgresql
botsplash.com
 
Oracle Active Data Guard: Best Practices and New Features Deep Dive
Oracle Active Data Guard: Best Practices and New Features Deep Dive
Glen Hawkins
 
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
ScaleGrid.io
 
Building a REST Service in minutes with Spring Boot
Building a REST Service in minutes with Spring Boot
Omri Spector
 
Exadata master series_asm_2020
Exadata master series_asm_2020
Anil Nair
 

Similar to Citus Architecture: Extending Postgres to Build a Distributed Database (20)

PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
Teresa Giacomini
 
Let's scale-out PostgreSQL using Citus (English)
Let's scale-out PostgreSQL using Citus (English)
Noriyoshi Shinoda
 
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
Citus Data
 
Chjkkkkkkkkkkkkkkkkkjjjjjjjjjjjjjjjjjjjjjjjjjj01_The Basics.pptx
Chjkkkkkkkkkkkkkkkkkjjjjjjjjjjjjjjjjjjjjjjjjjj01_The Basics.pptx
MhmdMk10
 
The Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus Story
Hanna Kelman
 
The Challenges of Distributing Postgres: A Citus Story | DataEngConf NYC 2017...
The Challenges of Distributing Postgres: A Citus Story | DataEngConf NYC 2017...
Citus Data
 
Open Source SQL Databases
Open Source SQL Databases
Emanuel Calvo
 
PostgreSQL Terminology
PostgreSQL Terminology
Showmax Engineering
 
PostgreSQL - Case Study
PostgreSQL - Case Study
S.Shayan Daneshvar
 
Cjoin
Cjoin
blogboy
 
Implementing Highly Performant Distributed Aggregates
Implementing Highly Performant Distributed Aggregates
ScyllaDB
 
Modern sql
Modern sql
Elizabeth Smith
 
PostgreSQL - Object Relational Database
PostgreSQL - Object Relational Database
Mubashar Iqbal
 
Intro to Databases
Intro to Databases
Sargun Dhillon
 
The Accidental DBA
The Accidental DBA
PostgreSQL Experts, Inc.
 
Nosql
Nosql
ericwilliammarshall
 
Introduction to PostgreSQL
Introduction to PostgreSQL
Jim Mlodgenski
 
Whats wrong with postgres | PGConf EU 2019 | Craig Kerstiens
Whats wrong with postgres | PGConf EU 2019 | Craig Kerstiens
Citus Data
 
Postgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL Cluster
Pavan Deolasee
 
Oracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration Hustle
EDB
 
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
PostgreSQL Extension APIs are Changing the Face of Relational Databases | PGC...
Teresa Giacomini
 
Let's scale-out PostgreSQL using Citus (English)
Let's scale-out PostgreSQL using Citus (English)
Noriyoshi Shinoda
 
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
Distributing Queries the Citus Way | PostgresConf US 2018 | Marco Slot
Citus Data
 
Chjkkkkkkkkkkkkkkkkkjjjjjjjjjjjjjjjjjjjjjjjjjj01_The Basics.pptx
Chjkkkkkkkkkkkkkkkkkjjjjjjjjjjjjjjjjjjjjjjjjjj01_The Basics.pptx
MhmdMk10
 
The Challenges of Distributing Postgres: A Citus Story
The Challenges of Distributing Postgres: A Citus Story
Hanna Kelman
 
The Challenges of Distributing Postgres: A Citus Story | DataEngConf NYC 2017...
The Challenges of Distributing Postgres: A Citus Story | DataEngConf NYC 2017...
Citus Data
 
Open Source SQL Databases
Open Source SQL Databases
Emanuel Calvo
 
Implementing Highly Performant Distributed Aggregates
Implementing Highly Performant Distributed Aggregates
ScyllaDB
 
PostgreSQL - Object Relational Database
PostgreSQL - Object Relational Database
Mubashar Iqbal
 
Introduction to PostgreSQL
Introduction to PostgreSQL
Jim Mlodgenski
 
Whats wrong with postgres | PGConf EU 2019 | Craig Kerstiens
Whats wrong with postgres | PGConf EU 2019 | Craig Kerstiens
Citus Data
 
Postgres-XC: Symmetric PostgreSQL Cluster
Postgres-XC: Symmetric PostgreSQL Cluster
Pavan Deolasee
 
Oracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration Hustle
EDB
 
Ad

Recently uploaded (20)

Presentation by Tariq & Mohammed (1).pptx
Presentation by Tariq & Mohammed (1).pptx
AbooddSandoqaa
 
@Reset-Password.pptx presentakh;kenvtion
@Reset-Password.pptx presentakh;kenvtion
MarkLariosa1
 
美国毕业证范本中华盛顿大学学位证书CWU学生卡购买
美国毕业证范本中华盛顿大学学位证书CWU学生卡购买
Taqyea
 
Artigo - Playing to Win.planejamento docx
Artigo - Playing to Win.planejamento docx
KellyXavier15
 
PPT2 W1L2.pptx.........................................
PPT2 W1L2.pptx.........................................
palicteronalyn26
 
Lesson-3_Program-Outcomes-and-Student-Learning-Outcomes_For-Students.pdf
Lesson-3_Program-Outcomes-and-Student-Learning-Outcomes_For-Students.pdf
SarahMaeDuallo
 
NVIDIA Triton Inference Server, a game-changing platform for deploying AI mod...
NVIDIA Triton Inference Server, a game-changing platform for deploying AI mod...
Tamanna36
 
BCG-Executive-Perspectives-CEOs-Guide-to-Maximizing-Value-from-AI-EP0-3July20...
BCG-Executive-Perspectives-CEOs-Guide-to-Maximizing-Value-from-AI-EP0-3July20...
benediktnetzer1
 
Microsoft Power BI - Advanced Certificate for Business Intelligence using Pow...
Microsoft Power BI - Advanced Certificate for Business Intelligence using Pow...
Prasenjit Debnath
 
最新版美国威斯康星大学河城分校毕业证(UWRF毕业证书)原版定制
最新版美国威斯康星大学河城分校毕业证(UWRF毕业证书)原版定制
taqyea
 
最新版美国佐治亚大学毕业证(UGA毕业证书)原版定制
最新版美国佐治亚大学毕业证(UGA毕业证书)原版定制
Taqyea
 
Starbucks in the Indian market through its joint venture.
Starbucks in the Indian market through its joint venture.
sales480687
 
lecture12.pdf Introduction to bioinformatics
lecture12.pdf Introduction to bioinformatics
SergeyTsygankov6
 
All the DataOps, all the paradigms .
All the DataOps, all the paradigms .
Lars Albertsson
 
Camuflaje Tipos Características Militar 2025.ppt
Camuflaje Tipos Características Militar 2025.ppt
e58650738
 
The Influence off Flexible Work Policies
The Influence off Flexible Work Policies
sales480687
 
Residential Zone 4 for industrial village
Residential Zone 4 for industrial village
MdYasinArafat13
 
MRI Pulse Sequence in radiology physics.pptx
MRI Pulse Sequence in radiology physics.pptx
BelaynehBishaw
 
Flextronics Employee Safety Data-Project-2.pptx
Flextronics Employee Safety Data-Project-2.pptx
kilarihemadri
 
最新版美国芝加哥大学毕业证(UChicago毕业证书)原版定制
最新版美国芝加哥大学毕业证(UChicago毕业证书)原版定制
taqyea
 
Presentation by Tariq & Mohammed (1).pptx
Presentation by Tariq & Mohammed (1).pptx
AbooddSandoqaa
 
@Reset-Password.pptx presentakh;kenvtion
@Reset-Password.pptx presentakh;kenvtion
MarkLariosa1
 
美国毕业证范本中华盛顿大学学位证书CWU学生卡购买
美国毕业证范本中华盛顿大学学位证书CWU学生卡购买
Taqyea
 
Artigo - Playing to Win.planejamento docx
Artigo - Playing to Win.planejamento docx
KellyXavier15
 
PPT2 W1L2.pptx.........................................
PPT2 W1L2.pptx.........................................
palicteronalyn26
 
Lesson-3_Program-Outcomes-and-Student-Learning-Outcomes_For-Students.pdf
Lesson-3_Program-Outcomes-and-Student-Learning-Outcomes_For-Students.pdf
SarahMaeDuallo
 
NVIDIA Triton Inference Server, a game-changing platform for deploying AI mod...
NVIDIA Triton Inference Server, a game-changing platform for deploying AI mod...
Tamanna36
 
BCG-Executive-Perspectives-CEOs-Guide-to-Maximizing-Value-from-AI-EP0-3July20...
BCG-Executive-Perspectives-CEOs-Guide-to-Maximizing-Value-from-AI-EP0-3July20...
benediktnetzer1
 
Microsoft Power BI - Advanced Certificate for Business Intelligence using Pow...
Microsoft Power BI - Advanced Certificate for Business Intelligence using Pow...
Prasenjit Debnath
 
最新版美国威斯康星大学河城分校毕业证(UWRF毕业证书)原版定制
最新版美国威斯康星大学河城分校毕业证(UWRF毕业证书)原版定制
taqyea
 
最新版美国佐治亚大学毕业证(UGA毕业证书)原版定制
最新版美国佐治亚大学毕业证(UGA毕业证书)原版定制
Taqyea
 
Starbucks in the Indian market through its joint venture.
Starbucks in the Indian market through its joint venture.
sales480687
 
lecture12.pdf Introduction to bioinformatics
lecture12.pdf Introduction to bioinformatics
SergeyTsygankov6
 
All the DataOps, all the paradigms .
All the DataOps, all the paradigms .
Lars Albertsson
 
Camuflaje Tipos Características Militar 2025.ppt
Camuflaje Tipos Características Militar 2025.ppt
e58650738
 
The Influence off Flexible Work Policies
The Influence off Flexible Work Policies
sales480687
 
Residential Zone 4 for industrial village
Residential Zone 4 for industrial village
MdYasinArafat13
 
MRI Pulse Sequence in radiology physics.pptx
MRI Pulse Sequence in radiology physics.pptx
BelaynehBishaw
 
Flextronics Employee Safety Data-Project-2.pptx
Flextronics Employee Safety Data-Project-2.pptx
kilarihemadri
 
最新版美国芝加哥大学毕业证(UChicago毕业证书)原版定制
最新版美国芝加哥大学毕业证(UChicago毕业证书)原版定制
taqyea
 
Ad

Citus Architecture: Extending Postgres to Build a Distributed Database

  • 1. Citus  5.0   Extending  PostgreSQL  to  Build  a     Distributed  Database   Ozgun  Erdogan   on  behalf  of  Citus  Data  team  
  • 2. Talk  Outline   1.  IntroducEon   2.  Citus  5.0  and  its  use  of  extension  APIs   3.  Distributed  query  planning   4.  Different  distributed  executors  for  different   workloads   •  Three  technical  lightning  talks  in  one  
  • 3. What  is  Citus?   •  Citus  extends  PostgreSQL  (not  a  fork)  to  provide   it  with  distributed  funcEonality.   •  Citus  scales-­‐out  Postgres  across  servers  using   sharding  and  replicaEon.  Its  query  engine   parallelizes  SQL  queries  across  many  servers.   •  Citus  5.0  is  open  source:  hVps://github.com/ citusdata/citus  
  • 4. Citus  5.0  Architecture  Diagram   Events   Citus  worker  1   (PostgreSQL  +   Citus  extension)   …   …   …   …   Citus  coordinator   (PostgreSQL  +   Citus  extension)     Distributed  table   (metadata)   E1   E3’   Citus  worker  2   …   …   …   …   E2   E1’   Citus  worker  N   …   …   …   …   E3   E2’   …   Regular  tables   (1  shard  =     1  Postgres  table)  
  • 5. When  is  Citus  a  good  fit?   •  Scaling  a  mulE-­‐tenant  (B2B)  database  to  100K+  tenants   •  Sub-­‐second  OLAP  queries  on  data  as  it  arrives   •  Powering  real-­‐Eme  analyEc  dashboards   •  Exploratory  queries  on  events  as  they  arrive   •  Who  is  using  Citus?   •  CloudFlare  uses  Citus  to  power  their  analyEc  dashboards   •  Neustar  builds  ad-­‐tech  infrastructure  with  HyperLogLog   •  Heap  powers  funnel,  segmentaEon,  and  cohort  queries  
  • 6. SQL,  Scaling  out,  and  What’s   Unique  About  PostgreSQL?  
  • 7. “SQL  doesn’t  Scale”   1.  Scaling-­‐out  is  hard.  Scaling  data,  compared  to   scaling  computaEons,  is  even  harder.   2.  SQL  means  different  things  to  different  people:   transacEonal  workloads,  short  reads/writes,  real-­‐ Eme  analyEcs,  data  warehousing,  or  triggers.   3.  SQL  doesn’t  have  the  no1on  of  “distribu1on”  built   into  the  language.  This  can  be  added  in,  but  not   there  in  SQL.  
  • 8. Query  Languages:  An  Example  
  • 9. SQL  RouEng  /  ReplicaEon   •  Simple  INSERT  rouEng  and  replicaEon   1.  Parse  plain  text  SQL  query   2.  Check  column  values  and  types  against  table  schema   3.  Apply  opEmizaEons,  such  as  constant  folding   4.  Determine  “billgates”  is  the  distribuEon  key   5.  Only  then  can  you  route  and  replicate  INSERT   •  What  about  my  SELECT  queries?  
  • 10. Takeaway     When  you’re  scaling  out  a  SQL  query,  your   “query  distribuEon”  logic  needs  to  work   together  with  the  part  that  understands  the   query.  
  • 11. How  to  overcome  this?   1.  ApplicaEon  level  sharding   2.  Build  a  distributed  database  from  scratch   3.  Extend  on  core  for  agreed  upon  use-­‐case   •  MulE-­‐master  for  replicaEon  and  HA;  parEEoning   •  Build  middleware  for  open  source  database   4.  Fork  an  open  source  database    
  • 12. PostgreSQL  Extension  APIs   •  CREATE  EXTENSION  citus;   •  Metadata  stored  in  Postgres  tables   •  User-­‐defined  funcEons  to  extend  SQL  syntax   •  Hooks:  Planner,  executor,  and  uElity  hooks   •  Similar  to  interceptors  in  Java  frameworks  
  • 14. Summary   •  PostgreSQL’s  extensible  architecture  puts  it   in  a  unique  place  to  scale  out  SQL  and  also   adapt  to  evolving  hardware  trends.   •  It  could  just  be  that  the  monolithic  SQL   database  is  dying.  If  so,  long  live  Postgres!  
  • 15. Why  is  distributed  query   planning  (SELECTs)  hard?      
  • 16. Past  Experiences   •  Built  a  similar  distributed  data  processing  engine  at   Amazon  called  CSPIT   •  Led  by  a  visionary  architect  and  built  by  an   extremely  talented  team   •  Scaled  to  (at  best)  a  dozen  machines.  Nicely   distributed  basic  computaEons  across  machines   •  Then  the  dream  met  reality  
  • 17. Why  did  it  fail?   •  You  can  solve  all  distributed  systems   problems  in  one  of  two  days:   1.  Bring  your  data  to  the  computaEon   2.  Push  your  computaEon  to  the  data  
  • 18. Bringing  data  to  computaEon  (1)  
  • 19. Bringing  computaEon  to  data  (2)  
  • 20. Slightly  more  complex  queries   •  Sum(price):  sum(price)  on  worker  nodes  and   then  sum()  intermediate  results   •  Avg(price):  Can  you  avg(price)  on  worker   nodes  and  then  avg()  intermediate  results?   •  Why  not?  
  • 21. CommutaEve  ComputaEons   •  If  you  can  transform  your  computaEons  into   their  commutaEve  form,  then  you  can  push   them  down.   •  (a  +  b  =  b  +  a  ;  a  /  b  ≠  b  /  a)    (*)   •  AssociaEve  and  distribuEve  property  for  other   operaEons  (We  also  knew  about  this)  
  • 22. How  does  this  help  me?   •  CommutaEve,  associaEve,  and  distribuEve   properEes  hold  for  any  query  language   •  We  pick  SQL  as  an  example  language   •  SQL  uses  RelaEonal  Algebra  to  express  a  query   •  If  a  query  has  a  WHERE  clause  in  it,  that’s  a   FILTER  node  in  the  relaEonal  algebra  tree  
  • 24. Distributed  Logical  Plan  (unopEmized)  
  • 25. Distributed  Logical  Plan  (opEmized)  
  • 26. Takeaway     In  the  land  of  distributed  systems,  the   commutaEve  (and  distribuEve)  property  is  king!   Transform  your  queries  with  respect  to  the  king,   and  they  will  scale!  
  • 27. One  example  doesn’t  make  a  proof   •  Can  you  prove  this  model  is  complete?   •  RelaEonal  Algebra  has  10  operators   •  What  about  opEmizing  more  complex   plans  with  joins,  subselects,  and  other   constructs?  
  • 28. MulE-­‐RelaEonal  Algebra   •  Correctness  of  Query  ExecuEon  Strategies  in   Distributed  Databases  Ceri  and  Pelagao,  1983   •  A  Distributed  Database  paper  from  a  more   civilized  age   •  Models  each  relaEonal  algebra  operator  as  a   distributed  operator  and  extends  it  
  • 32. Two  important  notes  (1)   Logical  plan  ≠  Physical  plan   •  “Join”  is  a  logical  operator.  HashJoin  or  MergeJoin  is  a   physical  operator.   •  It’s  easier  to  reason  about  logical  operators’   mathemaEcal  properEes  than  those  of  physical   operators.   •  Distributed  databases  that  start  from  a  “database”   usually  extend  physical  operators.  (Greenplum,   Redshis)    
  • 33. Two  important  notes  (2)   MulE-­‐relaEonal  Algebra  offers  a  complete   foundaEon  for  distribuEng  SQL  queries.   •  Citus  is  adding  more  SQL  funcEonality  with  each   release.   •  From  a  use-­‐case  standpoint,  think  of  Citus  not  as   a  replacement  to  your  data  warehouse,  and   instead  as  extending  it  with  real-­‐Eme  capabiliEes.  
  • 34. Summary   •  To  scale  out,  you  need  to  transform  your   computaEons  into  their  commutaEve  and   distribuEve  form.   •  Correctness  of  Query  ExecuEon  Strategies  in   Distributed  Databases  (1983)  offers  a   framework  to  do  this  for  relaEonal  algebra.  
  • 35. Distributed  Query  ExecuEon   across  Different  Workloads  
  • 36. Different  Workloads   1.  Simple  Insert  /  Update  /  Delete  /  Select  commands   •  High  throughput  and  low  latency   2.  Real-­‐Eme  Select  queries  that  get  parallelized  to  hundreds  of   shards  (<300ms)   3.  Long  running  Select  queries  that  join  large  tables   •  You  can’t  restart  a  Select  query  just  because  one  task  (or  one   machine)  in  1M  tasks  failed      
  • 37. Different  Executors   1.  Router  Executor:  Simple  Insert  /  Update  /  Delete  /   Select  commands   2.  Real-­‐Eme  Executor:  Real-­‐Eme  Select  queries  that   touch  100s  of  shards  (<300ms)   3.  Task-­‐tracker  Executor:  Longer  running  queries  that   need  to  scale  out  to  10K-­‐1M  tasks      
  • 38. Conclusions   •  Distributed  relaEonal  databases  is  hard   •  PostgreSQL  and  its  extension  APIs  are  unique   •  Citus  targets  real-­‐Eme  data  ingest  and   querying   •  Citus  5.0  is  open  source:  hVps://github.com/ citusdata/citus  
  • 39. QuesEons   hVps://citusdata.com   Forums:  groups.google.com/forum/#!forum/ citus-­‐users