SlideShare a Scribd company logo
Solving PostgreSQL wicked problems
Alexander Korotkov
Oriole DB Inc.
2021
Alexander Korotkov Solving PostgreSQL wicked problems 1 / 40
PostgreSQL has two sides
Alexander Korotkov Solving PostgreSQL wicked problems 2 / 40
The bright side of PostgreSQL
Alexander Korotkov Solving PostgreSQL wicked problems 3 / 40
PostgreSQL – one of the most popular DBMS’es1
1
According to db-engines.com
Alexander Korotkov Solving PostgreSQL wicked problems 4 / 40
PostgreSQL – strong trend2
2
https://db-engines.com/en/ranking_trend/system/PostgreSQL
Alexander Korotkov Solving PostgreSQL wicked problems 5 / 40
PostgreSQL – most loved RDBMS3
3
According to Stackoverflow 2020 survey
Alexander Korotkov Solving PostgreSQL wicked problems 6 / 40
The dark side of PostgreSQL
Alexander Korotkov Solving PostgreSQL wicked problems 7 / 40
Cri cism of PostgreSQL (1/2)
https://eng.uber.com/postgres-to-mysql-migration/
Alexander Korotkov Solving PostgreSQL wicked problems 8 / 40
Cri cism of PostgreSQL (2/2)
https://medium.com/@rbranson/10-things-i-hate-about-postgresql-20dbab8c2791
Alexander Korotkov Solving PostgreSQL wicked problems 9 / 40
10 wicked problems of PostgreSQL
Problem name Known for Work started Resolu on
1. Wraparound 20 years 15 years ago S ll WIP
2. Failover Will Probably Lose Data 20 years 16 years ago S ll WIP
3. Inefficient Replica on That Spreads Corrup on 10 years 8 years ago S ll WIP
4. MVCC Garbage Frequently Painful 20 years 19 years ago Abandoned
5. Process-Per-Connec on = Pain at Scale 20 years 3 years ago Abandoned
6. Primary Key Index is a Space Hog 13 years — Not started
7. Major Version Upgrades Can Require Down me 21 years 16 years ago S ll WIP
8. Somewhat Cumbersome Replica on Setup 10 years 9 years ago S ll WIP
9. Ridiculous No-Planner-Hints Dogma 20 years 11 years ago Extension
10. No Block Compression 12 years 11 years ago S ll WIP
* Scalability on modern hardware
Alexander Korotkov Solving PostgreSQL wicked problems 10 / 40
The exci ng moment
▶ PostgreSQL community have proven to be brilliant on solving
non-design issues, providing fantas c product to the market.
Alexander Korotkov Solving PostgreSQL wicked problems 11 / 40
The exci ng moment
▶ PostgreSQL community have proven to be brilliant on solving
non-design issues, providing fantas c product to the market.
▶ As a result, PostgreSQL has had a strong upwards trend for many
years.
Alexander Korotkov Solving PostgreSQL wicked problems 11 / 40
The exci ng moment
▶ PostgreSQL community have proven to be brilliant on solving
non-design issues, providing fantas c product to the market.
▶ As a result, PostgreSQL has had a strong upwards trend for many
years.
▶ At the same me, the PostgreSQL community appears to be
dysfunc onal in solving design issues, a rac ng severe cri cism.
Nevertheless, cri cs not yet break the upwards trend.
Alexander Korotkov Solving PostgreSQL wicked problems 11 / 40
The exci ng moment
▶ PostgreSQL community have proven to be brilliant on solving
non-design issues, providing fantas c product to the market.
▶ As a result, PostgreSQL has had a strong upwards trend for many
years.
▶ At the same me, the PostgreSQL community appears to be
dysfunc onal in solving design issues, a rac ng severe cri cism.
Nevertheless, cri cs not yet break the upwards trend.
▶ It appears to be a unique moment for PostgreSQL redesign!
Alexander Korotkov Solving PostgreSQL wicked problems 11 / 40
How could we solve the PostgreSQL
wicked problems?
Alexander Korotkov Solving PostgreSQL wicked problems 12 / 40
Tradi onal buffer management
1
2 3
4 5 6 7
Disk
1'
2' 3'
5' 6'
Memory
Buffer
mapping
1 2 3 5 6
4 7
▶ Each page access requires lookup into buffer mapping data structure.
Alexander Korotkov Solving PostgreSQL wicked problems 13 / 40
Tradi onal buffer management
1
2 3
4 5 6 7
Disk
1'
2' 3'
5' 6'
Memory
Buffer
mapping
1 2 3 5 6
4 7
▶ Each page access requires lookup into buffer mapping data structure.
▶ Each B-tree key lookup takes mul ple buffer mapping lookups.
Alexander Korotkov Solving PostgreSQL wicked problems 13 / 40
Tradi onal buffer management
1
2 3
4 5 6 7
Disk
1'
2' 3'
5' 6'
Memory
Buffer
mapping
1 2 3 5 6
4 7
▶ Each page access requires lookup into buffer mapping data structure.
▶ Each B-tree key lookup takes mul ple buffer mapping lookups.
▶ Accessing cached data doesn’t scale on modern hardware.
Alexander Korotkov Solving PostgreSQL wicked problems 13 / 40
Solu on: Dual pointers
1
2 3
5 7
1 2 3 4 5 6 7
Disk
▶ In-memory page refers either in-memory or on-disk page.
Alexander Korotkov Solving PostgreSQL wicked problems 14 / 40
Solu on: Dual pointers
1
2 3
5 7
1 2 3 4 5 6 7
Disk
▶ In-memory page refers either in-memory or on-disk page.
▶ Accessing cached data without buffer mapping lookups.
Alexander Korotkov Solving PostgreSQL wicked problems 14 / 40
Solu on: Dual pointers
1
2 3
5 7
1 2 3 4 5 6 7
Disk
▶ In-memory page refers either in-memory or on-disk page.
▶ Accessing cached data without buffer mapping lookups.
▶ Good scalability!
Alexander Korotkov Solving PostgreSQL wicked problems 14 / 40
PostgreSQL MVCC = bloat + write-amplifica on
▶ New and old row versions shares the same heap.
Alexander Korotkov Solving PostgreSQL wicked problems 15 / 40
PostgreSQL MVCC = bloat + write-amplifica on
▶ New and old row versions shares the same heap.
▶ Non-HOT updates cause index bloat.
Alexander Korotkov Solving PostgreSQL wicked problems 15 / 40
Solu on: undo log for both pages and rows
Undo
row1 row4
row1 row2
row3 row4
page
row2v1
row1v1 row1v2
▶ Old row versions form chains in undo log.
Alexander Korotkov Solving PostgreSQL wicked problems 16 / 40
Solu on: undo log for both pages and rows
Undo
row1 row4
row1 row2
row3 row4
page
row2v1
row1v1 row1v2
▶ Old row versions form chains in undo log.
▶ Page-level chains evict deleted rows from primary storage.
Alexander Korotkov Solving PostgreSQL wicked problems 16 / 40
Solu on: undo log for both pages and rows
Undo
row1 row4
row1 row2
row3 row4
page
row2v1
row1v1 row1v2
▶ Old row versions form chains in undo log.
▶ Page-level chains evict deleted rows from primary storage.
▶ Update only indexes with changed values.
Alexander Korotkov Solving PostgreSQL wicked problems 16 / 40
Block-level WAL
Heap
Index #1 Index #2
WAL
▶ Huge WAL traffic.
Alexander Korotkov Solving PostgreSQL wicked problems 17 / 40
Block-level WAL
Heap
Index #1 Index #2
WAL
▶ Huge WAL traffic.
▶ Problems with parallel apply.
Alexander Korotkov Solving PostgreSQL wicked problems 17 / 40
Block-level WAL
Heap
Index #1 Index #2
WAL
▶ Huge WAL traffic.
▶ Problems with parallel apply.
▶ Not suitable for mul -master replica on.
Alexander Korotkov Solving PostgreSQL wicked problems 17 / 40
Solu on: row-level WAL
Heap
Index #1 Index #2
WAL
▶ Very compact.
Alexander Korotkov Solving PostgreSQL wicked problems 18 / 40
Solu on: row-level WAL
Heap
Index #1 Index #2
WAL
▶ Very compact.
▶ Apply can be parallelized.
Alexander Korotkov Solving PostgreSQL wicked problems 18 / 40
Solu on: row-level WAL
Heap
Index #1 Index #2
WAL
▶ Very compact.
▶ Apply can be parallelized.
▶ Suitable for mul master (row-level conflicts, not block-level).
Alexander Korotkov Solving PostgreSQL wicked problems 18 / 40
Solu on: row-level WAL
Heap
Index #1 Index #2
WAL
▶ Very compact.
▶ Apply can be parallelized.
▶ Suitable for mul master (row-level conflicts, not block-level).
▶ Recovery needs structurally consistent checkpoints.
Alexander Korotkov Solving PostgreSQL wicked problems 18 / 40
Row-level WAL based mul master
OrioleDB instance
Storage
WAL
OrioleDB instance
Storage
WAL
OrioleDB instance
Storage
WAL
Raft replication
Alexander Korotkov Solving PostgreSQL wicked problems 19 / 40
Copy-on-write checkpoints (1/4)
1
2 3
5 7
1 2 3 4 5 6 7
Disk
Alexander Korotkov Solving PostgreSQL wicked problems 20 / 40
Copy-on-write checkpoints (2/4)
1
2 3
5 7*
1 2 3 4 5 6 7
Disk
Alexander Korotkov Solving PostgreSQL wicked problems 21 / 40
Copy-on-write checkpoints (3/4)
1*
2 3*
5 7*
1 2 3 4 5 6 7
Disk
7* 3* 1*
Alexander Korotkov Solving PostgreSQL wicked problems 22 / 40
Copy-on-write checkpoints (4/4)
1*
2 3*
5 7*
2 4 5 6
Disk
7* 3* 1*
Alexander Korotkov Solving PostgreSQL wicked problems 23 / 40
What do we need from PostgreSQL extendability?
Backgroud processes
Backend
Connection
Parser
Rewriter
Planner
Executor
Autovacuum
Background writer
Checkpointer
WAL writer
PostgreSQL server
File system
Data files
WAL files
Log files
OrioleDB

extension
OrioleDB
data files
OrioleDB
undo files
......
File system
▶ Extended table AM.
Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
What do we need from PostgreSQL extendability?
Backgroud processes
Backend
Connection
Parser
Rewriter
Planner
Executor
Autovacuum
Background writer
Checkpointer
WAL writer
PostgreSQL server
File system
Data files
WAL files
Log files
OrioleDB

extension
OrioleDB
data files
OrioleDB
undo files
......
File system
▶ Extended table AM.
▶ Custom toast handlers.
Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
What do we need from PostgreSQL extendability?
Backgroud processes
Backend
Connection
Parser
Rewriter
Planner
Executor
Autovacuum
Background writer
Checkpointer
WAL writer
PostgreSQL server
File system
Data files
WAL files
Log files
OrioleDB

extension
OrioleDB
data files
OrioleDB
undo files
......
File system
▶ Extended table AM.
▶ Custom toast handlers.
▶ Custom row iden fiers.
Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
What do we need from PostgreSQL extendability?
Backgroud processes
Backend
Connection
Parser
Rewriter
Planner
Executor
Autovacuum
Background writer
Checkpointer
WAL writer
PostgreSQL server
File system
Data files
WAL files
Log files
OrioleDB

extension
OrioleDB
data files
OrioleDB
undo files
......
File system
▶ Extended table AM.
▶ Custom toast handlers.
▶ Custom row iden fiers.
▶ Custom error cleanup.
Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
What do we need from PostgreSQL extendability?
Backgroud processes
Backend
Connection
Parser
Rewriter
Planner
Executor
Autovacuum
Background writer
Checkpointer
WAL writer
PostgreSQL server
File system
Data files
WAL files
Log files
OrioleDB

extension
OrioleDB
data files
OrioleDB
undo files
......
File system
▶ Extended table AM.
▶ Custom toast handlers.
▶ Custom row iden fiers.
▶ Custom error cleanup.
▶ Recovery & checkpointer hooks.
Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
What do we need from PostgreSQL extendability?
Backgroud processes
Backend
Connection
Parser
Rewriter
Planner
Executor
Autovacuum
Background writer
Checkpointer
WAL writer
PostgreSQL server
File system
Data files
WAL files
Log files
OrioleDB

extension
OrioleDB
data files
OrioleDB
undo files
......
File system
▶ Extended table AM.
▶ Custom toast handlers.
▶ Custom row iden fiers.
▶ Custom error cleanup.
▶ Recovery & checkpointer hooks.
▶ Snapshot hooks.
Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
What do we need from PostgreSQL extendability?
Backgroud processes
Backend
Connection
Parser
Rewriter
Planner
Executor
Autovacuum
Background writer
Checkpointer
WAL writer
PostgreSQL server
File system
Data files
WAL files
Log files
OrioleDB

extension
OrioleDB
data files
OrioleDB
undo files
......
File system
▶ Extended table AM.
▶ Custom toast handlers.
▶ Custom row iden fiers.
▶ Custom error cleanup.
▶ Recovery & checkpointer hooks.
▶ Snapshot hooks.
▶ Some other miscellaneous hooks
total 1K lines patch to PostgreSQL
Core
Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
OrioleDB = PostgreSQL redesign
PostgreSQL
Block-level WAL Row-level WAL
Buffer mapping Direct page links
Buffer locking Lock-less access
Bloat-prone MVCC Undo log
Cumbersome
block-level WAL
replication
Raft-based
multimaster
replication of row-
level WAL
Alexander Korotkov Solving PostgreSQL wicked problems 25 / 40
OrioleDB’s answer to 10 wicked problems of PostgreSQL
Problem name Solu on
1. Wraparound Na ve 64-bit transac on ids
2. Failover Will Probably Lose Data Mul master replica on
3. Inefficient Replica on That Spreads Corrup on Row-level replica on
4. MVCC Garbage Frequently Painful Non-persistent undo log
5. Process-Per-Connec on = Pain at Scale Migra on to mul thread model
6. Primary Key Index is a Space Hog Index-organized tables
7. Major Version Upgrades Can Require Down me Mul master + per-node upgrade
8. Somewhat Cumbersome Replica on Setup Simple setup of ra -based mul master
9. Ridiculous No-Planner-Hints Dogma In-core planner hints
10. No Block Compression Block-level compression
* Scalability on modern hardware
Alexander Korotkov Solving PostgreSQL wicked problems 26 / 40
Let’s do some benchmarks! 4
4
https://gist.github.com/akorotkov/f5e98ba5805c42ee18bf945b30cc3d67
Alexander Korotkov Solving PostgreSQL wicked problems 27 / 40
OrioleDB benchmark: read-only scalability
0 50 100 150 200 250
# Clients
0
200000
400000
600000
800000
TPS
Read-only scalability test PostgreSQL vs OrioleDB
1 minute of pgbench script reading 9 random values of 100M
PostgreSQL
OrioleDB
OrioleDB: 4X higher TPS!
Alexander Korotkov Solving PostgreSQL wicked problems 28 / 40
OrioleDB benchmark: read-write scalability
in-memory case
0 50 100 150 200 250
# Clients
0
100000
200000
300000
400000
TPS
Read-write scalability test PostgreSQL vs OrioleDB
1 minute of pgbench TPC-B like transactions wrapped into stored procedure
PostgreSQL
OrioleDB
OrioleDB: 3.5X higher TPS!
Alexander Korotkov Solving PostgreSQL wicked problems 29 / 40
OrioleDB benchmark: read-write scalability
external storage case
0 250 500 750 1000 1250 1500 1750 2000
# Clients
0
20000
40000
60000
80000
100000
120000
TPS
pgbench -s 20000 -j $n -c $n -M prepared on odb-node02
mean of 3 3-minute runs with shared_buffers = 32GB(128GB), max_connections = 2500
pgsql-read-write
orioledb-read-write
orioledb-read-write-block-device
OrioleDB: up to 50X higher TPS!
Alexander Korotkov Solving PostgreSQL wicked problems 30 / 40
OrioleDB benchmark: read-write scalability
Intel Optane persistent memory
0 250 500 750 1000 1250 1500 1750 2000
# Clients
0
25000
50000
75000
100000
125000
150000
175000
200000
TPS
pgbench -s 20000 -j $n -c $n -M prepared -f read-write-proc.sql on node03
5-minute run with shared_buffers = 32GB, max_connections = 2500
pgsql
orioledb-fsdax
orioledb-devdax
OrioleDB: up to 50X higher TPS!
Alexander Korotkov Solving PostgreSQL wicked problems 31 / 40
OrioleDB benchmark: write-amplifica on & bloat test: CPU
400 600 800 1000 1200 1400 1600
seconds
0
100000
200000
300000
400000
500000
600000
700000
800000
TPS
Troughtput
PostgreSQL
OrioleDB
400 600 800 1000 1200 1400 1600
seconds
0
20
40
60
80
100
Usage,
%
CPU usage
PostgreSQL
OrioleDB
OrioleDB: 5X higher TPS! 2.3X less CPU/TPS!
Alexander Korotkov Solving PostgreSQL wicked problems 32 / 40
OrioleDB benchmark: write-amplifica on & bloat test: IO
400 600 800 1000 1200 1400 1600
seconds
0
100000
200000
300000
400000
500000
600000
700000
800000
TPS
Troughtput
PostgreSQL
OrioleDB
0 250 500 750 1000 1250 1500 1750
seconds
0
5000
10000
15000
20000
25000
30000
35000
IOPS
IO load
PostgreSQL
OrioleDB
OrioleDB: 5X higher TPS! 22X less IO/TPS!
Alexander Korotkov Solving PostgreSQL wicked problems 33 / 40
OrioleDB benchmark: write-amplifica on & bloat test: space
400 600 800 1000 1200 1400 1600
seconds
0
10
20
30
40
50
60
70
80
GB
Space used
PostgreSQL
OrioleDB
OrioleDB: no bloat!
Alexander Korotkov Solving PostgreSQL wicked problems 34 / 40
OrioleDB benchmark: taxi workload (1/3): read
0 500 1000 1500 2000 2500 3000 3500
seconds
0
25
50
75
100
125
150
175
IOPS
Disk read
PostgreSQL
OrioleDB
OrioleDB: 9X less read IOPS!
Alexander Korotkov Solving PostgreSQL wicked problems 35 / 40
OrioleDB benchmark: taxi workload (2/3): write
0 500 1000 1500 2000 2500 3000 3500
seconds
0
50
100
150
200
250
300
350
IOPS
Disk write
PostgreSQL
OrioleDB
OrioleDB: 4.5X less write IOPS!
Alexander Korotkov Solving PostgreSQL wicked problems 36 / 40
OrioleDB benchmark: taxi workload (3/3): space
0 500 1000 1500 2000 2500 3000 3500
seconds
0
5
10
15
20
25
30
35
40
GB
Space used
PostgreSQL
OrioleDB
OrioleDB: 8X less space usage!
Alexander Korotkov Solving PostgreSQL wicked problems 37 / 40
OrioleDB = Solu on of wicked PostgreSQL
problems + extraordinary performance
Alexander Korotkov Solving PostgreSQL wicked problems 38 / 40
Roadmap
▶ Basic engine features 4
▶ Table AM interface implementa on 4
▶ Data compression 4
▶ Undo log 4
▶ TOAST support 4
▶ Parallel row-level replica on 4
▶ Par al and expression indexes 4
Ini al release
▶ GiST/GIN analogues
Alexander Korotkov Solving PostgreSQL wicked problems 39 / 40
OrioleDB status
▶ Release is scheduled for December 1st 2021;
▶ https://github.com/orioledb/orioledb;
▶ If you need more explana on, don’t hesitate to make pull requests.
Alexander Korotkov Solving PostgreSQL wicked problems 40 / 40

More Related Content

What's hot (20)

The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication Tutorial
Jean-François Gagné
 
Building an open data platform with apache iceberg
Building an open data platform with apache iceberg
Alluxio, Inc.
 
MyRocks Deep Dive
MyRocks Deep Dive
Yoshinori Matsunobu
 
Introduction to MongoDB
Introduction to MongoDB
Mike Dirolf
 
Cassandra Introduction & Features
Cassandra Introduction & Features
DataStax Academy
 
MySQL GTID 시작하기
MySQL GTID 시작하기
I Goo Lee
 
Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기
NeoClova
 
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
PgDay.Seoul
 
My first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdf
Alkin Tezuysal
 
Introduction to memcached
Introduction to memcached
Jurriaan Persyn
 
Highly efficient backups with percona xtrabackup
Highly efficient backups with percona xtrabackup
Nilnandan Joshi
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
Ryan Blue
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
MySQL_MariaDB-성능개선-202201.pptx
MySQL_MariaDB-성능개선-202201.pptx
NeoClova
 
PostgreSQL Database Slides
PostgreSQL Database Slides
metsarin
 
Introduction to Redis
Introduction to Redis
Dvir Volk
 
Using ClickHouse for Experimentation
Using ClickHouse for Experimentation
Gleb Kanterov
 
Introduction to Storm
Introduction to Storm
Chandler Huang
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
Zalando Technology
 
The Full MySQL and MariaDB Parallel Replication Tutorial
The Full MySQL and MariaDB Parallel Replication Tutorial
Jean-François Gagné
 
Building an open data platform with apache iceberg
Building an open data platform with apache iceberg
Alluxio, Inc.
 
Introduction to MongoDB
Introduction to MongoDB
Mike Dirolf
 
Cassandra Introduction & Features
Cassandra Introduction & Features
DataStax Academy
 
MySQL GTID 시작하기
MySQL GTID 시작하기
I Goo Lee
 
Maria db 이중화구성_고민하기
Maria db 이중화구성_고민하기
NeoClova
 
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
[pgday.Seoul 2022] PostgreSQL구조 - 윤성재
PgDay.Seoul
 
My first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdf
Alkin Tezuysal
 
Introduction to memcached
Introduction to memcached
Jurriaan Persyn
 
Highly efficient backups with percona xtrabackup
Highly efficient backups with percona xtrabackup
Nilnandan Joshi
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
Ryan Blue
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
MySQL_MariaDB-성능개선-202201.pptx
MySQL_MariaDB-성능개선-202201.pptx
NeoClova
 
PostgreSQL Database Slides
PostgreSQL Database Slides
metsarin
 
Introduction to Redis
Introduction to Redis
Dvir Volk
 
Using ClickHouse for Experimentation
Using ClickHouse for Experimentation
Gleb Kanterov
 
Introduction to Storm
Introduction to Storm
Chandler Huang
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
PostgreSQL-Consulting
 
High Availability PostgreSQL with Zalando Patroni
High Availability PostgreSQL with Zalando Patroni
Zalando Technology
 

Similar to Solving PostgreSQL wicked problems (20)

Our answer to Uber
Our answer to Uber
Alexander Korotkov
 
Наш ответ Uber’у
Наш ответ Uber’у
IT Event
 
9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf
sreedb2
 
a look at the postgresql engine
a look at the postgresql engine
Federico Campoli
 
Building a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management Application
Jonathan Katz
 
Migrating To PostgreSQL
Migrating To PostgreSQL
Grant Fritchey
 
Postgre sql best_practices
Postgre sql best_practices
Jacques Kostic
 
TechEvent PostgreSQL Best Practices
TechEvent PostgreSQL Best Practices
Trivadis
 
Postgre sql best_practices
Postgre sql best_practices
Emiliano Fusaglia
 
The Accidental DBA
The Accidental DBA
PostgreSQL Experts, Inc.
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?
Jeremy Schneider
 
PostgreSQL - Case Study
PostgreSQL - Case Study
S.Shayan Daneshvar
 
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
Equnix Business Solutions
 
Lessons PostgreSQL learned from commercial databases, and didn’t
Lessons PostgreSQL learned from commercial databases, and didn’t
PGConf APAC
 
PerlApp2Postgresql (2)
PerlApp2Postgresql (2)
Jerome Eteve
 
Major features postgres 11
Major features postgres 11
EDB
 
Oracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration Hustle
EDB
 
PostgreSQL Enterprise Class Features and Capabilities
PostgreSQL Enterprise Class Features and Capabilities
PGConf APAC
 
The hitchhiker's guide to PostgreSQL
The hitchhiker's guide to PostgreSQL
Federico Campoli
 
12 in 12 – A closer look at twelve or so new things in Postgres 12
12 in 12 – A closer look at twelve or so new things in Postgres 12
BasilBourque1
 
Наш ответ Uber’у
Наш ответ Uber’у
IT Event
 
9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf
sreedb2
 
a look at the postgresql engine
a look at the postgresql engine
Federico Campoli
 
Building a Complex, Real-Time Data Management Application
Building a Complex, Real-Time Data Management Application
Jonathan Katz
 
Migrating To PostgreSQL
Migrating To PostgreSQL
Grant Fritchey
 
Postgre sql best_practices
Postgre sql best_practices
Jacques Kostic
 
TechEvent PostgreSQL Best Practices
TechEvent PostgreSQL Best Practices
Trivadis
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?
Jeremy Schneider
 
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
PGConf.ASIA 2019 Bali - Keynote Speech 2 - Ivan Pachenko
Equnix Business Solutions
 
Lessons PostgreSQL learned from commercial databases, and didn’t
Lessons PostgreSQL learned from commercial databases, and didn’t
PGConf APAC
 
PerlApp2Postgresql (2)
PerlApp2Postgresql (2)
Jerome Eteve
 
Major features postgres 11
Major features postgres 11
EDB
 
Oracle to Postgres Schema Migration Hustle
Oracle to Postgres Schema Migration Hustle
EDB
 
PostgreSQL Enterprise Class Features and Capabilities
PostgreSQL Enterprise Class Features and Capabilities
PGConf APAC
 
The hitchhiker's guide to PostgreSQL
The hitchhiker's guide to PostgreSQL
Federico Campoli
 
12 in 12 – A closer look at twelve or so new things in Postgres 12
12 in 12 – A closer look at twelve or so new things in Postgres 12
BasilBourque1
 
Ad

More from Alexander Korotkov (6)

In-memory OLTP storage with persistence and transaction support
In-memory OLTP storage with persistence and transaction support
Alexander Korotkov
 
Oh, that ubiquitous JSON !
Oh, that ubiquitous JSON !
Alexander Korotkov
 
Jsquery - the jsonb query language with GIN indexing support
Jsquery - the jsonb query language with GIN indexing support
Alexander Korotkov
 
The future is CSN
The future is CSN
Alexander Korotkov
 
Open Source SQL databases enter millions queries per second era
Open Source SQL databases enter millions queries per second era
Alexander Korotkov
 
Использование специальных типов данных PostgreSQL в ORM Doctrine
Использование специальных типов данных PostgreSQL в ORM Doctrine
Alexander Korotkov
 
In-memory OLTP storage with persistence and transaction support
In-memory OLTP storage with persistence and transaction support
Alexander Korotkov
 
Jsquery - the jsonb query language with GIN indexing support
Jsquery - the jsonb query language with GIN indexing support
Alexander Korotkov
 
Open Source SQL databases enter millions queries per second era
Open Source SQL databases enter millions queries per second era
Alexander Korotkov
 
Использование специальных типов данных PostgreSQL в ORM Doctrine
Использование специальных типов данных PostgreSQL в ORM Doctrine
Alexander Korotkov
 
Ad

Recently uploaded (20)

Migrating to Azure Cosmos DB the Right Way
Migrating to Azure Cosmos DB the Right Way
Alexander (Alex) Komyagin
 
SAP PM Module Level-IV Training Complete.ppt
SAP PM Module Level-IV Training Complete.ppt
MuhammadShaheryar36
 
Code and No-Code Journeys: The Coverage Overlook
Code and No-Code Journeys: The Coverage Overlook
Applitools
 
Software Testing & it’s types (DevOps)
Software Testing & it’s types (DevOps)
S Pranav (Deepu)
 
Integrating Survey123 and R&H Data Using FME
Integrating Survey123 and R&H Data Using FME
Safe Software
 
Software Engineering Process, Notation & Tools Introduction - Part 4
Software Engineering Process, Notation & Tools Introduction - Part 4
Gaurav Sharma
 
Milwaukee Marketo User Group June 2025 - Optimize and Enhance Efficiency - Sm...
Milwaukee Marketo User Group June 2025 - Optimize and Enhance Efficiency - Sm...
BradBedford3
 
MOVIE RECOMMENDATION SYSTEM, UDUMULA GOPI REDDY, Y24MC13085.pptx
MOVIE RECOMMENDATION SYSTEM, UDUMULA GOPI REDDY, Y24MC13085.pptx
Maharshi Mallela
 
Reimagining Software Development and DevOps with Agentic AI
Reimagining Software Development and DevOps with Agentic AI
Maxim Salnikov
 
Porting Qt 5 QML Modules to Qt 6 Webinar
Porting Qt 5 QML Modules to Qt 6 Webinar
ICS
 
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Alluxio, Inc.
 
wAIred_RabobankIgniteSession_12062025.pptx
wAIred_RabobankIgniteSession_12062025.pptx
SimonedeGijt
 
Software Engineering Process, Notation & Tools Introduction - Part 3
Software Engineering Process, Notation & Tools Introduction - Part 3
Gaurav Sharma
 
FME as an Orchestration Tool - Peak of Data & AI 2025
FME as an Orchestration Tool - Peak of Data & AI 2025
Safe Software
 
OpenTelemetry 101 Cloud Native Barcelona
OpenTelemetry 101 Cloud Native Barcelona
Imma Valls Bernaus
 
Agile Software Engineering Methodologies
Agile Software Engineering Methodologies
Gaurav Sharma
 
IBM Rational Unified Process For Software Engineering - Introduction
IBM Rational Unified Process For Software Engineering - Introduction
Gaurav Sharma
 
Who will create the languages of the future?
Who will create the languages of the future?
Jordi Cabot
 
AI-Powered Compliance Solutions for Global Regulations | Certivo
AI-Powered Compliance Solutions for Global Regulations | Certivo
certivoai
 
What is data visualization and how data visualization tool can help.pdf
What is data visualization and how data visualization tool can help.pdf
Varsha Nayak
 
SAP PM Module Level-IV Training Complete.ppt
SAP PM Module Level-IV Training Complete.ppt
MuhammadShaheryar36
 
Code and No-Code Journeys: The Coverage Overlook
Code and No-Code Journeys: The Coverage Overlook
Applitools
 
Software Testing & it’s types (DevOps)
Software Testing & it’s types (DevOps)
S Pranav (Deepu)
 
Integrating Survey123 and R&H Data Using FME
Integrating Survey123 and R&H Data Using FME
Safe Software
 
Software Engineering Process, Notation & Tools Introduction - Part 4
Software Engineering Process, Notation & Tools Introduction - Part 4
Gaurav Sharma
 
Milwaukee Marketo User Group June 2025 - Optimize and Enhance Efficiency - Sm...
Milwaukee Marketo User Group June 2025 - Optimize and Enhance Efficiency - Sm...
BradBedford3
 
MOVIE RECOMMENDATION SYSTEM, UDUMULA GOPI REDDY, Y24MC13085.pptx
MOVIE RECOMMENDATION SYSTEM, UDUMULA GOPI REDDY, Y24MC13085.pptx
Maharshi Mallela
 
Reimagining Software Development and DevOps with Agentic AI
Reimagining Software Development and DevOps with Agentic AI
Maxim Salnikov
 
Porting Qt 5 QML Modules to Qt 6 Webinar
Porting Qt 5 QML Modules to Qt 6 Webinar
ICS
 
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Meet You in the Middle: 1000x Performance for Parquet Queries on PB-Scale Dat...
Alluxio, Inc.
 
wAIred_RabobankIgniteSession_12062025.pptx
wAIred_RabobankIgniteSession_12062025.pptx
SimonedeGijt
 
Software Engineering Process, Notation & Tools Introduction - Part 3
Software Engineering Process, Notation & Tools Introduction - Part 3
Gaurav Sharma
 
FME as an Orchestration Tool - Peak of Data & AI 2025
FME as an Orchestration Tool - Peak of Data & AI 2025
Safe Software
 
OpenTelemetry 101 Cloud Native Barcelona
OpenTelemetry 101 Cloud Native Barcelona
Imma Valls Bernaus
 
Agile Software Engineering Methodologies
Agile Software Engineering Methodologies
Gaurav Sharma
 
IBM Rational Unified Process For Software Engineering - Introduction
IBM Rational Unified Process For Software Engineering - Introduction
Gaurav Sharma
 
Who will create the languages of the future?
Who will create the languages of the future?
Jordi Cabot
 
AI-Powered Compliance Solutions for Global Regulations | Certivo
AI-Powered Compliance Solutions for Global Regulations | Certivo
certivoai
 
What is data visualization and how data visualization tool can help.pdf
What is data visualization and how data visualization tool can help.pdf
Varsha Nayak
 

Solving PostgreSQL wicked problems

  • 1. Solving PostgreSQL wicked problems Alexander Korotkov Oriole DB Inc. 2021 Alexander Korotkov Solving PostgreSQL wicked problems 1 / 40
  • 2. PostgreSQL has two sides Alexander Korotkov Solving PostgreSQL wicked problems 2 / 40
  • 3. The bright side of PostgreSQL Alexander Korotkov Solving PostgreSQL wicked problems 3 / 40
  • 4. PostgreSQL – one of the most popular DBMS’es1 1 According to db-engines.com Alexander Korotkov Solving PostgreSQL wicked problems 4 / 40
  • 5. PostgreSQL – strong trend2 2 https://db-engines.com/en/ranking_trend/system/PostgreSQL Alexander Korotkov Solving PostgreSQL wicked problems 5 / 40
  • 6. PostgreSQL – most loved RDBMS3 3 According to Stackoverflow 2020 survey Alexander Korotkov Solving PostgreSQL wicked problems 6 / 40
  • 7. The dark side of PostgreSQL Alexander Korotkov Solving PostgreSQL wicked problems 7 / 40
  • 8. Cri cism of PostgreSQL (1/2) https://eng.uber.com/postgres-to-mysql-migration/ Alexander Korotkov Solving PostgreSQL wicked problems 8 / 40
  • 9. Cri cism of PostgreSQL (2/2) https://medium.com/@rbranson/10-things-i-hate-about-postgresql-20dbab8c2791 Alexander Korotkov Solving PostgreSQL wicked problems 9 / 40
  • 10. 10 wicked problems of PostgreSQL Problem name Known for Work started Resolu on 1. Wraparound 20 years 15 years ago S ll WIP 2. Failover Will Probably Lose Data 20 years 16 years ago S ll WIP 3. Inefficient Replica on That Spreads Corrup on 10 years 8 years ago S ll WIP 4. MVCC Garbage Frequently Painful 20 years 19 years ago Abandoned 5. Process-Per-Connec on = Pain at Scale 20 years 3 years ago Abandoned 6. Primary Key Index is a Space Hog 13 years — Not started 7. Major Version Upgrades Can Require Down me 21 years 16 years ago S ll WIP 8. Somewhat Cumbersome Replica on Setup 10 years 9 years ago S ll WIP 9. Ridiculous No-Planner-Hints Dogma 20 years 11 years ago Extension 10. No Block Compression 12 years 11 years ago S ll WIP * Scalability on modern hardware Alexander Korotkov Solving PostgreSQL wicked problems 10 / 40
  • 11. The exci ng moment ▶ PostgreSQL community have proven to be brilliant on solving non-design issues, providing fantas c product to the market. Alexander Korotkov Solving PostgreSQL wicked problems 11 / 40
  • 12. The exci ng moment ▶ PostgreSQL community have proven to be brilliant on solving non-design issues, providing fantas c product to the market. ▶ As a result, PostgreSQL has had a strong upwards trend for many years. Alexander Korotkov Solving PostgreSQL wicked problems 11 / 40
  • 13. The exci ng moment ▶ PostgreSQL community have proven to be brilliant on solving non-design issues, providing fantas c product to the market. ▶ As a result, PostgreSQL has had a strong upwards trend for many years. ▶ At the same me, the PostgreSQL community appears to be dysfunc onal in solving design issues, a rac ng severe cri cism. Nevertheless, cri cs not yet break the upwards trend. Alexander Korotkov Solving PostgreSQL wicked problems 11 / 40
  • 14. The exci ng moment ▶ PostgreSQL community have proven to be brilliant on solving non-design issues, providing fantas c product to the market. ▶ As a result, PostgreSQL has had a strong upwards trend for many years. ▶ At the same me, the PostgreSQL community appears to be dysfunc onal in solving design issues, a rac ng severe cri cism. Nevertheless, cri cs not yet break the upwards trend. ▶ It appears to be a unique moment for PostgreSQL redesign! Alexander Korotkov Solving PostgreSQL wicked problems 11 / 40
  • 15. How could we solve the PostgreSQL wicked problems? Alexander Korotkov Solving PostgreSQL wicked problems 12 / 40
  • 16. Tradi onal buffer management 1 2 3 4 5 6 7 Disk 1' 2' 3' 5' 6' Memory Buffer mapping 1 2 3 5 6 4 7 ▶ Each page access requires lookup into buffer mapping data structure. Alexander Korotkov Solving PostgreSQL wicked problems 13 / 40
  • 17. Tradi onal buffer management 1 2 3 4 5 6 7 Disk 1' 2' 3' 5' 6' Memory Buffer mapping 1 2 3 5 6 4 7 ▶ Each page access requires lookup into buffer mapping data structure. ▶ Each B-tree key lookup takes mul ple buffer mapping lookups. Alexander Korotkov Solving PostgreSQL wicked problems 13 / 40
  • 18. Tradi onal buffer management 1 2 3 4 5 6 7 Disk 1' 2' 3' 5' 6' Memory Buffer mapping 1 2 3 5 6 4 7 ▶ Each page access requires lookup into buffer mapping data structure. ▶ Each B-tree key lookup takes mul ple buffer mapping lookups. ▶ Accessing cached data doesn’t scale on modern hardware. Alexander Korotkov Solving PostgreSQL wicked problems 13 / 40
  • 19. Solu on: Dual pointers 1 2 3 5 7 1 2 3 4 5 6 7 Disk ▶ In-memory page refers either in-memory or on-disk page. Alexander Korotkov Solving PostgreSQL wicked problems 14 / 40
  • 20. Solu on: Dual pointers 1 2 3 5 7 1 2 3 4 5 6 7 Disk ▶ In-memory page refers either in-memory or on-disk page. ▶ Accessing cached data without buffer mapping lookups. Alexander Korotkov Solving PostgreSQL wicked problems 14 / 40
  • 21. Solu on: Dual pointers 1 2 3 5 7 1 2 3 4 5 6 7 Disk ▶ In-memory page refers either in-memory or on-disk page. ▶ Accessing cached data without buffer mapping lookups. ▶ Good scalability! Alexander Korotkov Solving PostgreSQL wicked problems 14 / 40
  • 22. PostgreSQL MVCC = bloat + write-amplifica on ▶ New and old row versions shares the same heap. Alexander Korotkov Solving PostgreSQL wicked problems 15 / 40
  • 23. PostgreSQL MVCC = bloat + write-amplifica on ▶ New and old row versions shares the same heap. ▶ Non-HOT updates cause index bloat. Alexander Korotkov Solving PostgreSQL wicked problems 15 / 40
  • 24. Solu on: undo log for both pages and rows Undo row1 row4 row1 row2 row3 row4 page row2v1 row1v1 row1v2 ▶ Old row versions form chains in undo log. Alexander Korotkov Solving PostgreSQL wicked problems 16 / 40
  • 25. Solu on: undo log for both pages and rows Undo row1 row4 row1 row2 row3 row4 page row2v1 row1v1 row1v2 ▶ Old row versions form chains in undo log. ▶ Page-level chains evict deleted rows from primary storage. Alexander Korotkov Solving PostgreSQL wicked problems 16 / 40
  • 26. Solu on: undo log for both pages and rows Undo row1 row4 row1 row2 row3 row4 page row2v1 row1v1 row1v2 ▶ Old row versions form chains in undo log. ▶ Page-level chains evict deleted rows from primary storage. ▶ Update only indexes with changed values. Alexander Korotkov Solving PostgreSQL wicked problems 16 / 40
  • 27. Block-level WAL Heap Index #1 Index #2 WAL ▶ Huge WAL traffic. Alexander Korotkov Solving PostgreSQL wicked problems 17 / 40
  • 28. Block-level WAL Heap Index #1 Index #2 WAL ▶ Huge WAL traffic. ▶ Problems with parallel apply. Alexander Korotkov Solving PostgreSQL wicked problems 17 / 40
  • 29. Block-level WAL Heap Index #1 Index #2 WAL ▶ Huge WAL traffic. ▶ Problems with parallel apply. ▶ Not suitable for mul -master replica on. Alexander Korotkov Solving PostgreSQL wicked problems 17 / 40
  • 30. Solu on: row-level WAL Heap Index #1 Index #2 WAL ▶ Very compact. Alexander Korotkov Solving PostgreSQL wicked problems 18 / 40
  • 31. Solu on: row-level WAL Heap Index #1 Index #2 WAL ▶ Very compact. ▶ Apply can be parallelized. Alexander Korotkov Solving PostgreSQL wicked problems 18 / 40
  • 32. Solu on: row-level WAL Heap Index #1 Index #2 WAL ▶ Very compact. ▶ Apply can be parallelized. ▶ Suitable for mul master (row-level conflicts, not block-level). Alexander Korotkov Solving PostgreSQL wicked problems 18 / 40
  • 33. Solu on: row-level WAL Heap Index #1 Index #2 WAL ▶ Very compact. ▶ Apply can be parallelized. ▶ Suitable for mul master (row-level conflicts, not block-level). ▶ Recovery needs structurally consistent checkpoints. Alexander Korotkov Solving PostgreSQL wicked problems 18 / 40
  • 34. Row-level WAL based mul master OrioleDB instance Storage WAL OrioleDB instance Storage WAL OrioleDB instance Storage WAL Raft replication Alexander Korotkov Solving PostgreSQL wicked problems 19 / 40
  • 35. Copy-on-write checkpoints (1/4) 1 2 3 5 7 1 2 3 4 5 6 7 Disk Alexander Korotkov Solving PostgreSQL wicked problems 20 / 40
  • 36. Copy-on-write checkpoints (2/4) 1 2 3 5 7* 1 2 3 4 5 6 7 Disk Alexander Korotkov Solving PostgreSQL wicked problems 21 / 40
  • 37. Copy-on-write checkpoints (3/4) 1* 2 3* 5 7* 1 2 3 4 5 6 7 Disk 7* 3* 1* Alexander Korotkov Solving PostgreSQL wicked problems 22 / 40
  • 38. Copy-on-write checkpoints (4/4) 1* 2 3* 5 7* 2 4 5 6 Disk 7* 3* 1* Alexander Korotkov Solving PostgreSQL wicked problems 23 / 40
  • 39. What do we need from PostgreSQL extendability? Backgroud processes Backend Connection Parser Rewriter Planner Executor Autovacuum Background writer Checkpointer WAL writer PostgreSQL server File system Data files WAL files Log files OrioleDB extension OrioleDB data files OrioleDB undo files ...... File system ▶ Extended table AM. Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
  • 40. What do we need from PostgreSQL extendability? Backgroud processes Backend Connection Parser Rewriter Planner Executor Autovacuum Background writer Checkpointer WAL writer PostgreSQL server File system Data files WAL files Log files OrioleDB extension OrioleDB data files OrioleDB undo files ...... File system ▶ Extended table AM. ▶ Custom toast handlers. Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
  • 41. What do we need from PostgreSQL extendability? Backgroud processes Backend Connection Parser Rewriter Planner Executor Autovacuum Background writer Checkpointer WAL writer PostgreSQL server File system Data files WAL files Log files OrioleDB extension OrioleDB data files OrioleDB undo files ...... File system ▶ Extended table AM. ▶ Custom toast handlers. ▶ Custom row iden fiers. Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
  • 42. What do we need from PostgreSQL extendability? Backgroud processes Backend Connection Parser Rewriter Planner Executor Autovacuum Background writer Checkpointer WAL writer PostgreSQL server File system Data files WAL files Log files OrioleDB extension OrioleDB data files OrioleDB undo files ...... File system ▶ Extended table AM. ▶ Custom toast handlers. ▶ Custom row iden fiers. ▶ Custom error cleanup. Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
  • 43. What do we need from PostgreSQL extendability? Backgroud processes Backend Connection Parser Rewriter Planner Executor Autovacuum Background writer Checkpointer WAL writer PostgreSQL server File system Data files WAL files Log files OrioleDB extension OrioleDB data files OrioleDB undo files ...... File system ▶ Extended table AM. ▶ Custom toast handlers. ▶ Custom row iden fiers. ▶ Custom error cleanup. ▶ Recovery & checkpointer hooks. Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
  • 44. What do we need from PostgreSQL extendability? Backgroud processes Backend Connection Parser Rewriter Planner Executor Autovacuum Background writer Checkpointer WAL writer PostgreSQL server File system Data files WAL files Log files OrioleDB extension OrioleDB data files OrioleDB undo files ...... File system ▶ Extended table AM. ▶ Custom toast handlers. ▶ Custom row iden fiers. ▶ Custom error cleanup. ▶ Recovery & checkpointer hooks. ▶ Snapshot hooks. Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
  • 45. What do we need from PostgreSQL extendability? Backgroud processes Backend Connection Parser Rewriter Planner Executor Autovacuum Background writer Checkpointer WAL writer PostgreSQL server File system Data files WAL files Log files OrioleDB extension OrioleDB data files OrioleDB undo files ...... File system ▶ Extended table AM. ▶ Custom toast handlers. ▶ Custom row iden fiers. ▶ Custom error cleanup. ▶ Recovery & checkpointer hooks. ▶ Snapshot hooks. ▶ Some other miscellaneous hooks total 1K lines patch to PostgreSQL Core Alexander Korotkov Solving PostgreSQL wicked problems 24 / 40
  • 46. OrioleDB = PostgreSQL redesign PostgreSQL Block-level WAL Row-level WAL Buffer mapping Direct page links Buffer locking Lock-less access Bloat-prone MVCC Undo log Cumbersome block-level WAL replication Raft-based multimaster replication of row- level WAL Alexander Korotkov Solving PostgreSQL wicked problems 25 / 40
  • 47. OrioleDB’s answer to 10 wicked problems of PostgreSQL Problem name Solu on 1. Wraparound Na ve 64-bit transac on ids 2. Failover Will Probably Lose Data Mul master replica on 3. Inefficient Replica on That Spreads Corrup on Row-level replica on 4. MVCC Garbage Frequently Painful Non-persistent undo log 5. Process-Per-Connec on = Pain at Scale Migra on to mul thread model 6. Primary Key Index is a Space Hog Index-organized tables 7. Major Version Upgrades Can Require Down me Mul master + per-node upgrade 8. Somewhat Cumbersome Replica on Setup Simple setup of ra -based mul master 9. Ridiculous No-Planner-Hints Dogma In-core planner hints 10. No Block Compression Block-level compression * Scalability on modern hardware Alexander Korotkov Solving PostgreSQL wicked problems 26 / 40
  • 48. Let’s do some benchmarks! 4 4 https://gist.github.com/akorotkov/f5e98ba5805c42ee18bf945b30cc3d67 Alexander Korotkov Solving PostgreSQL wicked problems 27 / 40
  • 49. OrioleDB benchmark: read-only scalability 0 50 100 150 200 250 # Clients 0 200000 400000 600000 800000 TPS Read-only scalability test PostgreSQL vs OrioleDB 1 minute of pgbench script reading 9 random values of 100M PostgreSQL OrioleDB OrioleDB: 4X higher TPS! Alexander Korotkov Solving PostgreSQL wicked problems 28 / 40
  • 50. OrioleDB benchmark: read-write scalability in-memory case 0 50 100 150 200 250 # Clients 0 100000 200000 300000 400000 TPS Read-write scalability test PostgreSQL vs OrioleDB 1 minute of pgbench TPC-B like transactions wrapped into stored procedure PostgreSQL OrioleDB OrioleDB: 3.5X higher TPS! Alexander Korotkov Solving PostgreSQL wicked problems 29 / 40
  • 51. OrioleDB benchmark: read-write scalability external storage case 0 250 500 750 1000 1250 1500 1750 2000 # Clients 0 20000 40000 60000 80000 100000 120000 TPS pgbench -s 20000 -j $n -c $n -M prepared on odb-node02 mean of 3 3-minute runs with shared_buffers = 32GB(128GB), max_connections = 2500 pgsql-read-write orioledb-read-write orioledb-read-write-block-device OrioleDB: up to 50X higher TPS! Alexander Korotkov Solving PostgreSQL wicked problems 30 / 40
  • 52. OrioleDB benchmark: read-write scalability Intel Optane persistent memory 0 250 500 750 1000 1250 1500 1750 2000 # Clients 0 25000 50000 75000 100000 125000 150000 175000 200000 TPS pgbench -s 20000 -j $n -c $n -M prepared -f read-write-proc.sql on node03 5-minute run with shared_buffers = 32GB, max_connections = 2500 pgsql orioledb-fsdax orioledb-devdax OrioleDB: up to 50X higher TPS! Alexander Korotkov Solving PostgreSQL wicked problems 31 / 40
  • 53. OrioleDB benchmark: write-amplifica on & bloat test: CPU 400 600 800 1000 1200 1400 1600 seconds 0 100000 200000 300000 400000 500000 600000 700000 800000 TPS Troughtput PostgreSQL OrioleDB 400 600 800 1000 1200 1400 1600 seconds 0 20 40 60 80 100 Usage, % CPU usage PostgreSQL OrioleDB OrioleDB: 5X higher TPS! 2.3X less CPU/TPS! Alexander Korotkov Solving PostgreSQL wicked problems 32 / 40
  • 54. OrioleDB benchmark: write-amplifica on & bloat test: IO 400 600 800 1000 1200 1400 1600 seconds 0 100000 200000 300000 400000 500000 600000 700000 800000 TPS Troughtput PostgreSQL OrioleDB 0 250 500 750 1000 1250 1500 1750 seconds 0 5000 10000 15000 20000 25000 30000 35000 IOPS IO load PostgreSQL OrioleDB OrioleDB: 5X higher TPS! 22X less IO/TPS! Alexander Korotkov Solving PostgreSQL wicked problems 33 / 40
  • 55. OrioleDB benchmark: write-amplifica on & bloat test: space 400 600 800 1000 1200 1400 1600 seconds 0 10 20 30 40 50 60 70 80 GB Space used PostgreSQL OrioleDB OrioleDB: no bloat! Alexander Korotkov Solving PostgreSQL wicked problems 34 / 40
  • 56. OrioleDB benchmark: taxi workload (1/3): read 0 500 1000 1500 2000 2500 3000 3500 seconds 0 25 50 75 100 125 150 175 IOPS Disk read PostgreSQL OrioleDB OrioleDB: 9X less read IOPS! Alexander Korotkov Solving PostgreSQL wicked problems 35 / 40
  • 57. OrioleDB benchmark: taxi workload (2/3): write 0 500 1000 1500 2000 2500 3000 3500 seconds 0 50 100 150 200 250 300 350 IOPS Disk write PostgreSQL OrioleDB OrioleDB: 4.5X less write IOPS! Alexander Korotkov Solving PostgreSQL wicked problems 36 / 40
  • 58. OrioleDB benchmark: taxi workload (3/3): space 0 500 1000 1500 2000 2500 3000 3500 seconds 0 5 10 15 20 25 30 35 40 GB Space used PostgreSQL OrioleDB OrioleDB: 8X less space usage! Alexander Korotkov Solving PostgreSQL wicked problems 37 / 40
  • 59. OrioleDB = Solu on of wicked PostgreSQL problems + extraordinary performance Alexander Korotkov Solving PostgreSQL wicked problems 38 / 40
  • 60. Roadmap ▶ Basic engine features 4 ▶ Table AM interface implementa on 4 ▶ Data compression 4 ▶ Undo log 4 ▶ TOAST support 4 ▶ Parallel row-level replica on 4 ▶ Par al and expression indexes 4 Ini al release ▶ GiST/GIN analogues Alexander Korotkov Solving PostgreSQL wicked problems 39 / 40
  • 61. OrioleDB status ▶ Release is scheduled for December 1st 2021; ▶ https://github.com/orioledb/orioledb; ▶ If you need more explana on, don’t hesitate to make pull requests. Alexander Korotkov Solving PostgreSQL wicked problems 40 / 40