SlideShare a Scribd company logo
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Mike Fowler, Senior Site Reliability Engineer
Migrating PostgreSQL to the Cloud
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
About Me
Associate: SA, Developer
Specialist: Big Data
Data Engineer
Cloud Architect
Migrating PostgreSQL to the Cloud
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
●
Hosted PostgreSQL
●
Overview of public cloud hosting options
●
Database migration strategies
– Dump & Restore
– Replication failover
– Amazon’s Database Migration Service
– PITR + Logical decoding
Overview
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
What is hosted PostgreSQL?
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Benefits of Hosted PostgreSQL
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Drawbacks of Hosted PostgreSQL
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Fallacies of Hosted PostgreSQL
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Hosting Options
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Amazon RDS
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Amazon Aurora
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Amazon Aurora
https://aws.amazon.com/blogs/aws/amazon-aurora-update-postgresql-compatibility/
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Heroku
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Google Cloud SQL
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Microsoft Azure
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Encryption
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Extensions
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
●
Dump & Restore
●
Replication failover
●
Amazon’s Database Migration Service
●
PITR + Logical decoding
Migration Strategies
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Dump & Restore
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Strategies to Minimise Downtime
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Replication Failover
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
AWS Database Migration Service
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
PITR & Logical decoding
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
1. Create a logical replication slot
SELECT * FROM
pg_create_logical_replication_slot
('logical_slot', 'decoder_raw');
2. Note the transaction ID (catalog_xmin)
SELECT catalog_xmin FROM
pg_replication_slots WHERE slot_name =
‘logical_slot’;
PITR & Logical decoding
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
3. Perform a barman backup
$ barman backup master
4. Perform a barman PITR
$ barman recover –target-xid
(catalog_xmin - 1) master latest
5. Start database and verify correct recovery
PITR & Logical decoding
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
5. Perform pg_dump on the readonly barman node
6. Restore to public cloud
7. Read output of logical decoding and write to cloud
PITR & Logical decoding
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Summary
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Questions?
Mike Fowler
gh-mlfowler
mlfowler
mike dot fowler at claranet dot uk
We’re hiring!
If you’re interested in helping us do what we do email
hr at uk dot clara dot net
Migrating PostgreSQL to the Cloud
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Mike Fowler, Senior Site Reliability Engineer
Migrating PostgreSQL to the Cloud
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
About Me
Associate: SA, Developer
Specialist: Big Data
Data Engineer
Cloud Architect
●
Senior Site Reliability Engineer in the
Public Cloud Practice of claranet
●
Background in Software Engineering,
Systems Engineering, System &
Database Administration
●
Been using PostgreSQL since 7.4
●
Contributed to YAWL, PostgreSQL &
Terraform open source projects
●
XMLEXISTS/xpath_exists()
●
xml_is_well_formed()
●
Regular speaker at PGDay UK
Migrating PostgreSQL to the Cloud
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
●
Hosted PostgreSQL
●
Overview of public cloud hosting options
●
Database migration strategies
– Dump & Restore
– Replication failover
– Amazon’s Database Migration Service
– PITR + Logical decoding
Overview
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
What is hosted PostgreSQL?
Your database somewhere else
A managed service
Some providers offer full DBA support
Cloud providers give only the infrastructure
Typically provisioned through an API or GUI
i.e. a self-service environment
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Benefits of Hosted PostgreSQL
Reduces adoption costs
Installation & configuration is already done
Generally sane defaults, some tuning often
required
Needn’t worry about physical servers
Opex instead of Capex
Most routine DBA tasks are done for you
Easier to grow
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Drawbacks of Hosted PostgreSQL
Less control
Latency
Some features are disabled
Migrating existing databases is hard
Potential for vendor lock-in
Resource limits
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Fallacies of Hosted PostgreSQL
Automatic scaling
It will fix your slow queries/reports/application
You don’t need a DBA
They need to be planning for scale
They need to help address slowness
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Hosting Options
We’ll look only at Public Cloud offerings
Amazon Relation Database Service (RDS)
Amazon Aurora
Heroku
Google Cloud SQL
Microsoft Azure
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Amazon RDS
PostgreSQL 9.3.12 – 10.3 supported
Numerous instance types
Costs range from $0.018 to $7.96 per hour
Select from 1 vCPU up to 64 vCPUs, all 64-bit
Memory ranges from 1GB to 488GB
Flexible storage options
Choose between SSD or Provisioned IOPS
Up to 16TB with up to 40,000 IOPS
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Amazon Aurora
6 instance types costing from $0.29 to $9.28 per
hour
Compatible with PostgreSQL 9.6
Up to 2x throughput of conventional PostgreSQL
Up to 16 read replicas with sub-10ms replica lag
Auto-growing filesystem up to 64TB
Filesystem is shared between 3 availability
zones
Announced: Multi-master & serverless
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Amazon Aurora
https://aws.amazon.com/blogs/aws/amazon-aurora-update-postgresql-compatibility/
Tatsuo Ishii
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Heroku
Supports PostgreSQL 9.4, 9.5, 9.6 & 10
Simpler pricing based on choice of tier ($0-8.5k
pcm)
Tier dictates resource limits
Maximum number of rows (Hobby only)
Cache size (1GB - 240GB)
Storage limit (64GB - 1TB)
Connection limit (120 - 500)
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Google Cloud SQL
Became GA April 18th
Only supports PostgreSQL 9.6
Similar replication options to RDS
Posed to be a serious rival to RDS
Billing per minute
Automatic scaling of filesystem
Similar variety of instance types
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Microsoft Azure
Became GA April 18th, same day as GCP
Supports PostgreSQL 9.5.7 & 9.6.2
Replication is seamless
Automated failover
PITR
Selectable compute units
Roughly equates to maximum connections
99.99 availability vs 99.95 for others
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Encryption
At Rest (AES-256)
Enabled by default for Google & Azure
User enabled at create for Amazon
(RDS/Aurora)
Only available in premium plans for Heroku
In transit (SSL)
Enforced on Heroku & Azure
Can be enforced on Google & Amazon
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Extensions
Providers dictate the availability of extensions
Well documented except for Aurora
Only step required is to run CREATE EXTENSION
Many extensions are widely available
pgcrypto
PostGIS
pg_stat_statements
hstore
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
●
Dump & Restore
●
Replication failover
●
Amazon’s Database Migration Service
●
PITR + Logical decoding
Migration Strategies
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Dump & Restore
Simplest strategy
Perceived as low risk for data loss
Less “moving parts”
Just a pg_dump & pg_restore
Downtime is function of database size
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Strategies to Minimise Downtime
Move historic data ahead of time
Opportunity to clear out unused data
Consider introducing partitions
Consider moving the dump closer to the target
e.g. Upload to EC2 instance in the same region as
the RDS instance and run pg_restore from
there
Over provision resources
Gives higher throughput during data load
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Replication Failover
No one supports external masters!
Trigger based replication failover
Slony, Londiste & Bucardo
Can be used on most any version of PostgreSQL
Some restrictions apply
DDL is not supported
Rows must be uniquely identifiable
Presents some risk to production environment
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
AWS Database Migration Service
Web service supporting migration to/from AWS
Can be RDS or an EC2 instance
External DB can be anywhere
Perform a one time load or continual load
Change Data Capture
“Supports” heterogeneous migrations
Oracle, MySQL, SQL Server, MongoDB
Schema Conversion Tool exists to assist
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
PITR & Logical decoding
Most involved approach, least downtime
Combines point-in-time recovery with the
changes captured by logical decoding to create
a replica
Need to be running at least PostgreSQL 9.4 with
WAL level logical and have WAL archiving
configured
DDL not supported, still need unique rows
Using barman for managing WAL
http://www.pgbarman.org/
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
1. Create a logical replication slot
SELECT * FROM
pg_create_logical_replication_slot
('logical_slot', 'decoder_raw');
2. Note the transaction ID (catalog_xmin)
SELECT catalog_xmin FROM
pg_replication_slots WHERE slot_name =
‘logical_slot’;
PITR & Logical decoding
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
3. Perform a barman backup
$ barman backup master
4. Perform a barman PITR
$ barman recover –target-xid
(catalog_xmin - 1) master latest
5. Start database and verify correct recovery
PITR & Logical decoding
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
5. Perform pg_dump on the readonly barman node
6. Restore to public cloud
7. Read output of logical decoding and write to cloud
PITR & Logical decoding
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Summary
Hosted PostgreSQL gives you high performance
PostgreSQL without the hassle of hardware,
maintenance and configuration
Opex instead of Capex
Consider the limitations of your intended
platform
There are multiple options for migration
Spring Conference, April 26th-27th 2018
Dynamic Earth, Edinburgh
Mike Fowler mlfowler
mike dot fowler at claranet dot uk gh-mlfowler
Questions?
Mike Fowler
gh-mlfowler
mlfowler
mike dot fowler at claranet dot uk
We’re hiring!
If you’re interested in helping us do what we do email
hr at uk dot clara dot net
Migrating PostgreSQL to the Cloud
Ad

Recommended

Data Wrangling with PySpark for Data Scientists Who Know Pandas with Andrew Ray
Data Wrangling with PySpark for Data Scientists Who Know Pandas with Andrew Ray
Databricks
 
History of Apache Pinot
History of Apache Pinot
Kishore Gopalakrishna
 
CICD Pipeline and delivery of Apache Spark Applications on the cloud using AWS
CICD Pipeline and delivery of Apache Spark Applications on the cloud using AWS
Data Con LA
 
Introduction to Streaming with Apache Flink
Introduction to Streaming with Apache Flink
Tugdual Grall
 
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
Observing Intraday Indicators Using Real-Time Tick Data on Apache Superset an...
DataWorks Summit
 
Project Zen: Improving Apache Spark for Python Users
Project Zen: Improving Apache Spark for Python Users
Databricks
 
Parallelizing with Apache Spark in Unexpected Ways
Parallelizing with Apache Spark in Unexpected Ways
Databricks
 
Reliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on Kubernetes
Databricks
 
Distributed ML with Dask and Kubernetes
Distributed ML with Dask and Kubernetes
Ray Hilton
 
Spark Summit EU talk by William Benton
Spark Summit EU talk by William Benton
Spark Summit
 
An Insider’s Guide to Maximizing Spark SQL Performance
An Insider’s Guide to Maximizing Spark SQL Performance
Takuya UESHIN
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Databricks
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Big Data Spain
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Databricks
 
Spark Summit EU talk by Heiko Korndorf
Spark Summit EU talk by Heiko Korndorf
Spark Summit
 
Dataflow in 104corp - DataConTW2018
Dataflow in 104corp - DataConTW2018
Gavin Lin
 
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
Databricks
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on Kubernetes
Databricks
 
Spark + AI Summit recap jul16 2020
Spark + AI Summit recap jul16 2020
Guido Oswald
 
Presto @ Netflix: Interactive Queries at Petabyte Scale
Presto @ Netflix: Interactive Queries at Petabyte Scale
DataWorks Summit
 
Scalable Scientific Computing with Dask
Scalable Scientific Computing with Dask
Uwe Korn
 
Powering Custom Apps at Facebook using Spark Script Transformation
Powering Custom Apps at Facebook using Spark Script Transformation
Databricks
 
Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...
Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...
Spark Summit
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
Xiang Fu
 
Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016
Dan Lynn
 
SparkR + Zeppelin
SparkR + Zeppelin
felixcss
 
Approaching real-time-hadoop
Approaching real-time-hadoop
Chris Huang
 
Hands on with Apache Spark
Hands on with Apache Spark
Dan Lynn
 
Elephants in the Cloud
Elephants in the Cloud
Mike Fowler
 
Hosted PostgreSQL
Hosted PostgreSQL
Mike Fowler
 

More Related Content

What's hot (20)

Distributed ML with Dask and Kubernetes
Distributed ML with Dask and Kubernetes
Ray Hilton
 
Spark Summit EU talk by William Benton
Spark Summit EU talk by William Benton
Spark Summit
 
An Insider’s Guide to Maximizing Spark SQL Performance
An Insider’s Guide to Maximizing Spark SQL Performance
Takuya UESHIN
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Databricks
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Big Data Spain
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Databricks
 
Spark Summit EU talk by Heiko Korndorf
Spark Summit EU talk by Heiko Korndorf
Spark Summit
 
Dataflow in 104corp - DataConTW2018
Dataflow in 104corp - DataConTW2018
Gavin Lin
 
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
Databricks
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on Kubernetes
Databricks
 
Spark + AI Summit recap jul16 2020
Spark + AI Summit recap jul16 2020
Guido Oswald
 
Presto @ Netflix: Interactive Queries at Petabyte Scale
Presto @ Netflix: Interactive Queries at Petabyte Scale
DataWorks Summit
 
Scalable Scientific Computing with Dask
Scalable Scientific Computing with Dask
Uwe Korn
 
Powering Custom Apps at Facebook using Spark Script Transformation
Powering Custom Apps at Facebook using Spark Script Transformation
Databricks
 
Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...
Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...
Spark Summit
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
Xiang Fu
 
Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016
Dan Lynn
 
SparkR + Zeppelin
SparkR + Zeppelin
felixcss
 
Approaching real-time-hadoop
Approaching real-time-hadoop
Chris Huang
 
Hands on with Apache Spark
Hands on with Apache Spark
Dan Lynn
 
Distributed ML with Dask and Kubernetes
Distributed ML with Dask and Kubernetes
Ray Hilton
 
Spark Summit EU talk by William Benton
Spark Summit EU talk by William Benton
Spark Summit
 
An Insider’s Guide to Maximizing Spark SQL Performance
An Insider’s Guide to Maximizing Spark SQL Performance
Takuya UESHIN
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Databricks
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Big Data Spain
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Databricks
 
Spark Summit EU talk by Heiko Korndorf
Spark Summit EU talk by Heiko Korndorf
Spark Summit
 
Dataflow in 104corp - DataConTW2018
Dataflow in 104corp - DataConTW2018
Gavin Lin
 
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
Databricks
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on Kubernetes
Databricks
 
Spark + AI Summit recap jul16 2020
Spark + AI Summit recap jul16 2020
Guido Oswald
 
Presto @ Netflix: Interactive Queries at Petabyte Scale
Presto @ Netflix: Interactive Queries at Petabyte Scale
DataWorks Summit
 
Scalable Scientific Computing with Dask
Scalable Scientific Computing with Dask
Uwe Korn
 
Powering Custom Apps at Facebook using Spark Script Transformation
Powering Custom Apps at Facebook using Spark Script Transformation
Databricks
 
Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...
Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...
Spark Summit
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
Xiang Fu
 
Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016
Dan Lynn
 
SparkR + Zeppelin
SparkR + Zeppelin
felixcss
 
Approaching real-time-hadoop
Approaching real-time-hadoop
Chris Huang
 
Hands on with Apache Spark
Hands on with Apache Spark
Dan Lynn
 

Similar to Migrating PostgreSQL to the Cloud (20)

Elephants in the Cloud
Elephants in the Cloud
Mike Fowler
 
Hosted PostgreSQL
Hosted PostgreSQL
Mike Fowler
 
Pgbr 2013 postgres on aws
Pgbr 2013 postgres on aws
Emanuel Calvo
 
Azure Database for PostgreSQL_11.2021.pptx
Azure Database for PostgreSQL_11.2021.pptx
dominicduantran
 
PostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized World
Jignesh Shah
 
Pro PostgreSQL, OSCon 2008
Pro PostgreSQL, OSCon 2008
Robert Treat
 
Oracle to Azure PostgreSQL database migration webinar
Oracle to Azure PostgreSQL database migration webinar
Minnie Seungmin Cho
 
PostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized World
Jignesh Shah
 
Azure Database for PostgreSQL - Top Use Cases.pptx
Azure Database for PostgreSQL - Top Use Cases.pptx
dominicduantran
 
Best Practices & Lessons Learned from Deployment of PostgreSQL
Best Practices & Lessons Learned from Deployment of PostgreSQL
EDB
 
PostgreSQL database migration guide to Azure
PostgreSQL database migration guide to Azure
Principled Technologies
 
PostgreSQL Sharding and HA: Theory and Practice (PGConf.ASIA 2017)
PostgreSQL Sharding and HA: Theory and Practice (PGConf.ASIA 2017)
Aleksander Alekseev
 
Keynote - Hosted PostgreSQL: An Objective Look
Keynote - Hosted PostgreSQL: An Objective Look
EDB
 
Deep dive into the Rds PostgreSQL Universe Austin 2017
Deep dive into the Rds PostgreSQL Universe Austin 2017
Grant McAlister
 
Postgresql in Education
Postgresql in Education
dostatni
 
PostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQL
Alexei Krasner
 
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
PavelKonotopov
 
Preview of the EDB Postgres Roadmap
Preview of the EDB Postgres Roadmap
EDB
 
Azure Databases for PostgreSQL, MySQL and MariaDB
Azure Databases for PostgreSQL, MySQL and MariaDB
rockplace
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2
PgTraining
 
Elephants in the Cloud
Elephants in the Cloud
Mike Fowler
 
Hosted PostgreSQL
Hosted PostgreSQL
Mike Fowler
 
Pgbr 2013 postgres on aws
Pgbr 2013 postgres on aws
Emanuel Calvo
 
Azure Database for PostgreSQL_11.2021.pptx
Azure Database for PostgreSQL_11.2021.pptx
dominicduantran
 
PostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized World
Jignesh Shah
 
Pro PostgreSQL, OSCon 2008
Pro PostgreSQL, OSCon 2008
Robert Treat
 
Oracle to Azure PostgreSQL database migration webinar
Oracle to Azure PostgreSQL database migration webinar
Minnie Seungmin Cho
 
PostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized World
Jignesh Shah
 
Azure Database for PostgreSQL - Top Use Cases.pptx
Azure Database for PostgreSQL - Top Use Cases.pptx
dominicduantran
 
Best Practices & Lessons Learned from Deployment of PostgreSQL
Best Practices & Lessons Learned from Deployment of PostgreSQL
EDB
 
PostgreSQL database migration guide to Azure
PostgreSQL database migration guide to Azure
Principled Technologies
 
PostgreSQL Sharding and HA: Theory and Practice (PGConf.ASIA 2017)
PostgreSQL Sharding and HA: Theory and Practice (PGConf.ASIA 2017)
Aleksander Alekseev
 
Keynote - Hosted PostgreSQL: An Objective Look
Keynote - Hosted PostgreSQL: An Objective Look
EDB
 
Deep dive into the Rds PostgreSQL Universe Austin 2017
Deep dive into the Rds PostgreSQL Universe Austin 2017
Grant McAlister
 
Postgresql in Education
Postgresql in Education
dostatni
 
PostgreSQL as an Alternative to MSSQL
PostgreSQL as an Alternative to MSSQL
Alexei Krasner
 
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
PavelKonotopov
 
Preview of the EDB Postgres Roadmap
Preview of the EDB Postgres Roadmap
EDB
 
Azure Databases for PostgreSQL, MySQL and MariaDB
Azure Databases for PostgreSQL, MySQL and MariaDB
rockplace
 
Oracle to Postgres Migration - part 2
Oracle to Postgres Migration - part 2
PgTraining
 
Ad

More from Mike Fowler (14)

From Warehouses to Lakes: The Value of Streams
From Warehouses to Lakes: The Value of Streams
Mike Fowler
 
From Warehouses to Lakes: The Value of Streams
From Warehouses to Lakes: The Value of Streams
Mike Fowler
 
Getting Started with Machine Learning on AWS
Getting Started with Machine Learning on AWS
Mike Fowler
 
Building with Firebase
Building with Firebase
Mike Fowler
 
Reducing Pager Fatigue Using a Serverless ML Bot
Reducing Pager Fatigue Using a Serverless ML Bot
Mike Fowler
 
Getting started with Machine Learning
Getting started with Machine Learning
Mike Fowler
 
Migrating with Debezium
Migrating with Debezium
Mike Fowler
 
Leveraging Automation for a Disposable Infrastructure
Leveraging Automation for a Disposable Infrastructure
Mike Fowler
 
Shaping Clouds with Terraform
Shaping Clouds with Terraform
Mike Fowler
 
Google Cloud & Your Data
Google Cloud & Your Data
Mike Fowler
 
Disposable infrastructure
Disposable infrastructure
Mike Fowler
 
Fun Things to do with Logical Decoding
Fun Things to do with Logical Decoding
Mike Fowler
 
Handling XML and JSON in the Database
Handling XML and JSON in the Database
Mike Fowler
 
Migrating Rant & Rave to PostgreSQL
Migrating Rant & Rave to PostgreSQL
Mike Fowler
 
From Warehouses to Lakes: The Value of Streams
From Warehouses to Lakes: The Value of Streams
Mike Fowler
 
From Warehouses to Lakes: The Value of Streams
From Warehouses to Lakes: The Value of Streams
Mike Fowler
 
Getting Started with Machine Learning on AWS
Getting Started with Machine Learning on AWS
Mike Fowler
 
Building with Firebase
Building with Firebase
Mike Fowler
 
Reducing Pager Fatigue Using a Serverless ML Bot
Reducing Pager Fatigue Using a Serverless ML Bot
Mike Fowler
 
Getting started with Machine Learning
Getting started with Machine Learning
Mike Fowler
 
Migrating with Debezium
Migrating with Debezium
Mike Fowler
 
Leveraging Automation for a Disposable Infrastructure
Leveraging Automation for a Disposable Infrastructure
Mike Fowler
 
Shaping Clouds with Terraform
Shaping Clouds with Terraform
Mike Fowler
 
Google Cloud & Your Data
Google Cloud & Your Data
Mike Fowler
 
Disposable infrastructure
Disposable infrastructure
Mike Fowler
 
Fun Things to do with Logical Decoding
Fun Things to do with Logical Decoding
Mike Fowler
 
Handling XML and JSON in the Database
Handling XML and JSON in the Database
Mike Fowler
 
Migrating Rant & Rave to PostgreSQL
Migrating Rant & Rave to PostgreSQL
Mike Fowler
 
Ad

Recently uploaded (20)

Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
 
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
 
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
digitaljignect
 
Cyber Defense Matrix Workshop - RSA Conference
Cyber Defense Matrix Workshop - RSA Conference
Priyanka Aash
 
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
 
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
Safe Software
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
Securing AI - There Is No Try, Only Do!.pdf
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
Quantum AI: Where Impossible Becomes Probable
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
 
Python Conference Singapore - 19 Jun 2025
Python Conference Singapore - 19 Jun 2025
ninefyi
 
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
Salesforce Summer '25 Release Frenchgathering.pptx.pdf
yosra Saidani
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
 
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
 
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
WebdriverIO & JavaScript: The Perfect Duo for Web Automation
digitaljignect
 
Cyber Defense Matrix Workshop - RSA Conference
Cyber Defense Matrix Workshop - RSA Conference
Priyanka Aash
 
Techniques for Automatic Device Identification and Network Assignment.pdf
Techniques for Automatic Device Identification and Network Assignment.pdf
Priyanka Aash
 
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Cracking the Code - Unveiling Synergies Between Open Source Security and AI.pdf
Priyanka Aash
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
The Future of Product Management in AI ERA.pdf
The Future of Product Management in AI ERA.pdf
Alyona Owens
 
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
AI Agents and FME: A How-to Guide on Generating Synthetic Metadata
Safe Software
 
Security Tips for Enterprise Azure Solutions
Security Tips for Enterprise Azure Solutions
Michele Leroux Bustamante
 
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
“MPU+: A Transformative Solution for Next-Gen AI at the Edge,” a Presentation...
Edge AI and Vision Alliance
 
Securing AI - There Is No Try, Only Do!.pdf
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
Quantum AI: Where Impossible Becomes Probable
Quantum AI: Where Impossible Becomes Probable
Saikat Basu
 

Migrating PostgreSQL to the Cloud

  • 1. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Mike Fowler, Senior Site Reliability Engineer Migrating PostgreSQL to the Cloud
  • 2. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler About Me Associate: SA, Developer Specialist: Big Data Data Engineer Cloud Architect
  • 4. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler ● Hosted PostgreSQL ● Overview of public cloud hosting options ● Database migration strategies – Dump & Restore – Replication failover – Amazon’s Database Migration Service – PITR + Logical decoding Overview
  • 5. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler What is hosted PostgreSQL?
  • 6. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Benefits of Hosted PostgreSQL
  • 7. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Drawbacks of Hosted PostgreSQL
  • 8. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Fallacies of Hosted PostgreSQL
  • 9. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Hosting Options
  • 10. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Amazon RDS
  • 11. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Amazon Aurora
  • 12. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Amazon Aurora https://aws.amazon.com/blogs/aws/amazon-aurora-update-postgresql-compatibility/
  • 13. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Heroku
  • 14. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Google Cloud SQL
  • 15. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Microsoft Azure
  • 16. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Encryption
  • 17. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Extensions
  • 18. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler ● Dump & Restore ● Replication failover ● Amazon’s Database Migration Service ● PITR + Logical decoding Migration Strategies
  • 19. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Dump & Restore
  • 20. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Strategies to Minimise Downtime
  • 21. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Replication Failover
  • 22. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler AWS Database Migration Service
  • 23. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler PITR & Logical decoding
  • 24. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler 1. Create a logical replication slot SELECT * FROM pg_create_logical_replication_slot ('logical_slot', 'decoder_raw'); 2. Note the transaction ID (catalog_xmin) SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = ‘logical_slot’; PITR & Logical decoding
  • 25. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler 3. Perform a barman backup $ barman backup master 4. Perform a barman PITR $ barman recover –target-xid (catalog_xmin - 1) master latest 5. Start database and verify correct recovery PITR & Logical decoding
  • 26. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler 5. Perform pg_dump on the readonly barman node 6. Restore to public cloud 7. Read output of logical decoding and write to cloud PITR & Logical decoding
  • 27. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Summary
  • 28. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Questions? Mike Fowler gh-mlfowler mlfowler mike dot fowler at claranet dot uk We’re hiring! If you’re interested in helping us do what we do email hr at uk dot clara dot net
  • 30. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Mike Fowler, Senior Site Reliability Engineer Migrating PostgreSQL to the Cloud
  • 31. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler About Me Associate: SA, Developer Specialist: Big Data Data Engineer Cloud Architect ● Senior Site Reliability Engineer in the Public Cloud Practice of claranet ● Background in Software Engineering, Systems Engineering, System & Database Administration ● Been using PostgreSQL since 7.4 ● Contributed to YAWL, PostgreSQL & Terraform open source projects ● XMLEXISTS/xpath_exists() ● xml_is_well_formed() ● Regular speaker at PGDay UK
  • 33. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler ● Hosted PostgreSQL ● Overview of public cloud hosting options ● Database migration strategies – Dump & Restore – Replication failover – Amazon’s Database Migration Service – PITR + Logical decoding Overview
  • 34. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler What is hosted PostgreSQL? Your database somewhere else A managed service Some providers offer full DBA support Cloud providers give only the infrastructure Typically provisioned through an API or GUI i.e. a self-service environment
  • 35. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Benefits of Hosted PostgreSQL Reduces adoption costs Installation & configuration is already done Generally sane defaults, some tuning often required Needn’t worry about physical servers Opex instead of Capex Most routine DBA tasks are done for you Easier to grow
  • 36. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Drawbacks of Hosted PostgreSQL Less control Latency Some features are disabled Migrating existing databases is hard Potential for vendor lock-in Resource limits
  • 37. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Fallacies of Hosted PostgreSQL Automatic scaling It will fix your slow queries/reports/application You don’t need a DBA They need to be planning for scale They need to help address slowness
  • 38. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Hosting Options We’ll look only at Public Cloud offerings Amazon Relation Database Service (RDS) Amazon Aurora Heroku Google Cloud SQL Microsoft Azure
  • 39. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Amazon RDS PostgreSQL 9.3.12 – 10.3 supported Numerous instance types Costs range from $0.018 to $7.96 per hour Select from 1 vCPU up to 64 vCPUs, all 64-bit Memory ranges from 1GB to 488GB Flexible storage options Choose between SSD or Provisioned IOPS Up to 16TB with up to 40,000 IOPS
  • 40. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Amazon Aurora 6 instance types costing from $0.29 to $9.28 per hour Compatible with PostgreSQL 9.6 Up to 2x throughput of conventional PostgreSQL Up to 16 read replicas with sub-10ms replica lag Auto-growing filesystem up to 64TB Filesystem is shared between 3 availability zones Announced: Multi-master & serverless
  • 41. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Amazon Aurora https://aws.amazon.com/blogs/aws/amazon-aurora-update-postgresql-compatibility/ Tatsuo Ishii
  • 42. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Heroku Supports PostgreSQL 9.4, 9.5, 9.6 & 10 Simpler pricing based on choice of tier ($0-8.5k pcm) Tier dictates resource limits Maximum number of rows (Hobby only) Cache size (1GB - 240GB) Storage limit (64GB - 1TB) Connection limit (120 - 500)
  • 43. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Google Cloud SQL Became GA April 18th Only supports PostgreSQL 9.6 Similar replication options to RDS Posed to be a serious rival to RDS Billing per minute Automatic scaling of filesystem Similar variety of instance types
  • 44. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Microsoft Azure Became GA April 18th, same day as GCP Supports PostgreSQL 9.5.7 & 9.6.2 Replication is seamless Automated failover PITR Selectable compute units Roughly equates to maximum connections 99.99 availability vs 99.95 for others
  • 45. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Encryption At Rest (AES-256) Enabled by default for Google & Azure User enabled at create for Amazon (RDS/Aurora) Only available in premium plans for Heroku In transit (SSL) Enforced on Heroku & Azure Can be enforced on Google & Amazon
  • 46. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Extensions Providers dictate the availability of extensions Well documented except for Aurora Only step required is to run CREATE EXTENSION Many extensions are widely available pgcrypto PostGIS pg_stat_statements hstore
  • 47. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler ● Dump & Restore ● Replication failover ● Amazon’s Database Migration Service ● PITR + Logical decoding Migration Strategies
  • 48. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Dump & Restore Simplest strategy Perceived as low risk for data loss Less “moving parts” Just a pg_dump & pg_restore Downtime is function of database size
  • 49. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Strategies to Minimise Downtime Move historic data ahead of time Opportunity to clear out unused data Consider introducing partitions Consider moving the dump closer to the target e.g. Upload to EC2 instance in the same region as the RDS instance and run pg_restore from there Over provision resources Gives higher throughput during data load
  • 50. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Replication Failover No one supports external masters! Trigger based replication failover Slony, Londiste & Bucardo Can be used on most any version of PostgreSQL Some restrictions apply DDL is not supported Rows must be uniquely identifiable Presents some risk to production environment
  • 51. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler AWS Database Migration Service Web service supporting migration to/from AWS Can be RDS or an EC2 instance External DB can be anywhere Perform a one time load or continual load Change Data Capture “Supports” heterogeneous migrations Oracle, MySQL, SQL Server, MongoDB Schema Conversion Tool exists to assist
  • 52. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler PITR & Logical decoding Most involved approach, least downtime Combines point-in-time recovery with the changes captured by logical decoding to create a replica Need to be running at least PostgreSQL 9.4 with WAL level logical and have WAL archiving configured DDL not supported, still need unique rows Using barman for managing WAL http://www.pgbarman.org/
  • 53. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler 1. Create a logical replication slot SELECT * FROM pg_create_logical_replication_slot ('logical_slot', 'decoder_raw'); 2. Note the transaction ID (catalog_xmin) SELECT catalog_xmin FROM pg_replication_slots WHERE slot_name = ‘logical_slot’; PITR & Logical decoding
  • 54. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler 3. Perform a barman backup $ barman backup master 4. Perform a barman PITR $ barman recover –target-xid (catalog_xmin - 1) master latest 5. Start database and verify correct recovery PITR & Logical decoding
  • 55. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler 5. Perform pg_dump on the readonly barman node 6. Restore to public cloud 7. Read output of logical decoding and write to cloud PITR & Logical decoding
  • 56. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Summary Hosted PostgreSQL gives you high performance PostgreSQL without the hassle of hardware, maintenance and configuration Opex instead of Capex Consider the limitations of your intended platform There are multiple options for migration
  • 57. Spring Conference, April 26th-27th 2018 Dynamic Earth, Edinburgh Mike Fowler mlfowler mike dot fowler at claranet dot uk gh-mlfowler Questions? Mike Fowler gh-mlfowler mlfowler mike dot fowler at claranet dot uk We’re hiring! If you’re interested in helping us do what we do email hr at uk dot clara dot net