1©2014 TransLattice, Inc. All Rights Reserved. 11
Geographically Distributed PostgreSQL
PGConf NYC
April 3, 2014
Mason Sharp, Chief Architect
msharp@translattice.com
2©2014 TransLattice, Inc. All Rights Reserved.
Agenda
n  Why geographically distribute your data?
n  General replication background
n  PostgreSQL options
n  Custom PostgreSQL configurations
n  Upcoming solutions
3©2014 TransLattice, Inc. All Rights Reserved.
Why geographically distribute your data?
n  Improved Availability
n  Better performance (in some cases…)
–  Read vs Write
–  Data closer applications and users
n  Regulatory or Corporate Compliance
–  Data placement concerns
4©2014 TransLattice, Inc. All Rights Reserved.
Availability Issues Remain Headline News
5©2014 TransLattice, Inc. All Rights Reserved.
1  Survey by Zerto, July 2013, 356 IT professionals from 10 industries
Primary causes of data center outage1:
n  Hardware failure 34.4%
n  Power loss/interruption 31.5%
n  Natural disaster 13.3%
79.2%
Most recent unplanned data center outage1:
n  Last 6 months 42%
n  Last year 34%
76% experienced in last year
Data Center Outages – Causes and
Frequency
6©2014 TransLattice, Inc. All Rights Reserved.
Data Center Outage Costs Increasing
1 
2  2013 Cost of Data Center Outages, Ponemon Institute, December 2013
3 Bringing Continuous Availability to Oracle Environments, 2013 Mission-Critical Application Availability Survey,
Unisphere Research
Average cost of an outage is increasing2:
n  2010 $5,617/minute
n  2013 $7,908/minute 41% increase
Length of unplanned outage:
n  Average: 86 minutes2
n  25%+ of Oracle users had 8+ hours of unplanned downtime in last year3
7©2014 TransLattice, Inc. All Rights Reserved.
Current State of Data Replication
Top data management issues for IT executives:4
n  Providing business continuity at a reasonable cost
n  Deploying applications in multiple geographies consistently
n  The continued ability to use SQL
4 DBMS Evaluation Criteria, IDG Research Services, October 2013
5 Bringing Continuous Availability to Oracle Environments, 2013 Mission-Critical Application Availability Survey,
Unisphere Research
“Among respondents with at least two data centers and
rapid replication solutions, 46% indicate they are
less than satisfied with their current strategies.” 5
8©2014 TransLattice, Inc. All Rights Reserved.
Replication
n  Master-Slave
–  One Master, One or more Slaves
n  Multi-master
–  Multiple Masters
n  Multi-source fan-in
–  Example: consolidate multiple sites
n  Fan-out
9©2014 TransLattice, Inc. All Rights Reserved.
Master-Slave
Master Slave Slave Slave
10©2014 TransLattice, Inc. All Rights Reserved.
Master-Slave
n  All writes go to one master
n  Hot Standby reads can be done from any node
n  Synchronous / Asynchronous
n  Slaves get transactions via either
–  Native streaming replication
–  Statement based
•  Could be synchronous, could be via 2PC
•  Could be a replay mechanism via queues or triggers
11©2014 TransLattice, Inc. All Rights Reserved.
Multi-master
12©2014 TransLattice, Inc. All Rights Reserved.
Multi-master
n  Write can occur at any location
n  Synchronous 2PC
–  MVCC concerns
–  May make sense to first always write at one location,
acquiring lock
n  Asynchronous
n  Conflict Resolution
n  Conflict Avoidance through commit ordering
–  Paxos
–  Raft
13©2014 TransLattice, Inc. All Rights Reserved.
Multi-source fan-in
CentralLoc1
Loc2
Loc3
n  Consolidated centrally for reporting
14©2014 TransLattice, Inc. All Rights Reserved.
Multi-source fan-out
CentralLoc1
Loc2
Loc3
n  Subset sent to remote locations
15©2014 TransLattice, Inc. All Rights Reserved.
Understand Your Requirements
n  Availability
–  Read-only access of some data ok in downgraded state?
n  Immediacy of Data
–  Nightly refresh? Immediate? 2 second lag?
n  Performance & Latency
–  Read vs. Write
n  Correctness versus Performance
n  Conflicts: Prevent or Resolve
16©2014 TransLattice, Inc. All Rights Reserved.
Understand Your Requirements (continued)
n  Data Segregation
n  Data Ownership
–  Can each location be the “master” to a subset of data?
–  Example: regional customers
–  Expressed either as a subtable, or expression on a table
•  region_code = ‘US’
–  Different availability requirements?
n  “Staticity” Classification
–  Static tables that rarely change
–  Frequently updated tables
17©2014 TransLattice, Inc. All Rights Reserved.
Static Tables
n  Less concerned about write performance
n  Writing to table
–  BEGIN;
–  Execute DML statement on agreed “master”
–  On success, we have acquired all of the row locks
–  Safely execute on other nodes without risk of deadlock
–  PREPARE TRANSACTION;
–  COMMIT;
18©2014 TransLattice, Inc. All Rights Reserved.
Careful with reflexive UPDATES
UPDATE inventory SET qty = qty – 1 WHERE ….;
n  What if happens on multiple nodes?
n  If conflict resolution policy is last one wins,
inventory is reduced only by 1, not 2
n  May expect inventory that is not there
May want to handle some tables specially.
•  SELECT FOR UPDATE on a master
–  Will block if another transaction modifying
–  Locks won’t propagate to other nodes
19©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
Master-Slave
n  Native Streaming Replication
–  All databases in instance are replicated
–  Synchronous and Asynchronous options
–  Hot queryable standby option
n  Slony
–  Trigger based, asynchronous replication
–  Flexibility for a subset of data
–  More complex administration
n  Londiste
20©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
pgpool-II
n  Middle Layer
n  Synchronous Statement-Based Replication
–  Can instead be combined with other replication incl.
native streaming replication
n  Load Balancer
–  Can be All writes must go through master node
21©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
Postgres-XC
n  Can connect to one of multiple nodes
n  Good push-down join and operation handling
n  Ensures cluster-wide consistency
BUT
n  Requires access to Global Transaction Manager from
each node
n  Nodes are a modified version of PostgreSQL
n  Not currently suited for use case
22©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
PL/Proxy
n  Everything is a stored function
–  More cumbersome, but flexible
23©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
Multi-master
n  Bucardo
–  Perl-based
–  Limited to two masters
–  Custom conflict resolution possible
n  RubyRep
–  Ruby-based
–  Limited to two masters
–  Custom conflict resolution possible
n  Postgres-R
–  Modified PostgreSQL 9.0
24©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
Custom
n  Triggers
n  Foreign Data Wrappers
n  Subtable Partitioning
n  Two Phase Commit
25©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
Considerations
n  Connections and MVCC across multiple instances
n  Sequence/Serial
–  UUID as alternative
n  Timestamps
–  Use timestamp with timezone
–  Network Time Protocol NTP
–  Custom functions for time lag
26©2014 TransLattice, Inc. All Rights Reserved.
Custom Example
n  Multiple Locations
n  Locations largely independent
n  Most of the writes will occur locally
–  Each site is the “master” for local data
n  Want to be able to write data on a remote site
n  Want local read performance for remote
originating data
n  If remote site is down, local read-only access is
acceptable
n  Occasional updates to static data requires all
nodes online
27©2014 TransLattice, Inc. All Rights Reserved.
Custom Example
DC2
Hot
Standby
DC1
Master
DC2
Master
DC1
Hot
Standby
DC1 DC2
28©2014 TransLattice, Inc. All Rights Reserved.
Custom Example
DC2
Hot
Standby
DC1
Master
DC2
Master
DC1
Hot
Standby
DC1 DC2
customer_dc1
customer_dc2
29©2014 TransLattice, Inc. All Rights Reserved.
View: customer
Custom Example
DC2
Hot
Standby
DC1
Master
DC2
Master
DC1
Hot
Standby
DC1 DC2
customer_dc1
FDW
customer_dc2
30©2014 TransLattice, Inc. All Rights Reserved.
Configuration
n  configure --with-ossp-uuid
n  CREATE EXTENSION "uuid-ossp"
n  CREATE EXTENSION "postgres_fdw”
31©2014 TransLattice, Inc. All Rights Reserved.
Configuration
n  From dc1:
CREATE SERVER dc2_master
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'dc2_host', dbname 'dc2', port '5434');
CREATE SERVER dc2_slave
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', dbname 'dc2', port '5434');
32©2014 TransLattice, Inc. All Rights Reserved.
Configuration
n  From dc2:
CREATE SERVER dc1_master
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'dc1_host', dbname 'dc1', port '5433');
CREATE SERVER dc1_slave
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', dbname 'dc1', port '5433');
33©2014 TransLattice, Inc. All Rights Reserved.
Configuration
CREATE USER MAPPING
FOR user1 SERVER dc2_master
OPTIONS (user ’user1');
CREATE USER MAPPING
FOR user1 SERVER dc2_slave
OPTIONS (user ’user1');
34©2014 TransLattice, Inc. All Rights Reserved.
Configuration
CREATE TABLE customer_dc1
(cust_id UUID,
cust_name varchar,
cust_loc char(5));
35©2014 TransLattice, Inc. All Rights Reserved.
Configuration
On dc1:
CREATE FOREIGN TABLE customer_dc2_master
(cust_id UUID,
cust_name varchar,
cust_loc char(5))
SERVER dc2_master;
CREATE FOREIGN TABLE customer_dc2_slave
(cust_id UUID,
cust_name varchar,
cust_loc char(5))
SERVER dc2_slave;
36©2014 TransLattice, Inc. All Rights Reserved.
View Handling
n  Create a customer view, a union of local data and
local slave
n  Include cust_loc condition
CREATE VIEW customer AS
SELECT *
FROM customer_dc1
WHERE cust_loc = ‘DC1’
UNION ALL
SELECT * FROM customer_dc2_slave
WHERE cust_loc = ‘DC2’;
37©2014 TransLattice, Inc. All Rights Reserved.
View Handling
# explain select * from customer;
QUERY PLAN
-----------------------------------------------------------------------
---
Append (cost=0.00..140.82 rows=8 width=72)
-> Seq Scan on customer_dc1
Filter: (cust_loc = 'DC1'::bpchar)
-> Foreign Scan on customer_dc2_slave
38©2014 TransLattice, Inc. All Rights Reserved.
Configuration
n  PostgreSQL takes qualifications into account for better plans!
# explain select * from customer where cust_loc = 'DC1';
QUERY PLAN
----------------------------------------------------------------
Append (cost=0.00..20.04 rows=4 width=72)
-> Seq Scan on customer_dc1 (cost=…..)
Filter: (cust_loc = 'DC1'::bpchar)
n  Smart enough to know to use just one part of the UNION
–  Leaves off foreign table part
–  Consider in design of application
39©2014 TransLattice, Inc. All Rights Reserved.
Triggers
CREATE TRIGGER tr_customer
INSTEAD OF
INSERT OR UPDATE OR DELETE ON customer
FOR EACH ROW
EXECUTE PROCEDURE update_customer();
40©2014 TransLattice, Inc. All Rights Reserved.
Trigger Function
CREATE OR REPLACE FUNCTION update_customer()
RETURNS TRIGGER AS $$
BEGIN
-- TODO: Handle updating cust_loc
IF (TG_OP = 'UPDATE') THEN
IF OLD.cust_loc = 'DC1' THEN
UPDATE customer_dc1
SET cust_name = NEW.cust_name
WHERE cust_id = OLD.cust_id;
ELSEIF OLD.cust_loc = 'DC2' THEN
UPDATE customer_dc2_master
SET cust_name = NEW.cust_name
WHERE cust_id = OLD.cust_id;
END IF;
RETURN NEW;
:
$$ LANGUAGE plpgsql;
41©2014 TransLattice, Inc. All Rights Reserved.
Caveats
n  Performance will be poor for some queries
–  Join push-down
n  Two Phase Commit is not used by FDW
–  No consistency guarantees!
–  FWIW, will commit remotely before locally
n  Repeatable Read is used by the FDW
–  Keeps results the same for foreign table scanned multiple times
n  Differing locale settings may cause problems
42©2014 TransLattice, Inc. All Rights Reserved.
Custom Example – Further Enhancement
n  Want to reduce loss of ability to write new data
n  Add local table for local inserts when remote side
is down
–  Especially helpful for append-only workloads
n  Change trigger functions to use the local table
when the remote side is down
n  Allow updates and deletes on these as well
n  When the remote side is available again, apply
changes to remote side, truncate local table
43©2014 TransLattice, Inc. All Rights Reserved.
Custom Example – Additional try
n  Tried using table inheritance and adding a rule on
a subtable to instead query a remote table, but
encountered issues
44©2014 TransLattice, Inc. All Rights Reserved.
Another Custom Example
n  All tables in just one database on each node
n  No streaming replication
n  Changes applied at both locations
–  Either via 2PC
–  Or asynchronously via triggers
45©2014 TransLattice, Inc. All Rights Reserved.
Upcoming PostgreSQL Multi-master
Replication
n  Logical Log Streaming Replication (LLSR) in
PostgreSQL 9.4
n  WAL is read to determine logical commits
n  Can be decoded to SQL
n  Less overhead than other projects
n  Will allow for a subset of data to be replicated, not
entire instance unlike existing SR
46©2014 TransLattice, Inc. All Rights Reserved.
Upcoming PostgreSQL Multi-master
Replication
n  A goal in a future PostgreSQL release is multi-
master replication with last-one wins conflict
resolution (9.5?)
n  Possible 9.4 extension for apply side in future
n  Improvements over subsequent releases
–  Improved DDL support may be phased in over time
47©2014 TransLattice, Inc. All Rights Reserved.
Bucardo Example
createdb db1
createdb –p 5433 db1
psql –c “CREATE TABLE tab1
(col1 int, col2 int, PRIMARY KEY(col1))”
Db1
psql –c “CREATE TABLE tab1
(col1 int, col2 int, PRIMARY KEY(col1))”
-p 5433 db1
48©2014 TransLattice, Inc. All Rights Reserved.
Bucardo Example
bucardo_ctl install
bucardo_ctl add database db1 name=db1a
bucardo_ctl add database db1 name=db1b
port=5433
bucardo_ctl add all tables db=db1a
psql bucardo:
update bucardo.goat
set standard_conflict = 'latest'
where tablename = 'tab1';
49©2014 TransLattice, Inc. All Rights Reserved.
Bucardo Example
bucardo_ctl add sync sync_tab1 type=swap
source=db1a targetdb=db1b tables=tab1
bucardo_ctl stop
bucardo_ctl start
-> Updates to tab1 now visible on both servers
50©2014 TransLattice, Inc. All Rights Reserved.
Bucardo Notes
n  If having trouble, try “bucardo_ctl install” again
n  Also try bucardo_ctl stop and bucardo_ctl start
n  It seemed to get confused with table names the
same in multiple databases
51©2014 TransLattice, Inc. All Rights Reserved.
Alternative:
TransLattice Elastic Database (TED)
n  PostgreSQL-based
n  Geo-distributed multi-master RDBMS with sharding
n  Policy Configurable
–  Degree of redundancy
–  Data location
n  Uses Fast Generalized Paxos for global commit ordering
n  Easily add nodes
–  New locations
–  Existing locations for scalability
n  Nodes recover automatically
n  Easy transition
–  Can operates in conjunction with
existing database systems
52©2014 TransLattice, Inc. All Rights Reserved.
Each TransLattice Node Delivers Capabilities
That Replace Numerous Disparate Technologies
A single node type simplifies scaling and management
TL
Replication
Storage
Management
Cluster
Management
Compliance Tools
Fully Relational
Database
Management Tools
Data
Integration
Tools
53©2014 TransLattice, Inc. All Rights Reserved. 5353
Thank You!
msharp@translattice.com
@mason_db
@TransLattice

More Related Content

PDF
ALKALI
PDF
Linked Data의 RDF 어휘 이해하고 체험하기 - FOAF, SIOC, SKOS를 중심으로 -
PDF
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
PDF
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
ODP
PostgreSQL Replication in 10 Minutes - SCALE
PDF
Architecture for building scalable and highly available Postgres Cluster
PDF
Best Practices of running PostgreSQL in Virtual Environments
PDF
Linux tuning to improve PostgreSQL performance
ALKALI
Linked Data의 RDF 어휘 이해하고 체험하기 - FOAF, SIOC, SKOS를 중심으로 -
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Best Practices of HA and Replication of PostgreSQL in Virtualized Environments
PostgreSQL Replication in 10 Minutes - SCALE
Architecture for building scalable and highly available Postgres Cluster
Best Practices of running PostgreSQL in Virtual Environments
Linux tuning to improve PostgreSQL performance

Viewers also liked (20)

ODP
Postgresql Federation
PDF
Flexible Replication
PDF
Jena University Talk 2016.03.09 -- SQL at Zalando Technology
PDF
Presentation PgDay Paris geolllibre postgeol
DOC
sql server dba with 9+ years of exp and hands on Postgresql
PDF
PostgreSQL Scaling And Failover
PDF
Django pres
PDF
Adding replication protocol support for psycopg2
PDF
Multi-master, multi-region MySQL deployment in Amazon AWS
PDF
Do postgres-dream-of-graph-database
PDF
Linux tuning for PostgreSQL at Secon 2015
ODP
Fun Things to do with Logical Decoding
PDF
PostgreSQL 9.4 and Beyond @ FOSSASIA 2015 Singapore
PDF
Magic quadrant for data warehouse database management systems
PPTX
kafka for db as postgres
PDF
PostgreSQL High Availability in a Containerized World
PDF
PostgreSQL 9.4
PDF
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PDF
High Availability PostgreSQL with Zalando Patroni
PPT
The NoSQL Way in Postgres
 
Postgresql Federation
Flexible Replication
Jena University Talk 2016.03.09 -- SQL at Zalando Technology
Presentation PgDay Paris geolllibre postgeol
sql server dba with 9+ years of exp and hands on Postgresql
PostgreSQL Scaling And Failover
Django pres
Adding replication protocol support for psycopg2
Multi-master, multi-region MySQL deployment in Amazon AWS
Do postgres-dream-of-graph-database
Linux tuning for PostgreSQL at Secon 2015
Fun Things to do with Logical Decoding
PostgreSQL 9.4 and Beyond @ FOSSASIA 2015 Singapore
Magic quadrant for data warehouse database management systems
kafka for db as postgres
PostgreSQL High Availability in a Containerized World
PostgreSQL 9.4
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
High Availability PostgreSQL with Zalando Patroni
The NoSQL Way in Postgres
 
Ad

Similar to Geographically Distributed PostgreSQL (20)

PDF
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
PDF
Benchmarking sahara based big data as a service solutions
PDF
Netflix Open Source Meetup Season 4 Episode 2
PDF
Spark Driven Big Data Analytics
PPTX
Spark SQL versus Apache Drill: Different Tools with Different Rules
PDF
How Optimizely (Safely) Maximizes Database Concurrency.pdf
PDF
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
PDF
OpenDataPlane Project
PPTX
Performance tuning - A key to successful cassandra migration
PDF
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
PDF
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
PDF
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
PDF
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
PPT
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
PDF
Percon XtraDB Cluster in a nutshell
PDF
Capital One: Using Cassandra In Building A Reporting Platform
PPTX
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
PDF
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
PPTX
Introduction to DPDK
PDF
AusNOG 2011 - Residential IPv6 CPE - What Not to Do and Other Observations
Infosys Ltd: Performance Tuning - A Key to Successful Cassandra Migration
Benchmarking sahara based big data as a service solutions
Netflix Open Source Meetup Season 4 Episode 2
Spark Driven Big Data Analytics
Spark SQL versus Apache Drill: Different Tools with Different Rules
How Optimizely (Safely) Maximizes Database Concurrency.pdf
Top 10 present and future innovations in the NoSQL Cassandra ecosystem (2022)
OpenDataPlane Project
Performance tuning - A key to successful cassandra migration
20230511 - PGConf Nepal - Clustering in PostgreSQL_ Because one database serv...
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Percon XtraDB Cluster in a nutshell
Capital One: Using Cassandra In Building A Reporting Platform
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
DPDK Summit 2015 - NTT - Yoshihiro Nakajima
Introduction to DPDK
AusNOG 2011 - Residential IPv6 CPE - What Not to Do and Other Observations
Ad

Recently uploaded (20)

PDF
August Patch Tuesday
PPTX
Web Crawler for Trend Tracking Gen Z Insights.pptx
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PPTX
observCloud-Native Containerability and monitoring.pptx
PPTX
Chapter 5: Probability Theory and Statistics
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
A review of recent deep learning applications in wood surface defect identifi...
PDF
Getting Started with Data Integration: FME Form 101
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
1 - Historical Antecedents, Social Consideration.pdf
PPT
Geologic Time for studying geology for geologist
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PPTX
Benefits of Physical activity for teenagers.pptx
PPTX
The various Industrial Revolutions .pptx
PPTX
Tartificialntelligence_presentation.pptx
PDF
Getting started with AI Agents and Multi-Agent Systems
August Patch Tuesday
Web Crawler for Trend Tracking Gen Z Insights.pptx
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
observCloud-Native Containerability and monitoring.pptx
Chapter 5: Probability Theory and Statistics
Final SEM Unit 1 for mit wpu at pune .pptx
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
A review of recent deep learning applications in wood surface defect identifi...
Getting Started with Data Integration: FME Form 101
A contest of sentiment analysis: k-nearest neighbor versus neural network
1 - Historical Antecedents, Social Consideration.pdf
Geologic Time for studying geology for geologist
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Benefits of Physical activity for teenagers.pptx
The various Industrial Revolutions .pptx
Tartificialntelligence_presentation.pptx
Getting started with AI Agents and Multi-Agent Systems

Geographically Distributed PostgreSQL

  • 1. 1©2014 TransLattice, Inc. All Rights Reserved. 11 Geographically Distributed PostgreSQL PGConf NYC April 3, 2014 Mason Sharp, Chief Architect [email protected]
  • 2. 2©2014 TransLattice, Inc. All Rights Reserved. Agenda n  Why geographically distribute your data? n  General replication background n  PostgreSQL options n  Custom PostgreSQL configurations n  Upcoming solutions
  • 3. 3©2014 TransLattice, Inc. All Rights Reserved. Why geographically distribute your data? n  Improved Availability n  Better performance (in some cases…) –  Read vs Write –  Data closer applications and users n  Regulatory or Corporate Compliance –  Data placement concerns
  • 4. 4©2014 TransLattice, Inc. All Rights Reserved. Availability Issues Remain Headline News
  • 5. 5©2014 TransLattice, Inc. All Rights Reserved. 1  Survey by Zerto, July 2013, 356 IT professionals from 10 industries Primary causes of data center outage1: n  Hardware failure 34.4% n  Power loss/interruption 31.5% n  Natural disaster 13.3% 79.2% Most recent unplanned data center outage1: n  Last 6 months 42% n  Last year 34% 76% experienced in last year Data Center Outages – Causes and Frequency
  • 6. 6©2014 TransLattice, Inc. All Rights Reserved. Data Center Outage Costs Increasing 1  2  2013 Cost of Data Center Outages, Ponemon Institute, December 2013 3 Bringing Continuous Availability to Oracle Environments, 2013 Mission-Critical Application Availability Survey, Unisphere Research Average cost of an outage is increasing2: n  2010 $5,617/minute n  2013 $7,908/minute 41% increase Length of unplanned outage: n  Average: 86 minutes2 n  25%+ of Oracle users had 8+ hours of unplanned downtime in last year3
  • 7. 7©2014 TransLattice, Inc. All Rights Reserved. Current State of Data Replication Top data management issues for IT executives:4 n  Providing business continuity at a reasonable cost n  Deploying applications in multiple geographies consistently n  The continued ability to use SQL 4 DBMS Evaluation Criteria, IDG Research Services, October 2013 5 Bringing Continuous Availability to Oracle Environments, 2013 Mission-Critical Application Availability Survey, Unisphere Research “Among respondents with at least two data centers and rapid replication solutions, 46% indicate they are less than satisfied with their current strategies.” 5
  • 8. 8©2014 TransLattice, Inc. All Rights Reserved. Replication n  Master-Slave –  One Master, One or more Slaves n  Multi-master –  Multiple Masters n  Multi-source fan-in –  Example: consolidate multiple sites n  Fan-out
  • 9. 9©2014 TransLattice, Inc. All Rights Reserved. Master-Slave Master Slave Slave Slave
  • 10. 10©2014 TransLattice, Inc. All Rights Reserved. Master-Slave n  All writes go to one master n  Hot Standby reads can be done from any node n  Synchronous / Asynchronous n  Slaves get transactions via either –  Native streaming replication –  Statement based •  Could be synchronous, could be via 2PC •  Could be a replay mechanism via queues or triggers
  • 11. 11©2014 TransLattice, Inc. All Rights Reserved. Multi-master
  • 12. 12©2014 TransLattice, Inc. All Rights Reserved. Multi-master n  Write can occur at any location n  Synchronous 2PC –  MVCC concerns –  May make sense to first always write at one location, acquiring lock n  Asynchronous n  Conflict Resolution n  Conflict Avoidance through commit ordering –  Paxos –  Raft
  • 13. 13©2014 TransLattice, Inc. All Rights Reserved. Multi-source fan-in CentralLoc1 Loc2 Loc3 n  Consolidated centrally for reporting
  • 14. 14©2014 TransLattice, Inc. All Rights Reserved. Multi-source fan-out CentralLoc1 Loc2 Loc3 n  Subset sent to remote locations
  • 15. 15©2014 TransLattice, Inc. All Rights Reserved. Understand Your Requirements n  Availability –  Read-only access of some data ok in downgraded state? n  Immediacy of Data –  Nightly refresh? Immediate? 2 second lag? n  Performance & Latency –  Read vs. Write n  Correctness versus Performance n  Conflicts: Prevent or Resolve
  • 16. 16©2014 TransLattice, Inc. All Rights Reserved. Understand Your Requirements (continued) n  Data Segregation n  Data Ownership –  Can each location be the “master” to a subset of data? –  Example: regional customers –  Expressed either as a subtable, or expression on a table •  region_code = ‘US’ –  Different availability requirements? n  “Staticity” Classification –  Static tables that rarely change –  Frequently updated tables
  • 17. 17©2014 TransLattice, Inc. All Rights Reserved. Static Tables n  Less concerned about write performance n  Writing to table –  BEGIN; –  Execute DML statement on agreed “master” –  On success, we have acquired all of the row locks –  Safely execute on other nodes without risk of deadlock –  PREPARE TRANSACTION; –  COMMIT;
  • 18. 18©2014 TransLattice, Inc. All Rights Reserved. Careful with reflexive UPDATES UPDATE inventory SET qty = qty – 1 WHERE ….; n  What if happens on multiple nodes? n  If conflict resolution policy is last one wins, inventory is reduced only by 1, not 2 n  May expect inventory that is not there May want to handle some tables specially. •  SELECT FOR UPDATE on a master –  Will block if another transaction modifying –  Locks won’t propagate to other nodes
  • 19. 19©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – Master-Slave n  Native Streaming Replication –  All databases in instance are replicated –  Synchronous and Asynchronous options –  Hot queryable standby option n  Slony –  Trigger based, asynchronous replication –  Flexibility for a subset of data –  More complex administration n  Londiste
  • 20. 20©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – pgpool-II n  Middle Layer n  Synchronous Statement-Based Replication –  Can instead be combined with other replication incl. native streaming replication n  Load Balancer –  Can be All writes must go through master node
  • 21. 21©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – Postgres-XC n  Can connect to one of multiple nodes n  Good push-down join and operation handling n  Ensures cluster-wide consistency BUT n  Requires access to Global Transaction Manager from each node n  Nodes are a modified version of PostgreSQL n  Not currently suited for use case
  • 22. 22©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – PL/Proxy n  Everything is a stored function –  More cumbersome, but flexible
  • 23. 23©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – Multi-master n  Bucardo –  Perl-based –  Limited to two masters –  Custom conflict resolution possible n  RubyRep –  Ruby-based –  Limited to two masters –  Custom conflict resolution possible n  Postgres-R –  Modified PostgreSQL 9.0
  • 24. 24©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – Custom n  Triggers n  Foreign Data Wrappers n  Subtable Partitioning n  Two Phase Commit
  • 25. 25©2014 TransLattice, Inc. All Rights Reserved. Looking in the PostgreSQL Toolbox – Considerations n  Connections and MVCC across multiple instances n  Sequence/Serial –  UUID as alternative n  Timestamps –  Use timestamp with timezone –  Network Time Protocol NTP –  Custom functions for time lag
  • 26. 26©2014 TransLattice, Inc. All Rights Reserved. Custom Example n  Multiple Locations n  Locations largely independent n  Most of the writes will occur locally –  Each site is the “master” for local data n  Want to be able to write data on a remote site n  Want local read performance for remote originating data n  If remote site is down, local read-only access is acceptable n  Occasional updates to static data requires all nodes online
  • 27. 27©2014 TransLattice, Inc. All Rights Reserved. Custom Example DC2 Hot Standby DC1 Master DC2 Master DC1 Hot Standby DC1 DC2
  • 28. 28©2014 TransLattice, Inc. All Rights Reserved. Custom Example DC2 Hot Standby DC1 Master DC2 Master DC1 Hot Standby DC1 DC2 customer_dc1 customer_dc2
  • 29. 29©2014 TransLattice, Inc. All Rights Reserved. View: customer Custom Example DC2 Hot Standby DC1 Master DC2 Master DC1 Hot Standby DC1 DC2 customer_dc1 FDW customer_dc2
  • 30. 30©2014 TransLattice, Inc. All Rights Reserved. Configuration n  configure --with-ossp-uuid n  CREATE EXTENSION "uuid-ossp" n  CREATE EXTENSION "postgres_fdw”
  • 31. 31©2014 TransLattice, Inc. All Rights Reserved. Configuration n  From dc1: CREATE SERVER dc2_master FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'dc2_host', dbname 'dc2', port '5434'); CREATE SERVER dc2_slave FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'dc2', port '5434');
  • 32. 32©2014 TransLattice, Inc. All Rights Reserved. Configuration n  From dc2: CREATE SERVER dc1_master FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'dc1_host', dbname 'dc1', port '5433'); CREATE SERVER dc1_slave FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', dbname 'dc1', port '5433');
  • 33. 33©2014 TransLattice, Inc. All Rights Reserved. Configuration CREATE USER MAPPING FOR user1 SERVER dc2_master OPTIONS (user ’user1'); CREATE USER MAPPING FOR user1 SERVER dc2_slave OPTIONS (user ’user1');
  • 34. 34©2014 TransLattice, Inc. All Rights Reserved. Configuration CREATE TABLE customer_dc1 (cust_id UUID, cust_name varchar, cust_loc char(5));
  • 35. 35©2014 TransLattice, Inc. All Rights Reserved. Configuration On dc1: CREATE FOREIGN TABLE customer_dc2_master (cust_id UUID, cust_name varchar, cust_loc char(5)) SERVER dc2_master; CREATE FOREIGN TABLE customer_dc2_slave (cust_id UUID, cust_name varchar, cust_loc char(5)) SERVER dc2_slave;
  • 36. 36©2014 TransLattice, Inc. All Rights Reserved. View Handling n  Create a customer view, a union of local data and local slave n  Include cust_loc condition CREATE VIEW customer AS SELECT * FROM customer_dc1 WHERE cust_loc = ‘DC1’ UNION ALL SELECT * FROM customer_dc2_slave WHERE cust_loc = ‘DC2’;
  • 37. 37©2014 TransLattice, Inc. All Rights Reserved. View Handling # explain select * from customer; QUERY PLAN ----------------------------------------------------------------------- --- Append (cost=0.00..140.82 rows=8 width=72) -> Seq Scan on customer_dc1 Filter: (cust_loc = 'DC1'::bpchar) -> Foreign Scan on customer_dc2_slave
  • 38. 38©2014 TransLattice, Inc. All Rights Reserved. Configuration n  PostgreSQL takes qualifications into account for better plans! # explain select * from customer where cust_loc = 'DC1'; QUERY PLAN ---------------------------------------------------------------- Append (cost=0.00..20.04 rows=4 width=72) -> Seq Scan on customer_dc1 (cost=…..) Filter: (cust_loc = 'DC1'::bpchar) n  Smart enough to know to use just one part of the UNION –  Leaves off foreign table part –  Consider in design of application
  • 39. 39©2014 TransLattice, Inc. All Rights Reserved. Triggers CREATE TRIGGER tr_customer INSTEAD OF INSERT OR UPDATE OR DELETE ON customer FOR EACH ROW EXECUTE PROCEDURE update_customer();
  • 40. 40©2014 TransLattice, Inc. All Rights Reserved. Trigger Function CREATE OR REPLACE FUNCTION update_customer() RETURNS TRIGGER AS $$ BEGIN -- TODO: Handle updating cust_loc IF (TG_OP = 'UPDATE') THEN IF OLD.cust_loc = 'DC1' THEN UPDATE customer_dc1 SET cust_name = NEW.cust_name WHERE cust_id = OLD.cust_id; ELSEIF OLD.cust_loc = 'DC2' THEN UPDATE customer_dc2_master SET cust_name = NEW.cust_name WHERE cust_id = OLD.cust_id; END IF; RETURN NEW; : $$ LANGUAGE plpgsql;
  • 41. 41©2014 TransLattice, Inc. All Rights Reserved. Caveats n  Performance will be poor for some queries –  Join push-down n  Two Phase Commit is not used by FDW –  No consistency guarantees! –  FWIW, will commit remotely before locally n  Repeatable Read is used by the FDW –  Keeps results the same for foreign table scanned multiple times n  Differing locale settings may cause problems
  • 42. 42©2014 TransLattice, Inc. All Rights Reserved. Custom Example – Further Enhancement n  Want to reduce loss of ability to write new data n  Add local table for local inserts when remote side is down –  Especially helpful for append-only workloads n  Change trigger functions to use the local table when the remote side is down n  Allow updates and deletes on these as well n  When the remote side is available again, apply changes to remote side, truncate local table
  • 43. 43©2014 TransLattice, Inc. All Rights Reserved. Custom Example – Additional try n  Tried using table inheritance and adding a rule on a subtable to instead query a remote table, but encountered issues
  • 44. 44©2014 TransLattice, Inc. All Rights Reserved. Another Custom Example n  All tables in just one database on each node n  No streaming replication n  Changes applied at both locations –  Either via 2PC –  Or asynchronously via triggers
  • 45. 45©2014 TransLattice, Inc. All Rights Reserved. Upcoming PostgreSQL Multi-master Replication n  Logical Log Streaming Replication (LLSR) in PostgreSQL 9.4 n  WAL is read to determine logical commits n  Can be decoded to SQL n  Less overhead than other projects n  Will allow for a subset of data to be replicated, not entire instance unlike existing SR
  • 46. 46©2014 TransLattice, Inc. All Rights Reserved. Upcoming PostgreSQL Multi-master Replication n  A goal in a future PostgreSQL release is multi- master replication with last-one wins conflict resolution (9.5?) n  Possible 9.4 extension for apply side in future n  Improvements over subsequent releases –  Improved DDL support may be phased in over time
  • 47. 47©2014 TransLattice, Inc. All Rights Reserved. Bucardo Example createdb db1 createdb –p 5433 db1 psql –c “CREATE TABLE tab1 (col1 int, col2 int, PRIMARY KEY(col1))” Db1 psql –c “CREATE TABLE tab1 (col1 int, col2 int, PRIMARY KEY(col1))” -p 5433 db1
  • 48. 48©2014 TransLattice, Inc. All Rights Reserved. Bucardo Example bucardo_ctl install bucardo_ctl add database db1 name=db1a bucardo_ctl add database db1 name=db1b port=5433 bucardo_ctl add all tables db=db1a psql bucardo: update bucardo.goat set standard_conflict = 'latest' where tablename = 'tab1';
  • 49. 49©2014 TransLattice, Inc. All Rights Reserved. Bucardo Example bucardo_ctl add sync sync_tab1 type=swap source=db1a targetdb=db1b tables=tab1 bucardo_ctl stop bucardo_ctl start -> Updates to tab1 now visible on both servers
  • 50. 50©2014 TransLattice, Inc. All Rights Reserved. Bucardo Notes n  If having trouble, try “bucardo_ctl install” again n  Also try bucardo_ctl stop and bucardo_ctl start n  It seemed to get confused with table names the same in multiple databases
  • 51. 51©2014 TransLattice, Inc. All Rights Reserved. Alternative: TransLattice Elastic Database (TED) n  PostgreSQL-based n  Geo-distributed multi-master RDBMS with sharding n  Policy Configurable –  Degree of redundancy –  Data location n  Uses Fast Generalized Paxos for global commit ordering n  Easily add nodes –  New locations –  Existing locations for scalability n  Nodes recover automatically n  Easy transition –  Can operates in conjunction with existing database systems
  • 52. 52©2014 TransLattice, Inc. All Rights Reserved. Each TransLattice Node Delivers Capabilities That Replace Numerous Disparate Technologies A single node type simplifies scaling and management TL Replication Storage Management Cluster Management Compliance Tools Fully Relational Database Management Tools Data Integration Tools
  • 53. 53©2014 TransLattice, Inc. All Rights Reserved. 5353 Thank You! [email protected] @mason_db @TransLattice

Editor's Notes

  • #52: Even sharding within an instance