PostgreSQL is one of the most advanced relational databases. It offers superb replication capabilities. The most important features are: Streaming replication, Point-In-Time-Recovery, advanced monitoring, etc.
Devrim Gunduz gives a presentation on Write-Ahead Logging (WAL) in PostgreSQL. WAL logs all transactions to files called write-ahead logs (WAL files) before changes are written to data files. This allows for crash recovery by replaying WAL files. WAL files are used for replication, backup, and point-in-time recovery (PITR) by replaying WAL files to restore the database to a previous state. Checkpoints write all dirty shared buffers to disk and update the pg_control file with the checkpoint location.
This document discusses streaming replication in PostgreSQL. It covers how streaming replication works, including the write-ahead log and replication processes. It also discusses setting up replication between a primary and standby server, including configuring the servers and verifying replication is working properly. Monitoring replication is discussed along with views and functions for checking replication status. Maintenance tasks like adding or removing standbys and pausing replication are also mentioned.
Orchestrator allows for easy MySQL failover by monitoring the cluster and promoting a new master when failures occur. Two test cases were demonstrated: 1) using a VIP and scripts to redirect connections during failover and 2) integrating with Proxysql to separate reads and writes and automatically redirect write transactions during failover while keeping read queries distributed. Both cases resulted in failover occurring within 16 seconds while maintaining application availability.
This document discusses Patroni, an open-source tool for managing high availability PostgreSQL clusters. It describes how Patroni uses a distributed configuration system like Etcd or Zookeeper to provide automated failover for PostgreSQL databases. Key features of Patroni include manual and scheduled failover, synchronous replication, dynamic configuration updates, and integration with backup tools like WAL-E. The document also covers some of the challenges of building automatic failover systems and how Patroni addresses issues like choosing a new master node and reattaching failed nodes.
There are many ways to run high availability with PostgreSQL. Here, we present a template for you to create your own customized, high-availability solution using Python and for maximum accessibility, a distributed configuration store like ZooKeeper or etcd.
In 40 minutes the audience will learn a variety of ways to make postgresql database suddenly go out of memory on a box with half a terabyte of RAM.
Developer's and DBA's best practices for preventing this will also be discussed, as well as a bit of Postgres and Linux memory management internals.
Using Apache Spark to analyze large datasets in the cloud presents a range of challenges. Different stages of your pipeline may be constrained by CPU, memory, disk and/or network IO. But what if all those stages have to run on the same cluster? In the cloud, you have limited control over the hardware your cluster runs on.
You may have even less control over the size and format of your raw input files. Performance tuning is an iterative and experimental process. It’s frustrating with very large datasets: what worked great with 30 billion rows may not work at all with 400 billion rows. But with strategic optimizations and compromises, 50+ TiB datasets can be no big deal.
By using Spark UI and simple metrics, explore how to diagnose and remedy issues on jobs:
Sizing the cluster based on your dataset (shuffle partitions)
Ingestion challenges – well begun is half done (globbing S3, small files)
Managing memory (sorting GC – when to go parallel, when to go G1, when offheap can help you)
Shuffle (give a little to get a lot – configs for better out of box shuffle) – Spill (partitioning for the win)
Scheduling (FAIR vs FIFO, is there a difference for your pipeline?)
Caching and persistence (it’s the cost of doing business, so what are your options?)
Fault tolerance (blacklisting, speculation, task reaping)
Making the best of a bad deal (skew joins, windowing, UDFs, very large query plans)
Writing to S3 (dealing with write partitions, HDFS and s3DistCp vs writing directly to S3)
The document discusses MySQL's buffer pool and buffer management. It describes how the buffer pool caches frequently accessed data in memory for faster access. The buffer pool contains several lists including a free list, LRU list, and flush list. It explains functions for reading pages from storage into the buffer pool, replacing pages using LRU, and flushing dirty pages to disk including single page flushes during buffer allocation.
MariaDB 10.0 introduces domain-based parallel replication which allows transactions in different domains to execute concurrently on replicas. This can result in out-of-order transaction commit. MariaDB 10.1 adds optimistic parallel replication which maintains commit order. The document discusses various parallel replication techniques in MySQL and MariaDB including schema-based replication in MySQL 5.6 and logical clock replication in MySQL 5.7. It provides performance benchmarks of these techniques from Booking.com's database environments.
En savoir plus sur www.opensourceschool.fr
Ce support est diffusé sous licence Creative Commons (CC BY-SA 3.0 FR) Attribution - Partage dans les Mêmes Conditions 3.0 France
Plan :
1. Introduction
2. Installation
3. The psql client
4. Authentication and privileges
5. Backup and restoration
6. Internal Architecture
7. Performance optimization
8. Stats and monitoring
9. Logs
10. Replication
Presented at Percona Live Amsterdam 2016, this is an in-depth look at MariaDB Server right up to MariaDB Server 10.1. Learn the differences. See what's already in MySQL. And so on.
This document discusses Spark shuffle, which is an expensive operation that involves data partitioning, serialization/deserialization, compression, and disk I/O. It provides an overview of how shuffle works in Spark and the history of optimizations like sort-based shuffle and an external shuffle service. Key concepts discussed include shuffle writers, readers, and the pluggable block transfer service that handles data transfer. The document also covers shuffle-related configuration options and potential future work.
MariaDB Performance Tuning and OptimizationMariaDB plc
This document discusses MariaDB performance tuning and optimization. It covers common principles like tuning from the start of application development. Specific topics discussed include server hardware, OS settings, MariaDB configuration settings like innodb_buffer_pool_size, database design best practices, and query monitoring and tuning tools. The overall goal is to efficiently use hardware resources, ensure best performance for users, and avoid outages.
PostgreSQL is designed to be easily extensible. For this reason, extensions loaded into the database can function just like features that are built in. In this session, we will learn more about PostgreSQL extension framework, how are they built, look at some popular extensions, management of these extensions in your deployments.
Optimizing MariaDB for maximum performanceMariaDB plc
When it comes to optimizing the performance of a database, DBAs have to look at everything from the OS to the network. In this session, MariaDB Enterprise Architect Manjot Singh shares best practices for getting the most out of MariaDB. He highlights recommended OS settings, important configuration and tuning parameters, options for improving replication and clustering performance and features such as query result caching.
Performance optimization for all flash based on aarch64 v2.0Ceph Community
This document discusses performance optimization techniques for All Flash storage systems based on ARM architecture processors. It provides details on:
- The processor used, which is the Kunpeng920 ARM-based CPU with 32-64 cores at 2.6-3.0GHz, along with its memory and I/O controllers.
- Optimizing performance through both software and hardware techniques, including improving CPU usage, I/O performance, and network performance.
- Specific optimization techniques like data placement to reduce cross-NUMA access, multi-port NIC deployment, using multiple DDR channels, adjusting messaging throttling, and optimizing queue wait times in the object storage daemon (OSD).
- Other
MariaDB MaxScale is a database proxy that provides scalability, high availability, and data streaming capabilities for MariaDB and MySQL databases. It acts as a load balancer and router to distribute queries across database servers. MaxScale supports services like read/write splitting, query caching, and security features like selective data masking. It can monitor replication lag and route queries accordingly. MaxScale uses a plugin architecture and its core remains stateless to provide flexibility and high performance.
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)Jean-François Gagné
To get better replication speed and less lag, MySQL implements parallel replication in the same schema, also known as LOGICAL_CLOCK. But fully benefiting from this feature is not as simple as just enabling it.
In this talk, I explain in detail how this feature works. I also cover how to optimize parallel replication and the improvements made in MySQL 8.0 and back-ported in 5.7 (Write Sets), greatly improving the potential for parallel execution on replicas (but needing RBR).
Come to this talk to get all the details about MySQL 5.7 and 8.0 Parallel Replication.
How to set up orchestrator to manage thousands of MySQL serversSimon J Mudd
This document discusses how to scale Orchestrator to manage thousands of MySQL servers. It describes how Booking.com uses Orchestrator to manage their thousands of MySQL servers. As the number of monitored servers increases, integration with internal infrastructure is needed, Orchestrator performance must be optimized, and high availability and wider user access features are added. The document provides examples of configuration settings and special considerations needed to effectively use Orchestrator at large scale.
The document provides an overview of the InnoDB storage engine used in MySQL. It discusses InnoDB's architecture including the buffer pool, log files, and indexing structure using B-trees. The buffer pool acts as an in-memory cache for table data and indexes. Log files are used to support ACID transactions and enable crash recovery. InnoDB uses B-trees to store both data and indexes, with rows of variable length stored within pages.
The presentation covers improvements made to the redo logs in MySQL 8.0 and their impact on the MySQL performance and Operations. This covers the MySQL version still MySQL 8.0.30.
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorJean-François Gagné
Of course there is no such thing as perfect service discovery, and we will see why in the talk. However, the way ProxySQL is deployed in this case minimizes the risk for split-brains, and this is why I qualify it as almost perfect. But let’s step back a little...
MySQL alone is not a high availability solution. To provide resilience to primary failure, other components need to be integrated with MySQL. At MessageBird, these additional components are ProxySQL and Orchestrator. In this talk, we describe how ProxySQL is architectured to provide close to perfect Service Discovery and how this, combined with Orchestrator, allows for automatic failover. The talk presents the details of the integration of MySQL, ProxySQL and Orchestrator in Google Cloud (and it would be easy to re-implement a similar architecture at other cloud vendors or on-premises). We will also cover lessons learned for the 2 years this architecture has been in production. Come to this talk to learn more about MySQL high availability, ProxySQL and Orchestrator.
The paperback version is available on lulu.com there http://goo.gl/fraa8o
This is the first volume of the postgresql database administration book. The book covers the steps for installing, configuring and administering a PostgreSQL 9.3 on Linux debian. The book covers the logical and physical aspect of PostgreSQL. Two chapters are dedicated to the backup/restore topic.
MySQL Parallel Replication: inventory, use-case and limitationsJean-François Gagné
Booking.com uses MySQL parallel replication extensively with thousands of servers replicating. The presentation summarized MySQL and MariaDB parallel replication features including: 1) MySQL 5.6 uses schema-based parallel replication but transactions commit out of order. 2) MariaDB 10.0 introduced out-of-order parallel replication using write domains that can cause gaps. 3) MariaDB 10.1 includes five parallel modes including optimistic replication to reduce deadlocks during parallel execution. Long transactions and intermediate masters can limit parallelism.
Kernel Recipes 2019 - Faster IO through io_uringAnne Nicolas
io_uring provides a new asynchronous I/O interface in Linux that aims to address limitations with existing interfaces like aio and libaio. It uses a ring-based model for submission and completion queues to efficiently support asynchronous I/O operations with low latency and high throughput. Though initially skeptical, Linus Torvalds ultimately merged io_uring into the Linux kernel due to improvements in missing features, ease of use, and efficiency over alternatives.
The document discusses using PostgreSQL for data warehousing. It covers advantages like complex queries with joins, windowing functions and materialized views. It recommends configurations like separating the data warehouse onto its own server, adjusting memory settings, disabling autovacuum and using tablespaces. Methods of extract, transform and load (ETL) data discussed include COPY, temporary tables, stored procedures and foreign data wrappers.
Out of the box replication in postgres 9.4Denish Patel
This document provides an overview of setting up out of the box replication in PostgreSQL 9.4 without third party tools. It discusses write-ahead logs (WAL), replication slots, pg_basebackup, and pg_receivexlog. The document then demonstrates setting up replication on VMs with pg_basebackup to initialize a standby server, configuration of primary and standby servers, and monitoring of replication.
The document discusses MySQL's buffer pool and buffer management. It describes how the buffer pool caches frequently accessed data in memory for faster access. The buffer pool contains several lists including a free list, LRU list, and flush list. It explains functions for reading pages from storage into the buffer pool, replacing pages using LRU, and flushing dirty pages to disk including single page flushes during buffer allocation.
MariaDB 10.0 introduces domain-based parallel replication which allows transactions in different domains to execute concurrently on replicas. This can result in out-of-order transaction commit. MariaDB 10.1 adds optimistic parallel replication which maintains commit order. The document discusses various parallel replication techniques in MySQL and MariaDB including schema-based replication in MySQL 5.6 and logical clock replication in MySQL 5.7. It provides performance benchmarks of these techniques from Booking.com's database environments.
En savoir plus sur www.opensourceschool.fr
Ce support est diffusé sous licence Creative Commons (CC BY-SA 3.0 FR) Attribution - Partage dans les Mêmes Conditions 3.0 France
Plan :
1. Introduction
2. Installation
3. The psql client
4. Authentication and privileges
5. Backup and restoration
6. Internal Architecture
7. Performance optimization
8. Stats and monitoring
9. Logs
10. Replication
Presented at Percona Live Amsterdam 2016, this is an in-depth look at MariaDB Server right up to MariaDB Server 10.1. Learn the differences. See what's already in MySQL. And so on.
This document discusses Spark shuffle, which is an expensive operation that involves data partitioning, serialization/deserialization, compression, and disk I/O. It provides an overview of how shuffle works in Spark and the history of optimizations like sort-based shuffle and an external shuffle service. Key concepts discussed include shuffle writers, readers, and the pluggable block transfer service that handles data transfer. The document also covers shuffle-related configuration options and potential future work.
MariaDB Performance Tuning and OptimizationMariaDB plc
This document discusses MariaDB performance tuning and optimization. It covers common principles like tuning from the start of application development. Specific topics discussed include server hardware, OS settings, MariaDB configuration settings like innodb_buffer_pool_size, database design best practices, and query monitoring and tuning tools. The overall goal is to efficiently use hardware resources, ensure best performance for users, and avoid outages.
PostgreSQL is designed to be easily extensible. For this reason, extensions loaded into the database can function just like features that are built in. In this session, we will learn more about PostgreSQL extension framework, how are they built, look at some popular extensions, management of these extensions in your deployments.
Optimizing MariaDB for maximum performanceMariaDB plc
When it comes to optimizing the performance of a database, DBAs have to look at everything from the OS to the network. In this session, MariaDB Enterprise Architect Manjot Singh shares best practices for getting the most out of MariaDB. He highlights recommended OS settings, important configuration and tuning parameters, options for improving replication and clustering performance and features such as query result caching.
Performance optimization for all flash based on aarch64 v2.0Ceph Community
This document discusses performance optimization techniques for All Flash storage systems based on ARM architecture processors. It provides details on:
- The processor used, which is the Kunpeng920 ARM-based CPU with 32-64 cores at 2.6-3.0GHz, along with its memory and I/O controllers.
- Optimizing performance through both software and hardware techniques, including improving CPU usage, I/O performance, and network performance.
- Specific optimization techniques like data placement to reduce cross-NUMA access, multi-port NIC deployment, using multiple DDR channels, adjusting messaging throttling, and optimizing queue wait times in the object storage daemon (OSD).
- Other
MariaDB MaxScale is a database proxy that provides scalability, high availability, and data streaming capabilities for MariaDB and MySQL databases. It acts as a load balancer and router to distribute queries across database servers. MaxScale supports services like read/write splitting, query caching, and security features like selective data masking. It can monitor replication lag and route queries accordingly. MaxScale uses a plugin architecture and its core remains stateless to provide flexibility and high performance.
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)Jean-François Gagné
To get better replication speed and less lag, MySQL implements parallel replication in the same schema, also known as LOGICAL_CLOCK. But fully benefiting from this feature is not as simple as just enabling it.
In this talk, I explain in detail how this feature works. I also cover how to optimize parallel replication and the improvements made in MySQL 8.0 and back-ported in 5.7 (Write Sets), greatly improving the potential for parallel execution on replicas (but needing RBR).
Come to this talk to get all the details about MySQL 5.7 and 8.0 Parallel Replication.
How to set up orchestrator to manage thousands of MySQL serversSimon J Mudd
This document discusses how to scale Orchestrator to manage thousands of MySQL servers. It describes how Booking.com uses Orchestrator to manage their thousands of MySQL servers. As the number of monitored servers increases, integration with internal infrastructure is needed, Orchestrator performance must be optimized, and high availability and wider user access features are added. The document provides examples of configuration settings and special considerations needed to effectively use Orchestrator at large scale.
The document provides an overview of the InnoDB storage engine used in MySQL. It discusses InnoDB's architecture including the buffer pool, log files, and indexing structure using B-trees. The buffer pool acts as an in-memory cache for table data and indexes. Log files are used to support ACID transactions and enable crash recovery. InnoDB uses B-trees to store both data and indexes, with rows of variable length stored within pages.
The presentation covers improvements made to the redo logs in MySQL 8.0 and their impact on the MySQL performance and Operations. This covers the MySQL version still MySQL 8.0.30.
Almost Perfect Service Discovery and Failover with ProxySQL and OrchestratorJean-François Gagné
Of course there is no such thing as perfect service discovery, and we will see why in the talk. However, the way ProxySQL is deployed in this case minimizes the risk for split-brains, and this is why I qualify it as almost perfect. But let’s step back a little...
MySQL alone is not a high availability solution. To provide resilience to primary failure, other components need to be integrated with MySQL. At MessageBird, these additional components are ProxySQL and Orchestrator. In this talk, we describe how ProxySQL is architectured to provide close to perfect Service Discovery and how this, combined with Orchestrator, allows for automatic failover. The talk presents the details of the integration of MySQL, ProxySQL and Orchestrator in Google Cloud (and it would be easy to re-implement a similar architecture at other cloud vendors or on-premises). We will also cover lessons learned for the 2 years this architecture has been in production. Come to this talk to learn more about MySQL high availability, ProxySQL and Orchestrator.
The paperback version is available on lulu.com there http://goo.gl/fraa8o
This is the first volume of the postgresql database administration book. The book covers the steps for installing, configuring and administering a PostgreSQL 9.3 on Linux debian. The book covers the logical and physical aspect of PostgreSQL. Two chapters are dedicated to the backup/restore topic.
MySQL Parallel Replication: inventory, use-case and limitationsJean-François Gagné
Booking.com uses MySQL parallel replication extensively with thousands of servers replicating. The presentation summarized MySQL and MariaDB parallel replication features including: 1) MySQL 5.6 uses schema-based parallel replication but transactions commit out of order. 2) MariaDB 10.0 introduced out-of-order parallel replication using write domains that can cause gaps. 3) MariaDB 10.1 includes five parallel modes including optimistic replication to reduce deadlocks during parallel execution. Long transactions and intermediate masters can limit parallelism.
Kernel Recipes 2019 - Faster IO through io_uringAnne Nicolas
io_uring provides a new asynchronous I/O interface in Linux that aims to address limitations with existing interfaces like aio and libaio. It uses a ring-based model for submission and completion queues to efficiently support asynchronous I/O operations with low latency and high throughput. Though initially skeptical, Linus Torvalds ultimately merged io_uring into the Linux kernel due to improvements in missing features, ease of use, and efficiency over alternatives.
The document discusses using PostgreSQL for data warehousing. It covers advantages like complex queries with joins, windowing functions and materialized views. It recommends configurations like separating the data warehouse onto its own server, adjusting memory settings, disabling autovacuum and using tablespaces. Methods of extract, transform and load (ETL) data discussed include COPY, temporary tables, stored procedures and foreign data wrappers.
Out of the box replication in postgres 9.4Denish Patel
This document provides an overview of setting up out of the box replication in PostgreSQL 9.4 without third party tools. It discusses write-ahead logs (WAL), replication slots, pg_basebackup, and pg_receivexlog. The document then demonstrates setting up replication on VMs with pg_basebackup to initialize a standby server, configuration of primary and standby servers, and monitoring of replication.
This document discusses using PostgreSQL for a large database at Inmobi, an independent mobile ad network. It covers topics like partitioning the database into tables by date for improved performance, choosing indexes to optimize queries, ensuring high availability through streaming replication, and automating regular maintenance and archiving of old data.
Performance Tuning muss in PostgreSQL nicht schwer sein. Oft reichen einige einfache Veränderung, um die Datenbank massiv zu beschleunigen.
VIele Performance Probleme sind einfach zu lösen. Diese Präsentation zeigt die gängigsten Methoden, um einfache Probleme schnell und effizient zu beseitigen
In SQL joins are a fundamental concepts and many database engines have serious problems when it comes to joining many many tables.
PostgreSQL is a pretty cool database - the question is just: How many joins can it take?
It is known that Oracle does not accept insanely long queries and MySQL is known to core dump with 2000 tables.
This talk shows how to join 1 million tables with PostgreSQL.
By default PostgreSQL will store data on disk in its standard format. However, in many cases business or legal requirements require data to on disk be encrypted.
On disk-encryption was paid for by German industry and is a classical case showing how business and Open Source can coexist. This talk shows, that it can be actually cheaper to implement a feature into an Open Source product than to license a commercial product such as Oracle, DB2, Informix, or MS SQL Server.
Open Source can make a valuable contribution to save costs for everybody.
This document summarizes a presentation about PostgreSQL replication. It discusses different replication terms like master/slave and primary/secondary. It also covers replication mechanisms like statement-based and binary replication. The document outlines how to configure and administer replication through files like postgresql.conf and recovery.conf. It discusses managing replication including failover, failback, remastering and replication lag. It also covers synchronous replication and cascading replication setups.
Best Practices of HA and Replication of PostgreSQL in Virtualized EnvironmentsJignesh Shah
This document discusses best practices for high availability (HA) and replication of PostgreSQL databases in virtualized environments. It covers enterprise needs for HA, technologies like VMware HA and replication that can provide HA, and deployment blueprints for HA, read scaling, and disaster recovery within and across datacenters. The document also discusses PostgreSQL's different replication modes and how they can be used for HA, read scaling, and disaster recovery.
This document provides an overview of five steps to improve PostgreSQL performance: 1) hardware optimization, 2) operating system and filesystem tuning, 3) configuration of postgresql.conf parameters, 4) application design considerations, and 5) query tuning. The document discusses various techniques for each step such as selecting appropriate hardware components, spreading database files across multiple disks or arrays, adjusting memory and disk configuration parameters, designing schemas and queries efficiently, and leveraging caching strategies.
Out of the Box Replication in Postgres 9.4(pgconfsf)Denish Patel
Denish Patel gave a presentation on PostgreSQL replication. He began by introducing himself and his background. He then discussed PostgreSQL write-ahead logging (WAL), replication history, and how replication is currently setup. The presentation covered replication slots, demoing replication without external tools using pg_basebackup, streaming replication with slots, and pg_receivexlog. Patel also discussed monitoring replication and answered questions from the audience.
This document summarizes different approaches to data warehousing including Inmon's 3NF model, Kimball's conformed dimensions model, Linstedt's data vault model, and Rönnbäck's anchor model. It discusses the challenges of data warehousing and provides examples of open source software that can be used to implement each approach including MySQL, PostgreSQL, Greenplum, Infobright, and Hadoop. Cautions are also noted for each methodology.
This document discusses various issues encountered with a PostgreSQL database at InMobi. It includes discussions around high user connections, idle transactions, long-running queries, temporary file limits, out of memory errors, replication issues, partitions, tablespaces, SSH tunneling, and miscellaneous other topics. Potential solutions are provided around increasing connection pools, killing idle transactions, analyzing query plans, increasing configuration parameters, and ensuring proper replication setup.
PostgreSQL is a strong relational database which is highly capable of replication. As of PostgreSQL 9.4 streaming replication is only able to replicate an entire database instance.
Walbouncer allows to filter the PostgreSQL WAL and partial replicate single databases.
This talk describes how somebody can write his or her own PostgreSQL custom aggregation functions.
Hans also shows, how to run windowing functions and analytics to handle data in PostgreSQL.
In PostgreSQL kann man sich mit "explain" ansehen, welchen Execution Plan PostgreSQL für eine Query verwendet. Das hilft beim Suchen von Performance Problemen und hilft, den Durchsatz der Database zu steigern.
Presentation introducing materialized views in PostgreSQL with use cases. These slides were used for my talk at Indian PostgreSQL Users Group meetup at Hyderabad on 28th March, 2014
Patterns provide structure and clarity, enabling architects to establish their solutions across the enterprise. Moreover, these software patterns also help to link technology and business requirements in an effective and efficient manner. Patterns help to incorporate robust solutions for business problems due to it’s wide adoption as well as it’s reusability. In addition, patterns create a common method to communicate, document and describe solutions. This session will explain some of these patterns ranging from SOA (Service-Oriented Architecture), WOA (Web-Oriented Architecture), EDA (Event Driven Architecture), and IoT (Internet of Things)
PostgreSQL is a free and open-source relational database management system that provides high performance and reliability. It supports replication through various methods including log-based asynchronous master-slave replication, which the presenter recommends as a first option. The upcoming PostgreSQL 9.4 release includes improvements to replication such as logical decoding and replication slots. Future releases may add features like logical replication consumers and SQL MERGE statements. The presenter took questions at the end and provided additional resources on PostgreSQL replication.
Out of the Box Replication in Postgres 9.4(PgCon)Denish Patel
The document provides an overview of out-of-the-box replication in PostgreSQL 9.4. It discusses PostgreSQL write-ahead logging (WAL), setting up basic streaming replication between a primary and standby server, taking base backups with pg_basebackup, and using replication slots and pg_receivexlog to archive WAL files without external tools. The presentation includes steps to set up a demo of this replication method on a virtual machine.
Out of the Box Replication in Postgres 9.4(PgCon)Denish Patel
This document provides an overview of setting up out of the box replication in PostgreSQL 9.4 without third party tools. It discusses write-ahead logs (WAL), replication slots, pg_basebackup, pg_receivexlog and monitoring replication. The presenter then demonstrates setting up replication on VMs with pg_basebackup to initialize a standby, configuration of primary and standby servers, and monitoring replication status.
Out of the box replication in postgres 9.4(pg confus)Denish Patel
This document contains notes from a presentation on PostgreSQL replication. It discusses write-ahead logs (WAL), replication history in PostgreSQL from versions 7.0 to 9.4, how to set up basic replication, tools for backups and monitoring replication, and demonstrates setting up replication without third party tools using pg_basebackup, replication slots, and pg_receivexlog. It also includes contact information for the presenter and an invitation to join the PostgreSQL Slack channel.
Out of the Box Replication in Postgres 9.4(PgConfUS)Denish Patel
This document contains notes from a presentation on PostgreSQL replication. It discusses write-ahead logs (WAL), replication history in PostgreSQL from versions 7.0 to 9.4, how to set up basic replication, tools for backups and monitoring replication, and demonstrates setting up replication without third party tools using pg_basebackup, replication slots, and pg_receivexlog. Contact information is provided for the presenter and information on their employer, Medallia, is included at the end.
Replication in PostgreSQL tutorial given in Postgres Conference 2019Abbas Butt
This document provides an overview of replication in PostgreSQL, including the various methods and configurations. It discusses replication at both the physical and logical levels. At the physical level, it covers disk-based replication using NAS, file system based replication using DRBD, and log shipping based approaches at both the file and block levels. It also covers logical replication using trigger-based replication with Slony-I, statement-based replication with pgpool-II, and logical decoding-based approaches. Details are provided on setting up and configuring each method, including performing failovers.
The document provides configuration instructions and guidelines for setting up streaming replication between a PostgreSQL master and standby server, including setting parameter values for wal_level, max_wal_senders, wal_keep_segments, creating a dedicated replication role, using pg_basebackup to initialize the standby, and various recovery target options to control the standby's behavior. It also discusses synchronous replication using replication slots and monitoring the replication process on both the master and standby servers.
Built in physical and logical replication in postgresql-Firat GulecFIRAT GULEC
What is Replication?
Why do we need Replication?
How many replication layers do we have?
Understanding milestones of built-in Database Physical Replication.
What is the purpose of replication? and How to rescue system in case of failover?
What is Streaming Replication and what is its advantages? Async vs Sync, Hot standby etc.
How to configurate Master and Standby Servers? And What is the most important parameters? Example of topoloji.
What is Cascading Replication and how to configurate it? Live Demo on Terminal.
What is Logical Replication coming with PostgreSQL 10? And What is its advantages?
Logical Replication vs Physical Replication
Limitations of Logical Replication
Quorum Commit for Sync Replication etc.
What is coming up with PostgreSQL 11 about replication?
10 Questions quiz and giving some gifts to participants according to their success.
This document provides an overview of replicating a PostgreSQL database. It discusses setting up a primary server for reads and writes and standby servers that are kept in sync with the primary to serve as backups. The primary server writes data to its write-ahead log (WAL) files, which are streamed in real-time to the standby servers via WAL shipping. This allows the standby servers to keep an identical copy of the database. The document also covers configuration of both the primary and standby servers for replication as well as tools for testing the replication setup.
This document outlines an advanced administration training course for PostgreSQL. The agenda covers topics such as installation, configuration, database management, security, backups and recovery, performance tuning, replication, and monitoring. It introduces PostgreSQL and its features, community support resources, architecture including processes, memory, and disk structures, and provides objectives for individual training modules.
This document provides an overview and introduction to PostgreSQL for new users. It covers getting started with PostgreSQL, including installing it, configuring authentication and logging, upgrading to new versions, routine maintenance tasks, hardware recommendations, availability and scalability options, and query tuning and optimization. The document is presented as a slide deck with different sections labeled by letters (e.g. K-0, S-0, U-0).
PostgreSQL Replication High Availability MethodsMydbops
This slides illustrates the need for replication in PostgreSQL, why do you need a replication DB topology, terminologies, replication nodes and many more.
10 things, an Oracle DBA should care about when moving to PostgreSQLPostgreSQL-Consulting
PostgreSQL can handle many of the same workloads as Oracle and provides alternatives to common Oracle features and practices. Some key differences for DBAs moving from Oracle to PostgreSQL include: using shared_buffers instead of SGA with a recommended 25-75% of RAM; using pgbouncer instead of a listener; performing backups with pg_basebackup and WAL archiving instead of RMAN; managing undo data in datafiles instead of undo segments; using streaming replication for high availability instead of RAC; and needing to tune autovacuum instead of manually managing redo and undo logs. PostgreSQL is very capable but may not be suited for some extremely high update workloads of 200K+ transactions per second on a single server
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenEDB
Dieses Webinar hilft Ihnen, die Unterschiede zwischen den verschiedenen Replikationsansätzen zu verstehen, die Anforderungen der jeweiligen Strategie zu erkennen und sich über die Möglichkeiten klar zu werden, was mit jeder einzelnen zu erreichen ist. Damit werden Sie hoffentlich eher in der Lage sein, herauszufinden, welche PostgreSQL-Replikationsarten Sie wirklich für Ihr System benötigen.
- Wie physische und logische Replikation in PostgreSQL funktionieren
- Unterschiede zwischen synchroner und asynchroner Replikation
- Vorteile, Nachteile und Herausforderungen bei der Multi-Master-Replikation
- Welche Replikationsstrategie für unterschiedliche Use-Cases besser geeignet ist
Referent:
Borys Neselovskyi, Regional Sales Engineer DACH, EDB
------------------------------------------------------------
For more #webinars, visit http://bit.ly/EDB-Webinars
Download free #PostgreSQL whitepapers: http://bit.ly/EDB-Whitepapers
Read our #Postgres Blog http://bit.ly/EDB-Blogs
Follow us on Facebook at http://bit.ly/EDB-FB
Follow us on Twitter at http://bit.ly/EDB-Twitter
Follow us on LinkedIn at http://bit.ly/EDB-LinkedIn
Reach us via email at [email protected]
PostgreSQL Sharding and HA: Theory and Practice (PGConf.ASIA 2017)Aleksander Alekseev
This document summarizes Aleksander Alekseev's talk on PostgreSQL sharding and high availability. The talk introduces replication in PostgreSQL, including physical and logical replication. It discusses solutions for high availability like Stolon and PostgreSQL Multimaster, as well as existing solutions for sharding like manual sharding, Citus, Greenplum, and pg_shardman. It also briefly covers other databases like Amazon Aurora and CockroachDB that are inspired by PostgreSQL but with additional distributed functionality.
With any database system, backup is one of the most important tool. PostgreSQL, The most Advanced Open Source database facilitates various backup and recovery options:
* Logical Backup
* Physical Backup
* Archive Logging
* Point in Time Recovery
The document discusses lessons learned from setting up and maintaining a PostgreSQL cluster for a data analytics platform. It describes four stories where problems arose: 1) Implementing automatic failover using Repmgr when the master node failed, 2) The disk filling up faster than expected due to PostgreSQL's MVCC implementation, 3) Being unable to add a new standby node due to missing WAL segments, and 4) Long running queries on the standby node causing conflicts with replication. The key lessons are around using the right tools like Repmgr for replication management, tuning autovacuum, archiving WALs, and addressing hardware limitations for analytics workloads.
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Command Prompt., Inc
Alex Alexander & Linas Virbalas
Hot standby and streaming replication will move the needle forward for high availability and scaling for a wide number of applications. Tungsten already supports clustering using warm standby. In this talk we will describe how to build clusters using the new PostgreSQL features and give our report from the trenches.
This talk will cover how hot standby and streaming replication work from a user perspective, then dive into a description of how to use them, taking Tungsten as an example. We'll cover the following issues:
* Configuration of warm standby and streaming replication
* Provisioning new standby instances
* Strategies for balancing reads across primary and standby database
* Managing failover
* Troubleshooting and gotchas
Please join us for an enlightening discussion a set of PostgreSQL features that are interesting to a wide range of PostgreSQL users.
Plooma is a writing platform to plan, write, and shape books your wayPlooma
Plooma is your all in one writing companion, designed to support authors at every twist and turn of the book creation journey. Whether you're sketching out your story's blueprint, breathing life into characters, or crafting chapters, Plooma provides a seamless space to organize all your ideas and materials without the overwhelm. Its intuitive interface makes building rich narratives and immersive worlds feel effortless.
Packed with powerful story and character organization tools, Plooma lets you track character development and manage world building details with ease. When it’s time to write, the distraction-free mode offers a clean, minimal environment to help you dive deep and write consistently. Plus, built-in editing tools catch grammar slips and style quirks in real-time, polishing your story so you don’t have to juggle multiple apps.
What really sets Plooma apart is its smart AI assistant - analyzing chapters for continuity, helping you generate character portraits, and flagging inconsistencies to keep your story tight and cohesive. This clever support saves you time and builds confidence, especially during those complex, detail packed projects.
Getting started is simple: outline your story’s structure and key characters with Plooma’s user-friendly planning tools, then write your chapters in the focused editor, using analytics to shape your words. Throughout your journey, Plooma’s AI offers helpful feedback and suggestions, guiding you toward a polished, well-crafted book ready to share with the world.
With Plooma by your side, you get a powerful toolkit that simplifies the creative process, boosts your productivity, and elevates your writing - making the path from idea to finished book smoother, more fun, and totally doable.
Get Started here: https://www.plooma.ink/
Marketo & Dynamics can be Most Excellent to Each Other – The SequelBradBedford3
So you’ve built trust in your Marketo Engage-Dynamics integration—excellent. But now what?
This sequel picks up where our last adventure left off, offering a step-by-step guide to move from stable sync to strategic power moves. We’ll share real-world project examples that empower sales and marketing to work smarter and stay aligned.
If you’re ready to go beyond the basics and do truly most excellent stuff, this session is your guide.
The Future of Open Source Reporting Best Alternatives to Jaspersoft.pdfVarsha Nayak
In recent years, organizations have increasingly sought robust open source alternative to Jasper Reports as the landscape of open-source reporting tools rapidly evolves. While Jaspersoft has been a longstanding choice for generating complex business intelligence and analytics reports, factors such as licensing changes and growing demands for flexibility have prompted many businesses to explore other options. Among the most notable alternatives to Jaspersoft, Helical Insight stands out for its powerful open-source architecture, intuitive analytics, and dynamic dashboard capabilities. Designed to be both flexible and budget-friendly, Helical Insight empowers users with advanced features—such as in-memory reporting, extensive data source integration, and customizable visualizations—making it an ideal solution for organizations seeking a modern, scalable reporting platform. This article explores the future of open-source reporting and highlights why Helical Insight and other emerging tools are redefining the standards for business intelligence solutions.
Agentic Techniques in Retrieval-Augmented Generation with Azure AI SearchMaxim Salnikov
Discover how Agentic Retrieval in Azure AI Search takes Retrieval-Augmented Generation (RAG) to the next level by intelligently breaking down complex queries, leveraging full conversation history, and executing parallel searches through a new LLM-powered query planner. This session introduces a cutting-edge approach that delivers significantly more accurate, relevant, and grounded answers—unlocking new capabilities for building smarter, more responsive generative AI applications.
Traditional Retrieval-Augmented Generation (RAG) pipelines work well for simple queries—but when users ask complex, multi-part questions or refer to previous conversation history, they often fall short. That’s where Agentic Retrieval comes in: a game-changing advancement in Azure AI Search that brings LLM-powered reasoning directly into the retrieval layer.
This session unveils how agentic techniques elevate your RAG-based applications by introducing intelligent query planning, subquery decomposition, parallel execution, and result merging—all orchestrated by a new Knowledge Agent. You’ll learn how this approach significantly boosts relevance, groundedness, and answer quality, especially for sophisticated enterprise use cases.
Key takeaways:
- Understand the evolution from keyword and vector search to agentic query orchestration
- See how full conversation context improves retrieval accuracy
- Explore measurable improvements in answer relevance and completeness (up to 40% gains!)
- Get hands-on guidance on integrating Agentic Retrieval with Azure AI Foundry and SDKs
- Discover how to build scalable, AI-first applications powered by this new paradigm
Whether you're building intelligent copilots, enterprise Q&A bots, or AI-driven search solutions, this session will equip you with the tools and patterns to push beyond traditional RAG.
Maximizing Business Value with AWS Consulting Services.pdfElena Mia
An overview of how AWS consulting services empower organizations to optimize cloud adoption, enhance security, and drive innovation. Read More: https://www.damcogroup.com/aws-cloud-services/aws-consulting.
Insurance policy management software transforms complex, manual insurance operations into streamlined, efficient digital workflows, enhancing productivity, accuracy, customer service, and profitability for insurers. Visit https://www.damcogroup.com/insurance/policy-management-software for more details!
14 Years of Developing nCine - An Open Source 2D Game FrameworkAngelo Theodorou
A 14-year journey developing nCine, an open-source 2D game framework.
This talk covers its origins, the challenges of staying motivated over the long term, and the hurdles of open-sourcing a personal project while working in the game industry.
Along the way, it’s packed with juicy technical pills to whet the appetite of the most curious developers.
Who will create the languages of the future?Jordi Cabot
Will future languages be created by language engineers?
Can you "vibe" a DSL?
In this talk, we will explore the changing landscape of language engineering and discuss how Artificial Intelligence and low-code/no-code techniques can play a role in this future by helping in the definition, use, execution, and testing of new languages. Even empowering non-tech users to create their own language infrastructure. Maybe without them even realizing.
In a tight labor market and tighter economy, PMOs and resource managers must ensure that every team member is focused on the highest-value work. This session explores how AI reshapes resource planning and empowers organizations to forecast capacity, prevent burnout, and balance workloads more effectively, even with shrinking teams.
FME as an Orchestration Tool - Peak of Data & AI 2025Safe Software
Processing huge amounts of data through FME can have performance consequences, but as an orchestration tool, FME is brilliant! We'll take a look at the principles of data gravity, best practices, pros, cons, tips and tricks. And of course all spiced up with relevant examples!
Build Smarter, Deliver Faster with Choreo - An AI Native Internal Developer P...WSO2
Enterprises must deliver intelligent, cloud native applications quickly—without compromising governance or scalability. This session explores how an internal developer platform increases productivity via AI for code and accelerates AI-native app delivery via code for AI. Learn practical techniques for embedding AI in the software lifecycle, automating governance with AI agents, and applying a cell-based architecture for modularity and scalability. Real-world examples and proven patterns will illustrate how to simplify delivery, enhance developer productivity, and drive measurable outcomes.
Learn more: https://wso2.com/choreo
How the US Navy Approaches DevSecOps with Raise 2.0Anchore
Join us as Anchore's solutions architect reveals how the U.S. Navy successfully approaches the shift left philosophy to DevSecOps with the RAISE 2.0 Implementation Guide to support its Cyber Ready initiative. This session will showcase practical strategies for defense application teams to pivot from a time-intensive compliance checklist and mindset to continuous cyber-readiness with real-time visibility.
Learn how to break down organizational silos through RAISE 2.0 principles and build efficient, secure pipeline automation that produces the critical security artifacts needed for Authorization to Operate (ATO) approval across military environments.
In this session we cover the benefits of a migration to Cosmos DB, migration paths, common pain points and best practices. We share our firsthand experiences and customer stories. Adiom is the trusted partner for migration solutions that enable seamless online database migrations from MongoDB to Cosmos DB vCore, and DynamoDB to Cosmos DB for NoSQL.
NTRODUCTION TO SOFTWARE TESTING
• Definition:
• Software testing is the process of evaluating and
verifying that a software application or system meets
specified requirements and functions correctly.
• Purpose:
• Identify defects and bugs in the software.
• Ensure the software meets quality standards.
• Validate that the software performs as intended in
various scenarios.
• Importance:
• Reduces risks associated with software failures.
• Improves user satisfaction and trust in the product.
• Enhances the overall reliability and performance of
the software
3. What you will learn
How PostgreSQL writes data
What the transaction log does
How to set up streaming replication
How to handle Point-In-Time-Recovery
Managing conflicts
Monitoring replication
More advanced techniques
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
5. Writing a row of data
Understanding how PostgreSQL writes data is key to
understanding replication
Vital to understand PITR
A lot of potential to tune the system
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
6. Write the log first (1)
It is not possible to send data to a data table directly.
What if the system crashes during a write?
A data file could end up with broken data at potentially
unknown positions
Corruption is not an option
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
7. Write the log first (2)
Data goes to the xlog (= WAL) first
WAL is short for “Write Ahead Log”
IMPORTANT: The xlog DOES NOT contain SQL
It contains BINARY changes
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
8. The xlog
The xlog consists of a set of 16 MB files
The xlog consists of various types of records (heap changes,
btree changes, etc.)
It has to be flushed to disk on commit to achieve durability
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
9. Expert tip: Debugging the xlog
Change WAL DEBUG in src/include/pg config manual.h
Recompile PostgreSQL
NOTE: This is not for normal use but just for training purposes
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
10. Enabling wal debug
test=# SET wal_debug TO on;
SET
test=# SET client_min_messages TO debug;
SET
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
11. Observing changes
Every change will go to the screen now
It helps to understand how PostgreSQL works
Apart from debugging: The practical purpose is limited
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
12. Making changes
Data goes to the xlog first
Then data is put into shared buffers
At some later point data is written to the data files
This does not happen instantly leaving a lot of room for
optimization and tuning
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
13. A consistent view of the data
Data is not sent to those data files instantly.
Still: End users will have a consistent view of the data
When a query comes in, it checks the I/O cache (=
shared buffers) and asks the data files only in case of a cache
miss.
xlog is about the physical not about the logical level
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
14. Sustaining writes
We cannot write to the xlog forever without recycling it.
The xlog is recycled during a so called “checkpoint”.
Before the xlog can be recycled, data must be stored safely in
those data files
Checkpoints have a huge impact on performance
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
16. Checkpointing to frequently
Checkpointing is expensive
PostgreSQL warns about too frequent checkpoints
This is what checkpoint warning is good for
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
17. min wal size and max wal size (1)
This is a replacement for checkpoint segments
Now the xlog size is auto-tuned
The new configuration was introduced in PostgreSQL 9.5
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
18. min wal size and max wal size (2)
Instead of having a single knob (checkpoint segments) that both
triggers checkpoints, and determines how many checkpoints to
recycle, they are now separate concerns. There is still an internal
variable called CheckpointSegments, which triggers checkpoints.
But it no longer determines how many segments to recycle at a
checkpoint. That is now auto-tuned by keeping a moving average
of the distance between checkpoints (in bytes), and trying to keep
that many segments in reserve.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
19. min wal size and max wal size (3)
The advantage of this is that you can set max wal size very high,
but the system won’t actually consume that much space if there
isn’t any need for it. The min wal size sets a floor for that; you
can effectively disable the auto-tuning behavior by setting
min wal size equal to max wal size.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
20. How does it impact replication
The xlog has all the changes needed and can therefore be
used for replication.
Copying data files is not enough to achieve a consistent view
of the data
It has some implications related to base backups
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
21. Setting up streaming replication
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
22. The basic process
S: Install PostgreSQL on the slave (no initdb)
M: Adapt postgresql.conf
M: Adapt pg hba.conf
M: Restart PostgreSQL
S: Pull a base backup
S: Start the slave
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
23. Changing postgresql.conf
wal level: Ensure that there is enough xlog generated by the
master (recovering a server needs more xlog than just simple
crash-safety)
max wal senders: When a slave is streaming, connects to the
master and fetches xlog. A base backup will also need 1 / 2
wal senders
hot standby: This is not needed because it is ignored on the
master but it saves some work on the slave later on
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
24. Changing pg hba.conf
Rules for replication have to be added.
Note that “all” databases does not include replication
A separate rule has to be added, which explicitly states
“replication” in the second column
Replication rules work just like any other pg hba.conf rule
Remember: The first line matching rules
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
25. Restarting PostgreSQL
To activate those settings in postgresql.conf the master has to
be restarted.
If only pg hba.conf is changed, a simple SIGHUP (pg ctl
reload) is enough.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
26. Using pg basebackup (1)
pg basebackup will fetch a copy of the data from the master
While pg basebackup is running, the master is fully
operational (no downtime needed)
pg basebackup connects through a database connection and
copies all data files as they are
In most cases this does not create a consistent backup
The xlog is needed to “repair” the base backup (this is exactly
what happens during xlog replay anyway)
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
28. xlog-method: Self-contained backups
By default a base backup is not self-contained.
The database does not start up without additional xlog.
This is fine for Point-In-Time-Recovery because there is an
archive around.
For streaming it can be a problem.
–xlog-method=stream opens a second connection to fetch
xlog during the base backup
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
29. checkpoint=fast: Instant backups
By default pg basebackup starts as soon as the master
checkpoints.
This can take a while.
–checkpoint=fast makes the master check instantly.
In case of a small backup an instant checkpoint speeds things
up.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
30. -R: Generating a config file
For a simple streaming setup all PostgreSQL has to know is
already passed to pg basebackup (host, port, etc.).
-R automatically generates a recovery.conf file, which is quite
ok in most cases.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
31. Backup throttling
–max-rate=RATE: maximum transfer rate to transfer data
directory (in kB/s, or use suffix “k” or “M”)
If your master is weak a pg basebackup running at full speed
can lead to high response times and disk wait.
Slowing down the backup can help to make sure the master
stays responsive.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
32. Adjusting recovery.conf
A basic setup needs:
primary conninfo: A connect string pointing to the master
server
standby mode = on: Tells the system to stream instantly
Additional configuration parameters are available
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
33. Starting up the slave
Make sure the slave has connected to the master
Make sure it has reached a consistent state
Check for wal sender and wal receiver processes
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
34. Promoting a slave to master
Promoting a slave to a master is easy:
pg_ctl -D ... promote
After promotion recovery.conf will be renamed to
recovery.done
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
35. One word about security
So far replication has been done as superuser
This is not necessary
Creating a user, which can do just replication makes sense
CREATE ROLE foo ... REPLICATION ... NOSUPERUSER;
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
37. Simple checks
The most basic and most simplistic check is to check for
wal sender (on the master)
wal receiver (on the slave)
Without those processes the party is over
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
38. More detailed analysis
pg stat replication contains a lot of information
Make sure an entry for each slave is there
Check for replication lag
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
39. Checking for replication lag
A sustained lag is not a good idea.
The distance between the sender and the receiver can be
measured in bytes
SELECT client_addr,
pg_xlog_location_diff(pg_current_xlog_location(),
sent_location)
FROM pg_stat_replication;
In asynchronous replication the replication lag can vary
dramatically (for example during CREATE INDEX, etc.)
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
41. Handling more than 2 nodes
A simple 2 node cluster is easy.
In case of more than 2 servers, life is a bit harder.
If you have two slaves and the master fails: Who is going to
be the new master?
Unless you want to resync all your data, you should better
elect the server containing most of the data already
Comparing xlog positions is necessary
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
42. Timeline issues
When a slave is promoted the timeline ID is incremented
Master and slave have to be in the same timeline
In case of two servers it is important to connect one server to
the second one first and do the promotion AFTERWARDS.
This ensures that the timeline switch is already replicated
from the new master to the surviving slave.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
43. Cascading slaves
Slaves can be connected to slaves
Cascading can make sense to reduce bandwidth requirements
Cascading can take load from the master
Use pg basebackup to fetch data from a slave as if it was a
master
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
45. How conflicts happen
During replication conflicts can happen
Example: The master might want to remove a row still visible
to a reading transaction on the slave
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
46. What happens during a conflict
PostgreSQL will terminate a database connection after some
time
max standby archive delay = 30s
max standby streaming delay = 30s
Those settings define the maximum time the slave waits
during replay before replay is resumed.
In rare cases a connection might be aborted quite soon.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
47. Reducing conflicts
Conflicts can be reduced nicely by setting
hot standby feedback.
The slave will send its oldest transaction ID to tell the master
that cleanup has to be deferred.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
49. What happens if a slave reboots?
If a slave is gone for too long, the master might recycle its
transaction log
The slave needs a full history of the xlog
Setting wal keep segments on the master helps to prevent the
master from recycling transaction log too early
I recommend to always use wal keep segments to make sure
that a slave can be started after a pg basebackup
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
50. Making use of replication slots
Replication slots have been added in PostgreSQL 9.4
There are two types of replication slots:
Physical replication slots (for streaming)
Logical replication slots (for logical decoding)
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
51. Configuring replication slots
Change max replication slots and restart the master
Run . . .
test=# SELECT *
FROM pg_create_physical_replication_slot(’some_name’);
slot_name | xlog_position
-----------+---------------
some_name |
(1 row)
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
52. Tweaking the slave
Add this replication slot to primary slot name on the slave:
primary_slot_name = ’some_name’
The master will ensure that xlog is only recycled when it has
been consumed by the slave.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
53. A word of caution
If a slave is removed make sure the replication slot is dropped.
Otherwise the master might run out of disk space.
NEVER use replication slots without monitoring the size of
the xlog on the sender.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
54. Key advantages of replication slots
The difference between master and slave can be arbitrary.
During bulk load or CREATE INDEX this can be essential.
It can help to overcome the problems caused by slow networks.
It can help to avoid resyncs.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
56. Synchronous vs. asynchronous
Asynchronous replication: Commits on the slave can happen
long after the commit on the master.
Synchronous replication: A transaction has to be written to a
second server.
Synchronous replication potentially adds some network
latency to the scenery
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
57. The application name
During normal operations the application name setting can be
used to assign a name to a database connection.
In case of synchronous replication this variable is used to
determine synchronous candidates.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
58. Configuring synchronous replication:
Master:
add names to synchronous standby names
Slave:
add an application name to your connect string in
primary conninfo
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
59. Fail safety
Synchronous replication needs 2 active servers
If no two servers are left, replication will wait until a second
server is available.
Use AT LEAST 3 servers for synchronous replication to avoid
risk.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
61. What it does
PITR can be used to reach (almost) any point after a base
backup.
It is more of a backup strategy than a replication thing.
Replication and PITR can be combined.
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
62. Configuring for PITR
S: create an archive (ideally this is not on the master)
M: Change postgresql.conf
set wal level
set max wal senders (if pg basebackup is desired)
set archive mode to on
set a proper archive command to archive xlog
M: adapt pg hba.conf (if pg basebackup is desired)
M: restart the master
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
63. pg basebackup, etc.
Perform a pg basebackup as performed before
–xlog-method=stream and -R are not needed
In the archive a .backup file will be available after
pg basebackup
You can delete all xlog files older than the oldest base backup
you want to keep.
The .backup file will guide you
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
64. Restoring from a crash
Take a base backup.
Write a recovery.conf file:
restore command: Tell PostgreSQL where to find xlog
recovery target time (optional): Use a timestamp to tell the
system how far to recover
Start the server
Make sure the system has reached consistent state
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
66. recovery min apply delay: Delayed replay
This settings allows you to tell the slave that a certain delay is
desired.
Example: A stock broker might want to provide you with 15
minute old data
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
67. pause at recovery target
Make sure that the recovery does not stop at a specified point
in time.
Make PostgreSQL wait when a certain point is reached.
This is essential in case you do not know precisely how far to
recover
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
68. recovery target name
Sometimes you want to recover to a certain point in time,
which has been specified before.
To specify a point in time run . . .
SELECT pg_create_restore_point(’some_name’);
Use this name in recovery.conf to recover to this very specific
point
Hans-J¨urgen Sch¨onig
www.postgresql-support.de
70. Contact us . . .
Cybertec Sch¨onig & Sch¨onig GmbH
Gr¨ohrm¨uhlgasse 26
A-2700 Wiener Neustadt Austria
More than 15 years of PostgreSQL experience:
Training
Consulting
24x7 support
Hans-J¨urgen Sch¨onig
www.postgresql-support.de