PostgreSQL storage: Comparing storage options

Posted by Hans-Juergen Schoenig in Cybertec on 2025-07-15 at 05:52

Storing and archiving data efficiently

When it comes to designing and optimizing databases, one of the most critical aspects is the choice of storage options. PostgreSQL, like many other relational databases, provides various storage options that can significantly impact performance, data integrity, and overall database efficiency.

In this case study, we'll delve into each of PostgreSQL's main storage options, their characteristics, and the factors that influence their choice, enabling you to make informed decisions about your database's storage strategy. You will also learn how you can archive data in a hybrid environment for long term storage.

What we will compare:

PostgreSQL row store (heap)
PostgreSQL column store
CSV files
Parquet files

Each storage type serves a different purpose and can be used to achieve different goals.

Sample data: Oracle audit logs

Compliance is a key topic in database engineering and handling large volumes of audit data matters. Therefore taking data from the “Oracle unified audit trail” is a good way to demonstrate PostgreSQL capabilities.

Using PostgreSQL heap storage

For the purpose of this evaluation we have imported roughly 144 million rows into PostgreSQL using a “heap” (which is the default storage method):

lakehouse=# SELECT count(*) FROM t_row_plain;

   count   

-----------

 144417515

(1 row)

When storing a typical Oracle audit trails those 144 million rows will (without indexes) translate to roughly 500 bytes per entry which means that we can expect a table that is roughly 72 GB in size:

lakehouse=# SELECT pg_size_pretty(
pg_total_relation_size('t_row_plain')
);
 pg_size_pretty 
----------------
 72 GB
(1 row)

Heaps have significant advantages but also disadvantages compared to other storage methods. While a normal table needs a lot of space compared to other storage options, we can index at will and access single rows in the most efficient way possible.

When talking about audit logs a he

[...]

Detection and resolution of conflicts in PostgreSQL logical replication

Posted by Ajin Cherian in Fujitsu on 2025-07-15 at 01:00

At this year’s PGConf.dev, the premier gathering for PostgreSQL contributors, developers, and community leaders, Zhijie Hou and I had the opportunity talk about the challenges and solutions around conflict handling in logical replication — a topic increasingly relevant as PostgreSQL adoption continues to grow.

TIL - Debian comes with a pg_virtualenv wrapper!

Posted by Kaarel Moppel on 2025-07-14 at 21:00

Some weeks ago Postgres (well, not directly the official Postgres project) again managed to surprise me during my daily muscle-memory operations, prompting even one relatively lazy human to actually write about it. So while minding some other common Postgres business, I double-tapped “Tab” to complete on “pg_v”, looking for pg_verifybackup...

Dian Fay

Posted by Andreas 'ads' Scherbaum on 2025-07-14 at 14:00

PostgreSQL Person of the Week Interview with Dian Fay: I live in Michigan with my partner Trent, a cat named Bread, and a tarantula named Carmilla. I’m originally from the Florida panhandle, but I moved up here for school lo these many years ago and wound up sticking around – it’s nice having seasons. My mother taught me how to type on her old Selectric when I was six or seven (my handwriting never recovered), and my father taught me BASIC when I was eight by copying code listings out of his Sky & Telescope magazines. In other words, I never stood a chance.

PgPedia Week, 2025-07-13

Posted by Ian Barwick on 2025-07-14 at 09:34

PostgreSQL 19 changes this week new system view pg_dsm_registry_allocations the LSN output format is now uniformly set to %X/%08X this also affects the pg_lsn datatype, with e.g. SELECT '0/0'::pg_lsn returning 0/00000000 (in PosgreSQL 18 and earlier: 0/0 ). the CHECKPOINT command now accepts the following options: FLUSH_UNLOGGED [ boolean ] MODE { FAST | SPREAD } new libpq connection parameter servicefile added PostgreSQL 18 changes this week

PostgreSQL 18 beta2 is will likely be released on July 17th (Thursday): 18beta2 next week .

libpq the PQservice() function added in commit 4b99fed7 has been removed btree_gist the two changes resulting in extension version bumps have been consolidated into version 1.8 PostgreSQL 18 articles

more...

20 Years of DALIBO: PostgreSQL and the Sense of the Community

Posted by damien clochard in Dalibo on 2025-07-13 at 12:30

The cooperative company DALIBO is celebrating its 20th anniversary today, giving me an opportunity to reflect on the reasons behind the success of this collective adventure.

Happy Birhtday

Speaking of DALIBO’s success means first speaking of the PostgreSQL community’s success. When we created the company in 2005 with Jean-Paul Argudo, Dimitri Fontaine and Alexandre Baron, PostgreSQL was a marginal, confidential and unattractive project. Two decades later, it has become the dominant database: an obvious choice, a consensus among most developers, administrators and decision-makers…

So today I could easily tell you the fable of a visionary company, a pioneer that knew before everyone else that Postgres would devour everything in its path… But the truth is that we were lucky to board the right train at the right time :-)

In 2005, even though I had the intuition that this Postgres train would take us far, it was difficult to imagine that the journey would lead us to the very top of the database market… At that time, Oracle was probably the most powerful IT company in the world, Microsoft SQL Server had its own unwavering user base, MySQL was the rising star among web developers and the NoSQL hype was about to begin…

On paper, PostgreSQL seemed to be the ugly duckling of the group: no flashy interface for developers, no outstanding benchmarks, no bombastic press releases…

But in hindsight, the main ingredient for success was already there: an open, decentralized and self-managed community.

When I participated in creating DALIBO, I clearly remember how warm and stimulating the community’s welcome was: people like Bruce Momjian, Simon Riggs and many others supported, encouraged and inspired us.

Because what is so unique about the Postgres community is the sense of community that runs through it.

What I mean by “sense of community” is the ability for individuals to perceive, understand and value what unites them within the same collective. When people manage to grasp together a common objective, s

[...]

Contributions for the week of 2025-06-23 (Week 26)

Posted by Pavlo Golub in postgres-contrib.org on 2025-07-13 at 10:58

PUG Stuttgart

PUG Stuttgart happened on June, 26th - hosted by Aleshkova Daria

Speakers

Marco Klöckler about Semantics in the Cloud
Ilya Kosmodemiansky about Operating PostgreSQL as a Data Source for Analytics Pipelines
Sikandra Chaudhary about PostgreSQL 18: Exploring the Latest Innovations

Prague PostgreSQL Meetup

Prague PostgreSQL Meetup on June 23 organized by Gulcin Yildirim

Speakers

Henrietta Dombrovskaya about Securing Your PostgreSQL Data: A Comprehensive Guide to Protecting Your Database Assets
Boris Novikov about Querying Temporal Data in Postgres

Swiss PGDay 2025

Swiss PGDay 2025 took place on June 26th and 27th in Rapperswil (Switzerland)

Program committee

Organization committee

Speakers

Bruce Momjian How Open Source and Democracy Drive Postgres
Gulcin Yildirim Jelinek Anatomy of Table-Level Locks in PostgreSQL
Laurenz Albe Mach das nicht!
Patrick Stählin Using logical replication for fun and profit
Aarno Aukia Operating PostgreSQL at Scale: Lessons from Hundreds of Instances in Regulated Private Clouds
[Bertrand Hartwig-Peillon] pgAssistant
Gianni Ciolli The Why and What of WAL
Daniel Krefl Hacking pgvector for performance
Kaarel Moppel Das 1x1 der Testdatengenerierung mit Postgres
Franck Pachot Normalize or De-normalize? Relational SQL Columns or JSON Document Attributes?
Andreas Geppert Tabellen vs Objekte: warum nicht einfach beides (in Postgres)?
Devrim Gunduz Know the less known about PostgreSQL
Dirk Krautschick Benchmarking – Eine unerwartete Reise
J

[...]

Fixing Slow Row-Level Security Policies

Posted by Dian Fay on 2025-07-13 at 00:00

At my day job, we use row-level security extensively. Several different roles interact with Postgres through the same GraphQL API; each role has its own grants and policies on tables; whether a role can see record X in table Y can depend on its access to record A in table B, so these policies aren't merely a function of the contents of the candidate row itself. There's more complexity than that, even, but no need to get into it.

Two tables, then.

set jit = off; -- just-in-time compilation mostly serves to muddy the waters here

create table tag (
  id int generated always as identity primary key,
  name text
);

insert into tag (name)
select * from unnest(array[
    'alpha', 'beta', 'gamma', 'delta', 'epsilon', 'zeta', 'eta', 'iota', 'kappa', 'lambda', 'mu',
    'nu', 'xi', 'omicron', 'pi', 'rho', 'sigma', 'tau', 'upsilon', 'phi', 'chi', 'psi', 'omega'
]);

create table item (
  id int generated always as identity primary key,
  value text,
  tags int[]
);

insert into item (value, tags)
select
  md5(random()::text),
  array_sample((select array_agg(id) from tag), trunc(random() * 4)::int + 1)
from generate_series(1, 1000000);

create index on item using gin (tags);

alter table tag enable row level security;
alter table item enable row level security;

We'll set up two roles to compare performance. item_admin will have a simple policy allowing it to view all items, while item_reader's access will be governed by session settings that the user must configure before attempting to query these tables.

create role item_admin;
grant select on item to item_admin;
grant select on tag to item_admin;

create policy item_admin_tag_policy on tag
for select to item_admin
using (true);

create policy item_admin_item_policy on item
for select to item_admin
using (true);

create role item_reader;
grant select on item to item_reader;
grant select on tag to item_reader;

-- `set item_reader.allowed_tags = '{alpha,beta}'` and see items tagged
-- alpha or beta
create policy item_reader_tag_policy on tag
for select to item

[...]

Microservices in Postgres

Posted by Vibhor Kumar on 2025-07-12 at 22:49

Vibhor Kumar and Marc Linster; last updated July 10 2025

Great, big monolithic databases that assembled all the company’s data used to be considered a good thing. When I was Technical Director at Digital Equipment (a long time ago), our business goal was to bring ‘it’ all together into one enormous database instance, so that we could get a handle on the different businesses and have a clear picture of the current state of affairs. We were dreaming of one place where we could see which components were used where, what product was more profitable, and what parts of the business could be evaluated and optimized.

What changed? Why do we now consider monoliths to be dinosaurs that inhibit progress and that should be replaced with a new micro-services architecture?

This article reviews the pros and cons associated with large, monolithic databases, before diving into modular database (micro-)services. We review their advantages and challenges, describe a real-world problem from our consulting background, and outline design principles. The article ends with a discussion of Postgres building blocks for microservices.

Why did we ever want the big monolithic database?

Every business I know has been struggling with uniform definitions, such as a uniform price list with historical prices, or a single source of truth, such as the definite list of customers and their purchases. Trying to move all the data into one ginormous system with referential integrity is very tempting, and when it works, it can be very rewarding.

There are also other operational benefits, such as a single maintenance window, a single set of operating instructions, a single vendor, and a single change management process.
However, this centralized approach begins to show its limitations as the database grows to an extreme scale, leading to performance bottlenecks and inflexibility.

Challenges when dealing with monoliths

The challenges of monolithic systems are significant, and many architects believe that the

[...]

Worklog July 3-11, 2025

Posted by Mankirat Singh on 2025-07-12 at 00:00

Progress on ABI Compliance Reporting with BuildFarm

Efficiency of a sparse hash table

Posted by Ashutosh Bapat on 2025-07-11 at 15:01

When implementing an optimization for derived clause lookup myself, Amit Langote and David Rowley argued about the initial size of hash table (which would hold the clauses). See some discussions around this email on pgsql-hackers.

The hash_create() API in PostgreSQL takes initial size as an argument. It allocates memory for those many hash entries upfront. If more entries are added, it will expand that memory later. The point of argument was what should be the initial size of the hash table, introduced by that patch, containing the derived clauses. During the discussion, David hypothesised that the size of the hash table affects the efficiency of the hash table operations depending upon whether the hash table fits cache line. While I thought it's reasonable to assume so, the practical impact wouldn't be noticeable. I thought that beyond saving a few bytes choosing the right hash table size wasn't going to have any noticeable effects. If an derived clause lookup or insert became a bit slower, nobody would even notice it. It was practically easy to address David's concern by using the number of derived clauses at the time of creating the hash table to decide initial size of the hash table. The patch was committed.

Within a few months, I faced the same problem again when working on resizing shared buffers without server restart. The buffer manager maintains a buffer look table in the form of a hash table to map a page to buffer. When the number of configured buffers changes upon a server restart the size of buffer lookup table also changes. Doing that in a running server would be significant work. To avoid that, we could create a buffer lookup table large enough to accommodate future buffer size needs. Even if the buffer pool shrinks or expands, the size of the buffer lookup table would not change. As long as the expansion is within the buffer lookup table size limit, it could be done without a restart. Buffer lookup table isn't as large as the buffer pool itself, thus wasting a bit of memory can be consi

[...]

Replication Types and Modes in PostgreSQL

Posted by semab tariq in Stormatics on 2025-07-10 at 13:09

Data is a key part of any mission-critical application. Losing it can lead to serious issues, such as financial loss or harm to a business’s reputation. A common way to protect against data loss is by taking regular backups, either manually or automatically. However, as data grows, backups can become large and take longer to complete. If these backups are not managed properly or tested often, a sudden system crash could result in permanent data loss.

To address this, many organizations use real-time data replication as a more reliable solution. In this setup, data from the primary server is continuously copied to one or more standby servers. If the primary server fails, there’s no need to restore from a backup the standby server can be quickly promoted to take over as the new primary. This allows the application to resume with minimal downtime, reducing the Recovery Time Objective (RTO), and can bring the Recovery Point Objective (RPO) close to zero or even zero.

PostgreSQL also supports replication to keep standby servers in sync with the primary server using Write-Ahead Log (WAL) files. Every change made to the database is first recorded in these WAL files on the primary server. These logs are then continuously streamed to the standby server, which applies them to stay up to date. This method ensures that all standby servers stay in sync with the primary and are ready to be promoted in case the primary server fails.

In this blog, we will explore the different types and modes of replication available in PostgreSQL to help you understand which option best fits your business needs.

Types of Replication

PostgreSQL offers different types of replication, with each method designed to serve specific use cases.

Physical Replication
Logical Replication
Cascading Replication

Physical Replication

Physical replication is the foundation for setting up high availability in active-passive clusters. In this replication

[...]

Active-active replication - the hidden costs and complexities

Posted by Jan Wieremjewicz in Percona on 2025-07-10 at 00:00

In Part 1 of this series, we discussed what active-active databases are and identified some “good” reasons for considering them, primarily centered around extreme high availability and critical write availability during regional outages. Now, let’s turn our attention to the less compelling justifications and the substantial challenges that come with implementing such a setup.

What are “bad” reasons?

Scaling write throughput  Trying to scale your write capacity by deploying active-active across regions may sound like a clean horizontal solution, but it is rarely that simple. Write coordination, conflict resolution, and replication overhead introduce latency that defeats the purpose. If you are thinking about low latency writes between regions like Australia and the US, keep in mind that you will still be paying the round-trip cost, typically 150-200ms+, to maintain consistency. Physics don’t do any favors. Even if you have multiple primaries, unless you accept weaker consistency or potential conflicts, your writes will not scale linearly. In many real-world cases, throughput actually suffers compared to a well tuned primary replica setup. If your real goal is better throughput, you are usually better served by:
- Regional read replicas
- Sharding
- Task queuing and eventual delegation
Performance  Performance is often used as a vague justification. But active-active is not a panaceum for general slowness. If the issue is database bottlenecks, reasons may be as basic as not enough work spent on data structuring, indexing, query tuning or scaling reads before jumping into multi primary deployments. Way too often we find that performance problems come not from scale limits, but from poor design and neglected maintenance. We are in 2025 and lessons like ~~this classic one on fast inserts~~ are still painfully relevant. If what you are really facing is application level latency, that is a different challenge. And even then, active-active with strong guara

[...]

So why don't we pick the optimal query plan?

Posted by Tomas Vondra on 2025-07-08 at 10:00

Last week I posted about how we often don’t pick the optimal plan. I got asked about difficulties when trying to reproduce my results, so I’ll address that first (I forgot to mention a couple details). I also got questions about how to best spot this issue, and ways to mitigate this. I’ll discuss that too, although I don’t have any great solutions, but I’ll briefly discuss a couple possible planner/executor improvements that might allow handling this better.

Josef Machytka

Posted by Andreas 'ads' Scherbaum on 2025-07-07 at 14:00

PostgreSQL Person of the Week Interview with Josef Machytka: I was born in Czechia and lived most of my life there. However, for the last 12 years, my family and I have been calling Berlin, Germany, our home. It is an open-minded place that gives people a lot of freedom, we like it here very much.

PgPedia Week, 2025-07-06

Posted by Ian Barwick on 2025-07-07 at 11:38

PostgreSQL 19 development is now officially under way, so from now on any new features will be committed to that version. Any significant PostgreSQL 18 changes (e.g. reversions or substantial changes to already committed features) will be noted here separately (there were none this week).

PostgreSQL 19 changes this week

The first round of new PostgreSQL 19 features is here:

new object identifier type regdatabase , making it easier look up a database's OID COPY FROM now supports multi-line headers cross-type operator support added to contrib module btree_gin : non-array variants of function width_bucket() now permit operand input to be NaN

more...

PGDay UK 2025 - Schedule Published

Posted by Chris Ellis in PGDay UK on 2025-07-04 at 13:27

We are excited to announce the schedule for PGDay UK 2025 has been published. We've got an exciting line up for talks over a range of topics. There will be something for everyone attending.

Take a look at what we have going on: https://pgday.uk/events/pgdayuk2025/schedule/

We'd like to extend our gratitude to the whole CFP team, who did an amazing job selecting the talks to make up the schedule.

Thank you to all speakers whom submitted talks, it's always a shame that we can't accept all, and as ever it's a tough choice to choose the talks for the schedule. Be it your 100th time or 1st time submitting a talk, we hope you submit again in the future and at other PostgreSQL Europe events.

PGDay UK 2025 is taking place in London on September 9th, so don't forget to register for PGDay UK 2025, before it's too late!

https://2025.pgday.uk/

PGConf.be 2025

Posted by Wim Bertels on 2025-07-04 at 12:07

A round up of the fifth PGConf.be

The shared presentations are online, as are a couple of recordings and turtle-loading have-a-cup-of-tea locally stored photos.

Using the well known and broadly spread technique of inductive reasoning we came to the conclusion that this fourth PGConf.be conference was a success, as well as the art work. No animals or elephants we’re hurt during this event.

The statistics are

60 attendants
- depending on the session, an extra 60 to 150 students attended as well
10 speakers
2 sponsors

This conference wouldn’t have been possible without the help of volunteers.
To conclude a big thank you to all the speakers, sponsors and attendants.
Without them a conference is just a like tee party.

pgroll 0.14 - New commands and more control over version schemas

Posted by Andrew Farries in Xata on 2025-07-04 at 11:45

pgroll 0.14 is released with several new subcommands and better control over how version schema are created.

Disaster Recovery Guide with pgbackrest

Posted by warda bibi in Stormatics on 2025-07-03 at 14:18

Recently, we worked with a client who was manually backing up their 800GB PostgreSQL database using pg_dump, which was growing rapidly and had backups stored on the same server as the database itself. This setup had several critical issues:

Single point of failure: If the server failed, both the database and its backups would be lost.
No point-in-time recovery: Accidental data deletion couldn’t be undone.
Performance bottlenecks: Backups consumed local storage, impacting database performance.

To address these risks, we replaced their setup with pgBackRest, shifting backups to a dedicated backup server with automated retention policies and support for point-in-time recovery (PITR).

This guide will walk you through installing, configuring, and testing pgBackRest in a real-world scenario where backups will be configured on a dedicated backup server, separate from the data node itself.

PgBackRest Overview

pgBackRest is a robust, open-source backup and restore solution for PostgreSQL, developed by Crunchy Data. It is designed to meet the demands of large-scale PostgreSQL environments, with a focus on reliability, efficiency, and automation.

Key features include:

Support for full, differential, and incremental backups to optimize storage usage and reduce backup time.
Point-in-time recovery (PITR) for precise restoration in case of data loss or corruption.
Parallel processing for faster backup and restore operations, particularly in large databases.
Integrated compression and encryption to ensure secure and space-efficient backup storage.
Remote backup capabilities to protect against single points of failure by decoupling backups from the primary database server.

pgBackRest Backup Types

Full Backups

Full backup captures the entire PostgreSQL database cluster and includes all necessary files to re

[...]

On Postgres Plan Cache Mode Management

Posted by Andrei Lepikhov in Postgres Professional on 2025-07-03 at 08:29

Having attended PGConf.DE'2025 and discussed the practice of using Postgres on large databases there, I was surprised to regularly hear the opinion that query planning time is a significant issue. As a developer, it was surprising to learn that this factor can, for example, slow down the decision to move to a partitioned schema, which seems like a logical step once the number of records in a table exceeds 100 million. Well, let's figure it out.

The obvious way out of this situation is to use prepared statements, initially intended for reusing labour-intensive parts such as parse trees and query plans. For more specifics, let's look at a simple table scan with a large number of partitions (see initialisation script):

EXPLAIN (ANALYZE, COSTS OFF, MEMORY, TIMING OFF)
SELECT * FROM test WHERE y = 127;

/*
...
   ->  Seq Scan on l256 test_256
         Filter: (y = 127)
 Planning:
   Buffers: shared hit=1536
   Memory: used=3787kB  allocated=4104kB
 Planning Time: 61.272 ms
 Execution Time: 4.929 ms
*/

In this scenario involving a selection from a table with 256 partitions, my laptop's PostgreSQL took approximately 60 milliseconds for the planning phase and only 5 milliseconds for execution. During the planning process, it allocated 4 MB of RAM and accessed 1,500 data pages. Quite substantial overhead for a production environment! In this case, PostgreSQL has generated a custom plan that is compiled anew each time the query is executed, choosing an execution strategy based on the query parameter values during optimisation. To improve efficiency, let's parameterise this query and store it in the 'Plan Cache' of the backend by executing PREPARE:

PREPARE tst (integer) AS SELECT * FROM test WHERE y = $1;
EXPLAIN (ANALYZE, COSTS OFF, MEMORY, TIMING OFF) EXECUTE tst(127);

/*
...
   ->  Seq Scan on l256 test_256
         Filter: (y = $1)
 Planning:
   Buffers: shared hit=1536
   Memory: used=3772kB  allocated=4120kB
 Planning Time: 59.525 ms
 Execution Time: 5.184 ms
*/

The planning workload remains the same s

[...]

Avoid UUID Version 4 Primary Keys

Posted by Andrew Atkinson on 2025-07-02 at 00:00

Introduction

Over the last decade, when working on databases where UUID Version 4¹ was picked as the primary key data type, these databases usually have bad performance and excessive IO.

UUID is a native data type that can be stored as binary data, with various versions outlined in the RFC. Version 4 is mostly random bits, obfuscating information like when the value was created, or where it was generated.

Version 4 UUIDs are easy to work with in Postgres as the gen_random_uuid()² function generates values natively since version 13 (2020).

I’ve learned there are misconceptions about UUID Version 4, and sometimes the reasons users pick this data type is based on them.

Because of the poor performance, misconceptions, and available alternatives, I’ve come around to a simple position: Avoid UUID Version 4 for primary keys.

My more controversial take is to avoid UUIDs in general, but I understand there are some legitimate scenarios where there aren’t practical alternatives.

As a database enthusiast, I wanted to have an articulated position on this classic “Integer v. UUID” debate.

Among databases folks, debating these alternatives may be tired and clichéd. However, from my consulting work, I can say that I’m working on databases with UUID v4 as the primary key in 2024 and 2025, and seeing the issues discussed in this post.

Let’s dig in.

The scope of UUIDs in this post

UUIDs (or GUID in Microsoft speak)³) are long strings of 36 characters, 32 digits, 4 hyphens, stored as 128 bits (16 byte) values, stored using the binary uuid data type in Postgres
The RFC documents how the 128 bits are set
The bits for UUID Version 4 are mostly random values
UUID Version 7 includes a timestamp in the first 48 bits, which works much better with database indexes compared with random values

Although unreleased as of this writing, and pulled from Postgres 17 previously, UUID V7 is part of Postgres 18⁴ scheduled for release in the Fall of 2025.

What kind of app database

[...]

Operating PostgreSQL as a Data Source for Analytics Pipelines – Recap from the Stuttgart Meetup

Posted by Ilya Kosmodemiansky in Data Egret on 2025-07-01 at 11:29

PostgreSQL user groups are a fantastic way to build new connections and engage with the local community. Last week, I had the pleasure of speaking at the Stuttgart meetup, where I gave a talk on “Operating PostgreSQL as a Data Source for Analytics Pipelines.”

Below are my slides and a brief overview of the talk. If you missed the meetup but would be interested in an online repeat, let me know in the comments below!

PostgreSQL in Evolving Analytics Pipelines

As modern analytics pipelines evolve beyond simple dashboards into real-time and ML-driven environments, PostgreSQL continues to prove itself as a powerful, flexible, and community-driven database.
In my talk, I explored how PostgreSQL fits into modern data workflows and how to operate it effectively as a source for analytics.

From OLTP to Analytics

PostgreSQL is widely used for OLTP workloads – but can it serve OLAP needs as well? With physical or logical replication, PostgreSQL can act as a robust data source for analytics, enabling teams to offload read-intensive queries without compromising production.

Physical replication provides an easy-to-operate, read-only copy of your production PostgreSQL database. It lets you use the full power of SQL and relational features for reporting – without the risk of data scientists or analysts impacting production. It offers strong performance, though with some limitations: no materialized views, no temporary tables, and limited schema flexibility. Honestly, there are more ways analysts could harm production even from the replica side.

Logical replication offers a better solution:

It allows schema adaptation (e.g., different partitioning strategies)
Data retention beyond what’s on the primary
Support for multiple data sources feeding into one destination

However, it also brings complexity – especially around DDL handling, failover, and more awareness from participating teams.

The Shift Toward Modern Data Analytics

Data analytics in 2025 is more than jus

[...]

Behind the scenes: Speeding up pgstream snapshots for PostgreSQL

Posted by Esther Minano in Xata on 2025-07-01 at 10:30

How targeted improvements helped us speed up bulk data loads and complex schemas.

From 99.9% to 99.99%: Building PostgreSQL Resilience into Your Product Architecture

Posted by Umair Shahid in Stormatics on 2025-07-01 at 09:55

Most teams building production applications understand that “uptime” matters. I am writing this blog to demonstrate how much difference an extra 0.09% makes.

At 99.9% availability, your system can be down for over 43 minutes every month. At 99.99%, that window drops to just over 4 minutes. If your product is critical to business operations, customer workflows, or revenue generation, those 39 extra minutes of downtime each month can be the difference between trust and churn.

What Does 99.99% Look Like in Practice?

Before we dive into architecture, let us put the numbers into perspective:

Uptime SLA	Downtime per month	Downtime per year
99.9%	~43 minutes	~8.76 hours
99.95%	~22 minutes	~4.38 hours
99.99%	~4.4 minutes	~52.6 minutes

Moving from 99.9% to 99.99% availability encompasses the entire posture of resilience in your architecture, particularly in how your database layer handles failures, maintenance, upgrades, and unexpected events.

PostgreSQL by Default: Solid but Not Enough

PostgreSQL is a battle-tested, reliable RDBMS. It does what it promises, and does it well. However, the vanilla setup—single-node PostgreSQL, no failover, backups stored on local disk, and no monitoring—is only suitable for non-critical applications.

Out of the box, PostgreSQL off

[...]

Running Postgres on Cloudflare Containers

Posted by Abhishek Chanda on 2025-07-01 at 04:06

Disclaimer: I work for Cloudflare. The views and opinions expressed in this post are my own and do not necessarily reflect the official policy or position of Cloudflare.

Cloudflare recently launched Cloudflare Containers allowing running containerized applications with workers. To use this, an user would need to define an application in a dockerfile, define a worker, and bind the container to that worker. Naturally, I was curious if we can run Postgres on the new container platform.

An example for metrics assisted modeling of database performance

Posted by Dmitry Dolgov on 2025-07-01 at 00:00

PostgreSQL community conferences, as many other industry driven events, usually do not have proceedings or anything similar. All you get is a slide deck and, if you're lucky, a video recording. But it often feels not enough, as it was with one of my talks last year at PGConfDE, where the topic I was talking about included examples of mathematical modeling to predict how the database will behave. Plenty of things were not said or were not communicated with enough details, because they were simply not fitting the format. The long story short, I've managed to summarize the talk as a whitepaper [1] in much more depth and uploaded it to Zenodo, a European open research repository operated by CERN. Let me know what you think about it. [1]: https://zenodo.org/records/15786156

June, BuildFarm and ABIs

Posted by Mankirat Singh on 2025-07-01 at 00:00

This blog discusses the current progress and implementation of an automated ABI compliance reporting system for PostgreSQL, which is also my GSoC 2025 project.

The PG_TDE Extension Is Now Ready for Production

Posted by Jan Wieremjewicz in Percona on 2025-06-30 at 16:00

Lately, it feels like every time I go to a technical conference, someone is talking about how great PostgreSQL is. I’d think it’s just me noticing, but the rankings and surveys say otherwise. PostgreSQL is simply very popular. From old-school bare metal setups to VMs, containers, and fully managed cloud databases, PostgreSQL keeps gaining ground. And […]

Samed YILDIRIM

Posted by Andreas 'ads' Scherbaum on 2025-06-30 at 14:00

PostgreSQL Person of the Week Interview with Samed YILDIRIM: I’m originally from Turkey. The last city I lived in Turkey was Istanbul. I am now living in Riga, Latvia for more than 5 years already.

Latest Blog Posts

Posted by Hans-Juergen Schoenig in Cybertec on 2025-07-15 at 05:52

Storing and archiving data efficiently

Sample data: Oracle audit logs

Using PostgreSQL heap storage

Posted by Ajin Cherian in Fujitsu on 2025-07-15 at 01:00

Posted by Kaarel Moppel on 2025-07-14 at 21:00

Posted by Andreas 'ads' Scherbaum on 2025-07-14 at 14:00

Posted by Ian Barwick on 2025-07-14 at 09:34

Posted by damien clochard in Dalibo on 2025-07-13 at 12:30

Posted by Pavlo Golub in postgres-contrib.org on 2025-07-13 at 10:58

PUG Stuttgart

Speakers

Prague PostgreSQL Meetup

Speakers

Swiss PGDay 2025

Program committee

Organization committee

Speakers

Posted by Dian Fay on 2025-07-13 at 00:00

Posted by Vibhor Kumar on 2025-07-12 at 22:49

Why did we ever want the big monolithic database?

Challenges when dealing with monoliths

Posted by Mankirat Singh on 2025-07-12 at 00:00

Posted by Ashutosh Bapat on 2025-07-11 at 15:01

Posted by semab tariq in Stormatics on 2025-07-10 at 13:09

Types of Replication

Physical Replication

Posted by Jan Wieremjewicz in Percona on 2025-07-10 at 00:00

What are “bad” reasons?

Posted by Tomas Vondra on 2025-07-08 at 10:00

Posted by Andreas 'ads' Scherbaum on 2025-07-07 at 14:00

Posted by Ian Barwick on 2025-07-07 at 11:38

Posted by Chris Ellis in PGDay UK on 2025-07-04 at 13:27

Posted by Wim Bertels on 2025-07-04 at 12:07

A round up of the fifth PGConf.be

Posted by Andrew Farries in Xata on 2025-07-04 at 11:45

Posted by warda bibi in Stormatics on 2025-07-03 at 14:18

PgBackRest Overview

pgBackRest Backup Types

Full Backups

Posted by Andrei Lepikhov in Postgres Professional on 2025-07-03 at 08:29

Posted by Andrew Atkinson on 2025-07-02 at 00:00

Introduction

The scope of UUIDs in this post

Posted by Ilya Kosmodemiansky in Data Egret on 2025-07-01 at 11:29

PostgreSQL in Evolving Analytics Pipelines

From OLTP to Analytics

The Shift Toward Modern Data Analytics

Posted by Esther Minano in Xata on 2025-07-01 at 10:30

Posted by Umair Shahid in Stormatics on 2025-07-01 at 09:55

What Does 99.99% Look Like in Practice?

PostgreSQL by Default: Solid but Not Enough

Posted by Abhishek Chanda on 2025-07-01 at 04:06

Posted by Dmitry Dolgov on 2025-07-01 at 00:00

Posted by Mankirat Singh on 2025-07-01 at 00:00

Posted by Jan Wieremjewicz in Percona on 2025-06-30 at 16:00

Posted by Andreas 'ads' Scherbaum on 2025-06-30 at 14:00

Top posters

Top teams

Feeds

Planet

Contact