git.postgresql.org Git - postgresql.git/log

Fix typo in comment.

Defend against stack overrun in a few more places.

SplitToVariants() in the ispell code, lseg_inside_poly() in geo_ops.c,
and regex_selectivity_sub() in selectivity estimation could recurse
until stack overflow; fix by adding check_stack_depth() calls.
So could next() in the regex compiler, but that case is better fixed by
converting its tail recursion to a loop. (We probably get better code
that way too, since next() can now be inlined into its sole caller.)

There remains a reachable stack overrun in the Turkish stemmer, but
we'll need some advice from the Snowball people about how to fix that.

Per report from Egor Chindyaskin and Alexander Lakhin. These mistakes
are old, so back-patch to all supported branches.

Richard Guo and Tom Lane

Discussion: https://postgr.es/m/1661334672.728714027@f473.i.mail.ru

Doc: document possible need to raise kernel's somaxconn limit.

On fast machines, it's possible for applications such as pgbench
to issue connection requests so quickly that the postmaster's
listen queue overflows in the kernel, resulting in unexpected
failures (with not-very-helpful error messages). Most modern OSes
allow the queue size to be increased, so document how to do that.

Per report from Kevin McKibbin.

Discussion: https://postgr.es/m/CADc_NKg2d+oZY9mg4DdQdoUcGzN2kOYXBu-3--RW_hEe0tUV=g@mail.gmail.com

Doc: prefer sysctl to /proc/sys in docs and comments.

sysctl is more portable than Linux's /proc/sys file tree, and
often easier to use too. That's why most of our docs refer to
sysctl when talking about how to adjust kernel parameters.
Bring the few stragglers into line.

Discussion: https://postgr.es/m/361175.1661187463@sss.pgh.pa.us

Add CHECK_FOR_INTERRUPTS while decoding changes.

While decoding changes in a loop, if we skip all the changes there is no
CFI making the loop uninterruptible.

Reported-by: Whale Song and Andrey Borodin
Bug: 17580
Author: Masahiko Sawada
Reviwed-by: Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/17580-849c1d5b6d7eb422@postgresql.org
Discussion: https://postgr.es/m/B319ECD6-9A28-4CDF-A8F4-3591E0BF2369@yandex-team.ru

doc: fix wrong tag used in create sequence manual.

In ref/create_sequence.sgml <literal> tag was used for nextval function name.
This should have been <function> tag.

Author: Noboru Saito
Discussion: https://postgr.es/m/CAAM3qnJTDFFfRf5JHJ4AYrNcqXgMmj0pbH0%2Bvm%3DYva%2BpJyGymA%40mail.gmail.com
Backpatch-through: 10

Add missing bad-PGconn guards in libpq entry points.

There's a convention that externally-visible libpq functions should
check for a NULL PGconn pointer, and fail gracefully instead of
crashing. PQflush() and PQisnonblocking() didn't get that memo
though. Also add a similar check to PQdefaultSSLKeyPassHook_OpenSSL;
while it's not clear that ordinary usage could reach that with a
null conn pointer, it's cheap enough to check, so let's be consistent.

Daniele Varrazzo and Tom Lane

Discussion: https://postgr.es/m/CA+mi_8Zm_mVVyW1iNFgyMd9Oh0Nv8-F+7Y3-BqwMgTMHuo_h2Q@mail.gmail.com

Fix outdated --help message for postgres -f

This option switch supports a total of 8 values, as told by
set_plan_disabling_options() and the documentation, but this was not
reflected in the output generated by --help.

Author: Junwang Zhao
Discussion: https://postgr.es/m/CAEG8a3+pT3cWzyjzKs184L1XMNm8NDnoJLiSjAYSO7XqpRh_vA@mail.gmail.com
Backpatch-through: 10

Preserve memory context of VarStringSortSupport buffers.

When enlarging the work buffers of a VarStringSortSupport object,
varstrfastcmp_locale was careful to keep them in the ssup_cxt
memory context; but varstr_abbrev_convert just used palloc().
The latter creates a hazard that the buffers could be freed out
from under the VarStringSortSupport object, resulting in stomping
on whatever gets allocated in that memory later.

In practice, because we only use this code for ICU collations
(cf. 3df9c374e), the problem is confined to use of ICU collations.
I believe it may have been unreachable before the introduction
of incremental sort, too, as traditional sorting usually just
uses one context for the duration of the sort.

We could fix this by making the broken stanzas in varstr_abbrev_convert
match the non-broken ones in varstrfastcmp_locale.  However, it seems
like a better idea to dodge the issue altogether by replacing the
pfree-and-allocate-anew coding with repalloc, which automatically
preserves the chunk's memory context.  This fix does add a few cycles
because repalloc will copy the chunk's content, which the existing
coding assumes is useless.  However, we don't expect that these buffer
enlargement operations are performance-critical.  Besides that, it's
far from obvious that copying the buffer contents isn't required, since
these stanzas make no effort to mark the buffers invalid by resetting
last_returned, cache_blob, etc.  That seems to be safe upon examination,
but it's fragile and could easily get broken in future, which wouldn't
get revealed in testing with short-to-moderate-size strings.

Per bug #17584 from James Inform.  Whether or not the issue is
reachable in the older branches, this code has been broken on its
own terms from its introduction, so patch all the way back.

Discussion: https://postgr.es/m/17584-95c79b4a7d771f44@postgresql.org

Catch stack overflow when recursing in transformFromClauseItem().

Most parts of the parser can expect that the stack overflow check
in transformExprRecurse() will trigger before things get desperate.
However, transformFromClauseItem() can recurse directly to self
without having analyzed any expressions, so it's possible to drive
it to a stack-overrun crash. Add a check to prevent that.

Per bug #17583 from Egor Chindyaskin. Back-patch to all supported
branches.

Richard Guo

Discussion: https://postgr.es/m/17583-33be55b9f981f75c@postgresql.org

Add missing fields to _outConstraint()

As of 897795240cfaaed724af2f53ed2c50c9862f951f, check constraints can
be declared invalid. But that patch didn't update _outConstraint() to
also show the relevant struct fields (which were only applicable to
foreign keys before that). This currently only affects debugging
output, so no impact in practice.

pg_upgrade: Fix some minor code issues

96ef3b8ff1cf1950e897fd2f766d4bd9ef0d5d56 accidentally copied a not
applicable comment from the float8_pass_by_value code to the
data_checksums code. Remove that.

87d3b35a1ca31a9d947a8f919a6006679216dff0 changed pg_upgrade to
checking the checksum version rather than just the Boolean presence of
checksums, but didn't change the field type in its ControlData struct
from bool. So this would not work correctly if there ever is a
checksum version larger than 1.

doc: add missing role attributes to user management section

Reported-by: Shinya Kato
Discussion: https://postgr.es/m/1ecdb1ff78e9b03dfce37e85eaca725a@oss.nttdata.com

Author: Shinya Kato

Backpatch-through: 10

doc: warn about security issues around log files

Reported-by: Simon Riggs
Discussion: https://postgr.es/m/CANP8+jJESuuXYq9Djvf-+tx2vY2OFLmfEuu+UvwHNJ1RT7iJCQ@mail.gmail.com

Author: Simon Riggs

Backpatch-through: 10

doc: clarify configuration file for Windows builds

The use of file 'config.pl' was not clearly explained.

Reported-by: [email protected]
Discussion: https://postgr.es/m/164246013804.31952.4958087335645367498@wrigleys.postgresql.org

Backpatch-through: 10

doc: document the CREATE INDEX "USING" clause

Somehow this was in the syntax but had no description.

Reported-by: [email protected]
Discussion: https://postgr.es/m/164228771825.31954.2719791849363756957@wrigleys.postgresql.org

Backpatch-through: 10

doc: clarify CREATE TABLE AS ... IF NOT EXISTS

Mention that the table is not modified if it already exists.

Reported-by: [email protected]
Discussion: https://postgr.es/m/164441177106.9677.5991676148704507229@wrigleys.postgresql.org

Backpatch-through: 10

Fix _outConstraint() for "identity" constraints

The set of fields printed by _outConstraint() in the CONSTR_IDENTITY
case didn't match the set of fields actually used in that case. (The
code was probably uncarefully copied from the CONSTR_DEFAULT case.)
Fix that by using the right set of fields. Since there is no read
support for this node type, this is really just for debugging output
right now, so it doesn't affect anything important.

Back-Patch "Add wait_for_subscription_sync for TAP tests."

This was originally done in commit 0c20dd33db for 16 only, to eliminate
duplicate code and as an infrastructure that makes it easier to write
future tests. However, it has been suggested that it would be good to
back-patch this testing infrastructure to aid future tests in
back-branches.

Backpatch to all supported versions.

Author: Masahiko Sawada
Reviewed by: Amit Kapila, Shi yu
Discussion: https://postgr.es/m/CAD21AoC-fvAkaKHa4t1urupwL8xbAcWRePeETvshvy80f6WV1A@mail.gmail.com
Discussion: https://postgr.es/m/[email protected]

Fix catalog lookup with the wrong snapshot during logical decoding.

Previously, we relied on HEAP2_NEW_CID records and XACT_INVALIDATION
records to know if the transaction has modified the catalog, and that
information is not serialized to snapshot. Therefore, after the restart,
if the logical decoding decodes only the commit record of the transaction
that has actually modified a catalog, we will miss adding its XID to the
snapshot. Thus, we will end up looking at catalogs with the wrong
snapshot.

To fix this problem, this changes the snapshot builder so that it
remembers the last-running-xacts list of the decoded RUNNING_XACTS record
after restoring the previously serialized snapshot. Then, we mark the
transaction as containing catalog changes if it's in the list of initial
running transactions and its commit record has XACT_XINFO_HAS_INVALS. To
avoid ABI breakage, we store the array of the initial running transactions
in the static variables InitialRunningXacts and NInitialRunningXacts,
instead of storing those in SnapBuild or ReorderBuffer.

This approach has a false positive; we could end up adding the transaction
that didn't change catalog to the snapshot since we cannot distinguish
whether the transaction has catalog changes only by checking the COMMIT
record. It doesn't have the information on which (sub) transaction has
catalog changes, and XACT_XINFO_HAS_INVALS doesn't necessarily indicate
that the transaction has catalog change. But that won't be a problem since
we use snapshot built during decoding only to read system catalogs.

On the master branch, we took a more future-proof approach by writing
catalog modifying transactions to the serialized snapshot which avoids the
above false positive. But we cannot backpatch it because of a change in
the SnapBuild.

Reported-by: Mike Oh
Author: Masahiko Sawada
Reviewed-by: Amit Kapila, Shi yu, Takamichi Osumi, Kyotaro Horiguchi, Bertrand Drouvot, Ahsan Hadi
Backpatch-through: 10
Discussion: https://postgr.es/m/81D0D8B0-E7C4-4999-B616-1E5004DBDCD2%40amazon.com

Fix handling of R/W expanded datums that are passed to SQL functions.

fmgr_sql must make expanded-datum arguments read-only, because
it's possible that the function body will pass the argument to
more than one callee function. If one of those functions takes
the datum's R/W property as license to scribble on it, then later
callees will see an unexpected value, leading to wrong answers.

From a performance standpoint, it'd be nice to skip this in the
common case that the argument value is passed to only one callee.
However, detecting that seems fairly hard, and certainly not
something that I care to attempt in a back-patched bug fix.

Per report from Adam Mackler. This has been broken since we
invented expanded datums, so back-patch to all supported branches.

Discussion: https://postgr.es/m/WScDU5qfoZ7PB2gXwNqwGGgDPmWzz08VdydcPFLhOwUKZcdWbblbo-0Lku-qhuEiZoXJ82jpiQU4hOjOcrevYEDeoAvz6nR0IU4IHhXnaCA=@mackler.email
Discussion: https://postgr.es/m/187436.1660143060@sss.pgh.pa.us

Stamp 10.22.

Stabilize output of new regression test.

Per buildfarm, the output order of \dx+ isn't consistent across
locales. Apply NO_LOCALE to force C locale. There might be a
more localized way, but I'm not seeing it offhand, and anyway
there is nothing in this test module that particularly cares
about locales.

Security: CVE-2022-2625

Last-minute updates for release notes.

Security: CVE-2022-2625

In extensions, don't replace objects not belonging to the extension.

Previously, if an extension script did CREATE OR REPLACE and there was
an existing object not belonging to the extension, it would overwrite
the object and adopt it into the extension.  This is problematic, first
because the overwrite is probably unintentional, and second because we
didn't change the object's ownership.  Thus a hostile user could create
an object in advance of an expected CREATE EXTENSION command, and would
then have ownership rights on an extension object, which could be
modified for trojan-horse-type attacks.

Hence, forbid CREATE OR REPLACE of an existing object unless it already
belongs to the extension.  (Note that we've always forbidden replacing
an object that belongs to some other extension; only the behavior for
previously-free-standing objects changes here.)

For the same reason, also fail CREATE IF NOT EXISTS when there is
an existing object that doesn't belong to the extension.

Our thanks to Sven Klemm for reporting this problem.

Security: CVE-2022-2625

Translation updates

Source-Git-URL: ssh://[email protected]/pgtranslation/messages.git
Source-Git-Hash: ba202c5db861f49538ae65603908aeecdd9b8b39

Release notes for 14.5, 13.8, 12.12, 11.17, 10.22.

Remove unportable use of timezone in recent test

Per buildfarm member snapper

Discussion: https://postgr.es/m/129951.1659812518@sss.pgh.pa.us

Improve recently-added test reliability

Commit 59be1c942a47 already tried to make
src/test/recovery/t/033_replay_tsp_drops more reliable, but it wasn't
enough.  Try to improve on that by making this use of a replication slot
to be more like others.  Also, don't drop the slot.

Make a few other stylistic changes while at it.  It's still quite slow,
which is another thing that we need to fix in this script.

Backpatch to all supported branches.

Discussion: https://postgr.es/m/349302.1659191875@sss.pgh.pa.us

Backpatch addition of .git-blame-ignore-revs

This makes it more convenient for git config to contain the
blame.ignoreRevsFile setting; otherwise current git versions complain if
the file is not present.

I constructed the file for each branch by scraping the file in branch
master for commits that appear in that branch.  Because a few additional
pgindent commits have been added to the list in master since the list
was first created, this also propagates those to branches 14 and 15
where the file already existed.  Also, some entries appear to have been
made using author-date rather than committer-date in the format string,
so some timestamps are changed.  Also remove bogus whitespace in the
suggested `git log` format string.

Backpatch to all supported branches.

Discussion: https://postgr.es/m/20220711163138 [email protected]

BRIN: mask BRIN_EVACUATE_PAGE for WAL consistency checking

That bit is unlogged and therefore it's wrong to consider it in WAL page
comparison.

Add a test that tickles the case, as branch testing technology allows.

This has been a problem ever since wal consistency checking was
introduced (commit a507b86900f6 for pg10), so backpatch to all supported
branches.

Author: 王海洋 (Haiyang Wang) <[email protected]>
Reviewed-by: Kyotaro Horiguchi <[email protected]>
Discussion: https://postgr.es/m/CACciXAD2UvLMOhc4jX9VvOKt7DtYLr3OYRBhvOZ-jRxtzc_7Jg@mail.gmail.com
Discussion: https://postgr.es/m/CACciXADOfErX9Bx0nzE_SkdfXr6Bbpo5R=v_B6MUTEYW4ya+cg@mail.gmail.com

Add HINT for restartpoint race with KeepFileRestoredFromArchive().

The five commits ending at cc2c7d65fc27e877c9f407587b0b92d46cd6dd16
closed this race condition for v15+. For v14 through v10, add a HINT to
discourage studying the cosmetic problem.

Reviewed by Kyotaro Horiguchi and David Steele.

Discussion: https://postgr.es/m/20220731061747 [email protected]

Add CHECK_FOR_INTERRUPTS in ExecInsert's speculative insertion loop.

Ordinarily the functions called in this loop ought to have plenty
of CFIs themselves; but we've now seen a case where no such CFI is
reached, making the loop uninterruptible. Even though that's from
a recently-introduced bug, it seems prudent to install a CFI at
the loop level in all branches.

Per discussion of bug #17558 from Andrew Kesper (an actual fix for
that bug will follow).

Discussion: https://postgr.es/m/17558-3f6599ffcf52fd4a@postgresql.org

Reduce test runtime of src/test/modules/snapshot_too_old.

The sto_using_cursor and sto_using_select tests were coded to exercise
every permutation of their test steps, but AFAICS there is no value in
exercising more than one.  This matters because each permutation costs
about six seconds, thanks to the "pg_sleep(6)".  Perhaps we could
reduce that, but the useless permutations seem worth getting rid of
in any case.  (Note that sto_using_hash_index got it right already.)

While here, clean up some other sloppiness such as an unused table.

This doesn't make too much difference in interactive testing, since the
wasted time is typically masked by parallelization with other tests.
However, the buildfarm runs this as a serial step, which means we can
expect to shave ~40 seconds from every buildfarm run.  That makes it
worth back-patching.

Discussion: https://postgr.es/m/2515192.1659454702@sss.pgh.pa.us

Be more wary about 32-bit integer overflow in pg_stat_statements.

We've heard a couple of reports of people having trouble with
multi-gigabyte-sized query-texts files.  It occurred to me that on
32-bit platforms, there could be an issue with integer overflow
of calculations associated with the total query text size.
Address that with several changes:

1. Limit pg_stat_statements.max to INT_MAX / 2 not INT_MAX.
The hashtable code will bound it to that anyway unless "long"
is 64 bits.  We still need overflow guards on its use, but
this helps.

2. Add a check to prevent extending the query-texts file to
more than MaxAllocHugeSize.  If it got that big, qtext_load_file
would certainly fail, so there's not much point in allowing it.
Without this, we'd need to consider whether extent, query_offset,
and related variables shouldn't be off_t not size_t.

3. Adjust the comparisons in need_gc_qtexts() to be done in 64-bit
arithmetic on all platforms.  It appears possible that under duress
those multiplications could overflow 32 bits, yielding a false
conclusion that we need to garbage-collect the texts file, which
could lead to repeatedly garbage-collecting after every hash table
insertion.

Per report from Bruno da Silva.  I'm not convinced that these
issues fully explain his problem; there may be some other bug that's
contributing to the query-texts file becoming so large in the first
place.  But it did get that big, so #2 is a reasonable defense,
and #3 could explain the reported performance difficulties.

(See also commit 8bbe4cbd9, which addressed some related bugs.
The second Discussion: link is the thread that led up to that.)

This issue is old, and is primarily a problem for old platforms,
so back-patch.

Discussion: https://postgr.es/m/CAB+Nuk93fL1Q9eLOCotvLP07g7RAv4vbdrkm0cVQohDVMpAb9A@mail.gmail.com
Discussion: https://postgr.es/m/5601D354.5000703@BlueTreble.com

Check maximum number of columns in function RTEs, too.

I thought commit fd96d14d9 had plugged all the holes of this sort,
but no, function RTEs could produce oversize tuples too, either
via long coldeflists or just from multiple functions in one RTE.
(I'm pretty sure the other variants of base RTEs aren't a problem,
because they ultimately refer to either a table or a sub-SELECT,
whose widths are enforced elsewhere. But we explicitly allow join
RTEs to be overwidth, as long as you don't try to form their
tuple result.)

Per further discussion of bug #17561. As before, patch all branches.

Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org

Fix new recovery test for log_error_verbosity=verbose case

The new test is from commit 9e4f914b5e.

With this setting messages have SQL error numbers included, so that
needs to be provided for in the pattern looked for.

Backpatch to all live branches like the original.

In transformRowExpr(), check for too many columns in the row.

A RowExpr with more than MaxTupleAttributeNumber columns would fail at
execution anyway, since we cannot form a tuple datum with more than that
many columns. While heap_form_tuple() has a check for too many columns,
it emerges that there are some intermediate bits of code that don't
check and can be driven to failure with sufficiently many columns.
Checking this at parse time seems like the most appropriate place to
install a defense, since we already check SELECT list length there.

While at it, make the SELECT-list-length error use the same errcode
(TOO_MANY_COLUMNS) as heap_form_tuple does, rather than the generic
PROGRAM_LIMIT_EXCEEDED.

Per bug #17561 from Egor Chindyaskin. The given test case crashes
in all supported branches (and probably a lot further back),
so patch all.

Discussion: https://postgr.es/m/17561-80350151b9ad2ad4@postgresql.org

Fix test instability

On FreeBSD, the new test fails due to a WAL file being removed before
the standby has had the chance to copy it. Fix by adding a replication
slot to prevent the removal until after the standby has connected.

Author: Kyotaro Horiguchi <[email protected]>
Reported-by: Matthias van de Meent <[email protected]>
Discussion: https://postgr.es/m/CAEze2Wj5nau_qpjbwihvmXLfkAWOZ5TKdbnqOc6nKSiRJEoPyQ@mail.gmail.com

Fix replay of create database records on standby

Crash recovery on standby may encounter missing directories
when replaying database-creation WAL records.  Prior to this
patch, the standby would fail to recover in such a case;
however, the directories could be legitimately missing.
Consider the following sequence of commands:

    CREATE DATABASE
    DROP DATABASE
    DROP TABLESPACE

If, after replaying the last WAL record and removing the
tablespace directory, the standby crashes and has to replay the
create database record again, crash recovery must be able to continue.

A fix for this problem was already attempted in 49d9cfc68bf4, but it
was reverted because of design issues.  This new version is based
on Robert Haas' proposal: any missing tablespaces are created
during recovery before reaching consistency.  Tablespaces
are created as real directories, and should be deleted
by later replay.  CheckRecoveryConsistency ensures
they have disappeared.

The problems detected by this new code are reported as PANIC,
except when allow_in_place_tablespaces is set to ON, in which
case they are WARNING.  Apart from making tests possible, this
gives users an escape hatch in case things don't go as planned.

Author: Kyotaro Horiguchi <[email protected]>
Author: Asim R Praveen <[email protected]>
Author: Paul Guo <[email protected]>
Reviewed-by: Anastasia Lubennikova <[email protected]> (older versions)
Reviewed-by: Fujii Masao <[email protected]> (older versions)
Reviewed-by: Michaël Paquier <[email protected]>
Diagnosed-by: Paul Guo <[email protected]>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com

Allow "in place" tablespaces.

This is a backpatch to branches 10-14 of the following commits:

7170f2159fb2 Allow "in place" tablespaces.
c6f2f01611d4 Fix pg_basebackup with in-place tablespaces.
f6f0db4d6240 Fix pg_tablespace_location() with in-place tablespaces
7a7cd84893e0 doc: Remove mention to in-place tablespaces for pg_tablespace_location()
5344723755bd Remove unnecessary Windows-specific basebackup code.

In-place tablespaces were introduced as a testing helper mechanism, but
they are going to be used for a bugfix in WAL replay to be backpatched
to all stable branches.

I (Álvaro) had to adjust some code to account for lack of
get_dirent_type() in branches prior to 14.

Author: Thomas Munro <[email protected]>
Author: Michaël Paquier <[email protected]>
Author: Álvaro Herrera <[email protected]>
Discussion: https://postgr.es/m/20220722081858 [email protected]

Force immediate commit after CREATE DATABASE etc in extended protocol.

We have a few commands that "can't run in a transaction block",
meaning that if they complete their processing but then we fail
to COMMIT, we'll be left with inconsistent on-disk state.
However, the existing defenses for this are only watertight for
simple query protocol.  In extended protocol, we didn't commit
until receiving a Sync message.  Since the client is allowed to
issue another command instead of Sync, we're in trouble if that
command fails or is an explicit ROLLBACK.  In any case, sitting
in an inconsistent state while waiting for a client message
that might not come seems pretty risky.

This case wasn't reachable via libpq before we introduced pipeline
mode, but it's always been an intended aspect of extended query
protocol, and likely there are other clients that could reach it
before.

To fix, set a flag in PreventInTransactionBlock that tells
exec_execute_message to force an immediate commit.  This seems
to be the approach that does least damage to existing working
cases while still preventing the undesirable outcomes.

While here, add some documentation to protocol.sgml that explicitly
says how to use pipelining.  That's latent in the existing docs if
you know what to look for, but it's better to spell it out; and it
provides a place to document this new behavior.

Per bug #17434 from Yugo Nagata.  It's been wrong for ages,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/17434-d9f7a064ce2a88a3@postgresql.org

doc: clarify that auth. names are lower case and case-sensitive

This is true even for acronyms that are usually upper case, like LDAP.

Reported-by: Alvaro Herrera
Discussion: https://postgr.es/m/202205141521 [email protected]

Backpatch-through: 10

Fix ruleutils issues with dropped cols in functions-returning-composite.

Due to lack of concern for the case in the dependency code, it's
possible to drop a column of a composite type even though stored
queries have references to the dropped column via functions-in-FROM
that return the composite type.  There are "soft" references,
namely FROM-clause aliases for such columns, and "hard" references,
that is actual Vars referring to them.  The right fix for hard
references is to add dependencies preventing the drop; something
we've known for many years and not done (and this commit still doesn't
address it).  A "soft" reference shouldn't prevent a drop though.
We've been around on this before (cf. 9b35ddce9, 2c4debbd0), but
nobody had noticed that the current behavior can result in dump/reload
failures, because ruleutils.c can print more column aliases than the
underlying composite type now has.  So we need to rejigger the
column-alias-handling code to treat such columns as dropped and not
print aliases for them.

Rather than writing new code for this, I used expandRTE() which already
knows how to figure out which function result columns are dropped.
I'd initially thought maybe we could use expandRTE() in all cases, but
that fails for EXPLAIN's purposes, because the planner strips a lot of
RTE infrastructure that expandRTE() needs.  So this patch just uses it
for unplanned function RTEs and otherwise does things the old way.

If there is a hard reference (Var), then removing the column alias
causes us to fail to print the Var, since there's no longer a name
to print.  Failing seems less desirable than printing a made-up
name, so I made it print "?dropped?column?" instead.

Per report from Timo Stolz.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/5c91267e-3b6d-5795-189c-d15a55d61dbb@nullachtvierzehn.de

Fix assertion failure and segmentation fault in backup code.

When a non-exclusive backup is canceled, do_pg_abort_backup() is called
and resets some variables set by pg_backup_start (pg_start_backup in v14
or before). But previously it forgot to reset the session state indicating
whether a non-exclusive backup is in progress or not in this session.

This issue could cause an assertion failure when the session running
BASE_BACKUP is terminated after it executed pg_backup_start and
pg_backup_stop (pg_stop_backup in v14 or before). Also it could cause
a segmentation fault when pg_backup_stop is called after BASE_BACKUP
in the same session is canceled.

This commit fixes the issue by making do_pg_abort_backup reset
that session state.

Back-patch to all supported branches.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas
Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com

Prevent BASE_BACKUP in the middle of another backup in the same session.

Multiple non-exclusive backups are able to be run conrrently in different
sessions. But, in the same session, only one non-exclusive backup can be
run at the same moment. If pg_backup_start (pg_start_backup in v14 or before)
is called in the middle of another non-exclusive backup in the same session,
an error is thrown.

However, previously, in logical replication walsender mode, even if that
walsender session had already called pg_backup_start and started
a non-exclusive backup, it could execute BASE_BACKUP command and
start another non-exclusive backup. Which caused subsequent pg_backup_stop
to throw an error because BASE_BACKUP unexpectedly reset the session state
marked by pg_backup_start.

This commit prevents BASE_BACKUP command in the middle of another
non-exclusive backup in the same session.

Back-patch to all supported branches.

Author: Fujii Masao
Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas
Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com

postgres_fdw: set search_path to 'pg_catalog' while deparsing constants.

The motivation for this is to ensure successful transmission of the
values of constants of regconfig and other reg* types.  The remote
will be reading them with search_path = 'pg_catalog', so schema
qualification is necessary when referencing objects in other schemas.

Per bug #17483 from Emmanuel Quincerot.  Back-patch to all supported
versions.  (There's some other stuff to do here, but it's less
back-patchable.)

Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us

Make dsm_impl_posix_resize more future-proof.

Commit 4518c798 blocks signals for a short region of code, but it
assumed that whatever called it had the signal mask set to UnBlockSig on
entry.  That may be true today (or may even not be, in extensions in the
wild), but it would be better not to make that assumption.  We should
save-and-restore the caller's signal mask.

The PG_SETMASK() portability macro couldn't be used for that, which is
why it wasn't done before.  But... considering that commit a65e0864
established back in 9.6 that supported POSIX systems have sigprocmask(),
and that this is POSIX-only code, there is no reason not to use standard
sigprocmask() directly to achieve that.

Back-patch to all supported releases, like 4518c798 and 80845b7c.

Reviewed-by: Alvaro Herrera <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/CA%2BhUKGKx6Biq7_UuV0kn9DW%2B8QWcpJC1qwhizdtD9tN-fn0H0g%40mail.gmail.com

pg_upgrade doc: mention that replication slots must be recreated

Reported-by: Nikhil Shetty
Discussion: https://postgr.es/m/CAFpL5Vxastip0Jei-K-=7cKXTg=5sahSe5g=om=x68NOX8+PUA@mail.gmail.com

Backpatch-through: 10

doc: clarify the behavior of identically-named savepoints

Original patch by David G. Johnston.

Reported-by: David G. Johnston
Discussion: https://postgr.es/m/CAKFQuwYQCxSSuSL18skCWG8QHFswOJ3hjovHsOZUE346i4OpVQ@mail.gmail.com

Backpatch-through: 10

doc: clarify that "excluded" ON CONFLICT is a single row

Original patch by David G. Johnston.

Reported-by: David G. Johnston
Discussion: https://postgr.es/m/CAKFQuwa4J0+WuO7kW1PLbjoEvzPN+Q_j+P2bXxNnCLaszY7ZdQ@mail.gmail.com

Backpatch-through: 10

doc: mention that INSERT can block because of unique indexes

Initial patch by David G. Johnston.

Reported-by: David G. Johnston
Discussion: https://postgr.es/m/CAKFQuwZpbdzceO41VE-xt1Xh8rWRRfgopTAK1wL9EhCo0Am-Sw@mail.gmail.com

Backpatch-through: 10

doc: mention the pg_locks lock names in parentheses

Reported-by: Troy Frericks
Discussion: https://postgr.es/m/165653551130.665.8240515669521441325@wrigleys.postgresql.org

Backpatch-through: 10

Don't clobber postmaster sigmask in dsm_impl_resize.

Commit 4518c798 intended to block signals in regular backends that
allocate DSM segments, but dsm_impl_resize() is also reached by
dsm_postmaster_startup(). It's not OK to clobber the postmaster's
signal mask, so only manipulate the signal mask when under the
postmaster.

Back-patch to all releases, like 4518c798.

Discussion: https://postgr.es/m/CA%2BhUKGKNpK%3D2OMeea_AZwpLg7Bm4%3DgYWk7eDjZ5F6YbozfOf8w%40mail.gmail.com

Block signals while allocating DSM memory.

On Linux, we call posix_fallocate() on shm_open()'d memory to avoid
later potential SIGBUS (see commit 899bd785).

Based on field reports of systems stuck in an EINTR retry loop there,
there, we made it possible to break out of that loop via slightly odd
coding where the CHECK_FOR_INTERRUPTS() call was somewhat removed from
the loop (see commit 422952ee).

On further reflection, that was not a great choice for at least two
reasons:

1.  If interrupts were held, the CHECK_FOR_INTERRUPTS() would do nothing
and the EINTR error would be surfaced to the user.

2.  If EINTR was reported but neither QueryCancelPending nor
ProcDiePending was set, then we'd dutifully retry, but with a bit more
understanding of how posix_fallocate() works, it's now clear that you
can get into a loop that never terminates.  posix_fallocate() is not a
function that can do some of the job and tell you about progress if it's
interrupted, it has to undo what it's done so far and report EINTR, and
if signals keep arriving faster than it can complete (cf recovery
conflict signals), you're stuck.

Therefore, for now, we'll simply block most signals to guarantee
progress.  SIGQUIT is not blocked (see InitPostmasterChild()), because
its expected handler doesn't return, and unblockable signals like
SIGCONT are not expected to arrive at a high rate.  For good measure,
we'll include the ftruncate() call in the blocked region, and add a
retry loop.

Back-patch to all supported releases.

Reported-by: Alvaro Herrera <[email protected]>
Reported-by: Nicola Contu <[email protected]>
Reviewed-by: Alvaro Herrera <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Discussion: https://postgr.es/m/20220701154105.jjfutmngoedgiad3%40alvherre.pgsql

Fix \watch's interaction with libedit on ^C.

When you hit ^C, the terminal driver in Unix-like systems echoes "^C" as
well as sending an interrupt signal (depending on stty settings). At
least libedit (but maybe also libreadline) is then confused about the
current cursor location, and corrupts the display if you try to scroll
back. Fix, by moving to a new line before the next prompt is displayed.

Back-patch to all supported released.

Author: Pavel Stehule <[email protected]>
Reported-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/3278793.1626198638%40sss.pgh.pa.us

Fix alias matching in transformLockingClause().

When locking a specific named relation for a FOR [KEY] UPDATE/SHARE
clause, transformLockingClause() finds the relation to lock by
scanning the rangetable for an RTE with a matching eref->aliasname.
However, it failed to account for the visibility rules of a join RTE.

If a join RTE doesn't have a user-supplied alias, it will have a
generated eref->aliasname of "unnamed_join" that is not visible as a
relation name in the parse namespace. Such an RTE needs to be skipped,
otherwise it might be found in preference to a regular base relation
with a user-supplied alias of "unnamed_join", preventing it from being
locked.

In addition, if a join RTE doesn't have a user-supplied alias, but
does have a join_using_alias, then the RTE needs to be matched using
that alias rather than the generated eref->aliasname, otherwise a
misleading "relation not found" error will be reported rather than a
"join cannot be locked" error.

Backpatch all the way, except for the second part which only goes back
to 14, where JOIN USING aliases were added.

Dean Rasheed, reviewed by Tom Lane.

Discussion: https://postgr.es/m/CAEZATCUY_KOBnqxbTSPf=7fz9HWPnZ5Xgb9SwYzZ8rFXe7nb=w@mail.gmail.com

Fix previous commit's ecpg_clocale for ppc Darwin.

Per buildfarm member prairiedog, this platform rejects uninitialized
global variables in shared libraries. Back-patch to v10, like the
addition of the variable.

Reviewed by Tom Lane.

Discussion: https://postgr.es/m/20220703030619 [email protected]

ecpglib: call newlocale() once per process.

ecpglib has been calling it once per SQL query and once per EXEC SQL GET
DESCRIPTOR.  Instead, if newlocale() has not succeeded before, call it
while establishing a connection.  This mitigates three problems:
- If newlocale() failed in EXEC SQL GET DESCRIPTOR, the command silently
  proceeded without the intended locale change.
- On AIX, each newlocale()+freelocale() cycle leaked memory.
- newlocale() CPU usage may have been nontrivial.

Fail the connection attempt if newlocale() fails.  Rearrange
ecpg_do_prologue() to validate the connection before its uselocale().

The sort of program that may regress is one running in an environment
where newlocale() fails.  If that program establishes connections
without running SQL statements, it will stop working in response to this
change.  I'm betting against the importance of such an ECPG use case.
Most SQL execution (any using ECPGdo()) has long required newlocale()
success, so there's little a connection could do without newlocale().

Back-patch to v10 (all supported versions).

Reviewed by Tom Lane.  Reported by Guillaume Lelarge.

Discussion: https://postgr.es/m/20220101074055 [email protected]

Harden dsm_impl.c against unexpected EEXIST.

Previously, we trusted the OS not to report EEXIST unless we'd passed in
IPC_CREAT | IPC_EXCL or O_CREAT | O_EXCL, as appropriate.  Solaris's
shm_open() can in fact do that, causing us to crash because we didn't
ereport and then we blithely assumed the mapping was successful.

Let's treat EEXIST just like any other error, unless we're actually
trying to create a new segment.  This applies to shm_open(), where this
behavior has been seen, and also to the equivalent operations for our
sysv and mmap modes just on principle.

Based on the underlying reason for the error, namely contention on a
lock file managed by Solaris librt for each distinct name, this problem
is only likely to happen on 15 and later, because the new shared memory
stats system produces shm_open() calls for the same path from
potentially large numbers of backends concurrently during
authentication.  Earlier releases only shared memory segments between a
small number of parallel workers under one Gather node.  You could
probably hit it if you tried hard enough though, and we should have been
more defensive in the first place.  Therefore, back-patch to all
supported releases.

Per build farm animal margay.  This isn't the end of the story, though,
it just changes random crashes into random "File exists" errors; more
work needed for a green build farm.

Reviewed-by: Robert Haas <[email protected]>
Discussion: https://postgr.es/m/CA%2BhUKGKqKrCV5xKWfh9rnm%3Do%3DDwZLTLtnsj_XpUi9g5%3DV%2B9oyg%40mail.gmail.com

Fix visibility check when XID is committed in CLOG but not in procarray.

TransactionIdIsInProgress had a fast path to return 'false' if the
single-item CLOG cache said that the transaction was known to be
committed. However, that was wrong, because a transaction is first
marked as committed in the CLOG but doesn't become visible to others
until it has removed its XID from the proc array. That could lead to an
error:

ERROR: t_xmin is uncommitted in tuple to be updated

or for an UPDATE to go ahead without blocking, before the previous
UPDATE on the same row was made visible.

The window is usually very short, but synchronous replication makes it
much wider, because the wait for synchronous replica happens in that
window.

Another thing that makes it hard to hit is that it's hard to get such
a commit-in-progress transaction into the single item CLOG cache.
Normally, if you call TransactionIdIsInProgress on such a transaction,
it determines that the XID is in progress without checking the CLOG
and without populating the cache. One way to prime the cache is to
explicitly call pg_xact_status() on the XID. Another way is to use a
lot of subtransactions, so that the subxid cache in the proc array is
overflown, making TransactionIdIsInProgress rely on pg_subtrans and
CLOG checks.

This has been broken ever since it was introduced in 2008, but the race
condition is very hard to hit, especially without synchronous
replication. There were a couple of reports of the error starting from
summer 2021, but no one was able to find the root cause then.

TransactionIdIsKnownCompleted() is now unused. In 'master', remove it,
but I left it in place in backbranches in case it's used by extensions.

Also change pg_xact_status() to check TransactionIdIsInProgress().
Previously, it only checked the CLOG, and returned "committed" before
the transaction was actually made visible to other queries. Note that
this also means that you cannot use pg_xact_status() to reproduce the
bug anymore, even if the code wasn't fixed.

Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with
the pg_xact_status() change added by me.

Author: Simon Riggs
Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru

Fix PostgreSQL::Test aliasing for Perl v5.10.1.

This Perl segfaults if a declaration of the to-be-aliased package
precedes the aliasing itself. Per buildfarm members lapwing and wrasse.
Like commit 20911775de4ab7ac3ecc68bd714cb3ed0fd68b6a, back-patch to v10
(all supported versions).

Discussion: https://postgr.es/m/20220625171533 [email protected]

CREATE INDEX: use the original userid for more ACL checks.

Commit a117cebd638dd02e5c2e791c25e43745f233111b used the original userid
for ACL checks located directly in DefineIndex(), but it still adopted
the table owner userid for more ACL checks than intended. That broke
dump/reload of indexes that refer to an operator class, collation, or
exclusion operator in a schema other than "public" or "pg_catalog".
Back-patch to v10 (all supported versions), like the earlier commit.

Nathan Bossart and Noah Misch

Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de

For PostgreSQL::Test compatibility, alias entire package symbol tables.

Remove the need to edit back-branch-specific code sites when
back-patching the addition of a PostgreSQL::Test::Utils symbol.  Replace
per-symbol, incomplete alias lists.  Give old and new package names the
same EXPORT and EXPORT_OK semantics.  Back-patch to v10 (all supported
versions).

Reviewed by Andrew Dunstan.

Discussion: https://postgr.es/m/20220622072144 [email protected]

Fix memory leak due to LogicalRepRelMapEntry.attrmap.

When rebuilding the relation mapping on subscribers, we were not releasing
the attribute mapping's memory which was no longer required.

The attribute mapping used in logical tuple conversion was refactored in
PG13 (by commit e1551f96e6) but we forgot to update the related code that
frees the attribute map.

Author: Hou Zhijie
Reviewed-by: Amit Langote, Amit Kapila, Shi yu
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com

doc: improve wording of plpgsql RAISE format text

Reported-by: [email protected]
Discussion: https://postgr.es/m/165455351426.573551.7050474465030525109@wrigleys.postgresql.org

Backpatch-through: 10

doc: clarify wording about phantom reads

Reported-by: [email protected]
Discussion: https://postgr.es/m/165222922369.669.10475917322916060899@wrigleys.postgresql.org

Backpatch-through: 10

Avoid ecpglib core dump with out-of-order operations.

If an application executed operations like EXEC SQL PREPARE
without having first established a database connection, it could
get a core dump instead of the expected clean failure.  This
occurred because we did "pthread_getspecific(actual_connection_key)"
without ever having initialized the TSD key actual_connection_key.
The results of that are probably platform-specific, but at least
on Linux it often leads to a crash.

To fix, add calls to ecpg_pthreads_init() in the code paths that
might use actual_connection_key uninitialized.  It's harmless
(and hopefully inexpensive) to do that more than once.

Per bug #17514 from Okano Naoki.  The problem's ancient, so
back-patch to all supported branches.

Discussion: https://postgr.es/m/17514-edd4fad547c5692c@postgresql.org

Doc: clarify the default collation behavior of domains.

The previous wording was "the underlying data type's default collation
is used", which is wrong or at least misleading. The domain inherits
the base type's collation behavior, which if "default" actually can
mean that we use some non-default collation obtained from elsewhere.

Per complaint from Jian He.

Discussion: https://postgr.es/m/CACJufxHMR8_4WooDPjjvEdaxB2hQ5a49qthci8fpKP0MKemVRQ@mail.gmail.com

Revert "Fix psql's single transaction mode on client-side errors with -c/-f switches".

This reverts commits a04ccf6df et al. in the back branches only.
There was some disagreement already over whether to back-patch
157f8739a, on the grounds that it is the sort of behavioral
change that we don't like to back-patch. Furthermore, it now
looks like the logic needs some more work, which we don't have
time for before the upcoming 14.4 release. Revert for now, and
perhaps reconsider later.

Discussion: https://postgr.es/m/17504-76b68018e130415e@postgresql.org

Fix whitespace

Fix off-by-one loop termination condition in pg_stat_get_subscription().

pg_stat_get_subscription scanned one more LogicalRepWorker array entry
than is really allocated. In the worst case this could lead to SIGSEGV,
if the LogicalRepCtx data structure is near the end of shared memory.
That seems quite unlikely though (thanks to the ordering of calls in
CreateSharedMemoryAndSemaphores) and we've heard no field reports of it.
A more likely misbehavior is one row of garbage data in the function's
result, but even that is not real likely because of the check that the
pid field matches some live backend.

Report and fix by Kuntal Ghosh. This bug is old, so back-patch
to all supported branches.

Discussion: https://postgr.es/m/CAGz5QCJykEDzW6jQK6Yz7Qh_PMtD=95de_7QoocbVR2Qy8hWZA@mail.gmail.com

Don't fail on libpq-generated error reports in ecpg_raise_backend().

An error PGresult generated by libpq itself, such as a report of
connection loss, won't have broken-down error fields.
ecpg_raise_backend() blithely assumed that PG_DIAG_MESSAGE_PRIMARY
would always be present, and would end up passing a NULL string
pointer to snprintf when it isn't.  That would typically crash
before 3779ac62d, and it would fail to provide a useful error report
in any case.  Best practice is to substitute PQerrorMessage(conn)
in such cases, so do that.

Per bug #17421 from Masayuki Hirose.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/17421-790ff887e3188874@postgresql.org

Fix psql's single transaction mode on client-side errors with -c/-f switches

psql --single-transaction is able to handle multiple -c and -f switches
in a single transaction since d5563d7d, but this had the surprising
behavior of forcing a transaction COMMIT even if psql failed with an
error in the client (for example incorrect path given to \copy), which
would generate an error, but still commit any changes that were already
applied in the backend. This commit makes the behavior more consistent,
by enforcing a transaction ROLLBACK if any commands fail, both
client-side and backend-side, so as no changes are applied if one error
happens in any of them.

Some tests are added on HEAD to provide some coverage about all that.
Backend-side errors are unreliable as IPC::Run can complain on SIGPIPE
if psql quits before reading a query result, but that should work
properly in the case where any errors come from psql itself, which is
what the original report is about.

Reported-by: Christoph Berg
Author: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/17504-76b68018e130415e@postgresql.org
Backpatch-through: 10

Doc: fix incorrect bit-reversal in example of macaddr formatting.

Will Mortensen (minor additional copy-editing by me)

Discussion: https://postgr.es/m/CAMpnoC5Y6jiZHSA82FG+e_AqkwMg-i94EYqs1C_9kXXFc3_3Yw@mail.gmail.com

Ignore heap rewrites for materialized views in logical replication.

If you have a publication that specifies FOR ALL TABLES clause, a REFRESH
MATERIALIZED VIEW can break your setup with the following message

ERROR: logical replication target relation "public.pg_temp_NNN" does not exist

Commit 1a499c2520 introduces a heuristic to skip table writes that look
like they are from a heap rewrite for a FOR ALL TABLES publication.
However, it forgot to exclude materialized views (heap rewrites occur when
you execute REFRESH MATERIALIZED VIEW). Since materialized views are not
supported in logical decoding, it is appropriate to filter them too.

As explained in the commit 1a499c2520, a more robust solution was included
in version 11 (commit 325f2ec555), hence only version 10 is affected.

Reported-by: Euler Taveira
Author: Euler Taveira
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/bc557ebe-92dc-4afa-b6bb-285a9eeaa614@www.fastmail.com

Silence compiler warnings from some older compilers.

Since a117cebd6, some older gcc versions issue "variable may be used
uninitialized in this function" complaints for brin_summarize_range.
Silence that using the same coding pattern as in bt_index_check_internal;
arguably, a117cebd6 had too narrow a view of which compilers might give
trouble.

Nathan Bossart and Tom Lane. Back-patch as the previous commit was.

Discussion: https://postgr.es/m/20220601163537.GA2331988@nathanxps13

Fix pl/perl test case so it will still work under Perl 5.36.

Perl 5.36 has reclassified the warning condition that this test
case used, so that the expected error fails to appear. Tweak
the test so it instead exercises a case that's handled the same
way in all Perl versions of interest.

This appears to meet our standards for back-patching into
out-of-support branches: it changes no user-visible behavior
but enables testing of old branches with newer tools.
Hence, back-patch as far as 9.2.

Dagfinn Ilmari Mannsåker, per report from Jitka Plesníková.

Discussion: https://postgr.es/m/564579.1654093326@sss.pgh.pa.us

Ensure ParseTzFile() closes the input file after failing.

We hadn't noticed this because (a) few people feed invalid
timezone abbreviation files to the server, and (b) in typical
scenarios guc.c would throw ereport(ERROR) and then transaction
abort handling would silently clean up the leaked file reference.
However, it was possible to observe file leakage warnings if one
breaks an already-active abbreviation file, because guc.c does
not throw ERROR when loading supposedly-validated settings during
session start or SIGHUP processing.

Report and fix by Kyotaro Horiguchi (cosmetic adjustments by me)

Discussion: https://postgr.es/m/20220530.173740.748502979257582392 [email protected]

Doc: fix mention of pg_dump's minimum supported server version.

runtime.sgml contains a passing reference to the minimum server
version that pg_dump[all] can dump from. That was 7.0 for many
years, but when 64f3524e2 raised it to 8.0, we missed updating this
bit. Then when 30e7c175b raised it to 9.2, we missed it again.

Given that track record, I'm not too hopeful that we'll remember
to fix this in future changes ... but for now, make the docs match
reality in each branch.

Noted by Daniel Westermann.

Discussion: https://postgr.es/m/GV0P278MB041917EB3E2FE8704B5AE2C6D2DC9@GV0P278MB0419.CHEP278.PROD.OUTLOOK.COM

doc: Reword description of roles able to view track_activities's info

The information generated when track_activities is accessible to
superusers, roles with the privileges of pg_read_all_stats, as well as
roles one has the privileges of. The original text did not outline the
last point, while the change done in ac1ae47 was unclear about the
second point.

Per discussion with Nathan Bossart.

Discussion: https://postgr.es/m/20220521185743.GA886636@nathanxps13
Backpatch-through: 10

Handle NULL for short descriptions of custom GUC variables

If a short description is specified as NULL in one of the various
DefineCustomXXXVariable() functions available to external modules to
define a custom parameter, SHOW ALL would crash. This change teaches
SHOW ALL to properly handle NULL short descriptions, as well as any code
paths that manipulate it, to gain in flexibility. Note that
help_config.c was already able to do that, when describing a set of GUCs
for postgres --describe-config.

Author: Steve Chavez
Reviewed by: Nathan Bossart, Andres Freund, Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/CAGRrpzY6hO-Kmykna_XvsTv8P2DshGiU6G3j8yGao4mk0CqjHA%40mail.gmail.com
Backpatch-through: 10

Remove misguided SSL key file ownership check in libpq.

Commits a59c79564 et al. tried to sync libpq's SSL key file
permissions checks with what we've used for years in the backend.
We did not intend to create any new failure cases, but it turns out
we did: restricting the key file's ownership breaks cases where the
client is allowed to read a key file despite not having the identical
UID.  In particular a client running as root used to be able to read
someone else's key file; and having seen that I suspect that there are
other, less-dubious use cases that this restriction breaks on some
platforms.

We don't really need an ownership check, since if we can read the key
file despite its having restricted permissions, it must have the right
ownership --- under normal conditions anyway, and the point of this
patch is that any additional corner cases where that works should be
deemed allowable, as they have been historically.  Hence, just drop
the ownership check, and rearrange the permissions check to get rid
of its faulty assumption that geteuid() can't be zero.  (Note that the
comparable backend-side code doesn't have to cater for geteuid() == 0,
since the server rejects that very early on.)

This does have the end result that the permissions safety check used
for a root user's private key file is weaker than that used for
anyone else's.  While odd, root really ought to know what she's doing
with file permissions, so I think this is acceptable.

Per report from Yogendra Suralkar.  Like the previous patch,
back-patch to all supported branches.

Discussion: https://postgr.es/m/MW3PR15MB3931DF96896DC36D21AFD47CA3D39@MW3PR15MB3931.namprd15.prod.outlook.com

In CREATE FOREIGN TABLE syntax synopsis, fix partitioning stuff.

Foreign tables can be partitioned, but previous documentation commits
left the syntax synopsis both incomplete and incorrect.

Justin Pryzby and Amit Langote

Discussion: http://postgr.es/m/20220521130922 [email protected]

Show 'AS "?column?"' explicitly when it's important.

ruleutils.c was coded to suppress the AS label for a SELECT output
expression if the column name is "?column?", which is the parser's
fallback if it can't think of something better.  This is fine, and
avoids ugly clutter, so long as (1) nothing further up in the parse
tree relies on that column name or (2) the same fallback would be
assigned when the rule or view definition is reloaded.  Unfortunately
(2) is far from certain, both because ruleutils.c might print the
expression in a different form from how it was originally written
and because FigureColname's rules might change in future releases.
So we shouldn't rely on that.

Detecting exactly whether there is any outer-level use of a SELECT
column name would be rather expensive.  This patch takes the simpler
approach of just passing down a flag indicating whether there *could*
be any outer use; for example, the output column names of a SubLink
are not referenceable, and we also do not care about the names exposed
by the right-hand side of a setop.  This is sufficient to suppress
unwanted clutter in all but one case in the regression tests.  That
seems like reasonable evidence that it won't be too much in users'
faces, while still fixing the cases we need to fix.

Per bug #17486 from Nicolas Lutic.  This issue is ancient, so
back-patch to all supported branches.

Discussion: https://postgr.es/m/17486-1ad6fd786728b8af@postgresql.org

doc: Mention pg_read_all_stats in description of track_activities

The description of track_activities mentioned that it is visible to
superusers and that the information related to the current session can
be seen, without telling about pg_read_all_stats. Roles that are
granted the privileges of pg_read_all_stats can also see this
information, so mention it in the docs.

Author: Ian Barwick
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/CAB8KJ=jhPyYFu-A5r-ZGP+Ax715mUKsMxAGcEQ9Cx_mBAmrPow@mail.gmail.com
Backpatch-through: 10

Fix mis-merge of result file

Failed to "git add" this file in this branch.

Fix DDL deparse of CREATE OPERATOR CLASS

When an implicit operator family is created, it wasn't getting reported.
Make it do so.

This has always been missing. Backpatch to 10.

Author: Masahiko Sawada <[email protected]>
Reported-by: Leslie LEMAIRE <[email protected]>
Reviewed-by: Amit Kapila <[email protected]>
Reviewed-by: Michael Paquiër <[email protected]>
Discussion: https://postgr.es/m/f74d69e151b22171e8829551b1159e77@developpement-durable.gouv.fr

Backpatch regression tests added by 2d689babe3cb

A new plpgsql test function was added in 14 and up to cover for a bugfix
that was not backpatchable. We can add it to older versions as a way to
cover other bits of DDL event triggers, with an exception clause to
avoid the problematic corner case.

Originally authored by Michaël Paquier.

Backpatch: 10 through 13.

Discussion: https://postgr.es/m/202205201523 [email protected]

Doc: clarify location of libpq's default service file on Windows.

The documentation didn't specify the name of the per-user service file
on Windows, and extrapolating from the pattern used for other config
files gave the wrong answer.  The fact that it isn't consistent with the
others sure seems like a bug, but it's far too late to change that now;
we'd just penalize people who worked it out in the past.  So, simply
document the true state of affairs.

In passing, fix some gratuitous differences between the discussions
of the service file and the password file.

Julien Rouhaud, per question from Dominique Devienne.

Backpatch to all supported branches.  I (tgl) also chose to back-patch
the part of commit ba356a397 that touched libpq.sgml's description of
the service file --- in hindsight, I'm not sure why I didn't do so at
the time, as it includes some fairly essential information.

Discussion: https://postgr.es/m/CAFCRh-_mdLrh8eYVzhRzu4c8bAFEBn=rwoHOmFJcQOTsCy5nig@mail.gmail.com

Update xml_1.out and xml_2.out

Commit 0fbf01120023 should have updated them but didn't.

Check column list length in XMLTABLE/JSON_TABLE alias

We weren't checking the length of the column list in the alias clause of
an XMLTABLE or JSON_TABLE function (a "tablefunc" RTE), and it was
possible to make the server crash by passing an overly long one. Fix it
by throwing an error in that case, like the other places that deal with
alias lists.

In passing, modify the equivalent test used for join RTEs to look like
the other ones, which was different for no apparent reason.

This bug came in when XMLTABLE was born in version 10; backpatch to all
stable versions.

Reported-by: Wang Ke <[email protected]>
Discussion: https://postgr.es/m/17480-1c9d73565bb28e90@postgresql.org

Fix control file update done in restartpoints still running after promotion

If a cluster is promoted (aka the control file shows a state different
than DB_IN_ARCHIVE_RECOVERY) while CreateRestartPoint() is still
processing, this function could miss an update of the control file for
"checkPoint" and "checkPointCopy" but still do the recycling and/or
removal of the past WAL segments, assuming that the to-be-updated LSN
values should be used as reference points for the cleanup. This causes
a follow-up restart attempting crash recovery to fail with a PANIC on a
missing checkpoint record if the end-of-recovery checkpoint triggered by
the promotion did not complete while the cluster abruptly stopped or
crashed before the completion of this checkpoint. The PANIC would be
caused by the redo LSN referred in the control file as located in a
segment already gone, recycled by the previous restartpoint with
"checkPoint" out-of-sync in the control file.

This commit fixes the update of the control file during restartpoints so
as "checkPoint" and "checkPointCopy" are updated even if the cluster has
been promoted while a restartpoint is running, to be on par with the set
of WAL segments actually recycled in the end of CreateRestartPoint().

7863ee4 has fixed this problem already on master, but the release timing
of the latest point versions did not let me enough time to study and fix
that on all the stable branches.

Reported-by: Fujii Masao, Rui Zhao
Author: Kyotaro Horiguchi
Reviewed-by: Nathan Bossart, Michael Paquier
Discussion: https://postgr.es/m/20220316.102444.2193181487576617583 [email protected]
Backpatch-through: 10

Make pull_var_clause() handle GroupingFuncs exactly like Aggrefs.

This follows in the footsteps of commit 2591ee8ec by removing one more
ill-advised shortcut from planning of GroupingFuncs.  It's true that
we don't intend to execute the argument expression(s) at runtime, but
we still have to process any Vars appearing within them, or we risk
failure at setrefs.c time (or more fundamentally, in EXPLAIN trying
to print such an expression).  Vars in upper plan nodes have to have
referents in the next plan level, whether we ever execute 'em or not.

Per bug #17479 from Michael J. Sullivan.  Back-patch to all supported
branches.

Richard Guo

Discussion: https://postgr.es/m/17479-6260deceaf0ad304@postgresql.org

Fix the logical replication timeout during large transactions.

The problem is that we don't send keep-alive messages for a long time
while processing large transactions during logical replication where we
don't send any data of such transactions. This can happen when the table
modified in the transaction is not published or because all the changes
got filtered. We do try to send the keep_alive if necessary at the end of
the transaction (via WalSndWriteData()) but by that time the
subscriber-side can timeout and exit.

To fix this we try to send the keepalive message if required after
processing certain threshold of changes.

Reported-by: Fabrice Chapuis
Author: Wang wei and Amit Kapila
Reviewed By: Masahiko Sawada, Euler Taveira, Hou Zhijie, Hayato Kuroda
Backpatch-through: 10
Discussion: https://postgr.es/m/CAA5-nLARN7-3SLU_QUxfy510pmrYK6JJb=bk3hcgemAM_pAv+w@mail.gmail.com

Improve setup of environment values for commands in MSVC's vcregress.pl

The current setup assumes that commands for lz4, zstd and gzip always
exist by default if not enforced by a user's environment.  However,
vcpkg, as one example, installs libraries but no binaries, so this
default setup to assume that a command should always be present would
cause failures.  This commit improves the detection of such external
commands as follows:
* If a ENV value is available, trust the environment/user and use it.
* If a ENV value is not available, check its execution by looking in the
current PATH, by launching a simple "$command --version" (that should be
portable enough).
** On execution failure, ignore ENV{command}.
** On execution success, set ENV{command} = "$command".

Note that this new rule applies to gzip, lz4 and zstd but not tar that
we assume will always exist.  Those commands are set up in the
environment only when using bincheck and taptest.  The CI includes all
those commands and I have checked that their setup is correct there.  I
have also tested this change in a MSVC environment where we have none of
those commands.

While on it, remove the references to lz4 from the documentation and
vcregress.pl in ~v13.  --with-lz4 has been added in v14~ so there is no
point to have this information in these older branches.

Reported-by: Andrew Dunstan
Reviewed-by: Andrew Dunstan
Discussion: https://postgr.es/m/14402151-376b-a57a-6d0c-10ad12608e12@dunslane.net
Backpatch-through: 10

configure: don't probe for libldap_r if libldap is 2.5 or newer.

In OpenLDAP 2.5 and later, libldap itself is always thread-safe and
there's never a libldap_r.  Our existing coding dealt with that
by assuming it wouldn't find libldap_r if libldap is thread-safe.
But that rule fails to cope if there are multiple OpenLDAP versions
visible, as is likely to be the case on macOS in particular.  We'd
end up using shiny new libldap in the backend and a hoary libldap_r
in libpq.

Instead, once we've found libldap, check if it's >= 2.5 (by
probing for a function introduced then) and don't bother looking
for libldap_r if so.  While one can imagine library setups that
this'd still give the wrong answer for, they seem unlikely to
occur in practice.

Per report from Peter Eisentraut.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/fedacd7c-2a38-25c9-e7ff-dea549d0e979@enterprisedb.com

Stamp 10.21.

Last-minute updates for release notes.

Security: CVE-2022-1552

Fix core dump in transformValuesClause when there are no columns.

The parser code that transformed VALUES from row-oriented to
column-oriented lists failed if there were zero columns.
You can't write that straightforwardly (though probably you
should be able to), but the case can be reached by expanding
a "tab.*" reference to a zero-column table.

Per bug #17477 from Wang Ke. Back-patch to all supported branches.

Discussion: https://postgr.es/m/17477-0af3c6ac6b0a6ae0@postgresql.org