git.postgresql.org Git - postgresql.git/log

doc: Fix typo in function prototype

Remove dead assignment to local variable.

This should have been removed in commit 7e30c186da, which split the loop
into two. Only the first loop uses the 'from' variable; updating it in
the second loop is bogus. It was never read after the first loop, so this
was harmless and surely optimized away by the compiler, but let's be tidy.

Backpatch to all supported versions.

Author: Ranier Vilela
Discussion: https://www.postgresql.org/message-id/CAEudQAoWq%2BAL3BnELHu7gms2GN07k-np6yLbukGaxJ1vY-zeiQ%40mail.gmail.com

Lock the extension during ALTER EXTENSION ADD/DROP.

Although we were careful to lock the object being added or dropped,
we failed to get any sort of lock on the extension itself.  This
allowed the ALTER to proceed in parallel with a DROP EXTENSION,
which is problematic for a couple of reasons.  If both commands
succeeded we'd be left with a dangling link in pg_depend, which
would cause problems later.  Also, if the ALTER failed for some
reason, it might try to print the extension's name, and that could
result in a crash or (in older branches) a silly error message
complaining about extension "(null)".

Per bug #17098 from Alexander Lakhin.  Back-patch to all
supported branches.

Discussion: https://postgr.es/m/17098-b960f3616c861f83@postgresql.org

Fix numeric_mul() overflow due to too many digits after decimal point.

This fixes an overflow error when using the numeric * operator if the
result has more than 16383 digits after the decimal point by rounding
the result. Overflow errors should only occur if the result has too
many digits *before* the decimal point.

Discussion: https://postgr.es/m/CAEZATCUmeFWCrq2dNzZpRj5+6LfN85jYiDoqm+ucSXhb9U2TbA@mail.gmail.com

Un-break AIX build, take 2.

I incorrectly diagnosed the reason why hoverfly is unhappy.
Looking closer, it appears that it fails to link libldap
unless libssl is also present; so the problem was my
idea of clearing LIBS before making the check. Revert
to essentially the original coding, except that instead
of failing when libldap_r isn't there, use libldap.

Per buildfarm member hoverfly.

Discussion: https://postgr.es/m/17083-a19190d9591946a7@postgresql.org

Un-break AIX build.

In commit d0a02bdb8, I'd supposed that uniformly probing for
ldap_bind would make the intent clearer. However, that seems
not to work on AIX, for obscure reasons (maybe it's a macro
there?). Revert to the former behavior of probing
ldap_simple_bind for thread-safe cases and ldap_bind otherwise.

Per buildfarm member hoverfly.

Discussion: https://postgr.es/m/17083-a19190d9591946a7@postgresql.org

Update configure's probe for libldap to work with OpenLDAP 2.5.

The separate libldap_r is gone and libldap itself is now always
thread-safe. Unfortunately there seems no easy way to tell by
inspection whether libldap is thread-safe, so we have to take
it on faith that libldap is thread-safe if there's no libldap_r.
That should be okay, as it appears that libldap_r was a standard
part of the installation going back at least 20 years.

Report and patch by Adrian Ho. Back-patch to all supported
branches, since people might try to build any of them with
a newer OpenLDAP.

Discussion: https://postgr.es/m/17083-a19190d9591946a7@postgresql.org

Reject cases where a query in WITH rewrites to just NOTIFY.

Since the executor can't cope with a utility statement appearing
as a node of a plan tree, we can't support cases where a rewrite
rule inserts a NOTIFY into an INSERT/UPDATE/DELETE command appearing
in a WITH clause of a larger query.  (One can imagine ways around
that, but it'd be a new feature not a bug fix, and so far there's
been no demand for it.)  RewriteQuery checked for this, but it
missed the case where the DML command rewrites to *only* a NOTIFY.
That'd lead to crashes later on in planning.  Add the missed check,
and improve the level of testing of this area.

Per bug #17094 from Yaoguang Chen.  It's been busted since WITH
was introduced, so back-patch to all supported branches.

Discussion: https://postgr.es/m/17094-bf15dff55eaf2e28@postgresql.org

Remove more obsolete comments about semaphores.

Commit 6753333f stopped using semaphores as the sleep/wake mechanism for
heavyweight locks, but some obsolete references to that scheme remained
in comments. As with similar commit 25b93a29, back-patch all the way.

Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://postgr.es/m/CA%2BhUKGLafjB1uzXcy%3D%3D2L3cy7rjHkqOVn7qRYGBjk%3D%3DtMJE7Yg%40mail.gmail.com

Add missing Int64GetDatum macro in dbsize.c

I accidentally missed adding this when adjusting 55fe60938 for back
patching. This adjustment was made for 9.6 to 13. 14 and master are not
affected.

Discussion: https://postgr.es/m/CAApHDvp=twCsGAGQG=A=cqOaj4mpknPBW-EZB-sd+5ZS5gCTtA@mail.gmail.com

Fix incorrect return value in pg_size_pretty(bigint)

Due to how pg_size_pretty(bigint) was implemented, it's possible that when
given a negative number of bytes that the returning value would not match
the equivalent positive return value when given the equivalent positive
number of bytes.  This was due to two separate issues.

1. The function used bit shifting to convert the number of bytes into
larger units.  The rounding performed by bit shifting is not the same as
dividing.  For example -3 >> 1 = -2, but -3 / 2 = -1.  These two
operations are only equivalent with positive numbers.

2. The half_rounded() macro rounded towards positive infinity.  This meant
that negative numbers rounded towards zero and positive numbers rounded
away from zero.

Here we fix #1 by dividing the values instead of bit shifting.  We fix #2
by adjusting the half_rounded macro always to round away from zero.

Additionally, adjust the pg_size_pretty(numeric) function to be more
explicit that it's using division rather than bit shifting.  A casual
observer might have believed bit shifting was used due to a static
function being named numeric_shift_right.  However, that function was
calculating the divisor from the number of bits and performed division.
Here we make that more clear.  This change is just cosmetic and does not
affect the return value of the numeric version of the function.

Here we also add a set of regression tests both versions of
pg_size_pretty() which test the values directly before and after the
function switches to the next unit.

This bug was introduced in 8a1fab36a. Prior to that negative values were
always displayed in bytes.

Author: Dean Rasheed, David Rowley
Discussion: https://postgr.es/m/CAEZATCXnNW4HsmZnxhfezR5FuiGgp+mkY4AzcL5eRGO4fuadWg@mail.gmail.com
Backpatch-through: 9.6, where the bug was introduced.

Avoid doing catalog lookups in postgres_fdw's conversion_error_callback.

As in 50371df26, this is a bad idea since the callback can't really
know what error is being thrown and thus whether or not it is safe
to attempt catalog accesses.  Rather than pushing said accesses into
the mainline code where they'd usually be a waste of cycles, we can
look at the query's rangetable instead.

This change does mean that we'll be printing query aliases (if any
were used) rather than the table or column's true name.  But that
doesn't seem like a bad thing: it's certainly a more useful definition
in self-join cases, for instance.  In any case, it seems unlikely that
any applications would be depending on this detail, so it seems safe
to change.

Patch by me.  Original complaint by Andres Freund; Bharath Rupireddy
noted the connection to conversion_error_callback.

Discussion: https://postgr.es/m/20210106020229 [email protected]

Doc: add info about timestamps with fractional-minute UTC offsets.

Our code has supported fractional-minute UTC offsets for ages, but
there was no mention of the possibility in the main docs, and only
a very indirect reference in Appendix B. Improve that.

Discussion: https://postgr.es/m/162543102827.697.5755498651217979813@wrigleys.postgresql.org

Reduce overhead of cache-clobber testing in LookupOpclassInfo().

Commit 03ffc4d6d added logic to bypass all caching behavior in
LookupOpclassInfo when CLOBBER_CACHE_ALWAYS is enabled.  It doesn't
look like I stopped to think much about what that would cost, but
recent investigation shows that the cost is enormous: it roughly
doubles the time needed for cache-clobber test runs.

There does seem to be value in this behavior when trying to test
the opclass-cache loading logic itself, but for other purposes the
cost is excessive.  Hence, let's back off to doing this only when
debug_invalidate_system_caches_always is at least 3; or in older
branches, when CLOBBER_CACHE_RECURSIVELY is defined.

While here, clean up some other minor issues in LookupOpclassInfo.
Re-order the code so we aren't left with broken cache entries (leading
to later core dumps) in the unlikely case that we suffer OOM while
trying to allocate space for a new entry.  (That seems to be my
oversight in 03ffc4d6d.)  Also, in >= v13, stop allocating one array
entry too many.  That's evidently left over from sloppy reversion in
851b14b0c.

Back-patch to all supported branches, mainly to reduce the runtime
of cache-clobbering buildfarm animals.

Discussion: https://postgr.es/m/1370856.1625428625@sss.pgh.pa.us

doc: Mention requirement to --enable-tap-tests on section for TAP tests

Author: Greg Sabino Mullane
Discussion: https://postgr.es/m/CAKAnmmJYH2FBn_+Vwd2FD5SaKn8hjhAXOCHpZc6n4wXaUaW_SA@mail.gmail.com
Backpatch-through: 9.6

Doc: mention that VACUUM can't utilize over 1GB of RAM

Document that setting maintenance_work_mem to values over 1GB has no
effect on VACUUM.

Reported-by: Martín Marqués
Author: Laurenz Albe
Discussion: https://postgr.es/m/CABeG9LsZ2ozUMcqtqWu_-GiFKB17ih3p8wBHXcpfnHqhCnsc7A%40mail.gmail.com
Backpatch-through: 9.6, oldest supported release

doc: adjust "cities" example to be consistent with other SQL

Reported-by: [email protected]
Discussion: https://postgr.es/m/162345756191.14472.9754568432103008703@wrigleys.postgresql.org

Backpatch-through: 9.6

add missing tag from commit b8c4261e5e

Add new make targets world-bin and install-world-bin

These are the same as world and install-world respectively, but without
building or installing the documentation. There are many reasons for
wanting to be able to do this, including speed, lack of documentation
building tools, and wanting to build other formats of the documentation.
Plans for simplifying the buildfarm client code include using these
targets.

Backpatch to all live branches.

Discussion: https://postgr.es/m/6a421136-d462-b043-a8eb-e75b2861f3df@dunslane.net

Fix prove_installcheck to use correct paths when used with PGXS

The prove_installcheck recipe in src/Makefile.global.in was emitting
bogus paths for a couple of elements when used with PGXS. Here we create
a separate recipe for the PGXS case that does it correctly. We also take
the opportunity to make the make the file more readable by breaking up
the prove_installcheck and prove_check recipes across several lines, and
to remove the setting for REGRESS_SHLIB to src/test/recovery/Makefile,
which is the only set of tests that actually need it.

Backpatch to all live branches

Discussion: https://postgr.es/m/f2401388-936b-f4ef-a07c-a0bcc49b3300@dunslane.net

Fix incorrect PITR message for transaction ROLLBACK PREPARED

Reaching PITR on such a transaction would cause the generation of a LOG
message mentioning a transaction committed, not aborted.

Oversight in 4f1b890.

Author: Simon Riggs
Discussion: https://postgr.es/m/CANbhV-GJ6KijeCgdOrxqMCQ+C8QiK657EMhCy4csjrPcEUFv_Q@mail.gmail.com
Backpatch-through: 9.6

Don't use abort(3) in libpq's fe-print.c.

Causing a core dump on out-of-memory seems pretty unfriendly,
and surely is far outside the expected behavior of a general-purpose
library. Just print an error message (as we did already) and return.
These functions unfortunately don't have an error return convention,
but code using them is probably just looking for a quick-n-dirty
print method and wouldn't bother to check anyway.

Although these functions are semi-deprecated, it still seems
appropriate to back-patch this. In passing, also back-patch
b90e6cef1, just to reduce cosmetic differences between the
branches.

Discussion: https://postgr.es/m/3122443.1624735363@sss.pgh.pa.us

Add test for CREATE INDEX CONCURRENTLY with not-so-immutable predicate

83158f7 has improved index_set_state_flags() so as it is possible to use
transactional updates when updating pg_index state flags, but there was
not really a test case which stressed directly the possibility it fixed.
This commit adds such a test, using a predicate that looks valid in
appearance but calls a stable function.

Author: Andrey Lepikhov
Discussion: https://postgr.es/m/9b905019-5297-7372-0ad2-e1a4bb66a719@postgrespro.ru
Backpatch-through: 9.6

Make index_set_state_flags() transactional

3c84046 is the original commit that introduced index_set_state_flags(),
where the presence of SnapshotNow made necessary the use of an in-place
update. SnapshotNow has been removed in 813fb03, so there is no actual
reasons to not make this operation transactional.

As reported by Andrey, it is possible to trigger the assertion of this
routine expecting no transactional updates when switching the pg_index
state flags, using a predicate mark as immutable but calling stable or
volatile functions. 83158f7 has been around for a couple of months on
HEAD now with no issues found related to it, so it looks safe enough for
a backpatch.

Reported-by: Andrey Lepikhov
Author: Michael Paquier
Reviewed-by: Anastasia Lubennikova
Discussion: https://postgr.es/m/20200903080440 [email protected]
Discussion: https://postgr.es/m/9b905019-5297-7372-0ad2-e1a4bb66a719@postgrespro.ru
Backpatch-through: 9.6

Remove memory leaks in isolationtester.

specscanner.l leaked a kilobyte of memory per token of the spec file.
Apparently somebody thought that the introductory code block would be
executed once; but it's once per yylex() call.

A couple of functions in isolationtester.c leaked small amounts of
memory due to not bothering to free one-time allocations. Might
as well improve these so that valgrind gives this program a clean
bill of health. Also get rid of an ugly static variable.

Coverity complained about one of the one-time leaks, which led me
to try valgrind'ing isolationtester, which led to discovery of the
larger leak.

Remove unnecessary failure cases in RemoveRoleFromObjectPolicy().

It's not really necessary for this function to open or lock the
relation associated with the pg_policy entry it's modifying.  The
error checks it's making on the rel are if anything counterproductive
(e.g., if we don't want to allow installation of policies on system
catalogs, here is not the place to prevent that).  In particular, it
seems just wrong to insist on an ownership check.  That has the net
effect of forcing people to use superuser for DROP OWNED BY, which
surely is not an effect we want.  Also there is no point in rebuilding
the dependencies of the policy expressions, which aren't being
changed.  Lastly, locking the table also seems counterproductive; it's
not helping to prevent race conditions, since we failed to re-read the
pg_policy row after acquiring the lock.  That means that concurrent
DDL would likely result in "tuple concurrently updated/deleted"
errors; which is the same behavior this code will produce, with less
overhead.

Per discussion of bug #17062.  Back-patch to all supported versions,
as the failure cases this eliminates seem just as undesirable in 9.6
as in HEAD.

Discussion: https://postgr.es/m/1573181.1624220108@sss.pgh.pa.us

Stabilize results of insert-conflict-toast.spec.

This back-branch test script was later absorbed into
insert-conflict-specconflict.spec, which required some stabilization
in commit 741d7f104, so perhaps it's not surprising that it needs a
bit of love too.

It's odd though that we hadn't seen it fail before now, because
I thought that 741d7f104 did not change isolationtester's timing
behavior for scripts without any annotation markers. In any case,
this script is racy on its face, so add an annotation to force stable
reporting order.

Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=piculet&dt=2021-06-24%2009%3A54%3A56
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2021-06-24%2010%3A10%3A00

Another fix to relmapper race condition.

In previous commit, I missed that relmap_redo() was also not acquiring the
RelationMappingLock. Thanks to Thomas Munro for pointing that out.

Backpatch-through: 9.6, like previous commit.
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGLev%3DPpOSaL3WRZgOvgk217et%2BbxeJcRr4eR-NttP1F6Q%40mail.gmail.com

Prevent race condition while reading relmapper file.

Contrary to the comment here, POSIX does not guarantee atomicity of a
read(), if another process calls write() concurrently. Or at least Linux
does not. Add locking to load_relmap_file() to avoid the race condition.

Fixes bug #17064. Thanks to Alexander Lakhin for the report and test case.

Backpatch-through: 9.6, all supported versions.
Discussion: https://www.postgresql.org/message-id/17064-bb0d7904ef72add3@postgresql.org

Doc: Update caveats in synchronous logical replication.

Reported-by: Simon Riggs
Author: Takamichi Osumi
Reviewed-by: Amit Kapila
Backpatch-through: 9.6
Discussion: https://www.postgresql.org/message-id/20210222222847 [email protected]

pgcrypto: avoid name conflicts with OpenSSL in one more case.

I happened to notice that if compiled --with-gssapi, 9.6's
contrib/pgcrypto tests report memory stomps for some SHA operations.

Both MEMORY_CONTEXT_CHECKING and valgrind agree there's a problem,
though nothing crashes; it appears that the buffer overrun
only extends into alignment padding, at least on 64-bit hardware.

Investigation found that pgcrypto's references to SHA224_Init
et al were being captured by the system OpenSSL library, which
of course has slightly incompatible definitions of those functions.
We long ago noticed this problem with respect to the sibling
functions SHA256_Init and so on, and commit 56f44784f introduced
renaming macros to dodge the problem for those.  However, it didn't
cover the SHA224 family because we didn't use that at the time.
When commit 1abf76e82 added those awhile later, it neglected to add
a similar renaming macro.  Better late than never, so do so now.

This appears to affect all branches 8.2 - 9.6, so it's surprising
nobody noticed before now.  Maybe the effect is somehow specific
to the way RHEL8 intertwines its GSS and SSL libraries?  Anyway,
we refactored all this stuff in v10, so newer branches don't have
the problem.

Allow non-quoted identifiers as isolation test session/step names.

For no obvious reason, isolationtester has always insisted that
session and step names be written with double quotes.  This is
fairly tedious and does little for test readability, especially
since the names that people actually choose almost always look
like normal identifiers.  Hence, let's tweak the lexer to allow
SQL-like identifiers not only double-quoted strings.

(They're SQL-like, not exactly SQL, because I didn't add any
case-folding logic.  Also there's no provision for U&"..." names,
not that anyone's likely to care.)

There is one incompatibility introduced by this change: if you write
"foo""bar" with no space, that used to be taken as two identifiers,
but now it's just one identifier with an embedded quote mark.

I converted all the src/test/isolation/ specfiles to remove
unnecessary double quotes, but stopped there because my
eyes were glazing over already.

Like 741d7f104, back-patch to all supported branches, so that this
isn't a stumbling block for back-patching isolation test changes.

Discussion: https://postgr.es/m/759113.1623861959@sss.pgh.pa.us

Doc: fix confusion about LEAKPROOF in syntax summaries.

The syntax summaries for CREATE FUNCTION and allied commands
made it look like LEAKPROOF is an alternative to
IMMUTABLE/STABLE/VOLATILE, when of course it is an orthogonal
option. Improve that.

Per gripe from aazamrafeeque0. Thanks to David Johnston for
suggestions.

Discussion: https://postgr.es/m/162444349581.694.5818572718530259025@wrigleys.postgresql.org

Don't assume GSSAPI result strings are null-terminated.

Our uses of gss_display_status() and gss_display_name() assumed
that the gss_buffer_desc strings returned by those functions are
null-terminated.  It appears that they generally are, given the
lack of field complaints up to now.  However, the available
documentation does not promise this, and some man pages
for gss_display_status() show examples that rely on the
gss_buffer_desc.length field instead of expecting null
termination.  Also, we now have a report that on some
implementations, clang's address sanitizer is of the opinion
that the byte after the specified length is undefined.

Hence, change the code to rely on the length field instead.

This might well be cosmetic rather than fixing any real bug, but
it's hard to be sure, so back-patch to all supported branches.
While here, also back-patch the v12 changes that made pg_GSS_error
deal honestly with multiple messages available from
gss_display_status.

Per report from Sudheer H R.

Discussion: https://postgr.es/m/5372B6D4-8276-42C0-B8FB-BD0918826FC3@tekenlight.com

Improve display of query results in isolation tests.

Previously, isolationtester displayed SQL query results using some
ad-hoc code that clearly hadn't had much effort expended on it.
Field values longer than 14 characters weren't separated from
the next field, and usually caused misalignment of the columns
too.  Also there was no visual separation of a query's result
from subsequent isolationtester output.  This made test result
files confusing and hard to read.

To improve matters, let's use libpq's PQprint() function.  Although
that's long since unused by psql, it's still plenty good enough
for the purpose here.

Like 741d7f104, back-patch to all supported branches, so that this
isn't a stumbling block for back-patching isolation test changes.

Discussion: https://postgr.es/m/582362.1623798221@sss.pgh.pa.us

Use annotations to reduce instability of isolation-test results.

We've long contended with isolation test results that aren't entirely
stable.  Some test scripts insert long delays to try to force stable
results, which is not terribly desirable; but other erratic failure
modes remain, causing unrepeatable buildfarm failures.  I've spent a
fair amount of time trying to solve this by improving the server-side
support code, without much success: that way is fundamentally unable
to cope with diffs that stem from chance ordering of arrival of
messages from different server processes.

We can improve matters on the client side, however, by annotating
the test scripts themselves to show the desired reporting order
of events that might occur in different orders.  This patch adds
three types of annotations to deal with (a) test steps that might or
might not complete their waits before the isolationtester can see them
waiting; (b) test steps in different sessions that can legitimately
complete in either order; and (c) NOTIFY messages that might arrive
before or after the completion of a step in another session.  We might
need more annotation types later, but this seems to be enough to deal
with the instabilities we've seen in the buildfarm.  It also lets us
get rid of all the long delays that were previously used, cutting more
than a minute off the runtime of the isolation tests.

Back-patch to all supported branches, because the buildfarm
instabilities affect all the branches, and because it seems desirable
to keep isolationtester's capabilities the same across all branches
to simplify possible future back-patching of tests.

Discussion: https://postgr.es/m/327948.1623725828@sss.pgh.pa.us

Fix misbehavior of DROP OWNED BY with duplicate polroles entries.

Ordinarily, a pg_policy.polroles array wouldn't list the same role
more than once; but CREATE POLICY does not prevent that.  If we
perform DROP OWNED BY on a role that is listed more than once,
RemoveRoleFromObjectPolicy either suffered an assertion failure
or encountered a tuple-updated-by-self error.  Rewrite it to cope
correctly with duplicate entries, and add a CommandCounterIncrement
call to prevent the other problem.

Per discussion, there's other cleanup that ought to happen here,
but this seems like the minimum essential fix.

Per bug #17062 from Alexander Lakhin.  It's been broken all along,
so back-patch to all supported branches.

Discussion: https://postgr.es/m/17062-11f471ae3199ca23@postgresql.org

Avoid scribbling on input node tree in CREATE/ALTER DOMAIN.

This works fine in the "simple Query" code path; but if the
statement is in the plan cache then it's corrupted for future
re-execution. Apply copyObject() to protect the original
tree from modification, as we've done elsewhere.

This narrow fix is applied only to the back branches. In HEAD,
the problem was fixed more generally by commit 7c337b6b5; but
that changed ProcessUtility's API, so it's infeasible to
back-patch.

Per bug #17053 from Charles Samborski.

Discussion: https://postgr.es/m/931771.1623893989@sss.pgh.pa.us
Discussion: https://postgr.es/m/17053-3ca3f501bbc212b4@postgresql.org

Update plpython_subtransaction alternative expected files

The original patch only targeted Python 2.6 and newer, since that is
what we have supported in PostgreSQL 13 and newer. For older
branches, we need to fix it up for older Python versions.

Tidy up GetMultiXactIdMembers()'s behavior on error

One of the error paths left *members uninitialized. That's not a live
bug, because most callers don't look at *members when the function
returns -1, but let's be tidy. One caller, in heap_lock_tuple(), does
"if (members != NULL) pfree(members)", but AFAICS it never passes an
invalid 'multi' value so it should not reach that error case.

The callers are also a bit inconsistent in their expectations.
heap_lock_tuple() pfrees the 'members' array if it's not-NULL, others
pfree() it if "nmembers >= 0", and others if "nmembers > 0". That's
not a live bug either, because the function should never return 0, but
add an Assert for that to make it more clear. I left the callers alone
for now.

I also moved the line where we set *nmembers. It wasn't wrong before,
but I like to do that right next to the 'return' statement, to make it
clear that it's always set on return.

Also remove one unreachable return statement after ereport(ERROR), for
brevity and for consistency with the similar if-block right after it.

Author: Greg Nancarrow with the additional changes by me
Backpatch-through: 9.6, all supported versions

Fix subtransaction test for Python 3.10

Starting with Python 3.10, the stacktrace looks differently:
  -  PL/Python function "subtransaction_exit_subtransaction_in_with", line 3, in <module>
  -    s.__exit__(None, None, None)
  +  PL/Python function "subtransaction_exit_subtransaction_in_with", line 2, in <module>
  +    with plpy.subtransaction() as s:
Using try/except specifically makes the error look always the same.

(See https://github.com/python/cpython/pull/25719 for the discussion
of this change in Python.)

Author: Honza Horak <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/853083.1620749597%40sss.pgh.pa.us
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1959080

Document a few caveats in synchronous logical replication.

In a synchronous logical setup, locking [user] catalog tables can cause
deadlock. This is because logical decoding of transactions can lock
catalog tables to access them so exclusively locking those in transactions
can lead to deadlock. To avoid this users must refrain from having
exclusive locks on catalog tables.

Author: Takamichi Osumi
Reviewed-by: Vignesh C, Amit Kapila
Backpatch-through: 9.6
Discussion: https://www.postgresql.org/message-id/20210222222847.tpnb6eg3yiykzpky%40alap3.anarazel.de

Detect unused steps in isolation specs and do some cleanup

This is useful for developers to find out if an isolation spec is
over-engineered or if it needs more work by warning at the end of a
test run if a step is not used, generating a failure with extra diffs.

While on it, clean up all the specs which include steps not used in any
permutations to simplify them.

This is a backpatch of 989d23b and 06fdc4e, as it is becoming useful to
make all the branches consistent for an upcoming patch that will improve
the output generated by isolationtester.

Author: Michael Paquier
Reviewed-by: Asim Praveen, Melanie Plageman
Discussion: https://postgr.es/m/20190819080820 [email protected]
Discussion: https://postgr.es/m/794820.1623872009@sss.pgh.pa.us
Backpatch-through: 9.6

Remove dry-run mode from isolationtester

The original purpose of the dry-run mode is to be able to print all the
possible permutations from a spec file, but it has become less useful
since isolation tests have improved regarding deadlock detection as one
step not wanted by the author could block indefinitely now (originally
the step blocked would have been detected rather quickly). Per
discussion, let's remove it.

This is a backpatch of 9903338 for 9.6~12. It is proving to become
useful to have on those branches so as the code gets consistent across
all supported versions, as a matter of improving the output generated by
isolationtester.

Author: Michael Paquier
Reviewed-by: Asim Praveen, Melanie Plageman
Discussion: https://postgr.es/m/20190819080820 [email protected]
Discussion: https://postgr.es/m/794820.1623872009@sss.pgh.pa.us
Backpatch-through: 9.6

Fix plancache refcount leak after error in ExecuteQuery.

When stuffing a plan from the plancache into a Portal, one is
not supposed to risk throwing an error between GetCachedPlan and
PortalDefineQuery; if that happens, the plan refcount incremented
by GetCachedPlan will be leaked.  I managed to break this rule
while refactoring code in 9dbf2b7d7.  There is no visible
consequence other than some memory leakage, and since nobody is
very likely to trigger the relevant error conditions many times
in a row, it's not surprising we haven't noticed.  Nonetheless,
it's a bug, so rearrange the order of operations to remove the
hazard.

Noted on the way to looking for a better fix for bug #17053.
This mistake is pretty old, so back-patch to all supported
branches.

Further refinement of stuck_on_old_timeline recovery test

TestLib::perl2host can take a file argument as well as a directory
argument, so that code becomes substantially simpler. Also add comments
on why we're using forward slashes, and why we're setting
PERL_BADLANG=0.

Discussion: https://postgr.es/m/e9947bcd-20ee-027c-f0fe-01f736b7e345@dunslane.net

Fix decoding of speculative aborts.

During decoding for speculative inserts, we were relying for cleaning
toast hash on confirmation records or next change records. But that
could lead to multiple problems (a) memory leak if there is neither a
confirmation record nor any other record after toast insertion for a
speculative insert in the transaction, (b) error and assertion failures
if the next operation is not an insert/update on the same table.

The fix is to start queuing spec abort change and clean up toast hash
and change record during its processing. Currently, we are queuing the
spec aborts for both toast and main table even though we perform cleanup
while processing the main table's spec abort record. Later, if we have a
way to distinguish between the spec abort record of toast and the main
table, we can avoid queuing the change for spec aborts of toast tables.

Reported-by: Ashutosh Bapat
Author: Dilip Kumar
Reviewed-by: Amit Kapila
Backpatch-through: 9.6, where it was introduced
Discussion: https://postgr.es/m/CAExHW5sPKF-Oovx_qZe4p5oM6Dvof7_P+XgsNAViug15Fm99jA@mail.gmail.com

Work around portability issue with newer versions of mktime().

Recent glibc versions have made mktime() fail if tm_isdst is
inconsistent with the prevailing timezone; in particular it fails for
tm_isdst = 1 when the zone is UTC.  (This seems wildly inconsistent
with the POSIX-mandated treatment of "incorrect" values for the other
fields of struct tm, so if you ask me it's a bug, but I bet they'll
say it's intentional.)  This has been observed to cause cosmetic
problems when pg_restore'ing an archive created in a different
timezone.

To fix, do mktime() using the field values from the archive, and if
that fails try again with tm_isdst = -1.  This will give a result
that's off by the UTC-offset difference from the original zone, but
that was true before, too.  It's not terribly critical since we don't
do anything with the result except possibly print it.  (Someday we
should flush this entire bit of logic and record a standard-format
timestamp in the archive instead.  That's not okay for a back-patched
bug fix, though.)

Also, guard our only other use of mktime() by having initdb's
build_time_t() set tm_isdst = -1 not 0.  This case could only have
an issue in zones that are DST year-round; but I think some do exist,
or could in future.

Per report from Wells Oliver.  Back-patch to all supported
versions, since any of them might need to run with a newer glibc.

Discussion: https://postgr.es/m/CAOC+FBWDhDHO7G-i1_n_hjRzCnUeFO+H-Czi1y10mFhRWpBrew@mail.gmail.com

Further tweaks to stuck_on_old_timeline recovery test

Translate path slashes on target directory path. This was confusing old
branches, but is applied to all branches for the sake of uniformity.
Perl is perfectly able to understand paths with forward slashes.

Along the way, restore the previous archive_wait query, for the sake of
uniformity with other tests, per gripe from Tom Lane.

Ignore more environment variables in pg_regress.c

This is similar to the work done in 8279f68 for TestLib.pm, where
environment variables set may cause unwanted failures if using a
temporary installation with pg_regress. The list of variables reset is
adjusted in each stable branch depending on what is supported.

Comments are added to remember that the lists in TestLib.pm and
pg_regress.c had better be kept in sync.

Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 9.6

Ensure pg_filenode_relation(0, 0) returns NULL.

Previously, a zero value for the relfilenode resulted in
a confusing error message about "unexpected duplicate".
This function returns NULL for other invalid relfilenode
values, so zero should be treated likewise.

It's been like this all along, so back-patch to all supported
branches.

Justin Pryzby

Discussion: https://postgr.es/m/20210612023324 [email protected]

Fix new recovery test for use under msys

Commit caba8f0d43 wasn't quite right for msys, as demonstrated by
several buildfarm animals, including jacana and fairywren. We need to
use the msys perl in the archive command, but call it in such a way that
Windows will understand the path. Furthermore, inside the copy script we
need to convert a Windows path to an msys path.

Remove PGSSLCRLDIR from the list of variables ignored in TAP tests

This variable was present in the list added by 9d660670, but it is not
supported by this branch. Issue noticed while diving into a similar
change for pg_regress.c.

Backpatch-through: 9.6

Adjust new test case to set wal_keep_segments.

Per buildfarm member conchuela and Kyotaro Horiguchi, it's possible
for the WAL segment that the cascading standby needs to be removed
too quickly. Hopefully this will prevent that.

Kyotaro Horiguchi

Discussion: http://postgr.es/m/20210610.101240.1270925505780628275 [email protected]

Fix corner case failure of new standby to follow new primary.

This only happens if (1) the new standby has no WAL available locally,
(2) the new standby is starting from the old timeline, (3) the promotion
happened in the WAL segment from which the new standby is starting,
(4) the timeline history file for the new timeline is available from
the archive but the WAL files for are not (i.e. this is a race),
(5) the WAL files for the new timeline are available via streaming,
and (6) recovery_target_timeline='latest'.

Commit ee994272ca50f70b53074f0febaec97e28f83c4e introduced this
logic and was an improvement over the previous code, but it mishandled
this case. If recovery_target_timeline='latest' and restore_command is
set, validateRecoveryParameters() can change recoveryTargetTLI to be
different from receiveTLI. If streaming is then tried afterward,
expectedTLEs gets initialized with the history of the wrong timeline.
It's supposed to be a list of entries explaining how to get to the
target timeline, but in this case it ends up with a list of entries
explaining how to get to the new standby's original timeline, which
isn't right.

Dilip Kumar and Robert Haas, reviewed by Kyotaro Horiguchi.

Discussion: http://postgr.es/m/CAFiTN-sE-jr=LB8jQuxeqikd-Ux+jHiXyh4YDiZMPedgQKup0g@mail.gmail.com

Back-port a few PostgresNode.pm methods.

The 'lsn' and 'wait_for_catchup' methods only exist in v10 and
higher, but are needed in order to support a test planned test
case for a bug that exists all the way back to v9.6. To minimize
cross-branch differences in the test case, back-port these
methods.

Discussion: http://postgr.es/m/CA+TgmoaG5dmA_8Xc1WvbvftPjtwx5uzkGEHxE7MiJ+im9jynmw@mail.gmail.com

Fix inconsistencies in psql --help=commands

The set of subcommands supported by \dAp, \do and \dy was described
incorrectly in psql's --help. The documentation was already consistent
with the code.

Reported-by: inoas, from IRC
Author: Matthijs van der Vleuten
Reviewed-by: Neil Chen
Discussion: https://postgr.es/m/6a984e24-2171-4039-9050-92d55e7b23fe@www.fastmail.com
Backpatch-through: 9.6

Fix incautious handling of possibly-miscoded strings in client code.

An incorrectly-encoded multibyte character near the end of a string
could cause various processing loops to run past the string's
terminating NUL, with results ranging from no detectable issue to
a program crash, depending on what happens to be in the following
memory.

This isn't an issue in the server, because we take care to verify
the encoding of strings before doing any interesting processing
on them.  However, that lack of care leaked into client-side code
which shouldn't assume that anyone has validated the encoding of
its input.

Although this is certainly a bug worth fixing, the PG security team
elected not to regard it as a security issue, primarily because
any untrusted text should be sanitized by PQescapeLiteral or
the like before being incorporated into a SQL or psql command.
(If an app fails to do so, the same technique can be used to
cause SQL injection, with probably much more dire consequences
than a mere client-program crash.)  Those functions were already
made proof against this class of problem, cf CVE-2006-2313.

To fix, invent PQmblenBounded() which is like PQmblen() except it
won't return more than the number of bytes remaining in the string.
In HEAD we can make this a new libpq function, as PQmblen() is.
It seems imprudent to change libpq's API in stable branches though,
so in the back branches define PQmblenBounded as a macro in the files
that need it.  (Note that just changing PQmblen's behavior would not
be a good idea; notably, it would completely break the escaping
functions' defense against this exact problem.  So we just want a
version for those callers that don't have any better way of handling
this issue.)

Per private report from houjingyi.  Back-patch to all supported branches.

Support use of strnlen() in pre-v11 branches.

Back-patch a minimal subset of commits fffd651e8 and 46912d9b1,
to support strnlen() on all platforms without adding any callers.
This will be needed by a following bug fix.

In PostgresNode.pm, don't pass SQL to psql on the command line

The Msys shell mangles certain patterns in its command line, so avoid
handing arbitrary SQL to psql on the command line and instead use
IPC::Run's redirection facility for stdin. This pattern is already
mostly whats used, but query_poll_until() was not doing the right thing.

Problem discovered on the buildfarm when a new TAP test failed on msys.

Reduce risks of conflicts in internal queries of REFRESH MATVIEW CONCURRENTLY

The internal SQL queries used by REFRESH MATERIALIZED VIEW CONCURRENTLY
include some aliases for its diff and temporary relations with
rather-generic names: diff, newdata, newdata2 and mv.  Depending on the
queries used for the materialized view, using CONCURRENTLY could lead to
some internal failures if the query and those internal aliases conflict.

Those names have been chosen in 841c29c8.  This commit switches instead
to a naming pattern which is less likely going to cause conflicts, based
on an idea from Thomas Munro, by appending _$ to those aliases.  This is
not perfect as those new names could still conflict, but at least it has
the advantage to keep the code readable and simple while reducing the
likelihood of conflicts to be close to zero.

Reported-by: Mathis Rudolf
Author: Bharath Rupireddy
Reviewed-by: Bernd Helmle, Thomas Munro, Michael Paquier
Discussion: https://postgr.es/m/109c267a-10d2-3c53-b60e-720fcf44d9e8@credativ.de
Backpatch-through: 9.6

Ignore more environment variables in TAP tests

Various environment variables were not getting reset in the TAP tests,
which would cause failures depending on the tests or the environment
variables involved.  For example, PGSSL{MAX,MIN}PROTOCOLVERSION could
cause failures in the SSL tests.  Even worse, a junk value of
PGCLIENTENCODING makes a server startup fail.  The list of variables
reset is adjusted in each stable branch depending on what is supported.

While on it, simplify a bit the code per a suggestion from Andrew
Dunstan, using a list of variables instead of doing single deletions.

Reviewed-by: Andrew Dunstan, Daniel Gustafsson
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 9.6

Reject SELECT ... GROUP BY GROUPING SETS (()) FOR UPDATE.

This case should be disallowed, just as FOR UPDATE with a plain
GROUP BY is disallowed; FOR UPDATE only makes sense when each row
of the query result can be identified with a single table row.
However, we missed teaching CheckSelectLocking() to check
groupingSets as well as groupClause, so that it would allow
degenerate grouping sets. That resulted in a bad plan and
a null-pointer dereference in the executor.

Looking around for other instances of the same bug, the only one
I found was in examine_simple_variable(). That'd just lead to
silly estimates, but it should be fixed too.

Per private report from Yaoguang Chen.
Back-patch to all supported branches.

fix syntax error

Report configured port in MSVC built pg_config

This is a long standing omission, discovered when trying to write code
that relied on it.

Backpatch to all live branches.

Fix MSVC scripts when building with GSSAPI/Kerberos

The deliverables of upstream Kerberos on Windows are installed with
paths that do not match our MSVC scripts. First, the include folder was
named "inc/" in our scripts, but the upstream MSIs use "include/".
Second, the build would fail with 64-bit environments as the libraries
are named differently.

This commit adjusts the MSVC scripts to be compatible with the latest
installations of upstream, and I have checked that the compilation was
able to work with the 32-bit and 64-bit installations.

Special thanks to Kondo Yuta for the help in investigating the situation
in hamerkop, which had an incorrect configuration for the GSS
compilation.

Reported-by: Brian Ye
Discussion: https://postgr.es/m/162128202219.27274.12616756784952017465@wrigleys.postgresql.org
Backpatch-through: 9.6

doc: Fix description of some GUCs in docs and postgresql.conf.sample

The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)

This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.

Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 9.6

Disallow SSL renegotiation

SSL renegotiation is already disabled as of 48d23c72, however this does
not prevent the server to comply with a client willing to use
renegotiation.  In the last couple of years, renegotiation had its set
of security issues and flaws (like the recent CVE-2021-3449), and it
could be possible to crash the backend with a client attempting
renegotiation.

This commit takes one extra step by disabling renegotiation in the
backend in the same way as SSL compression (f9264d15) or tickets
(97d3a0b0).  OpenSSL 1.1.0h has added an option named
SSL_OP_NO_RENEGOTIATION able to achieve that.  In older versions
there is an option called SSL3_FLAGS_NO_RENEGOTIATE_CIPHERS that
was undocumented, and could be set within the SSL object created when
the TLS connection opens, but I have decided not to use it, as it feels
trickier to rely on, and it is not official.  Note that this option is
not usable in OpenSSL < 1.1.0h as the internal contents of the *SSL
object are hidden to applications.

SSL renegotiation concerns protocols up to TLSv1.2.

Per original report from Robert Haas, with a patch based on a suggestion
by Andres Freund.

Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 9.6

Clean up cpluspluscheck violation.

"typename" is a C++ keyword, so pg_upgrade.h fails to compile in C++.
Fortunately, there seems no likely reason for somebody to need to
do that. Nonetheless, it's project policy that all .h files should
pass cpluspluscheck, so rename the argument to fix that.

Oversight in 57c081de0; back-patch as that was. (The policy requiring
pg_upgrade.h to pass cpluspluscheck only goes back to v12, but it
seems best to keep this code looking the same in all branches.)

Fix typo and outdated information in README.barrier

README.barrier didn't seem to get the memo when atomics were added. Fix
that.

Author: Tatsuo Ishii, David Rowley
Discussion: https://postgr.es/m/20210516.211133.2159010194908437625.t-ishii%40sraoss.co.jp
Backpatch-through: 9.6, oldest supported release

Be more careful about barriers when releasing BackgroundWorkerSlots.

ForgetBackgroundWorker lacked any memory barrier at all, while
BackgroundWorkerStateChange had one but unaccountably did
additional manipulation of the slot after the barrier. AFAICS,
the rule must be that the barrier is immediately before setting
or clearing slot->in_use.

It looks like back in 9.6 when ForgetBackgroundWorker was first
written, there might have been some case for not needing a
barrier there, but I'm not very convinced of that --- the fact
that the load of bgw_notify_pid is in the caller doesn't seem
to guarantee no memory ordering problem. So patch 9.6 too.

It's likely that this doesn't fix any observable bug on Intel
hardware, but machines with weaker memory ordering rules could
have problems here.

Discussion: https://postgr.es/m/4046084.1620244003@sss.pgh.pa.us

Prevent infinite insertion loops in spgdoinsert().

Formerly we just relied on operator classes that assert longValuesOK
to eventually shorten the leaf value enough to fit on an index page.
That fails since the introduction of INCLUDE-column support (commit
09c1c6ab4), because the INCLUDE columns might alone take up more
than a page, meaning no amount of leaf-datum compaction will get
the job done.  At least with spgtextproc.c, that leads to an infinite
loop, since spgtextproc.c won't throw an error for not being able
to shorten the leaf datum anymore.

To fix without breaking cases that would otherwise work, add logic
to spgdoinsert() to verify that the leaf tuple size is decreasing
after each "choose" step.  Some opclasses might not decrease the
size on every single cycle, and in any case, alignment roundoff
of the tuple size could obscure small gains.  Therefore, allow
up to 10 cycles without additional savings before throwing an
error.  (Perhaps this number will need adjustment, but it seems
quite generous right now.)

As long as we've developed this logic, let's back-patch it.
The back branches don't have INCLUDE columns to worry about, but
this seems like a good defense against possible bugs in operator
classes.  We already know that an infinite loop here is pretty
unpleasant, so having a defense seems to outweigh the risk of
breaking things.  (Note that spgtextproc.c is actually the only
known opclass with longValuesOK support, so that this is all moot
for known non-core opclasses anyway.)

Per report from Dilip Kumar.

Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com

Fix query-cancel handling in spgdoinsert().

Knowing that a buggy opclass could cause an infinite insertion loop,
spgdoinsert() intended to allow its loop to be interrupted by query
cancel.  However, that never actually worked, because in iterations
after the first, we'd be holding buffer lock(s) which would cause
InterruptHoldoffCount to be positive, preventing servicing of the
interrupt.

To fix, check if an interrupt is pending, and if so fall out of
the insertion loop and service the interrupt after we've released
the buffers.  If it was indeed a query cancel, that's the end of
the matter.  If it was a non-canceling interrupt reason, make use
of the existing provision to retry the whole insertion.  (This isn't
as wasteful as it might seem, since any upper-level index tuples we
already created should be usable in the next attempt.)

While there's no known instance of such a bug in existing release
branches, it still seems like a good idea to back-patch this to
all supported branches, since the behavior is fairly nasty if a
loop does happen --- not only is it uncancelable, but it will
quickly consume memory to the point of an OOM failure.  In any
case, this code is certainly not working as intended.

Per report from Dilip Kumar.

Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com

Refactor CHECK_FOR_INTERRUPTS() to add flexibility.

Split up CHECK_FOR_INTERRUPTS() to provide an additional macro
INTERRUPTS_PENDING_CONDITION(), which just tests whether an
interrupt is pending without attempting to service it. This is
useful in situations where the caller knows that interrupts are
blocked, and would like to find out if it's worth the trouble
to unblock them.

Also add INTERRUPTS_CAN_BE_PROCESSED(), which indicates whether
CHECK_FOR_INTERRUPTS() can be relied on to clear the pending interrupt.

This commit doesn't actually add any uses of the new macros,
but a follow-on bug fix will do so. Back-patch to all supported
branches to provide infrastructure for that fix.

Alvaro Herrera and Tom Lane

Discussion: https://postgr.es/m/20210513155351 [email protected]

Stamp 9.6.22.

Last-minute updates for release notes.

Security: CVE-2021-32027, CVE-2021-32028, CVE-2021-32029

Fix mishandling of resjunk columns in ON CONFLICT ... UPDATE tlists.

It's unusual to have any resjunk columns in an ON CONFLICT ... UPDATE
list, but it can happen when MULTIEXPR_SUBLINK SubPlans are present.
If it happens, the ON CONFLICT UPDATE code path would end up storing
tuples that include the values of the extra resjunk columns.  That's
fairly harmless in the short run, but if new columns are added to
the table then the values would become accessible, possibly leading
to malfunctions if they don't match the datatypes of the new columns.

This had escaped notice through a confluence of missing sanity checks,
including

* There's no cross-check that a tuple presented to heap_insert or
heap_update matches the table rowtype.  While it's difficult to
check that fully at reasonable cost, we can easily add assertions
that there aren't too many columns.

* The output-column-assignment cases in execExprInterp.c lacked
any sanity checks on the output column numbers, which seems like
an oversight considering there are plenty of assertion checks on
input column numbers.  Add assertions there too.

* We failed to apply nodeModifyTable's ExecCheckPlanOutput() to
the ON CONFLICT UPDATE tlist.  That wouldn't have caught this
specific error, since that function is chartered to ignore resjunk
columns; but it sure seems like a bad omission now that we've seen
this bug.

In HEAD, the right way to fix this is to make the processing of
ON CONFLICT UPDATE tlists work the same as regular UPDATE tlists
now do, that is don't add "SET x = x" entries, and use
ExecBuildUpdateProjection to evaluate the tlist and combine it with
old values of the not-set columns.  This adds a little complication
to ExecBuildUpdateProjection, but allows removal of a comparable
amount of now-dead code from the planner.

In the back branches, the most expedient solution seems to be to
(a) use an output slot for the ON CONFLICT UPDATE projection that
actually matches the target table, and then (b) invent a variant of
ExecBuildProjectionInfo that can be told to not store values resulting
from resjunk columns, so it doesn't try to store into nonexistent
columns of the output slot.  (We can't simply ignore the resjunk columns
altogether; they have to be evaluated for MULTIEXPR_SUBLINK to work.)
This works back to v10.  In 9.6, projections work much differently and
we can't cheaply give them such an option.  The 9.6 version of this
patch works by inserting a JunkFilter when it's necessary to get rid
of resjunk columns.

In addition, v11 and up have the reverse problem when trying to
perform ON CONFLICT UPDATE on a partitioned table.  Through a
further oversight, adjust_partition_tlist() discarded resjunk columns
when re-ordering the ON CONFLICT UPDATE tlist to match a partition.
This accidentally prevented the storing-bogus-tuples problem, but
at the cost that MULTIEXPR_SUBLINK cases didn't work, typically
crashing if more than one row has to be updated.  Fix by preserving
resjunk columns in that routine.  (I failed to resist the temptation
to add more assertions there too, and to do some minor code
beautification.)

Per report from Andres Freund.  Back-patch to all supported branches.

Security: CVE-2021-32028

Prevent integer overflows in array subscripting calculations.

While we were (mostly) careful about ensuring that the dimensions of
arrays aren't large enough to cause integer overflow, the lower bound
values were generally not checked.  This allows situations where
lower_bound + dimension overflows an integer.  It seems that that's
harmless so far as array reading is concerned, except that array
elements with subscripts notionally exceeding INT_MAX are inaccessible.
However, it confuses various array-assignment logic, resulting in a
potential for memory stomps.

Fix by adding checks that array lower bounds aren't large enough to
cause lower_bound + dimension to overflow.  (Note: this results in
disallowing cases where the last subscript position would be exactly
INT_MAX.  In principle we could probably allow that, but there's a lot
of code that computes lower_bound + dimension and would need adjustment.
It seems doubtful that it's worth the trouble/risk to allow it.)

Somewhat independently of that, array_set_element() was careless
about possible overflow when checking the subscript of a fixed-length
array, creating a different route to memory stomps.  Fix that too.

Security: CVE-2021-32027

Translation updates

Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 9ff8d81b53760d6603761384e52e7c643cf88b3a

Release notes for 13.3, 12.7, 11.12, 10.17, 9.6.22.

Document lock level used by ALTER TABLE VALIDATE CONSTRAINT

Backpatch all the way back to 9.6.

Author: Simon Riggs <[email protected]>
Discussion: https://postgr.es/m/CANbhV-EwxvdhHuOLdfG2ciYrHOHXV=mm6=fD5aMhqcH09Li3Tg@mail.gmail.com

Doc: add an example of a self-referential foreign key to ddl.sgml.

While we've always allowed such cases, the documentation didn't
say you could do it.

Discussion: https://postgr.es/m/161969805833.690.13680986983883602407@wrigleys.postgresql.org

Doc: update libpq's documentation for PQfn().

Mention specifically that you can't call aggregates, window functions,
or procedures this way (the inability to call SRFs was already
mentioned).

Also, the claim that PQfn doesn't support NULL arguments or results
has been a lie since we invented protocol 3.0. Not sure why this
text was never updated for that, but do it now.

Discussion: https://postgr.es/m/2039442.1615317309@sss.pgh.pa.us

Disallow calling anything but plain functions via the fastpath API.

Reject aggregates, window functions, and procedures.  Aggregates
failed anyway, though with a somewhat obscure error message.
Window functions would hit an Assert or null-pointer dereference.
Procedures seemed to work as long as you didn't try to do
transaction control, but (a) transaction control is sort of the
point of a procedure, and (b) it's not entirely clear that no
bugs lurk in that path.  Given the lack of testing of this area,
it seems safest to be conservative in what we support.

Also reject proretset functions, as the fastpath protocol can't
support returning a set.

Also remove an easily-triggered assertion that the given OID
isn't 0; the subsequent lookups can handle that case themselves.

Per report from Theodor-Arsenij Larionov-Trichkin.
Back-patch to all supported branches.  (The procedure angle
only applies in v11+, of course.)

Discussion: https://postgr.es/m/2039442.1615317309@sss.pgh.pa.us

Fix some more omissions in pg_upgrade's tests for non-upgradable types.

Commits 29aeda6e4 et al closed up some oversights involving not checking
for non-upgradable types within container types, such as arrays and
ranges.  However, I only looked at version.c, failing to notice that
there were substantially-equivalent tests in check.c.  (The division
of responsibility between those files is less than clear...)

In addition, because genbki.pl does not guarantee that auto-generated
rowtype OIDs will hold still across versions, we need to consider that
the composite type associated with a system catalog or view is
non-upgradable.  It seems unlikely that someone would have a user
column declared that way, but if they did, trying to read it in another
PG version would likely draw "no such pg_type OID" failures, thanks
to the type OID embedded in composite Datums.

To support the composite and reg*-type cases, extend the recursive
query that does the search to allow any base query that returns
a column of pg_type OIDs, rather than limiting it to exactly one
starting type.

As before, back-patch to all supported branches.

Discussion: https://postgr.es/m/2798740.1619622555@sss.pgh.pa.us

Doc: fix discussion of how to get real Julian Dates.

Somehow I'd convinced myself that rotating to UTC-12 was the way
to do this, but upon further review, it's definitely UTC+12.

Discussion: https://postgr.es/m/1197050.1619123213@sss.pgh.pa.us

Fix use-after-release issue with pg_identify_object_as_address()

Spotted by buildfarm member prion, with -DRELCACHE_FORCE_RELEASE.

Introduced in f7aab36.

Discussion: https://postgr.es/m/2759018.1619577848@sss.pgh.pa.us
Backpatch-through: 9.6

Fix pg_identify_object_as_address() with event triggers

Attempting to use this function with event triggers failed, as, since
its introduction in a676201, this code has never associated an object
name with event triggers. This addresses the failure by adding the
event trigger name to the set defining its object address.

Note that regression tests are added within event_trigger and not
object_address to avoid issues with concurrent connections in parallel
schedules.

Author: Joel Jacobson
Discussion: https://postgr.es/m/3c905e77-a026-46ae-8835-c3f6cd1d24c8@www.fastmail.com
Backpatch-through: 9.6

Doc: document EXTRACT(JULIAN ...), improve Julian Date explanation.

For some reason, the "julian" option for extract()/date_part() has
never gotten listed in the manual. Also, while Appendix B mentioned
in passing that we don't conform to the usual astronomical definition
that a Julian date starts at noon UTC, it was kind of vague about what
we do instead. Clarify that, and add an example showing how to get
the astronomical definition if you want it.

It's been like this for ages, so back-patch to all supported branches.

Discussion: https://postgr.es/m/1197050.1619123213@sss.pgh.pa.us

fix silly perl error in commit d064afc720

Only ever test for non-127.0.0.1 addresses on Windows in PostgresNode

This has been found to cause hangs where tcp usage is forced.

Alexey Kodratov

Discussion: https://postgr.es/m/82e271a9a11928337fcb5b5e57b423c0@postgrespro.ru

Backpatch to all live branches

Allow TestLib::slurp_file to skip contents, and use as needed

In order to avoid getting old logfile contents certain functions in
PostgresNode were doing one of two things. On Windows it rotated the
logfile and restarted the server, while elsewhere it truncated the log
file. Both of these are unnecessary. We borrow from the buildfarm which
does this instead: note the size of the logfile before we start, and
then when fetching the logfile skip to that position before accumulating
contents. This is spelled differently on Windows but the effect is the
same. This is largely centralized in TestLib's slurp_file function,
which has a new optional parameter, the offset to skip to before
starting to reading the file. Code in the client becomes much neater.

Backpatch to all live branches.

Michael Paquier, slightly modified by me.

Discussion: https://postgr.es/m/[email protected]

Fix some inappropriately-disallowed uses of ALTER ROLE/DATABASE SET.

Most GUC check hooks that inspect database state have special checks
that prevent them from throwing hard errors for state-dependent issues
when source == PGC_S_TEST.  This allows, for example,
"ALTER DATABASE d SET default_text_search_config = foo" when the "foo"
configuration hasn't been created yet.  Without this, we have problems
during dump/reload or pg_upgrade, because pg_dump has no idea about
possible dependencies of GUC values and can't ensure a safe restore
ordering.

However, check_role() and check_session_authorization() hadn't gotten
the memo about that, and would throw hard errors anyway.  It's not
entirely clear what is the use-case for "ALTER ROLE x SET role = y",
but we've now heard two independent complaints about that bollixing
an upgrade, so apparently some people are doing it.

Hence, fix these two functions to act more like other check hooks
with similar needs.  (But I did not change their insistence on
being inside a transaction, as it's still not apparent that setting
either GUC from the configuration file would be wise.)

Also fix check_temp_buffers, which had a different form of the disease
of making state-dependent checks without any exception for PGC_S_TEST.
A cursory survey of other GUC check hooks did not find any more issues
of this ilk.  (There are a lot of interdependencies among
PGC_POSTMASTER and PGC_SIGHUP GUCs, which may be a bad idea, but
they're not relevant to the immediate concern because they can't be
set via ALTER ROLE/DATABASE.)

Per reports from Charlie Hornsby and Nathan Bossart.  Back-patch
to all supported branches.

Discussion: https://postgr.es/m/HE1P189MB0523B31598B0C772C908088DB7709@HE1P189MB0523.EURP189.PROD.OUTLOOK.COM
Discussion: https://postgr.es/m/20160711223641 [email protected]

Use "-I." in directories holding Bison parsers, for Oracle compilers.

With the Oracle Developer Studio 12.6 compiler, #line directives alter
the current source file location for purposes of #include "..."
directives.  Hence, a VPATH build failed with 'cannot find include file:
"specscanner.c"'.  With two exceptions, parser-containing directories
already add "-I. -I$(srcdir)"; eliminate the exceptions.  Back-patch to
9.6 (all supported versions).

Port regress-python3-mangle.mk to Solaris "sed".

It doesn't support "$foo$*" like a POSIX "sed" implementation does;
see the Autoconf manual. Back-patch to 9.6 (all supported versions).

Fix old bug with coercing the result of a COLLATE expression.

There are hacks in parse_coerce.c to push down a requested coercion
to below any CollateExpr that may appear.  However, we did that even
if the requested data type is non-collatable, leading to an invalid
expression tree in which CollateExpr is applied to a non-collatable
type.  The fix is just to drop the CollateExpr altogether, reasoning
that it's useless.

This bug is ten years old, dating to the original addition of
COLLATE support.  The lack of field complaints suggests that there
aren't a lot of user-visible consequences.  We noticed the problem
because it would trigger an assertion in DefineVirtualRelation if
the invalid structure appears as an output column of a view; however,
in a non-assert build, you don't see a crash just a (subtly incorrect)
complaint about applying collation to a non-collatable type.  I found
that by putting the incorrect structure further down in a view, I could
make a view definition that would fail dump/reload, per the added
regression test case.  But CollateExpr doesn't do anything at run-time,
so this likely doesn't lead to any really exciting consequences.

Per report from Yulin Pei.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/HK0PR01MB22744393C474D503E16C8509F4709@HK0PR01MB2274.apcprd01.prod.exchangelabs.com

Fix out-of-bound memory access for interval -> char conversion

Using Roman numbers (via "RM" or "rm") for a conversion to calculate a
number of months has never considered the case of negative numbers,
where a conversion could easily cause out-of-bound memory accesses. The
conversions in themselves were not completely consistent either, as
specifying 12 would result in NULL, but it should mean XII.

This commit reworks the conversion calculation to have a more
consistent behavior:
- If the number of months and years is 0, return NULL.
- If the number of months is positive, return the exact month number.
- If the number of months is negative, do a backward calculation, with
-1 meaning December, -2 November, etc.

Reported-by: Theodor Arsenij Larionov-Trichkin
Author: Julien Rouhaud
Discussion: https://postgr.es/m/16953-f255a18f8c51f1d5@postgresql.org
backpatch-through: 9.6

Fix typo

Author: Daniel Westermann
Backpatch-through: 9.6
Discussion: https://postgr.es/m/GV0P278MB0483A7AA85BAFCC06D90F453D2739@GV0P278MB0483.CHEP278.PROD.OUTLOOK.COM

Fix typos and grammar in documentation and code comments

Comment fixes are applied on HEAD, and documentation improvements are
applied on back-branches where needed.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20210408164008 [email protected]
Backpatch-through: 9.6

Don't add non-existent pages to bitmap from BRIN

The code in bringetbitmap() simply added the whole matching page range
to the TID bitmap, as determined by pages_per_range, even if some of the
pages were beyond the end of the heap. The query then might fail with
an error like this:

  ERROR:  could not open file "base/20176/20228.2" (target block
          262144): previous segment is only 131021 blocks

In this case, the relation has 262093 pages (131072 and 131021 pages),
but we're trying to acess block 262144, i.e. first block of the 3rd
segment. At that point _mdfd_getseg() notices the preceding segment is
incomplete, and fails.

Hitting this in practice is rather unlikely, because:

* Most indexes use power-of-two ranges, so segments and page ranges
  align perfectly (segment end is also a page range end).

* The table size has to be just right, with the last segment being
  almost full - less than one page range from full segment, so that the
  last page range actually crosses the segment boundary.

* Prefetch has to be enabled. The regular page access checks that
  pages are not beyond heap end, but prefetch does not. On older
  releases (before 12) the execution stops after hitting the first
  non-existent page, so the prefetch distance has to be sufficient
  to reach the first page in the next segment to trigger the issue.
  Since 12 it's enough to just have prefetch enabled, the prefetch
  distance does not matter.

Fixed by not adding non-existent pages to the TID bitmap. Backpatch
all the way back to 9.6 (BRIN indexes were introduced in 9.5, but that
release is EOL).

Backpatch-through: 9.6