git.postgresql.org Git - postgresql.git/log

Back-port a few PostgresNode.pm methods.

The 'lsn' and 'wait_for_catchup' methods only exist in v10 and
higher, but are needed in order to support a test planned test
case for a bug that exists all the way back to v9.6. To minimize
cross-branch differences in the test case, back-port these
methods.

Discussion: http://postgr.es/m/CA+TgmoaG5dmA_8Xc1WvbvftPjtwx5uzkGEHxE7MiJ+im9jynmw@mail.gmail.com

Fix inconsistencies in psql --help=commands

The set of subcommands supported by \dAp, \do and \dy was described
incorrectly in psql's --help. The documentation was already consistent
with the code.

Reported-by: inoas, from IRC
Author: Matthijs van der Vleuten
Reviewed-by: Neil Chen
Discussion: https://postgr.es/m/6a984e24-2171-4039-9050-92d55e7b23fe@www.fastmail.com
Backpatch-through: 9.6

Fix incautious handling of possibly-miscoded strings in client code.

An incorrectly-encoded multibyte character near the end of a string
could cause various processing loops to run past the string's
terminating NUL, with results ranging from no detectable issue to
a program crash, depending on what happens to be in the following
memory.

This isn't an issue in the server, because we take care to verify
the encoding of strings before doing any interesting processing
on them.  However, that lack of care leaked into client-side code
which shouldn't assume that anyone has validated the encoding of
its input.

Although this is certainly a bug worth fixing, the PG security team
elected not to regard it as a security issue, primarily because
any untrusted text should be sanitized by PQescapeLiteral or
the like before being incorporated into a SQL or psql command.
(If an app fails to do so, the same technique can be used to
cause SQL injection, with probably much more dire consequences
than a mere client-program crash.)  Those functions were already
made proof against this class of problem, cf CVE-2006-2313.

To fix, invent PQmblenBounded() which is like PQmblen() except it
won't return more than the number of bytes remaining in the string.
In HEAD we can make this a new libpq function, as PQmblen() is.
It seems imprudent to change libpq's API in stable branches though,
so in the back branches define PQmblenBounded as a macro in the files
that need it.  (Note that just changing PQmblen's behavior would not
be a good idea; notably, it would completely break the escaping
functions' defense against this exact problem.  So we just want a
version for those callers that don't have any better way of handling
this issue.)

Per private report from houjingyi.  Back-patch to all supported branches.

Support use of strnlen() in pre-v11 branches.

Back-patch a minimal subset of commits fffd651e8 and 46912d9b1,
to support strnlen() on all platforms without adding any callers.
This will be needed by a following bug fix.

In PostgresNode.pm, don't pass SQL to psql on the command line

The Msys shell mangles certain patterns in its command line, so avoid
handing arbitrary SQL to psql on the command line and instead use
IPC::Run's redirection facility for stdin. This pattern is already
mostly whats used, but query_poll_until() was not doing the right thing.

Problem discovered on the buildfarm when a new TAP test failed on msys.

Reduce risks of conflicts in internal queries of REFRESH MATVIEW CONCURRENTLY

The internal SQL queries used by REFRESH MATERIALIZED VIEW CONCURRENTLY
include some aliases for its diff and temporary relations with
rather-generic names: diff, newdata, newdata2 and mv.  Depending on the
queries used for the materialized view, using CONCURRENTLY could lead to
some internal failures if the query and those internal aliases conflict.

Those names have been chosen in 841c29c8.  This commit switches instead
to a naming pattern which is less likely going to cause conflicts, based
on an idea from Thomas Munro, by appending _$ to those aliases.  This is
not perfect as those new names could still conflict, but at least it has
the advantage to keep the code readable and simple while reducing the
likelihood of conflicts to be close to zero.

Reported-by: Mathis Rudolf
Author: Bharath Rupireddy
Reviewed-by: Bernd Helmle, Thomas Munro, Michael Paquier
Discussion: https://postgr.es/m/109c267a-10d2-3c53-b60e-720fcf44d9e8@credativ.de
Backpatch-through: 9.6

Ignore more environment variables in TAP tests

Various environment variables were not getting reset in the TAP tests,
which would cause failures depending on the tests or the environment
variables involved.  For example, PGSSL{MAX,MIN}PROTOCOLVERSION could
cause failures in the SSL tests.  Even worse, a junk value of
PGCLIENTENCODING makes a server startup fail.  The list of variables
reset is adjusted in each stable branch depending on what is supported.

While on it, simplify a bit the code per a suggestion from Andrew
Dunstan, using a list of variables instead of doing single deletions.

Reviewed-by: Andrew Dunstan, Daniel Gustafsson
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 9.6

Reject SELECT ... GROUP BY GROUPING SETS (()) FOR UPDATE.

This case should be disallowed, just as FOR UPDATE with a plain
GROUP BY is disallowed; FOR UPDATE only makes sense when each row
of the query result can be identified with a single table row.
However, we missed teaching CheckSelectLocking() to check
groupingSets as well as groupClause, so that it would allow
degenerate grouping sets. That resulted in a bad plan and
a null-pointer dereference in the executor.

Looking around for other instances of the same bug, the only one
I found was in examine_simple_variable(). That'd just lead to
silly estimates, but it should be fixed too.

Per private report from Yaoguang Chen.
Back-patch to all supported branches.

fix syntax error

Report configured port in MSVC built pg_config

This is a long standing omission, discovered when trying to write code
that relied on it.

Backpatch to all live branches.

Fix MSVC scripts when building with GSSAPI/Kerberos

The deliverables of upstream Kerberos on Windows are installed with
paths that do not match our MSVC scripts. First, the include folder was
named "inc/" in our scripts, but the upstream MSIs use "include/".
Second, the build would fail with 64-bit environments as the libraries
are named differently.

This commit adjusts the MSVC scripts to be compatible with the latest
installations of upstream, and I have checked that the compilation was
able to work with the 32-bit and 64-bit installations.

Special thanks to Kondo Yuta for the help in investigating the situation
in hamerkop, which had an incorrect configuration for the GSS
compilation.

Reported-by: Brian Ye
Discussion: https://postgr.es/m/162128202219.27274.12616756784952017465@wrigleys.postgresql.org
Backpatch-through: 9.6

doc: Fix description of some GUCs in docs and postgresql.conf.sample

The following parameters have been imprecise, or incorrect, about their
description (PGC_POSTMASTER or PGC_SIGHUP):
- autovacuum_work_mem (docs, as of 9.6~)
- huge_page_size (docs, as of 14~)
- max_logical_replication_workers (docs, as of 10~)
- max_sync_workers_per_subscription (docs, as of 10~)
- min_dynamic_shared_memory (docs, as of 14~)
- recovery_init_sync_method (postgresql.conf.sample, as of 14~)
- remove_temp_files_after_crash (docs, as of 14~)
- restart_after_crash (docs, as of 9.6~)
- ssl_min_protocol_version (docs, as of 12~)
- ssl_max_protocol_version (docs, as of 12~)

This commit adjusts the description of all these parameters to be more
consistent with the practice used for the others.

Revewed-by: Justin Pryzby
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 9.6

Disallow SSL renegotiation

SSL renegotiation is already disabled as of 48d23c72, however this does
not prevent the server to comply with a client willing to use
renegotiation.  In the last couple of years, renegotiation had its set
of security issues and flaws (like the recent CVE-2021-3449), and it
could be possible to crash the backend with a client attempting
renegotiation.

This commit takes one extra step by disabling renegotiation in the
backend in the same way as SSL compression (f9264d15) or tickets
(97d3a0b0).  OpenSSL 1.1.0h has added an option named
SSL_OP_NO_RENEGOTIATION able to achieve that.  In older versions
there is an option called SSL3_FLAGS_NO_RENEGOTIATE_CIPHERS that
was undocumented, and could be set within the SSL object created when
the TLS connection opens, but I have decided not to use it, as it feels
trickier to rely on, and it is not official.  Note that this option is
not usable in OpenSSL < 1.1.0h as the internal contents of the *SSL
object are hidden to applications.

SSL renegotiation concerns protocols up to TLSv1.2.

Per original report from Robert Haas, with a patch based on a suggestion
by Andres Freund.

Author: Michael Paquier
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 9.6

Clean up cpluspluscheck violation.

"typename" is a C++ keyword, so pg_upgrade.h fails to compile in C++.
Fortunately, there seems no likely reason for somebody to need to
do that. Nonetheless, it's project policy that all .h files should
pass cpluspluscheck, so rename the argument to fix that.

Oversight in 57c081de0; back-patch as that was. (The policy requiring
pg_upgrade.h to pass cpluspluscheck only goes back to v12, but it
seems best to keep this code looking the same in all branches.)

Fix typo and outdated information in README.barrier

README.barrier didn't seem to get the memo when atomics were added. Fix
that.

Author: Tatsuo Ishii, David Rowley
Discussion: https://postgr.es/m/20210516.211133.2159010194908437625.t-ishii%40sraoss.co.jp
Backpatch-through: 9.6, oldest supported release

Be more careful about barriers when releasing BackgroundWorkerSlots.

ForgetBackgroundWorker lacked any memory barrier at all, while
BackgroundWorkerStateChange had one but unaccountably did
additional manipulation of the slot after the barrier. AFAICS,
the rule must be that the barrier is immediately before setting
or clearing slot->in_use.

It looks like back in 9.6 when ForgetBackgroundWorker was first
written, there might have been some case for not needing a
barrier there, but I'm not very convinced of that --- the fact
that the load of bgw_notify_pid is in the caller doesn't seem
to guarantee no memory ordering problem. So patch 9.6 too.

It's likely that this doesn't fix any observable bug on Intel
hardware, but machines with weaker memory ordering rules could
have problems here.

Discussion: https://postgr.es/m/4046084.1620244003@sss.pgh.pa.us

Prevent infinite insertion loops in spgdoinsert().

Formerly we just relied on operator classes that assert longValuesOK
to eventually shorten the leaf value enough to fit on an index page.
That fails since the introduction of INCLUDE-column support (commit
09c1c6ab4), because the INCLUDE columns might alone take up more
than a page, meaning no amount of leaf-datum compaction will get
the job done.  At least with spgtextproc.c, that leads to an infinite
loop, since spgtextproc.c won't throw an error for not being able
to shorten the leaf datum anymore.

To fix without breaking cases that would otherwise work, add logic
to spgdoinsert() to verify that the leaf tuple size is decreasing
after each "choose" step.  Some opclasses might not decrease the
size on every single cycle, and in any case, alignment roundoff
of the tuple size could obscure small gains.  Therefore, allow
up to 10 cycles without additional savings before throwing an
error.  (Perhaps this number will need adjustment, but it seems
quite generous right now.)

As long as we've developed this logic, let's back-patch it.
The back branches don't have INCLUDE columns to worry about, but
this seems like a good defense against possible bugs in operator
classes.  We already know that an infinite loop here is pretty
unpleasant, so having a defense seems to outweigh the risk of
breaking things.  (Note that spgtextproc.c is actually the only
known opclass with longValuesOK support, so that this is all moot
for known non-core opclasses anyway.)

Per report from Dilip Kumar.

Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com

Fix query-cancel handling in spgdoinsert().

Knowing that a buggy opclass could cause an infinite insertion loop,
spgdoinsert() intended to allow its loop to be interrupted by query
cancel.  However, that never actually worked, because in iterations
after the first, we'd be holding buffer lock(s) which would cause
InterruptHoldoffCount to be positive, preventing servicing of the
interrupt.

To fix, check if an interrupt is pending, and if so fall out of
the insertion loop and service the interrupt after we've released
the buffers.  If it was indeed a query cancel, that's the end of
the matter.  If it was a non-canceling interrupt reason, make use
of the existing provision to retry the whole insertion.  (This isn't
as wasteful as it might seem, since any upper-level index tuples we
already created should be usable in the next attempt.)

While there's no known instance of such a bug in existing release
branches, it still seems like a good idea to back-patch this to
all supported branches, since the behavior is fairly nasty if a
loop does happen --- not only is it uncancelable, but it will
quickly consume memory to the point of an OOM failure.  In any
case, this code is certainly not working as intended.

Per report from Dilip Kumar.

Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com

Refactor CHECK_FOR_INTERRUPTS() to add flexibility.

Split up CHECK_FOR_INTERRUPTS() to provide an additional macro
INTERRUPTS_PENDING_CONDITION(), which just tests whether an
interrupt is pending without attempting to service it. This is
useful in situations where the caller knows that interrupts are
blocked, and would like to find out if it's worth the trouble
to unblock them.

Also add INTERRUPTS_CAN_BE_PROCESSED(), which indicates whether
CHECK_FOR_INTERRUPTS() can be relied on to clear the pending interrupt.

This commit doesn't actually add any uses of the new macros,
but a follow-on bug fix will do so. Back-patch to all supported
branches to provide infrastructure for that fix.

Alvaro Herrera and Tom Lane

Discussion: https://postgr.es/m/20210513155351 [email protected]

Stamp 9.6.22.

Last-minute updates for release notes.

Security: CVE-2021-32027, CVE-2021-32028, CVE-2021-32029

Fix mishandling of resjunk columns in ON CONFLICT ... UPDATE tlists.

It's unusual to have any resjunk columns in an ON CONFLICT ... UPDATE
list, but it can happen when MULTIEXPR_SUBLINK SubPlans are present.
If it happens, the ON CONFLICT UPDATE code path would end up storing
tuples that include the values of the extra resjunk columns.  That's
fairly harmless in the short run, but if new columns are added to
the table then the values would become accessible, possibly leading
to malfunctions if they don't match the datatypes of the new columns.

This had escaped notice through a confluence of missing sanity checks,
including

* There's no cross-check that a tuple presented to heap_insert or
heap_update matches the table rowtype.  While it's difficult to
check that fully at reasonable cost, we can easily add assertions
that there aren't too many columns.

* The output-column-assignment cases in execExprInterp.c lacked
any sanity checks on the output column numbers, which seems like
an oversight considering there are plenty of assertion checks on
input column numbers.  Add assertions there too.

* We failed to apply nodeModifyTable's ExecCheckPlanOutput() to
the ON CONFLICT UPDATE tlist.  That wouldn't have caught this
specific error, since that function is chartered to ignore resjunk
columns; but it sure seems like a bad omission now that we've seen
this bug.

In HEAD, the right way to fix this is to make the processing of
ON CONFLICT UPDATE tlists work the same as regular UPDATE tlists
now do, that is don't add "SET x = x" entries, and use
ExecBuildUpdateProjection to evaluate the tlist and combine it with
old values of the not-set columns.  This adds a little complication
to ExecBuildUpdateProjection, but allows removal of a comparable
amount of now-dead code from the planner.

In the back branches, the most expedient solution seems to be to
(a) use an output slot for the ON CONFLICT UPDATE projection that
actually matches the target table, and then (b) invent a variant of
ExecBuildProjectionInfo that can be told to not store values resulting
from resjunk columns, so it doesn't try to store into nonexistent
columns of the output slot.  (We can't simply ignore the resjunk columns
altogether; they have to be evaluated for MULTIEXPR_SUBLINK to work.)
This works back to v10.  In 9.6, projections work much differently and
we can't cheaply give them such an option.  The 9.6 version of this
patch works by inserting a JunkFilter when it's necessary to get rid
of resjunk columns.

In addition, v11 and up have the reverse problem when trying to
perform ON CONFLICT UPDATE on a partitioned table.  Through a
further oversight, adjust_partition_tlist() discarded resjunk columns
when re-ordering the ON CONFLICT UPDATE tlist to match a partition.
This accidentally prevented the storing-bogus-tuples problem, but
at the cost that MULTIEXPR_SUBLINK cases didn't work, typically
crashing if more than one row has to be updated.  Fix by preserving
resjunk columns in that routine.  (I failed to resist the temptation
to add more assertions there too, and to do some minor code
beautification.)

Per report from Andres Freund.  Back-patch to all supported branches.

Security: CVE-2021-32028

Prevent integer overflows in array subscripting calculations.

While we were (mostly) careful about ensuring that the dimensions of
arrays aren't large enough to cause integer overflow, the lower bound
values were generally not checked.  This allows situations where
lower_bound + dimension overflows an integer.  It seems that that's
harmless so far as array reading is concerned, except that array
elements with subscripts notionally exceeding INT_MAX are inaccessible.
However, it confuses various array-assignment logic, resulting in a
potential for memory stomps.

Fix by adding checks that array lower bounds aren't large enough to
cause lower_bound + dimension to overflow.  (Note: this results in
disallowing cases where the last subscript position would be exactly
INT_MAX.  In principle we could probably allow that, but there's a lot
of code that computes lower_bound + dimension and would need adjustment.
It seems doubtful that it's worth the trouble/risk to allow it.)

Somewhat independently of that, array_set_element() was careless
about possible overflow when checking the subscript of a fixed-length
array, creating a different route to memory stomps.  Fix that too.

Security: CVE-2021-32027

Translation updates

Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 9ff8d81b53760d6603761384e52e7c643cf88b3a

Release notes for 13.3, 12.7, 11.12, 10.17, 9.6.22.

Document lock level used by ALTER TABLE VALIDATE CONSTRAINT

Backpatch all the way back to 9.6.

Author: Simon Riggs <[email protected]>
Discussion: https://postgr.es/m/CANbhV-EwxvdhHuOLdfG2ciYrHOHXV=mm6=fD5aMhqcH09Li3Tg@mail.gmail.com

Doc: add an example of a self-referential foreign key to ddl.sgml.

While we've always allowed such cases, the documentation didn't
say you could do it.

Discussion: https://postgr.es/m/161969805833.690.13680986983883602407@wrigleys.postgresql.org

Doc: update libpq's documentation for PQfn().

Mention specifically that you can't call aggregates, window functions,
or procedures this way (the inability to call SRFs was already
mentioned).

Also, the claim that PQfn doesn't support NULL arguments or results
has been a lie since we invented protocol 3.0. Not sure why this
text was never updated for that, but do it now.

Discussion: https://postgr.es/m/2039442.1615317309@sss.pgh.pa.us

Disallow calling anything but plain functions via the fastpath API.

Reject aggregates, window functions, and procedures.  Aggregates
failed anyway, though with a somewhat obscure error message.
Window functions would hit an Assert or null-pointer dereference.
Procedures seemed to work as long as you didn't try to do
transaction control, but (a) transaction control is sort of the
point of a procedure, and (b) it's not entirely clear that no
bugs lurk in that path.  Given the lack of testing of this area,
it seems safest to be conservative in what we support.

Also reject proretset functions, as the fastpath protocol can't
support returning a set.

Also remove an easily-triggered assertion that the given OID
isn't 0; the subsequent lookups can handle that case themselves.

Per report from Theodor-Arsenij Larionov-Trichkin.
Back-patch to all supported branches.  (The procedure angle
only applies in v11+, of course.)

Discussion: https://postgr.es/m/2039442.1615317309@sss.pgh.pa.us

Fix some more omissions in pg_upgrade's tests for non-upgradable types.

Commits 29aeda6e4 et al closed up some oversights involving not checking
for non-upgradable types within container types, such as arrays and
ranges.  However, I only looked at version.c, failing to notice that
there were substantially-equivalent tests in check.c.  (The division
of responsibility between those files is less than clear...)

In addition, because genbki.pl does not guarantee that auto-generated
rowtype OIDs will hold still across versions, we need to consider that
the composite type associated with a system catalog or view is
non-upgradable.  It seems unlikely that someone would have a user
column declared that way, but if they did, trying to read it in another
PG version would likely draw "no such pg_type OID" failures, thanks
to the type OID embedded in composite Datums.

To support the composite and reg*-type cases, extend the recursive
query that does the search to allow any base query that returns
a column of pg_type OIDs, rather than limiting it to exactly one
starting type.

As before, back-patch to all supported branches.

Discussion: https://postgr.es/m/2798740.1619622555@sss.pgh.pa.us

Doc: fix discussion of how to get real Julian Dates.

Somehow I'd convinced myself that rotating to UTC-12 was the way
to do this, but upon further review, it's definitely UTC+12.

Discussion: https://postgr.es/m/1197050.1619123213@sss.pgh.pa.us

Fix use-after-release issue with pg_identify_object_as_address()

Spotted by buildfarm member prion, with -DRELCACHE_FORCE_RELEASE.

Introduced in f7aab36.

Discussion: https://postgr.es/m/2759018.1619577848@sss.pgh.pa.us
Backpatch-through: 9.6

Fix pg_identify_object_as_address() with event triggers

Attempting to use this function with event triggers failed, as, since
its introduction in a676201, this code has never associated an object
name with event triggers. This addresses the failure by adding the
event trigger name to the set defining its object address.

Note that regression tests are added within event_trigger and not
object_address to avoid issues with concurrent connections in parallel
schedules.

Author: Joel Jacobson
Discussion: https://postgr.es/m/3c905e77-a026-46ae-8835-c3f6cd1d24c8@www.fastmail.com
Backpatch-through: 9.6

Doc: document EXTRACT(JULIAN ...), improve Julian Date explanation.

For some reason, the "julian" option for extract()/date_part() has
never gotten listed in the manual. Also, while Appendix B mentioned
in passing that we don't conform to the usual astronomical definition
that a Julian date starts at noon UTC, it was kind of vague about what
we do instead. Clarify that, and add an example showing how to get
the astronomical definition if you want it.

It's been like this for ages, so back-patch to all supported branches.

Discussion: https://postgr.es/m/1197050.1619123213@sss.pgh.pa.us

fix silly perl error in commit d064afc720

Only ever test for non-127.0.0.1 addresses on Windows in PostgresNode

This has been found to cause hangs where tcp usage is forced.

Alexey Kodratov

Discussion: https://postgr.es/m/82e271a9a11928337fcb5b5e57b423c0@postgrespro.ru

Backpatch to all live branches

Allow TestLib::slurp_file to skip contents, and use as needed

In order to avoid getting old logfile contents certain functions in
PostgresNode were doing one of two things. On Windows it rotated the
logfile and restarted the server, while elsewhere it truncated the log
file. Both of these are unnecessary. We borrow from the buildfarm which
does this instead: note the size of the logfile before we start, and
then when fetching the logfile skip to that position before accumulating
contents. This is spelled differently on Windows but the effect is the
same. This is largely centralized in TestLib's slurp_file function,
which has a new optional parameter, the offset to skip to before
starting to reading the file. Code in the client becomes much neater.

Backpatch to all live branches.

Michael Paquier, slightly modified by me.

Discussion: https://postgr.es/m/[email protected]

Fix some inappropriately-disallowed uses of ALTER ROLE/DATABASE SET.

Most GUC check hooks that inspect database state have special checks
that prevent them from throwing hard errors for state-dependent issues
when source == PGC_S_TEST.  This allows, for example,
"ALTER DATABASE d SET default_text_search_config = foo" when the "foo"
configuration hasn't been created yet.  Without this, we have problems
during dump/reload or pg_upgrade, because pg_dump has no idea about
possible dependencies of GUC values and can't ensure a safe restore
ordering.

However, check_role() and check_session_authorization() hadn't gotten
the memo about that, and would throw hard errors anyway.  It's not
entirely clear what is the use-case for "ALTER ROLE x SET role = y",
but we've now heard two independent complaints about that bollixing
an upgrade, so apparently some people are doing it.

Hence, fix these two functions to act more like other check hooks
with similar needs.  (But I did not change their insistence on
being inside a transaction, as it's still not apparent that setting
either GUC from the configuration file would be wise.)

Also fix check_temp_buffers, which had a different form of the disease
of making state-dependent checks without any exception for PGC_S_TEST.
A cursory survey of other GUC check hooks did not find any more issues
of this ilk.  (There are a lot of interdependencies among
PGC_POSTMASTER and PGC_SIGHUP GUCs, which may be a bad idea, but
they're not relevant to the immediate concern because they can't be
set via ALTER ROLE/DATABASE.)

Per reports from Charlie Hornsby and Nathan Bossart.  Back-patch
to all supported branches.

Discussion: https://postgr.es/m/HE1P189MB0523B31598B0C772C908088DB7709@HE1P189MB0523.EURP189.PROD.OUTLOOK.COM
Discussion: https://postgr.es/m/20160711223641 [email protected]

Use "-I." in directories holding Bison parsers, for Oracle compilers.

With the Oracle Developer Studio 12.6 compiler, #line directives alter
the current source file location for purposes of #include "..."
directives.  Hence, a VPATH build failed with 'cannot find include file:
"specscanner.c"'.  With two exceptions, parser-containing directories
already add "-I. -I$(srcdir)"; eliminate the exceptions.  Back-patch to
9.6 (all supported versions).

Port regress-python3-mangle.mk to Solaris "sed".

It doesn't support "$foo$*" like a POSIX "sed" implementation does;
see the Autoconf manual. Back-patch to 9.6 (all supported versions).

Fix old bug with coercing the result of a COLLATE expression.

There are hacks in parse_coerce.c to push down a requested coercion
to below any CollateExpr that may appear.  However, we did that even
if the requested data type is non-collatable, leading to an invalid
expression tree in which CollateExpr is applied to a non-collatable
type.  The fix is just to drop the CollateExpr altogether, reasoning
that it's useless.

This bug is ten years old, dating to the original addition of
COLLATE support.  The lack of field complaints suggests that there
aren't a lot of user-visible consequences.  We noticed the problem
because it would trigger an assertion in DefineVirtualRelation if
the invalid structure appears as an output column of a view; however,
in a non-assert build, you don't see a crash just a (subtly incorrect)
complaint about applying collation to a non-collatable type.  I found
that by putting the incorrect structure further down in a view, I could
make a view definition that would fail dump/reload, per the added
regression test case.  But CollateExpr doesn't do anything at run-time,
so this likely doesn't lead to any really exciting consequences.

Per report from Yulin Pei.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/HK0PR01MB22744393C474D503E16C8509F4709@HK0PR01MB2274.apcprd01.prod.exchangelabs.com

Fix out-of-bound memory access for interval -> char conversion

Using Roman numbers (via "RM" or "rm") for a conversion to calculate a
number of months has never considered the case of negative numbers,
where a conversion could easily cause out-of-bound memory accesses. The
conversions in themselves were not completely consistent either, as
specifying 12 would result in NULL, but it should mean XII.

This commit reworks the conversion calculation to have a more
consistent behavior:
- If the number of months and years is 0, return NULL.
- If the number of months is positive, return the exact month number.
- If the number of months is negative, do a backward calculation, with
-1 meaning December, -2 November, etc.

Reported-by: Theodor Arsenij Larionov-Trichkin
Author: Julien Rouhaud
Discussion: https://postgr.es/m/16953-f255a18f8c51f1d5@postgresql.org
backpatch-through: 9.6

Fix typo

Author: Daniel Westermann
Backpatch-through: 9.6
Discussion: https://postgr.es/m/GV0P278MB0483A7AA85BAFCC06D90F453D2739@GV0P278MB0483.CHEP278.PROD.OUTLOOK.COM

Fix typos and grammar in documentation and code comments

Comment fixes are applied on HEAD, and documentation improvements are
applied on back-branches where needed.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20210408164008 [email protected]
Backpatch-through: 9.6

Don't add non-existent pages to bitmap from BRIN

The code in bringetbitmap() simply added the whole matching page range
to the TID bitmap, as determined by pages_per_range, even if some of the
pages were beyond the end of the heap. The query then might fail with
an error like this:

  ERROR:  could not open file "base/20176/20228.2" (target block
          262144): previous segment is only 131021 blocks

In this case, the relation has 262093 pages (131072 and 131021 pages),
but we're trying to acess block 262144, i.e. first block of the 3rd
segment. At that point _mdfd_getseg() notices the preceding segment is
incomplete, and fails.

Hitting this in practice is rather unlikely, because:

* Most indexes use power-of-two ranges, so segments and page ranges
  align perfectly (segment end is also a page range end).

* The table size has to be just right, with the last segment being
  almost full - less than one page range from full segment, so that the
  last page range actually crosses the segment boundary.

* Prefetch has to be enabled. The regular page access checks that
  pages are not beyond heap end, but prefetch does not. On older
  releases (before 12) the execution stops after hitting the first
  non-existent page, so the prefetch distance has to be sufficient
  to reach the first page in the next segment to trigger the issue.
  Since 12 it's enough to just have prefetch enabled, the prefetch
  distance does not matter.

Fixed by not adding non-existent pages to the TID bitmap. Backpatch
all the way back to 9.6 (BRIN indexes were introduced in 9.5, but that
release is EOL).

Backpatch-through: 9.6

Shut down transaction tracking at startup process exit.

Maxim Orlov reported that the shutdown of standby server could result in
the following assertion failure. The cause of this issue was that,
when the shutdown caused the startup process to exit, recovery-time
transaction tracking was not shut down even if it's already initialized,
and some locks the tracked transactions were holding could not be released.
At this situation, if other process was invoked and the PGPROC entry that
the startup process used was assigned to it, it found such unreleased locks
and caused the assertion failure, during the initialization of it.

TRAP: FailedAssertion("SHMQueueEmpty(&(MyProc->myProcLocks[i]))"

This commit fixes this issue by making the startup process shut down
transaction tracking and release all locks, at the exit of it.

Back-patch to all supported branches.

Reported-by: Maxim Orlov
Author: Fujii Masao
Reviewed-by: Maxim Orlov
Discussion: https://postgr.es/m/ad4ce692cc1d89a093b471ab1d969b0b@postgrespro.ru

Use macro MONTHS_PER_YEAR instead of '12' in /ecpg/pgtypeslib

All other places already use MONTHS_PER_YEAR appropriately.

Backpatch-through: 9.6

Clarify documentation of RESET ROLE

Command-line options, or previous "ALTER (ROLE|DATABASE) ...
SET ROLE ..." commands, can change the value of the default role
for a session. In the presence of one of these, RESET ROLE will
change the current user identifier to the default role rather
than the session user identifier. Fix the documentation to
reflect this reality. Backpatch to all supported versions.

Author: Nathan Bossart
Reviewed-By: Laurenz Albe, David G. Johnston, Joe Conway
Reported by: Nathan Bossart
Discussion: https://postgr.es/m/flat/925134DB-8212-4F60-8AB1-B1231D750CB4%40amazon.com
Backpatch-through: 9.6

doc: Clarify how to generate backup files with non-exclusive backups

The current instructions describing how to write the backup_label and
tablespace_map files are confusing. For example, opening a file in text
mode on Windows and copy-pasting the file's contents would result in a
failure at recovery because of the extra CRLF characters generated. The
documentation was not stating that clearly, and per discussion this is
not considered as a supported scenario.

This commit extends a bit the documentation to mention that it may be
required to open the file in binary mode before writing its data.

Reported-by: Wang Shenhao
Author: David Steele
Reviewed-by: Andrew Dunstan, Magnus Hagander
Discussion: https://postgr.es/m/8373f61426074f2cb6be92e02f838389@G08CNEXMBPEKD06.g08.fujitsu.local
Backpatch-through: 9.6

doc: mention that intervening major releases can be skipped

Also mention that you should read the intervening major releases notes.
This change was also applied to the website.

Discussion: https://postgr.es/m/20210330144949 [email protected]

Backpatch-through: 9.6

Fix pg_restore's misdesigned code for detecting archive file format.

Despite the clear comments pointing out that the duplicative code
segments in ReadHead() and _discoverArchiveFormat() needed to be
in sync, they were not: the latter did not bother to apply any of
the sanity checks in the former.  We'd missed noticing this partly
because none of those checks would fail in scenarios we customarily
test, and partly because the oversight would be masked if both
segments execute, which they would in cases other than needing to
autodetect the format of a non-seekable stdin source.  However,
in a case meeting all these requirements --- for example, trying
to read a newer-than-supported archive format from non-seekable
stdin --- pg_restore missed applying the version check and would
likely dump core or otherwise misbehave.

The whole thing is silly anyway, because there seems little reason
to duplicate the logic beyond the one-line verification that the
file starts with "PGDMP".  There seems to have been an undocumented
assumption that multiple major formats (major enough to require
separate reader modules) would nonetheless share the first half-dozen
fields of the custom-format header.  This seems unlikely, so let's
fix it by just nuking the duplicate logic in _discoverArchiveFormat().

Also get rid of the pointless attempt to seek back to the start of
the file after successful autodetection.  That wastes cycles and
it means we have four behaviors to verify not two.

Per bug #16951 from Sergey Koposov.  This has been broken for
decades, so back-patch to all supported versions.

Discussion: https://postgr.es/m/16951-a4dd68cf0de23048@postgresql.org

doc: Clarify use of ACCESS EXCLUSIVE lock in various sections

Some sections of the documentation used "exclusive lock" to describe
that an ACCESS EXCLUSIVE lock is taken during a given operation. This
can be confusing to the reader as ACCESS SHARE is allowed with an
EXCLUSIVE lock is used, but that would not be the case with what is
described on those parts of the documentation.

Author: Greg Rychlewski
Discussion: https://postgr.es/m/CAKemG7VptD=7fNWckFMsMVZL_zzvgDO6v2yVmQ+ZiBfc_06kCQ@mail.gmail.com
Backpatch-through: 9.6

Update obsolete comment.

Back-patch to all supported branches.

Author: Etsuro Fujita
Discussion: https://postgr.es/m/CAPmGK17DwzaSf%2BB71dhL2apXdtG-OmD6u2AL9Cq2ZmAR0%2BzapQ%40mail.gmail.com

doc: Define TLS as an acronym

Commit c6763156589 added an acronym reference for "TLS" but the definition
was never added.

Author: Daniel Gustafsson
Reviewed-by: Michael Paquier
Backpatch-through: 9.6
Discussion: https://postgr.es/m/27109504-82DB-41A8-8E63-C0498314F5B0@yesql.se

Fix bug in WAL replay of COMMIT_TS_SETTS record.

Previously the WAL replay of COMMIT_TS_SETTS record called
TransactionTreeSetCommitTsData() with the argument write_xlog=true,
which generated and wrote new COMMIT_TS_SETTS record.
This should not be acceptable because it's during recovery.

This commit fixes the WAL replay of COMMIT_TS_SETTS record
so that it calls TransactionTreeSetCommitTsData() with write_xlog=false
and doesn't generate new WAL during recovery.

Back-patch to all supported branches.

Reported-by: lx zou <[email protected]>
Author: Fujii Masao
Reviewed-by: Alvaro Herrera
Discussion: https://postgr.es/m/16931-620d0f2fdc6108f1@postgresql.org

Fix psql's \connect command some more.

Jasen Betts reported yet another unintended side effect of commit
85c54287a: reconnecting with "\c service=whatever" did not have the
expected results.  The reason is that starting from the output of
PQconndefaults() effectively allows environment variables (such
as PGPORT) to override entries in the service file, whereas the
normal priority is the other way around.

Not using PQconndefaults at all would require yet a third main code
path in do_connect's parameter setup, so I don't really want to fix
it that way.  But we can have the logic effectively ignore all the
default values for just a couple more lines of code.

This patch doesn't change the behavior for "\c -reuse-previous=on
service=whatever".  That remains significantly different from before
85c54287a, because many more parameters will be re-used, and thus
not be possible for service entries to replace.  But I think this
is (mostly?) intentional.  In any case, since libpq does not report
where it got parameter values from, it's hard to do differently.

Per bug #16936 from Jasen Betts.  As with the previous patches,
back-patch to all supported branches.  (9.5 is unfortunately now
out of support, so this won't get fixed there.)

Discussion: https://postgr.es/m/16936-3f524322a53a29f0@postgresql.org

pg_waldump: Fix bug in per-record statistics.

pg_waldump --stats=record identifies a record by a combination
of the RmgrId and the four bits of the xl_info field of the record.
But XACT records use the first bit of those four bits for an optional
flag variable, and the following three bits for the opcode to
identify a record. So previously the same type of XACT record
could have different four bits (three bits are the same but the
first one bit is different), and which could cause
pg_waldump --stats=record to show two lines of per-record statistics
for the same XACT record. This is a bug.

This commit changes pg_waldump --stats=record so that it processes
only XACT record differently, i.e., filters the opcode out of xl_info
and uses a combination of the RmgrId and those three bits as
the identifier of a record, only for XACT record. For other records,
the four bits of the xl_info field are still used.

Back-patch to all supported branches.

Author: Kyotaro Horiguchi
Reviewed-by: Shinya Kato, Fujii Masao
Discussion: https://postgr.es/m/2020100913412132258847@highgo.ca

Don't leak malloc'd strings when a GUC setting is rejected.

Because guc.c prefers to keep all its string values in malloc'd
not palloc'd storage, it has to be more careful than usual to
avoid leaks. Error exits out of string GUC hook checks failed
to clear the proposed value string, and error exits out of
ProcessGUCArray() failed to clear the malloc'd results of
ParseLongOption().

Found via valgrind testing.
This problem is ancient, so back-patch to all supported branches.

Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us

Don't leak compiled regex(es) when an ispell cache entry is dropped.

The text search cache mechanisms assume that we can clean up
an invalidated dictionary cache entry simply by resetting the
associated long-lived memory context.  However, that does not work
for ispell affixes that make use of regular expressions, because
the regex library deals in plain old malloc.  Hence, we leaked
compiled regex(es) any time we dropped such a cache entry.  That
could quickly add up, since even a fairly trivial regex can use up
tens of kB, and a large one can eat megabytes.  Add a memory context
callback to ensure that a regex gets freed when its owning cache
entry is cleared.

Found via valgrind testing.
This problem is ancient, so back-patch to all supported branches.

Discussion: https://postgr.es/m/3816764.1616104288@sss.pgh.pa.us

Prevent buffer overrun in read_tablespace_map().

Robert Foggia of Trustwave reported that read_tablespace_map()
fails to prevent an overrun of its on-stack input buffer.
Since the tablespace map file is presumed trustworthy, this does
not seem like an interesting security vulnerability, but still
we should fix it just in the name of robustness.

While here, document that pg_basebackup's --tablespace-mapping option
doesn't work with tar-format output, because it doesn't. To make it
work, we'd have to modify the tablespace_map file within the tarball
sent by the server, which might be possible but I'm not volunteering.
(Less-painful solutions would require changing the basebackup protocol
so that the source server could adjust the map. That's not very
appetizing either.)

Fix race condition in psql \e's detection of file modification.

psql's editing commands decide whether the user has edited the file
by checking for change of modification timestamp. This is probably
fine for a pre-existing file, but with a temporary file that is
created within the command, it's possible for a fast typist to
save-and-exit in less than the one-second granularity of stat(2)
timestamps. On Windows FAT filesystems the granularity is even
worse, 2 seconds, making the race a bit easier to hit.

To fix, try to set the temp file's mod time to be two seconds ago.
It's unlikely this would fail, but then again the race condition
itself is unlikely, so just ignore any error.

Also, we might as well check the file size as well as its mod time.

While this is a difficult bug to hit, it still seems worth
back-patching, to ensure that users' edits aren't lost.

Laurenz Albe, per gripe from Jacob Champion; based on fix suggestions
from Jacob and myself

Discussion: https://postgr.es/m/0ba3f2a658bac6546d9934ab6ba63a805d46a49b [email protected]

Re-simplify management of inStart in pqParseInput3's subroutines.

Commit 92785dac2 copied some logic related to advancement of inStart
from pqParseInput3 into getRowDescriptions and getAnotherTuple,
because it wanted to allow user-defined row processor callbacks to
potentially longjmp out of the library, and inStart would have to be
updated before that happened to avoid an infinite loop.  We later
decided that that API was impossibly fragile and reverted it, but
we didn't undo all of the related code changes, and this bit of
messiness survived.  Undo it now so that there's just one place in
pqParseInput3's processing where inStart is advanced; this will
simplify addition of better tracing support.

getParamDescriptions had grown similar processing somewhere along
the way (not in 92785dac2; I didn't track down just when), but it's
actually buggy because its handling of corrupt-message cases seems to
have been copied from the v2 logic where we lacked a known message
length.  The cases where we "goto not_enough_data" should not simply
return EOF, because then we won't consume the message, potentially
creating an infinite loop.  That situation now represents a
definitively corrupt message, and we should report it as such.

Although no field reports of getParamDescriptions getting stuck in
a loop have been seen, it seems appropriate to back-patch that fix.
I chose to back-patch all of this to keep the logic looking more alike
in supported branches.

Discussion: https://postgr.es/m/2217283.1615411989@sss.pgh.pa.us

tutorial:  land height is "elevation", not "altitude"

This is a follow-on patch to 92c12e46d5.  In that patch, we renamed
"altitude" to "elevation" in the docs, based on these details:

   https://mapscaping.com/blogs/geo-candy/what-is-the-difference-between-elevation-relief-and-altitude

This renames the tutorial SQL files to match the documentation.

Reported-by: [email protected]
Discussion: https://postgr.es/m/161512392887.1046.3137472627109459518@wrigleys.postgresql.org

Backpatch-through: 9.6

Fix some typos, grammar and style in docs and comments

The portions fixing the documentation are backpatched where needed.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20210210235557 [email protected]
backpatch-through: 9.6

Reinstate HEAP_XMAX_LOCK_ONLY|HEAP_KEYS_UPDATED as allowed

Commit 866e24d47db1 added an assert that HEAP_XMAX_LOCK_ONLY and
HEAP_KEYS_UPDATED cannot appear together, on the faulty assumption that
the latter necessarily referred to an update and not a tuple lock; but
that's wrong, because SELECT FOR UPDATE can use precisely that
combination, as evidenced by the amcheck test case added here.

Remove the Assert(), and also patch amcheck's verify_heapam.c to not
complain if the combination is found. Also, out of overabundance of
caution, update (across all branches) README.tuplock to be more explicit
about this.

Author: Julien Rouhaud <[email protected]>
Reviewed-by: Mahendra Singh Thalor <[email protected]>
Reviewed-by: Dilip Kumar <[email protected]>
Discussion: https://postgr.es/m/20210124061758.GA11756@nol

Fix another ancient bug in parsing of BRE-mode regular expressions.

While poking at the regex code, I happened to notice that the bug
squashed in commit afcc8772e had a sibling: next() failed to return
a specific value associated with the '}' token for a "\{m,n\}"
quantifier when parsing in basic RE mode.  Again, this could result
in treating the quantifier as non-greedy, which it never should be in
basic mode.  For that to happen, the last character before "\}" that
sets "nextvalue" would have to set it to zero, or it'd have to have
accidentally been zero from the start.  The failure can be provoked
repeatably with, for example, a bound ending in digit "0".

Like the previous patch, back-patch all the way.

Fix compiler warning in back branches (9.6, 10).

Back-patch a tiny bit of commit fbb2e9a0 into 9.6 and 10, to silence an
uninitialized variable warning from GCC 10.2. Seen on buildfarm member
handfish, and my own development workflow where I like to use -Werror.

Discussion: https://postgr.es/m/CA%2BhUKGJRcwvK86Uf5t-FrTekZjqHtpv3u%3D3MuBg8Zw8R933Mqg%40mail.gmail.com

Default to wal_sync_method=fdatasync on FreeBSD.

FreeBSD 13 gained O_DSYNC, which would normally cause wal_sync_method to
choose open_datasync as its default value. That may not be a good
choice for all systems, and performs worse than fdatasync in some
scenarios. Let's preserve the existing default behavior for now.

Like commit 576477e73c4, which did the same for Linux, back-patch to all
supported releases.

Discussion: https://postgr.es/m/CA%2BhUKGLsAMXBQrCxCXoW-JsUYmdOL8ALYvaX%3DCrHqWxm-nWbGA%40mail.gmail.com

Hold interrupts while running dsm_detach() callbacks.

While cleaning up after a parallel query or parallel index creation that
created temporary files, we could be interrupted by a statement timeout.
The error handling path would then fail to clean up the files when it
ran dsm_detach() again, because the callback was already popped off the
list. Prevent this hazard by holding interrupts while the cleanup code
runs.

Thanks to Heikki Linnakangas for this suggestion, and also to Kyotaro
Horiguchi, Masahiko Sawada, Justin Pryzby and Tom Lane for discussion of
this and earlier ideas on how to fix the problem.

Back-patch to all supported releases.

Reported-by: Justin Pryzby <[email protected]>
Discussion: https://postgr.es/m/20191212180506 [email protected]

pg_attribute_no_sanitize_alignment() macro

Modern gcc and clang compilers offer alignment sanitizers, which help to detect
pointer misalignment.  However, our codebase already contains x86-specific
crc32 computation code, which uses unalignment access.  Thankfully, those
compilers also support the attribute, which disables alignment sanitizers at
the function level.  This commit adds pg_attribute_no_sanitize_alignment(),
which wraps this attribute, and applies it to pg_comp_crc32c_sse42() function.

Back-patch of commits 993bdb9f9 and ad2ad698a, to enable doing
alignment testing in all supported branches.

Discussion: https://postgr.es/m/CAPpHfdsne3%3DT%3DfMNU45PtxdhSL_J2PjLTeS8rwKnJzUR4YNd4w%40mail.gmail.com
Discussion: https://postgr.es/m/475514.1612745257%40sss.pgh.pa.us
Author: Alexander Korotkov, revised by Tom Lane
Reviewed-by: Tom Lane

Avoid divide-by-zero in regex_selectivity() with long fixed prefix.

Given a regex pattern with a very long fixed prefix (approaching 500
characters), the result of pow(FIXED_CHAR_SEL, fixed_prefix_len) can
underflow to zero.  Typically the preceding selectivity calculation
would have underflowed as well, so that we compute 0/0 and get NaN.
In released branches this leads to an assertion failure later on.
That doesn't happen in HEAD, for reasons I've not explored yet,
but it's surely still a bug.

To fix, just skip the division when the pow() result is zero, so
that we'll (most likely) return a zero selectivity estimate.  In
the edge cases where "sel" didn't yet underflow, perhaps this
isn't desirable, but I'm not sure that the case is worth spending
a lot of effort on.  The results of regex_selectivity_sub() are
barely worth the electrons they're written on anyway :-(

Per report from Alexander Lakhin.  Back-patch to all supported versions.

Discussion: https://postgr.es/m/6de0a0c3-ada9-cd0c-3e4e-2fa9964b41e3@gmail.com

Stamp 9.6.21.

Translation updates

Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: c1022f89a3531073349e8954e4f4fae64fe34552

Release notes for 13.2, 12.6, 11.11, 10.16, 9.6.21, 9.5.25.

Revert "Propagate CTE property flags when copying a CTE list into a rule."

This reverts commit ed290896335414c6c069b9ccae1f3dcdd2fac6ba and
equivalent back-branch commits. The issue is subtler than I thought,
and it's far from new, so just before a release deadline is no time
to be fooling with it. We'll consider what to do at a bit more
leisure.

Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com

Propagate CTE property flags when copying a CTE list into a rule.

rewriteRuleAction() neglected this step, although it was careful to
propagate other similar flags such as hasSubLinks or hasRowSecurity.
Omitting to transfer hasRecursive is just cosmetic at the moment,
but omitting hasModifyingCTE is a live bug, since the executor
certainly looks at that.

The proposed test case only fails back to v10, but since the executor
examines hasModifyingCTE in 9.x as well, I suspect that a test case
could be devised that fails in older branches. Given the nearness
of the release deadline, though, I'm not going to spend time looking
for a better test.

Report and patch by Greg Nancarrow, cosmetic changes by me

Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com

Disallow converting an inheritance child table to a view.

Generally, members of inheritance trees must be plain tables (or,
in more recent versions, foreign tables).  ALTER TABLE INHERIT
rejects creating an inheritance relationship that has a view at
either end.  When DefineQueryRewrite attempts to convert a relation
to a view, it already had checks prohibiting doing so for partitioning
parents or children as well as traditional-inheritance parents ...
but it neglected to check that a traditional-inheritance child wasn't
being converted.  Since the planner assumes that any inheritance
child is a table, this led to making plans that tried to do a physical
scan on a view, causing failures (or even crashes, in recent versions).

One could imagine trying to support such a case by expanding the view
normally, but since the rewriter runs before the planner does
inheritance expansion, it would take some very fundamental refactoring
to make that possible.  There are probably a lot of other parts of the
system that don't cope well with such a situation, too.  For now,
just forbid it.

Per bug #16856 from Yang Lin.  Back-patch to all supported branches.
(In versions before v10, this includes back-patching the portion of
commit 501ed02cf that added has_superclass().  Perhaps the lack of
that infrastructure partially explains the missing check.)

Discussion: https://postgr.es/m/16856-0363e05c6e1612fd@postgresql.org

Fix backslash-escaping multibyte chars in COPY FROM.

If a multi-byte character is escaped with a backslash in TEXT mode input,
and the encoding is one of the client-only encodings where the bytes after
the first one can have an ASCII byte "embedded" in the char, we didn't
skip the character correctly. After a backslash, we only skipped the first
byte of the next character, so if it was a multi-byte character, we would
try to process its second byte as if it was a separate character. If it
was one of the characters with special meaning, like '\n', '\r', or
another '\\', that would cause trouble.

One such exmple is the byte sequence '\x5ca45c2e666f6f' in Big5 encoding.
That's supposed to be [backslash][two-byte character][.][f][o][o], but
because the second byte of the two-byte character is 0x5c, we incorrectly
treat it as another backslash. And because the next character is a dot, we
parse it as end-of-copy marker, and throw an "end-of-copy marker corrupt"
error.

Backpatch to all supported versions.

Reviewed-by: John Naylor, Kyotaro Horiguchi
Discussion: https://www.postgresql.org/message-id/a897f84f-8dca-8798-3139-07da5bb38728%40iki.fi

Fix ancient memory leak in contrib/auto_explain.

The ExecutorEnd hook is invoked in a context that could be quite
long-lived, not the executor's own per-query context as I think
we were sort of assuming.  Thus, any cruft generated while producing
the EXPLAIN output could accumulate over multiple queries.  This can
result in spectacular leakage if log_nested_statements is on, and
even without that I'm surprised nobody complained before.

To fix, just switch into the executor's context so that anything we
allocate will be released when standard_ExecutorEnd frees the executor
state.  We might as well nuke the code's retail pfree of the explain
output string, too; that's laughably inadequate to the need.

Japin Li, per report from Jeff Janes.  This bug is old, so
back-patch to all supported branches.

Discussion: https://postgr.es/m/CAMkU=1wCVtbeRn0s9gt12KwQ7PLXovbpM8eg25SYocKW3BT4hg@mail.gmail.com

Fix CREATE INDEX CONCURRENTLY for simultaneous prepared transactions.

In a cluster having used CREATE INDEX CONCURRENTLY while having enabled
prepared transactions, queries that use the resulting index can silently
fail to find rows.  Fix this for future CREATE INDEX CONCURRENTLY by
making it wait for prepared transactions like it waits for ordinary
transactions.  This expands the VirtualTransactionId structure domain to
admit prepared transactions.  It may be necessary to reindex to recover
from past occurrences.  Back-patch to 9.5 (all supported versions).

Andrey Borodin, reviewed (in earlier versions) by Tom Lane and Michael
Paquier.

Discussion: https://postgr.es/m/2E712143-97F7-4890-B470-4A35142ABC82@yandex-team.ru

Silence another gcc 11 warning.

Per buildfarm and local experimentation, bleeding-edge gcc isn't
convinced that the MemSet in reorder_function_arguments() is safe.
Shut it up by adding an explicit check that pronargs isn't negative,
and by changing MemSet to memset. (It appears that either change is
enough to quiet the warning at -O2, but let's do both to be sure.)

Make ecpg's rjulmdy() and rmdyjul() agree with their declarations.

We had "short *mdy" in the extern declarations, but "short mdy[3]"
in the actual function definitions. Per C99 these are equivalent,
but recent versions of gcc have started to issue warnings about
the inconsistency. Clean it up before the warnings get any more
widespread.

Back-patch, in case anyone wants to build older PG versions with
bleeding-edge compilers.

Discussion: https://postgr.es/m/2401575.1611764534@sss.pgh.pa.us

pgbench: Remove dead code

doConnect() never returns connections in state CONNECTION_BAD, so
checking for that is pointless. Remove the code that does.

This code has been dead since ba708ea3dc84, 20 years ago.

Discussion: https://postgr.es/m/20210126195224 [email protected]
Reviewed-by: Tom Lane <[email protected]>

Report the true database name on connection errors

When reporting connection errors, we might show a database name in the
message that's not the one we actually tried to connect to, if the
database was taken from libpq defaults instead of from user parameters.
Fix such error messages to use PQdb(), which reports the correct name.

(But, per commit 2930c05634bc, make sure not to try to print NULL.)

Apply to branches 9.5 through 13. Branch master has already been
changed differently by commit 58cd8dca3de0.

Reported-by: Robert Haas <[email protected]>
Discussion: https://postgr.es/m/CA+TgmobssJ6rS22dspWnu-oDxXevGmhMD8VcRBjmj-b9UDqRjw@mail.gmail.com

Code review for psql's helpSQL() function.

The loops to identify word boundaries could access past the end of
the input string.  Likely that would never result in an actual
crash, but it makes valgrind unhappy.

The logic to try different numbers of words didn't work when the
input has two words but we only have a match to the first, eg
"\h with select".  (We must "continue" the pass loop, not "break".)

The logic to compute nl_count was bizarrely managed, and in at
least two code paths could end up calling PageOutput with
nl_count = 0, resulting in failing to paginate output that should
have been fed to the pager.  Also, in v12 and up, the nl_count
calculation hadn't been updated to account for the addition of a URL.

The PQExpBuffer holding the command syntax details wasn't freed,
resulting in a session-lifespan memory leak.

While here, improve some comments, choose a more descriptive name
for a variable, fix inconsistent datatype choice for another variable.

Per bug #16837 from Alexander Lakhin.  This code is very old,
so back-patch to all supported branches.

Kyotaro Horiguchi and Tom Lane

Discussion: https://postgr.es/m/16837-479bcd56040c71b3@postgresql.org

Fix broken ruleutils support for function TRANSFORM clauses.

I chanced to notice that this dumped core due to a faulty Assert.
To add insult to injury, the output has been misformatted since v11.
Obviously we need some regression testing here.

Discussion: https://postgr.es/m/d1cc628c-3953-4209-957b-29427acc38c8@www.fastmail.com

Fix hypothetical bug in heap backward scans

Both heapgettup() and heapgettup_pagemode() incorrectly set the first page
to scan in a backward scan in which the number of pages to scan was
specified by heap_setscanlimits(). The code incorrectly started the scan
at the end of the relation when startBlk was 0, or otherwise at
startBlk - 1, neither of which is correct when only scanning a subset of
pages.

The fix here checks if heap_setscanlimits() has changed the number of
pages to scan and if so we set the first page to scan as the final page in
the specified range during backward scans.

Proper adjustment of this code was forgotten when heap_setscanlimits() was
added in 7516f5259 back in 9.5. However, practice, nowhere in core code
performs backward scans after having used heap_setscanlimits(), yet, it is
possible an extension uses the heap functions in this way, hence
backpatch.

An upcoming patch does use heap_setscanlimits() with backward scans, so
this must be fixed before that can go in.

Author: David Rowley
Discussion: https://postgr.es/m/CAApHDvpGc9h0_oVD2CtgBcxCS1N-qDYZSeBRnUh+0CWJA9cMaA@mail.gmail.com
Backpatch-through: 9.5, all supported versions

Update time zone data files to tzdata release 2021a.

DST law changes in Russia (Volgograd zone) and South Sudan.
Historical corrections for Australia, Bahamas, Belize, Bermuda,
Ghana, Israel, Kenya, Nigeria, Palestine, Seychelles, and Vanuatu.
Notably, the Australia/Currie zone has been corrected to the point
where it is identical to Australia/Hobart.

Make check-prepared-txns use temp-install binaries.

It tested the installed binaries. This fixes 9.6 and 9.5 to follow
later versions and "make -C src/test/isolation check" in this respect.

Doc: improve directions for building on macOS.

In light of recent discussions, we should instruct people to
install Apple's command line tools; installing Xcode is secondary.

Also, fix sample command for finding out the default sysroot,
as we now know that the command originally recommended can give
a result that doesn't match your OS version.

Also document the workaround to use if you really don't want
configure to select a sysroot at all.

Discussion: https://postgr.es/m/20210119111625 [email protected]

Further tweaking of PG_SYSROOT heuristics for macOS.

It emerges that in some phases of the moon (perhaps to do with
directory entry order?), xcrun will report that the SDK path is
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk
which is normally a symlink to a version-numbered sibling directory.
Our heuristic to skip non-version-numbered pathnames was rejecting
that, which is the wrong thing to do. We'd still like to end up
with a version-numbered PG_SYSROOT value, but we can have that by
dereferencing the symlink.

Like the previous fix, back-patch to all supported versions.

Discussion: https://postgr.es/m/522433.1611089678@sss.pgh.pa.us

Fix ALTER DEFAULT PRIVILEGES with duplicated objects

Specifying duplicated objects in this command would lead to unique
constraint violations in pg_default_acl or "tuple already updated by
self" errors. Similarly to GRANT/REVOKE, increment the command ID after
each subcommand processing to allow this case to work transparently.

A regression test is added by tweaking one of the existing queries of
privileges.sql to stress this case.

Reported-by: Andrus
Author: Michael Paquier
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/ae2a7dc1-9d71-8cba-3bb9-e4cb7eb1f44e@hot.ee
Backpatch-through: 9.5

Remove faulty support for MergeAppend plan with WHERE CURRENT OF.

Somebody extended search_plan_tree() to treat MergeAppend exactly
like Append, which is 100% wrong, because unlike Append we can't
assume that only one input node is actively returning tuples.
Hence a cursor using a MergeAppend across a UNION ALL or inheritance
tree could falsely match a WHERE CURRENT OF query at a row that
isn't actually the cursor's current output row, but coincidentally
has the same TID (in a different table) as the current output row.

Delete the faulty code; this means that such a case will now return
an error like 'cursor "foo" is not a simply updatable scan of table
"bar"', instead of silently misbehaving. Users should not find that
surprising though, as the same cursor query could have failed that way
already depending on the chosen plan. (It would fail like that if the
sort were done with an explicit Sort node instead of MergeAppend.)

Expand the clearly-inadequate commentary to be more explicit about
what this code is doing, in hopes of forestalling future mistakes.

It's been like this for awhile, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/482865.1611075182@sss.pgh.pa.us

Avoid crash with WHERE CURRENT OF and a custom scan plan.

execCurrent.c's search_plan_tree() assumed that ForeignScanStates
and CustomScanStates necessarily have a valid ss_currentRelation.
This is demonstrably untrue for postgres_fdw's remote join and
remote aggregation plans, and non-leaf custom scans might not have
an identifiable scan relation either.  Avoid crashing by ignoring
such nodes when the field is null.

This solution will lead to errors like 'cursor "foo" is not a
simply updatable scan of table "bar"' in cases where maybe we
could have allowed WHERE CURRENT OF to work.  That's not an issue
for postgres_fdw's usages, since joins or aggregations would render
WHERE CURRENT OF invalid anyway.  But an otherwise-transparent
upper level custom scan node might find this annoying.  When and if
someone cares to expend work on such a scenario, we could invent a
custom-scan-provider callback to determine what's safe.

Report and patch by David Geier, commentary by me.  It's been like
this for awhile, so back-patch to all supported branches.

Discussion: https://postgr.es/m/0253344d-9bdd-11c4-7f0d-d88c02cd7991@swarm64.com

Fix pg_dump for GRANT OPTION among initial privileges.

The context is an object that no longer bears some aclitem that it bore
initially.  (A user issued REVOKE or GRANT statements upon the object.)
pg_dump is forming SQL to reproduce the object ACL.  Since initdb
creates no ACL bearing GRANT OPTION, reaching this bug requires an
extension where the creation script establishes such an ACL.  No PGXN
extension does that.  If an installation did reach the bug, pg_dump
would have omitted a semicolon, causing a REVOKE and the next SQL
statement to fail.  Separately, since the affected code exists to
eliminate an entire aclitem, it wants plain REVOKE, not REVOKE GRANT
OPTION FOR.  Back-patch to 9.6, where commit
23f34fa4ba358671adab16773e79c17c92cbc870 first appeared.

Discussion: https://postgr.es/m/20210109102423 [email protected]

Prevent excess SimpleLruTruncate() deletion.

Every core SLRU wraps around.  With the exception of pg_notify, the wrap
point can fall in the middle of a page.  Account for this in the
PagePrecedes callback specification and in SimpleLruTruncate()'s use of
said callback.  Update each callback implementation to fit the new
specification.  This changes SerialPagePrecedesLogically() from the
style of asyncQueuePagePrecedes() to the style of CLOGPagePrecedes().
(Whereas pg_clog and pg_serial share a key space, pg_serial is nothing
like pg_notify.)  The bug fixed here has the same symptoms and user
followup steps as 592a589a04bd456410b853d86bd05faa9432cbbb.  Back-patch
to 9.5 (all supported versions).

Reviewed by Andrey Borodin and (in earlier versions) by Tom Lane.

Discussion: https://postgr.es/m/20190202083822 [email protected]

Improve our heuristic for selecting PG_SYSROOT on macOS.

In cases where Xcode is newer than the underlying macOS version,
asking xcodebuild for the SDK path will produce a pointer to the
SDK shipped with Xcode, which may end up building code that does
not work on the underlying macOS version.  It appears that in
such cases, xcodebuild's answer also fails to match the default
behavior of Apple's compiler: assuming one has installed Xcode's
"command line tools", there will be an SDK for the OS's own version
in /Library/Developer/CommandLineTools, and the compiler will
default to using that.  This is all pretty poorly documented,
but experimentation suggests that "xcrun --show-sdk-path" gives
the sysroot path that the compiler is actually using, at least
in some cases.  Hence, try that first, but revert to xcodebuild
if xcrun fails (in very old Xcode, it is missing or lacks the
--show-sdk-path switch).

Also, "xcrun --show-sdk-path" may give a path that is valid but lacks
any OS version identifier.  We don't really want that, since most
of the motivation for wiring -isysroot into the build flags at all
is to ensure that all parts of a PG installation are built against
the same SDK, even when considering extensions built later and/or on
a different machine.  Insist on finding "N.N" in the directory name
before accepting the result.  (Adding "--sdk macosx" to the xcrun
call seems to produce the same answer as xcodebuild, but usually
more quickly because it's cached, so we also try that as a fallback.)

The core reason why we don't want to use Xcode's default SDK in cases
like this is that Apple's technology for introducing new syscalls
does not play nice with Autoconf: for example, configure will think
that preadv/pwritev exist when using a Big Sur SDK, even when building
on an older macOS version where they don't exist.  It'd be nice to
have a better solution to that problem, but this patch doesn't attempt
to fix that.

Per report from Sergey Shinderuk.  Back-patch to all supported versions.

Discussion: https://postgr.es/m/ed3b8e5d-0da8-6ebd-fd1c-e0ac80a4b204@postgrespro.ru

Doc: clarify behavior of back-half options in pg_dump.

Options that change how the archive data is converted to SQL text
are ignored when dumping to archive formats. The documentation
previously said "not meaningful", which is not helpful.

Discussion: https://postgr.es/m/161052021249.12228.9598689907884726185@wrigleys.postgresql.org

Fix memory leak in SnapBuildSerialize.

The memory for the snapshot was leaked while serializing it to disk during
logical decoding. This memory will be freed only once walsender stops
streaming the changes. This can lead to a huge memory increase when master
logs Standby Snapshot too frequently say when the user is trying to create
many replication slots.

Reported-by: [email protected]
Diagnosed-by: [email protected]
Author: Amit Kapila
Backpatch-through: 9.5
Discussion: https://postgr.es/m/033ab54c-6393-42ee-8ec9-2b399b5d8cde [email protected]

Fix thinko in comment

This comment has been wrong since its introduction in commit
2c03216d8311.

Author: Masahiko Sawada <[email protected]>
Discussion: https://postgr.es/m/CAD21AoAzz6qipFJBbGEaHmyWxvvNDp8httbwLR9tUQWaTjUs2Q@mail.gmail.com