git.postgresql.org Git - postgresql.git/log

Improve memory management in regex compiler.

The previous logic here created a separate pool of arcs for each
state, so that the out-arcs of each state were physically stored
within it.  Perhaps this choice was driven by trying to not include
a "from" pointer within each arc; but Spencer gave up on that idea
long ago, and it's hard to see what the value is now.  The approach
turns out to be fairly disastrous in terms of memory consumption,
though.  In the first place, NFAs built by this engine seem to have
about 4 arcs per state on average, with a majority having only one
or two out-arcs.  So pre-allocating 10 out-arcs for each state is
already cause for a factor of two or more bloat.  Worse, the NFA
optimization phase moves arcs around with abandon.  In a large NFA,
some of the states will have hundreds of out-arcs, so towards the
end of the optimization phase we have a significant number of states
whose arc pools have room for hundreds of arcs each, even though only
a few of those arcs are in use.  We have seen real-world regexes in
which this effect bloats the memory requirement by 25X or even more.

Hence, get rid of the per-state arc pools in favor of a single arc
pool for the whole NFA, with variable-sized allocation batches
instead of always asking for 10 at a time.  While we're at it,
let's batch the allocations of state structs too, to further reduce
the malloc traffic.

This incidentally allows moveouts() to be optimized in a similar
way to moveins(): when moving an arc to another state, it's now
valid to just re-link the same arc struct into a different outchain,
where before the code invariants required us to make a physically
new arc and then free the old one.

These changes reduce the regex compiler's typical space consumption
for average-size regexes by about a factor of two, and much more for
large or complicated regexes.  In a large test set of real-world
regexes, we formerly had half a dozen cases that failed with "regular
expression too complex" due to exceeding the REG_MAX_COMPILE_SPACE
limit (about 150MB); we would have had to raise that limit to
something close to 400MB to make them work with the old code.  Now,
none of those cases need more than 13MB to compile.  Furthermore,
the test set is about 10% faster overall due to less malloc traffic.

Discussion: https://postgr.es/m/168861.1614298592@sss.pgh.pa.us

Extend a test case a little

This will possibly help a subsequent patch by making sure the notice
messages are distinct so that it's clear that they come out in the
right order.

Author: Fabien COELHO <[email protected]>
Discussion: https://www.postgresql.org/message-id/alpine.DEB.2.21.1904240654120.3407%40lancre

doc: Improve {archive,restore}_command for compressed logs

The commands mentioned in the docs with gzip and gunzip did not prefix
the archives with ".gz" and used inconsistent paths for the archives,
which can be confusing.

Reported-by: Philipp Gramzow
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/161397938841.15451.13129264141285167267@wrigleys.postgresql.org

Revert "pg_collation_actual_version() -> pg_collation_current_version()."

This reverts commit 9cf184cc0599b6e65e7e5ecd9d91cd42e278bcd8. Name
change less well received than anticipated.

Discussion: https://postgr.es/m/afcfb97e-88a1-a540-db95-6c573b93bc2b%40eisentraut.org

Fix list-manipulation bug in WITH RECURSIVE processing.

makeDependencyGraphWalker and checkWellFormedRecursionWalker
thought they could hold onto a pointer to a list's first
cons cell while the list was modified by recursive calls.
That was okay when the cons cell was actually separately
palloc'd ... but since commit 1cff1b95a, it's quite unsafe,
leading to core dumps or incorrect complaints of faulty
WITH nesting.

In the field this'd require at least a seven-deep WITH nest
to cause an issue, but enabling DEBUG_LIST_MEMORY_USAGE
allows the bug to be seen with lesser nesting depths.

Per bug #16801 from Alexander Lakhin. Back-patch to v13.

Michael Paquier and Tom Lane

Discussion: https://postgr.es/m/16801-393c7922143eaa4d@postgresql.org

VACUUM VERBOSE: Count "newly deleted" index pages.

Teach VACUUM VERBOSE to report on pages deleted by the _current_ VACUUM
operation -- these are newly deleted pages.  VACUUM VERBOSE continues to
report on the total number of deleted pages in the entire index (no
change there).  The former is a subset of the latter.

The distinction between each category of deleted index page only arises
with index AMs where page deletion is supported and is decoupled from
page recycling for performance reasons.

This is follow-up work to commit e5d8a999, which made nbtree store
64-bit XIDs (not 32-bit XIDs) in pages at the point at which they're
deleted.  Note that the btm_last_cleanup_num_delpages metapage field
added by that commit usually gets set to pages_newly_deleted.  The
exceptions (the scenarios in which they're not equal) all seem to be
tricky cases for the implementation (of page deletion and recycling) in
general.

Author: Peter Geoghegan <[email protected]>
Discussion: https://postgr.es/m/CAH2-WznpdHvujGUwYZ8sihX%3Dd5u-tRYhi-F4wnV2uN2zHpMUXw%40mail.gmail.com

Doc: remove src/backend/regex/re_syntax.n.

We aren't publishing this file as documentation, and it's been
much more haphazardly maintained than the real docs in func.sgml,
so let's just drop it. I think the only reason I included it in
commit 7bcc6d98f was that the Berkeley-era sources had had a man
page in this directory.

Discussion: https://postgr.es/m/4099447.1614186542@sss.pgh.pa.us

Change regex \D and \W shorthands to always match newlines.

Newline is certainly not a digit, nor a word character, so it is
sensible that it should match these complemented character classes.
Previously, \D and \W acted that way by default, but in
newline-sensitive mode ('n' or 'p' flag) they did not match newlines.

This behavior was previously forced because explicit complemented
character classes don't match newlines in newline-sensitive mode;
but as of the previous commit that implementation constraint no
longer exists. It seems useful to change this because the primary
real-world use for newline-sensitive mode seems to be to match the
default behavior of other regex engines such as Perl and Javascript
... and their default behavior is that these match newlines.

The old behavior can be kept by writing an explicit complemented
character class, i.e. [^[:digit:]] or [^[:word:]]. (This means
that \D and \W are not exactly equivalent to those strings, but
they weren't anyway.)

Discussion: https://postgr.es/m/3220564.1613859619@sss.pgh.pa.us

Allow complemented character class escapes within regex brackets.

The complement-class escapes \D, \S, \W are now allowed within
bracket expressions.  There is no semantic difficulty with doing
that, but the rather hokey macro-expansion-based implementation
previously used here couldn't cope.

Also, invent "word" as an allowed character class name, thus "\w"
is now equivalent to "[[:word:]]" outside brackets, or "[:word:]"
within brackets.  POSIX allows such implementation-specific
extensions, and the same name is used in e.g. bash.

One surprising compatibility issue this raises is that constructs
such as "[\w-_]" are now disallowed, as our documentation has always
said they should be: character classes can't be endpoints of a range.
Previously, because \w was just a macro for "[:alnum:]_", such a
construct was read as "[[:alnum:]_-_]", so it was accepted so long as
the character after "-" was numerically greater than or equal to "_".

Some implementation cleanup along the way:

* Remove the lexnest() hack, and in consequence clean up wordchrs()
to not interact with the lexer.

* Fix colorcomplement() to not be O(N^2) in the number of colors
involved.

* Get rid of useless-as-far-as-I-can-see calls of element()
on single-character character element names in brackpart().
element() always maps these to the character itself, and things
would be quite broken if it didn't --- should "[a]" match something
different than "a" does?  Besides, the shortcut path in brackpart()
wasn't doing this anyway, making it even more inconsistent.

Discussion: https://postgr.es/m/2845172.1613674385@sss.pgh.pa.us
Discussion: https://postgr.es/m/3220564.1613859619@sss.pgh.pa.us

Improve tab-completion for TRUNCATE.

Author: Kota Miyake
Reviewed-by: Muhammad Usama
Discussion: https://postgr.es/m/f5d30053d00dcafda3280c9e267ecb0f@oss.nttdata.com

doc: Mention PGDATABASE as supported by pgbench

PGHOST, PGPORT and PGUSER were already mentioned, but not PGDATABASE.
Like 5aaa584, backpatch down to 12.

Reported-by: Christophe Courtois
Discussion: https://postgr.es/m/161399398648.21711.15387267201764682579@wrigleys.postgresql.org
Backpatch-through: 12

Use full 64-bit XIDs in deleted nbtree pages.

Otherwise we risk "leaking" deleted pages by making them non-recyclable
indefinitely.  Commit 6655a729 did the same thing for deleted pages in
GiST indexes.  That work was used as a starting point here.

Stop storing an XID indicating the oldest bpto.xact across all deleted
though unrecycled pages in nbtree metapages.  There is no longer any
reason to care about that condition/the oldest XID.  It only ever made
sense when wraparound was something _bt_vacuum_needs_cleanup() had to
consider.

The btm_oldest_btpo_xact metapage field has been repurposed and renamed.
It is now btm_last_cleanup_num_delpages, which is used to remember how
many non-recycled deleted pages remain from the last VACUUM (in practice
its value is usually the precise number of pages that were _newly
deleted_ during the specific VACUUM operation that last set the field).

The general idea behind storing btm_last_cleanup_num_delpages is to use
it to give _some_ consideration to non-recycled deleted pages inside
_bt_vacuum_needs_cleanup() -- though never too much.  We only really
need to avoid leaving a truly excessive number of deleted pages in an
unrecycled state forever.  We only do this to cover certain narrow cases
where no other factor makes VACUUM do a full scan, and yet the index
continues to grow (and so actually misses out on recycling existing
deleted pages).

These metapage changes result in a clear user-visible benefit: We no
longer trigger full index scans during VACUUM operations solely due to
the presence of only 1 or 2 known deleted (though unrecycled) blocks
from a very large index.  All that matters now is keeping the costs and
benefits in balance over time.

Fix an issue that has been around since commit 857f9c36, which added the
"skip full scan of index" mechanism (i.e. the _bt_vacuum_needs_cleanup()
logic).  The accuracy of btm_last_cleanup_num_heap_tuples accidentally
hinged upon _when_ the source value gets stored.  We now always store
btm_last_cleanup_num_heap_tuples in btvacuumcleanup().  This fixes the
issue because IndexVacuumInfo.num_heap_tuples (the source field) is
expected to accurately indicate the state of the table _after_ the
VACUUM completes inside btvacuumcleanup().

A backpatchable fix cannot easily be extracted from this commit.  A
targeted fix for the issue will follow in a later commit, though that
won't happen today.

I (pgeoghegan) have chosen to remove any mention of deleted pages in the
documentation of the vacuum_cleanup_index_scale_factor GUC/param, since
the presence of deleted (though unrecycled) pages is no longer of much
concern to users.  The vacuum_cleanup_index_scale_factor description in
the docs now seems rather unclear in any case, and it should probably be
rewritten in the near future.  Perhaps some passing mention of page
deletion will be added back at the same time.

Bump XLOG_PAGE_MAGIC due to nbtree WAL records using full XIDs now.

Author: Peter Geoghegan <[email protected]>
Reviewed-By: Masahiko Sawada <[email protected]>
Discussion: https://postgr.es/m/CAH2-WznpdHvujGUwYZ8sihX=d5u-tRYhi-F4wnV2uN2zHpMUXw@mail.gmail.com

Fix relcache reference leak introduced by ce0fdbfe97.

Author: Sawada Masahiko
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAD21AoA7ZEfsOXQ9HQqMv3QYGsEm2H5Wk5ic5S=mvzDf-3a3SA@mail.gmail.com

Fix some typos, grammar and style in docs and comments

The portions fixing the documentation are backpatched where needed.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20210210235557 [email protected]
backpatch-through: 9.6

Message style fix

Don't quote type name placeholders.

doc: Improve description of wal_receiver_status_interval

This parameter description was previously confusing, telling that a
value of 0 disabled completely status updates. This is not true as
there are cases where an update is sent while ignoring this parameter
value. The documentation is improved to outline the difference of
treatment for scheduled status messages and when these are forced.

Reported-by: Dmitriy Kuzmin
Author: Michael Paquier
Reviewed-by: Euler Taveira
Discussion: https://postgr.es/m/161346024420.3455.1345266601055047937@wrigleys.postgresql.org

Fix confusion in comments about generate_gather_paths

d2d8a229bc58 introduced a new function generate_useful_gather_paths to
be used as a replacement for generate_gather_paths, but forgot to update
a couple of places that referenced the older function.

This is possibly not 100% complete (ref. create_ordered_paths), but it's
better than not changing anything.

Author: "Hou, Zhijie" <[email protected]>
Reviewed-by: Tomas Vondra <[email protected]>
Discussion: https://postgr.es/m/4ce1d5116fe746a699a6d29858c6a39a@G08CNEXMBPEKD05.g08.fujitsu.local

Reinstate HEAP_XMAX_LOCK_ONLY|HEAP_KEYS_UPDATED as allowed

Commit 866e24d47db1 added an assert that HEAP_XMAX_LOCK_ONLY and
HEAP_KEYS_UPDATED cannot appear together, on the faulty assumption that
the latter necessarily referred to an update and not a tuple lock; but
that's wrong, because SELECT FOR UPDATE can use precisely that
combination, as evidenced by the amcheck test case added here.

Remove the Assert(), and also patch amcheck's verify_heapam.c to not
complain if the combination is found. Also, out of overabundance of
caution, update (across all branches) README.tuplock to be more explicit
about this.

Author: Julien Rouhaud <[email protected]>
Reviewed-by: Mahendra Singh Thalor <[email protected]>
Reviewed-by: Dilip Kumar <[email protected]>
Discussion: https://postgr.es/m/20210124061758.GA11756@nol

Suppress compiler warning in new regex match-all detection code.

gcc 10 is smart enough to notice that control could reach this
"hasmatch[depth]" assignment with depth < 0, but not smart enough
to know that that would require a badly broken NFA graph. Change
the assert() to a plain runtime test to shut it up.

Per report from Andres Freund.

Discussion: https://postgr.es/m/20210223173437 [email protected]

VACUUM: ignore indexing operations with CONCURRENTLY

As envisioned in commit c98763bf51bf, it is possible for VACUUM to
ignore certain transactions that are executing CREATE INDEX CONCURRENTLY
and REINDEX CONCURRENTLY for the purposes of computing Xmin; that's
because we know those transactions are not going to examine any other
tables, and are not going to execute anything else in the same
transaction. (Only operations on "safe" indexes can be ignored: those
on indexes that are neither partial nor expressional).

This is extremely useful in cases where CIC/RC can run for a very long
time, because that used to be a significant headache for concurrent
vacuuming of other tables.

Reviewed-by: Matthias van de Meent <[email protected]>
Reviewed-by: Masahiko Sawada <[email protected]>
Discussion: https://postgr.es/m/20210115133858 [email protected]

Simplify printing of LSNs

Add a macro LSN_FORMAT_ARGS for use in printf-style printing of LSNs.
Convert all applicable code to use it.

Reviewed-by: Ashutosh Bapat <[email protected]>
Reviewed-by: Kyotaro Horiguchi <[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/CAExHW5ub5NaTELZ3hJUCE6amuvqAtsSxc7O+uK7y4t9Rrk23cw@mail.gmail.com

Fix an oversight in ReorderBufferFinishPrepared.

We don't have anything to decode in a transaction if ReorderBufferTXN
doesn't exist by the time we decode the commit prepared. So don't create a
new ReorderBufferTXN here. This is an oversight in commit a271a1b5.

Reported-by: Markus Wanner
Discussion: https://postgr.es/m/dbec82e2-dbd7-95a2-c6b6-e488cbbdf853@bluegap.ch

Change the error message for logical replication authentication failure.

The authentication failure error message wasn't distinguishing whether
it is a physical replication or logical replication connection failure and
was giving incomplete information on what led to failure in case of logical
replication connection.

Author: Paul Martinez and Amit Kapila
Reviewed-by: Euler Taveira and Amit Kapila
Discussion: https://postgr.es/m/CACqFVBYahrAi2OPdJfUA3YCvn3QMzzxZdw0ibSJ8wouWeDtiyQ@mail.gmail.com

Remove pointless HeapTupleHeaderIndicatesMovedPartitions calls

Pavan Deolasee recently noted that a few of the
HeapTupleHeaderIndicatesMovedPartitions calls added by commit
5db6df0c0117 are useless, since they are done after comparing t_self
with t_ctid. But because t_self can never be set to the magical values
that indicate that the tuple moved partition, this can never succeed: if
the first test fails (so we know t_self equals t_ctid), necessarily the
second test will also fail.

So these checks can be removed and no harm is done. There's no bug
here, just a code legibility issue.

Reported-by: Pavan Deolasee <[email protected]>
Discussion: https://postgr.es/m/20200929164411 [email protected]

Fix typo

Fix docs build for website styles

Building the docs with STYLE=website referenced a stylesheet that long
longer exists on the website, since we changed it to use versioned
references.

To make it less likely for this to happen again, point to a single
stylesheet on the website which will in turn import the required one.
That puts the process entirely within the scope of the website
repository, so next time a version is switched that's the only place
changes have to be made, making them less likely to be missed.

Per (off-list) discussion with Peter Geoghegan and Jonathan Katz.

Tab-complete CREATE COLLATION.

Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com

Refactor get_collation_current_version().

The code paths for three different OSes finished up with three different
ways of excluding C[.xxx] and POSIX from consideration. Merge them.

Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com

pg_collation_actual_version() -> pg_collation_current_version().

The new name seems a bit more natural.

Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com

Hide internal error for pg_collation_actual_version(<bad OID>).

Instead of an unsightly internal "cache lookup failed" message, just
return NULL for bad OIDs, as is the convention for other similar things.

Reported-by: Justin Pryzby <[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Discussion: https://postgr.es/m/20210117215940.GE8560%40telsasoft.com

Initialize atomic variable waitStart in PGPROC, at postmaster startup.

Commit 46d6e5f567 added the atomic variable "waitStart" into PGPROC struct,
to store the time at which wait for lock acquisition started. Previously
this variable was initialized every time each backend started. Instead,
this commit makes postmaster initialize it at the startup, to ensure that
the variable should be initialized before any use of it.

This commit also moves the code to initialize "waitStart" variable for
prepare transaction, from TwoPhaseGetDummyProc() to MarkAsPreparingGuts().
Because MarkAsPreparingGuts() is more proper place to do that since
it initializes other PGPROC variables.

Author: Fujii Masao
Reviewed-by: Atsushi Torikoshi
Discussion: https://postgr.es/m/1df88660-6f08-cc6e-b7e2-f85296a2bdab@oss.nttdata.com

Improve new hash partition bound check error messages

For the error message "every hash partition modulus must be a factor
of the next larger modulus", add a detail message that shows the
particular numbers and existing partition involved. Also comment the
code more.

Reviewed-by: Amit Langote <[email protected]>
Reviewed-by: Heikki Linnakangas <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/bb9d60b4-aadb-607a-1a9d-fdc3434dddcd%40enterprisedb.com

Use pgstat_progress_update_multi_param() where possible

This commit changes one code path in REINDEX INDEX and one code path
in CREATE INDEX CONCURRENTLY to report the progress of each operation
using pgstat_progress_update_multi_param() rather than
multiple calls to pgstat_progress_update_param(). This has the
advantage to make the progress report more consistent to the end-user
without impacting the amount of information provided.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACV5zW7GxD8D_tyO==bcj6ZktQchEKWKPBOAGKiLhAQo=w@mail.gmail.com

Remove outdated reference to RAID spindles.

Commit b09ff536 left behind some outdated advice in the long_desc field
of the GUC "effective_io_concurrency". Remove it.

Back-patch to 13.

Reported-by: Andrew Gierth <[email protected]>
Reviewed-by: Julien Rouhaud <[email protected]>
Discussion: https://postgr.es/m/CA%2BhUKGJyyWqFBxL9gEj-qtjBThGjhAOBE8GBnF8MUJOJ3vrfag%40mail.gmail.com

Simplify memory management for regex DFAs a little.

Coverity complained that functions in regexec.c might leak DFA
storage. It's wrong, but this logic is confusing enough that it's
not so surprising Coverity couldn't make sense of it. Rewrite
in hopes of making it more legible to humans as well as machines.

Fix invalid array access in trgm_regexp.c.

Brown-paper-bag bug in 08c0d6ad6: I missed one place that needed
to guard against RAINBOW arc colors. Remarkably, nothing noticed
the invalid array access except buildfarm member thorntail.

Thanks to Noah Misch for assistance with tracking this down.

Avoid generating extra subre tree nodes for capturing parentheses.

Previously, each pair of capturing parentheses gave rise to a separate
subre tree node, whose only function was to identify that we ought to
capture the match details for this particular sub-expression.  In
most cases we don't really need that, since we can perfectly well
put a "capture this" annotation on the child node that does the real
matching work.  As with the two preceding commits, the main value
of this is to avoid generating and optimizing an NFA for a tree node
that's not really pulling its weight.

The chosen data representation only allows one capture annotation
per subre node.  In the legal-per-spec, but seemingly not very useful,
case where there are multiple capturing parens around the exact same
bit of the regex (i.e. "((xyz))"), wrap the child node in N-1 capture
nodes that act the same as before.  We could work harder at that but
I'll refrain, pending some evidence that such cases are worth troubling
over.

In passing, improve the comments in regex.h to say what all the
different re_info bits mean.  Some of them were pretty obvious
but others not so much, so reverse-engineer some documentation.

This is part of a patch series that in total reduces the regex engine's
runtime by about a factor of four on a large corpus of real-world regexes.

Patch by me, reviewed by Joel Jacobson

Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us

Convert regex engine's subre tree from binary to N-ary style.

Instead of having left and right child links in subre structs,
have a single child link plus a sibling link.  Multiple children
of a tree node are now reached by chasing the sibling chain.

The beneficiary of this is alternation tree nodes.  A regular
expression with N (>1) branches is now represented by one alternation
node with N children, rather than a tree that includes N alternation
nodes as well as N children.  While the old representation didn't
really cost anything extra at execution time, it was pretty horrid
for compilation purposes, because each of the alternation nodes had
its own NFA, which we were too stupid not to separately optimize.
(To make matters worse, all of those NFAs described the entire
alternation pattern, not just the portion of it that one might
expect from the tree structure.)

We continue to require concatenation nodes to have exactly two
children.  This data structure is now prepared to support more,
but the executor's logic would need some careful redesign, and
it's not clear that a lot of benefit could be had.

This is part of a patch series that in total reduces the regex engine's
runtime by about a factor of four on a large corpus of real-world regexes.

Patch by me, reviewed by Joel Jacobson

Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us

Fix regex engine to suppress useless concatenation sub-REs.

The comment for parsebranch() claims that it avoids generating
unnecessary concatenation nodes in the "subre" tree, but it missed
some significant cases.  Once we've decided that a given atom is
"messy" and can't be bundled with the preceding atom(s) of the
current regex branch, parseqatom() always generated two new concat
nodes, one to concat the messy atom to what follows it in the branch,
and an upper node to concatenate the preceding part of the branch
to that one.  But one or both of these could be unnecessary, if the
messy atom is the first, last, or only one in the branch.  Improve
the code to suppress such useless concat nodes, along with the
no-op child nodes representing empty chunks of a branch.

Reducing the number of subre tree nodes offers significant savings
not only at execution but during compilation, because each subre node
has its own NFA that has to be separately optimized.  (Maybe someday
we'll figure out how to share the optimization work across multiple
tree nodes, but it doesn't look easy.)  Eliminating upper tree nodes
is especially useful because they tend to have larger NFAs.

This is part of a patch series that in total reduces the regex engine's
runtime by about a factor of four on a large corpus of real-world regexes.

Patch by me, reviewed by Joel Jacobson

Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us

Recognize "match-all" NFAs within the regex engine.

This builds on the previous "rainbow" patch to detect NFAs that will
match any string, though possibly with constraints on the string length.
This definition is chosen to match constructs such as ".*", ".+", and
".{1,100}".  Recognizing such an NFA after the optimization pass is
fairly cheap, since we basically just have to verify that all arcs
are RAINBOW arcs and count the number of steps to the end state.
(Well, there's a bit of complication with pseudo-color arcs for string
boundary conditions, but not much.)

Once we have these markings, the regex executor functions longest(),
shortest(), and matchuntil() don't have to expend per-character work
to determine whether a given substring satisfies such an NFA; they
just need to check its length against the bounds.  Since some matching
problems require O(N) invocations of these functions, we've reduced
the runtime for an N-character string from O(N^2) to O(N).  Of course,
this is no help for non-matchall sub-patterns, but those usually have
constraints that allow us to avoid needing O(N) substring checks in the
first place.  It's precisely the unconstrained "match-all" cases that
cause the most headaches.

This is part of a patch series that in total reduces the regex engine's
runtime by about a factor of four on a large corpus of real-world regexes.

Patch by me, reviewed by Joel Jacobson

Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us

Invent "rainbow" arcs within the regex engine.

Some regular expression constructs, most notably the "." match-anything
metacharacter, produce a sheaf of parallel NFA arcs covering all
possible colors (that is, character equivalence classes).  We can make
a noticeable improvement in the space and time needed to process large
regexes by replacing such cases with a single arc bearing the special
color code "RAINBOW".  This requires only minor additional complication
in places such as pull() and push().

Callers of pg_reg_getoutarcs() must now be prepared for the possibility
of seeing a RAINBOW arc.  For the one known user, contrib/pg_trgm,
that's a net benefit since it cuts the number of arcs to be dealt with,
and the handling isn't any different than for other colors that contain
too many characters to be dealt with individually.

This is part of a patch series that in total reduces the regex engine's
runtime by about a factor of four on a large corpus of real-world regexes.

Patch by me, reviewed by Joel Jacobson

Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us

doc: Mention that partitions_{done,total} is 0 for REINDEX progress reports

REINDEX has recently gained support for partitions, so it can be
confusing to see those fields not being set. Making useful reports for
for such relations is more complicated than it looks with the current
set of columns available in pg_stat_progress_create_index, and this
touches equally REINDEX DATABASE/SYSTEM/SCHEMA. This commit documents
that those two columns are not touched during a REINDEX.

Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20210216064214 [email protected]

Fix inconsistent configure data for --with-ssl

This inconsistency was showing up after an autoreconf.

Reported-by: Antonin Houska
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/47255.1613716807@antos

Fix psql's ON_ERROR_ROLLBACK so that it handles COMMIT AND CHAIN.

When ON_ERROR_ROLLBACK is enabled, psql releases a temporary savepoint
if it's idle in a valid transaction block after executing a query. But psql
doesn't do that after RELEASE or ROLLBACK is executed because a temporary
savepoint has already been destroyed in that case.

This commit changes psql's ON_ERROR_ROLLBACK so that it doesn't release
a temporary savepoint also when COMMIT AND CHAIN is executed. A temporary
savepoint doesn't need to be released in that case because
COMMIT AND CHAIN also destroys any savepoints defined within the transaction
to commit. Otherwise psql tries to release the savepoint that
COMMIT AND CHAIN has already destroyed and cause an error
"ERROR: savepoint "pg_psql_temporary_savepoint" does not exist".

Back-patch to v12 where transaction chaining was added.

Reported-by: Arthur Nascimento
Author: Arthur Nascimento
Reviewed-by: Fujii Masao, Vik Fearing
Discussion: https://postgr.es/m/16867-3475744069228158@postgresql.org

Fix bug in COMMIT AND CHAIN command.

This commit fixes COMMIT AND CHAIN command so that it starts new transaction
immediately even if savepoints are defined within the transaction to commit.
Previously COMMIT AND CHAIN command did not in that case because
commit 280a408b48 forgot to make CommitTransactionCommand() handle
a transaction chaining when the transaction state was TBLOCK_SUBCOMMIT.

Also this commit adds the regression test for COMMIT AND CHAIN command
when savepoints are defined.

Back-patch to v12 where transaction chaining was added.

Reported-by: Arthur Nascimento
Author: Fujii Masao
Reviewed-by: Arthur Nascimento, Vik Fearing
Discussion: https://postgr.es/m/16867-3475744069228158@postgresql.org

Update snowball

Update to snowball tag v2.1.0. Major changes are new stemmers for
Armenian, Serbian, and Yiddish.

Add nbtree README section on page recycling.

Consolidate discussion of how VACUUM places pages in the FSM for
recycling by adding a new section that comes after discussion of page
deletion.  This structure reflects the fact that page recycling is
explicitly decoupled from page deletion in Lanin & Shasha's paper.  Page
recycling in nbtree is an implementation of what the paper calls "the
drain technique".

This decoupling is an important concept for nbtree VACUUM.  Searchers
have to detect and recover from concurrent page deletions, but they will
never have to reason about concurrent page recycling.  Recycling can
almost always be thought of as a low level garbage collection operation
that asynchronously frees the physical space that backs a logical tree
node.  Almost all code need only concern itself with logical tree nodes.
(Note that "logical tree node" is not currently a term of art in the
nbtree code -- this all works implicitly.)

This is preparation for an upcoming patch that teaches nbtree VACUUM to
remember the details of pages that it deletes on the fly, in local
memory.  This enables the same VACUUM operation to consider placing its
own deleted pages in the FSM later on, when it reaches the end of
btvacuumscan().

Fix another ancient bug in parsing of BRE-mode regular expressions.

While poking at the regex code, I happened to notice that the bug
squashed in commit afcc8772e had a sibling: next() failed to return
a specific value associated with the '}' token for a "\{m,n\}"
quantifier when parsing in basic RE mode.  Again, this could result
in treating the quantifier as non-greedy, which it never should be in
basic mode.  For that to happen, the last character before "\}" that
sets "nextvalue" would have to set it to zero, or it'd have to have
accidentally been zero from the start.  The failure can be provoked
repeatably with, for example, a bound ending in digit "0".

Like the previous patch, back-patch all the way.

Fix "invalid spinlock number: 0" error in pg_stat_wal_receiver.

Commit 2c8dd05d6c added the atomic variable writtenUpto into
walreceiver's shared memory information. It's initialized only
when walreceiver started up but could be read via pg_stat_wal_receiver
view anytime, i.e., even before it's initialized. In the server built
with --disable-atomics and --disable-spinlocks, this uninitialized
atomic variable read could cause "invalid spinlock number: 0" error.

This commit changed writtenUpto so that it's initialized at
the postmaster startup, to avoid the uninitialized variable read
via pg_stat_wal_receiver and fix the error.

Also this commit moved the read of writtenUpto after the release
of spinlock protecting walreceiver's shared variables. This is
necessary to prevent new spinlock from being taken by atomic
variable read while holding another spinlock, and to shorten
the spinlock duration. This change leads writtenUpto not to be
consistent with the other walreceiver's shared variables protected
by a spinlock. But this is OK because writtenUpto should not be
used for data integrity checks.

Back-patch to v13 where commit 2c8dd05d6c introduced the bug.

Author: Fujii Masao
Reviewed-by: Michael Paquier, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/7ef8708c-5b6b-edd3-2cf2-7783f1c7c175@oss.nttdata.com

Add tests for bytea LIKE operator

Add test coverage for the following operations, which were previously
not tested at all:

bytea LIKE bytea (bytealike)
bytea NOT LIKE bytea (byteanlike)
ESCAPE clause for the above (like_escape_bytea)

also

name NOT ILIKE text (nameicnlike)

Discussion: https://www.postgresql.org/message-id/flat/4d13563a-2c8d-fd91-20d5-e71b7a4eaa87%40enterprisedb.com

Allow specifying CRL directory

Add another method to specify CRLs, hashed directory method, for both
server and client side. This offers a means for server or libpq to
load only CRLs that are required to verify a certificate. The CRL
directory is specifed by separate GUC variables or connection options
ssl_crl_dir and sslcrldir, alongside the existing ssl_crl_file and
sslcrl, so both methods can be used at the same time.

Author: Kyotaro Horiguchi <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/20200731.173911.904649928639357911 [email protected]

nbtree README: move VACUUM linear scan section.

Discuss VACUUM's linear scan after discussion of tuple deletion by
VACUUM, but before discussion of page deletion by VACUUM. This
progression is a lot more natural.

Also tweak the wording a little. It seems unnecessary to talk about how
it worked prior to PostgreSQL 8.2.

Fix tuple routing to initialize batching only for inserts

A cross-partition update on a partitioned table is implemented as a
delete followed by an insert. With foreign partitions, this was however
causing issues, because the FDW and core may disagree on when to enable
batching. postgres_fdw was only allowing batching for plain inserts
(CMD_INSERT) while core was trying to batch the insert component of the
cross-partition update. Fix by restricting core to apply batching only
to plain CMD_INSERT queries.

It's possible to allow batching for cross-partition updates, but that
will require more extensive changes, so better to leave that for a
separate patch.

Author: Amit Langote
Reviewed-by: Tomas Vondra, Takayuki Tsunakawa
Discussion: https://postgr.es/m/20200628151002.7x5laxwpgvkyiu3q@development

Fix pointer type in ExecForeignBatchInsert SGML docs

Reported-by: Ian Barwick
Discussion: https://postgr.es/m/20200628151002.7x5laxwpgvkyiu3q@development

Make some minor improvements in the regex code.

Push some hopefully-uncontroversial bits extracted from an upcoming
patch series, to remove non-relevant clutter from the main patches.

In compact(), return immediately after setting REG_ASSERT error;
continuing the loop would just lead to assertion failure below.
(Ask me how I know.)

In parseqatom(), remove assertion that moresubs() did its job.
When moresubs actually did its job, this is redundant with that
function's final assert; but when it failed on OOM, this is an
assertion crash.  We could avoid the crash by adding a NOERR()
check before the assertion, but it seems better to subtract code
than add it.  (Note that there's a NOERR exit a few lines further
down, and nothing else between here and there requires moresubs
to have succeeded.  So we don't really need an extra error exit.)
This is a live bug in assert-enabled builds, but given the very
low likelihood of OOM in moresub's tiny allocation, I don't think
it's worth back-patching.

On the other hand, it seems worthwhile to add an assertion that
our intended v->subs[subno] target is still null by the time we
are ready to insert into it, since there's a recursion in between.

In pg_regexec, ensure we fflush any debug output on the way out,
and try to make MDEBUG messages more uniform and helpful.  (In
particular, ensure that all of them are prefixed with the subre's
id number, so one can match up entry and exit reports.)

Add some test cases in test_regex to improve coverage of lookahead
and lookbehind constraints.  Adding these now is mainly to establish
that this is indeed the existing behavior.

Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us

Routine usage information schema tables

Several information schema views track dependencies between
functions/procedures and objects used by them.  These had not been
implemented so far because PostgreSQL doesn't track objects used in a
function body.  However, formally, these also show dependencies used
in parameter default expressions, which PostgreSQL does support and
track.  So for the sake of completeness, we might as well add these.
If dependency tracking for function bodies is ever implemented, these
views will automatically work correctly.

Reviewed-by: Erik Rijkers <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/ac80fc74-e387-8950-9a31-2560778fc1e3%40enterprisedb.com

Fix typo

Author: Daniel Gustafsson <[email protected]>
Discussion: https://postgr.es/m/0CF087FC-BEAD-4010-8BB9-3CDD74DC9060@yesql.se

Use errmsg_internal for debug messages

An inconsistent set of debug-level messages was not using
errmsg_internal(), thus uselessly exposing the messages to translation
work. Fix those.

Add psql completion for [ NO ] DEPENDS ON EXTENSION

ALTER INDEX was able to handle that already. This adds tab completion
for all the remaining commands that support this grammar:
- ALTER FUNCTION
- ALTER PROCEDURE
- ALTER ROUTINE
- ALTER TRIGGER
- ALTER MATERIALIZED VIEW

Author: Ian Lawrence Barwick
Discussion: https://postgr.es/m/CAB8KJ=iypYudXuMOAMOP4BpkaYbXxk=a2cdJppX0e9mJXWtuig@mail.gmail.com

Convert tsginidx.c's GIN indexing logic to fully ternary operation.

Commit 2f2007fbb did this partially, but there were two remaining
warts.  checkcondition_gin handled some uncertain cases by setting
the out-of-band recheck flag, some by returning TS_MAYBE, and some
by doing both.  Meanwhile, TS_execute arbitrarily converted a
TS_MAYBE result to TS_YES.  Thus, if checkcondition_gin chose to
only return TS_MAYBE, the outcome would be TS_YES with no recheck
flag, potentially resulting in wrong query outputs.

The case where this'd happen is if there were GIN_MAYBE entries
in the indexscan results passed to gin_tsquery_[tri]consistent,
which so far as I can see would only happen if the tidbitmap used
to accumulate indexscan results grew large enough to become lossy.

I initially thought of fixing this by ensuring we always set the
recheck flag as well as returning TS_MAYBE in uncertain cases.
But that errs in the other direction, potentially forcing rechecks
of rows that provably match the query (since the recheck flag
remains set even if TS_execute later finds that the answer must be
TS_YES).  Instead, let's get rid of the out-of-band recheck flag
altogether and rely on returning TS_MAYBE.  This requires exporting
a version of TS_execute that will actually return the full ternary
result of the evaluation ... but we likely should have done that
to start with.

Unfortunately it doesn't seem practical to add a regression test case
that covers this: the amount of data needed to cause the GIN bitmap to
become lossy results in a longer runtime than I think we want to have
in the tests.  (I'm wondering about allowing smaller work_mem settings
to ameliorate that, but it'd be a matter for a separate patch.)

Per bug #16865 from Dimitri Nüscheler.  Back-patch to v13 where
the faulty commit came in.

Discussion: https://postgr.es/m/16865-4ffdc3e682e6d75b@postgresql.org

Remove the unnecessary PrepareWrite in pgoutput.

This issue exists from the inception of this code (PG-10) but got exposed
by the recent commit ce0fdbfe97 where we are using origins in tablesync
workers. The problem was that we were sometimes sending the prepare_write
('w') message but then the actual message was not being sent and on the
subscriber side, we always expect a message after prepare_write message
which led to this bug.

I refrained from backpatching this because there is no way in the core
code to hit this prior to commit ce0fdbfe97 and we haven't received any
complaints so far.

Reported-by: Erik Rijkers
Author: Amit Kapila and Vignesh C
Tested-by: Erik Rijkers
Discussion: https://postgr.es/m/1295168140.139428.1613133237154@webmailclassic.xs4all.nl

Fix heap_page_prune() parameter order confusion introduced in dc7420c2c92.

Both luckily and unluckily the passed values meant the same for all
types. Luckily because that meant my confusion caused no harm,
unluckily because otherwise the compiler might have warned...

In passing, synchronize parameter names between definition and
declaration.

Reported-By: Peter Geoghegan <[email protected]>
Author: Andres Freund <[email protected]>
Discussion: https://postgr.es/m/CAH2-Wz=L=nBoepQdH9b5Qd0nMvepFT2CnT6sjWvvpOXa=K8HVQ@mail.gmail.com

Remove backwards compat ugliness in snapbuild.c.

In 955a684e040 we fixed a bug in initial snapshot creation. In the
course of which several members of struct SnapBuild were obsoleted. As
SnapBuild is serialized to disk we couldn't change the memory layout.

Unfortunately I subsequently forgot about removing the backward compat
gunk, but luckily Heikki just reminded me.

This commit bumps SNAPBUILD_VERSION, therefore breaking existing
slots (which is fine in a major release).

Author: Andres Freund
Reminded-By: Heikki Linnakangas <[email protected]>
Discussion: https://postgr.es/m/c94be044-818f-15e3-1ad3-7a7ae2dfed0a@iki.fi

Simplify loop logic in nodeIncrementalSort.c.

The inner loop in switchToPresortedPrefixMode() can be implemented
as a conventional integer-counter for() loop, removing a couple of
redundant boolean state variables. The old logic here was a remnant
of earlier development, but as things now stand there's no reason
for extra complexity.

Also, annotate the test case added by 82e0e2930 to explain why it
manages to hit the corner case fixed in that commit, and add an
EXPLAIN to verify that it's creating an incremental-sort plan.

Back-patch to v13, like the previous patch.

James Coleman and Tom Lane

Discussion: https://postgr.es/m/16846-ae49f51ac379a4cb@postgresql.org

Make ExecGetInsertedCols() and friends more robust and improve comments.

If ExecGetInsertedCols(), ExecGetUpdatedCols() or ExecGetExtraUpdatedCols()
were called with a ResultRelInfo that's not in the range table and isn't a
partition routing target, the functions would dereference a NULL pointer,
relinfo->ri_RootResultRelInfo. Such ResultRelInfos are created when firing
RI triggers in tables that are not modified directly. None of the current
callers of these functions pass such relations, so this isn't a live bug,
but let's make them more robust.

Also update comment in ResultRelInfo; after commit 6214e2b228,
ri_RangeTableIndex is zero for ResultRelInfos created for partition tuple
routing.

Noted by Coverity. Backpatch down to v11, like commit 6214e2b228.

Reviewed-by: Tom Lane, Amit Langote

Display the time when the process started waiting for the lock, in pg_locks, take 2

This commit adds new column "waitstart" into pg_locks view. This column
reports the time when the server process started waiting for the lock
if the lock is not held. This information is useful, for example, when
examining the amount of time to wait on a lock by subtracting
"waitstart" in pg_locks from the current time, and identify the lock
that the processes are waiting for very long.

This feature uses the current time obtained for the deadlock timeout
timer as "waitstart" (i.e., the time when this process started waiting
for the lock). Since getting the current time newly can cause overhead,
we reuse the already-obtained time to avoid that overhead.

Note that "waitstart" is updated without holding the lock table's
partition lock, to avoid the overhead by additional lock acquisition.
This can cause "waitstart" in pg_locks to become NULL for a very short
period of time after the wait started even though "granted" is false.
This is OK in practice because we can assume that users are likely to
look at "waitstart" when waiting for the lock for a long time.

The first attempt of this patch (commit 3b733fcd04) caused the buildfarm
member "rorqual" (built with --disable-atomics --disable-spinlocks) to report
the failure of the regression test. It was reverted by commit 890d2182a2.
The cause of this failure was that the atomic variable for "waitstart"
in the dummy process entry created at the end of prepare transaction was
not initialized. This second attempt fixes that issue.

Bump catalog version.

Author: Atsushi Torikoshi
Reviewed-by: Ian Lawrence Barwick, Robert Haas, Justin Pryzby, Fujii Masao
Discussion: https://postgr.es/m/a96013dc51cdc56b2a2b84fa8a16a993@oss.nttdata.com

Add "LP_DEAD item?" column to GiST pageinspect functions

This brings gist_page_items() and gist_page_items_bytea() in line with
nbtree's bt_page_items() function.

Minor follow-up to commit 756ab291, which added the GiST functions.

Author: Andrey Borodin <[email protected]>
Discussion: https://postgr.es/m/E0794687-7315-4C29-A9C7-EC54D448596D@yandex-team.ru

Avoid misinterpreting GiST pages in pageinspect.

GistPageSetDeleted() sets pd_lower when deleting a page, and sets the
page contents to a GISTDeletedPageContents. Avoid treating deleted GiST
pages as regular slotted pages within pageinspect.

Oversight in commit 756ab291.

Author: Andrey Borodin <[email protected]>

Adjust lazy_scan_heap() accounting comments.

Explain which particular LP_DEAD line pointers get accounted for by the
tups_vacuumed variable.

Default to wal_sync_method=fdatasync on FreeBSD.

FreeBSD 13 gained O_DSYNC, which would normally cause wal_sync_method to
choose open_datasync as its default value. That may not be a good
choice for all systems, and performs worse than fdatasync in some
scenarios. Let's preserve the existing default behavior for now.

Like commit 576477e73c4, which did the same for Linux, back-patch to all
supported releases.

Discussion: https://postgr.es/m/CA%2BhUKGLsAMXBQrCxCXoW-JsUYmdOL8ALYvaX%3DCrHqWxm-nWbGA%40mail.gmail.com

Use pg_pwrite() in pg_test_fsync.

For consistency with the PostgreSQL behavior this test program is
intended to simulate, use pwrite() instead of lseek() + write().

Also fix the final "non-sync" test, which was opening and closing the
file for every write.

Discussion: https://postgr.es/m/CA%2BhUKGJjjid2BJsvjMALBTduo1ogdx2SPYaTQL3wAy8y2hc4nw%40mail.gmail.com

Fix the warnings introduced in commit ce0fdbfe97.

Author: Amit Kapila
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/1610789.1613170207@sss.pgh.pa.us

Hold interrupts while running dsm_detach() callbacks.

While cleaning up after a parallel query or parallel index creation that
created temporary files, we could be interrupted by a statement timeout.
The error handling path would then fail to clean up the files when it
ran dsm_detach() again, because the callback was already popped off the
list. Prevent this hazard by holding interrupts while the cleanup code
runs.

Thanks to Heikki Linnakangas for this suggestion, and also to Kyotaro
Horiguchi, Masahiko Sawada, Justin Pryzby and Tom Lane for discussion of
this and earlier ideas on how to fix the problem.

Back-patch to all supported releases.

Reported-by: Justin Pryzby <[email protected]>
Discussion: https://postgr.es/m/20191212180506 [email protected]

Add result size as argument of pg_cryptohash_final() for overflow checks

With its current design, a careless use of pg_cryptohash_final() could
would result in an out-of-bound write in memory as the size of the
destination buffer to store the result digest is not known to the
cryptohash internals, without the caller knowing about that.  This
commit adds a new argument to pg_cryptohash_final() to allow such sanity
checks, and implements such defenses.

The internals of SCRAM for HMAC could be tightened a bit more, but as
everything is based on SCRAM_KEY_LEN with uses particular to this code
there is no need to complicate its interface more than necessary, and
this comes back to the refactoring of HMAC in core.  Except that, this
minimizes the uses of the existing DIGEST_LENGTH variables, relying
instead on sizeof() for the result sizes.  In ossp-uuid, this also makes
the code more defensive, as it already relied on dce_uuid_t being at
least the size of a MD5 digest.

This is in philosophy similar to cfc40d3 for base64.c and aef8948 for
hex.c.

Reported-by: Ranier Vilela
Author: Michael Paquier, Ranier Vilela
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/CAEudQAoqEGmcff3J4sTSV-R_16Monuz-UpJFbf_dnVH=APr02Q@mail.gmail.com

Minor fixes to improve regex debugging code.

When REG_DEBUG is defined, ensure that an un-filled "struct cnfa"
is all-zeroes, not just that it has nstates == 0. This is mainly
so that looking at "struct subre" structs in gdb doesn't distract
one with a lot of garbage fields during regex compilation.

Adjust some places that print debug output to have suitable fflush
calls afterwards.

In passing, correct an erroneous ancient comment: the concatenation
subre-s created by parsebranch() have op == '.' not ','.

Noted while fooling around with some regex performance improvements.

ReadNewTransactionId() -> ReadNextTransactionId().

The new name conveys the effect better, is more consistent with similar
functions ReadNextMultiXactId(), ReadNextFullTransactionId(), and
matches the name of the variable that it reads.

Reported-by: Peter Geoghegan <[email protected]>
Discussion: https://postgr.es/m/CAH2-WzmVR4SakBXQUdhhPpMf1aYvZCnna5%3DHKa7DAgEmBAg%2B8g%40mail.gmail.com

README/C-comment: document GiST's NSN value

doc: Mention NO DEPENDS ON EXTENSION in its supported ALTER commands

This grammar flavor has been added by 5fc7039.

Author: Ian Lawrence Barwick
Discussion: https://postgr.es/m/CAB8KJ=ii6JScodxkA6-DO8bjatsMYU3OcewnL0mdN9geR+tTaw@mail.gmail.com
Backpatch-through: 13

Tweak compiler version cutoff for no_sanitize("alignment") support.

Buildfarm results show that gcc up through 7.x produces annoying
warnings for this construct (and, presumably, wouldn't do the right
thing anyway). clang seems okay with the cutoff we have, though.

Discussion: https://postgr.es/m/CAPpHfdsne3%3DT%3DfMNU45PtxdhSL_J2PjLTeS8rwKnJzUR4YNd4w%40mail.gmail.com
Discussion: https://postgr.es/m/475514.1612745257%40sss.pgh.pa.us

Avoid divide-by-zero in regex_selectivity() with long fixed prefix.

Given a regex pattern with a very long fixed prefix (approaching 500
characters), the result of pow(FIXED_CHAR_SEL, fixed_prefix_len) can
underflow to zero.  Typically the preceding selectivity calculation
would have underflowed as well, so that we compute 0/0 and get NaN.
In released branches this leads to an assertion failure later on.
That doesn't happen in HEAD, for reasons I've not explored yet,
but it's surely still a bug.

To fix, just skip the division when the pow() result is zero, so
that we'll (most likely) return a zero selectivity estimate.  In
the edge cases where "sel" didn't yet underflow, perhaps this
isn't desirable, but I'm not sure that the case is worth spending
a lot of effort on.  The results of regex_selectivity_sub() are
barely worth the electrons they're written on anyway :-(

Per report from Alexander Lakhin.  Back-patch to all supported versions.

Discussion: https://postgr.es/m/6de0a0c3-ada9-cd0c-3e4e-2fa9964b41e3@gmail.com

pg_attribute_no_sanitize_alignment() macro

Modern gcc and clang compilers offer alignment sanitizers, which help to detect
pointer misalignment.  However, our codebase already contains x86-specific
crc32 computation code, which uses unalignment access.  Thankfully, those
compilers also support the attribute, which disables alignment sanitizers at
the function level.  This commit adds pg_attribute_no_sanitize_alignment(),
which wraps this attribute, and applies it to pg_comp_crc32c_sse42() function.

Discussion: https://postgr.es/m/CAPpHfdsne3%3DT%3DfMNU45PtxdhSL_J2PjLTeS8rwKnJzUR4YNd4w%40mail.gmail.com
Discussion: https://postgr.es/m/475514.1612745257%40sss.pgh.pa.us
Author: Alexander Korotkov, revised by Tom Lane
Reviewed-by: Tom Lane

Fix Subscription test added by commit ce0fdbfe97.

We want to test the variants of Alter Subscription that are not allowed in
the transaction block but for that, we don't need to create a subscription
that tries to connect to the publisher. As such, there is no problem with
this test but it is good to allow such tests to run with
wal_level = minimal and max_wal_senders = 0 so as to keep them consistent
with other tests.

Reported by buildfarm.

Author: Amit Kapila
Reviewed-by: Ajin Cherian
Discussion: https://postgr.es/m/CAA4eK1Lw0V+e1JPGHDq=+hVACv=14H8sR+2eJ1k3PEgwKmU-jQ@mail.gmail.com

Allow multiple xacts during table sync in logical replication.

For the initial table data synchronization in logical replication, we use
a single transaction to copy the entire table and then synchronize the
position in the stream with the main apply worker.

There are multiple downsides of this approach: (a) We have to perform the
entire copy operation again if there is any error (network breakdown,
error in the database operation, etc.) while we synchronize the WAL
position between tablesync worker and apply worker; this will be onerous
especially for large copies, (b) Using a single transaction in the
synchronization-phase (where we can receive WAL from multiple
transactions) will have the risk of exceeding the CID limit, (c) The slot
will hold the WAL till the entire sync is complete because we never commit
till the end.

This patch solves all the above downsides by allowing multiple
transactions during the tablesync phase. The initial copy is done in a
single transaction and after that, we commit each transaction as we
receive. To allow recovery after any error or crash, we use a permanent
slot and origin to track the progress. The slot and origin will be removed
once we finish the synchronization of the table. We also remove slot and
origin of tablesync workers if the user performs DROP SUBSCRIPTION .. or
ALTER SUBSCRIPTION .. REFERESH and some of the table syncs are still not
finished.

The commands ALTER SUBSCRIPTION ... REFRESH PUBLICATION and
ALTER SUBSCRIPTION ... SET PUBLICATION ... with refresh option as true
cannot be executed inside a transaction block because they can now drop
the slots for which we have no provision to rollback.

This will also open up the path for logical replication of 2PC
transactions on the subscriber side. Previously, we can't do that because
of the requirement of maintaining a single transaction in tablesync
workers.

Bump catalog version due to change of state in the catalog
(pg_subscription_rel).

Author: Peter Smith, Amit Kapila, and Takamichi Osumi
Reviewed-by: Ajin Cherian, Petr Jelinek, Hou Zhijie and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1KHJxaZS-fod-0fey=0tq3=Gkn4ho=8N4-5HWiCfu0H1A@mail.gmail.com

Remove obsolete IndexBulkDeleteResult stats field.

The pages_removed field is no longer used for anything. It hasn't been
possible for an index to physically shrink since old-style VACUUM FULL
was removed by commit 0a469c87.

Remove dead code in ECPGconnect(), and improve documentation.

The stanza in ECPGconnect() that intended to allow specification of a
Unix socket directory path in place of a port has never executed since
it was committed, nearly two decades ago; the preceding strrchr()
already found the last colon so there cannot be another one. The lack
of complaints about that is doubtless related to the fact that no
user-facing documentation suggested it was possible.

Rather than try to fix that up, let's just remove the unreachable
code, and instead document the way that does work to write a socket
directory path, namely specifying it as a "host" option.

In support of that, make another pass at clarifying the syntax
documentation for ECPG connection targets, particularly documenting
which things are parsed as identifiers and where to use double quotes.
Rearrange some things that seemed poorly ordered, and fix a couple of
minor doc errors.

Kyotaro Horiguchi, per gripe from Shenhao Wang
(docs changes mostly by me)

Discussion: https://postgr.es/m/ae52a416bbbf459c96bab30b3038e06c@G08CNEXMBPEKD06.g08.fujitsu.local

Simplify jsonfuncs.c code by using strtoint() not strtol().

Explicitly testing for INT_MIN and INT_MAX isn't particularly good
style; it's tedious and may draw useless compiler warnings on
machines where int and long are the same width. We invented
strtoint() precisely for this usage, so use that instead.

While here, remove gratuitous variations in the way the tests for
did-strtoint-succeed were spelled. Also, avoid attempting to
negate INT_MIN; that would probably work given that the result
is implicitly cast to uint32, but I think it's nominally undefined
behavior.

Per gripe from Ranier Vilela, though this isn't his proposed patch.

Discussion: https://postgr.es/m/CAEudQAqge3QfzoBRhe59QrB_5g+NmQUj2QpzqZ9Nc7QepXGAEw@mail.gmail.com

Remove no-longer-used RTE argument of markVarForSelectPriv().

In the wake of c028faf2a, this is no longer needed. I left it
out of that patch since the API change would be undesirable in
a released branch; but there's no reason not to do it in HEAD.

Fix copy-paste error with SHA256 digest length in checksum_helper.c

Issue introduced by 87ae969, noticed while working on the area. While
on it, fix some grammar in the surrounding static assertions.

Add test case for abbrev(cidr)

This will in particular add some good test coverage for
inet_cidr_ntop.c, which was previously completely uncovered.

Reviewed-by: Tom Lane <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/cb0c4662-4596-dab4-7f64-839c5e8582c8%40enterprisedb.com

Remove extra Success message at the end of initdb

This was accidentally included in e09155bd62 and is redundant with the
lines right above it.

Reported-By: Peter Eisentraut
Discussion: https://postgr.es/m/455845d1-441d-cc40-d2a7-b47f4e422489@2ndquadrant.com

pg_dump: Add const decorations

Add const decorations to the *info arguments of the dump* functions,
to clarify that they don't modify that argument. Many other nearby
functions modify their arguments, so this can help clarify these
different APIs a bit.

Discussion: https://www.postgresql.org/message-id/flat/012d3030-9a2c-99a1-ed2d-988978b5632f%40enterprisedb.com

Fix lack of message pluralization

Fix ORDER BY clause in new regression test of REINDEX CONCURRENTLY

Oversight in bd12080.

Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20210210065805 [email protected]
Backpatch-through: 12

Simplify code related to compilation of SSL and OpenSSL

This commit makes more generic some comments and code related to the
compilation with OpenSSL and SSL in general to ease the addition of more
SSL implementations in the future. In libpq, some OpenSSL-only code is
moved under USE_OPENSSL and not USE_SSL.

While on it, make a comment more consistent in libpq-fe.h.

Author: Daniel Gustafsson
Discussion: https://postgr.es/m/5382CB4A-9CF3-4145-BA46-C802615935E0@yesql.se

Preserve pg_attribute.attstattarget across REINDEX CONCURRENTLY

For an index, attstattarget can be updated using ALTER INDEX SET
STATISTICS. This data was lost on the new index after REINDEX
CONCURRENTLY.

The update of this field is done when the old and new indexes are
swapped to make the fix back-patchable. Another approach we could look
after in the long-term is to change index_create() to pass the wanted
values of attstattarget when creating the new relation, but, as this
would cause an ABI breakage this can be done only on HEAD.

Reported-by: Ronan Dunklau
Author: Michael Paquier
Reviewed-by: Ronan Dunklau, Tomas Vondra
Discussion: https://postgr.es/m/16628084.uLZWGnKmhe@laptop-ronand
Backpatch-through: 12

Make pg_replication_origin_drop safe against concurrent drops.

Currently, we get the origin id from the name and then drop the origin by
taking ExclusiveLock on ReplicationOriginRelationId. So, two concurrent
sessions can get the id from the name at the same time and then when they
try to drop the origin, one of the sessions will get the either
"tuple concurrently deleted" or "cache lookup failed for replication
origin ..".

To prevent this race condition we do the entire operation under lock. This
obviates the need for replorigin_drop() API and we have removed it so if
any extension authors are using it they need to instead use
replorigin_drop_by_name. See it's usage in pg_replication_origin_drop().

Author: Peter Smith
Reviewed-by: Amit Kapila, Euler Taveira, Petr Jelinek, and Alvaro
Herrera
Discussion: https://www.postgresql.org/message-id/CAHut%2BPuW8DWV5fskkMWWMqzt-x7RPcNQOtJQBp6SdwyRghCk7A%40mail.gmail.com

Fix obsolete FSM remarks in nbtree README.

The free space map has used a dedicated relation fork rather than shared
memory segments for over a decade.

Revert "Display the time when the process started waiting for the lock, in pg_locks."

This reverts commit 3b733fcd04195399db56f73f0616b4f5c6828e18.

Per buildfarm members prion and rorqual.

Display the time when the process started waiting for the lock, in pg_locks.

This commit adds new column "waitstart" into pg_locks view. This column
reports the time when the server process started waiting for the lock
if the lock is not held. This information is useful, for example, when
examining the amount of time to wait on a lock by subtracting
"waitstart" in pg_locks from the current time, and identify the lock
that the processes are waiting for very long.

This feature uses the current time obtained for the deadlock timeout
timer as "waitstart" (i.e., the time when this process started waiting
for the lock). Since getting the current time newly can cause overhead,
we reuse the already-obtained time to avoid that overhead.

Note that "waitstart" is updated without holding the lock table's
partition lock, to avoid the overhead by additional lock acquisition.
This can cause "waitstart" in pg_locks to become NULL for a very short
period of time after the wait started even though "granted" is false.
This is OK in practice because we can assume that users are likely to
look at "waitstart" when waiting for the lock for a long time.

Bump catalog version.

Author: Atsushi Torikoshi
Reviewed-by: Ian Lawrence Barwick, Robert Haas, Justin Pryzby, Fujii Masao
Discussion: https://postgr.es/m/a96013dc51cdc56b2a2b84fa8a16a993@oss.nttdata.com

Add option PROCESS_TOAST to VACUUM

This option controls if toast tables associated with a relation are
vacuumed or not when running a manual VACUUM.  It was already possible
to trigger a manual VACUUM on a toast relation without processing its
main relation, but a manual vacuum on a main relation always forced a
vacuum on its toast table.  This is useful in scenarios where the level
of bloat or transaction age of the main and toast relations differs a
lot.

This option is an extension of the existing VACOPT_SKIPTOAST that was
used by autovacuum to control if toast relations should be skipped or
not.  This internal flag is renamed to VACOPT_PROCESS_TOAST for
consistency with the new option.

A new option switch, called --no-process-toast, is added to vacuumdb.

Author: Nathan Bossart
Reviewed-by: Kirk Jamison, Michael Paquier, Justin Pryzby
Discussion: https://postgr.es/m/BA8951E9-1524-48C5-94AF-73B1F0D7857F@amazon.com