[patch] [doc] Further note required activity aspect of automatic checkpoint and archving

Lists: pgsql-hackers
From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: [patch] [doc] Further note required activity aspect of automatic checkpoint and archving
Date: 2020-10-12 21:54:28
Message-ID: CAKFQuwZ1Vsc7VZbq=0w=OsXNHFbmqtvs82JK+=eAKWxnUGKRTg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hackers,

Over in general [1] Robert Inder griped about the not-so-recent change to
our automatic checkpointing, and thus archiving, behavior where
non-activity results in nothing happening. In looking over the
documentation I felt a few changes could be made to increase the chance
that a reader learns this key dynamic. Attached is a patch with those
changes. Copied inline for ease of review.

commit 8af7f653907688252d8663a80e945f6f5782b0de
Author: David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
Date: Mon Oct 12 21:32:32 2020 +0000

Further note required activity aspect of automatic checkpoint and
archiving

A few spots in the documentation could use a reminder that checkpoints
and archiving requires that actual WAL records be written in order to
happen
automatically.

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 42a8ed328d..c312fc9387 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -722,6 +722,8 @@ test ! -f
/mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_wal/0
short <varname>archive_timeout</varname> &mdash; it will bloat your
archive
storage. <varname>archive_timeout</varname> settings of a minute or
so are
usually reasonable.
+ This is mitigated by the fact that empty WAL segments will not be
archived
+ even if the archive_timeout period has elapsed.
</para>

<para>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ee914740cc..306f78765c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3131,6 +3131,8 @@ include_dir 'conf.d'
<listitem>
<para>
Maximum time between automatic WAL checkpoints.
+ The automatic checkpoint will do nothing if no new WAL has been
+ written since the last recorded checkpoint.
If this value is specified without units, it is taken as seconds.
The valid range is between 30 seconds and one day.
The default is five minutes (<literal>5min</literal>).
@@ -3337,18 +3339,17 @@ include_dir 'conf.d'
</term>
<listitem>
<para>
+ Force the completion of the current, non-empty, WAL segment when
+ this amount of time (if non-zero) has elapsed since the last
+ segment file switch.
The <xref linkend="guc-archive-command"/> is only invoked for
completed WAL segments. Hence, if your server generates little WAL
traffic (or has slack periods where it does so), there could be a
long delay between the completion of a transaction and its safe
recording in archive storage. To limit how old unarchived
data can be, you can set <varname>archive_timeout</varname> to
force the
- server to switch to a new WAL segment file periodically. When this
- parameter is greater than zero, the server will switch to a new
- segment file whenever this amount of time has elapsed since the
last
- segment file switch, and there has been any database activity,
- including a single checkpoint (checkpoints are skipped if there is
- no database activity). Note that archived files that are closed
+ server to switch to a new WAL segment file periodically.
+ Note that archived files that are closed
early due to a forced switch are still the same length as
completely
full files. Therefore, it is unwise to use a very short
<varname>archive_timeout</varname> &mdash; it will bloat your
archive

David J.

[1]
https://www.postgresql.org/message-id/flat/CAKqjJm83gnw2u0ugpkgc4bq58L%3DcLwbvmh69TwKKo83Y1CnANw%40mail.gmail.com

Attachment Content-Type Size
v1-doc-automatic-checkpoint-and-archive-skips.patch application/octet-stream 3.1 KB

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [patch] [doc] Further note required activity aspect of automatic checkpoint and archving
Date: 2021-01-15 07:16:51
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 2020-10-12 23:54, David G. Johnston wrote:
> --- a/doc/src/sgml/backup.sgml
> +++ b/doc/src/sgml/backup.sgml
> @@ -722,6 +722,8 @@ test ! -f
> /mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_wal/0
>      short <varname>archive_timeout</varname> &mdash; it will bloat
> your archive
>      storage.  <varname>archive_timeout</varname> settings of a minute
> or so are
>      usually reasonable.
> +    This is mitigated by the fact that empty WAL segments will not be
> archived
> +    even if the archive_timeout period has elapsed.
>     </para>

This is hopefully not what happens. What this would mean is that I'd
then have a sequence of WAL files named, say,

1, 2, 3, 7, 8, ...

because a few in the middle were not archived because they were empty.

> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml
> @@ -3131,6 +3131,8 @@ include_dir 'conf.d'
>        <listitem>
>         <para>
>          Maximum time between automatic WAL checkpoints.
> +        The automatic checkpoint will do nothing if no new WAL has been
> +        written since the last recorded checkpoint.
>          If this value is specified without units, it is taken as seconds.
>          The valid range is between 30 seconds and one day.
>          The default is five minutes (<literal>5min</literal>).

I think what happens is that the checkpoint is skipped, not that the
checkpoint happens but does nothing. That is the wording you cited in
the other thread from
<https://www.postgresql.org/docs/13/wal-configuration.html>.


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [patch] [doc] Further note required activity aspect of automatic checkpoint and archving
Date: 2021-01-15 19:50:43
Message-ID: CAKFQuwZJtgq53sGDR+zxd-oBMwwJx4SqJu8HkjSo4LP92giyHg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 15, 2021 at 12:16 AM Peter Eisentraut <
peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:

> On 2020-10-12 23:54, David G. Johnston wrote:
> > --- a/doc/src/sgml/backup.sgml
> > +++ b/doc/src/sgml/backup.sgml
> > @@ -722,6 +722,8 @@ test ! -f
> > /mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_wal/0
> > short <varname>archive_timeout</varname> &mdash; it will bloat
> > your archive
> > storage. <varname>archive_timeout</varname> settings of a minute
> > or so are
> > usually reasonable.
> > + This is mitigated by the fact that empty WAL segments will not be
> > archived
> > + even if the archive_timeout period has elapsed.
> > </para>
>
> This is hopefully not what happens. What this would mean is that I'd
> then have a sequence of WAL files named, say,
>
> 1, 2, 3, 7, 8, ...
>
> because a few in the middle were not archived because they were empty.
>

This addition assumes it is known that the archive process first fills the
files to their maximum size and then archives them. That filling of the
file is what causes the next file in the sequence to be created. So if the
archiving doesn't happen the files do not get filled and the status-quo
prevails.

If the above wants to be made more explicit in this change maybe:

"This is mitigated by the fact that archiving, and thus filling, the active
WAL segment will not happen if that segment is empty; it will continue as
the active segment."

> > --- a/doc/src/sgml/config.sgml
> > +++ b/doc/src/sgml/config.sgml
> > @@ -3131,6 +3131,8 @@ include_dir 'conf.d'
> > <listitem>
> > <para>
> > Maximum time between automatic WAL checkpoints.
> > + The automatic checkpoint will do nothing if no new WAL has been
> > + written since the last recorded checkpoint.
> > If this value is specified without units, it is taken as
> seconds.
> > The valid range is between 30 seconds and one day.
> > The default is five minutes (<literal>5min</literal>).
>
> I think what happens is that the checkpoint is skipped, not that the
> checkpoint happens but does nothing. That is the wording you cited in
> the other thread from
> <https://www.postgresql.org/docs/13/wal-configuration.html>.
>

Consistency is good; and considering it further the skipped wording is
generally better anyway.

"The automatic checkpoint will be skipped if no new WAL has been written
since the last recorded checkpoint."

David J.


From: David Steele <david(at)pgmasters(dot)net>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [patch] [doc] Further note required activity aspect of automatic checkpoint and archving
Date: 2021-03-18 15:36:52
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi David,

On 1/15/21 2:50 PM, David G. Johnston wrote:
>
> If the above wants to be made more explicit in this change maybe:
>
> "This is mitigated by the fact that archiving, and thus filling, the
> active WAL segment will not happen if that segment is empty; it will
> continue as the active segment."

"archiving, and thus filling" seems awkward to me. Perhaps:

This is mitigated by the fact that WAL segments will not be archived
until they have been filled with some data, even if the archive_timeout
period has elapsed.

> Consistency is good; and considering it further the skipped wording is
> generally better anyway.
>
> "The automatic checkpoint will be skipped if no new WAL has been written
> since the last recorded checkpoint."
Looks good to me.

Could you produce a new patch so Peter has something complete to look at?

Regards,
--
-David
david(at)pgmasters(dot)net


From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [patch] [doc] Further note required activity aspect of automatic checkpoint and archving
Date: 2021-11-04 09:36:37
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> On 18 Mar 2021, at 16:36, David Steele <david(at)pgmasters(dot)net> wrote:

> Could you produce a new patch so Peter has something complete to look at?

As this thread has been stalled for for a few commitfests by now I'm marking
this patch as returned with feedback. Feel free to open a new entry for an
updated patch.

--
Daniel Gustafsson https://vmware.com/