improve DEBUG1 logging of parallel workers for CREATE INDEX?

Lists: pgsql-hackers
From: Sami Imseih <samimseih(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-03 18:30:14
Message-ID: CAA5RZ0trTUL6_vpvW79daGgkp7B-ZtWUc5yrPz5Sjm8Ns4KRgQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

While reviewing patch [1], I realized that the DEBUG1 message
for CREATE INDEX could do better in providing information
about parallel workers launched. Currently, the message just
shows how many workers are planned, but a user may want
to ensure that they have the appropriate number of workers
launched as well when they enable DEBUG1 logging.

Therefore, I want to suggest that instead of:

postgres=# CREATE INDEX tbl_c1 ON tbl(c1);
DEBUG: building index "tbl_c1" on table "tbl" with request for 2
parallel workers
DEBUG: index "tbl_c1" can safely use deduplication
CREATE INDEX

to enhance the message to:

postgres=# create index tbl_c1 on tbl(c1);
DEBUG: building index "tbl_c1" on table "tbl"
DEBUG: launched 0 parallel vacuum workers for index vacuuming (planned: 2)
DEBUG: index "tbl_c1" can safely use deduplication
CREATE INDEX
postgres=#

I prepared a simple patch, attached, for this. The log message matches
the format
used in VACUUM VERBOSE ( for consistency sake ).

Also, I think the documentation in [2] should be updated to mention that DEBUG1
can be used to view parallel worker usage for the build, but I have
not included it in
the patch yet.

Any thoughts?

[1] https://www.postgresql.org/message-id/CAA5RZ0tS%3DFmgku9%3DqCp-U4EBC-dqVtj%2Bv_EGrKV_NA%3DTHnUDsQ%40mail.gmail.com

[2] https://www.postgresql.org/docs/current/sql-createindex.html

Regards,

Sami Imseih
Amazon Web Services (AWS)

Attachment Content-Type Size
0001-improve-DEBUG1-logging-of-parallel-workers-for-CREAT.patch application/octet-stream 2.5 KB

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-08 15:21:07
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Fri, Jan 03, 2025 at 12:30:14PM -0600, Sami Imseih wrote:
> While reviewing patch [1], I realized that the DEBUG1 message
> for CREATE INDEX could do better in providing information
> about parallel workers launched. Currently, the message just
> shows how many workers are planned, but a user may want
> to ensure that they have the appropriate number of workers
> launched as well when they enable DEBUG1 logging.

Yeah, one could see how many workers are currently running but would not be
able to figure out once the index is created. So adding the info in the log
makes sense to me.

> I prepared a simple patch, attached, for this. The log message matches
> the format
> used in VACUUM VERBOSE ( for consistency sake ).

A few random comments:

=== 1

s/parallel vacuum workers for index vacuuming/parallel workers for index creation/?

(2 times)

=== 2

- (errmsg_internal("building index \"%s\" on table \"%s\" with request for %d parallel workers",
+ (errmsg_internal("building index \"%s\" on table \"%s\"",

I'd add "in parallel" to match its counterpart "serially" above. That would
make it more clear in case one just look for "building index" in the log.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-08 17:36:51
Message-ID: CAA5RZ0t2y1NVt3V+wVp=1cwi1oKKQif7+NrCVRoJg-oj_ScPfg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> s/parallel vacuum workers for index vacuuming/parallel workers for index creation/?

oops, that's my oversight from copying the message from vacuum. fixed.

> I'd add "in parallel" to match its counterpart "serially" above. That would
> make it more clear in case one just look for "building index" in the log.

good point.

Below is the new output with the attached v2.

postgres=# create index foo_idx1 on foo(id);
DEBUG: building index "foo_idx1" on table "foo" with parallel workers
DEBUG: launched 1 parallel workers for index creation (planned: 1)
DEBUG: index "foo_idx1" can safely use deduplication
CREATE INDEX

Regards,

Sami

Attachment Content-Type Size
v2-0001-improve-DEBUG1-logging-of-parallel-workers-for-CR.patch application/octet-stream 2.5 KB

From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-08 20:21:46
Message-ID: CAECtzeVgxsxHSXhbB=UQJXN5ifUTEEQTJZXB00NQj2X25e=-jg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Le mer. 8 janv. 2025 à 18:37, Sami Imseih <samimseih(at)gmail(dot)com> a écrit :

> > s/parallel vacuum workers for index vacuuming/parallel workers for index
> creation/?
>
> oops, that's my oversight from copying the message from vacuum. fixed.
>
> > I'd add "in parallel" to match its counterpart "serially" above. That
> would
> > make it more clear in case one just look for "building index" in the log.
>
> good point.
>
> Below is the new output with the attached v2.
>
> postgres=# create index foo_idx1 on foo(id);
> DEBUG: building index "foo_idx1" on table "foo" with parallel workers
> DEBUG: launched 1 parallel workers for index creation (planned: 1)
> DEBUG: index "foo_idx1" can safely use deduplication
> CREATE INDEX
>
>
I tend to agree it might be better than Benoit's patch on the index
messages, though I'm afraid that DEBUG1 level won't work for many users.
DEBUGx are for developers, not users. A better log level for me would be
LOG. For client messages, LOG messages won't be displayed by default. So
there's still a need for a "SET client_min_messages to LOG" but it won't be
necessary to go all the way to DEBUG1.

Regards.

--
Guillaume.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: Sami Imseih <samimseih(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-08 20:35:38
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Guillaume Lelarge <guillaume(at)lelarge(dot)info> writes:
> I tend to agree it might be better than Benoit's patch on the index
> messages, though I'm afraid that DEBUG1 level won't work for many users.
> DEBUGx are for developers, not users. A better log level for me would be
> LOG.

Please, no. That would result in spamming the postmaster log
for perfectly ordinary actions, with no usable way to prevent it.

I think DEBUG is perfectly appropriate for this sort of message.

regards, tom lane


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-08 20:44:00
Message-ID: CAA5RZ0vc1VN_-L3Y0BgAzshkc9xd5OFvJEKPFYrox71doYqLaA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> I think DEBUG is perfectly appropriate for this sort of message.

I am curious what are the thoughts on introducing a
CREATE INDEX VERBOSE which can provide this info?
similar to users scripting VACUUM VERBOSE to log
more details about the vacuum operation including parallel
usage.

I have not convinced myself this is a good idea, but maybe
someone has an opinion. If not, we can just stick with a better
DEBUG1 message.

Regards,

Sami


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Sami Imseih <samimseih(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-08 20:46:39
Message-ID: CAECtzeW+hUHM9u=E+H5d3b_GvXy5Ei0=d9TwjVKBMS9OTa=bxw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Le mer. 8 janv. 2025 à 21:35, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> a écrit :

> Guillaume Lelarge <guillaume(at)lelarge(dot)info> writes:
> > I tend to agree it might be better than Benoit's patch on the index
> > messages, though I'm afraid that DEBUG1 level won't work for many users.
> > DEBUGx are for developers, not users. A better log level for me would be
> > LOG.
>
> Please, no. That would result in spamming the postmaster log
> for perfectly ordinary actions, with no usable way to prevent it.
>
>
Not sure why I was thinking it would only go to the client. You're right,
it will also go to the logs, and it will be stored by default, which would
be bad. Sorry about this...

> I think DEBUG is perfectly appropriate for this sort of message.
>
>
Still think DEBUG level is pretty bad for a user experience. Anyway, I much
prefer a DEBUG message than no message at all :)

--
Guillaume.


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-08 20:48:17
Message-ID: CAECtzeX7Y2Z=mttK75LP2wPAuaPxk6NA=MKRQT6NyNibEVJ_Fw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Le mer. 8 janv. 2025 à 21:44, Sami Imseih <samimseih(at)gmail(dot)com> a écrit :

> > I think DEBUG is perfectly appropriate for this sort of message.
>
> I am curious what are the thoughts on introducing a
> CREATE INDEX VERBOSE which can provide this info?
> similar to users scripting VACUUM VERBOSE to log
> more details about the vacuum operation including parallel
> usage.
>
> I have not convinced myself this is a good idea, but maybe
> someone has an opinion. If not, we can just stick with a better
> DEBUG1 message.
>
>
Sounds better to me than relying on DEBUG messages. So, yeah, VERBOSE has
my vote.

--
Guillaume.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-08 20:58:25
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Sami Imseih <samimseih(at)gmail(dot)com> writes:
> I am curious what are the thoughts on introducing a
> CREATE INDEX VERBOSE which can provide this info?
> similar to users scripting VACUUM VERBOSE to log
> more details about the vacuum operation including parallel
> usage.

What I can recall being discussed in the past is to extend
EXPLAIN and/or EXPLAIN ANALYZE to cover utility statements
that have nontrivial execution complexity --- for example,
ALTER TABLE has a lot of machinery underneath, and people
often wish to know things like whether a particular ALTER
will cause a table rewrite or not.

Of course, a patch for that would be a few orders of magnitude
larger than what you've got here :-(. But if you're looking
for a framework for reporting these sorts of details, I'd
much rather go in that direction than follow the model of
VACUUM VERBOSE. VACUUM VERBOSE is a kluge with little to
recommend it other than having been easy to implement.

regards, tom lane


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-09 03:24:27
Message-ID: CAA5RZ0uzC12pVHKYkrmmSi9p0vLqkRN-iKTy03_X-ZRLuP9vEg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> Of course, a patch for that would be a few orders of magnitude
> larger than what you've got here :-(. But if you're looking
> for a framework for reporting these sorts of details, I'd
> much rather go in that direction than follow the model of
> VACUUM VERBOSE. VACUUM VERBOSE is a kluge with little to
> recommend it other than having been easy to implement.

To my surprise, REINDEX does have a VERBOSE option.
should have check this earlier :)

postgres=# reindex (verbose) index t_idx1;
INFO: index "t_idx1" was reindexed
DETAIL: CPU: user: 5.33 s, system: 0.48 s, elapsed: 6.26 s
REINDEX

Is there a reason not to do the same for CREATE INDEX?

Also, we can improve the REINDEX verbose message by
also providing the parallel usage.

Regards,

Sami


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-11 10:29:57
Message-ID: CAECtzeX28w03vV-_fYmnU6PQHYSb-nDarwnwPqd3tmghoih3GQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Le jeu. 9 janv. 2025 à 04:24, Sami Imseih <samimseih(at)gmail(dot)com> a écrit :

> > Of course, a patch for that would be a few orders of magnitude
> > larger than what you've got here :-(. But if you're looking
> > for a framework for reporting these sorts of details, I'd
> > much rather go in that direction than follow the model of
> > VACUUM VERBOSE. VACUUM VERBOSE is a kluge with little to
> > recommend it other than having been easy to implement.
>
> To my surprise, REINDEX does have a VERBOSE option.
> should have check this earlier :)
>
> postgres=# reindex (verbose) index t_idx1;
> INFO: index "t_idx1" was reindexed
> DETAIL: CPU: user: 5.33 s, system: 0.48 s, elapsed: 6.26 s
> REINDEX
>
> Is there a reason not to do the same for CREATE INDEX?
>
>
Sounds a good idea to me.

> Also, we can improve the REINDEX verbose message by
> also providing the parallel usage.
>
>
+1

--
Guillaume.


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-18 23:42:50
Message-ID: CAA5RZ0v1DsqibU_am-CGM=_ve4_YMH+PXyfPL64rBzCgojpVMg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

The opinion in this thread is leaning towards a VERBOSE
option and I agree with this as it provides the users with a
much more intuitive way to gather parallel information about
parallel index builds. Also, the current DEBUG1 output is
useless as it is now.

Here is a v1 that implements CREATE INDEX (VERBOSE)
to output the parallel workers planned and launched.

A few points about this patch:

1/ the (VERBOSE) option is added after the INDEX.
It seemed better than adding it after the CREATE keyword
to make the option specific to the "CREATE INDEX" operation
rather than the "CREATE" operation which can be many
things.

2/ The workers launched is determined in the index AM,
therefore a new field called "ii_verbose" is added to IndexInfo
so it can be available to the AM to determine if logging
is required. We currently handle the concurrently
option in the same manner.

3/ currently REINDEX (VERBOSE) does not show parallel
workers planned, but it will with this patch. Both
CREATE INDEX and REINDEX will show the same output.

4/ Updated auto-completed in psql to deal with (VERBOSE)

5/ Removed the existing DEBUG1 output.

6/ Updated the tests for create index/reinde

Here is sample output:

postgres=# CREATE INDEX (VERBOSE ON) ON t(c1);
INFO: building index "t_c1_idx" on table "t" with parallel workers
INFO: launched 1 parallel workers for index creation (planned: 1)
INFO: index "public.t" was created
DETAIL: CPU: user: 0.88 s, system: 0.01 s, elapsed: 0.89 s.
CREATE INDEX
postgres=#
postgres=# REINDEX (verbose) INDEX t_c1_idx ;
INFO: building index "t_c1_idx" on table "t" with parallel workers
INFO: launched 1 parallel workers for index creation (planned: 1)
INFO: index "t_c1_idx" was reindexed
DETAIL: CPU: user: 0.88 s, system: 0.00 s, elapsed: 0.89 s
REINDEX

Looking forward to feedback.

Regards,

Sami

Attachment Content-Type Size
v1-0001-Improve-Parallel-worker-reporting-in-CREATE-INDEX.patch application/octet-stream 30.9 KB

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-19 00:52:14
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Jan 18, 2025 at 05:42:50PM -0600, Sami Imseih wrote:
> Here is a v1 that implements CREATE INDEX (VERBOSE)
> to output the parallel workers planned and launched.

Hmm. I am reading Tom's opinion that goes toward not going in this
direction for more commands, with the point to extend EXPLAIN to show
this kind of information:
https://www.postgresql.org/message-id/[email protected]

So do we really want to do what's proposed here? I'm +-0 about the
VERBOSE option attached to more commands, as it is a bit harder for
clients to catch the information wanted. So here comes my question:
how do we want to consume this information at the end from the
perspective of the client?
--
Michael


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Guillaume Lelarge <guillaume(at)lelarge(dot)info>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-19 05:40:32
Message-ID: CAA5RZ0tTpqUjNc3FcegjW9njZWSPSE0tFN9d4BCLh_c+whgZwA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> Hmm. I am reading Tom's opinion that goes toward not going in this
> direction for more commands, with the point to extend EXPLAIN to show
> this kind of information:
> https://www.postgresql.org/message-id/[email protected]

That sounds like the ability to do something like EXPLAIN CREATE INDEX ...
is that correct?

> So do we really want to do what's proposed here? I'm +-0 about the
> VERBOSE option attached to more commands, as it is a bit harder for
> clients to catch the information wanted. So here comes my question:
> how do we want to consume this information at the end from the
> perspective of the client?

For interactive usage in psql or pgadmin, it's trivial to capture this
information. The
information can also be written to the server log with log_min_messages=INFO

A bit more work is required to redirect messages to an out file from psql, as
you need to ensure that stderr is redirected to a file.
It's also a bit more work to capture this information from something like a
JDBC application.

IMO the interactive use-case is where this is the most useful as you can start a
CREATE INDEX (VERBOSE) and ensure that it's going to launch all the parallel
workers that it planned before letting it it continue; or control-c and figure
out why not all planned workers launched.

Regards,

Sami


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-19 10:30:13
Message-ID: CAECtzeVGvSJzuVwaALtmZZYUZYP4=cWku3J9yAktCENWy3OjgQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Le dim. 19 janv. 2025 à 06:40, Sami Imseih <samimseih(at)gmail(dot)com> a écrit :

> > Hmm. I am reading Tom's opinion that goes toward not going in this
> > direction for more commands, with the point to extend EXPLAIN to show
> > this kind of information:
> > https://www.postgresql.org/message-id/[email protected]
>
> That sounds like the ability to do something like EXPLAIN CREATE INDEX ...
> is that correct?
>
>
Yes, and I suppose this would be quite some work to do. Though I see how
interesting it would be, and a nice project to hack on.

> > So do we really want to do what's proposed here? I'm +-0 about the
> > VERBOSE option attached to more commands, as it is a bit harder for
> > clients to catch the information wanted.

It sounds to me a lot easier to know about a VERBOSE option that gives you
more information, than using a DEBUG level to get the same information.
DEBUG level messages aren't explained in the CREATE INDEX manpage, whereas
a VERBOSE option would be.

> So here comes my question:
> > how do we want to consume this information at the end from the
> > perspective of the client?
>
> For interactive usage in psql or pgadmin, it's trivial to capture this
> information.

+1

The information can also be written to the server log with
> log_min_messages=INFO
>
> A bit more work is required to redirect messages to an out file from psql,
> as
> you need to ensure that stderr is redirected to a file.
> It's also a bit more work to capture this information from something like a
> JDBC application.
>
> IMO the interactive use-case is where this is the most useful as you can
> start a
> CREATE INDEX (VERBOSE) and ensure that it's going to launch all the
> parallel
> workers that it planned before letting it it continue; or control-c and
> figure
> out why not all planned workers launched.
>

--
Guillaume.


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-20 03:27:57
Message-ID: CAA5RZ0v+Lq+zFwWZy4oUgnqmxzY1gb-OmJr1CDz_iTQOtZmCRQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> > > Hmm. I am reading Tom's opinion that goes toward not going in this
> > > direction for more commands, with the point to extend EXPLAIN to show
> > > this kind of information:
> > > https://www.postgresql.org/message-id/[email protected]
> >
> > That sounds like the ability to do something like EXPLAIN CREATE INDEX ...
> > is that correct?
> >

> Yes, and I suppose this would be quite some work to do. Though I see how
> interesting it would be, and a nice project to hack on.

I have been contemplating this and do see how something like EXPLAIN
could be useful, as it would give users the ability to examine
what a command will do, i.e. EXPLAIN CREATE INDEX will tell you
how many parallel workers are planned and EXPLAIN (ANALYZE) CREATE
INDEX will actually execute the command and provide the actual parallel
workers launched. Tom also mentioned the example of a user wanting to know if
a table rewrite will occur. There are other cases where this could be useful.

FWIW, Oracle does support EXPLAIN PLAN FOR CREATE INDEX and
EXPLAIN PLAN FOR REBUILD (for index rebuilds) [1][2], but this functionality
is not officially documented [3]. Oracle also provides a documented function to
estimate the size of an index [4]. I do not have access to an Oracle instance to
verify the state of current versions of Oracle, but I do vaguely remember this
from when I worked on Oracle many years ago :)

Now, I also realized that v1 does not include logging for ALTER TABLE commands
that add an index, such as "ALTER TABLE foo ADD CONSTRAINT foo_uq UNIQUE (c1)"
or attaching a partition and ensuring the index is built. So this
needs to be added to
the current proposal.

So far these are 2 options being discussed:

1/
The current proposal of:

CREATE INDEX (VERBOSE) and ALTER TABLE (VERBOSE)
and output debugging at INFO level.

2/
EXPLAIN

This is a lot more work. Currently QueryDesc has all the info we
need to generate the execution plan. But, we will likely need a
new struct that can track the debugging info to relay it back
up to explain. And this has to be generic enough for all cases.

Also, implementing an EXPLAIN and EXPLAIN ANALYZE for such commands
may not even be feasible. Plans are different in that they are generated
up-front and we can choose to execute them or not. This is not the same
for the other types of commands being discussed.

I am not convinced it's worth the effort.

thoughts? other approaches?

[1] https://oracle-randolf.blogspot.com/2009/02/explain-plan-on-ddls.html
[2] https://stackoverflow.com/questions/827123/how-can-i-estimate-the-size-of-an-oracle-index
[3] https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/EXPLAIN-PLAN.html
[4] https://docs.oracle.com/en/database/oracle/oracle-database/21/arpls/DBMS_SPACE.html#GUID-B6EB7527-BC67-4394-9E7A-01F2790C409A

Regards,

Sami


From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-20 07:24:54
Message-ID: CAECtzeXHnpsFUUSanLs_0ppDLGyXWhMQju_AHe9-HGgJjZZHDw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Le lun. 20 janv. 2025 à 04:28, Sami Imseih <samimseih(at)gmail(dot)com> a écrit :

> > > > Hmm. I am reading Tom's opinion that goes toward not going in this
> > > > direction for more commands, with the point to extend EXPLAIN to show
> > > > this kind of information:
> > > >
> https://www.postgresql.org/message-id/[email protected]
> > >
> > > That sounds like the ability to do something like EXPLAIN CREATE INDEX
> ...
> > > is that correct?
> > >
>
> > Yes, and I suppose this would be quite some work to do. Though I see how
> > interesting it would be, and a nice project to hack on.
>
> I have been contemplating this and do see how something like EXPLAIN
> could be useful, as it would give users the ability to examine
> what a command will do, i.e. EXPLAIN CREATE INDEX will tell you
> how many parallel workers are planned and EXPLAIN (ANALYZE) CREATE
> INDEX will actually execute the command and provide the actual parallel
> workers launched. Tom also mentioned the example of a user wanting to know
> if
> a table rewrite will occur. There are other cases where this could be
> useful.
>
> FWIW, Oracle does support EXPLAIN PLAN FOR CREATE INDEX and
> EXPLAIN PLAN FOR REBUILD (for index rebuilds) [1][2], but this
> functionality
> is not officially documented [3]. Oracle also provides a documented
> function to
> estimate the size of an index [4]. I do not have access to an Oracle
> instance to
> verify the state of current versions of Oracle, but I do vaguely remember
> this
> from when I worked on Oracle many years ago :)
>
> Now, I also realized that v1 does not include logging for ALTER TABLE
> commands
> that add an index, such as "ALTER TABLE foo ADD CONSTRAINT foo_uq UNIQUE
> (c1)"
> or attaching a partition and ensuring the index is built. So this
> needs to be added to
> the current proposal.
>
> So far these are 2 options being discussed:
>
> 1/
> The current proposal of:
>
> CREATE INDEX (VERBOSE) and ALTER TABLE (VERBOSE)
> and output debugging at INFO level.
>
>
I guess the major issue is that it won't scale fine.

> 2/
> EXPLAIN
>
> This is a lot more work. Currently QueryDesc has all the info we
> need to generate the execution plan. But, we will likely need a
> new struct that can track the debugging info to relay it back
> up to explain. And this has to be generic enough for all cases.
>
> Also, implementing an EXPLAIN and EXPLAIN ANALYZE for such commands
> may not even be feasible. Plans are different in that they are generated
> up-front and we can choose to execute them or not. This is not the same
> for the other types of commands being discussed.
>
> I am not convinced it's worth the effort.
>
>
> thoughts? other approaches?
>
>
You might be interested by this thread "Thinking about EXPLAIN ALTER TABLE":

https://www.postgresql.org/message-id/CAM-w4HNm1M5J-ow8UjTcqRe3JPxkVCrGe56tRpPUSePSdGcZ_w%40mail.gmail.com

I didn't had the time to read the whole thread, but it might be quite
interesting.

>
> [1] https://oracle-randolf.blogspot.com/2009/02/explain-plan-on-ddls.html
> [2]
> https://stackoverflow.com/questions/827123/how-can-i-estimate-the-size-of-an-oracle-index
> [3]
> https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/EXPLAIN-PLAN.html
> [4]
> https://docs.oracle.com/en/database/oracle/oracle-database/21/arpls/DBMS_SPACE.html#GUID-B6EB7527-BC67-4394-9E7A-01F2790C409A
>
> Regards,
>
> Sami
>

--
Guillaume.


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-01-22 00:33:00
Message-ID: CAA5RZ0tTC66X2TZURqR9ThFt7YPVwGwFUBVYkvjn3=zuRKq4UA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> You might be interested by this thread "Thinking about EXPLAIN ALTER TABLE":

> https://www.postgresql.org/message-id/CAM-w4HNm1M5J-ow8UjTcqRe3JPxkVCrGe56tRpPUSePSdGcZ_w%40mail.gmail.com

I reviewed this thread, and the primary issue with the EXPLAIN command lies
in the inability to predict all the steps the ALTER TABLE will take as some
are made in phase 2 or 3. It is unlikely that all significant
decisions can be made
in phase 1.

However, I don't think that EXPLAIN addresses the same problem as the proposed
VERBOSE option. Consider, for instance, a user who intends to perform a schema
change that includes a long sequence of ALTERs. These steps will depend on each
other. Tom raises this point of dependent ALTERs [2] but for a
different reason.
I think however, this is an important point. How will EXPLAIN help
here? It may not tell
you the truth, because it does not actually do the work and can't know
the future
state of the schema.

VERBOSE on the other hand will perform the steps, and a user
can test these changes in a test environment or a schema-only restore
and know exactly what to expect in production.

As this thread has expanded beyond its original scope,
I believe this broader discussion is valuable. Rather than merely
addressing the DEBUG1 issue, we can work toward a more complete
and beneficial solution.

Thoughts?

[0] https://www.postgresql.org/message-id/26597.1544460770%40sss.pgh.pa.us

Regards,

Sami


From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: improve DEBUG1 logging of parallel workers for CREATE INDEX?
Date: 2025-02-03 20:29:03
Message-ID: CAA5RZ0tW=9qE6zLxuSQBimG_JjHg-Z0PK7dKn3r21NH9_Q+Rwg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

At this point I am planning on withdrawing this patch
from the March commitfest. I don't think fixing the REINDEX
debug1 output makes a whole lot of sense. I still think more logging
for (CREATE|ALTER) (INDEX|TABLE) will be a good to have but there
needs to be more discussion about the best approach.

Regards,

Sami