Lists: | pgsql-hackers |
---|
From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-05-09 00:01:08 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Restarting a large instance took twice as long as I expected due to not
checking interrupts in (at least) statext_ndistinct_build. Long enough that I
attached (and was able to attach) a debugger to verify, which I think is too
long. I think it could cause issues for an high-availability cluster or other
script if it takes too long to shut down.
The tables being auto-analyzed have 9 exteneded stats objects, each with stats
target=10. 7 of those are (ndistinct) stats on 4 simple columns plus 1
expression (5 total). And the other 2 stats objects are expressional stats
(necessarily on a single expression).
From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-05-09 03:31:16 |
Message-ID: | YniLBN/[email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Sun, May 08, 2022 at 07:01:08PM -0500, Justin Pryzby wrote:
> Restarting a large instance took twice as long as I expected due to not
> checking interrupts in (at least) statext_ndistinct_build. Long enough that I
> attached (and was able to attach) a debugger to verify, which I think is too
> long. I think it could cause issues for an high-availability cluster or other
> script if it takes too long to shut down.
Hmm. That's annoying.
> The tables being auto-analyzed have 9 exteneded stats objects, each with stats
> target=10. 7 of those are (ndistinct) stats on 4 simple columns plus 1
> expression (5 total). And the other 2 stats objects are expressional stats
> (necessarily on a single expression).
How long can the backend remain unresponsive? I don't think that
anybody would object to the addition of some CHECK_FOR_INTERRUPTS() in
areas where it would be efficient to make the shutdown quicker, but
we need to think carefully about the places where we'd want to add
these.
--
Michael
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-05-09 03:36:33 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Michael Paquier <michael(at)paquier(dot)xyz> writes:
> How long can the backend remain unresponsive? I don't think that
> anybody would object to the addition of some CHECK_FOR_INTERRUPTS() in
> areas where it would be efficient to make the shutdown quicker, but
> we need to think carefully about the places where we'd want to add
> these.
CHECK_FOR_INTERRUPTS is really quite cheap, just a test-and-branch.
I wouldn't put it in a *very* tight loop, but one test per row
processed while gathering stats is unlikely to be a problem.
regards, tom lane
From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-05-09 13:11:37 |
Message-ID: | CA+TgmoY=Q9C9n=P=e8gTqLFTLza89t-cz+X-16LPaxq2+=qY2g@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Sun, May 8, 2022 at 11:36 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Michael Paquier <michael(at)paquier(dot)xyz> writes:
> > How long can the backend remain unresponsive? I don't think that
> > anybody would object to the addition of some CHECK_FOR_INTERRUPTS() in
> > areas where it would be efficient to make the shutdown quicker, but
> > we need to think carefully about the places where we'd want to add
> > these.
>
> CHECK_FOR_INTERRUPTS is really quite cheap, just a test-and-branch.
> I wouldn't put it in a *very* tight loop, but one test per row
> processed while gathering stats is unlikely to be a problem.
+1. If we're finding things stalling that would be fixed by adding
CHECK_FOR_INTERRUPTS(), we should generally just add it. In the
unlikely event that this causes a performance problem, we can try to
figure out some other solution, but not responding to interrupts isn't
the right way to economize.
--
Robert Haas
EDB: http://www.enterprisedb.com
From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-06-03 15:28:37 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, May 09, 2022 at 09:11:37AM -0400, Robert Haas wrote:
> On Sun, May 8, 2022 at 11:36 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Michael Paquier <michael(at)paquier(dot)xyz> writes:
> > > How long can the backend remain unresponsive? I don't think that
> > > anybody would object to the addition of some CHECK_FOR_INTERRUPTS() in
> > > areas where it would be efficient to make the shutdown quicker, but
> > > we need to think carefully about the places where we'd want to add
> > > these.
> >
> > CHECK_FOR_INTERRUPTS is really quite cheap, just a test-and-branch.
> > I wouldn't put it in a *very* tight loop, but one test per row
> > processed while gathering stats is unlikely to be a problem.
>
> +1. If we're finding things stalling that would be fixed by adding
> CHECK_FOR_INTERRUPTS(), we should generally just add it. In the
> unlikely event that this causes a performance problem, we can try to
> figure out some other solution, but not responding to interrupts isn't
> the right way to economize.
Reproduce the problem for ndistinct and dependencies like:
DROP TABLE t; CREATE TABLE t AS SELECT i A,1+i B,2+i C,3+i D,4+i E,5+i F, now() AS ts FROM generate_series(1.0, 99999.0)i; VACUUM t;
DROP STATISTICS stxx; CREATE STATISTICS stxx (ndistinct) ON mod(a,14),mod(b,15),mod(c,16),mod(d,17),mod(e,18),mod(f,19) FROM t;
ANALYZE VERBOSE t;
Maybe this should actually call vacuum_delay_point(), like
compute_scalar_stats(). For MCV, there seems to be no issue, since those
functions are being called (but only for expressional stats). But maybe I've
just failed to make a large enough, non-expressional MCV list for the problem
to be apparent.
The patch is WIP, but whatever we end up with should probably be backpatched at
least to v14, where expressional indexes were introduced, since they're likely
to have more columns, and are slower to compute.
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index e8f71567b4e..e5538dcd4e1 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -19,6 +19,7 @@
#include "catalog/pg_statistic_ext.h"
#include "catalog/pg_statistic_ext_data.h"
#include "lib/stringinfo.h"
+#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "nodes/nodes.h"
#include "nodes/pathnodes.h"
@@ -383,6 +384,8 @@ statext_dependencies_build(StatsBuildData *data)
MVDependency *d;
MemoryContext oldcxt;
+ CHECK_FOR_INTERRUPTS();
+
/* release memory used by dependency degree calculation */
oldcxt = MemoryContextSwitchTo(cxt);
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index dd67b19b6fa..9db1d0325cd 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -269,6 +269,8 @@ statext_mcv_build(StatsBuildData *data, double totalrows, int stattarget)
SortItem **freqs;
int *nfreqs;
+ // CHECK_FOR_INTERRUPTS();
+
/* used to search values */
tmp = (MultiSortSupport) palloc(offsetof(MultiSortSupportData, ssup)
+ sizeof(SortSupportData));
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 6ade5eff78c..3b739ab7ca0 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -29,6 +29,7 @@
#include "catalog/pg_statistic_ext.h"
#include "catalog/pg_statistic_ext_data.h"
#include "lib/stringinfo.h"
+#include "miscadmin.h"
#include "statistics/extended_stats_internal.h"
#include "statistics/statistics.h"
#include "utils/fmgrprotos.h"
@@ -114,6 +115,8 @@ statext_ndistinct_build(double totalrows, StatsBuildData *data)
MVNDistinctItem *item = &result->items[itemcnt];
int j;
+ CHECK_FOR_INTERRUPTS();
+
item->attributes = palloc(sizeof(AttrNumber) * k);
item->nattributes = k;
From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-06-05 01:42:33 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Fri, Jun 03, 2022 at 10:28:37AM -0500, Justin Pryzby wrote:
> Maybe this should actually call vacuum_delay_point(), like
> compute_scalar_stats().
I think vacuum_delay_point() would be wrong for these cases, since they don't
call "fetchfunc()", like the other places which use vacuum_delay_point.
> For MCV, there seems to be no issue, since those
> functions are being called (but only for expressional stats). But maybe I've
> just failed to make a large enough, non-expressional MCV list for the problem
> to be apparent.
I reproduced the issue with MCV like this:
DROP TABLE IF EXISTS t; CREATE TABLE t AS SELECT a::text,b::text,c::text,d::text,e::text,f::text,g::text FROM generate_series(1000001,1000006)a,generate_series(1000001,1000006)b,generate_series(1000001,1000006)c,generate_series(1000001,1000006)d,generate_series(1000001,1000006)e,generate_series(1000001,1000006)f,generate_series(1000001,1000006)g,generate_series(1000001,1000006)h; VACUUM t;
DROP STATISTICS IF EXISTS stxx; CREATE STATISTICS stxx (mcv) ON a,b,c,d,e,f FROM t; ALTER STATISTICS stxx SET STATISTICS 9999; ANALYZE VERBOSE t;
This is slow (25 seconds) inside qsort:
(gdb) bt
#0 __memcmp_sse4_1 () at ../sysdeps/x86_64/multiarch/memcmp-sse4.S:1020
#1 0x00005653d8686fac in varstrfastcmp_locale (a1p=0x5653dce67d54 "1000004~", len1=7, a2p=0x5653e895ffa4 "1000004~", len2=7, ssup=ssup(at)entry=0x5653d98c37b8) at varlena.c:2444
#2 0x00005653d8687161 in varlenafastcmp_locale (x=94918188367184, y=94918384418720, ssup=0x5653d98c37b8) at varlena.c:2270
#3 0x00005653d85134d8 in ApplySortComparator (ssup=0x5653d98c37b8, isNull2=<optimized out>, datum2=<optimized out>, isNull1=<optimized out>, datum1=<optimized out>) at ../../../src/include/utils/sortsupport.h:224
#4 multi_sort_compare (a=0x7fa587b44e58, b=0x7fa5875f0dd0, arg=0x5653d98c37b0) at extended_stats.c:903
#5 0x00005653d8712eed in qsort_arg (data=data(at)entry=0x7fa5875f0050, n=<optimized out>, n(at)entry=1679616, element_size=element_size(at)entry=24, compare=compare(at)entry=0x5653d8513483 <multi_sort_compare>,
arg=arg(at)entry=0x5653d98c37b0) at ../../src/include/lib/sort_template.h:349
#6 0x00005653d851415f in build_sorted_items (data=data(at)entry=0x7fa58f2e1050, nitems=nitems(at)entry=0x7ffe4f764e5c, mss=mss(at)entry=0x5653d98c37b0, numattrs=6, attnums=0x7fa58f2e1078) at extended_stats.c:1134
#7 0x00005653d8515d84 in statext_mcv_build (data=data(at)entry=0x7fa58f2e1050, totalrows=totalrows(at)entry=1679616, stattarget=stattarget(at)entry=9999) at mcv.c:204
#8 0x00005653d8513819 in BuildRelationExtStatistics (onerel=onerel(at)entry=0x7fa5b26ef658, inh=inh(at)entry=false, totalrows=1679616, numrows=numrows(at)entry=1679616, rows=rows(at)entry=0x7fa5a4103050, natts=natts(at)entry=7,
vacattrstats=vacattrstats(at)entry=0x5653d98b76b0) at extended_stats.c:213
The fix seems to be to CHECK_FOR_INTERRUPTS() within multi_sort_compare().
That would supercede the other two CHECK_FOR_INTERRUPTS I'd proposed, and
handle mcv, depends, and ndistinct all at once.
Does that sound right ?
For MCV, there's also ~0.6sec spent in build_column_frequencies(), which (if
needed) would be addressed by adding CFI in sort_item_compare.
From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-06-06 07:23:34 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Sat, Jun 04, 2022 at 08:42:33PM -0500, Justin Pryzby wrote:
> The fix seems to be to CHECK_FOR_INTERRUPTS() within multi_sort_compare().
> That would supercede the other two CHECK_FOR_INTERRUPTS I'd proposed, and
> handle mcv, depends, and ndistinct all at once.
Hmm. I have to admit that adding a CFI() in multi_sort_compare()
stresses me a bit as it is dependent on the number of rows involved,
and it can be used as a qsort routine.
--
Michael
From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-06-17 22:25:49 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, Jun 06, 2022 at 04:23:34PM +0900, Michael Paquier wrote:
> On Sat, Jun 04, 2022 at 08:42:33PM -0500, Justin Pryzby wrote:
> > The fix seems to be to CHECK_FOR_INTERRUPTS() within multi_sort_compare().
> > That would supercede the other two CHECK_FOR_INTERRUPTS I'd proposed, and
> > handle mcv, depends, and ndistinct all at once.
>
> Hmm. I have to admit that adding a CFI() in multi_sort_compare()
> stresses me a bit as it is dependent on the number of rows involved,
> and it can be used as a qsort routine.
That's exactly the problem for which I showed a backtrace - it took 10s of
seconds to do qsort, which is (uh) a human timescale and too long to be
unresponsive, even if I create on a table with many rows a stats object with a
lot of columns and a high stats target.
Attachment | Content-Type | Size |
---|---|---|
0001-wip-check-for-interrupts-during-extended-stats.patch | text/x-diff | 950 bytes |
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-07-01 23:19:11 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Justin Pryzby <pryzby(at)telsasoft(dot)com> writes:
> On Mon, Jun 06, 2022 at 04:23:34PM +0900, Michael Paquier wrote:
>> Hmm. I have to admit that adding a CFI() in multi_sort_compare()
>> stresses me a bit as it is dependent on the number of rows involved,
>> and it can be used as a qsort routine.
> That's exactly the problem for which I showed a backtrace - it took 10s of
> seconds to do qsort, which is (uh) a human timescale and too long to be
> unresponsive, even if I create on a table with many rows a stats object with a
> lot of columns and a high stats target.
Hmm. On my machine, the example last shown upthread takes about 9
seconds, which I agree is a mighty long time to be unresponsive
--- but it appears that fully half of that elapses before we
reach multi_sort_compare for the first time. The first half of
the ANALYZE run does seem to contain some CFI calls, but they
are not exactly thick on the ground there either. So I'm feeling
like this patch isn't ambitious enough.
I tried interrupting at a random point and then stepping, and
look what I hit after just a couple of steps:
(gdb) s
qsort_arg (data=data(at)entry=0x13161410, n=<optimized out>, n(at)entry=1679616,
element_size=element_size(at)entry=16,
compare=compare(at)entry=0x649450 <compare_scalars>,
arg=arg(at)entry=0x7ffec539c0f0) at ../../src/include/lib/sort_template.h:353
353 if (r == 0)
(gdb)
358 pc -= ST_POINTER_STEP;
(gdb)
359 DO_CHECK_FOR_INTERRUPTS();
That, um, piqued my interest. After a bit of digging,
I modestly propose the attached. I'm not sure if it's
okay to back-patch this, because maybe someone out there
is relying on qsort() to be incapable of throwing an error
--- but it'd solve the problem at hand and a bunch of other
issues of the same ilk.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
enable-check-for-interrupts-in-qsort.patch | text/x-diff | 1.1 KB |
From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-07-02 00:07:04 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Fri, Jul 01, 2022 at 07:19:11PM -0400, Tom Lane wrote:
> I tried interrupting at a random point and then stepping, and
> look what I hit after just a couple of steps:
I'd come up with the trick of setting
SET backtrace_functions='ProcessInterrupts';
> That, um, piqued my interest. After a bit of digging,
> I modestly propose the attached. I'm not sure if it's
> okay to back-patch this, because maybe someone out there
> is relying on qsort() to be incapable of throwing an error
> --- but it'd solve the problem at hand and a bunch of other
> issues of the same ilk.
Confirmed this fixes the 3 types of extended stats, at least for the example
cases I made.
If it's okay to backpatch, v14 seems adequate since the problem is more
prominent with expressional statistics (I'm sure that's why I hit it) ...
otherwise v15 would do.
--
Justin
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-07-05 22:48:01 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Justin Pryzby <pryzby(at)telsasoft(dot)com> writes:
> On Fri, Jul 01, 2022 at 07:19:11PM -0400, Tom Lane wrote:
>> That, um, piqued my interest. After a bit of digging,
>> I modestly propose the attached. I'm not sure if it's
>> okay to back-patch this, because maybe someone out there
>> is relying on qsort() to be incapable of throwing an error
>> --- but it'd solve the problem at hand and a bunch of other
>> issues of the same ilk.
> Confirmed this fixes the 3 types of extended stats, at least for the example
> cases I made.
> If it's okay to backpatch, v14 seems adequate since the problem is more
> prominent with expressional statistics (I'm sure that's why I hit it) ...
> otherwise v15 would do.
After thinking for awhile, my inclination is to apply my patch in
HEAD and yours in v15/v14. We have a year to find out if there's
any problem with the more invasive check if we do it in HEAD;
but there's a lot less margin for error in v14, and not that
much in v15 either.
regards, tom lane
From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-07-05 23:20:13 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
On 2022-07-01 19:19:11 -0400, Tom Lane wrote:
> That, um, piqued my interest. After a bit of digging,
> I modestly propose the attached. I'm not sure if it's
> okay to back-patch this, because maybe someone out there
> is relying on qsort() to be incapable of throwing an error
> --- but it'd solve the problem at hand and a bunch of other
> issues of the same ilk.
I'm worried about this. Interrupting random qsorts all over seems like it
could end up corrupting state. We do things like qsort in building snapshots
etc. Are we confident that we handle interrupts reliably in all those places?
Greetings,
Andres Freund
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-07-05 23:37:03 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
I wrote:
>>> I modestly propose the attached. I'm not sure if it's
>>> okay to back-patch this, because maybe someone out there
>>> is relying on qsort() to be incapable of throwing an error
I thought I'd better try to check that, and I soon found several
places that *are* relying on qsort() to not be any smarter than the
libc version. Notably, guc.c qsorts its persistent-state GUC array
during add_guc_variable --- an interrupt there would leave the GUC
data structure in an inconsistent state. There are lesser hazards,
typically memory leaks, elsewhere. So I fear this can't fly as-is.
Nonetheless, it'd be a good idea to use an interruptible sort in
as many places as we can. If we don't mind YA copy of the qsort
code, I'd be inclined to propose inventing qsort_interruptible
which is like qsort_arg but also does CHECK_FOR_INTERRUPTS.
(There seems no reason to support a non-arg version, since you
can just pass NULL.) Or we could redefine qsort_arg as allowing
interrupts, but that might still carry some surprises for existing
code.
regards, tom lane
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-07-05 23:38:55 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Andres Freund <andres(at)anarazel(dot)de> writes:
> I'm worried about this. Interrupting random qsorts all over seems like it
> could end up corrupting state. We do things like qsort in building snapshots
> etc. Are we confident that we handle interrupts reliably in all those places?
Nope ... see my followup just now.
regards, tom lane
From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-07-05 23:49:46 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Tue, Jul 05, 2022 at 07:37:03PM -0400, Tom Lane wrote:
> Nonetheless, it'd be a good idea to use an interruptible sort in
> as many places as we can. If we don't mind YA copy of the qsort
> code, I'd be inclined to propose inventing qsort_interruptible
> which is like qsort_arg but also does CHECK_FOR_INTERRUPTS.
> (There seems no reason to support a non-arg version, since you
> can just pass NULL.) Or we could redefine qsort_arg as allowing
> interrupts, but that might still carry some surprises for existing
> code.
Agreed to use a new and different API for this purpose. It seems like
a tool where one could write some new code and use an interruptible
qsort() thinking that it is fine, still a preperly-documented API has
clear benefits.
--
Michael
From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-07-06 00:44:08 |
Message-ID: | CA+hUKGLB=cNksokWCPDwscUud0sy_LZ02o750S5ZNr_ny-spfQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Wed, Jul 6, 2022 at 11:37 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> qsort_interruptible
+1
FWIW compute_scalar_stats() was already noted as hot and perhaps a
candidate for specialisation (with more study needed), but
qsort_interruptible() seems like a sane use of ~4KB of text to me.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-07-07 19:50:51 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> On Wed, Jul 6, 2022 at 11:37 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> qsort_interruptible
> +1
So here's a patch that does it that way. I first meant to put the
new file into src/port/, but after remembering that that directory
has no backend-only functions, I went with src/backend/utils/sort/
instead.
For the moment I contented myself with changing qsort[_arg] calls
that occur during statistics collection. We have a lot more of course,
but many of them can be expected to not be dealing with much data,
and in some cases we might want some closer analysis to be sure there's
no performance hit. So I'm inclined to not worry too much about the
remaining qsort calls until somebody complains.
This could be back-patched to v14 without much worry, I should think.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
check-for-interrupts-in-statistics-qsorts.patch | text/x-diff | 16.5 KB |
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-07-12 20:31:39 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
I wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
>> On Wed, Jul 6, 2022 at 11:37 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> qsort_interruptible
>> +1
> So here's a patch that does it that way.
Hearing no comments, pushed.
regards, tom lane
From: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-07-12 21:03:26 |
Message-ID: | CALNJ-vR0M+zpYe30tLu4Y8HKFxTST9P8jNnWW-D_yh3Y9KDC0g@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Tue, Jul 12, 2022 at 1:31 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
> > Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> >> On Wed, Jul 6, 2022 at 11:37 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>> qsort_interruptible
>
> >> +1
>
> > So here's a patch that does it that way.
>
> Hearing no comments, pushed.
>
> regards, tom lane
>
>
> Hi,
Looking at the files under src/backend/utils/sort/, looks like license
header is missing from qsort_interruptible.c
Please consider the patch which adds license header to qsort_interruptible.c
Cheers
Attachment | Content-Type | Size |
---|---|---|
qsort-interruptible-copyright.patch | application/octet-stream | 595 bytes |
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: should check interrupts in BuildRelationExtStatistics ? |
Date: | 2022-07-12 21:09:31 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Zhihong Yu <zyu(at)yugabyte(dot)com> writes:
> Looking at the files under src/backend/utils/sort/, looks like license
> header is missing from qsort_interruptible.c
[ shrug ... ] qsort.c and qsort_arg.c don't have that either.
regards, tom lane