Re: Partial aggregates pushdown - Mailing list pgsql-hackers
From | Alexander Pyhalov |
---|---|
Subject | Re: Partial aggregates pushdown |
Date | |
Msg-id | [email protected] Whole thread Raw |
In response to | RE: Partial aggregates pushdown ("[email protected]" <[email protected]>) |
List | pgsql-hackers |
[email protected] писал 2023-09-25 06:18: > Hi Mr.Bruce, Mr.Pyhalov, Mr.Finnerty, hackers. > > Thank you for your valuable comments. I sincerely apologize for the > very late reply. > Here is a response to your comments or a fix to the patch. > > Tuesday, August 8, 2023 at 3:31 Bruce Momjian >> > I have modified the program except for the point "if the version of the remote server is less than PG17". >> > Instead, we have addressed the following. >> > "If check_partial_aggregate_support is true and the remote server version is older than the local server >> > version, postgres_fdw does not assume that the partial aggregate function is on the remote server unless >> > the partial aggregate function and the aggregate function match." >> > The reason for this is to maintain compatibility with any aggregate function that does not support partial >> > aggregate in one version of V1 (V1 is PG17 or higher), even if the next version supports partial aggregate. >> > For example, string_agg does not support partial aggregation in PG15, but it will support partial aggregation >> > in PG16. >> >> Just to clarify, I think you are saying: >> >> If check_partial_aggregate_support is true and the remote >> server >> version is older than the local server version, postgres_fdw >> checks if the partial aggregate function exists on the remote >> server during planning and only uses it if it does. >> >> I tried to phrase it in a positive way, and mentioned the plan time >> distinction. Also, I am sorry I was away for most of July and am just >> getting to this. > Thanks for your comment. In the documentation, the description of > check_partial_aggregate_support is as follows > (please see postgres-fdw.sgml). > -- > check_partial_aggregate_support (boolean) > Only if this option is true, during query planning, postgres_fdw > connects to the remote server and check if the remote server version > is older than the local server version. If so, postgres_fdw assumes > that for each built-in aggregate function, the partial aggregate > function is not defined on the remote server unless the partial > aggregate function and the aggregate function match. The default is > false. > -- > > Thursday, 20 July 2023 19:23 Alexander Pyhalov > <[email protected]>: >> [email protected] писал 2023-07-19 03:43: >> > Hi Mr.Pyhalov, hackers. >> >> > 3) >> > I modified the patch to safely do a partial aggregate pushdown for >> > queries which contain having clauses. >> > >> >> Hi. >> Sorry, but I don't see how it could work. > We apologize for any inconvenience caused. > Thanks to Pyhalov's and Jim's comments, I have realized that I have > made a fundamental mistake regarding the pushdown of the HAVING clause > and the difficulty of achieving it performing Partial aggregate > pushdown. > So, I removed the codes about pushdown of the HAVING clause performing > Partial aggregate pushdown. > > Thursday, 20 July 2023 19:23 Alexander Pyhalov > <[email protected]>: >> As for changes in planner.c (setGroupClausePartial()) I have several >> questions. >> >> 1) Why don't we add non_group_exprs to pathtarget->exprs when >> partial_target->exprs is not set? >> >> 2) We replace extra->partial_target->exprs with partial_target->exprs >> after processing. Why are we sure that after this tleSortGroupRef is >> correct? > Response to 1) > The code you pointed out was unnecessary. I have removed this code. > Also, the process of adding PlaceHolderVar's expr to partial_target was > missing. > So I fixed this. > > Response to 2) > The making procedures extra->groupClausePartial and > extra->partial_target > in make_partial_grouping_target for this patch is as follows. > STEP1. From grouping_target->exprs, extract Aggref, Var and > Placeholdervar that are not included in Aggref. > STEP2. setGroupClausePartial sets the copy of original groupClause to > extra->groupClausePartial > and sets the copy of original partial_target to extra->partial_target. > STEP3. setGroupClausePartial adds Var and Placeholdervar in STEP1 to > partial_target. > The sortgroupref of partial_target->sortgrouprefs to be added to value > is set to > (the maximum value of the existing sortgroupref) + 1. > setGroupClausePartial adds data sgc of sortgroupclause type where > sgc->tlesortgroupref > matches the sortgroupref to GroupClause. > STEP4. add_new_columns_to_pathtarget adds STEP1's Aggref to > partial_target. > > Due to STEP2, the list of tlesortgrouprefs set in > extra->groupClausePartial is not duplicated. Do you mean that extra->partial_target->sortgrouprefs is not replaced, and so we preserve tlesortgroupref numbers? I'm suspicious about rewriting extra->partial_target->exprs with partial_target->exprs - I'm still not sure why we don't we loose information, added by add_column_to_pathtarget() to extra->partial_target->exprs? Also look at the following example. EXPLAIN VERBOSE SELECT count(*) , (b/2)::numeric FROM pagg_tab GROUP BY b/2 ORDER BY 1; QUERY PLAN --------------------------------------------------------------------------------------------------- Sort (cost=511.35..511.47 rows=50 width=44) Output: (count(*)), ((((pagg_tab.b / 2)))::numeric), ((pagg_tab.b / 2)) Sort Key: (count(*)) -> Finalize HashAggregate (cost=509.06..509.94 rows=50 width=44) Output: count(*), (((pagg_tab.b / 2)))::numeric, ((pagg_tab.b / 2)) Group Key: ((pagg_tab.b / 2)) -> Append (cost=114.62..506.06 rows=600 width=16) -> Foreign Scan (cost=114.62..167.69 rows=200 width=16) Output: ((pagg_tab.b / 2)), (PARTIAL count(*)), pagg_tab.b Relations: Aggregate on (public.fpagg_tab_p1 pagg_tab) Remote SQL: SELECT (b / 2), count(*), b FROM public.pagg_tab_p1 GROUP BY 1, 2 -> Foreign Scan (cost=114.62..167.69 rows=200 width=16) Output: ((pagg_tab_1.b / 2)), (PARTIAL count(*)), pagg_tab_1.b Relations: Aggregate on (public.fpagg_tab_p2 pagg_tab_1) Remote SQL: SELECT (b / 2), count(*), b FROM public.pagg_tab_p2 GROUP BY 1, 2 -> Foreign Scan (cost=114.62..167.69 rows=200 width=16) Output: ((pagg_tab_2.b / 2)), (PARTIAL count(*)), pagg_tab_2.b Relations: Aggregate on (public.fpagg_tab_p3 pagg_tab_2) Remote SQL: SELECT (b / 2), count(*), b FROM public.pagg_tab_p3 GROUP BY 1, 2 Note that group by is still deparsed incorrectly. -- Best regards, Alexander Pyhalov, Postgres Professional
pgsql-hackers by date: