Re: Partial aggregates pushdown - Mailing list pgsql-hackers
From | Alexander Pyhalov |
---|---|
Subject | Re: Partial aggregates pushdown |
Date | |
Msg-id | [email protected] Whole thread Raw |
In response to | RE: Partial aggregates pushdown ("[email protected]" <[email protected]>) |
List | pgsql-hackers |
[email protected] писал(а) 2024-05-28 00:30: > Hi Mr. Pyhalov. > > Sorry for the late reply. > Thank you for your modification and detailed review. > I attach a fixed patch, have been not yet rebased. > > Monday, 25 March 2024 16:01 Alexander Pyhalov > <[email protected]>:. >> Comment in nodeAgg.c seems to be strange: >> >> 1079 /* >> 1080 * If the agg's finalfn is provided and PARTIAL_AGGREGATE >> keyword is >> 1081 * not specified, apply the agg's finalfn. >> 1082 * If PARTIAL_AGGREGATE keyword is specified and the >> transValue type >> 1083 * is internal, apply the agg's serialfn. In this case, if >> the agg's >> 1084 * serialfn must not be invalid. Otherwise return >> transValue. >> 1085 */ >> >> Likely, you mean: >> >> ... In this case the agg'ss serialfn must not be invalid... > Fixed. > >> Lower, in the same file, please, correct error message: >> >> 1136 if(!OidIsValid(peragg->serialfn_oid)) >> 1137 elog(ERROR, "serialfunc is note provided >> for partial aggregate"); >> >> it should be "serialfunc is not provided for partial aggregate" > Fixed. > >> Also something is wrong with the following test : >> >> SELECT /* aggregate <> partial aggregate */ >> array_agg(c_int4array), array_agg(b), >> avg(b::int2), avg(b::int4), avg(b::int8), avg(c_interval), >> avg(b::float4), avg(b::float8), >> corr(b::float8, (b * b)::float8), >> covar_pop(b::float8, (b * b)::float8), >> covar_samp(b::float8, (b * b)::float8), >> regr_avgx((2 * b)::float8, b::float8), >> ..... >> >> Its results have changed since last patch. Do they depend on daylight >> saving time? > You are right. In my environment, TimeZone is set to 'PST8PDT' > with which timetz values depends on daylight saving time. > Changed TimeZone to 'UTC' in this test. > >> You can see that filter is applied before append. The result is >> correct >> only by chance, as sum in every partition is actually < 700. If you >> lower this bound, let's say, to 200, you'll start getting wrong >> results >> as data is filtered prior to aggregation. >> >> It seems, however, that in partial case you should just avoid pulling >> conditions from having qual at all, all filters will be applied on >> upper >> level. Something like > Thank you for your modification. > >> Found one more problem. You can fire partial aggregate over >> partitioned >> table, but convert_combining_aggrefs() will make non-partial copy, >> which >> leads to >> 'variable not found in subplan target list' error. > Thanks for the correction as well. > As you pointed out, > the original patch certainly had the potential to cause problems. > However, I could not actually reproduce the problem in cases such as > the following. > > Settings: > t(c1, c2) is a patitioned table whose partition key is c1. > t1, t2 are patitions of t and are partitioned table. > t11, t12: partitions of t1 and foreign table of postgres_fdw. > t21, t22: partitions of t2 and foreign table of postgres_fdw. > Query: > select c2 / 2, sum(c1) from t group by c2 / 2 order by 1 > > If you have a reproducible example, I would like to add it to > the regression test. > Do you have a reproducible example? > >> Also denied partial agregates pushdown on server version mismatch. >> Should check_partial_aggregate_support be 'true' by default? > Could we discuss this point after we determine how to transfer state > values? > If we determine this point, we can easly determine whether > check_partial_aggregate_support shold be 'true' by default. > >> I'm not sure what to do with current grammar - it precludes partial >> distinct aggregates. I understand that it's currently impossible to >> have >> partial aggregation for distinct agregates -but does it worth to have >> such restriction at grammar level? > If partial aggregation for distinct agregates becomes possible in the > future, > I see no problem with the policy of accepting new SQL keywords, > such as "PARTIL_AGGREGATE DISTINCT". BTW, there's I have an issue with test results in the last version of the patch. Attaching regression diffs. I have partial sum over c_interval instead of sum(c_interval). -- Best regards, Alexander Pyhalov, Postgres Professional
Attachment
pgsql-hackers by date: