Home > mailing lists

Re: Partial aggregates pushdown - Mailing list pgsql-hackers

From	Alexander Pyhalov
Subject	Re: Partial aggregates pushdown
Date	May 28, 2024 05:57:39
Msg-id	[email protected] Whole thread Raw
In response to	RE: Partial aggregates pushdown ("[email protected]" <[email protected]>)
List	pgsql-hackers

Tree view

[email protected] писал(а) 2024-05-28 00:30:
> Hi Mr. Pyhalov.
> 
> Sorry for the late reply.
> Thank you for your modification and detailed review.
> I attach a fixed patch, have been not yet rebased.
> 
> Monday, 25 March 2024 16:01 Alexander Pyhalov 
> <[email protected]>:.
>> Comment in nodeAgg.c seems to be strange:
>> 
>> 1079 /*
>> 1080 * If the agg's finalfn is provided and PARTIAL_AGGREGATE
>> keyword is
>> 1081 * not specified, apply the agg's finalfn.
>> 1082 * If PARTIAL_AGGREGATE keyword is specified and the
>> transValue type
>> 1083 * is internal, apply the agg's serialfn. In this case, if
>> the agg's
>> 1084 * serialfn must not be invalid. Otherwise return
>> transValue.
>> 1085 */
>> 
>> Likely, you mean:
>> 
>> ... In this case the agg'ss serialfn must not be invalid...
> Fixed.
> 
>> Lower, in the same file, please, correct error message:
>> 
>> 1136 if(!OidIsValid(peragg->serialfn_oid))
>> 1137  elog(ERROR, "serialfunc is note provided
>> for partial aggregate");
>> 
>> it should be "serialfunc is not provided for partial aggregate"
> Fixed.
> 
>> Also something is wrong with the following test :
>> 
>> SELECT /* aggregate <> partial aggregate */
>>  array_agg(c_int4array), array_agg(b),
>>  avg(b::int2), avg(b::int4), avg(b::int8), avg(c_interval),
>>  avg(b::float4), avg(b::float8),
>>  corr(b::float8, (b * b)::float8),
>>  covar_pop(b::float8, (b * b)::float8),
>>  covar_samp(b::float8, (b * b)::float8),
>>  regr_avgx((2 * b)::float8, b::float8),
>> .....
>> 
>> Its results have changed since last patch. Do they depend on daylight
>> saving time?
> You are right. In my environment, TimeZone is set to 'PST8PDT'
> with which timetz values depends on daylight saving time.
> Changed TimeZone to 'UTC' in this test.
> 
>> You can see that filter is applied before append. The result is 
>> correct
>> only by chance, as sum in every partition is actually < 700. If you
>> lower this bound, let's say, to 200, you'll start getting wrong 
>> results
>> as data is filtered prior to aggregation.
>> 
>> It seems, however, that in partial case you should just avoid pulling
>> conditions from having qual at all, all filters will be applied on 
>> upper
>> level. Something like
> Thank you for your modification.
> 
>> Found one more problem. You can fire partial aggregate over 
>> partitioned
>> table, but convert_combining_aggrefs() will make non-partial copy, 
>> which
>> leads to
>> 'variable not found in subplan target list' error.
> Thanks for the correction as well.
> As you pointed out,
> the original patch certainly had the potential to cause problems.
> However, I could not actually reproduce the problem in cases such as 
> the following.
> 
>   Settings:
>     t(c1, c2) is a patitioned table whose partition key is c1.
>     t1, t2 are patitions of t and are partitioned table.
>     t11, t12: partitions of t1 and foreign table of postgres_fdw.
>     t21, t22: partitions of t2 and foreign table of postgres_fdw.
>   Query:
>     select c2 / 2, sum(c1) from t group by c2 / 2 order by 1
> 
> If you have a reproducible example, I would like to add it to
> the regression test.
> Do you have a reproducible example?
> 
>> Also denied partial agregates pushdown on server version mismatch.
>> Should check_partial_aggregate_support be 'true' by default?
> Could we discuss this point after we determine how to transfer state 
> values?
> If we determine this point, we can easly determine whether 
> check_partial_aggregate_support shold be 'true' by default.
> 
>> I'm not sure what to do with current grammar - it precludes partial
>> distinct aggregates. I understand that it's currently impossible to 
>> have
>> partial aggregation for distinct agregates -but does it worth to have
>> such restriction at grammar level?
> If partial aggregation for distinct agregates becomes possible in the 
> future,
> I see no problem with the policy of accepting new SQL keywords,
> such as "PARTIL_AGGREGATE DISTINCT".

BTW, there's I have an issue with test results in the last version of 
the patch. Attaching regression diffs.
I have partial sum over c_interval instead of sum(c_interval).


-- 
Best regards,
Alexander Pyhalov,
Postgres Professional

Attachment

regression.diffs

pgsql-hackers by date:

From: Alexander Pyhalov
Date: 28 May 2024, 05:45:04
Subject: Re: Partial aggregates pushdown

From: Alexander Pyhalov
Date: 28 May 2024, 06:05:02
Subject: Re: CREATE INDEX CONCURRENTLY on partitioned index

Re: Partial aggregates pushdown - Mailing list pgsql-hackers

Attachment

Previous

Next