On 27/12/2023 12:07, Tom Lane wrote:
> Andrei Lepikhov <[email protected]> writes:
>> To be clear. In [1], I mentioned we can perform micro-benchmarks and
>> structure costs of operators. At least for fixed-length operators, it is
>> relatively easy.
>
> I repeat what I said: this is a fool's errand. You will not get
> trustworthy results even for the cases you measured, let alone
> all the rest. I'd go as far as to say I would not believe your
> microbenchmarks, because they would only apply for one platform,
> compiler, backend build, phase of the moon, etc.
Thanks for the explanation.
I removed all cost-related codes. It still needs to be finished; I will
smooth the code further and rewrite regression tests - many of them
without cost-dependent reorderings look silly. Also, remember
Alexander's remarks, which must be implemented, too.
But already here, it works well. Look:
Preliminaries:
CREATE TABLE t(x int, y int, z text, w int);
INSERT INTO t SELECT gs%100,gs%100, 'abc' || gs%10, gs
FROM generate_series(1,10000) AS gs;
CREATE INDEX abc ON t(x,y);
ANALYZE t;
SET enable_hashagg = 'off';
This patch eliminates unneeded Sort operation:
explain SELECT x,y FROM t GROUP BY (x,y);
explain SELECT x,y FROM t GROUP BY (y,x);
Engages incremental sort:
explain SELECT x,y FROM t GROUP BY (x,y,z,w);
explain SELECT x,y FROM t GROUP BY (z,y,w,x);
explain SELECT x,y FROM t GROUP BY (w,z,x,y);
explain SELECT x,y FROM t GROUP BY (w,x,z,y);
Works with subqueries:
explain SELECT x,y
FROM (SELECT * FROM t ORDER BY x,y,w,z) AS q1
GROUP BY (w,x,z,y);
explain SELECT x,y
FROM (SELECT * FROM t ORDER BY x,y,w,z LIMIT 100) AS q1
GROUP BY (w,x,z,y);
But arrangement with an ORDER BY clause doesn't work:
DROP INDEX abc;
explain SELECT x,w,z FROM t GROUP BY (w,x,z) ORDER BY (x,z,w);
I think the reason is that the sort_pathkeys and group_pathkeys are
physically different structures, and we can't just compare pointers here.
--
regards,
Andrei Lepikhov
Postgres Professional