Skip to content

Commit 3d00162

Browse files
committed
Remove useless self-joins
The Self Join Elimination (SJE) feature removes an inner join of a plain table to itself in the query tree if is proved that the join can be replaced with a scan without impacting the query result. Self join and inner relation are replaced with the outer in query, equivalence classes, and planner info structures. Also, inner restrictlist moves to the outer one with removing duplicated clauses. Thus, this optimization reduces the length of the range table list (this especially makes sense for partitioned relations), reduces the number of restriction clauses === selectivity estimations, and potentially can improve total planner prediction for the query. The SJE proof is based on innerrel_is_unique machinery. We can remove a self-join when for each outer row: 1. At most one inner row matches the join clause. 2. Each matched inner row must be (physically) the same row as the outer one. In this patch we use the next approach to identify a self-join: 1. Collect all merge-joinable join quals which look like a.x = b.x 2. Add to the list above the baseretrictinfo of the inner table. 3. Check innerrel_is_unique() for the qual list. If it returns false, skip this pair of joining tables. 4. Check uniqueness, proved by the baserestrictinfo clauses. To prove the possibility of self-join elimination inner and outer clauses must have an exact match. The relation replacement procedure is not trivial and it is partly combined with the one, used to remove useless left joins. Tests, covering this feature, were added to join.sql. Some regression tests changed due to self-join removal logic. Discussion: https://postgr.es/m/flat/64486b0b-0404-e39e-322d-0801154901f3%40postgrespro.ru Author: Andrey Lepikhov, Alexander Kuzmenkov Reviewed-by: Tom Lane, Robert Haas, Andres Freund, Simon Riggs, Jonathan S. Katz Reviewed-by: David Rowley, Thomas Munro, Konstantin Knizhnik, Heikki Linnakangas Reviewed-by: Hywel Carver, Laurenz Albe, Ronan Dunklau, vignesh C, Zhihong Yu Reviewed-by: Greg Stark, Jaime Casanova, Michał Kłeczek, Alena Rybakina Reviewed-by: Alexander Korotkov
1 parent 99b9928 commit 3d00162

File tree

15 files changed

+2951
-82
lines changed

15 files changed

+2951
-82
lines changed

doc/src/sgml/config.sgml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5593,6 +5593,22 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
55935593
</listitem>
55945594
</varlistentry>
55955595

5596+
<varlistentry id="guc-enable_self_join_removal" xreflabel="enable_self_join_removal">
5597+
<term><varname>enable_self_join_removal</varname> (<type>boolean</type>)
5598+
<indexterm>
5599+
<primary><varname>enable_self_join_removal</varname> configuration parameter</primary>
5600+
</indexterm>
5601+
</term>
5602+
<listitem>
5603+
<para>
5604+
Enables or disables the query planner's optimization which analyses
5605+
the query tree and replaces self joins with semantically equivalent
5606+
single scans. Takes into consideration only plain tables.
5607+
The default is <literal>on</literal>.
5608+
</para>
5609+
</listitem>
5610+
</varlistentry>
5611+
55965612
<varlistentry id="guc-enable-seqscan" xreflabel="enable_seqscan">
55975613
<term><varname>enable_seqscan</varname> (<type>boolean</type>)
55985614
<indexterm>

src/backend/optimizer/path/indxpath.c

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3440,6 +3440,22 @@ bool
34403440
relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
34413441
List *restrictlist,
34423442
List *exprlist, List *oprlist)
3443+
{
3444+
return relation_has_unique_index_ext(root, rel, restrictlist,
3445+
exprlist, oprlist, NULL);
3446+
}
3447+
3448+
/*
3449+
* relation_has_unique_index_ext
3450+
* Same as relation_has_unique_index_for(), but supports extra_clauses
3451+
* parameter. If extra_clauses isn't NULL, return baserestrictinfo clauses
3452+
* which were used to derive uniqueness.
3453+
*/
3454+
bool
3455+
relation_has_unique_index_ext(PlannerInfo *root, RelOptInfo *rel,
3456+
List *restrictlist,
3457+
List *exprlist, List *oprlist,
3458+
List **extra_clauses)
34433459
{
34443460
ListCell *ic;
34453461

@@ -3495,6 +3511,7 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
34953511
{
34963512
IndexOptInfo *ind = (IndexOptInfo *) lfirst(ic);
34973513
int c;
3514+
List *exprs = NIL;
34983515

34993516
/*
35003517
* If the index is not unique, or not immediately enforced, or if it's
@@ -3546,6 +3563,24 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
35463563
if (match_index_to_operand(rexpr, c, ind))
35473564
{
35483565
matched = true; /* column is unique */
3566+
3567+
if (bms_membership(rinfo->clause_relids) == BMS_SINGLETON)
3568+
{
3569+
MemoryContext oldMemCtx =
3570+
MemoryContextSwitchTo(root->planner_cxt);
3571+
3572+
/*
3573+
* Add filter clause into a list allowing caller to
3574+
* know if uniqueness have made not only by join
3575+
* clauses.
3576+
*/
3577+
Assert(bms_is_empty(rinfo->left_relids) ||
3578+
bms_is_empty(rinfo->right_relids));
3579+
if (extra_clauses)
3580+
exprs = lappend(exprs, rinfo);
3581+
MemoryContextSwitchTo(oldMemCtx);
3582+
}
3583+
35493584
break;
35503585
}
35513586
}
@@ -3588,7 +3623,11 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
35883623

35893624
/* Matched all key columns of this index? */
35903625
if (c == ind->nkeycolumns)
3626+
{
3627+
if (extra_clauses)
3628+
*extra_clauses = exprs;
35913629
return true;
3630+
}
35923631
}
35933632

35943633
return false;

0 commit comments

Comments
 (0)