Home > mailing lists

Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers

From	Anastasia Lubennikova
Subject	Re: WIP: Covering + unique indexes.
Date	September 26, 2016 15:17:48
Msg-id	[email protected] Whole thread Raw
In response to	Re: WIP: Covering + unique indexes. (Amit Kapila <[email protected]>)
Responses	Re: WIP: Covering + unique indexes. Re: WIP: Covering + unique indexes.
List	pgsql-hackers

Tree view

24.09.2016 15:36, Amit Kapila:
> On Wed, Sep 21, 2016 at 6:51 PM, Anastasia Lubennikova
> <[email protected]> wrote:
>> 20.09.2016 08:21, Amit Kapila:
>>
>> On Tue, Sep 6, 2016 at 10:18 PM, Anastasia Lubennikova
>> <[email protected]> wrote:
>>
>> 28.08.2016 09:13, Amit Kapila:
>>
>>
>> The problem seems really tricky, but the answer is simple.
>> We store included columns unordered. It was mentioned somewhere in
>> this thread.
>>
> Is there any fundamental problem in storing them in ordered way?  I
> mean to say, you need to anyway store all the column values on leaf
> page, so why can't we find the exact location for the complete key.
> Basically use truncated key to reach to leaf level and then use the
> complete key to find the exact location to store the key.  I might be
> missing some thing here, but if we can store them in ordered fashion,
> we can use them even for queries containing ORDER BY (where ORDER BY
> contains included columns).
>

I'd say that the reason for not using included columns in any
operations which require comparisons, is that they don't have opclass.
Let's go back to the example of points.
This data type don't have any opclass for B-tree, because of fundamental 
reasons.
And we can not apply _bt_compare() and others to this attribute, so
we don't include it to scan key.

create table t (i int, i2 int, p point);
create index idx1 on (i) including (i2);
create index idx2 on (i) including (p);
create index idx3 on (i) including (i2, p);
create index idx4 on (i) including (p, i2);

You can keep tuples ordered in idx1, but not for idx2, partially ordered 
for idx3, but not for idx4.

At the very beginning of this thread [1], I suggested to use opclass, 
where possible.
Exactly the same idea, you're thinking about. But after short 
discussion, we came
to conclusion that it would require many additional checks and will be 
too complicated,
at least for the initial patch.

>> Let me give you an example:
>>
>> create table t (i int, p point);
>> create index on (i) including (p);
>> "point" data type doesn't have any opclass for btree.
>> Should we insert (0, '(0,2)') before (0, '(1,1)') or after?
>> We have no idea what is the "correct order" for this attribute.
>> So the answer is "it doesn't matter". When searching in index,
>> we know that only key attrs are ordered, so only them can be used
>> in scankey. Other columns are filtered after retrieving data.
>>
>> explain select i,p from t where i =0 and p <@ circle '((0,0),2)';
>>                              QUERY PLAN
>> -------------------------------------------------------------------
>>   Index Only Scan using idx on t  (cost=0.14..4.20 rows=1 width=20)
>>     Index Cond: (i = 0)
>>     Filter: (p <@ '<(0,0),2>'::circle)
>>
> I think here reason for using Filter is that because we don't keep
> included columns in scan keys, can't we think of having them in scan
> keys, but use only key columns in scan key to reach till leaf level
> and then use complete scan key at leaf level.

>> What should I add to README (or to documentation),
>> to make it more understandable?
>>
> May be add the data representation like only leaf pages contains all
> the columns and how the scan works.  I think you can see if you can
> extend "Notes About Data Representation" and or "Other Things That Are
> Handy to Know" sections in existing README.

Ok, I'll write it in a few days.


[1] https://www.postgresql.org/message-id/[email protected]

-- 
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

pgsql-hackers by date:

From: Stephen Frost
Date: 26 September 2016, 14:48:55
Subject: Re: Showing parallel status in \df+

From: "[email protected]"
Date: 26 September 2016, 15:39:28
Subject: temporary table vs array performance

Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers

Previous

Next