Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers
From | Anastasia Lubennikova |
---|---|
Subject | Re: WIP: Covering + unique indexes. |
Date | |
Msg-id | [email protected] Whole thread Raw |
In response to | Re: WIP: Covering + unique indexes. (Amit Kapila <[email protected]>) |
Responses |
Re: WIP: Covering + unique indexes.
Re: WIP: Covering + unique indexes. |
List | pgsql-hackers |
24.09.2016 15:36, Amit Kapila: > On Wed, Sep 21, 2016 at 6:51 PM, Anastasia Lubennikova > <[email protected]> wrote: >> 20.09.2016 08:21, Amit Kapila: >> >> On Tue, Sep 6, 2016 at 10:18 PM, Anastasia Lubennikova >> <[email protected]> wrote: >> >> 28.08.2016 09:13, Amit Kapila: >> >> >> The problem seems really tricky, but the answer is simple. >> We store included columns unordered. It was mentioned somewhere in >> this thread. >> > Is there any fundamental problem in storing them in ordered way? I > mean to say, you need to anyway store all the column values on leaf > page, so why can't we find the exact location for the complete key. > Basically use truncated key to reach to leaf level and then use the > complete key to find the exact location to store the key. I might be > missing some thing here, but if we can store them in ordered fashion, > we can use them even for queries containing ORDER BY (where ORDER BY > contains included columns). > I'd say that the reason for not using included columns in any operations which require comparisons, is that they don't have opclass. Let's go back to the example of points. This data type don't have any opclass for B-tree, because of fundamental reasons. And we can not apply _bt_compare() and others to this attribute, so we don't include it to scan key. create table t (i int, i2 int, p point); create index idx1 on (i) including (i2); create index idx2 on (i) including (p); create index idx3 on (i) including (i2, p); create index idx4 on (i) including (p, i2); You can keep tuples ordered in idx1, but not for idx2, partially ordered for idx3, but not for idx4. At the very beginning of this thread [1], I suggested to use opclass, where possible. Exactly the same idea, you're thinking about. But after short discussion, we came to conclusion that it would require many additional checks and will be too complicated, at least for the initial patch. >> Let me give you an example: >> >> create table t (i int, p point); >> create index on (i) including (p); >> "point" data type doesn't have any opclass for btree. >> Should we insert (0, '(0,2)') before (0, '(1,1)') or after? >> We have no idea what is the "correct order" for this attribute. >> So the answer is "it doesn't matter". When searching in index, >> we know that only key attrs are ordered, so only them can be used >> in scankey. Other columns are filtered after retrieving data. >> >> explain select i,p from t where i =0 and p <@ circle '((0,0),2)'; >> QUERY PLAN >> ------------------------------------------------------------------- >> Index Only Scan using idx on t (cost=0.14..4.20 rows=1 width=20) >> Index Cond: (i = 0) >> Filter: (p <@ '<(0,0),2>'::circle) >> > I think here reason for using Filter is that because we don't keep > included columns in scan keys, can't we think of having them in scan > keys, but use only key columns in scan key to reach till leaf level > and then use complete scan key at leaf level. >> What should I add to README (or to documentation), >> to make it more understandable? >> > May be add the data representation like only leaf pages contains all > the columns and how the scan works. I think you can see if you can > extend "Notes About Data Representation" and or "Other Things That Are > Handy to Know" sections in existing README. Ok, I'll write it in a few days. [1] https://www.postgresql.org/message-id/[email protected] -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: