Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers
From | Anastasia Lubennikova |
---|---|
Subject | Re: WIP: Covering + unique indexes. |
Date | |
Msg-id | [email protected] Whole thread Raw |
In response to | Re: WIP: Covering + unique indexes. (David Rowley <[email protected]>) |
Responses |
Re: WIP: Covering + unique indexes.
Re: WIP: Covering + unique indexes. |
List | pgsql-hackers |
31.01.2016 11:04, David Rowley: > On 27 January 2016 at 03:35, Anastasia Lubennikova > <[email protected]> wrote: >> including_columns_3.0 is the latest version of patch. >> And changes regarding the previous version are attached in a separate patch. >> Just to ease the review and debug. > Hi, > > I've made another pass over the patch. There's still a couple of > things that I think need to be looked at. Thank you again. I just write here to say that I do not disappear and I do remember about the issue. But I'm very very busy this week. I'll send an updated patch next week as soon as possible. > Do we need the "b (included)" here? The key is (a) = (1). Having > irrelevant details might be confusing. > > postgres=# create table a (a int not null, b int not null); > CREATE TABLE > postgres=# create unique index on a (a) including(b); > CREATE INDEX > postgres=# insert into a values(1,1); > INSERT 0 1 > postgres=# insert into a values(1,1); > ERROR: duplicate key value violates unique constraint "a_a_b_idx" > DETAIL: Key (a, b (included))=(1, 1) already exists. I thought that it could be strange if user inserts two values and then sees only one of them in error message. But now I see that you're right. I'll also look at the same functional in other DBs and fix it. > In index_reform_tuple() I find it a bit scary that you change the > TupleDesc's number of attributes then set it back again once you're > finished reforming the shortened tuple. > Maybe it would be better to modify index_form_tuple() to accept a new > argument with a number of attributes, then you can just Assert that > this number is never higher than the number of attributes in the > TupleDesc. Good point. I agree that this function is a bit strange. I have to set tupdesc->nattrs to support compatibility with index_form_tuple(). I didn't want to add neither a new field to tupledesc nor a new parameter to index_form_tuple(), because they are used widely. > I'm also not that keen on index_reform_tuple() in general. I wonder if > there's a way we can just keep the Datum/isnull arrays a bit longer, > and only form the tuple when needed. I've not looked into this in > detail, but it does look like reforming the tuple is not going to be > cheap. It is used in splits, for example. There is no datum array, we just move tuple key from a child page to a parent page or something like that. And according to INCLUDING algorithm we need to truncate nonkey attributes. > If we do need to keep this function, I think a better name might be > index_trim_tuple() and I don't think you need to pass the original > length. It might make sense to Assert() that the trim length is > smaller than the tuple size As regards the performance, I don't think that it's a big problem here. Do you suggest to do it in a following way memcpy(oldtup, newtup, newtuplength)? I will > in gistrescan() there is some code: > > for (attno = 1; attno <= natts; attno++) > { > TupleDescInitEntry(so->giststate->fetchTupdesc, attno, NULL, > scan->indexRelation->rd_opcintype[attno - 1], > -1, 0); > } > > Going by RelationInitIndexAccessInfo() rd_opcintype[] is allocated to > be sized by the number of key columns, but this loop goes over the > number of attribute columns. > Perhaps this is not a big problem since GIST does not support > INCLUDING columns, but it does seem wrong still. GiST doesn't support INCLUDING clause, so natts and nkeyatts are always equal. I don't see any problem here. And I think that it's an extra work to this patch. Maybe I or someone else would add this feature to other access methods later. > Which brings me to the fact that I've spent a bit of time trying to > look for places where you've forgotten to change natts to nkeyatts. I > did find this one, but I don't have much confidence that there's not > lots more places that have been forgotten. Apart from this one, how > confident are you that you've found all the places? I'm getting > towards being happy with the code that I see that's been changed, but > I'm hesitant to mark as "Ready for committer" due to not being all > that comfortable that all the code that needs to be updated has been > updated. I'm not quite sure of a good way to find all these places. I found all mentions of natts and other related variables with grep, and replaced (or expand) them with nkeyatts where it was necessary. As mentioned before, I didn't change other AMs. I strongly agree that any changes related to btree require thorough inspection, so I'll recheck it again. But I'm almost sure that it's okay. > I wondering if hacking the code so that each btree index which is > created with > 1 column puts all but the first column into the > INCLUDING columns, then run the regression tests to see if there are > any crashes. I'm really not that sure of how else to increase the > confidence levels on this. Do you have ideas? Do I understand correctly that you suggest to replace all multicolumn indexes with (1key column) + included? I don't think it's a good idea. INCLUDING clause brings some disadvantages. For example, included columns must be filtered after the search, while key columns could be used in scan key directly. I already mentioned this in test example: explain analyze select c1, c2 from tbl where c1<10000 and c3<20; If columns' opclasses are used, new query plan uses them in Index Cond: ((c1 < 10000) AND (c3 < 20)) Otherwise, new query can not use included column in Index Cond and uses filter instead: Index Cond: (c1 < 10000) Filter: (c3 < 20) Rows Removed by Filter: 9993 It slows down the query significantly. And besides that, we still want to have multicolumn unique indexes. CREATE UNIQUE INDEX on tbl (a, b, c) INCLUDING (d); -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: