Unusable index - Mailing list pgsql-hackers
From | Konstantin Knizhnik |
---|---|
Subject | Unusable index |
Date | |
Msg-id | [email protected] Whole thread Raw |
List | pgsql-hackers |
Hi hackers, Hot experts help is highly needed:) One of our customers is faced with the following problem: Session 1: create table x (i int); begin; insert into x values(1); ... Session 2: select i as id, 0 as v into t from generate_series(1, 100000) i; create unique index idx on t (id); explain analyze select v from t where id = 10000; QUERY PLAN ------------------------------------------------------------------------------------------------------- Index Scan using idx on t (cost=0.29..8.31 rows=1 width=4) (actual time=0.062..0.064 rows=1 loops=1) Index Cond: (id = 10000) -- Everything is Ok here -- Now some magic update t set v = v + 1 where id = 10000; update t set v = v + 1 where id = 10000; update t set v = v + 1 where id = 10000; update t set v = v + 1 where id = 10000; update t set v = v + 1 where id = 10000; drop index idx; create unique index idx on t (id); explain analyze select v from t where id = 10000; explain analyze select v from t where id = 10000; QUERY PLAN ------------------------------------------------------------------------------------------------ Seq Scan on t (cost=0.00..1693.00 rows=1 width=4) (actual time=27.557..27.558 rows=1 loops=1) Now no backend can use this index until transaction in Session 1 is completed. According to the README.HOT it is more or less expected behavior: --------------------------------------------- Practically, we prevent certain transactions from using the new index by setting pg_index.indcheckxmin to TRUE. Transactions are allowed to use such an index only after pg_index.xmin is below their TransactionXmin horizon, thereby ensuring that any incompatible rows in HOT chains are dead to them. (pg_index.xmin will be the XID of the CREATE INDEX transaction. The reason for using xmin rather than a normal column is that the regular vacuum freezing mechanism will take care of converting xmin to FrozenTransactionId before it can wrap around.) This means in particular that the transaction creating the index will be unable to use the index if the transaction has old snapshots. We alleviate that problem somewhat by not setting indcheckxmin unless the table actually contains HOT chains with RECENTLY_DEAD members. --------------------------------------------- But few notes: 1. Not only transaction created the index is not able to use it. Any other transaction will not be able to use it as well. 2. It happens even if Session1 and Session2 works with different databases, so it can really confuse users! Ok, isolation of databases in Postgres is separate topic. I want to clarify why this is index is disabled for any new transactions. The code disabling use of index is the following: /* * If the index is valid, but cannot yet be used, ignore it; but * mark the plan we are generating as transient. See * src/backend/access/heap/README.HOT for discussion. */ if (index->indcheckxmin && !TransactionIdPrecedes(HeapTupleHeaderGetXmin(indexRelation->rd_indextuple->t_data), TransactionXmin)) { root->glob->transientPlan = true; index_close(indexRelation, NoLock); continue; } And "indcheckmin" is set here: /* * If we found any potentially broken HOT chains, mark the index as not * being usable until the current transaction is below the event horizon. * See src/backend/access/heap/README.HOT for discussion. Also set this * if early pruning/vacuuming is enabled for the heap relation. While it * might become safe to use the index earlier based on actual cleanup * activity and other active transactions, the test for that would be much * more complex and would require some form of blocking, so keep it simple * and fast by just using the current transaction. * * However, when reindexing an existing index, we should do nothing here. * Any HOT chains that are broken with respect to the index must predate * the index's original creation, so there is no need to change the * index's usability horizon. Moreover, we *must not* try to change the * index's pg_index entry while reindexing pg_index itself, and this * optimization nicely prevents that. The more complex rules needed for a * reindex are handled separately after this function returns. * * We also need not set indcheckxmin during a concurrent index build, * because we won't set indisvalid true until all transactions that care * about the broken HOT chains or early pruning/vacuuming are gone. * * Therefore, this code path can only be taken during non-concurrent * CREATE INDEX. Thus the fact that heap_update will set the pg_index * tuple's xmin doesn't matter, because that tuple was created in the * current transaction anyway. That also means we don't need to worry * about any concurrent readers of the tuple; no other transaction can see * it yet. */ if ((indexInfo->ii_BrokenHotChain || EarlyPruningEnabled(heapRelation)) && !isreindex && !indexInfo->ii_Concurrent) { Oid indexId = RelationGetRelid(indexRelation); Relation pg_index; HeapTuple indexTuple; Form_pg_index indexForm; pg_index = heap_open(IndexRelationId, RowExclusiveLock); indexTuple = SearchSysCacheCopy1(INDEXRELID, ObjectIdGetDatum(indexId)); if (!HeapTupleIsValid(indexTuple)) elog(ERROR, "cache lookup failed for index %u", indexId); indexForm = (Form_pg_index) GETSTRUCT(indexTuple); /* If it's a new index, indcheckxmin shouldn't be set ... */ Assert(!indexForm->indcheckxmin); indexForm->indcheckxmin = true; CatalogTupleUpdate(pg_index, &indexTuple->t_self, indexTuple); heap_freetuple(indexTuple); heap_close(pg_index, RowExclusiveLock); } Because ii_BrokenHotChain is true. And ii_BrokenHotChain in turn is set to true here: case HEAPTUPLE_RECENTLY_DEAD: /* * If tuple is recently deleted then we must index it * anyway to preserve MVCC semantics. (Pre-existing * transactions could try to use the index after we finish * building it, and may need to see such tuples.) * * However, if it was HOT-updated then we must only index * the live tuple at the end of the HOT-chain. Since this * breaks semantics for pre-existing snapshots, mark the * index as unusable for them. * * We don't count recently-dead tuples in reltuples, even * if we index them; see acquire_sample_rows(). */ if (HeapTupleIsHotUpdated(heapTuple)) { indexIt = false; /* mark the index as unsafe for old snapshots */ indexInfo->ii_BrokenHotChain = true; } So 1. We mark index as unsafe for old snapshots. 3. It prevents all transactions from using this index until index creation is below the horizon. It seems to me that we are over-conservative here... But I am failed to understand the source of potential problems we are trying to prevent and how this check can be done smarter. When I just comment assignment indexInfo->ii_BrokenHotChain = true in the code above, then everything works correctly. No surprise: old snapshots are not seeing this newly created index at all. I will be very please if somebody can explain me bad scenario which we are trying to prevent here but disabling use of this index. Also I wonder if something can be done here. Definitely, indexes are not used to be created too often and long living transactions are bad things in any case. But the fact that "hanged" transaction in one database prevents use of index and so dramatically increases query execution time in another database also seems to be very disappointing. Thanks in advance, -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: