Skip to content

Commit db76b1e

Browse files
committed
Allow SetHintBits() to succeed if the buffer's LSN is new enough.
Previously we only allowed SetHintBits() to succeed if the commit LSN of the last transaction touching the page has already been flushed to disk. We can't generally change the LSN of the page, because we don't necessarily have the required locks on the page. But the required LSN interlock does not mean the commit record has to be flushed immediately, it just requires that the commit record will be flushed before the page is written out. Therefore if the buffer LSN is newer than the commit LSN, the hint bit can be safely set. In a number of scenarios (e.g. pgbench) this noticeably increases the number of hint bits are set. But more importantly it also keeps the success rate up when flushing WAL less frequently. That was the original reason for commit 4de82f7, which has negative performance consequences in a number of scenarios. This will allow a followup commit to reduce the flush rate. Discussion: [email protected]
1 parent cfafd8b commit db76b1e

File tree

1 file changed

+13
-8
lines changed

1 file changed

+13
-8
lines changed

src/backend/utils/time/tqual.c

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -89,12 +89,13 @@ static bool XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot);
8989
* Set commit/abort hint bits on a tuple, if appropriate at this time.
9090
*
9191
* It is only safe to set a transaction-committed hint bit if we know the
92-
* transaction's commit record has been flushed to disk, or if the table is
93-
* temporary or unlogged and will be obliterated by a crash anyway. We
94-
* cannot change the LSN of the page here because we may hold only a share
95-
* lock on the buffer, so we can't use the LSN to interlock this; we have to
96-
* just refrain from setting the hint bit until some future re-examination
97-
* of the tuple.
92+
* transaction's commit record is guaranteed to be flushed to disk before the
93+
* buffer, or if the table is temporary or unlogged and will be obliterated by
94+
* a crash anyway. We cannot change the LSN of the page here, because we may
95+
* hold only a share lock on the buffer, so we can only use the LSN to
96+
* interlock this if the buffer's LSN already is newer than the commit LSN;
97+
* otherwise we have to just refrain from setting the hint bit until some
98+
* future re-examination of the tuple.
9899
*
99100
* We can always set hint bits when marking a transaction aborted. (Some
100101
* code in heapam.c relies on that!)
@@ -122,8 +123,12 @@ SetHintBits(HeapTupleHeader tuple, Buffer buffer,
122123
/* NB: xid must be known committed here! */
123124
XLogRecPtr commitLSN = TransactionIdGetCommitLSN(xid);
124125

125-
if (XLogNeedsFlush(commitLSN) && BufferIsPermanent(buffer))
126-
return; /* not flushed yet, so don't set hint */
126+
if (BufferIsPermanent(buffer) && XLogNeedsFlush(commitLSN) &&
127+
BufferGetLSNAtomic(buffer) < commitLSN)
128+
{
129+
/* not flushed and no LSN interlock, so don't set hint */
130+
return;
131+
}
127132
}
128133

129134
tuple->t_infomask |= infomask;

0 commit comments

Comments
 (0)