Skip to content

Commit 2d38a79

Browse files
author
Sergey Shinderuk
committed
PGPRO-6599: Avoid race when accessing the request shared variable.
Consider the following sequence of events: 1. Session 1 calls pg_wait_sampling_reset_profile and sets the request shared variable to PROFILE_RESET. 2. The collector reads request and saves PROFILE_RESET to a local variable. 3. Session 2 queries pg_wait_sampling_profile, which sets request to PROFILE_REQUEST and waits for the collector in shm_mq_receive. 4. The collector continues and clears shared request, thus dropping PROFILE_REQUEST from Session 2. 5. Session 2 waits indefinitely in shm_mq_receive. A similar example with query cancellation: 1. Session 1 queries pg_wait_sampling_history and sets request to HISTORY_REQUEST. 2. Session 1 cancels the query while waiting for the collector. 3. The collector reads request and saves HISTORY_REQUEST to a local variable. 4. Session 2 queries pg_wait_sampling_profile, sets request to PROFILE_REQUEST and waits for the collector. 5. The collector continues and responds to HISTORY_REQUEST. 6. Session 2 receives history data and renders them as profile data returning invalid counts. These interleavings are avoided by acquiring the collector lock before reading request from shared memory in the collector. But we also need to hold the collector lock when we set request in receive_array in a backend. Otherwise, the following interleaving is possible: 1. Session 1 calls pg_wait_sampling_reset_profile and sets request to PROFILE_RESET. 2. Session 2 queries pg_wait_sampling_profile, acquires and releases the collector lock. 3. The collector acquires the lock, reads request and saves PROFILE_RESET to a local variable. 4. Session 2 sets request to PROFILE_REQUEST. 5. The collector clears request, and PROFILE_REQUEST is lost. 6. Session 2 waits indefinitely in shm_mq_receive. Same for the second example above. This patch, however, doesn't prevent loosing PROFILE_RESET requests: 1. Session 1 calls pg_wait_sampling_reset_profile and sets request to PROFILE_RESET. 2. Session 2 queries pg_wait_sampling_profile before the collector reads request. 3. The collector reads PROFILE_REQUEST, while PROFILE_RESET is lost. To fix this, we could make pg_wait_sampling_reset_profile wait for the collector, but we decided not to, as loosing a PROFILE_RESET isn't critical. Resolves #48. Author: Roman Zharkov Reported-By: Alexander Lakhin Reviewed-By: Maksim Milyutin, Sergey Shinderuk
1 parent 9dad24b commit 2d38a79

File tree

2 files changed

+4
-5
lines changed

2 files changed

+4
-5
lines changed

collector.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -441,11 +441,12 @@ collector_main(Datum main_arg)
441441
if (collector_hdr->request != NO_REQUEST)
442442
{
443443
LOCKTAG tag;
444-
SHMRequest request = collector_hdr->request;
444+
SHMRequest request;
445445

446446
init_lock_tag(&tag, PGWS_COLLECTOR_LOCK);
447447

448448
LockAcquire(&tag, ExclusiveLock, false, false);
449+
request = collector_hdr->request;
449450
collector_hdr->request = NO_REQUEST;
450451

451452
if (request == HISTORY_REQUEST || request == PROFILE_REQUEST)

pg_wait_sampling.c

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -594,13 +594,11 @@ receive_array(SHMRequest request, Size item_size, Size *count)
594594
init_lock_tag(&queueTag, PGWS_QUEUE_LOCK);
595595
LockAcquire(&queueTag, ExclusiveLock, false, false);
596596

597-
/* Ensure collector has processed previous request */
598597
init_lock_tag(&collectorTag, PGWS_COLLECTOR_LOCK);
599598
LockAcquire(&collectorTag, ExclusiveLock, false, false);
600-
LockRelease(&collectorTag, ExclusiveLock, false);
601-
602599
recv_mq = shm_mq_create(collector_mq, COLLECTOR_QUEUE_SIZE);
603600
collector_hdr->request = request;
601+
LockRelease(&collectorTag, ExclusiveLock, false);
604602

605603
if (!collector_hdr->latch)
606604
ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR),
@@ -770,9 +768,9 @@ pg_wait_sampling_reset_profile(PG_FUNCTION_ARGS)
770768

771769
init_lock_tag(&tagCollector, PGWS_COLLECTOR_LOCK);
772770
LockAcquire(&tagCollector, ExclusiveLock, false, false);
771+
collector_hdr->request = PROFILE_RESET;
773772
LockRelease(&tagCollector, ExclusiveLock, false);
774773

775-
collector_hdr->request = PROFILE_RESET;
776774
SetLatch(collector_hdr->latch);
777775

778776
LockRelease(&tag, ExclusiveLock, false);

0 commit comments

Comments
 (0)