You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PGPRO-6599: Avoid race when accessing the request shared variable.
Consider the following sequence of events:
1. Session 1 calls pg_wait_sampling_reset_profile and sets the request shared
variable to PROFILE_RESET.
2. The collector reads request and saves PROFILE_RESET to a local variable.
3. Session 2 queries pg_wait_sampling_profile, which sets request to
PROFILE_REQUEST and waits for the collector in shm_mq_receive.
4. The collector continues and clears shared request, thus dropping
PROFILE_REQUEST from Session 2.
5. Session 2 waits indefinitely in shm_mq_receive.
A similar example with query cancellation:
1. Session 1 queries pg_wait_sampling_history and sets request to
HISTORY_REQUEST.
2. Session 1 cancels the query while waiting for the collector.
3. The collector reads request and saves HISTORY_REQUEST to a local variable.
4. Session 2 queries pg_wait_sampling_profile, sets request to
PROFILE_REQUEST and waits for the collector.
5. The collector continues and responds to HISTORY_REQUEST.
6. Session 2 receives history data and renders them as profile data returning
invalid counts.
These interleavings are avoided by acquiring the collector lock before reading
request from shared memory in the collector. But we also need to hold the
collector lock when we set request in receive_array in a backend. Otherwise,
the following interleaving is possible:
1. Session 1 calls pg_wait_sampling_reset_profile and sets request to
PROFILE_RESET.
2. Session 2 queries pg_wait_sampling_profile, acquires and releases the
collector lock.
3. The collector acquires the lock, reads request and saves PROFILE_RESET to
a local variable.
4. Session 2 sets request to PROFILE_REQUEST.
5. The collector clears request, and PROFILE_REQUEST is lost.
6. Session 2 waits indefinitely in shm_mq_receive.
Same for the second example above. This patch, however, doesn't prevent loosing
PROFILE_RESET requests:
1. Session 1 calls pg_wait_sampling_reset_profile and sets request to
PROFILE_RESET.
2. Session 2 queries pg_wait_sampling_profile before the collector reads
request.
3. The collector reads PROFILE_REQUEST, while PROFILE_RESET is lost.
To fix this, we could make pg_wait_sampling_reset_profile wait for the
collector, but we decided not to, as loosing a PROFILE_RESET isn't critical.
Resolves#48.
Author: Roman Zharkov
Reported-By: Alexander Lakhin
Reviewed-By: Maksim Milyutin, Sergey Shinderuk
0 commit comments