Lists: | pgsql-hackers |
---|
From: | Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-02-25 19:44:33 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi, hackers!
Historically, the checkpointer process use palloc() into
AbsorbSyncRequests() function. Therefore, the checkpointer does not
expect to receive a request larger than 1 GB.
We encountered a case where the database went into recovery state, after
applying all wal, the checkpointer process generated an "invalid memory
alloc request size" error and entered a loop. But it is quite acceptable
for the recovery state to receive such a large allocation request.
A simple solution to this problem is to use palloc_extended() instead of
palloc(). But is it safe to allow the checkpointer to allocate so much
memory at once?
I have proposal to update this memory allocation but I need your ideas
and advices on how to do it in appropriate way. As an idea, we can
replace the array with a list of arrays to allocate memory in chunks. As
a bad idea, we can process a temporary array without locking.
I would be glad to hear your ideas and suggestions about this topic.
Have a nice day!
--
Ekaterina Sokolova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From: | Maxim Orlov <orlovmg(at)gmail(dot)com> |
---|---|
To: | Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-02-26 08:46:45 |
Message-ID: | CACG=ezZ5_TvRtt-a3hwANDGzbHOVXFsr6vPGSgV7WdJyM6tgXw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Tue, 25 Feb 2025 at 22:44, Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>
wrote:
> Hi, hackers!
>
> Historically, the checkpointer process use palloc() into
> AbsorbSyncRequests() function. Therefore, the checkpointer does not
> expect to receive a request larger than 1 GB.
Yeah. And the most unpleasant thing is it won't simply fail with an error
or helpful message suggesting a workaround (reduce the amount of shared
memory). Checkpointer will just "stuck".
AFAICS, we have a few options:
1. Leave it as it is, but fatal on allocation of the chunk more than 1G.
2. Use palloc_extended with MCXT_ALLOC_HUGE flag.
3. Do not use any allocation and use CheckpointerShmem->requests directly
in case of > 1G size of the required allocation.
Case (3) is not an option, in my opinion. So, we following (1) or (2).
Personally, I'm for (2), PFA v0 patch.
--
Best regards,
Maxim Orlov.
Attachment | Content-Type | Size |
---|---|---|
v0-0001-Expect-huge-number-of-requests-in-checkpointer.patch | application/octet-stream | 1.5 KB |
From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Maxim Orlov <orlovmg(at)gmail(dot)com> |
Cc: | Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-02-26 08:54:17 |
Message-ID: | bjno37ickfafixkqmd2lcyopsajnuig5mm4rg6tn2ackpqyiba@w3sjfo3usuos |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
On 2025-02-26 11:46:45 +0300, Maxim Orlov wrote:
> On Tue, 25 Feb 2025 at 22:44, Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>
> wrote:
>
> > Hi, hackers!
> >
> > Historically, the checkpointer process use palloc() into
> > AbsorbSyncRequests() function. Therefore, the checkpointer does not
> > expect to receive a request larger than 1 GB.
>
>
> Yeah. And the most unpleasant thing is it won't simply fail with an error
> or helpful message suggesting a workaround (reduce the amount of shared
> memory). Checkpointer will just "stuck".
>
> AFAICS, we have a few options:
> 1. Leave it as it is, but fatal on allocation of the chunk more than 1G.
> 2. Use palloc_extended with MCXT_ALLOC_HUGE flag.
> 3. Do not use any allocation and use CheckpointerShmem->requests directly
> in case of > 1G size of the required allocation.
4) Do compaction incrementally, instead of doing it for all requests at once.
That'd probably be better, because
a) it'll take some time to to compact 10s to 100s of million requests, which
makes it much more likely that backends will have to perform syncs
themselves and the lock will be held for an extended period of time
b) allocating gigabytes of memory obviously makes it more likely that you'll
fail with out-of-memory at runtime or evne get OOM killed.
Greetings,
Andres Freund
From: | Maxim Orlov <orlovmg(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-02-26 11:10:33 |
Message-ID: | CACG=ezaP5j_HnSS4_bc51Fw=O4-Sdd=-tQTiodrdq8QRThshPw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Wed, 26 Feb 2025 at 11:54, Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> 4) Do compaction incrementally, instead of doing it for all requests at
> once.
Yeah, good idea! I completely forgot about that. Thanks!
--
Best regards,
Maxim Orlov.
From: | Maxim Orlov <orlovmg(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-02-27 14:16:28 |
Message-ID: | CACG=ezZJGhJgZEKc3arPFKMUWjmXLFt_GeaMTqXcMxQPfR31jg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
I tried to implement the idea (4). This is the patch.
But, there is a problem. See, when we release lock and call
RememberSyncRequest() and acquire it again,
CompactCheckpointerRequestQueue() may be called and the state of the
request array may be changed. Of course, we can recheck num_requests after
retaking the lock and restart the algo all over again. But this is not a
great idea, since we can stuck in this loop if someone is pushing requests
in the queue.
As for case (3). In fact, the described problem happens only with high
enough values of NBuffers. Thus, user already expected postgres to use huge
amount of RAM. Is this really a problem if he will demand some more to
process sync request?
--
Best regards,
Maxim Orlov.
Attachment | Content-Type | Size |
---|---|---|
v1-0001-AbsorbSyncRequests-incrementally-instead-of-doing.patch | application/octet-stream | 3.3 KB |
From: | Maxim Orlov <orlovmg(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-02-28 08:13:29 |
Message-ID: | CACG=ezbnPkZwKbQ=YrRfawrSEw0fbz7WsWohusGsE+ycHyfvvA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
After done some testing, I found a bug in the patch. If more requests were
pushed while we release the lock, num_requests could not be set to zero.
Here is a fixed version.
--
Best regards,
Maxim Orlov.
Attachment | Content-Type | Size |
---|---|---|
v2-0001-AbsorbSyncRequests-incrementally-instead-of-doing.patch | application/octet-stream | 3.8 KB |
From: | Maxim Orlov <orlovmg(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-02-28 08:55:45 |
Message-ID: | CACG=ezbQrdr8orTXnu6_kWpJ_VC1MEm7Y89ACMVBB0mm+zCROA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Here is an alternative solution. We can limit the number of processed
requests to fit in a 1Gb memory chunk for each pass. Obviously, we left
some requests in the queue to be processed in the next call. We can
overcome this by using multi-step processing: estimating the number of
steps in the beginning and processing requests again.
I'd like to hear your opinion on the subject.
--
Best regards,
Maxim Orlov.
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Limit-AbsorbSyncRequests-to-1Gb-at-once.patch | application/octet-stream | 1.5 KB |
From: | Maxim Orlov <orlovmg(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-02-28 16:54:12 |
Message-ID: | CACG=ezaXQroNuSwhvymza=gqpxV93nTjRvpq-EoEgLWhGfMubw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
I think I figured it out. Here is v4.
If the number of requests is less than 1 GB, the algorithm stays the same
as before. If we need to process more, we will do it incrementally with
slices of 1 GB.
Best regards,
Maxim Orlov.
Attachment | Content-Type | Size |
---|---|---|
v4-0001-Process-sync-requests-incrementally-in-AbsorbSync.patch | application/octet-stream | 3.7 KB |
From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
---|---|
To: | Maxim Orlov <orlovmg(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-03-12 07:27:31 |
Message-ID: | CABPTF7XYajjHFG1quHFUa_t5h8T4SBMqMdkgQcPNmU99yT2S5g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
The patch itself looks ok to me. I'm curious about the trade-offs between
this incremental approach and the alternative of
using palloc_extended() with the MCXT_ALLOC_HUGE flag. The approach of
splitting the requests into fixed-size slices avoids OOM failures or
process termination by the OOM killer, which is good. However, it does add
some overhead with additional lock acquisition/release cycles and memory
movement operations via memmove(). The natural question is whether the
security justify the cost. Regarding the slice size of 1 GB, is this
derived from MaxAllocSize limit, or was it chosen for other performance
reasons? whether a different size might offer better performance under
typical workloads?
It would be helpful to know the reasoning behind these design decisions.
Maxim Orlov <orlovmg(at)gmail(dot)com> 于2025年3月1日周六 00:54写道:
> I think I figured it out. Here is v4.
>
> If the number of requests is less than 1 GB, the algorithm stays the same
> as before. If we need to process more, we will do it incrementally with
> slices of 1 GB.
>
> Best regards,
> Maxim Orlov.
>
From: | Maxim Orlov <orlovmg(at)gmail(dot)com> |
---|---|
To: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-03-12 11:00:53 |
Message-ID: | CACG=ezaqBbJcChcznoh2qE8+vF6A5uTyvMeSKHdrAb7sDYg5CQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Wed, 12 Mar 2025 at 10:27, Xuneng Zhou <xunengzhou(at)gmail(dot)com> wrote:
> Hi,
> The patch itself looks ok to me. I'm curious about the trade-offs between
> this incremental approach and the alternative of
> using palloc_extended() with the MCXT_ALLOC_HUGE flag. The approach of
> splitting the requests into fixed-size slices avoids OOM failures or
> process termination by the OOM killer, which is good. However, it does add
> some overhead with additional lock acquisition/release cycles and memory
> movement operations via memmove(). The natural question is whether the
> security justify the cost. Regarding the slice size of 1 GB, is this
> derived from MaxAllocSize limit, or was it chosen for other performance
> reasons? whether a different size might offer better performance under
> typical workloads?
>
I think 1 GB is derived purely from MaxAllocSize. This "palloc" is a
relatively old one, and no one expected the number of requests to exceed 1
GB. Now we have the ability to set the shared_buffers to a huge number
(without discussing now whether this makes any real sense), thus this limit
for palloc becomes a problem.
--
Best regards,
Maxim Orlov.
From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Maxim Orlov <orlovmg(at)gmail(dot)com>, Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-04-04 21:22:52 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 12/03/2025 13:00, Maxim Orlov wrote:
> On Wed, 12 Mar 2025 at 10:27, Xuneng Zhou <xunengzhou(at)gmail(dot)com
> <mailto:xunengzhou(at)gmail(dot)com>> wrote:
>
>> The patch itself looks ok to me. I'm curious about the trade-offs
>> between this incremental approach and the alternative of
>> using palloc_extended() with the MCXT_ALLOC_HUGE flag. The approach
>> of splitting the requests into fixed-size slices avoids OOM
>> failures or process termination by the OOM killer, which is good.
+1
>> However, it does add some overhead with additional lock acquisition/
>> release cycles and memory movement operations via memmove(). The
>> natural question is whether the security justify the cost.
Hmm, if you turned the array into a ring buffer, you could absorb part
of the queue without the memmove().
With that, I'd actually suggest using a much smaller allocation, maybe
10000 entries or even less. That should be enough to keep the lock
acquire/release overhead acceptable.
>> Regarding the slice size of 1 GB, is this derived from
>> MaxAllocSize limit, or was it chosen for other performance
>> reasons? whether a different size might offer better performance
>> under typical workloads?
>
> I think 1 GB is derived purely from MaxAllocSize. This "palloc" is a
> relatively old one, and no one expected the number of requests to exceed
> 1 GB. Now we have the ability to set the shared_buffers to a huge number
> (without discussing now whether this makes any real sense), thus this
> limit for palloc becomes a problem.
Yeah. Frankly even 1 GB seems excessive for this. We set max_requests =
NBuffers, which makes some sense so that you can fit the worst case
scenario of quickly evicting all pages from the buffer cache. But even
then I'd expect the checkpointer to be able to absorb the changes before
the queue fills up. And we have the compaction logic now too.
So I wonder if we should cap max_requests at, say, 10 million requests now?
CompactCheckpointerRequestQueue() makes a pretty large allocation too,
BTW, although much smaller than the one in AbsorbSyncRequests(). The
hash table it builds could grow quite large though.
--
Heikki Linnakangas
Neon (https://neon.tech)
From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Maxim Orlov <orlovmg(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-04-10 15:43:20 |
Message-ID: | CABPTF7WGU1dNjOn_Ea2Os78zqu90jEUr=oyveOtpG8bNVh87bA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
I’ve updated and rebased Maxim's patch version 4 with optimizations like
ring buffer and capped max_requests proposed by Heikki. These are the
summary of Changes in v5:
• Replaced the linear array model in AbsorbSyncRequests() with a bounded
ring buffer to avoid large memmove() operations when processing sync
requests.
• Introduced a tunable batch size (CKPT_REQ_BATCH_SIZE, default:
10,000) to incrementally
absorb requests while respecting MaxAllocSize.
• Capped max_requests with a hard upper bound (MAX_CHECKPOINT_REQUESTS =
10,000,000) to avoid pathological memory growth when shared_buffers is very
large.
• Updated CompactCheckpointerRequestQueue() to support the ring buffer
layout.
• Retained the existing compacting logic but modified it to operate
in-place over the ring.
> >> The patch itself looks ok to me. I'm curious about the trade-offs
> >> between this incremental approach and the alternative of
> >> using palloc_extended() with the MCXT_ALLOC_HUGE flag. The approach
> >> of splitting the requests into fixed-size slices avoids OOM
> >> failures or process termination by the OOM killer, which is good.
>
> +1
>
> >> However, it does add some overhead with additional lock acquisition/
> >> release cycles and memory movement operations via memmove(). The
> >> natural question is whether the security justify the cost.
>
> Hmm, if you turned the array into a ring buffer, you could absorb part
> of the queue without the memmove().
>
> With that, I'd actually suggest using a much smaller allocation, maybe
> 10000 entries or even less. That should be enough to keep the lock
> acquire/release overhead acceptable
>
> >> Regarding the slice size of 1 GB, is this derived from
> >> MaxAllocSize limit, or was it chosen for other performance
> >> reasons? whether a different size might offer better performance
> >> under typical workloads?
> >
> > I think 1 GB is derived purely from MaxAllocSize. This "palloc" is a
> > relatively old one, and no one expected the number of requests to exceed
> > 1 GB. Now we have the ability to set the shared_buffers to a huge number
> > (without discussing now whether this makes any real sense), thus this
> > limit for palloc becomes a problem.
>
> Yeah. Frankly even 1 GB seems excessive for this. We set max_requests =
> NBuffers, which makes some sense so that you can fit the worst case
> scenario of quickly evicting all pages from the buffer cache. But even
> then I'd expect the checkpointer to be able to absorb the changes before
> the queue fills up. And we have the compaction logic now too.
>
> So I wonder if we should cap max_requests at, say, 10 million requests now?
>
> CompactCheckpointerRequestQueue() makes a pretty large allocation too,
> BTW, although much smaller than the one in AbsorbSyncRequests(). The
> hash table it builds could grow quite large though.
>
> I’m not certain what the optimal cap for max_requests should be—whether
it’s 10 million(setting it to 10 million would result in a peak temporary
allocation of roughly 700 MB within CompactCheckpointerRequestQueue), 1
million, or some other value—but introducing an upper bound seems
important. Without a cap, max_requests could theoretically scale with
NBuffers, potentially resulting in excessive temporary memory allocations.
As this is my first patch submission, it might be somewhat naive and
experimental— I appreciate your patience and feedback.
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Process-sync-requests-incrementally-in-AbsorbSyncReq.patch | application/octet-stream | 9.4 KB |
From: | Xuneng Zhou <xunengzhou(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Maxim Orlov <orlovmg(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Ekaterina Sokolova <e(dot)sokolova(at)postgrespro(dot)ru>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Proposal: Limitations of palloc inside checkpointer |
Date: | 2025-04-15 04:02:25 |
Message-ID: | CABPTF7VbszDaqyqS8WeyxJgyyjHvb9x1L3Ls7gSwr3KuiBr0wg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
I've moved this entry to the July CommitFest. The CI reported an unused
variable warning in patch v5, so I've addressed it by removing the unused
one. Sorry for not catching the issue locally.
Xuneng Zhou <xunengzhou(at)gmail(dot)com> 于2025年4月10日周四 23:43写道:
> Hi,
>
> I’ve updated and rebased Maxim's patch version 4 with optimizations like
> ring buffer and capped max_requests proposed by Heikki. These are the
> summary of Changes in v5:
>
> • Replaced the linear array model in AbsorbSyncRequests() with a bounded
> ring buffer to avoid large memmove() operations when processing sync
> requests.
>
> • Introduced a tunable batch size (CKPT_REQ_BATCH_SIZE, default: 10,000)
> to incrementally absorb requests while respecting MaxAllocSize.
>
> • Capped max_requests with a hard upper bound (MAX_CHECKPOINT_REQUESTS =
> 10,000,000) to avoid pathological memory growth when shared_buffers is
> very large.
>
> • Updated CompactCheckpointerRequestQueue() to support the ring buffer
> layout.
>
> • Retained the existing compacting logic but modified it to operate
> in-place over the ring.
>
>> >> The patch itself looks ok to me. I'm curious about the trade-offs
>> >> between this incremental approach and the alternative of
>> >> using palloc_extended() with the MCXT_ALLOC_HUGE flag. The approach
>> >> of splitting the requests into fixed-size slices avoids OOM
>> >> failures or process termination by the OOM killer, which is good.
>>
>> +1
>>
>> >> However, it does add some overhead with additional lock acquisition/
>> >> release cycles and memory movement operations via memmove(). The
>> >> natural question is whether the security justify the cost.
>>
>> Hmm, if you turned the array into a ring buffer, you could absorb part
>> of the queue without the memmove().
>>
>> With that, I'd actually suggest using a much smaller allocation, maybe
>> 10000 entries or even less. That should be enough to keep the lock
>> acquire/release overhead acceptable
>>
>
>
>> >> Regarding the slice size of 1 GB, is this derived from
>> >> MaxAllocSize limit, or was it chosen for other performance
>> >> reasons? whether a different size might offer better performance
>> >> under typical workloads?
>> >
>> > I think 1 GB is derived purely from MaxAllocSize. This "palloc" is a
>> > relatively old one, and no one expected the number of requests to
>> exceed
>> > 1 GB. Now we have the ability to set the shared_buffers to a huge
>> number
>> > (without discussing now whether this makes any real sense), thus this
>> > limit for palloc becomes a problem.
>>
>> Yeah. Frankly even 1 GB seems excessive for this. We set max_requests =
>> NBuffers, which makes some sense so that you can fit the worst case
>> scenario of quickly evicting all pages from the buffer cache. But even
>> then I'd expect the checkpointer to be able to absorb the changes before
>> the queue fills up. And we have the compaction logic now too.
>>
>> So I wonder if we should cap max_requests at, say, 10 million requests
>> now?
>>
>> CompactCheckpointerRequestQueue() makes a pretty large allocation too,
>> BTW, although much smaller than the one in AbsorbSyncRequests(). The
>> hash table it builds could grow quite large though.
>>
>> I’m not certain what the optimal cap for max_requests should be—whether
> it’s 10 million(setting it to 10 million would result in a peak temporary
> allocation of roughly 700 MB within CompactCheckpointerRequestQueue), 1
> million, or some other value—but introducing an upper bound seems
> important. Without a cap, max_requests could theoretically scale with
> NBuffers, potentially resulting in excessive temporary memory allocations.
>
> As this is my first patch submission, it might be somewhat naive and
> experimental— I appreciate your patience and feedback.
>
Attachment | Content-Type | Size |
---|---|---|
v6-0001-Process-sync-requests-incrementally-in-AbsorbSyncReq.patch | application/octet-stream | 9.3 KB |