Skip to content

Add extension_distdir GUC #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

theory
Copy link
Owner

@theory theory commented Apr 2, 2024

Based on a patch by Christophe Berg in the Debian Project, add a new GUC, extension_destdir, that prepends a directory prefix for extension loading. This directory is prepended to the SHAREDIR paths when loading extensions (control and SQL files), and to the $libdir directive when loading modules that back functions. Requires a superuser or user with the appropriate SET privilege.

Also document the PGXS DESTDIR variable, which should be used to install extensions into the proper destination directory.

@theory theory self-assigned this Apr 2, 2024
@theory theory force-pushed the extension-directory-guc branch 2 times, most recently from 5b903ec to 5d5a0eb Compare April 3, 2024 13:29
@theory theory force-pushed the extension-directory-guc branch from 4836162 to f6c852a Compare April 11, 2024 17:46
@theory theory changed the title Update extension configuration patch Add extension_distdir GUC Apr 11, 2024
@theory theory force-pushed the extension-directory-guc branch 4 times, most recently from 0bb93da to 12d1230 Compare June 17, 2024 22:25
@theory theory force-pushed the extension-directory-guc branch 4 times, most recently from 698bdd9 to 9d68067 Compare June 26, 2024 17:43
theory pushed a commit that referenced this pull request Jul 2, 2024
1. TruncateMultiXact() performs the SLRU truncations in a critical
section. Deleting the SLRU segments calls ForwardSyncRequest(), which
will try to compact the request queue if it's full
(CompactCheckpointerRequestQueue()). That in turn allocates memory,
which is not allowed in a critical section. Backtrace:

    TRAP: failed Assert("CritSectionCount == 0 || (context)->allowInCritSection"), File: "../src/backend/utils/mmgr/mcxt.c", Line: 1353, PID: 920981
    postgres: autovacuum worker template0(ExceptionalCondition+0x6e)[0x560a501e866e]
    postgres: autovacuum worker template0(+0x5dce3d)[0x560a50217e3d]
    postgres: autovacuum worker template0(ForwardSyncRequest+0x8e)[0x560a4ffec95e]
    postgres: autovacuum worker template0(RegisterSyncRequest+0x2b)[0x560a50091eeb]
    postgres: autovacuum worker template0(+0x187b0a)[0x560a4fdc2b0a]
    postgres: autovacuum worker template0(SlruDeleteSegment+0x101)[0x560a4fdc2ab1]
    postgres: autovacuum worker template0(TruncateMultiXact+0x2fb)[0x560a4fdbde1b]
    postgres: autovacuum worker template0(vac_update_datfrozenxid+0x4b3)[0x560a4febd2f3]
    postgres: autovacuum worker template0(+0x3adf66)[0x560a4ffe8f66]
    postgres: autovacuum worker template0(AutoVacWorkerMain+0x3ed)[0x560a4ffe7c2d]
    postgres: autovacuum worker template0(+0x3b1ead)[0x560a4ffecead]
    postgres: autovacuum worker template0(+0x3b620e)[0x560a4fff120e]
    postgres: autovacuum worker template0(+0x3b3fbb)[0x560a4ffeefbb]
    postgres: autovacuum worker template0(+0x2f724e)[0x560a4ff3224e]
    /lib/x86_64-linux-gnu/libc.so.6(+0x27c8a)[0x7f62cc642c8a]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7f62cc642d45]
    postgres: autovacuum worker template0(_start+0x21)[0x560a4fd16f31]

To fix, bail out in CompactCheckpointerRequestQueue() without doing
anything, if it's called in a critical section. That covers the above
call path, as well as any other similar cases where
RegisterSyncRequest might be called in a critical section.

2. After fixing that, another problem became apparent: Autovacuum
process doing that truncation can deadlock with the checkpointer
process. TruncateMultiXact() sets "MyProc->delayChkptFlags |=
DELAY_CHKPT_START". If the sync request queue is full and cannot be
compacted, the process will repeatedly sleep and retry, until there is
room in the queue. However, if the checkpointer is trying to start a
checkpoint at the same time, and is waiting for the DELAY_CHKPT_START
processes to finish, the queue will never shrink.

More concretely, the autovacuum process is stuck here:

    #0  0x00007fc934926dc3 in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6
    #1  0x000056220b24348b in WaitEventSetWaitBlock (set=0x56220c2e4b50, occurred_events=0x7ffe7856d040, nevents=1, cur_timeout=<optimized out>) at ../src/backend/storage/ipc/latch.c:1570
    #2  WaitEventSetWait (set=0x56220c2e4b50, timeout=timeout@entry=10, occurred_events=<optimized out>, occurred_events@entry=0x7ffe7856d040, nevents=nevents@entry=1,
        wait_event_info=wait_event_info@entry=150994949) at ../src/backend/storage/ipc/latch.c:1516
    #3  0x000056220b243224 in WaitLatch (latch=<optimized out>, latch@entry=0x0, wakeEvents=wakeEvents@entry=40, timeout=timeout@entry=10, wait_event_info=wait_event_info@entry=150994949)
        at ../src/backend/storage/ipc/latch.c:538
    #4  0x000056220b26cf46 in RegisterSyncRequest (ftag=ftag@entry=0x7ffe7856d0a0, type=type@entry=SYNC_FORGET_REQUEST, retryOnError=true) at ../src/backend/storage/sync/sync.c:614
    #5  0x000056220af9db0a in SlruInternalDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1495
    #6  0x000056220af9dab1 in SlruDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1566
    #7  0x000056220af98e1b in PerformMembersTruncation (oldestOffset=<optimized out>, newOldestOffset=<optimized out>) at ../src/backend/access/transam/multixact.c:3006
    #8  TruncateMultiXact (newOldestMulti=newOldestMulti@entry=3221225472, newOldestMultiDB=newOldestMultiDB@entry=4) at ../src/backend/access/transam/multixact.c:3201
    #9  0x000056220b098303 in vac_truncate_clog (frozenXID=749, minMulti=<optimized out>, lastSaneFrozenXid=749, lastSaneMinMulti=3221225472) at ../src/backend/commands/vacuum.c:1917
    #10 vac_update_datfrozenxid () at ../src/backend/commands/vacuum.c:1760
    #11 0x000056220b1c3f76 in do_autovacuum () at ../src/backend/postmaster/autovacuum.c:2550
    postgres#12 0x000056220b1c2c3d in AutoVacWorkerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/autovacuum.c:1569

and the checkpointer is stuck here:

    #0  0x00007fc9348ebf93 in clock_nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
    #1  0x00007fc9348fe353 in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
    #2  0x000056220b40ecb4 in pg_usleep (microsec=microsec@entry=10000) at ../src/port/pgsleep.c:50
    #3  0x000056220afb43c3 in CreateCheckPoint (flags=flags@entry=108) at ../src/backend/access/transam/xlog.c:7098
    #4  0x000056220b1c6e86 in CheckpointerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/checkpointer.c:464

To fix, add AbsorbSyncRequests() to the loops where the checkpointer
waits for DELAY_CHKPT_START or DELAY_CHKPT_COMPLETE operations to
finish.

Backpatch to v14. Before that, SLRU deletion didn't call
RegisterSyncRequest, which avoided this failure. I'm not sure if there
are other similar scenarios on older versions, but we haven't had
any such reports.

Discussion: https://www.postgresql.org/message-id/[email protected]
@theory theory force-pushed the extension-directory-guc branch 3 times, most recently from b3c3b82 to c1a146b Compare July 8, 2024 15:59
@theory theory force-pushed the extension-directory-guc branch 2 times, most recently from 8157854 to bd74d01 Compare July 10, 2024 14:38
@theory theory force-pushed the extension-directory-guc branch from bd74d01 to c5cc0b2 Compare July 19, 2024 13:54
@theory theory force-pushed the extension-directory-guc branch 2 times, most recently from e0cc589 to 72e18d3 Compare July 31, 2024 13:44
@ringerc
Copy link

ringerc commented Aug 21, 2024

(this is being discussed in the mailing lists as https://www.postgresql.org/message-id/flat/[email protected])


A single extension dir isn't IMO quite enough if we're going to make extension directories configurable.

I suggest a search path, like we have with dynamic_library_path, so it's possible to configure postgres to find extension .control and .sql files in multiple directories.

Use cases include things like app-bundled extensions, vendor add-ons, and local-built-from-source extensions compiled for distro-packaged postgres. E.g., assuming there is a default-extensions location var $extsdir added too, one might:

SET extension_search_path = $extsdir, /opt/myapp/extensions, /usr/local/postgres/my-custom-extension/extension;
SET dynamic_library_path = $libdir, /opt/myapp/lib, /usr/local/postgres/my-custom-extension/lib;

I elaborated in https://www.postgresql.org/message-id/CAGRY4nxP6A5Dz23g+aGD-agdVwUj_qrG6szd-mWc0E5OFMBg4w@mail.gmail.com

@@ -4491,6 +4492,17 @@ struct config_string ConfigureNamesString[] =
check_canonical_path, NULL, NULL
},

{
{"extension_destdir", PGC_SUSET, FILE_LOCATIONS,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dynamic_library_path is CLIENT_CONN_OTHER not FILE_LOCATIONS. Should this be the same for consistency, and is there a security reason for the dynamic library path choice?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's superuser-only, so I'm not sure it matters. Happy to change it to whatever the consensus determines.

gettext_noop("This directory is prepended to paths when loading extensions (control and SQL files), and to the '$libdir' directive when loading modules that back functions. The location is made configurable to allow build-time testing of extensions that do not have been installed to their proper location yet."),
GUC_SUPERUSER_ONLY
},
&extension_destdir,
Copy link

@ringerc ringerc Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be made extensions_search_path or similar, like dynamic_library_path https://github.com/theory/postgres/blob/72e18d36734ced0cacb9c24022a85d88a14b4cb8/src/backend/utils/misc/guc_tables.c#L4161C39-L4161C56, so it can search multiple directories for extensions?

That would be a real help for container environments, local testing, etc, and would be more consistent with how the library search path works.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replied to the -hackers thread, but I'd argue that's a completely different feature that requires a fair bit of thought. I would definitely prefer it, but the approach proposed here feels like a decent interim step to me, one that's already proven (in the Debian packaging system) and does most of what we need to solve the immediate challenges, at the expense of a slightly funky directory layout.

@theory theory force-pushed the extension-directory-guc branch from 72e18d3 to beb1ee9 Compare September 10, 2024 19:41
Based on a [patch] by Christophe Berg in the Debian Project, add a new
GUC, `extension_destdir`, that prepends a directory prefix for extension
loading. This directory is prepended to the `SHAREDIR` paths when
loading extensions (control and SQL files), and to the `$libdir`
directive when loading modules that back functions. Requires a superuser
or user with the appropriate SET privilege.

Also document the PGXS `DESTDIR` variable, which should be used to
install extensions into the proper destination directory.

  [patch]: https://salsa.debian.org/postgresql/postgresql/-/blob/17/debian/patches/extension_destdir?ref_type=heads
@theory theory force-pushed the extension-directory-guc branch from beb1ee9 to a69a8e5 Compare September 16, 2024 17:21
@theory
Copy link
Owner Author

theory commented Apr 24, 2025

Alternate patch committed in 4f7f7b0.

@theory theory closed this Apr 24, 2025
@theory theory deleted the extension-directory-guc branch April 28, 2025 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants