Re: [HACKERS] Moving relation extension locks out of heavyweight lockmanager - Mailing list pgsql-hackers
From | Konstantin Knizhnik |
---|---|
Subject | Re: [HACKERS] Moving relation extension locks out of heavyweight lockmanager |
Date | |
Msg-id | [email protected] Whole thread Raw |
In response to | Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager (Masahiko Sawada <[email protected]>) |
Responses |
Re: [HACKERS] Moving relation extension locks out of heavyweightlock manager
Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager |
List | pgsql-hackers |
On 26.04.2018 09:10, Masahiko Sawada wrote: > On Thu, Apr 26, 2018 at 3:30 AM, Robert Haas <[email protected]> wrote: >> On Tue, Apr 10, 2018 at 9:08 PM, Masahiko Sawada <[email protected]> wrote: >>> Never mind. There was a lot of items especially at the last CommitFest. >>> >>>> In terms of moving forward, I'd still like to hear what >>>> Andres has to say about the comments I made on March 1st. >>> Yeah, agreed. >> $ ping -n andres.freund >> Request timeout for icmp_seq 0 >> Request timeout for icmp_seq 1 >> Request timeout for icmp_seq 2 >> Request timeout for icmp_seq 3 >> Request timeout for icmp_seq 4 >> ^C >> --- andres.freund ping statistics --- >> 6 packets transmitted, 0 packets received, 100.0% packet loss >> >> Meanwhile, https://www.postgresql.org/message-id/[email protected] >> shows that this patch has some benefits for other cases, which is a >> point in favor IMHO. > Thank you for sharing. That's good to know. > > Andres pointed out the performance degradation due to hash collision > when multiple loading. I think the point is that it happens at where > users don't know. Therefore even if we make N_RELEXTLOCK_ENTS > configurable parameter, since users don't know the hash collision they > don't know when they should tune it. > > So it's just an idea but how about adding an SQL-callable function > that returns the estimated number of lock waiters of the given > relation? Since user knows how many processes are loading to the > relation, if a returned value by the function is greater than the > expected value user can know hash collision and will be able to start > to consider to increase N_RELEXTLOCK_ENTS. > > Regards, > > -- > Masahiko Sawada > NIPPON TELEGRAPH AND TELEPHONE CORPORATION > NTT Open Source Software Center > We in PostgresProc were faced with lock extension contention problem at two more customers and tried to use this patch (v13) to address this issue. Unfortunately replacing heavy lock with lwlock couldn't completely eliminate contention, now most of backends are blocked on conditional variable: 0x00007fb03a318903 in __epoll_wait_nocancel () from /lib64/libc.so.6 #0 0x00007fb03a318903 in __epoll_wait_nocancel () from /lib64/libc.so.6 #1 0x00000000007024ee in WaitEventSetWait () #2 0x0000000000718fa6 in ConditionVariableSleep () #3 0x000000000071954d in RelExtLockAcquire () #4 0x00000000004ba99d in RelationGetBufferForTuple () #5 0x00000000004b3f18 in heap_insert () #6 0x00000000006109c8 in ExecInsert () #7 0x0000000000611a49 in ExecModifyTable () #8 0x00000000005ef97a in standard_ExecutorRun () #9 0x000000000072440a in ProcessQuery () #10 0x0000000000724631 in PortalRunMulti () #11 0x00000000007250ec in PortalRun () #12 0x0000000000721287 in exec_simple_query () #13 0x0000000000722532 in PostgresMain () #14 0x000000000047a9eb in ServerLoop () #15 0x00000000006b9fe9 in PostmasterMain () #16 0x000000000047b431 in main () Obviously there is nothing surprising here: if a lot of processes try to acquire the same exclusive lock, then high contention is expected. I just want to notice that this patch is not able to completely eliminate the problem with large number of concurrent inserts to the same table. Second problem we observed was even more critical: if backed is granted relation extension lock and then got some error before releasing this lock, then abort of the current transaction doesn't release this lock (unlike heavy weight lock) and the relation is kept locked. So database is actually stalled and server has to be restarted. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: