Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers
From | Yura Sokolov |
---|---|
Subject | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Date | |
Msg-id | [email protected] Whole thread Raw |
In response to | Re: [PoC] Improve dead tuple storage for lazy vacuum (Masahiko Sawada <[email protected]>) |
List | pgsql-hackers |
Masahiko Sawada писал 2021-07-27 07:06: > On Mon, Jul 26, 2021 at 11:01 PM Masahiko Sawada > <[email protected]> wrote: >> >> I'll experiment with the proposed ideas including this idea in more >> scenarios and share the results tomorrow. >> > > I've done some benchmarks for proposed data structures. In this trial, > I've done with the scenario where dead tuples are concentrated on a > particular range of table blocks (test 5-8), in addition to the > scenarios I've done in the previous trial. Also, I've done benchmarks > of each scenario while increasing table size. In the first test, the > maximum block number of the table is 1,000,000 (i.g., 8GB table) and > in the second test, it's 10,000,000 (80GB table). We can see how > performance and memory consumption changes with a large-scale table. > Here are the results: > > * Test 1 > select prepare( > 1000000, -- max block > 10, -- # of dead tuples per page > 1, -- dead tuples interval within a page > 1, -- # of consecutive pages having dead tuples > 20 -- page interval > ); > > name | attach | attach | shuffled | size_x10 | attach_x10| > shuffled_x10 > --------+-----------+--------+----------+------------+-----------+------------- > array | 57.23 MB | 0.040 | 98.613 | 572.21 MB | 0.387 | > 1521.981 > intset | 46.88 MB | 0.114 | 75.944 | 468.67 MB | 0.961 | > 997.760 > radix | 40.26 MB | 0.102 | 18.427 | 336.64 MB | 0.797 | > 266.146 > rtbm | 64.02 MB | 0.234 | 22.443 | 512.02 MB | 2.230 | > 275.143 > svtm | 27.28 MB | 0.060 | 13.568 | 274.07 MB | 0.476 | > 211.073 > tbm | 96.01 MB | 0.273 | 10.347 | 768.01 MB | 2.882 | > 128.103 > > * Test 2 > select prepare( > 1000000, -- max block > 10, -- # of dead tuples per page > 1, -- dead tuples interval within a page > 1, -- # of consecutive pages having dead tuples > 1 -- page interval > ); > > name | attach | attach | shuffled | size_x10 | attach_x10| > shuffled_x10 > --------+-----------+--------+----------+------------+-----------+------------- > array | 57.23 MB | 0.041 | 4.757 | 572.21 MB | 0.344 | > 71.228 > intset | 46.88 MB | 0.127 | 3.762 | 468.67 MB | 1.093 | > 49.573 > radix | 9.95 MB | 0.048 | 0.679 | 82.57 MB | 0.371 | > 16.211 > rtbm | 34.02 MB | 0.179 | 0.534 | 288.02 MB | 2.092 | > 8.693 > svtm | 5.78 MB | 0.043 | 0.239 | 54.60 MB | 0.342 | > 7.759 > tbm | 96.01 MB | 0.274 | 0.521 | 768.01 MB | 2.685 | > 6.360 > > * Test 3 > select prepare( > 1000000, -- max block > 2, -- # of dead tuples per page > 100, -- dead tuples interval within a page > 1, -- # of consecutive pages having dead tuples > 1 -- page interval > ); > > name | attach | attach | shuffled | size_x10 | attach_x10| > shuffled_x10 > --------+-----------+--------+----------+------------+-----------+------------- > array | 11.45 MB | 0.009 | 57.698 | 114.45 MB | 0.076 | > 1045.639 > intset | 15.63 MB | 0.031 | 46.083 | 156.23 MB | 0.243 | > 848.525 > radix | 40.26 MB | 0.063 | 13.755 | 336.64 MB | 0.501 | > 223.413 > rtbm | 36.02 MB | 0.123 | 11.527 | 320.02 MB | 1.843 | > 180.977 > svtm | 9.28 MB | 0.053 | 9.631 | 92.59 MB | 0.438 | > 212.626 > tbm | 96.01 MB | 0.228 | 10.381 | 768.01 MB | 2.258 | > 126.630 > > * Test 4 > select prepare( > 1000000, -- max block > 100, -- # of dead tuples per page > 1, -- dead tuples interval within a page > 1, -- # of consecutive pages having dead tuples > 1 -- page interval > ); > > name | attach | attach | shuffled | size_x10 | attach_x10| > shuffled_x10 > --------+-----------+--------+----------+------------+-----------+------------- > array | 572.21 MB | 0.367 | 78.047 | 5722.05 MB | 3.942 | > 1154.776 > intset | 93.74 MB | 0.777 | 45.146 | 937.34 MB | 7.716 | > 643.708 > radix | 40.26 MB | 0.203 | 9.015 | 336.64 MB | 1.775 | > 133.294 > rtbm | 36.02 MB | 0.369 | 5.639 | 320.02 MB | 3.823 | > 88.832 > svtm | 7.28 MB | 0.294 | 3.891 | 73.60 MB | 2.690 | > 103.744 > tbm | 96.01 MB | 0.534 | 5.223 | 768.01 MB | 5.679 | > 60.632 > > > * Test 5 > select prepare( > 1000000, -- max block > 150, -- # of dead tuples per page > 1, -- dead tuples interval within a page > 10000, -- # of consecutive pages having dead tuples > 20000 -- page interval > ); > > There are 10000 consecutive pages that have 150 dead tuples at every > 20000 pages. > > name | attach | attach | shuffled | size_x10 | attach_x10| > shuffled_x10 > --------+-----------+--------+----------+------------+-----------+------------- > array | 429.16 MB | 0.274 | 75.664 | 4291.54 MB | 3.067 | > 1259.501 > intset | 46.88 MB | 0.559 | 36.449 | 468.67 MB | 4.565 | > 517.445 > radix | 20.26 MB | 0.166 | 8.466 | 196.90 MB | 1.273 | > 166.587 > rtbm | 18.02 MB | 0.242 | 8.491 | 160.02 MB | 2.407 | > 171.725 > svtm | 3.66 MB | 0.243 | 3.635 | 37.10 MB | 2.022 | > 86.165 > tbm | 48.01 MB | 0.344 | 9.763 | 384.01 MB | 3.327 | > 151.824 > > * Test 6 > select prepare( > 1000000, -- max block > 10, -- # of dead tuples per page > 1, -- dead tuples interval within a page > 10000, -- # of consecutive pages having dead tuples > 20000 -- page interval > ); > > There are 10000 consecutive pages that have 10 dead tuples at every > 20000 pages. > > name | attach | attach | shuffled | size_x10 | attach_x10| > shuffled_x10 > --------+-----------+--------+----------+------------+-----------+------------- > array | 28.62 MB | 0.022 | 2.791 | 286.11 MB | 0.170 | > 46.920 > intset | 23.45 MB | 0.061 | 2.156 | 234.34 MB | 0.501 | > 32.577 > radix | 5.04 MB | 0.026 | 0.433 | 48.57 MB | 0.191 | > 11.060 > rtbm | 17.02 MB | 0.074 | 0.533 | 144.02 MB | 0.954 | > 11.502 > svtm | 3.16 MB | 0.023 | 0.206 | 27.60 MB | 0.175 | > 4.886 > tbm | 48.01 MB | 0.132 | 0.656 | 384.01 MB | 1.284 | > 10.231 > > * Test 7 > select prepare( > 1000000, -- max block > 150, -- # of dead tuples per page > 1, -- dead tuples interval within a page > 1000, -- # of consecutive pages having dead tuples > 999000 -- page interval > ); > > There are pages that have 150 dead tuples at first 1000 blocks and > last 1000 blocks. > > name | attach | attach | shuffled | size_x10 | attach_x10| > shuffled_x10 > --------+-----------+--------+----------+------------+-----------+------------- > array | 1.72 MB | 0.002 | 7.507 | 17.17 MB | 0.011 | > 76.510 > intset | 0.20 MB | 0.003 | 6.742 | 1.89 MB | 0.022 | > 52.122 > radix | 0.20 MB | 0.001 | 1.023 | 1.07 MB | 0.007 | > 12.023 > rtbm | 0.15 MB | 0.001 | 2.637 | 0.65 MB | 0.009 | > 34.528 > svtm | 0.52 MB | 0.002 | 0.721 | 0.61 MB | 0.010 | > 6.434 > tbm | 0.20 MB | 0.002 | 2.733 | 1.51 MB | 0.015 | > 38.538 > > * Test 8 > select prepare( > 1000000, -- max block > 100, -- # of dead tuples per page > 1, -- dead tuples interval within a page > 50, -- # of consecutive pages having dead tuples > 100 -- page interval > ); > > There are 50 consecutive pages that have 100 dead tuples at every 100 > pages. > > name | attach | attach | shuffled | size_x10 | attach_x10| > shuffled_x10 > --------+-----------+--------+----------+------------+-----------+------------- > array | 286.11 MB | 0.184 | 67.233 | 2861.03 MB | 1.743 | > 979.070 > intset | 46.88 MB | 0.389 | 35.176 | 468.67 MB | 3.698 | > 505.322 > radix | 21.82 MB | 0.116 | 6.160 | 186.86 MB | 0.891 | > 117.730 > rtbm | 18.02 MB | 0.182 | 5.909 | 160.02 MB | 1.870 | > 112.550 > svtm | 4.28 MB | 0.152 | 3.213 | 37.60 MB | 1.383 | > 79.073 > tbm | 48.01 MB | 0.265 | 6.673 | 384.01 MB | 2.586 | > 101.327 > > Overall, 'svtm' is faster and consumes less memory. 'radix' tree also > has good performance and memory usage. > > From these results, svtm is the best data structure among proposed > ideas for dead tuple storage used during lazy vacuum in terms of > performance and memory usage. I think it can support iteration by > extracting the offset of dead tuples for each block while iterating > chunks. > > Apart from performance and memory usage points of view, we also need > to consider the reusability of the code. When I started this thread, I > thought the best data structure would be the one optimized for > vacuum's dead tuple storage. However, if we can use a data structure > that can also be used in general, we can use it also for other > purposes. Moreover, if it's too optimized for the current TID system > (32 bits block number, 16 bits offset number, maximum block/offset > number, etc.) it may become a blocker for future changes. > > In that sense, radix tree also seems good since it can also be used in > gist vacuum as a replacement for intset, or a replacement for hash > table for shared buffer as discussed before. Are there any other use > cases? On the other hand, I’m concerned that radix tree would be an > over-engineering in terms of vacuum's dead tuples storage since the > dead tuple storage is static data and requires only lookup operation, > so if we want to use radix tree as dead tuple storage, I'd like to see > further use cases. I can evolve svtm to transparent intset replacement certainly. Using same trick from radix_to_key it will store tids efficiently: shift = pg_ceil_log2_32(MaxHeapTuplesPerPage); tid_i = ItemPointerGetOffsetNumber(tid); tid_i |= ItemPointerGetBlockNumber(tid) << shift; Will do today's evening. regards Yura Sokolov aka funny_falcon
pgsql-hackers by date: