SlideShare a Scribd company logo
POSTGRES INDEXES:
A DEEP DIVE
INTRODUCTION
Bartosz Sypytkowski
▪ @Horusiath
▪ b.sypytkowski@gmail.com
▪ bartoszsypytkowski.com
 B+Tree: how databases manage data?
 How indexes work
 Query plans & execution
 Types of indexes
AGENDA
B+TREE
HEADER
HEADER
HEADER
HEADER
HEADER
HEADER
HEADER
4 32
4 25
4
DATA
32 49
8
DATA
16
DATA
20
DATA
25
DATA
27
DATA
30
DATA
32
DATA
42
DATA
43
DATA
44
DATA
49
DATA
62
DATA
64
DATA
8KB
B+TREE
link
link link link
HEADER
HEADER
HEADER
HEADER
HEADER
HEADER
HEADER
4 32
4 25
4
DATA
32 49
8
DATA
16
DATA
20
DATA
25
DATA
27
DATA
30
DATA
32
DATA
42
DATA
43
DATA
49
DATA
49
DATA
62
DATA
64
DATA
8KB
SEARCH BY EQUALITY
link
link link link
HEADER
HEADER
HEADER
HEADER
HEADER
HEADER
HEADER
4 32
4 25
4
DATA
32 49
8
DATA
16
DATA
20
DATA
25
DATA
27
DATA
30
DATA
32
DATA
42
DATA
43
DATA
44
DATA
49
DATA
62
DATA
64
DATA
8KB
RANGE SCAN
link
link link link
HEADER
4 32
HEADER
4 25
HEADER
32 49
HEADER
4
DATA
8
DATA
16
DATA
20
DATA
HEADER
25
DATA
27
DATA
30
DATA
HEADER
32
DATA
42
DATA
43
DATA
44
DATA
HEADER
49
DATA
62
DATA
64
DATA
8KB
OVERFLOW PAGES
link
link link link
HEADER
OVERFLOW
DATA
OVERFLOW
HEADER
HEADER
DATA
HEADER
POSTGRESQL
TOAST TABLES
book_id title content
1 ‘Haikus’
‘In the twilight rain
these brilliant-hued
hibiscus - A lovely
sunset.’
2 ‘Moby Dick’
public.books
chunk_id chunk_seq chunk_data
16403 0 0xFF2F000000436
16403 1 0x6167F27521F25B
16403 2 0x23FB21030E6F6
16403 3 0x7974686108676
pg_toast.pg_toast_{$OID}
select relname
from pg_class
where oid = (
select reltoastrelid
from pg_class
where relname='books')
Find name of a
corresponding
TOAST table
if(compress(content).len > 2000)
HEADER
HEADER
HEADER
HEADER
HEADER
4 25
4
DATA
8
DATA
16
DATA
20
DATA
25
DATA
27
DATA
30
DATA
TUPLE IDs
link
Table Heap Index Storage
4
TID
8
TID
16
TID
20
TID
25
TID
27
TID
30
TID
32
TID
32
DATA
<block id, tuple offset>
select ctid from my_table
QUERY EXECUTION
select *
from books
where author_id = 10
SEQ SCAN
SEQ SCAN
M1
T1 I1
T2 T3 I2 I3
Index Storage
T4 T5 T6 T7
Table Heap
1. (Hopefully) sequential I/O
2. Scans all table’s related pages
3. Doesn’t use index pages
create index on books(publication_date);
select publication_date
from books
where publication_date > ‘2020/01/01’
INDEX
ONLY
SCAN
INDEX ONLY
SCAN
M1
T1 I1
T2 T3 I2 I3
Index Storage
T4 T5 T6 T7
Table Heap
1. Sequential I/O over index pages
2. Doesn’t use table’s related pages
create index on books(publication_date);
select title, publication_date
from books
where publication_date > ‘2020/01/01’
INDEX
SCAN
INDEX SCAN
M1
T1 I1
T2 T3 I2 I3
Index Storage
T4 T5 T6 T7
Table Heap
1. Uses index to find a first page of the related table…
INDEX SCAN
M1
T1 I1
T2 T3 I2 I3
Index Storage
T4 T5 T6 T7
Table Heap
1. Uses index to find a first page of the related table…
2. Position read cursor on the first page…
INDEX SCAN
M1
T1 I1
T2 T3 I2 I3
Index Storage
T4 T5 T6 T7
Table Heap
1. Uses index to find a first page of the related table…
2. Position read cursor on the first page…
3. Sequential I/O over all table’s pages until condition is done
create index on books
using gist(description_lex);
select title, publication_date
from books
where description_lex @@ ‘epic’
BITMAP
SCAN
BITMAP SCAN
M1
T1 I1
T2 T3 I2 I3
Index Storage
T4 T5 T6 T7
Table Heap
1. Using index create bitmap of matching pages
Bitmap
BITMAP SCAN
M1
T1 I1
T2 T3 I2 I3
Index Storage
T4 T5 T6 T7
Table Heap
1. Using index create bitmap of matching pages
2. Random I/O over pages covered by bitmap
Bitmap
INCLUDE & PARTIAL
INDEXES
create index ix_books_by_author
on books(author_id)
include (created_at)
where author_id is not null;
HEADER
HEADER
4
25
HEADER
HEADER
HEADER
4
DATA
7
DATA
13
DATA
16
DATA
19
DATA
25
DATA
32
DATA
47
DATA
61
DATA
4
TID
INC
16
TID
INC
25
TID
INC
32
TID
INC
duplicated
columns
BTREE INDEX
create index ix_users_birthdate on users(birthdate desc);
COMPAR
ING
VECTOR
CLOCKS
B-TREE INDEX
1. Default
2. Access time: always O(logN)
3. Supports most of the index features
create index ix_users_birthdate
on users(birthdate desc);
HASH INDEX
create index ix_orders_no on orders using hash(order_no);
HEADER
HEADER
0x
01
0x
02
HEADER
HEADER
HEADER
Meta page
HASH INDEX
HEADER
Bucket 0
Overflow page Bitmap page
Bucket 1
4
TID
71
TID
13
TID
73
TID
42
TID
67
TID
86
TID
99
TID
5
TID
38
TID
7
TID
51
TID
14
TID
34
TID
66
TID
70
TID
72
TID
90
TID
79
TID
91
TID
82
TID
COMPAR
ING
VECTOR
CLOCKS
HASH INDEX
1. Access time: O(1) – O(N)
2. Doesn’t shrink in size*
create index ix_orders_no
on orders using hash(order_no);
BRIN INDEX
create index ih_events_created_at on events
using brin(created_at) with (pages_per_range = 128);
HEADER
HEADER
HEADER
2020/01/01
2020/03/14
2020/02/09
2020/11/10
2020/01/01
2020/01/21
0..127
BRIN INDEX
link
2020/01/21
2020/03/14
128..255
2020/02/09
2020/07/28
256..383
2020/03/17
2020/11/10
384..511
select tablename, attname, correlation
from pg_stats
where tablename = 'film'
tablename attname correlation
film film_id 0.9979791
film title 0.9979791
film description 0.04854498
film release_year 1
film rating 0.1953281
film last_update 1
film fulltext <null>
COLUMN-TUPLE
CORRELATION
COMPAR
ING
VECTOR
CLOCKS
BRIN INDEX
1. Imprecise
2. Very small in size
3. Good for columns aligned with tuple
insert order and immutable records
create index ih_events_created_at on events
using brin(created_at) with (pages_per_range = 128);
BLOOM INDEX
create index ix_active_codes
on active_codes using bloom(keycode)
with (length=80, col1=2);
BLOOM FILTER
INSERT(C) =
h1(C)/len | h2(C)/len .. | hn(C)/len
INSERT(D) =
h1(D)/len | h2(D)/len .. | hn(D)/len
CONTAINS(A) =
h1(A)/len | h2(A)/len .. | hn(A)/len
CONTAINS(B) =
h1(B)/len | h2(B)/len .. | hn(B)/len
FALSE
MAYBE?
COMPAR
ING
VECTOR
CLOCKS
BLOOM INDEX
1. Small in size
2. Good for exclusion/narrowing
3. False positive ratio: hur.st/bloomfilter/
create extension bloom;
create index ix_active_codes
on active_codes using bloom(keycode)
with (length=80, col1=2);
number of bits per record
number of hashes for each
column
GiST INDEX
create index ix_books_content on books using gist(content_lex);
GiST INDEX
GEO POINTS
image: https://postgrespro.com/blog/pgsql/4175817
GiST INDEX
TSVECTOR
-- gist cannot be applied directly on text columns
alter table film add column
description_lex tsvector
generated always as (to_tsvector('english', description))
stored;
create index idx_film_description_lex
on film using gist(description_lex);
select * from film where description_lex @@ 'epic';
Bitmap Heap Scan on film (cost=4.18..20.32 rows=5 width=416)
Recheck Cond: (description_lex @@ '''epic'''::tsquery)
-> Bitmap Index Scan on idx_film_description_lex (cost=0.00..4.18 rows=5 width=0)
Index Cond: (description_lex @@ '''epic'''::tsquery)
Query Plan
GiST INDEX
TSVECTOR
HEADER
011011 110111
HEADER
010011 011010
HEADER
100110 110001
HEADER
aab
TID
aac
TID
aba
TID HEADER
adf
TID
azf
TID
bac
TID
HEADER
brc
TID
caa
TID
cdl
TID
HEADER
cff
TID
fre
TID
klm
TID
COMPAR
ING
VECTOR
CLOCKS
GiST INDEX
1. Supports specialized operators
2. Index is not updated during deletes
create index ix_books_content
on books using gist(content_lex);
SP-GiST INDEX
create index ix_files_path on files using spgist(path);
SP-GiST INDEX
GEO POINTS
image: https://postgrespro.com/blog/pgsql/4220639
SP-GiST INDEX
TSVECTOR
-- spgist can be created on text column but not on nvarchar
create index idx_film_title on film using spgist(title);
select * from film
where title like ‘A Fast-Paced% in New Orleans';
Bitmap Heap Scan on film (cost=8.66..79.03 rows=51 width=416)
Filter: (description ~~ 'A Fast-Paced%'::text)
-> Bitmap Index Scan on idx_film_title (cost=0.00..8.64 rows=50 width=0)
Index Cond: ((description ~>=~ 'A Fast-Paced'::text) AND (description ~<~ 'A Fast-Pacee'::text))
Query Plan
COMPAR
ING
VECTOR
CLOCKS
GiST INDEX
1. Just like GiST, but faster for some ops…
2. … but unable to perform some other
3. Indexed space is partitioned into non-
overlapping regions
create index ix_files_path
on files using spgist(path);
GIN INDEX
create index ix_books_content on books using gin(content_lex);
GIN INDEX -- gist cannot be applied directly on text columns
alter table film add column
description_lex tsvector
generated always as (to_tsvector('english', description))
stored;
create index idx_film_description_lex
on film using gin(description_lex);
select * from film where description_lex @@ 'epic';
Bitmap Heap Scan on film (cost=8.04..24.18 rows=5 width=416)
Recheck Cond: (description_lex @@ '''epic'''::tsquery)
-> Bitmap Index Scan on idx_film_description_lex (cost=0.00..8.04 rows=5 width=0)
Index Cond: (description_lex @@ '''epic'''::tsquery)
Query Plan
GIN INDEX
HEADER
ever gone
HEADER
HEADER
HEADER
HEADER
omit
call
(1,1)
(3,1)
dome
(1,2)
ever
(1,1)
(2,1)
faucet
(4,0)
gather
(2,1)
(3,2)
gone leather
(2,1)
omit
(2,1)
(3,2)
(2,2) (3,1)
HEADER
HEADER
(1,1)
(1,6)
(1,9)
(1,2) (1,3)
(1,7) (1,8)
(2,1) (2,2)
(2,3)
(2,6)
(2,8)
(2,5)
(2,7)
(3,1)
Posting list
Posting tree
COMPAR
ING
VECTOR
CLOCKS
GIN INDEX
create index ix_books_content
on books using gin(content_lex);
1. Reads usually faster than GiST
2. Writes are usually slower than GiST
3. Index size greater than GiST
RUM INDEX
create index ix_books_content on books using rum(content_lex);
RUM INDEX
HEADER
ever gone
HEADER
HEADER
HEADER
HEADER
omit
call
(1,1)
(3,1)
dome
(1,2)
ever
(1,1)
(2,1)
faucet
(4,0)
gather
(2,1)
(3,2)
gone leather
(2,1)
omit
(2,1)
(3,2)
(2,2) (3,1)
HEADER
HEADER
(1,1)
(1,6)
(1,9)
(1,2) (1,3)
(1,7) (1,8)
(2,1) (2,2)
1, 14
5
2, 5 25
1, 9
45 1, 21
9, 11
2, 10 45
1, 22
13
1
211
2, 4
2, 21
13
25
5
4, 7
(2,3)
(2,6)
(2,8)
(2,5)
(2,7)
(3,1)
2
3
1, 7
8, 13
17
1
RUM INDEX
-- similarity ranking
select description_lex <=> to_tsquery('epic’) as similarity
from books;
-- find description with 2 words located one after another
select * from books
where description_lex @@ to_tsquery(‘hello <-> world’);
COMPAR
ING
VECTOR
CLOCKS
RUM INDEX
1. GIN on steroids (bigger but more
capable)
2. Allows to query for terms and their
relative positions in text
3. Supports Index Scan and EXCLUDE
create extension rum;
create index ix_books_content
on books using rum(content_lex);
SUMMARY
THANK YOU

More Related Content

PPTX
Hive Correlation Optimizer
PPTX
DOAG: Visual SQL Tuning
PPT
List
PDF
Skipl List implementation - Part 1
PDF
A BigBench Implementation in the Hadoop Ecosystem
PPT
Extensible hashing
PPTX
Radix and shell sort
PPT
Stacks & Queues
Hive Correlation Optimizer
DOAG: Visual SQL Tuning
List
Skipl List implementation - Part 1
A BigBench Implementation in the Hadoop Ecosystem
Extensible hashing
Radix and shell sort
Stacks & Queues

Similar to Postgres indexes (20)

PPTX
Postgres indexes: how to make them work for your application
PDF
Flexible Indexing with Postgres
 
PPTX
PostgreSQL - It's kind've a nifty database
PDF
Flexible Indexing with Postgres
 
PDF
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
PDF
Deep dive to PostgreSQL Indexes
PDF
Postgres Performance for Humans
PDF
Steam Learn: Introduction to RDBMS indexes
PDF
Quick Wins
PDF
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
PDF
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
PDF
Indexing Complex PostgreSQL Data Types
PDF
Postgres performance for humans
PDF
Postgres can do THAT?
PDF
SQL: Query optimization in practice
PDF
Beyond php - it's not (just) about the code
PDF
Full Text Search in PostgreSQL
PDF
Полнотекстовый поиск в PostgreSQL / Александр Алексеев (Postgres Professional)
ODP
Beyond PHP - it's not (just) about the code
PDF
PostgreSQL: Advanced indexing
Postgres indexes: how to make them work for your application
Flexible Indexing with Postgres
 
PostgreSQL - It's kind've a nifty database
Flexible Indexing with Postgres
 
JDD 2016 - Tomasz Borek - DB for next project? Why, Postgres, of course
Deep dive to PostgreSQL Indexes
Postgres Performance for Humans
Steam Learn: Introduction to RDBMS indexes
Quick Wins
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Полнотекстовый поиск в PostgreSQL за миллисекунды (Олег Бартунов, Александр К...
Indexing Complex PostgreSQL Data Types
Postgres performance for humans
Postgres can do THAT?
SQL: Query optimization in practice
Beyond php - it's not (just) about the code
Full Text Search in PostgreSQL
Полнотекстовый поиск в PostgreSQL / Александр Алексеев (Postgres Professional)
Beyond PHP - it's not (just) about the code
PostgreSQL: Advanced indexing
Ad

More from Bartosz Sypytkowski (16)

PPTX
Full text search, vector search or both?
PPTX
Service-less communication: is it possible?
PPTX
Serviceless or how to build software without servers
PPTX
How do databases perform live backups and point-in-time recovery
PPTX
Scaling connections in peer-to-peer applications
PPTX
Rich collaborative data structures for everyone
PPTX
Behind modern concurrency primitives
PPTX
Collaborative eventsourcing
PPTX
Behind modern concurrency primitives
PPTX
Living in eventually consistent reality
PPTX
Virtual machines - how they work
PPTX
Short story of time
PPTX
Akka.NET streams and reactive streams
PPTX
Collaborative text editing
PPTX
The last mile from db to disk
PPTX
GraphQL - an elegant weapon... for more civilized age
Full text search, vector search or both?
Service-less communication: is it possible?
Serviceless or how to build software without servers
How do databases perform live backups and point-in-time recovery
Scaling connections in peer-to-peer applications
Rich collaborative data structures for everyone
Behind modern concurrency primitives
Collaborative eventsourcing
Behind modern concurrency primitives
Living in eventually consistent reality
Virtual machines - how they work
Short story of time
Akka.NET streams and reactive streams
Collaborative text editing
The last mile from db to disk
GraphQL - an elegant weapon... for more civilized age
Ad

Recently uploaded (20)

PPTX
history of c programming in notes for students .pptx
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
Online Work Permit System for Fast Permit Processing
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
Transform Your Business with a Software ERP System
PPTX
ISO 45001 Occupational Health and Safety Management System
PPTX
ai tools demonstartion for schools and inter college
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Understanding Forklifts - TECH EHS Solution
PPTX
L1 - Introduction to python Backend.pptx
history of c programming in notes for students .pptx
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Online Work Permit System for Fast Permit Processing
VVF-Customer-Presentation2025-Ver1.9.pptx
Odoo Companies in India – Driving Business Transformation.pdf
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Transform Your Business with a Software ERP System
ISO 45001 Occupational Health and Safety Management System
ai tools demonstartion for schools and inter college
2025 Textile ERP Trends: SAP, Odoo & Oracle
Softaken Excel to vCard Converter Software.pdf
Design an Analysis of Algorithms II-SECS-1021-03
Which alternative to Crystal Reports is best for small or large businesses.pdf
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
How Creative Agencies Leverage Project Management Software.pdf
Understanding Forklifts - TECH EHS Solution
L1 - Introduction to python Backend.pptx

Postgres indexes