SlideShare a Scribd company logo
PostgreSQL
Database Administrator- Part4
By Poguttu
Database
Maintenance &
Performance
and
Concurrency
Day-4
• PostgreSQL Tuning and Performance
• Find and Tune Slow Running Queries
• Collecting regular statistics from pg_stat*
views
• Finding out what makes SQL slow
• Speeding up queries without rewriting them
• Discovering why a query is not using an
index
• Forcing a query to use an index
• EXPLAIN and SQL Execution
• Workload Analysis
database_info.sql : SELECT db.datname, au.rolname as datdba,
pg_encoding_to_char(db.encoding) as encoding,
db.datallowconn, db.datconnlimit, db.datfrozenxid,
tb.spcname as tblspc,
-- db.datconfig,
db.datacl
FROM pg_database db
JOIN pg_authid au ON au.oid = db.datdba
JOIN pg_tablespace tb ON tb.oid = db.dattablespace
ORDER BY 1;
database_sizes.sql
SELECT datname,
pg_size_pretty(pg_database_size(datname))as size_pretty,
pg_database_size(datname) as size,
(SELECT pg_size_pretty (SUM( pg_database_size(datname))::bigint)
FROM pg_database) AS total,
((pg_database_size(datname) / (SELECT SUM( pg_database_size(datname))
FROM pg_database) ) *
100)::numeric(6,3) AS pct
FROM pg_database ORDER BY datname;
blocked_transactions.sql
/* Requires PostgreSQL version is 9.2 or Greater */
SELECT
w.query as waiting_query, w.pid as w_pid, w.usename as w_user, l.query as locking_query, l.pid as
l_pid,
l.usename as l_user, t.schemaname || '.' || t.relname as tablename
FROM pg_stat_activity w
JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
JOIN pg_stat_activity l ON (l2.pid = l.pid)
JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE w.waiting;
cache_hit_ratio.sql
SELECT pg_stat_database.datname, pg_stat_database.blks_read, pg_stat_database.blks_hit,
round((pg_stat_database.blks_hit::double precision / (pg_stat_database.blks_read
+ pg_stat_database.blks_hit
+1)::double precision * 100::double precision)::numeric, 2) AS
cachehitratio
FROM pg_stat_database
WHERE pg_stat_database.datname !~ '^(template(0|1)|postgres)$'::text
ORDER BY round((pg_stat_database.blks_hit::double precision
/ (pg_stat_database.blks_read
+ pg_stat_database.blks_hit
+ 1)::double precision * 100::double precision)::numeric, 2) DESC;
connection_counts.sql: SELECT COUNT(*) FROM pg_stat_activity;
SELECT usename, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 1;
SELECT datname, usename, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2;
SELECT usename, datname, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2;
current_locks.sql:SELECT database, relation, n.nspname, c.relname, pid, a.usename,
locktype, mode,
granted, tuple FROM pg_locks l
JOIN pg_class c ON (c.oid = l.relation)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_stat_activity a ON (a.procpid = l.pid)
ORDER BY database, relation, pid;
current_queries.sql: SELECT a.datname, a.procpid as pid,
CASE WHEN a.client_addr IS NULL
THEN 'local'
ELSE a.client_addr::text
END as client_addr,
a.usename as user, a.waiting, l.procpid as blocking_pid, l.usename as blicking_user,
a.current_query, a.query_start, current_timestamp - a.query_start as duration
FROM pg_stat_activity a
LEFT JOIN pg_locks l1 ON (a.procpid = l1.pid )
LEFT JOIN pg_locks l2 on (l1.relation = l2.relation )
LEFT JOIN pg_stat_activity l ON (l2.pid = l.procpid)
LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE pg_backend_pid() <> a.procpid
ORDER BY a.datname,
a.query_start;
SELECT w.current_query as waiting_query, w.procpid as w_pid, w.usename as w_user,
l.current_query as locking_query, l.procpid as l_pid, l.usename as l_user,
t.schemaname || '.' || t.relname as tablename FROM pg_stat_activity w
JOIN pg_locks l1 ON (w.procpid = l1.pid and not l1.granted)
JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
JOIN pg_stat_activity l ON (l2.pid = l.procpid)
JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE w.waiting;
SELECT datname, procpid as pid, client_addr, usename as user, current_query,
CASE WHEN waiting = TRUE
THEN 'BLOCKED'
ELSE 'no'
END as waiting,
query_start, current_timestamp - query_start as duration
FROM pg_stat_activity WHERE pg_backend_pid() <> procpid
ORDER BY datname, query_start;
current_queries_blocked.sql
SELECT c.datname, c.pid as pid, c.client_addr, c.usename as user, c.query,
CASE WHEN c.waiting = TRUE
THEN 'BLOCKED'
ELSE 'no'
END as waiting,
l.pid as blocked_by,
current_timestamp - c.query_start as duration
FROM pg_stat_activity c
LEFT JOIN pg_locks l1 ON (c.pid = l1.pid and not l1.granted)
LEFT JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
LEFT JOIN pg_stat_activity l ON (l2.pid = l.pid)
LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE pg_backend_pid() <> c.pid
ORDER BY datname, query_start;
most_active_tables.sql
SELECT schemaname, relname, idx_tup_fetch + seq_tup_read as TotalReads
FROM pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0
AND schemaname NOT IN ( 'pg_catalog', 'pg_toast' )
order by TotalReads desc LIMIT 10;
pg_runtime.sql
SELECT pg_postmaster_start_time() as pg_start, current_timestamp - pg_postmaster_start_time()
as runtime;
table_row_counts.sql
SELECT s.nspname, c.relname as table, c.reltuples::int4 as rows FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace s ON (c.relnamespace = s.oid) WHERE relkind = 'r'
AND c.reltuples::int4 > 0 ORDER BY rows DESC;
pg_stat_all_tables.sql
SELECT n.nspname, s.relname, c.reltuples::bigint,-- n_live_tup,
n_tup_ins, n_tup_upd, n_tup_del,
date_trunc('second', last_vacuum) as last_vacuum,
date_trunc('second', last_autovacuum) as last_autovacuum, date_trunc('second',
last_analyze) as last_analyze, date_trunc('second', last_autoanalyze) as last_autoanalyze
, round( current_setting('autovacuum_vacuum_threshold')::integer +
current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) AS av_threshold
/* ,CASE WHEN reltuples > 0
THEN round(100.0 * n_dead_tup / (reltuples))
ELSE 0
END AS pct_dead,
CASE WHEN n_dead_tup > round( current_setting('autovacuum_vacuum_threshold')::integer +
current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
THEN 'VACUUM'
ELSE 'ok'
END AS "av_needed"
*/
FROM pg_stat_all_tables s
JOIN pg_class c ON c.oid = s.relid
JOIN pg_namespace n ON (n.oid = c.relnamespace)
WHERE s.relname NOT LIKE 'pg_%'
AND s.relname NOT LIKE 'sql_%'
-- AND s.relname LIKE '%TBL%'
ORDER by 1, 2;
pg_stat_all_indexes.sql
SELECT n.nspname as schema, i.relname as table, i.indexrelname as index,
i.idx_scan, i.idx_tup_read, i.idx_tup_fetch,
CASE WHEN idx.indisprimary
THEN 'pkey'
WHEN idx.indisunique
THEN 'uidx'
ELSE 'idx'
END AS type,
CASE WHEN idx.indisvalid
THEN 'valid'
ELSE 'INVALID'
END as statusi,
pg_relation_size( quote_ident(n.nspname) || '.' || quote_ident(i.relname) ) as size_in_bytes,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' || quote_ident(i.relname) )) as
size
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid = i.indexrelid )
WHERE i.relname LIKE '%%'
AND n.nspname NOT LIKE 'pg_%'
-- AND idx.indisunique = TRUE
-- AND NOT idx.indisprimary
-- AND i.indexrelname LIKE 'tmp%'
-- AND idx.indisvalid IS false
/* AND NOT idx.indisprimary
AND NOT idx.indisunique
AND idx_scan = 0
*/ ORDER BY 1, 2, 3;
blocked_transactions.sql
/* Requires PostgreSQL version is 9.2 or Greater */
SELECT
w.query as waiting_query, w.pid as w_pid, w.usename as w_user, l.query as locking_query, l.pid
as l_pid,
l.usename as l_user, t.schemaname || '.' || t.relname as tablename
FROM pg_stat_activity w
JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
JOIN pg_stat_activity l ON (l2.pid = l.pid)
JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE w.waiting;
cache_hit_ratio.sql
SELECT pg_stat_database.datname, pg_stat_database.blks_read,
pg_stat_database.blks_hit, round((pg_stat_database.blks_hit::double precision /
(pg_stat_database.blks_read
+ pg_stat_database.blks_hit
+1)::double precision * 100::double precision)::numeric, 2) AS
cachehitratio
FROM pg_stat_database
WHERE pg_stat_database.datname !~ '^(template(0|1)|postgres)$'::text
ORDER BY round((pg_stat_database.blks_hit::double precision / (pg_stat_database.blks_read
+ pg_stat_database.blks_hit
+ 1)::double precision * 100::double precision)::numeric, 2)
DESC;
connection_counts.sql: SELECT COUNT(*) FROM pg_stat_activity;
SELECT usename, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 1;
SELECT datname, usename, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2;
SELECT usename, datname, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2;
current_locks.sql:SELECT database, relation, n.nspname, c.relname, pid, a.usename,
locktype, mode,
granted, tuple FROM pg_locks l JOIN pg_class c ON (c.oid = l.relation)
JOIN pg_namespace n ON (n.oid = c.relnamespace) JOIN pg_stat_activity a ON (a.procpid = l.pid)
ORDER BY database, relation, pid;
current_queries.sql: SELECT a.datname, a.procpid as pid,
CASE WHEN a.client_addr IS NULL
THEN 'local'
ELSE a.client_addr::text
END as client_addr,
a.usename as user, a.waiting, l.procpid as blocking_pid, l.usename as blicking_user,
a.current_query, a.query_start, current_timestamp - a.query_start as duration
FROM pg_stat_activity a
LEFT JOIN pg_locks l1 ON (a.procpid = l1.pid )
LEFT JOIN pg_locks l2 on (l1.relation = l2.relation )
LEFT JOIN pg_stat_activity l ON (l2.pid = l.procpid)
LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE pg_backend_pid() <> a.procpid
ORDER BY a.datname,
a.query_start;
SELECT w.current_query as waiting_query, w.procpid as w_pid, w.usename as w_user,
current_queries_blocked.sql
SELECT c.datname, c.pid as pid, c.client_addr, c.usename as user, c.query,
CASE WHEN c.waiting = TRUE
THEN 'BLOCKED'
ELSE 'no'
END as waiting,
l.pid as blocked_by, current_timestamp - c.query_start as duration
FROM pg_stat_activity c
LEFT JOIN pg_locks l1 ON (c.pid = l1.pid and not l1.granted)
LEFT JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
LEFT JOIN pg_stat_activity l ON (l2.pid = l.pid)
LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE pg_backend_pid() <> c.pid
ORDER BY datname, query_start;
most_active_tables.sql
SELECT schemaname, relname, idx_tup_fetch + seq_tup_read as TotalReads
FROM pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0
pg_stat_all_tables.sql
SELECT n.nspname, s.relname, c.reltuples::bigint,-- n_live_tup,
n_tup_ins, n_tup_upd, n_tup_del,
date_trunc('second', last_vacuum) as last_vacuum,
date_trunc('second', last_autovacuum) as last_autovacuum, date_trunc('second',
last_analyze) as last_analyze, date_trunc('second', last_autoanalyze) as
last_autoanalyze
, round( current_setting('autovacuum_vacuum_threshold')::integer +
current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) AS
av_threshold
/* ,CASE WHEN reltuples > 0
THEN round(100.0 * n_dead_tup / (reltuples))
ELSE 0
END AS pct_dead,
CASE WHEN n_dead_tup > round(
current_setting('autovacuum_vacuum_threshold')::integer +
current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples)
THEN 'VACUUM'
ELSE 'ok'
END AS "av_needed"
*/
FROM pg_stat_all_tables s
JOIN pg_class c ON c.oid = s.relid
JOIN pg_namespace n ON (n.oid = c.relnamespace)
WHERE s.relname NOT LIKE 'pg_%'
AND s.relname NOT LIKE 'sql_%'
-- AND s.relname LIKE '%TBL%'
ORDER by 1, 2;
pg_stat_all_indexes.sql
SELECT n.nspname as schema, i.relname as table, i.indexrelname as index,
i.idx_scan, i.idx_tup_read, i.idx_tup_fetch,
CASE WHEN idx.indisprimary
THEN 'pkey'
WHEN idx.indisunique
THEN 'uidx'
ELSE 'idx'
END AS type,
CASE WHEN idx.indisvalid
THEN 'valid'
ELSE 'INVALID'
END as statusi,
pg_relation_size( quote_ident(n.nspname) || '.' || quote_ident(i.relname) ) as
size_in_bytes,
pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
quote_ident(i.relname) )) as size
FROM pg_stat_all_indexes i
JOIN pg_class c ON (c.oid = i.relid)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_index idx ON (idx.indexrelid = i.indexrelid )
WHERE i.relname LIKE '%%'
AND n.nspname NOT LIKE 'pg_%'
-- AND idx.indisunique = TRUE
-- AND NOT idx.indisprimary
-- AND i.indexrelname LIKE 'tmp%'
-- AND idx.indisvalid IS false
/* AND NOT idx.indisprimary
AND NOT idx.indisunique
AND idx_scan = 0
*/ ORDER BY 1, 2, 3;
Standard Statistics Views
pg_stat_activity- One row per server process, showing
information related to the current activity of that process,
such as state and current query.
pg_stat_archiver- One row only, showing statistics about
the WAL archiver process's activity.
pg_stat_bgwriter- One row only, showing statistics about
the background writer process's activity.
pg_stat_database- One row per database, showing
database-wide statistics.
pg_stat_all_tables-One row for each table in the current
database, showing statistics about accesses to that specific
table..
pg_stat_sys_tables- Same as pg_stat_all_tables,
except that only system tables are shown.
pg_stat_user_tables- Same as pg_stat_all_tables, except that only user tables are
shown.
pg_stat_xact_all_tables- Similar to pg_stat_all_tables, but counts actions taken so far
within the current transaction (which are not yet included in pg_stat_all_tables and related
views). The columns for numbers of live and dead rows and vacuum and analyze actions are
not present in this view.
pg_stat_xact_sys_tables -Same as pg_stat_xact_all_tables, except that only system tables
are shown.
pg_stat_xact_user_tables- Same as pg_stat_xact_all_tables, except that only user tables are
shown.
pg_statio_user_sequences-Same as pg_statio_all_sequences, except that only user sequences
are shown.
pg_stat_user_functions- One row for each tracked function, showing statistics about
executions of that function.
pg_stat_xact_user_functions-Similar to pg_stat_user_functions, but counts only calls during
the current transaction (which are not yet included in pg_stat_user_functions).
Standard Statistics Views
pg_statio_sys_indexes- Same as pg_statio_all_indexes,
except that only indexes on system tables are shown.
pg_statio_user_indexes- Same as pg_statio_all_indexes,
except that only indexes on user tables are shown.
pg_statio_all_sequences-One row for each sequence in the
current database, showing statistics about I/O on that
specific sequence. See pg_statio_all_sequences for details.
pg_statio_sys_sequences Same as
pg_statio_all_sequences, except that only system sequences
are shown. (Presently, no system sequences are defined, so
this view is always empty.)
pg_stat_replication-One row per WAL sender process,
showing statistics about replication to that sender's
connected standby server..
pg_stat_database_conflicts-One row per database, showing
database-wide statistics about query cancels due to conflict
with recovery on standby servers.
ws
pg_stat_all_indexes -One row for each index in the
current database, showing statistics about accesses to that
specific index.
pg_stat_sys_indexes- Same as pg_stat_all_indexes,
except that only indexes on system tables are shown.
pg_stat_user_indexes- Same as pg_stat_all_indexes,
except that only indexes on user tables are shown.
pg_statio_all_tables- One row for each table in the
current database, showing statistics about I/O on that specific
table.
pg_statio_sys_tables- Same as pg_statio_all_tables,
except that only system tables are shown.
pg_statio_user_tables- Same as pg_statio_all_tables,
except that only user tables are shown.
pg_statio_all_indexes- One row for each index in the
current database, showing statistics about I/O on that specific
index.
connection_counts.sql
SELECT COUNT(*) FROM pg_stat_activity;
SELECT usename, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 1;
SELECT datname, usename, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2;
SELECT usename, datname, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2;
current_locks.sql
SELECT database, relation, n.nspname, c.relname, pid, a.usename, locktype,
mode,
granted, tuple FROM pg_locks l
JOIN pg_class c ON (c.oid = l.relation)
JOIN pg_namespace n ON (n.oid = c.relnamespace)
JOIN pg_stat_activity a ON (a.procpid = l.pid)
ORDER BY database, relation, pid;
current_queries.sql
SELECT a.datname, a.procpid as pid,
CASE WHEN a.client_addr IS NULL
THEN 'local'
ELSE a.client_addr::text
END as client_addr,
a.usename as user, a.waiting, l.procpid as blocking_pid, l.usename as blicking_user,
a.current_query, a.query_start, current_timestamp - a.query_start as duration
FROM pg_stat_activity a
LEFT JOIN pg_locks l1 ON (a.procpid = l1.pid )
LEFT JOIN pg_locks l2 on (l1.relation = l2.relation )
LEFT JOIN pg_stat_activity l ON (l2.pid = l.procpid)
LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE pg_backend_pid() <> a.procpid
ORDER BY a.datname,
a.query_start;
SELECT w.current_query as waiting_query, w.procpid as w_pid, w.usename as w_user,
l.current_query as locking_query, l.procpid as l_pid, l.usename as l_user,
t.schemaname || '.' || t.relname as tablename FROM pg_stat_activity w
JOIN pg_locks l1 ON (w.procpid = l1.pid and not l1.granted)
JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
JOIN pg_stat_activity l ON (l2.pid = l.procpid)
JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
SELECT datname, procpid as pid, client_addr, usename as user, current_query,
CASE WHEN waiting = TRUE
THEN 'BLOCKED'
ELSE 'no'
END as waiting,
query_start, current_timestamp - query_start as duration
FROM pg_stat_activity WHERE pg_backend_pid() <> procpid
ORDER BY datname, query_start;
current_queries_blocked.sql
/* Use for PostgreSQL 9.2 or greater */
SELECT c.datname, c.pid as pid, c.client_addr, c.usename as user, c.query,
CASE WHEN c.waiting = TRUE
THEN 'BLOCKED'
ELSE 'no'
END as waiting,
l.pid as blocked_by,
current_timestamp - c.query_start as duration
FROM pg_stat_activity c
LEFT JOIN pg_locks l1 ON (c.pid = l1.pid and not l1.granted)
LEFT JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
LEFT JOIN pg_stat_activity l ON (l2.pid = l.pid)
LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
WHERE pg_backend_pid() <> c.pid
ORDER BY datname, query_start;
most_active_tables.sql
SELECT schemaname, relname, idx_tup_fetch + seq_tup_read as TotalReads
FROM pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0
AND schemaname NOT IN ( 'pg_catalog', 'pg_toast' )
order by TotalReads desc LIMIT 10;
gen_table_compare_counts.sql
SELECT 'SELECT ''' || c.relname || ''' as table, ' || c.reltuples::int4 || ' as rows, ' ||
'(SELECT COUNT(*) FROM ' || s.nspname || '."' || c.relname || '") as cnt;'
FROM pg_catalog.pg_class c
What is Vacuum?
• Vacuum does the following:
• Gathering table and index statistics
• Reorganize the table
• Clean up tables and index dead blocks
• Frozen by record XID to prevent XID Wraparound
• #1 and #2 are generally required for DBMS management. But #3 and #4 are necessary because of the PostgreSQL MVCC feature
VACUUM
• •Restructures pages and reclaims space taken by dead rows (rows that were deleted BEFORE any of the current transactions
started)
• •Removes dead rows from indexes and TOAST tables
• •Having long-running transactions can mess everything up(including long transactions on replica if hot_standby_feedback == on)
• •Truncates the table if possible
• •Updates free space map
• •Done to avoid needing
• VACUUM FULLNOT NEEDED:
• •On replica
• •After TRUNCATE
VACUUM FULL
•Shrinks table size(rewrites all “alive” tuples into a new file as
compactly as possible)
•Can only be launched manually(not by autovacuum)
•OID of the relation stays the same, relfilenode (on-disk name)
changesCons:•ACCESS EXCLUSIVE LOCK(no reading or writing allowed)
•table size≤needed space ≤table size * 2
•Need a REINDEX
•Takes a long time
Alternative:
pg_repack-does allow reads and writes, but needs more space (≥table
size * 2)VACUUM FULL5
Postgresql Database Administration- Day4
Postgresql Database Administration- Day4
Below have required changes to force the Autovacuum parameters for running frequently.
First enable the log for Autovacuum process:
log_autovacuum_min_duration = 0
Increase the size of worker to check table more:
autovacuum_max_workers = 6
autovacuum_naptime = 15s
Decrease the value of thresholds and auto analyze to trigger the sooner:
autovacuum_vacuum_threshold = 25
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_threshold = 10
autovacuum_analyze_scale_factor = 0.05
Make autovacuum less interruptable:
autovacuum_vacuum_cost_delay = 10ms
autovacuum_vacuum_cost_limit = 1000
Script to check the status of AutoVacuum for all Tables
SELECT
schemaname
,relname
,n_live_tup
,n_dead_tup
,last_autovacuum
FROM pg_stat_all_tables
ORDER BY n_dead_tup
/(n_live_tup
* current_setting('autovacuum_vacuum_scale_factor')::float8
+ current_setting('autovacuum_vacuum_threshold')::float8)
DESC;
Controlling automatic database maintenance
Autovacuum is enabled by default in PostgreSQL and mostly does a great job of maintaining your PostgreSQL database. We say mostly because
it doesn't know everything you do about the database, such as the best time to perform maintenance actions. Let's explore the settings that
can be tuned so that you can use vacuums efficiently.
Exercising control requires some thinking about what you actually want:
What are the best times of day to do things? When are system resources more available?
Which days are quiet, and which are not?
Which tables are critical to the application, and which are not?
Perform the following steps:
The first thing to do is make sure that autovacuum is switched on, which is the default. Check that you have the following parameters
enabled in yourpostgresql.conffile:
autovacuum = on
track_counts = on
PostgreSQL controls autovacuum with more than 40 individually tunable parameters that provide a wide range of...Get quickly up to speed
on the latest tech
Removing issues that cause bloat
Bloat can be caused by long-running queries or long-running write transactions that execute alongside write-heavy workloads.
Resolving that is mostly down to understanding the workloads running on the server.
Look at the age of the oldest snapshots that are running, like this:
postgres=# SELECT now() -
CASE
WHEN backend_xid IS NOT NULL
THEN xact_start
ELSE query_start END
AS age
, pid, backend_xid AS xid, backend_xmin AS xmin, stateFROM pg_stat_activity WHERE backend_type = 'client backend’ ORDER BY 1
DESC;
age | pid | xid | xmin | state
----------------+-------+----------+----------+------------------
00:00:25.791098 | 27624 | | 10671262 | active
00:00:08.018103 | 27591 | | | idle in transaction
00:00:00.002444 | 27630 | 10703641 | 10703639 | active
00:00:00.001506 | 27631 | 10703642 | 10703640 | active
00:00:00.000324 | 27632 | 10703643 | 10703641 | active
00:00:00...
Identifying and fixing bloated tables and indexes
PostgreSQL implements Multiversion Concurrency Control (MVCC), which allows users to read data at the same
time as writers make changes.
This is an important feature for concurrency in database applications, as it can allow the following:
Better performance because of fewer locks
Greatly reduced deadlocking
Simplified application design and management
Bloated tables and indexes are a natural consequence of MVCC design in PostgreSQL. It is caused mainly by
updates, as we must retain both the old and new updates for a certain period of time.
Bloating results in increased disk consumption, as well as performance loss—if a table is twice as big as it
should be, scanning it takes twice as long. VACUUM is one of the best ways of removing bloat.
Many users execute VACUUM far too frequently, while at the same time complaining about the cost of doing
so. This recipe is all about understanding when you need to run VACUUM by estimating the amount of bloat...
Monitoring and tuning a vacuum
If you're currently waiting for a long-running vacuum (or autovacuum) to finish, go straight to the How to do it... section.
If you've just had a long-running vacuum complete, then you may want to think about setting a few parameters.
autovacuum_max_workers should always be set to more than 2. Setting it too high may not be very useful, and so you need to be careful.
Setting vacuum_cost_delay too high is counterproductive. VACUUM is your friend, not your enemy, so delaying it until it doesn't happen at all just makes
things worse.
maintenance_work_mem should be set to anything up to 1 GB, according to how much memory you can allocate to this task at this time.
Let's watch what happens when we run a large VACUUM. Don't run VACUUM FULL, because it runs for a long time while holding an AccessExclusiveLock
on the table.
First, locate which process is running the VACUUM by using the pg_stat_activity view to identify the specific pid (34399 is just an example...
test=# SELECT oid::regclass::text AS table, age(relfrozenxid) AS xid_age, mxid_age(relminmxid) AS mxid_age, least(
(SELECT setting::int
FROM pg_settings WHERE name = 'autovacuum_freeze_max_age') - age(relfrozenxid),
(SELECT setting::int FROM pg_settings WHERE name = 'autovacuum_multixact_freeze_max_age') -
mxid_age(relminmxid) ) AS tx_before_wraparound_vacuum,pg_size_pretty(pg_total_relation_size(oid)) AS size,
pg_stat_get_last_autovacuum_time(oid) AS last_autovacuum FROM pg_class WHERE relfrozenxid != 0 AND oid > 16384
ORDER BY tx_before_wraparound_vacuum;
Database Maintenance
Maintenance Tools
Optimizer Statistics
Demo - Optimizer Statistics
Example - Updating Statistics
Data Fragmentation and Bloat
Routine Vacuuming
Vacuuming Commands
Vacuum and Vacuum Full
Demo - Vacuum Command
Vacuumdb Utility
Autovacuuming
Autovacuuming Parameters
Per-Table Thresholds
Routine Reindexing
When to Reindex
Demo - Reindexing

More Related Content

What's hot (19)

Postgresql Federation
Postgresql Federation
Jim Mlodgenski
 
MongoDB Database Replication
MongoDB Database Replication
Mehdi Valikhani
 
Oracle Tracing
Oracle Tracing
Merin Mathew
 
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Ontico
 
PostgreSQL- An Introduction
PostgreSQL- An Introduction
Smita Prasad
 
Postgres 12 Cluster Database operations.
Postgres 12 Cluster Database operations.
Vijay Kumar N
 
Full Text Search In PostgreSQL
Full Text Search In PostgreSQL
Karwin Software Solutions LLC
 
Postgresql search demystified
Postgresql search demystified
javier ramirez
 
Comparing SAS Files
Comparing SAS Files
Laura A Schild
 
Backup and Recovery
Backup and Recovery
Anar Godjaev
 
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Jim Mlodgenski
 
Unix commands in etl testing
Unix commands in etl testing
Garuda Trainings
 
Pro PostgreSQL, OSCon 2008
Pro PostgreSQL, OSCon 2008
Robert Treat
 
Exadata - BULK DATA LOAD Testing on Database Machine
Exadata - BULK DATA LOAD Testing on Database Machine
Monowar Mukul
 
hadoop
hadoop
longhao
 
Introduction to scoop and its functions
Introduction to scoop and its functions
Rupak Roy
 
Import and Export Big Data using R Studio
Import and Export Big Data using R Studio
Rupak Roy
 
Test Dml With Nologging
Test Dml With Nologging
N/A
 
Read, store and create xml and json
Read, store and create xml and json
Kim Berg Hansen
 
MongoDB Database Replication
MongoDB Database Replication
Mehdi Valikhani
 
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Ontico
 
PostgreSQL- An Introduction
PostgreSQL- An Introduction
Smita Prasad
 
Postgres 12 Cluster Database operations.
Postgres 12 Cluster Database operations.
Vijay Kumar N
 
Postgresql search demystified
Postgresql search demystified
javier ramirez
 
Backup and Recovery
Backup and Recovery
Anar Godjaev
 
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Top 10 Mistakes When Migrating From Oracle to PostgreSQL
Jim Mlodgenski
 
Unix commands in etl testing
Unix commands in etl testing
Garuda Trainings
 
Pro PostgreSQL, OSCon 2008
Pro PostgreSQL, OSCon 2008
Robert Treat
 
Exadata - BULK DATA LOAD Testing on Database Machine
Exadata - BULK DATA LOAD Testing on Database Machine
Monowar Mukul
 
Introduction to scoop and its functions
Introduction to scoop and its functions
Rupak Roy
 
Import and Export Big Data using R Studio
Import and Export Big Data using R Studio
Rupak Roy
 
Test Dml With Nologging
Test Dml With Nologging
N/A
 
Read, store and create xml and json
Read, store and create xml and json
Kim Berg Hansen
 

Similar to Postgresql Database Administration- Day4 (20)

Monitoring Postgres at Scale | PostgresConf US 2018 | Lukas Fittl
Monitoring Postgres at Scale | PostgresConf US 2018 | Lukas Fittl
Citus Data
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
Deep dive into PostgreSQL internal statistics / Алексей Лесовский (PostgreSQL...
Deep dive into PostgreSQL internal statistics / Алексей Лесовский (PostgreSQL...
Ontico
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
Aplicações 10x a 100x mais rápida com o postgre sql
Aplicações 10x a 100x mais rápida com o postgre sql
Fabio Telles Rodriguez
 
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas Fittl
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas Fittl
Citus Data
 
Explain this!
Explain this!
Fabio Telles Rodriguez
 
PGConf APAC 2018 - Monitoring PostgreSQL at Scale
PGConf APAC 2018 - Monitoring PostgreSQL at Scale
PGConf APAC
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_Cheatsheet
Lucian Oprea
 
Migrating To PostgreSQL
Migrating To PostgreSQL
Grant Fritchey
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at Scale
Sean Chittenden
 
Troubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenter
Alexey Lesovsky
 
PostGreSQL Performance Tuning
PostGreSQL Performance Tuning
Maven Logix
 
Postgres performance for humans
Postgres performance for humans
Craig Kerstiens
 
query-optimization-techniques_talk.pdf
query-optimization-techniques_talk.pdf
garos1
 
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
Citus Data
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Severalnines
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
Alexey Lesovsky
 
What's New in PostgreSQL 17? - Mydbops MyWebinar Edition 35
What's New in PostgreSQL 17? - Mydbops MyWebinar Edition 35
Mydbops
 
Monitoring Postgres at Scale | PostgresConf US 2018 | Lukas Fittl
Monitoring Postgres at Scale | PostgresConf US 2018 | Lukas Fittl
Citus Data
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
Deep dive into PostgreSQL internal statistics / Алексей Лесовский (PostgreSQL...
Deep dive into PostgreSQL internal statistics / Алексей Лесовский (PostgreSQL...
Ontico
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
Aplicações 10x a 100x mais rápida com o postgre sql
Aplicações 10x a 100x mais rápida com o postgre sql
Fabio Telles Rodriguez
 
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas Fittl
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas Fittl
Citus Data
 
PGConf APAC 2018 - Monitoring PostgreSQL at Scale
PGConf APAC 2018 - Monitoring PostgreSQL at Scale
PGConf APAC
 
PostgreSQL High_Performance_Cheatsheet
PostgreSQL High_Performance_Cheatsheet
Lucian Oprea
 
Migrating To PostgreSQL
Migrating To PostgreSQL
Grant Fritchey
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at Scale
Sean Chittenden
 
Troubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenter
Alexey Lesovsky
 
PostGreSQL Performance Tuning
PostGreSQL Performance Tuning
Maven Logix
 
Postgres performance for humans
Postgres performance for humans
Craig Kerstiens
 
query-optimization-techniques_talk.pdf
query-optimization-techniques_talk.pdf
garos1
 
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
Citus Data
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Severalnines
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
Alexey Lesovsky
 
What's New in PostgreSQL 17? - Mydbops MyWebinar Edition 35
What's New in PostgreSQL 17? - Mydbops MyWebinar Edition 35
Mydbops
 
Ad

Recently uploaded (20)

Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Shashikant Jagtap
 
ENERGY CONSUMPTION CALCULATION IN ENERGY-EFFICIENT AIR CONDITIONER.pdf
ENERGY CONSUMPTION CALCULATION IN ENERGY-EFFICIENT AIR CONDITIONER.pdf
Muhammad Rizwan Akram
 
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
Edge AI and Vision Alliance
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
“Addressing Evolving AI Model Challenges Through Memory and Storage,” a Prese...
“Addressing Evolving AI Model Challenges Through Memory and Storage,” a Prese...
Edge AI and Vision Alliance
 
The State of Web3 Industry- Industry Report
The State of Web3 Industry- Industry Report
Liveplex
 
Oracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI Professional
VICTOR MAESTRE RAMIREZ
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
Kubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too Late
Michael Furman
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
 
Introduction to Typescript - GDG On Campus EUE
Introduction to Typescript - GDG On Campus EUE
Google Developer Group On Campus European Universities in Egypt
 
PyData - Graph Theory for Multi-Agent Integration
PyData - Graph Theory for Multi-Agent Integration
barqawicloud
 
Oracle Cloud and AI Specialization Program
Oracle Cloud and AI Specialization Program
VICTOR MAESTRE RAMIREZ
 
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Safe Software
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Alliance
 
High Availability On-Premises FME Flow.pdf
High Availability On-Premises FME Flow.pdf
Safe Software
 
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
AmirStern2
 
Supporting the NextGen 911 Digital Transformation with FME
Supporting the NextGen 911 Digital Transformation with FME
Safe Software
 
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Agentic AI: Beyond the Buzz- LangGraph Studio V2
Shashikant Jagtap
 
ENERGY CONSUMPTION CALCULATION IN ENERGY-EFFICIENT AIR CONDITIONER.pdf
ENERGY CONSUMPTION CALCULATION IN ENERGY-EFFICIENT AIR CONDITIONER.pdf
Muhammad Rizwan Akram
 
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
Edge AI and Vision Alliance
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
“Addressing Evolving AI Model Challenges Through Memory and Storage,” a Prese...
“Addressing Evolving AI Model Challenges Through Memory and Storage,” a Prese...
Edge AI and Vision Alliance
 
The State of Web3 Industry- Industry Report
The State of Web3 Industry- Industry Report
Liveplex
 
Oracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI Professional
VICTOR MAESTRE RAMIREZ
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
Kubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too Late
Michael Furman
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
 
PyData - Graph Theory for Multi-Agent Integration
PyData - Graph Theory for Multi-Agent Integration
barqawicloud
 
Oracle Cloud and AI Specialization Program
Oracle Cloud and AI Specialization Program
VICTOR MAESTRE RAMIREZ
 
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Safe Software
 
Edge-banding-machines-edgeteq-s-200-en-.pdf
Edge-banding-machines-edgeteq-s-200-en-.pdf
AmirStern2
 
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Seminar: Authentication for a Billion Consumers - Amazon.pptx
FIDO Alliance
 
High Availability On-Premises FME Flow.pdf
High Availability On-Premises FME Flow.pdf
Safe Software
 
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
vertical-cnc-processing-centers-drillteq-v-200-en.pdf
AmirStern2
 
Supporting the NextGen 911 Digital Transformation with FME
Supporting the NextGen 911 Digital Transformation with FME
Safe Software
 
Ad

Postgresql Database Administration- Day4

  • 2. Database Maintenance & Performance and Concurrency Day-4 • PostgreSQL Tuning and Performance • Find and Tune Slow Running Queries • Collecting regular statistics from pg_stat* views • Finding out what makes SQL slow • Speeding up queries without rewriting them • Discovering why a query is not using an index • Forcing a query to use an index • EXPLAIN and SQL Execution • Workload Analysis
  • 3. database_info.sql : SELECT db.datname, au.rolname as datdba, pg_encoding_to_char(db.encoding) as encoding, db.datallowconn, db.datconnlimit, db.datfrozenxid, tb.spcname as tblspc, -- db.datconfig, db.datacl FROM pg_database db JOIN pg_authid au ON au.oid = db.datdba JOIN pg_tablespace tb ON tb.oid = db.dattablespace ORDER BY 1; database_sizes.sql SELECT datname, pg_size_pretty(pg_database_size(datname))as size_pretty, pg_database_size(datname) as size, (SELECT pg_size_pretty (SUM( pg_database_size(datname))::bigint) FROM pg_database) AS total, ((pg_database_size(datname) / (SELECT SUM( pg_database_size(datname)) FROM pg_database) ) * 100)::numeric(6,3) AS pct FROM pg_database ORDER BY datname; blocked_transactions.sql /* Requires PostgreSQL version is 9.2 or Greater */ SELECT w.query as waiting_query, w.pid as w_pid, w.usename as w_user, l.query as locking_query, l.pid as l_pid, l.usename as l_user, t.schemaname || '.' || t.relname as tablename FROM pg_stat_activity w JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted) JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) JOIN pg_stat_activity l ON (l2.pid = l.pid) JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE w.waiting; cache_hit_ratio.sql SELECT pg_stat_database.datname, pg_stat_database.blks_read, pg_stat_database.blks_hit, round((pg_stat_database.blks_hit::double precision / (pg_stat_database.blks_read + pg_stat_database.blks_hit +1)::double precision * 100::double precision)::numeric, 2) AS cachehitratio FROM pg_stat_database WHERE pg_stat_database.datname !~ '^(template(0|1)|postgres)$'::text ORDER BY round((pg_stat_database.blks_hit::double precision / (pg_stat_database.blks_read + pg_stat_database.blks_hit + 1)::double precision * 100::double precision)::numeric, 2) DESC; connection_counts.sql: SELECT COUNT(*) FROM pg_stat_activity; SELECT usename, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 1; SELECT datname, usename, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2; SELECT usename, datname, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2; current_locks.sql:SELECT database, relation, n.nspname, c.relname, pid, a.usename, locktype, mode, granted, tuple FROM pg_locks l JOIN pg_class c ON (c.oid = l.relation) JOIN pg_namespace n ON (n.oid = c.relnamespace) JOIN pg_stat_activity a ON (a.procpid = l.pid) ORDER BY database, relation, pid;
  • 4. current_queries.sql: SELECT a.datname, a.procpid as pid, CASE WHEN a.client_addr IS NULL THEN 'local' ELSE a.client_addr::text END as client_addr, a.usename as user, a.waiting, l.procpid as blocking_pid, l.usename as blicking_user, a.current_query, a.query_start, current_timestamp - a.query_start as duration FROM pg_stat_activity a LEFT JOIN pg_locks l1 ON (a.procpid = l1.pid ) LEFT JOIN pg_locks l2 on (l1.relation = l2.relation ) LEFT JOIN pg_stat_activity l ON (l2.pid = l.procpid) LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE pg_backend_pid() <> a.procpid ORDER BY a.datname, a.query_start; SELECT w.current_query as waiting_query, w.procpid as w_pid, w.usename as w_user, l.current_query as locking_query, l.procpid as l_pid, l.usename as l_user, t.schemaname || '.' || t.relname as tablename FROM pg_stat_activity w JOIN pg_locks l1 ON (w.procpid = l1.pid and not l1.granted) JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) JOIN pg_stat_activity l ON (l2.pid = l.procpid) JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE w.waiting; SELECT datname, procpid as pid, client_addr, usename as user, current_query, CASE WHEN waiting = TRUE THEN 'BLOCKED' ELSE 'no' END as waiting, query_start, current_timestamp - query_start as duration FROM pg_stat_activity WHERE pg_backend_pid() <> procpid ORDER BY datname, query_start; current_queries_blocked.sql SELECT c.datname, c.pid as pid, c.client_addr, c.usename as user, c.query, CASE WHEN c.waiting = TRUE THEN 'BLOCKED' ELSE 'no' END as waiting, l.pid as blocked_by, current_timestamp - c.query_start as duration FROM pg_stat_activity c LEFT JOIN pg_locks l1 ON (c.pid = l1.pid and not l1.granted) LEFT JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) LEFT JOIN pg_stat_activity l ON (l2.pid = l.pid) LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE pg_backend_pid() <> c.pid ORDER BY datname, query_start; most_active_tables.sql SELECT schemaname, relname, idx_tup_fetch + seq_tup_read as TotalReads FROM pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0 AND schemaname NOT IN ( 'pg_catalog', 'pg_toast' ) order by TotalReads desc LIMIT 10; pg_runtime.sql SELECT pg_postmaster_start_time() as pg_start, current_timestamp - pg_postmaster_start_time() as runtime; table_row_counts.sql SELECT s.nspname, c.relname as table, c.reltuples::int4 as rows FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace s ON (c.relnamespace = s.oid) WHERE relkind = 'r' AND c.reltuples::int4 > 0 ORDER BY rows DESC;
  • 5. pg_stat_all_tables.sql SELECT n.nspname, s.relname, c.reltuples::bigint,-- n_live_tup, n_tup_ins, n_tup_upd, n_tup_del, date_trunc('second', last_vacuum) as last_vacuum, date_trunc('second', last_autovacuum) as last_autovacuum, date_trunc('second', last_analyze) as last_analyze, date_trunc('second', last_autoanalyze) as last_autoanalyze , round( current_setting('autovacuum_vacuum_threshold')::integer + current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) AS av_threshold /* ,CASE WHEN reltuples > 0 THEN round(100.0 * n_dead_tup / (reltuples)) ELSE 0 END AS pct_dead, CASE WHEN n_dead_tup > round( current_setting('autovacuum_vacuum_threshold')::integer + current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) THEN 'VACUUM' ELSE 'ok' END AS "av_needed" */ FROM pg_stat_all_tables s JOIN pg_class c ON c.oid = s.relid JOIN pg_namespace n ON (n.oid = c.relnamespace) WHERE s.relname NOT LIKE 'pg_%' AND s.relname NOT LIKE 'sql_%' -- AND s.relname LIKE '%TBL%' ORDER by 1, 2; pg_stat_all_indexes.sql SELECT n.nspname as schema, i.relname as table, i.indexrelname as index, i.idx_scan, i.idx_tup_read, i.idx_tup_fetch, CASE WHEN idx.indisprimary THEN 'pkey' WHEN idx.indisunique THEN 'uidx' ELSE 'idx' END AS type, CASE WHEN idx.indisvalid THEN 'valid' ELSE 'INVALID' END as statusi, pg_relation_size( quote_ident(n.nspname) || '.' || quote_ident(i.relname) ) as size_in_bytes, pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' || quote_ident(i.relname) )) as size FROM pg_stat_all_indexes i JOIN pg_class c ON (c.oid = i.relid) JOIN pg_namespace n ON (n.oid = c.relnamespace) JOIN pg_index idx ON (idx.indexrelid = i.indexrelid ) WHERE i.relname LIKE '%%' AND n.nspname NOT LIKE 'pg_%' -- AND idx.indisunique = TRUE -- AND NOT idx.indisprimary -- AND i.indexrelname LIKE 'tmp%' -- AND idx.indisvalid IS false /* AND NOT idx.indisprimary AND NOT idx.indisunique AND idx_scan = 0 */ ORDER BY 1, 2, 3;
  • 6. blocked_transactions.sql /* Requires PostgreSQL version is 9.2 or Greater */ SELECT w.query as waiting_query, w.pid as w_pid, w.usename as w_user, l.query as locking_query, l.pid as l_pid, l.usename as l_user, t.schemaname || '.' || t.relname as tablename FROM pg_stat_activity w JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted) JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) JOIN pg_stat_activity l ON (l2.pid = l.pid) JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE w.waiting; cache_hit_ratio.sql SELECT pg_stat_database.datname, pg_stat_database.blks_read, pg_stat_database.blks_hit, round((pg_stat_database.blks_hit::double precision / (pg_stat_database.blks_read + pg_stat_database.blks_hit +1)::double precision * 100::double precision)::numeric, 2) AS cachehitratio FROM pg_stat_database WHERE pg_stat_database.datname !~ '^(template(0|1)|postgres)$'::text ORDER BY round((pg_stat_database.blks_hit::double precision / (pg_stat_database.blks_read + pg_stat_database.blks_hit + 1)::double precision * 100::double precision)::numeric, 2) DESC; connection_counts.sql: SELECT COUNT(*) FROM pg_stat_activity; SELECT usename, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 1; SELECT datname, usename, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2; SELECT usename, datname, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2; current_locks.sql:SELECT database, relation, n.nspname, c.relname, pid, a.usename, locktype, mode, granted, tuple FROM pg_locks l JOIN pg_class c ON (c.oid = l.relation) JOIN pg_namespace n ON (n.oid = c.relnamespace) JOIN pg_stat_activity a ON (a.procpid = l.pid) ORDER BY database, relation, pid; current_queries.sql: SELECT a.datname, a.procpid as pid, CASE WHEN a.client_addr IS NULL THEN 'local' ELSE a.client_addr::text END as client_addr, a.usename as user, a.waiting, l.procpid as blocking_pid, l.usename as blicking_user, a.current_query, a.query_start, current_timestamp - a.query_start as duration FROM pg_stat_activity a LEFT JOIN pg_locks l1 ON (a.procpid = l1.pid ) LEFT JOIN pg_locks l2 on (l1.relation = l2.relation ) LEFT JOIN pg_stat_activity l ON (l2.pid = l.procpid) LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE pg_backend_pid() <> a.procpid ORDER BY a.datname, a.query_start; SELECT w.current_query as waiting_query, w.procpid as w_pid, w.usename as w_user, current_queries_blocked.sql SELECT c.datname, c.pid as pid, c.client_addr, c.usename as user, c.query, CASE WHEN c.waiting = TRUE THEN 'BLOCKED' ELSE 'no' END as waiting, l.pid as blocked_by, current_timestamp - c.query_start as duration FROM pg_stat_activity c LEFT JOIN pg_locks l1 ON (c.pid = l1.pid and not l1.granted) LEFT JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) LEFT JOIN pg_stat_activity l ON (l2.pid = l.pid) LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE pg_backend_pid() <> c.pid ORDER BY datname, query_start; most_active_tables.sql SELECT schemaname, relname, idx_tup_fetch + seq_tup_read as TotalReads FROM pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0
  • 7. pg_stat_all_tables.sql SELECT n.nspname, s.relname, c.reltuples::bigint,-- n_live_tup, n_tup_ins, n_tup_upd, n_tup_del, date_trunc('second', last_vacuum) as last_vacuum, date_trunc('second', last_autovacuum) as last_autovacuum, date_trunc('second', last_analyze) as last_analyze, date_trunc('second', last_autoanalyze) as last_autoanalyze , round( current_setting('autovacuum_vacuum_threshold')::integer + current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) AS av_threshold /* ,CASE WHEN reltuples > 0 THEN round(100.0 * n_dead_tup / (reltuples)) ELSE 0 END AS pct_dead, CASE WHEN n_dead_tup > round( current_setting('autovacuum_vacuum_threshold')::integer + current_setting('autovacuum_vacuum_scale_factor')::numeric * C.reltuples) THEN 'VACUUM' ELSE 'ok' END AS "av_needed" */ FROM pg_stat_all_tables s JOIN pg_class c ON c.oid = s.relid JOIN pg_namespace n ON (n.oid = c.relnamespace) WHERE s.relname NOT LIKE 'pg_%' AND s.relname NOT LIKE 'sql_%' -- AND s.relname LIKE '%TBL%' ORDER by 1, 2; pg_stat_all_indexes.sql SELECT n.nspname as schema, i.relname as table, i.indexrelname as index, i.idx_scan, i.idx_tup_read, i.idx_tup_fetch, CASE WHEN idx.indisprimary THEN 'pkey' WHEN idx.indisunique THEN 'uidx' ELSE 'idx' END AS type, CASE WHEN idx.indisvalid THEN 'valid' ELSE 'INVALID' END as statusi, pg_relation_size( quote_ident(n.nspname) || '.' || quote_ident(i.relname) ) as size_in_bytes, pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' || quote_ident(i.relname) )) as size FROM pg_stat_all_indexes i JOIN pg_class c ON (c.oid = i.relid) JOIN pg_namespace n ON (n.oid = c.relnamespace) JOIN pg_index idx ON (idx.indexrelid = i.indexrelid ) WHERE i.relname LIKE '%%' AND n.nspname NOT LIKE 'pg_%' -- AND idx.indisunique = TRUE -- AND NOT idx.indisprimary -- AND i.indexrelname LIKE 'tmp%' -- AND idx.indisvalid IS false /* AND NOT idx.indisprimary AND NOT idx.indisunique AND idx_scan = 0 */ ORDER BY 1, 2, 3;
  • 8. Standard Statistics Views pg_stat_activity- One row per server process, showing information related to the current activity of that process, such as state and current query. pg_stat_archiver- One row only, showing statistics about the WAL archiver process's activity. pg_stat_bgwriter- One row only, showing statistics about the background writer process's activity. pg_stat_database- One row per database, showing database-wide statistics. pg_stat_all_tables-One row for each table in the current database, showing statistics about accesses to that specific table.. pg_stat_sys_tables- Same as pg_stat_all_tables, except that only system tables are shown. pg_stat_user_tables- Same as pg_stat_all_tables, except that only user tables are shown. pg_stat_xact_all_tables- Similar to pg_stat_all_tables, but counts actions taken so far within the current transaction (which are not yet included in pg_stat_all_tables and related views). The columns for numbers of live and dead rows and vacuum and analyze actions are not present in this view. pg_stat_xact_sys_tables -Same as pg_stat_xact_all_tables, except that only system tables are shown. pg_stat_xact_user_tables- Same as pg_stat_xact_all_tables, except that only user tables are shown. pg_statio_user_sequences-Same as pg_statio_all_sequences, except that only user sequences are shown. pg_stat_user_functions- One row for each tracked function, showing statistics about executions of that function. pg_stat_xact_user_functions-Similar to pg_stat_user_functions, but counts only calls during the current transaction (which are not yet included in pg_stat_user_functions).
  • 9. Standard Statistics Views pg_statio_sys_indexes- Same as pg_statio_all_indexes, except that only indexes on system tables are shown. pg_statio_user_indexes- Same as pg_statio_all_indexes, except that only indexes on user tables are shown. pg_statio_all_sequences-One row for each sequence in the current database, showing statistics about I/O on that specific sequence. See pg_statio_all_sequences for details. pg_statio_sys_sequences Same as pg_statio_all_sequences, except that only system sequences are shown. (Presently, no system sequences are defined, so this view is always empty.) pg_stat_replication-One row per WAL sender process, showing statistics about replication to that sender's connected standby server.. pg_stat_database_conflicts-One row per database, showing database-wide statistics about query cancels due to conflict with recovery on standby servers. ws pg_stat_all_indexes -One row for each index in the current database, showing statistics about accesses to that specific index. pg_stat_sys_indexes- Same as pg_stat_all_indexes, except that only indexes on system tables are shown. pg_stat_user_indexes- Same as pg_stat_all_indexes, except that only indexes on user tables are shown. pg_statio_all_tables- One row for each table in the current database, showing statistics about I/O on that specific table. pg_statio_sys_tables- Same as pg_statio_all_tables, except that only system tables are shown. pg_statio_user_tables- Same as pg_statio_all_tables, except that only user tables are shown. pg_statio_all_indexes- One row for each index in the current database, showing statistics about I/O on that specific index.
  • 10. connection_counts.sql SELECT COUNT(*) FROM pg_stat_activity; SELECT usename, count(*) FROM pg_stat_activity GROUP BY 1 ORDER BY 1; SELECT datname, usename, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2; SELECT usename, datname, count(*) FROM pg_stat_activity GROUP BY 1, 2 ORDER BY 1, 2; current_locks.sql SELECT database, relation, n.nspname, c.relname, pid, a.usename, locktype, mode, granted, tuple FROM pg_locks l JOIN pg_class c ON (c.oid = l.relation) JOIN pg_namespace n ON (n.oid = c.relnamespace) JOIN pg_stat_activity a ON (a.procpid = l.pid) ORDER BY database, relation, pid; current_queries.sql SELECT a.datname, a.procpid as pid, CASE WHEN a.client_addr IS NULL THEN 'local' ELSE a.client_addr::text END as client_addr, a.usename as user, a.waiting, l.procpid as blocking_pid, l.usename as blicking_user, a.current_query, a.query_start, current_timestamp - a.query_start as duration FROM pg_stat_activity a LEFT JOIN pg_locks l1 ON (a.procpid = l1.pid ) LEFT JOIN pg_locks l2 on (l1.relation = l2.relation ) LEFT JOIN pg_stat_activity l ON (l2.pid = l.procpid) LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE pg_backend_pid() <> a.procpid ORDER BY a.datname, a.query_start; SELECT w.current_query as waiting_query, w.procpid as w_pid, w.usename as w_user, l.current_query as locking_query, l.procpid as l_pid, l.usename as l_user, t.schemaname || '.' || t.relname as tablename FROM pg_stat_activity w JOIN pg_locks l1 ON (w.procpid = l1.pid and not l1.granted) JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) JOIN pg_stat_activity l ON (l2.pid = l.procpid) JOIN pg_stat_user_tables t ON (l1.relation = t.relid) SELECT datname, procpid as pid, client_addr, usename as user, current_query, CASE WHEN waiting = TRUE THEN 'BLOCKED' ELSE 'no' END as waiting, query_start, current_timestamp - query_start as duration FROM pg_stat_activity WHERE pg_backend_pid() <> procpid ORDER BY datname, query_start; current_queries_blocked.sql /* Use for PostgreSQL 9.2 or greater */ SELECT c.datname, c.pid as pid, c.client_addr, c.usename as user, c.query, CASE WHEN c.waiting = TRUE THEN 'BLOCKED' ELSE 'no' END as waiting, l.pid as blocked_by, current_timestamp - c.query_start as duration FROM pg_stat_activity c LEFT JOIN pg_locks l1 ON (c.pid = l1.pid and not l1.granted) LEFT JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted) LEFT JOIN pg_stat_activity l ON (l2.pid = l.pid) LEFT JOIN pg_stat_user_tables t ON (l1.relation = t.relid) WHERE pg_backend_pid() <> c.pid ORDER BY datname, query_start; most_active_tables.sql SELECT schemaname, relname, idx_tup_fetch + seq_tup_read as TotalReads FROM pg_stat_all_tables WHERE idx_tup_fetch + seq_tup_read != 0 AND schemaname NOT IN ( 'pg_catalog', 'pg_toast' ) order by TotalReads desc LIMIT 10; gen_table_compare_counts.sql SELECT 'SELECT ''' || c.relname || ''' as table, ' || c.reltuples::int4 || ' as rows, ' || '(SELECT COUNT(*) FROM ' || s.nspname || '."' || c.relname || '") as cnt;' FROM pg_catalog.pg_class c
  • 11. What is Vacuum? • Vacuum does the following: • Gathering table and index statistics • Reorganize the table • Clean up tables and index dead blocks • Frozen by record XID to prevent XID Wraparound • #1 and #2 are generally required for DBMS management. But #3 and #4 are necessary because of the PostgreSQL MVCC feature VACUUM • •Restructures pages and reclaims space taken by dead rows (rows that were deleted BEFORE any of the current transactions started) • •Removes dead rows from indexes and TOAST tables • •Having long-running transactions can mess everything up(including long transactions on replica if hot_standby_feedback == on) • •Truncates the table if possible • •Updates free space map • •Done to avoid needing • VACUUM FULLNOT NEEDED: • •On replica • •After TRUNCATE
  • 12. VACUUM FULL •Shrinks table size(rewrites all “alive” tuples into a new file as compactly as possible) •Can only be launched manually(not by autovacuum) •OID of the relation stays the same, relfilenode (on-disk name) changesCons:•ACCESS EXCLUSIVE LOCK(no reading or writing allowed) •table size≤needed space ≤table size * 2 •Need a REINDEX •Takes a long time Alternative: pg_repack-does allow reads and writes, but needs more space (≥table size * 2)VACUUM FULL5
  • 15. Below have required changes to force the Autovacuum parameters for running frequently. First enable the log for Autovacuum process: log_autovacuum_min_duration = 0 Increase the size of worker to check table more: autovacuum_max_workers = 6 autovacuum_naptime = 15s Decrease the value of thresholds and auto analyze to trigger the sooner: autovacuum_vacuum_threshold = 25 autovacuum_vacuum_scale_factor = 0.1 autovacuum_analyze_threshold = 10 autovacuum_analyze_scale_factor = 0.05 Make autovacuum less interruptable: autovacuum_vacuum_cost_delay = 10ms autovacuum_vacuum_cost_limit = 1000
  • 16. Script to check the status of AutoVacuum for all Tables SELECT schemaname ,relname ,n_live_tup ,n_dead_tup ,last_autovacuum FROM pg_stat_all_tables ORDER BY n_dead_tup /(n_live_tup * current_setting('autovacuum_vacuum_scale_factor')::float8 + current_setting('autovacuum_vacuum_threshold')::float8) DESC;
  • 17. Controlling automatic database maintenance Autovacuum is enabled by default in PostgreSQL and mostly does a great job of maintaining your PostgreSQL database. We say mostly because it doesn't know everything you do about the database, such as the best time to perform maintenance actions. Let's explore the settings that can be tuned so that you can use vacuums efficiently. Exercising control requires some thinking about what you actually want: What are the best times of day to do things? When are system resources more available? Which days are quiet, and which are not? Which tables are critical to the application, and which are not? Perform the following steps: The first thing to do is make sure that autovacuum is switched on, which is the default. Check that you have the following parameters enabled in yourpostgresql.conffile: autovacuum = on track_counts = on PostgreSQL controls autovacuum with more than 40 individually tunable parameters that provide a wide range of...Get quickly up to speed on the latest tech
  • 18. Removing issues that cause bloat Bloat can be caused by long-running queries or long-running write transactions that execute alongside write-heavy workloads. Resolving that is mostly down to understanding the workloads running on the server. Look at the age of the oldest snapshots that are running, like this: postgres=# SELECT now() - CASE WHEN backend_xid IS NOT NULL THEN xact_start ELSE query_start END AS age , pid, backend_xid AS xid, backend_xmin AS xmin, stateFROM pg_stat_activity WHERE backend_type = 'client backend’ ORDER BY 1 DESC; age | pid | xid | xmin | state ----------------+-------+----------+----------+------------------ 00:00:25.791098 | 27624 | | 10671262 | active 00:00:08.018103 | 27591 | | | idle in transaction 00:00:00.002444 | 27630 | 10703641 | 10703639 | active 00:00:00.001506 | 27631 | 10703642 | 10703640 | active 00:00:00.000324 | 27632 | 10703643 | 10703641 | active 00:00:00...
  • 19. Identifying and fixing bloated tables and indexes PostgreSQL implements Multiversion Concurrency Control (MVCC), which allows users to read data at the same time as writers make changes. This is an important feature for concurrency in database applications, as it can allow the following: Better performance because of fewer locks Greatly reduced deadlocking Simplified application design and management Bloated tables and indexes are a natural consequence of MVCC design in PostgreSQL. It is caused mainly by updates, as we must retain both the old and new updates for a certain period of time. Bloating results in increased disk consumption, as well as performance loss—if a table is twice as big as it should be, scanning it takes twice as long. VACUUM is one of the best ways of removing bloat. Many users execute VACUUM far too frequently, while at the same time complaining about the cost of doing so. This recipe is all about understanding when you need to run VACUUM by estimating the amount of bloat...
  • 20. Monitoring and tuning a vacuum If you're currently waiting for a long-running vacuum (or autovacuum) to finish, go straight to the How to do it... section. If you've just had a long-running vacuum complete, then you may want to think about setting a few parameters. autovacuum_max_workers should always be set to more than 2. Setting it too high may not be very useful, and so you need to be careful. Setting vacuum_cost_delay too high is counterproductive. VACUUM is your friend, not your enemy, so delaying it until it doesn't happen at all just makes things worse. maintenance_work_mem should be set to anything up to 1 GB, according to how much memory you can allocate to this task at this time. Let's watch what happens when we run a large VACUUM. Don't run VACUUM FULL, because it runs for a long time while holding an AccessExclusiveLock on the table. First, locate which process is running the VACUUM by using the pg_stat_activity view to identify the specific pid (34399 is just an example... test=# SELECT oid::regclass::text AS table, age(relfrozenxid) AS xid_age, mxid_age(relminmxid) AS mxid_age, least( (SELECT setting::int FROM pg_settings WHERE name = 'autovacuum_freeze_max_age') - age(relfrozenxid), (SELECT setting::int FROM pg_settings WHERE name = 'autovacuum_multixact_freeze_max_age') - mxid_age(relminmxid) ) AS tx_before_wraparound_vacuum,pg_size_pretty(pg_total_relation_size(oid)) AS size, pg_stat_get_last_autovacuum_time(oid) AS last_autovacuum FROM pg_class WHERE relfrozenxid != 0 AND oid > 16384 ORDER BY tx_before_wraparound_vacuum;
  • 24. Demo - Optimizer Statistics
  • 25. Example - Updating Statistics
  • 30. Demo - Vacuum Command