Understanding Postgres Database VACUUM , VACUUM FULL & VACUUM ANALYZE
Definition:
VACUUM reclaims storage occupied by dead tuples. In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table; they remain present until a VACUUM is done. Therefore it’s necessary to do VACUUM periodically, especially on frequently-updated tables.
- VACUUM can only be performed by a superuser
- VACUUM will skip over any tables that the calling user does not have permission to vacuum.
- We recommend that active production databases be vacuumed frequently (at least when less transaction)
- adding or deleting a large number of rows, it might be a good idea to issue a VACUUM ANALYZE command for the affected table. This will update the system catalogs with the results of all recent changes, and allow the PostgreSQL query planner to make better choices in planning queries.
- The FULL option is not recommended for routine use, but might be useful in special cases. An example is when you have deleted or updated most of the rows in a table and would like the table to physically shrink to occupy less disk space and allow faster table scans. VACUUM FULL will usually shrink the table more than a plain VACUUM would.
SYNTAX:
VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ table [ (column [, ...] ) ] ]
parameters | Explanation |
VACUUM ANALYZE | It performs a VACUUM and then an ANALYZE for each selected table. |
ANALYZE | It collects statistics about the contents of tables in the database, and stores the results in the pg_statistic system catalog) |
FREEZE | Selects aggressive “freezing” of tuples. Specifying FREEZE is equivalent to performing VACUUM with the vacuum_freeze_min_ageparameter set to zero. |
COLUMN | The name of a specific column to analyze. Defaults to all columns. If a column list is specified, ANALYZE is implied. |
Difference Between VACUUM and VACUUM FULL:-
VACUUM | VACUUM FULL |
marks expired data in tables and indexes for future reuse, it wont release space only marks empty space. |
|
Transaction will be allowed(exclusive lock is not obtained) | transaction will not be allowed(exclusive lock on each table while it is being processed) |
|
While performing VACUMM FULL its taking more time compare to vacuum becouse it’s checking dependencies also. |
If you need to vacuum full and don’t want to lock your table, you can use the pg_repack extension |
Practical:
STEP 1. Create a big table and insert the values like following procedure.
postgres=# create table k1 as select * from pg_tables;
SELECT 115
postgres=# insert into k1 select * from pg_tables;
INSERT 0 116
postgres=# insert into k1 select * from pg_tables;
INSERT 0 116
postgres=# insert into k1 select * from pg_tables;
INSERT 0 116
postgres=# insert into k1 select * from k1;
INSERT 0 463
postgres=# insert into k1 select * from k1;
INSERT 0 926
postgres=# insert into k1 select * from k1;
INSERT 0 1852
postgres=# insert into k1 select * from k1;
INSERT 0 3704
postgres=# insert into k1 select * from k1;
INSERT 0 7408
postgres=# insert into k1 select * from k1;
INSERT 0 14816
postgres=# insert into k1 select * from k1;
INSERT 0 29632
postgres=# insert into k1 select * from k1;
INSERT 0 59264
postgres=# insert into k1 select * from k1;
INSERT 0 118528
postgres=# insert into k1 select * from k1;
INSERT 0 237056
postgres=# insert into k1 select * from k1;
INSERT 0 474112
STEP 2. Before updating or deleting check the k1 table if any dead tubles or fragmented is occure or not
postgres=# \d pg_stat_all_tables
View "pg_catalog.pg_stat_all_tables"
Column | Type | Modifiers
-------------------+--------------------------+-----------
relid | oid |
schemaname | name |
relname | name |
seq_scan | bigint |
seq_tup_read | bigint |
idx_scan | bigint |
idx_tup_fetch | bigint |
n_tup_ins | bigint |
n_tup_upd | bigint |
n_tup_del | bigint |
n_tup_hot_upd | bigint |
n_live_tup | bigint |
n_dead_tup | bigint |
last_vacuum | timestamp with time zone |
last_autovacuum | timestamp with time zone |
last_analyze | timestamp with time zone |
last_autoanalyze | timestamp with time zone |
vacuum_count | bigint |
autovacuum_count | bigint |
analyze_count | bigint |
autoanalyze_count | bigint |
postgres=# select n_dead_tup ,last_vacuum,last_analyze,n_tup_upd, n_tup_del,n_tup_hot_upd,relname ,seq_scan,idx_scan from pg_stat_all_tables where relname='k1';
n_dead_tup | last_vacuum | last_analyze | n_tup_upd | n_tup_del | n_tup_hot_upd | relname | seq_scan | idx_scan
------------+-------------+--------------+-----------+-----------+---------------+---------+----------+----------
0 | | | 0 | 0 | 0 | k1 | 11 |
(1 row)
Note: Values n_tup_upd and n_dead_tup values is zero because k1 table did not making any operation like update,delete..etc this is why it showed values is zero, after making some update and delete operation this values will be change depend on your updation and deletion, next step we will make some fragmented operation(update,delete operation).
STEP 3. Now check the k1 table structure for making updating operation (making fragmented)
postgres=# \d k1
Table "public.k1"
Column | Type | Modifiers
-------------+---------+-----------
schemaname | name |
tablename | name |
tableowner | name |
tablespace | name |
hasindexes | boolean |
hasrules | boolean |
hastriggers | boolean |
— Count the rows before updataing
postgres=# select count(*) from k1;
count
--------
948224
(1 row)
— we are going to updating all records so it will take some time
postgres=# update k1 set tableowner='nijam';
UPDATE 948224
— Now delete some records
postgres=# delete from k1 where tablename='t1';
DELETE 8192
— check the k1 table for any dead tubles is occur or not
postgres=# select n_dead_tup ,last_vacuum,last_analyze,n_tup_upd, n_tup_del,n_tup_hot_upd,relname ,seq_scan,idx_scan from pg_stat_all_tables where relname='k1';
n_dead_tup | last_vacuum | last_analyze | n_tup_upd | n_tup_del | n_tup_hot_upd | relname | seq_scan | idx_scan
------------+-------------+--------------+-----------+-----------+---------------+---------+----------+----------
8268 | | | 948224 | 8192 | 0 | k1 | 17 |
(1 row)
Note: Here no of dead tubles is 8268 this is the unwanted(unused) space so we need to voccum it
STEP 4. make the vacuum on k1 table and see the below query after completed vaccum “n_dead_tup” values will be reduced
postgres=# vacuum k1;
VACUUM
postgres=# select n_dead_tup ,last_vacuum,last_analyze,n_tup_upd, n_tup_del,n_tup_hot_upd,relname ,seq_scan,idx_scan from pg_stat_all_tables where relname='k1';
n_dead_tup | last_vacuum | last_analyze | n_tup_upd | n_tup_del | n_tup_hot_upd | relname | seq_scan | idx_scan
------------+-------------------------------+--------------+-----------+-----------+---------------+---------+----------+----------
0 | 2017-05-15 05:20:05.216778-04 | | 948224 | 8192 | 0 | k1 | 19 |
(1 row)
VACUUM (VERBOSE, ANALYZE):
STEP 1. update the k1 table again for making operation of VACUUM (VERBOSE, ANALYZE)
update k1 set tableowner='john';
STEP 2. Check the dead tubles
postgres=# select n_dead_tup ,last_vacuum,last_analyze,n_tup_upd, n_tup_del,n_tup_hot_upd,relname ,seq_scan,idx_scan from pg_stat_all_tables where relname='k1';
n_dead_tup | last_vacuum | last_analyze | n_tup_upd | n_tup_del | n_tup_hot_upd | relname | seq_scan | idx_scan
------------+-------------------------------+--------------+-----------+-----------+---------------+---------+----------+----------
1880064 | 2017-05-15 05:20:05.216778-04 | | 4708352 | 8192 | 8333 | k1 | 21 |
(1 row)
STEP 3. Now let us making VACUUM
VACUUM (VERBOSE, ANALYZE);
STEP 4. Again check the dead tubles
postgres=# select n_dead_tup ,last_vacuum,last_analyze,n_tup_upd, n_tup_del,n_tup_hot_upd,relname ,seq_scan,idx_scan from pg_stat_all_tables where relname='k1';
n_dead_tup | last_vacuum | last_analyze | n_tup_upd | n_tup_del | n_tup_hot_upd | relname | seq_scan | idx_scan
------------+-------------------------------+-------------------------------+-----------+-----------+---------------+---------+----------+----------
0 | 2017-05-15 13:48:05.985591-04 | 2017-05-15 13:48:16.555511-04 | 4708352 | 8192 | 8333 | k1 | 21 |
(1 row)