Postgresql Partitioned Tables
- Maintenance of large tables and indexes can become very time and resource consuming. At the same time, data access performance can reduce drastically for these objects. Partitioning of tables and indexes can benefit the performance and maintenance in several ways
- Query performance can be improved as access can be limited to relevant partitions only.
- Partitioning refers to splitting what is logically one large table into smaller physical pieces,these partitions would be assigned to different tablespaces to reduce device contention.
- There is a greater ability for parallelism with more partitions by using WITH (parallel_workers = 4) key word.
- Table partitioned is faster than index.
- New Features Available in Postgresql 10x.It like same as oracle Partition.you can make table partition based on tablespace
Types of Partitioned Tables:-
1.Range Partitioning
The table is partitioned into “ranges” defined by a key column or set of columns, with no overlap between the ranges of values assigned to different partitions. For example, one might partition by date ranges or by ranges of numbers(1 to 100,101 to 200,…etc).
For Example: – Key based
- number
- date [FROM (‘2017-01-01’) TO (‘2018-01-01’)]
2.List Partitioning
The table is partitioned by explicitly listing which key values appear in each partition.
For Example: – Fixed List (selected Key value)
- village (pulicat, ponneri, zimcan)
- city (chennai, new york, london)
- Department (like college department ECE, EEE, CSE, government department)
Before Postgresql 10x :
Create steps:-
Step 1:-
Create the “master” table, from which all of the partitions will inherit.
CREATE TABLE m1 (city_id int not null,logdate date not null);
Step 2:-
Create several “child” tables that each inherit from the master table.
Normally, these tables will not add any columns to the set inherited from the master.
CREATE TABLE m1_part1 (CHECK ( logdate >= DATE '2017-02-01' AND logdate < DATE '2017-03-01' ) ) INHERITS (m1); CREATE TABLE m1_part2 ( CHECK ( logdate >= DATE '2017-03-01' AND logdate < DATE '2017-04-01' )) INHERITS (m1); CREATE TABLE m1_part3 ( ) INHERITS (m1); CREATE TABLE m1_part4 ( ) INHERITS (m1);
Step 3:-
Define a trigger or rule to redirect data inserted into the master table to the appropriate partition.
CREATE OR REPLACE FUNCTION part_func() RETURNS TRIGGER AS $$ BEGIN IF ( NEW.logdate >= DATE '2017-02-01' AND NEW.logdate < DATE '2017-03-01' ) THEN INSERT INTO m1_part1 VALUES (NEW.*); ELSIF ( NEW.logdate >= DATE '2017-03-01' AND NEW.logdate < DATE '2017-04-01' ) THEN INSERT INTO m1_part2 VALUES (NEW.*); ELSE RAISE EXCEPTION 'Date out of range. Fix the part_func() function!'; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql;
CREATE TRIGGER insert_measurement_trigger BEFORE INSERT ON m1 FOR EACH ROW EXECUTE PROCEDURE part_func();
Making partition existing table in postgres:
Since #1 requires copying data from the master to the child while it is in an active production environment, I personally went with #2 (creating a new master). This prevents disruptions to the original table while it is actively in use and if there are any issues, I can easily delete the new master without issue and continue using the original table. Here are the steps to do it:
- Create new master table.
CREATE TABLE new_master ( id serial, counter integer, dt_created DATE DEFAULT CURRENT_DATE NOT NULL );
- Create children that inherit from master.
CREATE TABLE child_2014 ( CONSTRAINT pk_2014 PRIMARY KEY (id), CONSTRAINT ck_2014 CHECK ( dt_created < DATE '2015-01-01' ) ) INHERITS (new_master); CREATE INDEX idx_2014 ON child_2014 (dt_created);
CREATE TABLE child_2015 ( CONSTRAINT pk_2015 PRIMARY KEY (id), CONSTRAINT ck_2015 CHECK ( dt_created >= DATE '2015-01-01' AND dt_created < DATE '2016-01-01' ) ) INHERITS (new_master);
CREATE INDEX idx_2015 ON child_2015 (dt_created);
- Copy all historical data to new master table
INSERT INTO child_2014 (id,counter,dt_created) SELECT id,counter,dt_created from old_master where dt_created < '01/01/2015'::date;
- Temporarily pause new inserts/updates to production database
- Copy most recent data to new master table
INSERT INTO child_2015 (id,counter,dt_created) SELECT id,counter,dt_created from old_master where dt_created >= '01/01/2015'::date AND dt_created < '01/01/2016'::date;
- Rename tables so that new_master becomes the production database.
ALTER TABLE old_master RENAME TO old_master_backup; ALTER TABLE new_master RENAME TO old_master;
- Add function for INSERT statements to old_master so that data gets passed to correct partition.
CREATE OR REPLACE FUNCTION fn_insert() RETURNS TRIGGER AS $$ BEGIN IF ( NEW.dt_created >= DATE '2015-01-01' AND NEW.dt_created < DATE '2016-01-01' ) THEN INSERT INTO child_2015 VALUES (NEW.*); ELSIF ( NEW.dt_created < DATE '2015-01-01' ) THEN INSERT INTO child_2014 VALUES (NEW.*); ELSE RAISE EXCEPTION 'Date out of range'; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql;
- Add trigger so that function is called on INSERTS
CREATE TRIGGER tr_insert BEFORE INSERT ON old_master FOR EACH ROW EXECUTE PROCEDURE fn_insert();
- Set constraint exclusion to ON
SET constraint_exclusion = on;
- Re-enable UPDATES and INSERTS on production database
- Set up trigger or cron so that new partitions get created and function gets updated to assign new data to correct partition.
- Delete old_master_backup
Postgresql 10x Features – table partition:
- New Features Available in Postgresql 10x.It like same as oracle Partition.you can make table partition based on tablespace and you can increase a worker process also.
- Indexes must be created separately for each partition.
- Third-Party Tools : pg_partman is an extension to create and manage both time-based and serial-based table partition sets. Native partitioning in PostgreSQL 10 is supported as of pg_partman v3.0.1. Note that all the features of trigger-based partitioning are not yet supported in native, but performance in both reads & writes is significantly better.
- You can ADD or REMOVE partition by using ATTACH PARTITION and DETACH PARTITION sub-commands.
- You cannot drop the NOT NULL constraint on a partition’s column if the constraint is present in the parent table.
- it must have all the NOT NULL and CHECK constraints of the target table.
- if any of the partition keys is an expression and the partition does not accept NULL values. If attaching a list partition that will not accept NULL values, also add NOT NULL constraint to the partition key column, unless it’s an expression.
- If the new partition is a regular table, a full table scan is performed to check that no existing row in the table violates the partition constraint.
- PRIMARY key/UNIQUE constraints are not supported on partitioned tables.
- CHECK constraints are supported on partitioned tables
1.primary key constraints are not supported on partitioned tables:
CREATE TABLE cities2 ( id int primary key, city varchar not null, country varchar not null ) PARTITION BY RANGE (id); ERROR: primary key constraints are not supported on partitioned tables LINE 2: id int primary key,
2.unique constraints are not supported on partitioned tables:
CREATE TABLE cities2 ( id int, city varchar not null unique, country varchar not null ) PARTITION BY RANGE (id); ERROR: unique constraints are not supported on partitioned tables LINE 3: city varchar not null unique,
3.CHECK constraints are supported on partitioned tables:
CREATE TABLE cities3 ( id int not null, city varchar not null, country varchar not null, code varchar CHECK (char_length(code) = 5) ) PARTITION BY RANGE (id); postgres=# \d cities3 Table "public.cities3" Column | Type | Collation | Nullable | Default ---------+-------------------+-----------+----------+--------- id | integer | | not null | city | character varying | | not null | country | character varying | | not null | code | character varying | | | Partition key: RANGE (id) Check constraints: "cities3_code_check" CHECK (char_length(code::text) = 5)
CREATE TABLE cities ( id int not null, city varchar not null, country varchar not null ) PARTITION BY RANGE (id); postgres=# \d cities Table "public.cities" Column | Type | Collation | Nullable | Default ---------+-------------------+-----------+----------+--------- id | integer | | not null | city | character varying | | not null | country | character varying | | not null | Partition key: RANGE (id) postgres=# insert into cities values(1,'chennai','india'); ERROR: no partition of relation "cities" found for row DETAIL: Partition key of the failing row contains (id) = (1).
This is to be expected – after all – there shouldn’t be any rows in in master table. So I need to add some partitions.
Let’s assume, for now, that I will create three partitions – one for id from 1 to 100, and one for id from 101 to 200 and last one is id from 201 to 300.
CREATE TABLE id_01_to_100 PARTITION OF cities FOR VALUES from (1) to (100); CREATE TABLE id_101_to_200 PARTITION OF cities FOR VALUES FROM (101) TO (200); CREATE TABLE id_201_to_300 PARTITION OF cities FOR VALUES FROM (201) TO (300);
Let’s see how it changes our cities table definition, and how the partition definition looks like:
postgres=# \d cities Table "public.cities" Column | Type | Collation | Nullable | Default ---------+-------------------+-----------+----------+--------- id | integer | | not null | city | character varying | | not null | country | character varying | | not null | Partition key: RANGE (id) Number of partitions: 3 (Use \d+ to list them.) postgres=# \d id_01_to_100 Table "public.id_01_to_100" Column | Type | Collation | Nullable | Default ---------+-------------------+-----------+----------+--------- id | integer | | not null | city | character varying | | not null | country | character varying | | not null | Partition of: cities FOR VALUES FROM (1) TO (100) postgres=# \d id_101_to_200 Table "public.id_101_to_200" Column | Type | Collation | Nullable | Default ---------+-------------------+-----------+----------+--------- id | integer | | not null | city | character varying | | not null | country | character varying | | not null | Partition of: cities FOR VALUES FROM (101) TO (200) postgres=# \d id_201_to_300 Table "public.id_201_to_300" Column | Type | Collation | Nullable | Default ---------+-------------------+-----------+----------+--------- id | integer | | not null | city | character varying | | not null | country | character varying | | not null | Partition of: cities FOR VALUES FROM (201) TO (300)
list out the full table definition:
postgres=# \d+ cities Table "public.cities" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description ---------+-------------------+-----------+----------+---------+----------+--------------+------------- id | integer | | not null | | plain | | city | character varying | | not null | | extended | | country | character varying | | not null | | extended | | Partition key: RANGE (id) Partitions: id_01_to_100 FOR VALUES FROM (1) TO (100), id_101_to_200 FOR VALUES FROM (101) TO (200), id_201_to_300 FOR VALUES FROM (201) TO (300)
Now everything is good let make to insert some records:
insert into cities values(50,'chennai','india'); insert into cities values(150,'chennai','india'); insert into cities values(230,'chennai','india');
Let’s list the tables including child tables:
postgres=# select * from cities; id | city | country -----+---------+--------- 50 | chennai | india 150 | chennai | india 230 | chennai | india (3 rows) postgres=# select * from id_01_to_100 ; id | city | country ----+---------+--------- 50 | chennai | india (1 row) postgres=# select * from id_101_to_200 ; id | city | country -----+---------+--------- 150 | chennai | india (1 row) postgres=# select * from id_201_to_300 ; id | city | country -----+---------+--------- 230 | chennai | india (1 row)
postgres=# insert into cities values(100,'chennai','india'); ERROR: no partition of relation "cities" found for row DETAIL: Partition key of the failing row contains (id) = (100). postgres=# insert into cities values(101,'chennai','india'); INSERT 0 1 postgres=# insert into cities values(200,'chennai','india'); ERROR: no partition of relation "cities" found for row DETAIL: Partition key of the failing row contains (id) = (200). postgres=# insert into cities values(300,'chennai','india'); ERROR: no partition of relation "cities" found for row DETAIL: Partition key of the failing row contains (id) = (300).
So, it works, but for the duration of the process, partition id_01_to_100 will not be used, when querying users.
- The proper solution, in this case, seems to be to avoid unbounded ranges, and simply make sure you always keep enough partitions ready.
- There is also one more thing. As docs say, when you attach partition, it will have to be fully scanned, to make sure that all rows in it match partition definition.
- So, if you’d doing detach, attach – it will render the partition inaccessible for duration of while table scan.
- Basically – given range partitioning, simply keep a number of partitions ready for future use, and create new ones when you’re close to filling already existing ones. For this you will need some way to schedule jobs – using cron, pgagent or (possibly)pg_partman
postgres=# alter table cities attach partition id_01_to_100 FOR VALUES from (1) to (101); ALTER TABLE postgres=# alter table cities attach partition id_101_to_200 FOR VALUES FROM (101) TO (201); ERROR: "id_101_to_200" is already a partition alter table cities detach partition id_101_to_200; alter table cities attach partition id_101_to_200 FOR VALUES FROM (101) TO (201);
Check the partition status:
postgres=# \d+ cities Table "public.cities" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description ---------+-------------------+-----------+----------+---------+----------+--------------+------------- id | integer | | not null | | plain | | city | character varying | | not null | | extended | | country | character varying | | not null | | extended | | Partition key: RANGE (id) Partitions: id_01_to_100 FOR VALUES FROM (1) TO (101), id_101_to_200 FOR VALUES FROM (101) TO (201), id_201_to_300 FOR VALUES FROM (201) TO (300)
Now insert the data’s untill 299
insert into cities values(100,'chennai','india'); insert into cities values(200,'chennai','india'); postgres=# insert into cities values(300,'chennai','india'); ERROR: no partition of relation "cities" found for row DETAIL: Partition key of the failing row contains (id) = (300).
Making partition Based on unique/primary key:
CREATE TABLE id_300_to_500 PARTITION OF cities(id,primary key(id)) FOR VALUES FROM (301) TO (501); postgres=# \d id_300_to_500 Table "public.id_300_to_500" Column | Type | Collation | Nullable | Default ---------+-------------------+-----------+----------+--------- id | integer | | not null | city | character varying | | not null | country | character varying | | not null | Partition of: cities FOR VALUES FROM (301) TO (501) Indexes: "id_300_to_500_pkey" PRIMARY KEY, btree (id)
Making partition based on tablespace:
CREATE TABLESPACE tbs1 LOCATION '/tab1/tbs1'; CREATE TABLE id_500_to_1000 PARTITION OF cities FOR VALUES FROM (501) TO (1001) TABLESPACE tbs1; postgres=# \d+ id_500_to_1000 Table "public.id_500_to_1000" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description ---------+-------------------+-----------+----------+---------+----------+--------------+------------- id | integer | | not null | | plain | | city | character varying | | not null | | extended | | country | character varying | | not null | | extended | | Partition of: cities FOR VALUES FROM (501) TO (1001) Partition constraint: ((id IS NOT NULL) AND (id >= 501) AND (id < 1001)) Tablespace: "tbs1"
2.LIST PARTITION:
create table USA ( id int not null, username text not null, Region text not null ) partition by list (region);
making partition based on Regions people:
CREATE TABLE region1 partition of USA (primary key (id)) for values in ('New Jersey','New York','and Pennsylvania'); CREATE TABLE region2 partition of USA (primary key (id)) for values in ('Alaska','California','Hawaii','Oregon','Washington'); CREATE TABLE region3 partition of USA (primary key (id)) for values in ('virginia','columbia');
postgres=# \d+ USA Table "public.usa" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description ----------+---------+-----------+----------+---------+----------+--------------+------------- id | integer | | not null | | plain | | username | text | | not null | | extended | | region | text | | not null | | extended | | Partition key: LIST (region) Partitions: region1 FOR VALUES IN ('New Jersey', 'New York', 'and Pennsylvania'), region2 FOR VALUES IN ('Alaska', 'California', 'Hawaii', 'Oregon', 'Washington'), region3 FOR VALUES IN ('virginia', 'columbia') postgres=# \d+ region1 Table "public.region1" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description ----------+---------+-----------+----------+---------+----------+--------------+------------- id | integer | | not null | | plain | | username | text | | not null | | extended | | region | text | | not null | | extended | | Partition of: usa FOR VALUES IN ('New Jersey', 'New York', 'and Pennsylvania') Partition constraint: ((region IS NOT NULL) AND (region = ANY (ARRAY['New Jersey'::text, 'New York'::text, 'and Pennsylvania'::text]))) Indexes: "region1_pkey" PRIMARY KEY, btree (id) postgres=# \d+ region2 Table "public.region2" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description ----------+---------+-----------+----------+---------+----------+--------------+------------- id | integer | | not null | | plain | | username | text | | not null | | extended | | region | text | | not null | | extended | | Partition of: usa FOR VALUES IN ('Alaska', 'California', 'Hawaii', 'Oregon', 'Washington') Partition constraint: ((region IS NOT NULL) AND (region = ANY (ARRAY['Alaska'::text, 'California'::text, 'Hawaii'::text, 'Oregon'::text, 'Washington'::text]))) Indexes: "region2_pkey" PRIMARY KEY, btree (id) postgres=# \d+ region3 Table "public.region3" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description ----------+---------+-----------+----------+---------+----------+--------------+------------- id | integer | | not null | | plain | | username | text | | not null | | extended | | region | text | | not null | | extended | | Partition of: usa FOR VALUES IN ('virginia', 'columbia') Partition constraint: ((region IS NOT NULL) AND (region = ANY (ARRAY['virginia'::text, 'columbia'::text]))) Indexes: "region3_pkey" PRIMARY KEY, btree (id)
Insert Some Datas:
insert into USA values(1,'nijam','New York'); insert into USA values(2,'johnson','California'); insert into USA values(3,'kane','virginia');
Check the datas how was partition:
postgres=# select * from USA; id | username | region ----+----------+------------ 1 | nijam | New York 2 | johnson | California 3 | kane | virginia (3 rows) postgres=# select * from region1; id | username | region ----+----------+---------- 1 | nijam | New York (1 row) postgres=# select * from region2; id | username | region ----+----------+------------ 2 | johnson | California (1 row) postgres=# select * from region3; id | username | region ----+----------+---------- 3 | kane | virginia (1 row)
Basic Commands For Table partitioning:
To create a INDEX
For each partition, create an index on the key column(s), as well as any other indexes you might want.
CREATE INDEX m1_part1_date_ind ON m1_part1 (logdate); CREATE INDEX m1_part2_date_ind ON m1_part2 (logdate);
To Add a Partition
create table m1_part5(CHECK ( logdate >= DATE '2017-02-01' AND logdate < DATE '2017-03-01' )) INHERITS (m1);
To Drop a Partition
DROP TABLE m1_part1;
To Eliminate a Partition
ALTER TABLE m1_part3 NO INHERIT m1;
Constraint_exclusion configuration parameter is not disabled in postgresql.conf.
set constraint_exclusion=on/off —to enable and disable session level..
Default is partition in version 10
postgres=# show constraint_exclusion; constraint_exclusion ---------------------- partition (1 row)