HOW TO SETUP/CONFIGURE THE POSTGRESQL STREAMING REPLICATION ON POSTGRESQL 10?
what is the postgresql Streaming Replication (SR) ?
Streaming Replication (SR) provides the capability to continuously ship and apply the WAL XLOG records to some number of standby servers in order to keep them current. This feature was added to PostgreSQL 9.0.
Advantages of postgresql Streaming Replication (SR):
1.Switch over/Fail over after the primary fails.
2.Data loss so less and data integrity.
3.No need more down time after primary fail.
4.synchronous mode Zero data loss.
5. Load balancing using load balancer ex.pgpool-II,pgbouncer.
6.Auto failover using repmgr.
Now We can go postgresql streaming replication setup using postgresql-10.3 version
master IP: 192.168.2.2 slave IP: 192.168.2.3
ON SLAVE:
stop the slave server
/usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data/ stop
move the data directory to backup location:
mv /var/lib/pgsql/10/data/* /home/postgres/backup/
ON MASTER:
on postgresql.conf:
listen_addresses = 'localhost,192.168.2.2' wal_level = replica # minimal, replica, or logical archive_mode = on archive_command = 'rsync -av %p /home/postgres/archive/%f && rsync -av %p postgres@192.168.2.3:/home/postgres/archive/%f' max_wal_senders = 2 wal_keep_segments = 10
on pg_hba.conf:
host replication postgres 192.168.2.3/24 md5
Restart the master server:
/usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data/ restart
taking consistancy backup:
$ psql -c "select pg_start_backup('initial_backup');" $ rsync -cva --inplace /var/lib/pgsql/10/data/* postgres@192.168.2.3:/var/lib/pgsql/10/data/ $ psql -c "select pg_stop_backup();"
ON SLAVE:
on postgresql.conf:
listen_addresses = 'localhost,192.168.2.2' wal_level = replica # minimal, replica, or logical archive_mode = on archive_command = '/bin/cp -av %p /home/postgres/archive/%f' max_wal_senders = 2 wal_keep_segments = 10 hot_standby = on
on pg_hba.conf:
host replication postgres 192.168.2.2/24 md5
On recovery.conf
cat /var/lib/pgsql/10/data/recovery.conf standby_mode = 'on' # to enable the standby (read-only) mode. primary_conninfo = 'host=192.168.2.2 port=5432 user=postgres password=postgres' # to specify a connection info to the master node. trigger_file = '/tmp/pg_failover_trigger' # to specify a trigger file to recognize a fail over. restore_command = 'cp /home/postgres/archive/%f "%p"' archive_cleanup_command = '/usr/pgsql-10/bin/pg_archivecleanup /home/postgres/archive/ %r'
Restart the slave server:
/usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data/ restart
if you get any error error like arch missing wal log …etc while restarting slave server again take the incremental physical backup
$ psql -c "select pg_start_backup('initial_backup');" $ rsync -cva --inplace --exclude=pg_hba.conf --exclude=postgresql.conf --exclude=recovery.conf /var/lib/pgsql/10/data/* postgres@192.168.2.3:/var/lib/pgsql/10/data/ $ psql -c "select pg_stop_backup();"
WHILE RESTARTING SLAVE SERVER GOT FOLLOWING ERROR:
[postgres@slave root]$ /usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data/ restart could not change directory to "/root": Permission denied pg_ctl: PID file "/var/lib/pgsql/10/data/postmaster.pid" does not exist Is server running? starting server anyway waiting for server to start....2018-05-26 20:36:34.834 IST [45453] LOG: listening on IPv6 address "::1", port 5432 2018-05-26 20:36:34.834 IST [45453] LOG: listening on IPv4 address "127.0.0.1", port 5432 2018-05-26 20:36:34.836 IST [45453] LOG: listening on IPv4 address "192.168.2.3", port 5432 2018-05-26 20:36:34.840 IST [45453] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2018-05-26 20:36:34.849 IST [45453] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" .2018-05-26 20:36:35.685 IST [45453] LOG: redirecting log output to logging collector process 2018-05-26 20:36:35.685 IST [45453] HINT: Future log output will appear in directory "log".
ON LOG FILE:
tail -f postgresql-2018-05-26_203635.log 2018-05-26 20:36:35.957 IST [45461] FATAL: could not connect to the primary server: FATAL: password authentication failed for user "postgres" cp: cannot stat ‘/home/postgres/archive/000000010000000000000006’: No such file or directory 2018-05-26 20:36:40.964 IST [45465] FATAL: could not connect to the primary server: FATAL: password authentication failed for user "postgres" cp: cannot stat ‘/home/postgres/archive/000000010000000000000006’: No such file or directory 2018-05-26 20:36:45.963 IST [45467] FATAL: could not connect to the primary server: FATAL: password authentication failed for user "postgres" cp: cannot stat ‘/home/postgres/archive/000000010000000000000006’: No such file or directory 2018-05-26 20:36:50.977 IST [45471] FATAL: could not connect to the primary server: FATAL: password authentication failed for user "postgres" 2018-05-26 20:36:55.459 IST [45453] LOG: received fast shutdown request 2018-05-26 20:36:55.465 IST [45457] LOG: shutting down 2018-05-26 20:36:55.472 IST [45453] LOG: database system is shut down
on master:
–change the postgres password becouse you given wrong password on recovery.conf file for postgres user so change the password as per recovery.conf
alter user postgres with password 'postgres';
Then Restart the slave server:
/usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data/ restart
MONITORING STREAMING REPLICATION:
At MASTER:
1.create some dummy table check the slave server whther is replicated or not.
postgres=# create table t(id int); CREATE TABLE --check master server whether is recovery mode or not,master will not be recovery mode slave only will be recovery mode. postgres=# select pg_is_in_recovery(); pg_is_in_recovery ------------------- f (1 row)
2.Using pg_stat_replication views
postgres=# select client_addr,client_hostname,client_port,state,sent_lsn,write_lsn,flush_lsn,replay_lsn,write_lag,replay_lag,flush_lag,sync_state from pg_stat_replication; client_addr | client_hostname | client_port | state | sent_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | replay_lag | flush_lag | sync_state -------------+-----------------+-------------+-----------+-----------+-----------+-----------+------------+-----------+------------+-----------+------------ 192.168.2.3 | | 60000 | streaming | 0/8017A28 | 0/8017A28 | 0/8017A28 | 0/8017A28 | | | | async (1 row)
3.using linux command check the wal sender process whether started or not:
[postgres@mster ~]$ ps -ef|grep postgres root 82472 82437 0 08:34 pts/1 00:00:00 su postgres postgres 82473 82472 0 08:34 pts/1 00:00:00 bash root 94549 94494 0 20:36 pts/3 00:00:00 su postgres postgres 94550 94549 0 20:36 pts/3 00:00:00 bash postgres 94582 1 0 20:36 pts/3 00:00:00 /usr/pgsql-10/bin/postgres -D /var/lib/pgsql/10/data postgres 94584 94582 0 20:36 ? 00:00:00 postgres: logger process postgres 94586 94582 0 20:36 ? 00:00:00 postgres: checkpointer process postgres 94587 94582 0 20:36 ? 00:00:00 postgres: writer process postgres 94588 94582 0 20:36 ? 00:00:00 postgres: wal writer process postgres 94589 94582 0 20:36 ? 00:00:00 postgres: autovacuum launcher process postgres 94591 94582 0 20:36 ? 00:00:00 postgres: stats collector process postgres 94592 94582 0 20:36 ? 00:00:00 postgres: bgworker: logical replication launcher postgres 94741 94582 0 20:43 ? 00:00:00 postgres: wal sender process postgres 192.168.2.3(60000) streaming 0/8017B08 postgres 95178 94550 0 21:08 pts/3 00:00:00 ps -ef postgres 95179 94550 0 21:08 pts/3 00:00:00 grep --color=auto postgres
–you can calculate using this linux command how many wal sender is replicated to slave.
$ ps -ef|grep sender postgres 7585 3383 0 15:59 ? 00:00:00 postgres: wal sender process postgres 192.168.2.2(42586) streaming 0/18017CD8 postgres 7598 6564 0 15:59 pts/2 00:00:00 grep --color=auto sender
AT SLAVE SERVER:
1.check the previously created table whether is replicated or not
postgres=# \dt List of relations Schema | Name | Type | Owner --------+---------------+-------+---------- public | qrtransaction | table | postgres public | t | table | postgres (2 rows)
yes! it is successfully replicated the table.
–check the slave mode using following function whether is recovery mode or not
postgres=# select pg_is_in_recovery(); pg_is_in_recovery ------------------- t (1 row)
–and try to insert the valuse on slave server, slave is a read only mode it will not permitted write transaction.
postgres=# insert into t values(1); ERROR: cannot execute INSERT in a read-only transaction
2. using pg_stat_replication view:
postgres=# select status,receive_start_lsn,received_lsn,last_msg_send_time,latest_end_lsn,latest_end_time,conninfo from pg_stat_wal_receiver ; status | receive_start_lsn | received_lsn | last_msg_send_time | latest_end_lsn | latest_end_time | conninfo -----------+-------------------+--------------+----------------------------------+----------------+----------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- streaming | 0/6000000 | 0/8017B08 | 2018-05-26 21:26:34.577733+05:30 | 0/8017B08 | 2018-05-26 21:06:32.309548+05:30 | user=postgres password=******** dbname=replication host=192.168.2.2 port=5432 fallback_application_name=walreceiver sslmode=prefer sslcompression=1 krbsrvname=postgres target_session_attrs=any (1 row)
3.using linux command you can monitor wether wal receiver is started or not.this linux command will be helpful for finding out the postgres data directory as well as postgres utility path means bin path
[postgres@slave data]$ ps -ef|grep postgres root 3971 3573 0 10:27 pts/0 00:00:00 su postgres postgres 3972 3971 0 10:27 pts/0 00:00:00 bash root 45410 45321 0 20:36 pts/1 00:00:00 su postgres postgres 45411 45410 0 20:36 pts/1 00:00:00 bash postgres 45606 1 0 20:43 pts/1 00:00:00 /usr/pgsql-10/bin/postgres -D /var/lib/pgsql/10/data postgres 45607 45606 0 20:43 ? 00:00:00 postgres: logger process postgres 45608 45606 0 20:43 ? 00:00:00 postgres: startup process recovering 000000010000000000000008 postgres 45610 45606 0 20:43 ? 00:00:00 postgres: checkpointer process postgres 45611 45606 0 20:43 ? 00:00:00 postgres: writer process postgres 45612 45606 0 20:43 ? 00:00:06 postgres: wal receiver process streaming 0/8017B08 postgres 45613 45606 0 20:43 ? 00:00:00 postgres: stats collector process postgres 45995 45411 0 21:16 pts/1 00:00:00 ps -ef postgres 45996 45411 0 21:16 pts/1 00:00:00 grep --color=auto postgres
4.this command will be helpful for how many wal segment get postgres receiver
[postgres@slave data]$ ps -ef|grep receiver; postgres 45612 45606 0 20:43 ? 00:00:06 postgres: wal receiver process streaming 0/8017B08 postgres 46018 45411 0 21:18 pts/1 00:00:00 grep --color=auto receiver
5.If the slave is up in hot standby mode, you can tell the time in seconds the delay of transactions applied on the slave with this query:
postgres=# select now() - pg_last_xact_replay_timestamp() AS replication_delay; replication_delay ------------------- 00:18:35.207663 (1 row)
Note: above timing is noted for last 18 minutes slave not get any transaction