Fastest Ways To Configure Streaming replication with PostgreSQL 10
In this post, i will explain how to setup a streaming replication with PostgreSQL 10. I will explain how to install PostgreSQL 10 on your system and how to monitor the postgres replication server and how many ways is there to monitoring the streaming replication everything am mentioned here.
If you want to install the postgres server means please go through the below post, this posts will be helpful to install the postgres server and how to split (linux fdisk) the partition for data , archive , backup..etc.
Easy way To Install Postgres Database
What is the Postgresql Streaming Replication (SR) ?
Streaming Replication (SR) provides the capability to continuously ship and apply the WAL XLOG records to some number of standby servers in order to keep them current. This feature was added to PostgreSQL 9.0.
Advantages of postgresql Streaming Replication (SR):
Configuring Configuring between postgres cluster is considered to be a best strategy towards achieving high availability during disasters and provides fault tolerance against unexpected failures.
1. Switch over/Fail over after the primary fails.
2. Data loss so less and data integrity.
3. No need more down time after primary fail.
4. Synchronous mode Zero data loss.
5. Load balancing using load balancer ex.pgpool-II,pgbouncer.
6. Auto failover using repmgr.
My setup is :
- Master server ip 192.168.2.2
- Slave server ip 192.168.2.3
Securise your communications and use SSL
This step is not mandatory but recommended. If you already have a SSL certificate, skip the first step.
Generate a self signed certificate,Â
Setup SSL on PostgreSQL:
Copy your private key and your certificate in the directory of your choice. Be carefull that the postgresql user can read them (usually user postgres on Linux or _postgresql on OpenBSD)
Edit the file postgresql.conf and change these lines:
ssl = on
ssl_cert_file = '/etc/ssl/postgresql/cert/server.crt'
ssl_key_file = '/etc/ssl/postgresql/private/server.key'
Of course, change the directory by yours. If you don’t specify a directory but only the filename, PostgreSQL will search them in the PGDATA directory.
Configure the master
Create a role dedicated to the replication
postgres=# CREATE ROLE replicate WITH REPLICATION LOGIN ; CREATE ROLE postgres=# set password_encryption = 'scram-sha-256'; SET postgres=# \password replicate Enter new password: Enter it again:
ON MASTER:
on postgresql.conf:
Verify that your PostgreSQL server listen on your interface. Edit postgresql.conf and change this line
#listen_addresses = 'localhost'
by something like this
listen_addresses = '*'
#or
listen_addresses = 'xxx.xxx.xxx.xxx'
Change the parameters for the streaming replication in postgresql.conf
listen_addresses = 'localhost,192.168.2.2'
wal_level = replica # minimal, replica, or logical
archive_mode = on
archive_command = 'rsync -av %p /home/postgres/archive/%f && rsync -av %p postgres@192.168.2.3:/home/postgres/archive/%f'
max_wal_senders = 2
wal_keep_segments = 10
The archive_command will copy the wal segments on a directory that must be accessible by the standby server. In the example above, i use the rsync command to copy them directly on the standby itself.
Warning : if, like me, you use rsync, be sure to configure the ssh access by keys !!
on pg_hba.conf:
Now allow your slave(s) to connect to the master. Edit pg_hba.conf and add something like this:
hostssl replication replicate xxx.xxx.xxx.xxx/yy scram-sha-256
Replace xxx.xxx.xxx.xxx/yy by the ip of your slave or maybe by the subnet used by your slave if you want to have many.
Restart your master server
/usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data/ restart
Setup the slave
Now that your master is ready, it’s time to configure the slave.
Stop postgresql on the slave
/usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data/ stop
move the data directory to backup location:
mv /var/lib/pgsql/10/data/* /home/postgres/backup/
Edit your postgresql.conf and pg_hba.conf and report the changes you made on the master (like this, your slave will have the same configuration and could act as a master)
Go to your PGDATA directory and delete all the files. WARNING : if the files postgresql.conf and pg_hba.conf are in this directory, you must backup them (same for the certificate files)
Now we will copy all the data from the master with the rsync command. You must run this command as the postgresql user (postgres on Debian, _postgresql on OpenBSD for example)
Taking consistancy backup from master for salve database :
# su - postgres
$ psql -c "select pg_start_backup('initial_backup');"
$ rsync -cva --inplace /var/lib/pgsql/10/data/* postgres@192.168.2.3:/var/lib/pgsql/10/data/
$ psql -c "select pg_stop_backup();"
ON SLAVE :
Now, all your master’s data are copied on the slave.
Edit your postgresql.conf and change this line :
hot_standby = on
Now create a file recovery.conf in your PGDATA directory
vi recovery.conf
standby_mode = 'on' # to enable the standby (read-only) mode.
primary_conninfo = 'host=192.168.2.2 port=5432 user=replicate password=MySuperPassword'
# to specify a connection info to the master node.
trigger_file = '/tmp/pg_failover_trigger'
# to specify a trigger file to recognize a fail over.
restore_command = 'cp /home/postgres/archive/%f "%p"'
archive_cleanup_command = '/usr/pgsql-10/bin/pg_archivecleanup /home/postgres/archive/ %r'
Here is an explanation for each line :
hot_standby : Must be set to ON on standby/replica and has no effect on the master. However, when you setup your replication, parameters set on the master are automatically copied. This parameter is important to enable READS on slave. Otherwise, you cannot run your SELECT queries against slave.
primary_conninfo : the parameters to use to connect to the master
trigger_file : if this file exists, the server will stop the replication and act as a master
restore_command : this command is only needed if you have used the archive_command on the master
wal_level : Must be at least set to hot_standby until version 9.5 or replica in the later versions.
max_wal_senders : Must be set to 3 if you are starting with one slave. For every slave, you may add 2 wal senders.
wal_keep_segments : Set the WAL retention in pg_xlog (until PostgreSQL 9.x) and pg_wal (from PostgreSQL 10). Every WAL requires 16MB of space unless you have explicitly modified the WAL segment size. You may start with 100 or more depending on the space and the amount of WAL that could be generated during a backup.
Start the slave server:
/usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data/ start
While starting the postgres slave server am getting below error
[postgres@slave root]$ /usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data/ restart
could not change directory to "/root": Permission denied
pg_ctl: PID file "/var/lib/pgsql/10/data/postmaster.pid" does not exist
Is server running?
starting server anyway
waiting for server to start....2018-05-26 20:36:34.834 IST [45453] LOG: listening on IPv6 address "::1", port 5432
2018-05-26 20:36:34.834 IST [45453] LOG: listening on IPv4 address "127.0.0.1", port 5432
2018-05-26 20:36:34.836 IST [45453] LOG: listening on IPv4 address "192.168.2.3", port 5432
2018-05-26 20:36:34.840 IST [45453] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2018-05-26 20:36:34.849 IST [45453] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
.2018-05-26 20:36:35.685 IST [45453] LOG: redirecting log output to logging collector process
2018-05-26 20:36:35.685 IST [45453] HINT: Future log output will appear in directory "log".
ON LOG FILE:
tail -f postgresql-2018-05-26_203635.log
2018-05-26 20:36:35.957 IST [45461] FATAL: could not connect to the primary server: FATAL: password authentication failed for user "replicate"
cp: cannot stat ‘/home/postgres/archive/000000010000000000000006’: No such file or directory
2018-05-26 20:36:40.964 IST [45465] FATAL: could not connect to the primary server: FATAL: password authentication failed for user "replicate"
cp: cannot stat ‘/home/postgres/archive/000000010000000000000006’: No such file or directory
2018-05-26 20:36:45.963 IST [45467] FATAL: could not connect to the primary server: FATAL: password authentication failed for user "replicate"
cp: cannot stat ‘/home/postgres/archive/000000010000000000000006’: No such file or directory
2018-05-26 20:36:50.977 IST [45471] FATAL: could not connect to the primary server: FATAL: password authentication failed for user "replicate"
2018-05-26 20:36:55.459 IST [45453] LOG: received fast shutdown request
2018-05-26 20:36:55.465 IST [45457] LOG: shutting down
2018-05-26 20:36:55.472 IST [45453] LOG: database system is shut down
Actually issue is password changed somebody for replicate user else password is incorrect this is why getting permission denied error , alter the correct password on master server by using below command
alter user replicate with password 'MySuperPassword';
Then start the slave server by using below command
/usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/10/data/ start
Your slave is ready !
if you get any error like arch missing , wal log missing …etc while starting slave server again take the incremental backup from master then try to start the server.
For this missing wal log, mismatching log file  use to take backup again using below command
$ psql -c "select pg_start_backup('initial_backup');"
$ rsync -cva --inplace --exclude=pg_hba.conf --exclude=postgresql.conf --exclude=recovery.conf /var/lib/pgsql/10/data/* postgres@192.168.2.3:/var/lib/pgsql/10/data/
$ psql -c "select pg_stop_backup();"
Then again start the slave server we had faced this error multiple times while configuring postgres streaming replication that is what mentioned those type of error.
We have configured streaming replication successfully now have to monitor our streaming replication whether is working fine or not by using below steps
MONITORING STREAMING REPLICATION:
Concept of wal receiver & wal sender :
   Wal receiver writes the WAL data sent by wal sender to WAL segments. It is the startup process on slave that replays the data written to WAL segment. And then the streaming replication begins.
At MASTER:
By using below command you can find the postgres server which one is master and which one is as salve server for example if you issue the below command means your postgres server replied as values ‘t’ means that is salve server means server is in recovery mode otherwise if you get values ‘f’ means that is master server means server is not in recovery mode .
postgres=# select pg_is_in_recovery(); Â
pg_is_in_recoveryÂ
------------------- Â
f
(1 row)
- On master create some dummy tables check the slave server whether is replicated or not in slave server successfully.
postgres=# create table t(id int);
CREATE TABLE
 2. By using pg_stat_replication views easily you can monitor the postgres streaming replication below script will be helpful to monitor the streaming replication
You can see more details by querying the master’s pg_stat_replication view.
postgres=# select client_addr,client_hostname,client_port,state,sent_lsn,write_lsn,flush_lsn,replay_lsn,write_lag,replay_lag,flush_lag,sync_state from pg_stat_replication;
 client_addr | client_hostname | client_port |  state  | sent_lsn  | write_lsn | flush_lsn | replay_lsn | write_lag | replay_lag | flush_lag | sync_state
-------------+-----------------+-------------+-----------+-----------+-----------+-----------+------------+-----------+------------+-----------+------------
 192.168.2.3 |         |    60000 | streaming | 0/8017A28 | 0/8017A28 | 0/8017A28 | 0/8017A28  |      |       |      | async
(1 row)
3. Using below linux command easily you can check the wal sender process , you can view the status Streaming Replication whether is in streaming mode or standalone mode and can check Wal sequence as well
ps -ef|grep postgres root    82472  82437  0 08:34 pts/1   00:00:00 su postgres postgres  82473  82472  0 08:34 pts/1   00:00:00 bash root    94549  94494  0 20:36 pts/3   00:00:00 su postgres postgres  94550  94549  0 20:36 pts/3   00:00:00 bash postgres  94582    1  0 20:36 pts/3   00:00:00 /usr/pgsql-10/bin/postgres -D /var/lib/pgsql/10/data postgres  94584  94582  0 20:36 ?     00:00:00 postgres: logger process postgres  94586  94582  0 20:36 ?     00:00:00 postgres: checkpointer process postgres  94587  94582  0 20:36 ?     00:00:00 postgres: writer process postgres  94588  94582  0 20:36 ?     00:00:00 postgres: wal writer process postgres  94589  94582  0 20:36 ?     00:00:00 postgres: autovacuum launcher process postgres  94591  94582  0 20:36 ?     00:00:00 postgres: stats collector process postgres  94592  94582  0 20:36 ?     00:00:00 postgres: bgworker: logical replication launcher postgres  94741  94582  0 20:43 ?     00:00:00 postgres: wal sender process postgres 192.168.2.3(60000) streaming 0/8017B08 postgres  95178  94550  0 21:08 pts/3   00:00:00 ps -ef postgres  95179  94550  0 21:08 pts/3   00:00:00 grep --color=auto postgres
By using below linux command you can see the Wal sender processor information.
$ ps -ef|grep sender postgres  7585  3383  0 15:59 ?     00:00:00 postgres: wal sender process postgres 192.168.2.2(42586) streaming 0/18017CD8 postgres  7598  6564  0 15:59 pts/2   00:00:00 grep --color=auto sender
Monitoring from Slave standby server :
1. Checking and monitoring previously created Whether is reached on slave server or not , \dt or \d public.t command will be helpful to check the table description and table name.
postgres=# \dt        List of relations  Schema |   Name    | Type  |  Owner --------+---------------+-------+----------  public | qrtransaction | table | postgres  public | t       | table | postgres (2 rows)
yes! The table t is successfully reached the standby server which was already created on master database.
You can confirm the postgres server which is master or slave by using below command.
postgres=# select pg_is_in_recovery();  pg_is_in_recovery -------------------  t (1 row)
This is the alternate method to easily find out the postgres server which is master or slave.
If you try to create a temporary table else try to install already exists temporary table means standby server will not allow to do any modifications only master can do.
From below example you can understand this concept.
postgres=# insert into t values(1); ERROR: Â cannot execute INSERT in a read-only transaction
2. Quering pg_stat_replication you can monitor the postgres replication.
postgres=# select status,receive_start_lsn,received_lsn,last_msg_send_time,latest_end_lsn,latest_end_time,conninfo from pg_stat_wal_receiver ;  status  | receive_start_lsn | received_lsn |     last_msg_send_time     | latest_end_lsn |     latest_end_time      |                              conninfo -----------+-------------------+--------------+----------------------------------+----------------+----------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  streaming | 0/6000000     | 0/8017B08   | 2018-05-26 21:26:34.577733+05:30 | 0/8017B08    | 2018-05-26 21:06:32.309548+05:30 | user=postgres password=******** dbname=replication host=192.168.2.2 port=5432 fallback_application_name=walreceiver sslmode=prefer sslcompression=1 krbsrvname=postgres target_session_attrs=any (1 row)
3. Below linux command will give the information about postgres process is running or not including this can find postgres replication lags and whether is master slave Everything can understand
For example master means wal sender process will be running ,slave server having Wal receiver.
[postgres@slave data]$ ps -ef|grep postgres root    3971  3573  0 10:27 pts/0   00:00:00 su postgres postgres  3972  3971  0 10:27 pts/0   00:00:00 bash root    45410  45321  0 20:36 pts/1   00:00:00 su postgres postgres  45411  45410  0 20:36 pts/1   00:00:00 bash postgres  45606    1  0 20:43 pts/1   00:00:00 /usr/pgsql-10/bin/postgres -D /var/lib/pgsql/10/data postgres  45607  45606  0 20:43 ?     00:00:00 postgres: logger process postgres  45608  45606  0 20:43 ?     00:00:00 postgres: startup process  recovering 000000010000000000000008 postgres  45610  45606  0 20:43 ?     00:00:00 postgres: checkpointer process postgres  45611  45606  0 20:43 ?     00:00:00 postgres: writer process postgres  45612  45606  0 20:43 ?     00:00:06 postgres: wal receiver process  streaming 0/8017B08 postgres  45613  45606  0 20:43 ?     00:00:00 postgres: stats collector process postgres  45995  45411  0 21:16 pts/1   00:00:00 ps -ef postgres  45996  45411  0 21:16 pts/1   00:00:00 grep --color=auto postgres
4. This is the alternate command to check the postgres replication lags , this command will give the information about right now what is the sequence getting from master.
[postgres@slave data]$ ps -ef|grep receiver;
postgres  45612  45606  0 20:43 ?     00:00:06 postgres: wal receiver process  streaming 0/8017B08
postgres  46018  45411  0 21:18 pts/1   00:00:00 grep --color=auto receiver
5. If the slave is up in hot standby mode, you can tell the time in seconds the delay of transactions applied on the slave with this query:
postgres=# select now() - pg_last_xact_replay_timestamp() AS replication_delay;
replication_delay
-------------------
00:18:35.207663
(1 row)
Above timing is noted for last 18 minutes slave not get any transaction means slave not received any lags.