Configuring pgpool-II + PostgreSQL + Streaming replication + Hot standby in AWS environment

Hello world!
Decided to describe the basics of configuring failover (HA) cluster the PostgreSQL database in IaaS environment like Amazon AWS.

About the setting of the specified bundles with the appearance of light in 9th version with native replication already written a lot of articles, why dwell on configuring the PostgreSQL and pgpool will not, everything is standard. Here is the pieces of the configs to mindless copy-paste, in any case have to open your configs and edit the required parameters. Don't want to encourage the configuration process according to the method kopipasta.

Terminology:

Streaming replication means that the nodes Postgres will pull updates from the master. Don't need additional functionality of the archive.
Hot standby allows slave nodes to serve READ requests for load balancing, in contrast to the warm standby, in which the slave server is not serving client requests, and constantly pulls the actual master database. In turn, the replication can be synchronous and asynchronous (sync or async).
In my example uses a bunch of database servers master-slave in it to use synchronous replication is impossible, because with synchronous replication, the master, not being able to send a replica of the slave, just do not fulfill the request and will hang waiting for the slave and the whole point of such a scheme will be lost.

inputs:

The created configuration should be deprived of a single point of failure. At the level of pgpool we will use its native functionality of watchdog, to be able to track the fall of one of the node and drag the IP which connects the client application. At the level of postgresql, use streaming replication + hot standby. In case of a fall masters his role on quickly will take slave slave'and master will make pgpool by creating a trigger file in $PGDATA. In the case of falling of slave will return him to life, any automatic manipulation of databases do not lead to good emergency situation and drop the node in any case requires attention. In all described cases, the fall, the client application must continue to operate with minimal downtime.

In the cloud AWS (Amazon Web Services) created 4 virtual machines: pgpool-1 (IP:10.0.3.11), pgpool-2(IP:10.0.3.12), db-1(IP:10.0.3.21), db-2(IP:10.0.3.22). Machines are created from within a VPC, to be able to assign private addresses, and they would persist between reboot of your instances. When creating instances, I was using the ami-c 8e987ef9 Ubuntu. But if you have the ability to choose any image — take Amazon Linux why I think so learn the text.

Configurations:

1. db-1 — master startup bundles

...
wal_level = hot_standby
#the Following parameter is ignored on the master server, but due to the fact that master and slave can be interchanged include it in the master config a
hot_standby = on
...

In accordance with their needs corrected values for
checkpoint_segments, max_wal_senders, wal_keep_segments
To start the replication process you can leave them default, and then portunity, pre-reading wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

In pg_hba.conf configurable replication access, you can use the password because I have everything in VPC is spinning and does not have direct access from the outside, I just ordered trust:

host replication postgres 10.0.3.0/24 trust

2. db 2 — the slave during the start phase of the ligaments

...
wal_level = hot_standby
hot_standby = on
...

Behavior standby for the slave drives file recovery.conf, which should be in $PGDATA.
In my case it was the directory /var/lib/postgresql/9.3/main
recovery.conf:

standby_mode = 'on'
primary_conninfo = 'host=10.0.3.21 port=5432 user=postgres'
trigger_file = '/var/lib/postgresql/9.3/main/postgresql.trigger'

Don't forget about the settings for the access in pg_hba.conf

3. pgpool-1

pgpool.conf:

...
backend_hostname0 = '10.0.3.21'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/var/lib/postgresql/9.3/main'
backend_flag0 = 'ALLOW_TO_FAILOVER'
backend_hostname1 = '10.0.3.22'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/var/lib/postgresql/9.3/main'
backend_flag1 = 'ALLOW_TO_FAILOVER'

#because we have Hot Standby:
load_balance_mode = on
#we are going to Have Streaming replication
master_slave_mode = on
master_slave_sub_mode = 'stream'
sr_check_period = 10
sr_check_user = 'postgres'
sr_check_password = "
delay_threshold = 100
#makes sense when slave > 1
follow_master_command = "
#What to do when there is no node:
failover_command = '/etc/pgpool2/failover.sh %d %P %H %R'
#What to do when the detached node returns:
failback_command = "
#Run failover when can't connect to the backend
fail_over_on_backend_error = on
search_primary_node_timeout = 10
#what user will do online recovery
recovery_user = 'postgres'
recovery_password = "
#In the first stage, pgpool continues to accept connections and requests from clients, second to none.
#Running the script must be in $PGDATA
recovery_1st_stage_command = 'basebackup.sh'
recovery_2nd_stage_command = "
#How many seconds wait for recovery node
recovery_timeout = 90
#Will use the watchdog to monitor the status of pgpool
use_watchdog = on
wd_hostname = 'pgpool-1'
wd_port = 9000
wd_authkey = "
#the Virtual address to which to connect a client application:
delegate_IP = '10.0.3.10'
#Where will lie scripts control interfaces:
ifconfig_path = '/opt/AWS'
#Execute the command to assign the node the virtual IP
if_up_cmd = 'if.sh up $_IP_$'
#Execute the command to remove the node virtual IP
if_down_cmd = 'if.sh down $_IP_$'
#arping Pgpool executes, when dragging over the virtual interface for the speedy updates ARP cache
arping_cmd = "
#in What way will check liveliness of adjacent nodes pgpool:
#heartbeat or try to send it via queries to the database
wd_lifecheck_method = 'heartbeat'
#Interval in seconds between checks:
wd_interval = 4
#On which port the helm:
wd_heartbeat_port = 9694
#the interval between the mandrels keepalive packets
wd_heartbeat_keepalive = 2
#Time after which we consider the silent nod fallen away:
wd_heartbeat_deadtime = 15
#the Address of a neighboring node:
heartbeat_destination0 = 'pgpool-2'
heartbeat_destination_port0 = 9694
#you Can specify on what interface to work heartbeat'
heartbeat_device0 = "
#Describe the parameters of the other nodes:
other_pgpool_hostname0 = 'pgpool-2'
other_pgpool_port0 = 9999
other_wd_port0 = 9000
...

4. pgpool-2
The config is identical pgpool-1, changing the description of the neighboring nodes with pgpool-pgpool 2 in-1

In /etc/hosts on both nodes specify a binding name for the ip:

10.0.3.11 pgpool-1
10.0.3.12 pgpool-2
10.0.3.21 db-1
10.0.3.22 db-2

5. Integration pgpool to work with our database
From pgpool-1 and pgpool-2 create a script from parameter failover_command, running in the fall of the node (I have an automatic action is executed only when falling the master node). All he does is actually checks the fallen master node or not, and if the master creates the slave trigger file that automatically puts the slave in READ-WRITE, i.e., makes his master:
the

#!/bin/bash -x
FALLING_NODE=$1 # %d
OLDPRIMARY_NODE=$2 # %P
NEW_PRIMARY=$3 # %H
PGDATA=$4 # %R
KEY_PATH="/var/lib/postgresql/.ssh/id_rsa"

if [ $FALLING_NODE = $OLDPRIMARY_NODE ]; then
if [ $UID -eq 0 ]
then
sudo -u postgres ssh -T-i $KEY_PATH postgres@$NEW_PRIMARY "touch $PGDATA/postgresql.trigger"
exit 0;
fi
ssh -T-i $KEY_PATH postgres@$NEW_PRIMARY "touch $PGDATA/postgresql.trigger"
fi;

exit 0;

From the db-1 and db-2 installed schema pgpool:
the

sudo -u postgres psql -f /usr/share/postgresql/9.3/extension/pgpool-recovery.sql template1

Create $PGDATA script pgpool_remote_start, which will start postgresql on the neighboring node [depending on the version you use postgresql you might need a second passed pgpool'om parameter specifying the $PGDATA directory of the server]:
the

#! /bin/sh

DEST=$1
PGCTL=/usr/bin/pg_ctlcluster
KEY_PATH="/var/lib/postgresql/.ssh/id_rsa"

ssh -T-i $KEY_PATH postgres@$DEST "$PGCTL stop 9.3 main --force;$PGCTL 9.3 main restart"

And script from the option recovery_1st_stage_command which will synchronize with the current master the new slave (also lies in $PGDATA near pgpool_remote_start):
the

#! /bin/sh
datadir=$1
desthost=$2
destdir=$3
KEY_PATH="/var/lib/postgresql/.ssh/id_rsa"
PGCTL="/usr/bin/pg_ctlcluster"

ssh -T-i $KEY_PATH postgres@$desthost "$PGCTL 9.3 main stop --force"

psql -c "SELECT pg_start_backup('Streaming Replication', true)" postgres

rsync -C -a-c --delete -e ssh --exclude postgresql.conf --exclude postmaster.pid \

--exclude recovery.conf --exclude recovery.done \
--exclude pg_xlog $datadir/ $desthost:$destdir/

ssh -T-i $KEY_PATH postgres@$desthost "cp $destdir/../recovery.done $destdir/recovery.conf;rm $destdir/postgresql.trigger"

psql -c "SELECT pg_stop_backup()" postgres

5. To be able to execute commands associated with restarting services and backup to remote hosts, you must configure passwordless access:

pgpool-1 -> db-1, db-2
pgpool-2 -> db-1,db-2
db-1 -> db-2
db-2 -> db-1

you Can use ssh host-based, you can generate ssh keys to allow them to authorized_keys.
After you configure the access it has to be checked on behalf of the user that will run the scripts in the process of pgpool, I have this postgres:
With pgpool host-1 performed by:
the

sudo -u postgres ssh -i /path_to_key -T postgres@db-1 id

And so for all necessary hosts, check access and update the known_hosts file for ssh.

At this stage a bunch of 4 gcd, it could be run for normal operation is not in the AWS environment.
the

start the master host (db-1)

Synchronize with it slave (postgresql, it has not yet started), over the directory $PGDATA execute:
the

mv main main.bak && sudo -u postgres pg_basebackup -h 10.0.3.21 -D /var/lib/postgresql/9.3/main-U postgres -v-P && cp recovery.done main/recovery.conf && chown postgres:postgres main/recovery.conf

(recovery.done — created a template for recovery.conf which references the IP of the master)

Run postgresql on the slave:
the
```
sudo service postgresql restart
```
Look at the replication status using "select * from pg_stat_replication", we see something like this:

application_name | walreceiver
client_addr | 10.0.3.22
state | streaming
sent_location | 1/2A000848
write_location | 1/2A000848
flush_location | 1/2A000848
replay_location | 1/2A000848
sync_priority | 0
sync_state | async
or just check that the wal sender/receiver in the list of processes on host machines db-1 and db-2.

After the launch of the first Launchpad pgpool peretyagivaet on the virtual address of the parameter delegate_IP by performing the command from the parameter if_up_cmd (by default there's just ifconfig).
The entry in the logs:

wd_escalation: escalated to master pgpool successfully

by running the second pgpool logs see that the neighboring node pgpool successfully identified and bale has earned:

find_primary_node: primary node id is 0

Status pool you can see one of the pcp_* commands — pcp_pool_status,pcp_node_info or queries through pgpool nodes "show pool_nodes;", "show pool_pools;"
All of these commands and the statuses of the nodes in the pool are very well described in the documentation of pgpool — www.pgpool.net/docs/pgpool-II-3.3.2/doc/pgpool-en.html

When disconnecting the first and second pgpool'pulled on myself delegate_ip team from if_up_cmd parameter.
In the fall of the backend db-1 or db-2 is the command parameter failover_command.
To return to the pool of backend commands pcp_attach_node and pcp_recovery_node

AWS Staff:

What is happening in the AWS environment? Everything is the same except that the IP address must be previously assigned to the network interface via the setting "Assign a secondary private address" in the menu Network Intefaces. For Amazon Linux which I wrote earlier there is a possibility of automatic assignment of the IP address for your instance and further crawling between pgpool-1 and pgpool-2 if necessary (I personally amazon linux not tested, it would be very interesting to see how smoothly it all works). In the case of AWS is not adapted to the image I need to use additional scripts from a set of ec2-api-tools.
Latest version api-tools to download from amazon.
For operation ec2-api-tools need Java, put — apt-get install default-jre
Unpacked the archive api-tools aws/bin will lie scripts to the aws management console.
But to work with amazon api requires the authorization key.
The process of obtaining authentication data is described in detail on amazon here — docs.aws.amazon.com/IAM/latest/UserGuide/ManagingCredentials.html and then docs.aws.amazon.com/IAM/latest/UserGuide/ManagingUserCerts.html
The first link, we learn how to create a user with keys and to assign the required group via menu IAM ( console.aws.amazon.com/iam/home?#users when you create the key in the clear should be shown to the owner once, if you don't have time to write, you will have to generate another ). Amazon when you first try to create a key strongly recommend to create a separate user menu IAM for these purposes, instead of creating a key under uchetku administrative AWS account.

openssl genrsa 1024 > private-key.pem
openssl pkcs8 -topk8 -nocrypt -inform PEM -in private-key.pem-out private-key-in-PCKS8-format.pem
openssl req-new-x509 -nodes -sha1 -days 3650 -key private-key.pem -outform PEM > certificate.pem

The contents of the certificate.pem poured on the AWS IAM. Certificate management can be done via the "Security Credentials" IAM:

After all these manipulations we have:
certificate.for the parameter pem EC2_CERT
private-key-in-PCKS8-format.for pem EC2_PRIVATE_KEY, AWS_ACCESS_KEY and AWS_SECRET_KEY.

You can start using ec2-api-tools.
For this I created a script if.sh who will win delegate_IP for pgpool's between instances. The script as parameters, retrieves the action that must be performed with the interface (up/down) and the desired ip address for the interface. Next, the script calculates the subnet for the entered IP (I use /24 and I just cut off the last octet, so one mask is not /24, the script will have finished). I consider the subnet because instances are used two interfaces — basic and management, to understand which of them need to hang a secondary ip.

Script if.sh:

#!/bin/sh

if test $# -eq 0
then
echo "This scripts adds and removes ip to subinterfaces and to AWS VPC configuration."
echo "Don't forget to set variables inside this script."
echo
echo Usage: $0' [up|down] IP_ADDRESS'
echo
exit 1
fi

#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
#CORRECT VALUES MUST BE SET PRIOR TO RUN THIS SCRIPT
#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

#If Proxy is used:
export EC2_JVM_ARGS='-Dhttp.proxySet=true-Dhttps.proxySet=true-Dhttp.proxyHost=x.x.x.x-Dhttp.proxyPort=3128-Dhttps.proxyHost=x.x.x.x-Dhttps.proxyPort=3128'
#Path to unpacked ec2-api from http://s3.amazonaws.com/ec2-downloads/ec2-api-tools.zip
export EC2_HOME=/opt
#Path to java
export JAVA_HOME=/usr
#Path to generated private key & cert (READ http://docs.aws.amazon.com/IAM/latest/UserGuide/ManagingUserCerts.html)
export EC2_PRIVATE_KEY=/opt/private-key-in-PCKS8-format.pem
export EC2_CERT=/opt/certificate.pem
#User access &secret key (READ http://docs.aws.amazon.com/IAM/latest/UserGuide/ManagingCredentials.html)
export AWS_ACCESS_KEY=YOUR_ACCESS_KEY
export AWS_SECRET_KEY=YOUR_SECRET_KEY
#Region for this EC2 instance
REGION=YOUR_REGION

#!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
AWS_PATH=$EC2_HOME
VIP=$2

subnet () {
SUB=`echo $VIP | awk '{ split($0,a,"."); print a[1]"."a[2]"."a[3]"."; }"
SUBNET=`$AWS_PATH/bin/ec2-describe-subnets --region $REGION | grep -F $SUB | awk '{print $2}"
echo Subnet id: $SUBNET

if [ -z "$SUBNET" ]; then
echo "Wrong subnet!"
exit 1;
fi
Instance_ID=`/usr/bin/curl --silent http://169.254.169.254/latest/meta-data/instance-id`
echo Instance_ID=$Instance_ID
ENI_ID=`$AWS_PATH/bin/ec2-describe-instances $Instance_ID --region $REGION | cut-f 2,3 | grep $SUBNET | awk '{print $1}"
echo ENI_ID=$ENI_ID
}

if_up () {
subnet
/usr/bin/sudo /sbin/ifconfig eth1:0 inet $VIP netmask 255.255.255.255
$AWS_PATH/bin/ec2-assign-private-ip-addresses -n $ENI_ID --secondary-private-ip-address $VIP --allow-reassignment --region $REGION
}

if_down (){
subnet
/usr/bin/sudo /sbin/ifconfig eth1:0 down
$AWS_PATH/bin/ec2-unassign-private-ip-addresses -n $ENI_ID --secondary-private-ip-address $VIP --region $REGION
}

case $1 in
[uU][pP])
if_up
break
;;
[dD][oO][wW][nN])
if_down
break
;;
*) echo "Up/Down command missed!"
exit 1
esac

/usr/sbin/service networking restart > /dev/null 2>&1

To control the actual elastic IP you can use ec2-associate-address and ec2-unassign-private-ip-addresses.

Actually these are the movements I had to perform to make friends with pgpool running on Amazon Linux instance AWS c.

Article based on information from habrahabr.ru

Поиск по этому блогу

computer express