Friday, April 30, 2010

PostgreSQL 8.4 Servers and Warm Standby (Point-In-Time Recovery)

PostgreSQL is a great database server, but when your dataset is rather large, you want a back-up server with minimal downtime. To achieve this, we will be using the PostgreSQL built in features for PITR (Point in time recovery). This tutorial will teach you how to configure it from the basics starting from installation.

Assumptions


This tutorial assumes you have 2 PostgreSQL 8.4 servers running on a Debian / Ubuntu based Linux distribution. The database versions need to be the same. One can be your machine and other one can be your VM.

Installation


First install postgresql
$ sudo apt-get install postgresql

pgadmin3 is a excellent PostgreSQL client
$ sudo apt-get install pgadmin3

Server is running ?


Try this out:
$ sudo -u postgres psql

It should prompt
postgres=#

Type \password and give a password and type \q to quit and come out. It means its successfully running :)
postgres=# \password
Enter new password: <some password>
Enter again: <enter same password again>

postgres=# \q

if any problem persists. May be your server is not running, start it by using:
$ sudo /etc/init.d/postgresql-8.4 start

You can now login to your pgadmin3 by selecting new connection, with following credentials:



What do we intend to do




Step 1: Configure the Master PostgreSQL server


As stated earlier we will be using PITR (Point In Time Recovery) for this. This creates WAL files, which will be copied to the destination/standby database server. Each WAL file is (by default) 16MB in size.
$ mkdir /archive; chown postgres.postgres /archive
$ vim /etc/postgresql/8.4/main/postgresql.conf

archive_mode = on # for pgpoolII
archive_command = 'cp -i %p /archive%f'
archive_timeout = 300

Next, run the following commands to initiate postgres in backup mode.
$ sudo /etc/init.d/postgresql-8.4 reload;
$ sudo su postgres;
$ psql;
postgres=# SELECT pg_start_backup('backup');
postgres=# \q

Next, copy the files in /var/lib/postgresql/8.4/main to the destination/standby server. I prefer to use rsync for this.
rsync --progress -azv --delete  /var/lib/postgresql/8.4/main/  root@<destinationserver>:/var/lib/postgresql/8.4/main

When this is finished, we will need to stop postgresql's backup mode;
$ sudo su postgres;
$ psql;
postgres=# SELECT pg_stop_backup();
postgres=# \q

Destination/Standby server


Now we need to configure the destination server may be VM in this case. First, create another archive directory. Again, substitute /archive/ for your desired directory. For ease I recommend keeping these directories the same.
$ sudo mkdir /archive; sudo chown postgres.postgres /archive;

we need pg_standby. pg_standby supports creation of a "warm standby" database server
$ apt-get install postgresql-contrib-8.4

Create a symlink to pg_standby, so that its universally available.

$ sudo ln -s /usr/lib/postgresql/8.4/bin/pg_standby /usr/bin/pg_standby


Create a file names recovery here:
$ sudo vim /var/lib/postgresql/8.4/main/recovery.conf

Paste the following line into the newly created file:

restore_command = 'pg_standby -l -d -s 2 -t /tmp/pgsql.trigger /archive %f %p %r 2>>/var/log/postgresql/pg_standby.log'


Next, we will need to rsync the /archive directories across. This can be done from either the master or standby server. The following assumes the destination server:
$ sudo rsync --progress -azv --delete root@<sourceserver>:/archive  /archive

When finished, start postgresql..
$ sudo /etc/init.d/postgresql-8.4 start

When you want to switch over, all you have to do is re-rsync the /archive directory and create a trigger file to start postgresql into normal mode. Please note that from the moment you create the trigger file, new data from the master server will not be added to the standby server.
$ sudo rsync --progress -azv -e ssh --delete root@<sourceserver>:/archive  /archive
$ sudo su -c 'touch /tmp/pgsql.trigger' postgres

You may add a cron jobs to sync your archive directory between your master and slave PostgreSQL servers.

So once file is touched it runs into normal node, with minimal downtime. Hurray!