On Github keithf4 / omniti_pg_training_centos
Presented by Keith Fiske / @keithf4
Database Administrator @ OmniTI
Follow along at http://slides.keithf4.com/pgtraining_centos
sudo yum install http://yum.postgresql.org/9.3/redhat/rhel-6-x86_64/pgdg-centos93-9.3-1.noarch.rpmInstall PostgreSQL 9.3 & Contrib modules
sudo yum install postgresql93-server postgresql93-contribNo clusters automatically created and no initial automatic startup (Redhat policy)
sudo service postgresql-9.3 initdb sudo chkconfig postgresql-9.3 on sudo service postgresql-9.3 start
sudo su - postgresLog into database (default postgres role & database already created)
psql
CREATE ROLE training WITH LOGIN SUPERUSER; CREATE DATABASE training;Should now be able to log into PostgreSQL as your training user Create replication role for later
CREATE ROLE replication WITH LOGIN REPLICATION;
\password replicationRecommend keeping second terminal open logged in as postgres user. Avoids having to exit and re-sudo throughout training. Demo trying to connect before the training user & database are created.
training=# show data_directory;Open pg_hba.conf file located in data directory (must be postgres user) All authentication into cluster is controlled by this file Evaluated in order top-down Default only allows local system users with matching role names (peer) Avoid "trust" if at all possible
host replication replication 127.0.0.1/32 md5Only requires a reload to put new HBA settings into place
training=# select pg_reload_conf(); OR sudo service postgresql-9.3 reloadCheck log file for SIGHUP
sudo service postgresql-9.3 restartIf restart unsuccessful, check postgresql log files
host replication replication 127.0.0.1/32 md5Use pg_basebackup to do a backup of master (as postgres system user)
pg_basebackup -h 127.0.0.1 -U replication -D /var/lib/pgsql/9.3/slave -R -Xs -P -vEdit slave postgresql.conf
port = 5488 (doesn’t matter when using CentOS init.d) hot_standby = onEdit slave recovery.conf
standby_mode = 'on' primary_conninfo = 'host=127.0.0.1 port=5432 user=replication password=password' trigger_file = '/var/lib/pgsql/9.3/slave/finish.recovery' recovery_target_timeline='latest'Mention delay settings and what they’re for near the hot_standby setting
sudo cp /etc/init.d/postgresql-9.3 /etc/init.d/postgresql-9.3-5488Edit config file to change (as root):
PGPORT=5488 PGDATA=/var/lib/pgsql/9.3/slave PGLOG=/var/lib/pgsql/9.3/pgstartup-5488.log PGUPLOG=/var/lib/pgsql/$PGMAJORVERSION/pgupgrade-5488.logRegister service, start up slave & ensure it connects to master (as training user)
sudo chkconfig postgresql-9.3-5488 on sudo service postgresql-9.3-5488 start
started streaming WAL from primary at 0/3000000 on timeline 1Check from master
select * from pg_stat_replication;Create an object and make sure it appears on slave (as training user)
training=# CREATE TABLE testing ( id serial primary key, stuff text, inserted_at timestamptz default now());Connect to slave
psql -p 5488
mkdir /var/lib/pgsql/9.3/omnislave chmod 700 /var/lib/pgsql/9.3/omnislaveEdit master postgresql.conf to set archive_command /w omnipitr-archive (ensure it's a single line) Clone of git repo already in /opt/omnipitr - https://github.com/omniti-labs/omnipitr
archive_command = '/opt/omnipitr/bin/omnipitr-archive -D /var/lib/pgsql/9.3/data -dl gzip=/var/lib/pgsql/9.3/backups/walarchive -l /var/lib/pgsql/9.3/data/pg_log/omnipitr-archive-^Y-^m-^d.log -v -s /var/lib/pgsql/9.3/data/pg_log/state "%p"'Create a destination folder for the WAL archives
mkdir /var/lib/pgsql/9.3/backups/walarchiveCreate state folder for omnipitr (used when there are multiple destinations)
mkdir /var/lib/pgsql/9.3/data/pg_log/stateCentOS does not install HiRes perl module by default (used by omnipitr)
sudo yum install perl-Time-HiRes.x86_64Reload master to activate new archive_command Check omnipitr & postgres logs. Check WAL folder for WAL files
ssh-keygen -t rsa (accept all defaults, no password) cd ~/.ssh cp id_rsa.pub authorized_keys ssh localhost (yes) exit
/opt/omnipitr/bin/omnipitr-synch -o localhost:/var/lib/pgsql/9.3/omnislave -l /var/lib/pgsql/9.3/data/pg_log/omnipitr-synch-^Y-^m-^d.log -vConfirm with all caps “YES” Edit postgresql.conf
cp /var/lib/pgsql/9.3/slave/recovery.conf /var/lib/pgsql/9.3/omnislave/
restore_command = '/opt/omnipitr/bin/omnipitr-restore -D /var/lib/pgsql/9.3/omnislave -s gzip=/var/lib/pgsql/9.3/backups/walarchive -l /var/lib/pgsql/9.3/omnislave/pg_log/omnipitr-restore-^Y-^m-^d.log -f /var/lib/pgsql/9.3/omnislave/finish.recovery -v -sr %f %p'
archive_cleanup_command = '/opt/omnipitr/bin/omnipitr-cleanup -a gzip=/var/lib/pgsql/9.3/backups/walarchive -l /var/lib/pgsql/9.3/omnislave/pg_log/omnipitr-cleanup-^Y-^m-^d.log -v %r'Also update trigger file to point to a different location (just in case)
trigger_file = '/var/lib/pgsql/9.3/omnislave/finish.recovery'
/usr/pgsql-9.3/bin/pg_ctl start -D /var/lib/pgsql/9.3/omnislaveWill not autostart on boot Cannot connect to this slave (hot_standby = off in postgresql.conf) Should see two entries for streaming slaves when checking from master now After a few minutes, should start seeing archived WAL files cleaned up automatically Check logs
/usr/pgsql-9.3/bin/pg_ctl restart -m fast -D /var/lib/pgsql/9.3/omnislave/archive_timeout of master determines how far behind slave could possibly be --recovery-delay (-w) option to omnipitr-restore can force slave to lag behind a specified period of time
sudo yum install policycoreutils-python-2.0.83-19.39.el6.x86_64 sudo semanage fcontext -a -t bin_t "/usr/bin/rsync" sudo restorecon -R -v /usr/bin/rsync sudo setsebool -P postgresql_can_rsync onJust disable SELinux if you don’t need it (did this in VM already)
pg_dumpall -g -f globals.sql -vpg_dump
pg_dump -Fc training -f training.pgr -v
archive_command = '/opt/omnipitr/bin/omnipitr-archive -D /var/lib/pgsql/9.3/data -dl gzip=/var/lib/pgsql/9.3/backups/walarchive -l /var/lib/pgsql/9.3/data/pg_log/omnipitr-archive-^Y-^m-^d.log -v -s /var/lib/pgsql/9.3/data/pg_log/state -db /var/lib/pgsql/9.3/backups/dst-backup "%p"'
sudo yum install perl-Digest-SHA.x86_64Run omnipitr-backup-master (as postgres system user)
/opt/omnipitr/bin/omnipitr-backup-master -D /var/lib/pgsql/9.3/data -x /var/lib/pgsql/9.3/backups/dst-backup -dl gzip=/var/lib/pgsql/9.3/backups -l /var/lib/pgsql/9.3/data/pg_log/omnipitr-backup-master-^Y-^m-^d.log -v -dg SHA-512
mkdir mydb /usr/pgsql-9.3/bin/initdb -D /home/training/mydb Edit port in postgresql.conf - port = 5444 /usr/pgsql-9.3/bin/pg_ctl start -D /home/training/mydbAutomatically made role “training” psql used to restore pg_dumpall or plaintext version of pg_dump
psql -p 5444 -d postgres -f globals.sql -apg_restore can be used to restore binary dump of pg_dump
psql -p 5444 postgres postgres=# create database new_training; pg_restore -p 5444 -d new_training -v training.pgrSmaller backups & more flexible restore Must recreate all indexes & constraints Stop this instance
/usr/pgsql-9.3/bin/pg_ctl stop -m fast -D /home/training/mydb
mkdir /var/lib/pgsql/9.3/omnirestoreUntar the backup to it
cd /var/lib/pgsql/9.3/omnirestore tar xvzpf ../backups/localhost.localdomain-data-YYYY-MM-DD.tar.gz tar xvzpf ../backups/localhost.localdomain-xlog-YYYY-MM-DD.tar.gz chmod 700 data cd dataChange the port in postgresql.conf
port = 5444Start it up
/usr/pgsql-9.3/bin/pg_ctl start -D /var/lib/pgsql/9.3/omnirestore/dataMuch faster disaster recovery Less flexible restore options (all or nothing)
SELECT extract(epoch from now() - pg_last_xact_replay_timestamp()) AS slave_lagMonitor from master if streaming (bytes behind)
SELECT client_hostname , client_addr , pg_xlog_location_diff(pg_stat_replication.sent_location, pg_stat_replication.replay_location) AS byte_lag FROM pg_stat_replication;Use omnipitr-monitor to monitor a non-streaming, non-hot-standby slave
/opt/omnipitr/bin/omnipitr-monitor -c last-restore-age -l /var/lib/pgsql/9.3/omnislave/pg_log/omnipitr-restore-^Y-^m-^d.log -s /var/log/omnipitr/Backup monitor if using omnipitr-backup
/opt/omnipitr/bin/omnipitr-monitor -c last-backup-age -l /var/lib/pgsql/9.3/data/pg_log/omnipitr-backup-master-^Y-^m-^d.log -s /var/log/omnipitr/
create or replace function pg_stat_activity() returns setof pg_catalog.pg_stat_activity as $$begin return query(select * from pg_catalog.pg_stat_activity); end$$ language plpgsql security definer; revoke all on function pg_stat_activity() from public;
select max_connections , total_used , coalesce(round(100*(total_used/max_connections)),0) as pct_used , idle , idle_in_txn , ((total_used - idle) - idle_in_txn) as active , (select coalesce(extract(epoch from (max(now() - query_start))),0) from pg_stat_activity() where state = 'idle in transaction') as max_idle_in_txn , (select coalesce(extract(epoch from (max(now() - query_start))),0) from pg_stat_activity() where state <> 'idle') as max_txn_time from (select count(*) as total_used , coalesce(sum(case when state = 'idle' then 1 else 0 end),0) as idle , coalesce(sum(case when state = 'idle in transaction' then 1 else 0 end),0) as idle_in_txn from pg_stat_activity()) x join (select setting::float AS max_connections FROM pg_settings WHERE name = 'max_connections') xx ON (true);
select sum(n_tup_ins) as inserts , sum(n_tup_upd) as updates , sum(n_tup_del) as deletes , sum(idx_scan) as index_scans , sum(seq_scan) as seq_scans , sum(idx_tup_fetch) as index_tup_fetch , sum(seq_tup_read) as seq_tup_read , coalesce(extract(epoch from now() - max(last_autovacuum))) as max_last_autovacuum , coalesce(extract(epoch from now() - max(last_vacuum))) as max_last_vacuum , coalesce(extract(epoch from now() - max(last_autoanalyze))) as max_last_autoanalyze , coalesce(extract(epoch from now() - max(last_analyze))) as max_last_analyze from pg_stat_all_tables;
SELECT datname , txns as "age/txn" , wrap , ROUND(100*(txns/wrap::float)) as wrap_perc , freez , ROUND(100*(txns/freez::float)) AS perc FROM ( SELECT foo.wrap::int , foo.freez::int , age(datfrozenxid) AS txns , datname FROM pg_database d JOIN (SELECT 2000000000 AS wrap , setting AS freez FROM pg_settings WHERE name = 'autovacuum_freeze_max_age') AS foo ON (true) WHERE d.datallowconn) AS foo2 ORDER BY 6 DESC, 1 ASC;
select txid_snapshot_xmax(txid_current_snapshot()) as xmax , commits , rollback from ( select sum(xact_commit) as commits , sum(xact_rollback) as rollback from pg_stat_database) as x;
select datname as name , pg_database_size(datname) as size from pg_catalog.pg_database;
SELECT sum(pg_relation_size(c.oid)) as size_table FROM pg_class c, pg_namespace n WHERE relkind not in ( 'i' , 'c','v') AND n.oid = c.relnamespace
SELECT sum(pg_relation_size(c.oid)) as size_index FROM pg_class c, pg_namespace n WHERE (relkind = 'i') AND n.oid = c.relnamespace;
select count(*) as total , count(nullif(granted,true)) as waiting , count(nullif(mode ilike '%exclusive%',false)) as exclusive from pg_locks;
create or replace function pg_ls_dir_sec_def(text) returns setof text as $$begin return query(select pg_catalog.pg_ls_dir('pg_xlog')); end$$ language plpgsql security definer; revoke all on function pg_ls_dir_sec_def(text) from public;
SELECT count(*) as total , (2 + current_setting('checkpoint_completion_target')::float) * current_setting('checkpoint_segments')::int + 1 + current_setting('wal_keep_segments')::int as expected , ((count(*) / ((2 + current_setting('checkpoint_completion_target')::float) * current_setting('checkpoint_segments')::int + 1 + current_setting('wal_keep_segments')::int)) * 100)::int as pct FROM pg_ls_dir_sec_def('pg_xlog') WHERE pg_ls_dir_sec_def ~ '^[0-9A-F]{24}$';
mkdir /var/lib/pgsql/9.3/slave/pg_log/state
archive_command = '/opt/omnipitr/bin/omnipitr-archive -D /var/lib/pgsql/9.3/slave -dl gzip=/var/lib/pgsql/9.3/backups/walarchive -l /var/lib/pgsql/9.3/slave/pg_log/omnipitr-archive-^Y-^m-^d.log -v -s /var/lib/pgsql/9.3/slave/pg_log/state -db /var/lib/pgsql/9.3/backups/dst-backup "%p"'Reload slave config to turn on new archive command
psql -p 5488 select pg_reload_conf();Check that it's in place
show archive_command;Archive command is only called if postgresql is in master mode
sudo service postgresql-9.3 stopCheck slave logs. Should see connection errors Touch failover file for first slave we made
touch /var/lib/pgsql/9.3/slave/finish.recoveryEdit recovery.conf in omnislave to point to new master
primary_conninfo = 'user=replication password=password host=127.0.0.1 port=5488'Restart omnislave
/usr/pgsql-9.3/bin/pg_ctl restart -D /var/lib/pgsql/9.3/omnislaveCheck omnislave log to see if it reconnected to new primary