primary:10.189.102.118
standby:10.189.100.195
1. 配置ssh互信机制
- 在primary主库执行
$ ssh-keygen -t rsa$ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys$ ssh-copy-id postgres@10.189.100.195
- 在standby备库执行
$ ssh-keygen -t rsa$ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys$ ssh-copy-id postgres@10.189.102.118
2. 在standby服务器安装postgres数据库,不需要初始化.
安装过程详见:
3. 在primary服务器创建具有REPLICATION权限的复制用户
postgres=# CREATE ROLE repl WITH REPLICATION PASSWORD 'repl' LOGIN;
4. 允许复制用户远程连接到primary服务器
$ grep "^host" pg_hba.confhost all all 127.0.0.1/32 trusthost replication repl 0.0.0.0/0 md5 host all all ::1/128 trust
5. 在primary服务器设置流复制相关的参数
$ mkdir /usr/local/pgsql/arch $ egrep "archive_mode|max_wal_senders|wal_keep_segments|archive_command|wal_level|hot_standby" postgresql.confal_level = hot_standby # minimal, archive, hot_standby, or logicalarchive_mode = on # enables archiving; off, on, or alwaysarchive_command = 'test ! -f /usr/local/pgsql/arch/%f && cp %p /usr/local/pgsql/arch/%f && rsync -a %p postgres@10.189.100.195:/usr/local/pgsql/arch/%f' max_wal_senders = 5 # max number of walsender processeswal_keep_segments = 30 # in logfile segments, 16MB each; 0 disableshot_standby = on # "on" allows queries during recovery#hot_standby_feedback = off # send info from standby to prevent
6. 重新启动primary服务器进程
$ pg_ctl stop -m fast$ pg_ctl start
7. 对primary服务器做一个全备并传输到standby服务器
- 方法一,在primary服务器通过pg_(start|stop)_backup函数进行备份
postgres=# SELECT pg_start_backup('label', true); pg_start_backup ----------------- 7/E6000060(1 row)$ rsync -az --progress ${PGDATA} postgres@10.189.100.195:/usr/local/pgsql/ --exclude postmaster.pidpostgres=# SELECT pg_stop_backup();NOTICE: pg_stop_backup complete, all required WAL segments have been archived pg_stop_backup ---------------- 7/E60005C8(1 row)
- 方法二,在standby服务器通过pg_basebackup命令进行备份,要求standby的PGDATA目录为空
$ pg_basebackup --host=10.189.102.118 --username=repl --port=5432 --label=backup --verbose --progress --pgdata=/usr/local/pgsql/data --checkpoint=fast --format=p --xlog-method=streamPassword: transaction log start point: 7/EA000028 on timeline 1pg_basebackup: starting background WAL receiver65933562/65933562 kB (100%), 1/1 tablespace transaction log end point: 7/EA000830pg_basebackup: waiting for background process to finish streaming ...pg_basebackup: base backup completed
8. 设置standby数据库复制相关参数,使得standby失效转移后可以作为主库工作
$ mkdir /usr/local/pgsql/arch$ grep "hot_standby" postgresql.confhot_standby = on # "on" allows queries during recovery#hot_standby_feedback = off # send info from standby to prevent
9. 在standby文件创建恢复文件
$ cat recovery.conf restore_command = 'cp /usr/local/pgsql/arch/%f "%p"'standby_mode = 'on'primary_conninfo = 'user=repl password=repl host=10.189.102.118 port=5432 sslmode=disable sslcompression=1'archive_cleanup_command = 'pg_archivecleanup -d /usr/local/pgsql/arch %r >> /usr/local/pgsql/arch/archive_cleanup.log'trigger_file = '/usr/local/pgsql/data/trigger_active_standby'
10. 启动standby数据库进程,自动启动流复制
$ pg_ctl start -wwaiting for server to start....LOG: could not create IPv6 socket: Address family not supported by protocolLOG: redirecting log output to logging collector processHINT: Future log output will appear in directory "pg_log". doneserver started
11. 检查primary和standby数据库的状态
- 通过函数和系统表查看
edbstore=# select * from pg_stat_replication; #在primary主库查看-[ RECORD 1 ]----+------------------------------pid | 15013usesysid | 19206usename | replapplication_name | walreceiverclient_addr | 10.189.100.195client_hostname | client_port | 56072backend_start | 2017-06-13 08:10:35.400508-07backend_xmin | state | streamingsent_location | 7/EC01A588write_location | 7/EC01A588flush_location | 7/EC01A588replay_location | 7/EC01A588sync_priority | 0sync_state | asyncedbstore=# SELECT pg_current_xlog_location(); #在primary主库查看 pg_current_xlog_location -------------------------- 7/EC01A588(1 row)postgres=# select pg_last_xlog_receive_location(),pg_last_xlog_replay_location(),pg_last_xact_replay_timestamp(); #在standby备库查看 pg_last_xlog_receive_location | pg_last_xlog_replay_location | pg_last_xact_replay_timestamp -------------------------------+------------------------------+------------------------------- 7/EC01A588 | 7/EC01A588 | 2017-06-13 08:25:20.281568-07(1 row)
- 通过进程查看
$ ps -ef | grep sender | grep -v grep #在primary库查看postgres 15013 24883 0 08:10 ? 00:00:00 postgres: wal sender process repl 10.189.100.195(56072) streaming 7/EC01A668$ ps -ef | grep receiver | grep -v grep #在standby库查看postgres 12857 12843 0 08:10 ? 00:00:00 postgres: wal receiver process streaming 7/EC01A668
- 查看备库落后主库多少字节的WAl日志
postgres=# select pg_xlog_location_diff(pg_current_xlog_location(),replay_location) from pg_stat_replication ; #在primary库查看 pg_xlog_location_diff ----------------------- 0(1 row)
- 把主库WAL日志位置转换成WAL文件名称和偏移量
postgres=# select pg_current_xlog_location(); pg_current_xlog_location -------------------------- 7/EC021790(1 row)postgres=# select * from pg_xlogfile_name_offset('7/EC021790'); file_name | file_offset --------------------------+------------- 000000010000000700000076 | 137104(1 row)
- 查看主库备库的状态
postgres=# select pg_is_in_recovery(); #在primary主库查看 pg_is_in_recovery ------------------- f(1 row)postgres=# select pg_is_in_recovery(); #在standby库查看 pg_is_in_recovery ------------------- t(1 row)