意见箱
恒创运营部门将仔细参阅您的意见和建议,必要时将通过预留邮箱与您保持联络。感谢您的支持!
意见/建议
提交建议

MySQL MGR 某个节点宕机之后开启组复制

来源:恒创科技 编辑:恒创科技编辑部
2023-12-20 13:27:59


环境如下

主机名

IP


MySQL MGR 某个节点宕机之后开启组复制

系统

数据库版本

db01

192.168.179.102

CentOS7.4

5.7.30

db02

192.168.179.103

CentOS7.4

5.7.30

db03

192.168.179.104

CentOS7.4

5.7.30

MGR集群中某个节点宕机(db02,db03宕机)

主节点db01上查看,可以看到db01,db03状态是不存在的,在db01节点中可以看到成员被移除了

mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 18f5da07-a096-11ea-8c70-000c290e1abf | db01 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+


2020-05-31T02:17:34.210629Z 0 [Warning] Plugin group_replication reported: 'Member with address db02:3306 is reachable again.'
2020-05-31T02:17:34.210705Z 0 [Warning] Plugin group_replication reported: 'The member has resumed contact with a majority of the members in the group. Regular operation is restored and transactions are unblocked.'
2020-05-31T02:17:39.209784Z 0 [Warning] Plugin group_replication reported: 'Member with address db03:3306 is reachable again.'
2020-05-31T02:17:42.855196Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:42.855378Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:42.855470Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:42.855702Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:42.855770Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:43.397373Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:43.397855Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:43.398351Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:43.398705Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:46.072741Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:46.073304Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:46.073611Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:46.073827Z 0 [Note] Plugin group_replication reported: '[GCS] Removing members that have failed while processing new view.'
2020-05-31T02:17:47.220541Z 0 [Warning] Plugin group_replication reported: 'Members removed from the group: db02:3306, db03:3306'
2020-05-31T02:17:47.221214Z 0 [Note] Plugin group_replication reported: 'Group membership changed to db01:3306 on view 15907170974285400:4.'
2020-05-31T02:17:48.217697Z 0 [Note] Plugin group_replication reported: 'Group membership changed to db01:3306 on view 15907170974285400:5.'

在db02,db03上面可以看到状态是error的

#db02状态error和错误日志,下面可以看到由于网络问题导致节点被驱除
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 7d3e894a-a0a9-11ea-82f8-000c29a7fff7 | db03 | 3306 | ERROR |
+---------------------------+--------------------------------------+-------------+-------------+--------------+

2020-05-31T02:17:50.950971Z 0 [ERROR] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2020-05-31T02:17:50.954613Z 0 [Note] Plugin group_replication reported: 'Going to wait for view modification'
2020-05-31T02:17:50.950971Z 0 [ERROR] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2020-05-31T02:17:53.940774Z 0 [Warning] Plugin group_replication reported: 'Skipping this round of stable set computation as certification garbage collection process is still running.'
2020-05-31T02:18:00.384705Z 0 [ERROR] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2020-05-31T02:18:00.385155Z 0 [Note] Plugin group_replication reported: 'Going to wait for view modification'

#db03和db02一样
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 7d3e894a-a0a9-11ea-82f8-000c29a7fff7 | db03 | 3306 | ERROR |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
2020-05-31T02:17:43.394771Z 0 [ERROR] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2020-05-31T02:17:43.405616Z 0 [Note] Plugin group_replication reported: 'Going to wait for view modification'
2020-05-31T02:17:46.070755Z 0 [ERROR] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2020-05-31T02:17:46.071941Z 0 [Note] Plugin group_replication reported: 'Going to wait for view modification'
2020-05-31T02:17:46.072140Z 0 [Warning] Plugin group_replication reported: 'Skipping this round of stable set computation as certification garbage collection process is still running.'
2020-05-31T02:17:46.072887Z 0 [ERROR] Plugin group_replication reported: 'Member was expelled from the group due to network failures, changing member status to ERROR.'
2020-05-31T02:17:46.073299Z 0 [Note] Plugin group_replication reported: 'Going to wait for view modification'

如果是单主模式,某个节点宕机,如果该节点是可读可写节点,那么可写权限会漂移到其他节点,如果该节点只是个可读节点,那么宕机也不影响集群运行;
多主模式下某个节点宕机更不会影响MGR集群的运行;
所以某个节点宕机或者多个节点宕机,但集群中还存在可读可写的节点,那么重启节点实例后,直接开启MGR加入复制集群即可。

我们这是单主集群,由于主节点存在可读可写,所以只需要将从节点先stop group_replication;然后再start group_replication;即可

如果你先不关闭组复制,会报出如下错误:

mysql> start group_replication;
ERROR 3093 (HY000): The START GROUP_REPLICATION command failed since the group is already running.

由于db02,db03在尚未正确加入集群,在db01主库插入一条数据,看看是否在db02,db03在之后加入集群之后数据可以同步

mysql> insert into tbs02 values('3','test');
Query OK, 1 row affected (0.01 sec)

在db02上操作如下:

mysql> stop group_replication;
Query OK, 0 rows affected (6.03 sec)
mysql> start group_replication;
Query OK, 0 rows affected, 1 warning (5.05 sec)


#db02上的日志如下:
2020-05-31T02:22:56.014718Z 29 [Warning] Plugin group_replication reported: ''group_replication_allow_local_disjoint_gtids_join' is deprecated and will be removed in a future release.'
2020-05-31T02:22:56.015880Z 29 [Note] Plugin group_replication reported: 'Group communication SSL configuration: group_replication_ssl_mode: "DISABLED"'
2020-05-31T02:22:56.016130Z 29 [Note] Plugin group_replication reported: '[GCS] Added automatically IP ranges 127.0.0.1/8,192.168.179.103/24 to the whitelist'
2020-05-31T02:22:56.016383Z 29 [Note] Plugin group_replication reported: '[GCS] Translated 'db02' to 192.168.179.103'
2020-05-31T02:22:56.016793Z 29 [Warning] Plugin group_replication reported: '[GCS] Automatically adding IPv4 localhost address to the whitelist. It is mandatory that it is added.'
2020-05-31T02:22:56.016864Z 29 [Note] Plugin group_replication reported: '[GCS] SSL was not enabled'
2020-05-31T02:22:56.016896Z 29 [Note] Plugin group_replication reported: 'Initialized group communication with configuration: group_replication_group_name: "5a421130-2674-11ea-bbce-00505639ee45"; group_replication_local_address: "db02:33061"; group_replication_group_seeds: "db01:33061,db02:33061,db03:33061"; group_replication_bootstrap_group: false; group_replication_poll_spin_loops: 0; group_replication_compression_threshold: 1000000; group_replication_ip_whitelist: "AUTOMATIC"'
2020-05-31T02:22:56.016936Z 29 [Note] Plugin group_replication reported: '[GCS] Configured number of attempts to join: 0'
2020-05-31T02:22:56.016948Z 29 [Note] Plugin group_replication reported: '[GCS] Configured time between attempts to join: 5 seconds'
2020-05-31T02:22:56.016985Z 29 [Note] Plugin group_replication reported: 'Member configuration: member_id: 102; member_uuid: "dfe895a8-a0df-11ea-bb7d-000c296190c1"; single-primary mode: "true"; group_replication_auto_increment_increment: 7; '
2020-05-31T02:22:56.021378Z 33 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 2334, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2020-05-31T02:22:56.034915Z 36 [Note] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log './db02-relay-bin-group_replication_applier.000002' position: 2599
2020-05-31T02:22:56.035837Z 29 [Note] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'
2020-05-31T02:22:56.035896Z 29 [Note] Plugin group_replication reported: 'auto_increment_increment is set to 7'
2020-05-31T02:22:56.035910Z 29 [Note] Plugin group_replication reported: 'auto_increment_offset is set to 102'
2020-05-31T02:22:56.037586Z 0 [Note] Plugin group_replication reported: 'XCom protocol version: 3'
2020-05-31T02:22:56.037631Z 0 [Note] Plugin group_replication reported: 'XCom initialized and ready to accept incoming connections on port 33061'
2020-05-31T02:23:00.068204Z 0 [ERROR] Plugin group_replication reported: 'This member has more executed transactions than those present in the group. Local transactions: 5a421130-2674-11ea-bbce-00505639ee45:1-11,
dfe895a8-a0df-11ea-bb7d-000c296190c1:1-2 > Group transactions: 5a421130-2674-11ea-bbce-00505639ee45:1-12'
2020-05-31T02:23:00.068280Z 0 [Warning] Plugin group_replication reported: 'The member contains transactions not present in the group. It is only allowed to join due to group_replication_allow_local_disjoint_gtids_join option'
2020-05-31T02:23:00.068980Z 29 [Note] Plugin group_replication reported: 'This server is working as secondary member with primary member address db01:3306.'
2020-05-31T02:23:00.070720Z 39 [Note] Plugin group_replication reported: 'Establishing group recovery connection with a possible donor. Attempt 1/10'
2020-05-31T02:23:00.071511Z 0 [Note] Plugin group_replication reported: 'Group membership changed to db01:3306, db02:3306 on view 15907170974285400:6.'
2020-05-31T02:23:00.089567Z 39 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='db01', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.
2020-05-31T02:23:00.105240Z 39 [Note] Plugin group_replication reported: 'Establishing connection to a group replication recovery donor 18f5da07-a096-11ea-8c70-000c290e1abf at db01 port: 3306.'
2020-05-31T02:23:00.106148Z 41 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2020-05-31T02:23:00.112140Z 42 [Note] Slave SQL thread for channel 'group_replication_recovery' initialized, starting replication in log 'FIRST' at position 0, relay log './db02-relay-bin-group_replication_recovery.000001' position: 4
2020-05-31T02:23:00.114535Z 41 [Note] Slave I/O thread for channel 'group_replication_recovery': connected to master 'repl@db01:3306',replication started in log 'FIRST' at position 4
2020-05-31T02:23:00.135846Z 39 [Note] Plugin group_replication reported: 'Terminating existing group replication donor connection and purging the corresponding logs.'
2020-05-31T02:23:00.137323Z 42 [Note] Slave SQL thread for channel 'group_replication_recovery' exiting, replication stopped in log 'mysql-bin.000001' at position 3454
2020-05-31T02:23:00.140746Z 41 [Note] Slave I/O thread killed while reading event for channel 'group_replication_recovery'
2020-05-31T02:23:00.140790Z 41 [Note] Slave I/O thread exiting for channel 'group_replication_recovery', read up to log 'mysql-bin.000001', position 3454
2020-05-31T02:23:00.162635Z 39 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='db01', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 0, master_log_file='', master_log_pos= 4, master_bind=''.
2020-05-31T02:23:00.179468Z 0 [Note] Plugin group_replication reported: 'This server was declared online within the replication group'


#主库日志信息,可以看到db02已经成功加入到集群里面了
2020-05-31T02:23:00.068064Z 0 [Note] Plugin group_replication reported: 'Members joined the group: db02:3306'
2020-05-31T02:23:00.068315Z 0 [Note] Plugin group_replication reported: 'Group membership changed to db01:3306, db02:3306 on view 15907170974285400:6.'
2020-05-31T02:23:00.122246Z 30 [Note] Start binlog_dump to master_thread_id(30) slave_server(102), pos(, 4)
2020-05-31T02:23:00.179611Z 0 [Note] Plugin group_replication reported: 'The member with address db02:3306 was declared online within the replication group'
2020-05-31T02:24:00.123889Z 30 [Note] Aborted connection 30 to db: 'unconnected' user: 'repl' host: 'db02' (failed on flush_net())


#查看集群状态,可以看到db02加入了
mysql> melect * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 18f5da07-a096-11ea-8c70-000c290e1abf | db01 | 3306 | ONLINE |
| group_replication_applier | dfe895a8-a0df-11ea-bb7d-000c296190c1 | db02 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+

db03如法炮制参照db02:

mysql> stop group_replication;
Query OK, 0 rows affected (6.03 sec)
mysql> start group_replication;
Query OK, 0 rows affected, 1 warning (3.07 sec)

#db01日志
2020-05-31T02:23:00.068064Z 0 [Note] Plugin group_replication reported: 'Members joined the group: db02:3306'
2020-05-31T02:23:00.068315Z 0 [Note] Plugin group_replication reported: 'Group membership changed to db01:3306, db02:3306 on view 15907170974285400:6.'
2020-05-31T02:23:00.122246Z 30 [Note] Start binlog_dump to master_thread_id(30) slave_server(102), pos(, 4)
2020-05-31T02:23:00.179611Z 0 [Note] Plugin group_replication reported: 'The member with address db02:3306 was declared online within the replication group'
2020-05-31T02:24:00.123889Z 30 [Note] Aborted connection 30 to db: 'unconnected' user: 'repl' host: 'db02' (failed on flush_net())
2020-05-31T02:39:28.051102Z 0 [Note] Plugin group_replication reported: 'Members joined the group: db03:3306'
2020-05-31T02:39:28.051830Z 0 [Note] Plugin group_replication reported: 'Group membership changed to db01:3306, db03:3306, db02:3306 on view 15907170974285400:7.'
2020-05-31T02:39:28.274123Z 0 [Note] Plugin group_replication reported: 'The member with address db03:3306 was declared online within the replication group'

#db02日志
2020-05-31T02:39:28.051213Z 0 [Note] Plugin group_replication reported: 'Members joined the group: db03:3306'
2020-05-31T02:39:28.051941Z 0 [Note] Plugin group_replication reported: 'Group membership changed to db01:3306, db03:3306, db02:3306 on view 15907170974285400:7.'
2020-05-31T02:39:28.157716Z 47 [Note] Start binlog_dump to master_thread_id(47) slave_server(103), pos(, 4)
2020-05-31T02:39:28.274040Z 0 [Note] Plugin group_replication reported: 'The member with address db03:3306 was declared online within the replication group'


#通过上面日志可以看到在加入db03节点时候,db01,db02都可以得知该节点的加入,再去看看全部都online了
mysql> select * from performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
| group_replication_applier | 18f5da07-a096-11ea-8c70-000c290e1abf | db01 | 3306 | ONLINE |
| group_replication_applier | 7d3e894a-a0a9-11ea-82f8-000c29a7fff7 | db03 | 3306 | ONLINE |
| group_replication_applier | dfe895a8-a0df-11ea-bb7d-000c296190c1 | db02 | 3306 | ONLINE |
+---------------------------+--------------------------------------+-------------+-------------+--------------+
3 rows in set (0.01 sec)

最后验证在正确加入集群db02,db03上是否可以查询到db01在db02,db03尚未正确加入集群插入的数据

#db02上可以查询到
mysql> select * from tbs02;
+----+------+
| id | name |
+----+------+
| 1 | fxkt |
| 3 | test |
+----+------+
2 rows in set (0.00 sec)

#db03上同样可以查询到
mysql> select * from tbs02;
+----+------+
| id | name |
+----+------+
| 1 | fxkt |
| 3 | test |
+----+------+
2 rows in set (0.00 sec)

#结论,在db02,db03宕机重启重新加入集群会将主库db01上修改的数据进行同步

上一篇: MySQL 组复制(MGR) 以及常用复制技术介绍 下一篇: #yyds干货盘点#mysql删除重复记录