MMM实现MySQL高可用详解

hejunbinfendou 2018-01-16

一、MMM简介

MMM(Master-Master replication manager for MySQL)是一套支持双主故障切换和双主日常管理的脚本程序。MMM使用Perl语言开发,主要用来监控和管理MySQL Master-Master(双主)复制,虽然叫做双主复制,但是业务上同一时刻只允许对一个主进行写入,另一台备选主上提供部分读服务,以加速在主主切换时刻备选主的预热,可以说MMM这套脚本程序一方面实现了故障切换的功能,另一方面其内部附加的工具脚本也可以实现多个slave的read负载均衡。

MMM提供了自动和手动两种方式移除一组服务器中复制延迟较高的服务器的虚拟ip,同时它还可以备份数据,实现两节点之间的数据同步等。由于MMM无法完全的保证数据一致性,所以MMM适用于对数据的一致性要求不是很高,但是又想最大程度的保证业务可用性的场景。对于那些对数据的一致性要求很高的业务,非常不建议采用MMM这种高可用架构。

优点:

1 稳定和成熟的开源产品,经过了时间的考验 核心技术是mysql自己的技术,只是使用脚本程序来控制,所以在原理上比较容易理解,而且管理能够更智能化。
2 安装简单,配置简单,使用简单
3 功能强大 (HA,failover,tools套件,cluster模式可以一个monitor管理多个mmm组)

缺点:

1 由于架构里只有一个写入点,所以扩展性是有限的,但是对一般中型企业够用了。
   解决方案:对于大应用可以采取垂直拆分到多个mmm架构的方式,使用mmm cluster来管理。 
2 对于读写分离和读负载均衡还是要程序来开发或者使用其他工具完成。

二、架构及配置环境

MySQL-MMM架构图:

MMM实现MySQL高可用详解

mysql-mmm运行机制:

MMM实现MySQL高可用详解

 mysql-mmm安装需求

Server n+1:
       N台安装mysql的机器和1台安装mmm monitor的机器。 
2*(n+1)Ips:
      每个主机一个固定ip、一个虚拟IP(reader role),全局一个writer role IP 
Monitor User: 
      一个可以在mmm monitor机器上使用的并且拥有REPLICATION,CLIENT权限的mysql用户 
Agent User:
       一个可以在mmm agent机器上使用的并且拥有super,replication,client,process权限的mysql用户 
Replication user: 
     一个slaves主机上可以使用的并且有用replication slave权限的用户 
Tools user: 
     一个mmm tools主机可以使用的,并且有用super,replication client,reload权限的mysql用户

安装环境:

1.角色:

MMM实现MySQL高可用详解

2.虚拟ip规划:

MMM实现MySQL高可用详解

3.hosts文件配置(全部机器):

192.168.1.11 master-db1
192.168.1.12 master-db2
192.168.1.13 slave-db1
192.168.1.14 slave-db2
192.168.1.15 mmm-monitor

4.其他:

1 关闭iptables
2 同步时间
3 配置yum和epel源

三、安装MySQL并配置

1.mysql安装

master-db1

[mysqld]
server-id = 1
datadir = /Data/apps/mysql-5.6.36/data
log-bin = /Data/apps/mysql-5.6.36/data/mysql-bin
binlog_format = ROW
relay_log = /Data/apps/mysql-5.6.36/data/relay-log
auto-increment-increment = 2
auto-increment-offset = 1
sync_binlog = 1
sync_master_info = 1
sync_relay_log = 1
sync_relay_log_info = 1
max_binlog_size  = 100M
log_slave_updates   = 1
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES

master-db2

[mysqld]
server-id = 2
datadir = /Data/apps/mysql-5.6.36/data
log-bin = /Data/apps/mysql-5.6.36/data/mysql-bin
binlog_format = ROW
relay_log = /Data/apps/mysql-5.6.36/data/relay-log
auto-increment-increment = 2
auto-increment-offset = 2
sync_binlog = 1
sync_master_info = 1
sync_relay_log = 1
sync_relay_log_info = 1
max_binlog_size  = 100M
log_slave_updates   = 1
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES

slave-db1

[mysqld]
server-id = 3
datadir = /Data/apps/mysql-5.6.36/data
log-bin = /Data/apps/mysql-5.6.36/data/mysql-bin
binlog_format = ROW
relay_log = /Data/apps/mysql-5.6.36/data/relay-log
sync_binlog = 1
sync_master_info = 1
sync_relay_log = 1
sync_relay_log_info = 1
max_binlog_size  = 100M
log_slave_updates   = 1
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES

slave-db2

[mysqld]
server-id = 4
datadir = /Data/apps/mysql-5.6.36/data
log-bin = /Data/apps/mysql-5.6.36/data/mysql-bin
binlog_format = ROW
relay_log = /Data/apps/mysql-5.6.36/data/relay-log
sync_binlog = 1
sync_master_info = 1
sync_relay_log = 1
sync_relay_log_info = 1
max_binlog_size  = 100M
log_slave_updates   = 1
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES

3.重启以上四台mysql服务

service mysqld restart

4.在master-db1上创建mmm架构中需要的用户和权限

[root@master-db1 ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.6.36-log MySQL Community Server (GPL)
.......
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> GRANT REPLICATION CLIENT ON *.* TO 'mmm_monitor'@'192.168.1.%' IDENTIFIED BY '123456';
Query OK, 0 rows affected (0.19 sec)
mysql> GRANT SUPER,REPLICATION CLIENT,PROCESS ON *.* TO 'mmm_agent'@'192.168.1.%' IDENTIFIED BY '123456';
Query OK, 0 rows affected (0.00 sec)

mysql> GRANT REPLICATION SLAVE ON *.* TO 'replication'@'192.168.1.%' IDENTIFIED BY '123456';
Query OK, 0 rows affected (0.01 sec)

5.查看二进制日志位置

mysql> FLUSH TABLES WITH READ LOCK;
Query OK, 0 rows affected (0.07 sec)

mysql> SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |      796 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.05 sec)

6.不要关闭这个mysql进程连接,避免锁失效,我们另起一个ssh连接db1服务器,进行数据库备份:

[root@master-db1 ~]# mysqldump --all-databases > /tmp/database-backup.sql

7.回到刚才mysql进程,进行解锁:

mysql> UNLOCK TABLES;
Query OK, 0 rows affected (0.05 sec)

8.将database-backup.sql文件复制到其他db节点:

[root@master-db1 ~]# scp /tmp/database-backup.sql master-db2:/tmp
[root@master-db1 ~]# scp /tmp/database-backup.sql slave-db1:/tmp
[root@master-db1 ~]# scp /tmp/database-backup.sql slave-db2:/tmp

9.master-db,slave-db1,slave-db2三台主机导入sql文件,并刷新权限:

[root@master-db2 ~]# mysql < /tmp/database-backup.sql
[root@master-db2 ~]# mysql -e "FLUSH PRIVILEGES;"

四、设置MySQL主-从和主-主配置

1.在其他三台mysql上将master-db1设为主服务器

[root@master-db2 ~]# mysql
mysql> CHANGE MASTER TO MASTER_HOST='192.168.1.11',MASTER_USER='replication',MASTER_PASSWORD='123456',MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=796;
Query OK, 0 rows affected, 2 warnings (0.11 sec)
mysql> start slave;
Query OK, 0 rows affected (0.11 sec)


mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.11
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 796
               Relay_Log_File: relay-log.000002
                Relay_Log_Pos: 283
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

2.查看master-db2的master日志位置:

mysql> SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |   636231 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

3.在master-db1上操作,将master-db2设置为主:

mysql> CHANGE MASTER TO MASTER_HOST='192.168.1.12',MASTER_USER='replication',MASTER_PASSWORD='123456',MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=636231;
Query OK, 0 rows affected, 2 warnings (0.03 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.12
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 636231
               Relay_Log_File: relay-log.000002
                Relay_Log_Pos: 283
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

五、安装MMM

1.创建Tools user

useradd -s /sbin/nologin mmmd #所有机器

2.查看mmm版本:

yum list all|grep ^mysql-mmm
mysql-mmm.noarch                            2.2.1-2.el6                  @epel  
mysql-mmm-agent.noarch                      2.2.1-2.el6                  @epel  
mysql-mmm-monitor.noarch                    2.2.1-2.el6                  epel   
mysql-mmm-tools.noarch                      2.2.1-2.el6                  epel

3.在mmm-monitor上安装:

[root@mmm-monitor ~]# yum -y install mysql-mmm-monitor

4.在四台mysql服务器上安装:

yum -y install mysql-mmm-agent

5.编写配置文件,五台主机必须一致:

[root@mmm-monitor ~]# vim /etc/mysql-mmm/mmm_common.conf
active_master_role      writer    #积极的master角色的标示,所有的db服务器要开启read_only参数,对于writer服务器监控代理会自动将read_only属性关闭。

<host default>
    cluster_interface       eth0 #群集的网络接口
    pid_path                /var/run/mysql-mmm/mmm_agentd.pid  #pid路径
    bin_path                /usr/libexec/mysql-mmm/  #可执行文件路径
    replication_user        replication    #复制用户
    replication_password    123456      #复制用户密码
    agent_user              mmm_agent    #代理用户
    agent_password          123456      #代理用户密码
</host>

<host master-db1>   #master-db1的host名称
    ip      192.168.1.11   #master-db1的ip
    mode    master        #角色属性,master代表是主
    peer    master-db2   #与master-db1对等的服务器的host名,也就是master-db2的服务器host名
</host>

<host master-db2>   #和master-db1的概念一样
    ip      192.168.1.12
    mode    master
    peer    master-db1
</host>

<host slave-db1>     #从库的host名,如果存在多个从库可以重复一样的配置
    ip      192.168.1.13    #从的ip
    mode    slave      #slave的角色属性代表当前host是从
</host>

<host slave-db2>      #和slave-db1的概念一样
    ip      192.168.1.14
    mode    slave
</host>
<role writer>      #writer角色配置
    hosts   master-db1, master-db2   #能进行写操作的服务器的host名,如果不想切换写操作这里可以只配置master,这样也可以避免因为网络延时而进行write的切换,但是一旦master出现故障那么当前的MMM就没有writer了只有对外的read操作。
    ips     192.168.1.250    #对外提供的写操作的虚拟IP
    mode    exclusive    #exclusive代表只允许存在一个主,也就是只能提供一个写的IP
</role>

<role reader>   #read角色配置
    hosts   master-db1, master-db2, slave-db1, slave-db2    #对外提供读操作的服务器的host名,当然这里也可以把master加进来
    ips     192.168.1.251, 192.168.1.252, 192.168.1.253, 192.168.1.254    #对外提供读操作的虚拟ip,这三个ip和host不是一一对应的,并且ips也hosts的数目也可以不相同,如果这样配置的话其中一个hosts会分配两个ip
    mode    balanced  #balanced代表负载均衡
</role>

 6.复制到其他服务器上

scp /etc/mysql-mmm/mmm_common.conf 192.168.1.11:/etc/mysql-mmm/mmm_common.conf 
scp /etc/mysql-mmm/mmm_common.conf 192.168.1.12:/etc/mysql-mmm/mmm_common.conf 
scp /etc/mysql-mmm/mmm_common.conf 192.168.1.13:/etc/mysql-mmm/mmm_common.conf 
scp /etc/mysql-mmm/mmm_common.conf 192.168.1.14:/etc/mysql-mmm/mmm_common.conf

 7.在所有的MySQL上修改mmm_agent.conf,只需要修改master-db1这里,是哪台就改成哪台,这里只给出master-db1的:

vim /etc/mysql-mmm/mmm_agent.conf
include mmm_common.conf

# The 'this' variable refers to this server.  Proper operation requires 
# that 'this' server (db1 by default), as well as all other servers, have the 
# proper IP addresses set in mmm_common.conf.
this master-db1

8.配置mmm-monitor上的mmm_mon.conf:

[root@mmm-monitor ~]# vim /etc/mysql-mmm/mmm_mon.conf
include mmm_common.conf

<monitor>
    ip                  192.168.1.15
    pid_path            /var/run/mysql-mmm/mmm_mond.pid
    bin_path            /usr/libexec/mysql-mmm
    status_path         /var/lib/mysql-mmm/mmm_mond.status
    ping_ips            192.168.1.11, 192.168.1.12, 192.168.1.13, 192.168.1.14
    auto_set_online     60

    # The kill_host_bin does not exist by default, though the monitor will
    # throw a warning about it missing.  See the section 5.10 "Kill Host
    # Functionality" in the PDF documentation.
    #
    # kill_host_bin     /usr/libexec/mysql-mmm/monitor/kill_host
    #
</monitor>

<host default>
    monitor_user        mmm_monitor
    monitor_password    123456
</host>

debug 0

9.启动服务:

在mmm-monitor启动:

[root@mmm-monitor ~]# chkconfig mysql-mmm-monitor on
[root@mmm-monitor ~]# service mysql-mmm-monitor start

在所有mysql服务器上启动

chkconfig mysql-mmm-agent on
service mysql-mmm-agent start

六、高可用性测试:

服务器读写采有VIP地址进行读写,出现故障时VIP会漂移到其它节点,由其它节点提供服务。

mysql-mmm故障处理机制:

MMM实现MySQL高可用详解

 

1.首先查看整个集群的状态,可以看到整个集群状态正常

[root@mmm-monitor ~]# mmm_control show
  master-db1(192.168.1.11) master/ONLINE. Roles: reader(192.168.1.251), writer(192.168.1.250)
  master-db2(192.168.1.12) master/ONLINE. Roles: reader(192.168.1.254)
  slave-db1(192.168.1.13) slave/ONLINE. Roles: reader(192.168.1.252)
  slave-db2(192.168.1.14) slave/ONLINE. Roles: reader(192.168.1.253)

2.关闭master-db1上的mysql服务,模拟mysql宕机

[root@master-db1 ~]# service mysqld stop
Shutting down MySQL...                                     [确定]

3.mmm-monitor上查看集群状态

[root@mmm-monitor ~]# mmm_control show
  master-db1(192.168.1.11) master/HARD_OFFLINE. Roles: 
  master-db2(192.168.1.12) master/ONLINE. Roles: reader(192.168.1.254), writer(192.168.1.250)
  slave-db1(192.168.1.13) slave/ONLINE. Roles: reader(192.168.1.251), reader(192.168.1.252)
  slave-db2(192.168.1.14) slave/ONLINE. Roles: reader(192.168.1.253)

从显示结果可以看出master-db1的状态有ONLINE转换为HARD_OFFLINE,写VIP转移到了master-db2主机上。

4.查看slave-db1和slave-db2主从状态

mysql> show slave status\G   #slave-db1
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.12
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 636231
               Relay_Log_File: relay-log.000002
                Relay_Log_Pos: 283
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
 
 
mysql> show slave status\G   #slave-db2
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.12
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 636231
               Relay_Log_File: relay-log.000002
                Relay_Log_Pos: 283
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

可以看到写请求的VIP已经转移到master-db2上了,且从节点的主都指向了master-db2

5.启动master-db1的mysql服务

[root@master-db1 ~]# service mysqld start
Starting MySQL......                                       [确定]

6.再次查看集群状态(大概等待一分钟左右)

[root@mmm-monitor ~]# mmm_control show
  master-db1(192.168.1.11) master/ONLINE. Roles: reader(192.168.1.252)
  master-db2(192.168.1.12) master/ONLINE. Roles: reader(192.168.1.254), writer(192.168.1.250)
  slave-db1(192.168.1.13) slave/ONLINE. Roles: reader(192.168.1.251)
  slave-db2(192.168.1.14) slave/ONLINE. Roles: reader(192.168.1.253)

7.再次slave-db1和slave-db2主从状态

mysql> show slave status\G  #slave-db1
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.12
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 636231
               Relay_Log_File: relay-log.000002
                Relay_Log_Pos: 283
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
 
 
mysql> show slave status\G   #slave-db2
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.12
                  Master_User: replication
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 636231
               Relay_Log_File: relay-log.000002
                Relay_Log_Pos: 283
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

可以看到主库启动不会接管主,直到现有的主再次宕机。

mysql-mmm故障处理机制中总结:

相关推荐