MHA高速可用群集

MHA高速可用群集

文章目录

前言

1.群集服务器

借助服务器群集功能将多台服务器连接在一起,从而为在该群集中运行的数据和程序提供高可用性和易管理性。服务器群集提供了以下三种主要的群集技术优点: 更高的可用性。 允许服务器群集中的服务和应用在硬件或软件组件故障下或在计划维护期间仍能不间断地提供服务。 更高的可扩展性。 支持通过增加多个处理器(在 Windows Server 2003 Enterprise Edition 中最多可达 8 个,在 Windows Server 2003 Datacenter Edition 中最多可达 32 个)和额外内存(在企业版中,随机存取内存 [RAM] 最多可达 8 GB ,在 Windows Server 2003 Datacenter Edition 中最多可达 64 GB )来扩展服务器。 更高的可管理性。 允许管理员如同管理单台计算机那样管理整个群集内的设备和资源。 该群集服务是两种互为补充的 Windows 群集技术(为了扩展 Windows Server 2003 和 Windows 2000 基础操作系统而提供的)中的一种。另一个群集技术是网络负载均衡( Network Load Balancing , NLB )。该技术作为服务器群集的互补,可面向前端应用和服务(如 Internet 或 Intranet 站点、基于 Web 的应用、媒体流以及 Microsoft 终端服务)来支持高度可用和可伸缩的群集。 本白皮书仅立足于服务器群集的架构和功能,介绍了服务器群集的术语、概念、设计目标、关键组件和预定的发展方向。 本白皮书结尾处的“详细信息”小节提供了一个参考列表,您可以通过这些资源了解服务器群集和 NLB 技术的详细信息。 发展背景计算机群集的出现和使用已经有十几年的历史。 作为最早的群集技术设计师之一, G. Pfister 对群集的定义是,“一种并行或分布式的系统,由全面互连的计算机集合组成,可作为一个统一的计算资源使用”。 将数台服务器计算机组合成一个统一的群集,多台服务器将可以在用户或管理员不必了解细节的情况下分担计算负载。例如,如果服务器群集中的任何资源发生了故障,则不论发生故障的组件是硬件还是软件资源,作为一个整体的群集都可以使用群集中其它服务器上的资源来继续向用户提供服务。 换言之,当资源发生故障时,同服务器群集连接的用户可能经历短暂的性能下降现象,但不会完全失去对服务的访问能力。当需要更高的处理能力时,管理员可以通过滚动升级过程来添加新资源。该过程中,群集在整体上将保持联机状态,它不仅可供用户使用,而且在升级后,其性能也将得到改善。 Windows Server 2003 Enterprise Edition 和 Windows Server 2003 Datacenter Edition 操作系统是完全针对用户和业务对群集技术的要求而设计开发的。主要目标是:开发一种能满足大多数商业机构和组织的群集需求的操作系统服务,而不是仅针对小型和特定的市场段。 Microsoft 市场调查显示,随着中小型商业机构的日常运作已越来越离不开数据库和电子邮件,因此它们对高可用系统的需求很大,而且这种需求日趋旺盛。 易于安装和管理,被认为是这种规模的机构最关键的要求。 Microsoft 的调查同时显示,那些对高性能和高可用性具有很高要求的大企业对基于 Windows 的服务器也日益感兴趣。

2.群集

在我的理解中,群集是指一组相互独立的、通过高速网络互联的计算机,它们构成了一个组,并以单一系统的模式加以管理。一个客户与集群相互作用时,集群像是一个独立的服务器。集群配置是用于提高可用性和可缩放性。

3.MHA

MHA(Master High Availability)目前在MySQL高可用方面是一个相对成熟的解决方案,它由日本DeNA公司youshimaton(现就职于Facebook公司)开发,是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。在MySQL故障切换过程中,MHA能做到在10~30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能在最大程度上保证数据的一致性,以达到真正意义上的高可用。

MHA还提供在线主库切换的功能,能够安全地切换当前运行的主库到一个新的主库中 (通过将从库提升为主库),大概0.5-2秒内即可完成。

该软件由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。MHA Manager可以单独部署在一*立的机器上管理多个master-slave集群,也可以部署在一台slave节点上。MHA Node运行在每台MySQL服务器上,MHA Manager会定时探测集群中的master节点,**当master出现故障时,它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master。**整个故障转移过程对应用程序完全透明。

1.MHA优点总结

1)Masterfailover and slave promotion can be done very quickly

自动故障转移快

2)Mastercrash does not result in data inconsistency

主库崩溃不存在数据一致性问题

3)Noneed to modify current MySQL settings (MHA works with regular MySQL)

不需要对当前mysql环境做重大修改

4)Noneed to increase lots of servers

不需要添加额外的服务器(仅一台manager就可管理上百个replication)

5)Noperformance penalty

性能优秀,可工作在半同步复制和异步复制,当监控mysql状态时,仅需要每隔N秒向master发送ping包(默认3秒),所以对性能无影响。你可以理解为MHA的性能和简单的主从复制框架性能一样。

6)Works with any storage engine

只要replication支持的存储引擎,MHA都支持,不会局限于innodb

2.MHA的工作流程

1.企业网站MHA的基础工作架构

MHA高速可用群集

2.MHA形成的原因

1.在线网当中,我们可能会遇到很多的问题,我们假设一下,当master主服务器宕机了,我们该如何通过MHA将master主机剔除出集群,然后再将备用的服务器的竞选成master主机,从而保证业务的不中断。

2.线网中的拓扑图

MHA高速可用群集

3.MHA的工作流程

1)把宕机的master二进制日志保存下来。

2)找到binlog位置点最新的slave。

3)在binlog位置点最新的slave上用relay log(差异日志)修复其它slave。

4)将宕机的master上保存下来的二进制日志恢复到含有最新位置点的slave上。

5)将含有最新位置点binlog所在的slave提升为master。

6)将其它slave重新指向新提升的master,并开启主从复制。

监控所有node节点MHA功能说明:

2、自动故障切换(failover)

前提是必须有三个节点存在,并且有两个从库

(1)选主前提,按照配置文件的顺序进行,但是如果此节点后主库100M以上relay-log 就不会选

(2)如果你设置了权重,总会切换带此节点;一般在多地多中心的情况下,一般会把权重设置在本地节点。

(3)选择slave1为新主

(4)保存主库binlog日志

3、重新构建主从

(1)将有问题的节点剔除MHA

进行第一阶段数据补偿,slave2缺失部分补全90

(2)slave1切换角色为新主,将slave2指向新主slave1

slave2 change master to slave1

(3) 第二阶段数据补偿

将保存过来的新主和原有主缺失部分的binlog,应用到新主。

(4)虚拟IP漂移到新主,对应用透明无感知

(5)通知管理员故障切换

3.搭建MHA高可用群集

1.主从同步基础配置

1.MHA群集拓扑图

MHA高速可用群集

2.安装mysql数据库(master、slave1、slave2都要安装,为了节省大家的时间就写一个了)

//安装ntp,同步时间
[root@localhost ~]# yum -y install ntp ntpdate
//配置ntp的主配置文件
[root@localhost ~]# vim /etc/ntp.conf
//本地时钟源
server 127.127.73.0  
//设置时间层级为8
fudge 127.127.73.0 stratum 8 
//从服务器重还需要做一下配置
//接下来这步非常重要,我们需要将从服务器的时间与master的服务器同步
//[root@slave1 mysql]# /usr/sbin/ntpdate 192.168.73.140
//[root@slave2 mysql]# /usr/sbin/ntpdate 192.168.73.140
[root@localhost ~]# systemctl start ntpd
//关闭防火墙
[root@localhost ~]# systemctl stop firewalld
[root@localhost ~]# setenforce 0
//安装MySQL5.7
//安装MySQL的源码编译包
[root@localhost ~]# yum -y install ncurses ncurses-devel bison cmake gcc gcc-c++
//创建MySQL程序用户
[root@localhost ~]# useradd -s /sbin/nologin mysql
//将压缩包解压到/opt目录下面
[root@localhost ~]# tar -zxvf mysql-boost-5.7.20.tar.gz -C /opt
[root@localhost ~]# cd /opt/mysql-5.7.20/
//cmake 编译安装MySQL5.7
[root@localhost mysql-5.7.20]# cmake \
-DCMAKE_INSTALL_PREFIX=/usr/local/mysql \
-DMYSQL_UNIX_ADDR=/usr/local/mysql/mysql.sock \
-DSYSCONFDIR=/etc \
-DSYSTEMD_PID_DIR=/usr/local/mysql \
-DDEFAULT_CHARSET=utf8 \
-DDEFAULT_COLLATION=utf8_general_ci \
-DWITH_INNOBASE_STORAGE_ENGINE=1 \
-DWITH_ARCHIVE_STORAGE_ENGINE=1 \
-DWITH_BLACKHOLE_STORAGE_ENGINE=1 \
-DWITH_PERFSCHEMA_STORAGE_ENGINE=1 \
-DMYSQL_DATADIR=/usr/local/mysql/data \
-DWITH_BOOST=boost \
-DWITH_SYSTEMD=1
[root@localhost mysql-5.7.20]# make 
[root@localhost mysql-5.7.20]# make install
//将/usr/local/mysql目录下面的所有文件的属主和属组都给mysql
[root@localhost mysql-5.7.20]# chown -R mysql:mysql /usr/local/mysql/
//配置MySQL的my.cnf文件
[root@localhost mysql-5.7.20]# vi /etc/my.cnf
[client]
port = 3306
default-character-set=utf8
socket = /usr/local/mysql/mysql.sock

[mysql]
port = 3306
default-character-set=utf8
socket = /usr/local/mysql/mysql.sock

[mysqld]
user = mysql
basedir = /usr/local/mysql
datadir = /usr/local/mysql/data
port = 3306
character_set_server=utf8
pid-file = /usr/local/mysql/mysqld.pid
socket = /usr/local/mysql/mysql.sock
server-id = 1

sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,PIPES_AS_CONCAT,ANSI_QUOTES

//修改my.cnf的属主和属组
[root@localhost mysql-5.7.20]# chown mysql:mysql /etc/my.cnf
//配置mysql的环境变量
[root@localhost mysql-5.7.20]# echo 'PATH=/usr/local/mysql/bin:/usr/local/mysql/lib:$PATH' >> /etc/profile
[root@localhost mysql-5.7.20]# echo 'export PATH' >> /etc/profile
//加载环境变量
[root@localhost mysql-5.7.20]# source /etc/profile
//初始化mysql数据库
[root@localhost mysql-5.7.20]# cd /usr/local/mysql/
[root@localhost mysql]# bin/mysqld \
--initialize-insecure \
--user=mysql \
--basedir=/usr/local/mysql \
--datadir=/usr/local/mysql/data
[root@localhost mysql]# cp usr/lib/systemd/system/mysqld.service /usr/lib/systemd/system/
[root@localhost mysql]# systemctl enable mysqld
Created symlink from /etc/systemd/system/multi-user.target.wants/mysqld.service to /usr/lib/systemd/system/mysqld.service.
[root@localhost mysql]# netstat -ntap | grep 3306
tcp6       0      0 :::3306                 :::*                    LISTEN      22547/mysqld 

3.修改主机名加以区分,并查看运行状态

master主机

[root@localhost ~]# hostnamectl set-hostname master
[root@localhost ~]# su
[root@master ~]# systemctl status mysqld
● mysqld.service - MySQL Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since 六 2020-01-11 08:40:15 CST; 32min ago
     Docs: man:mysqld(8)
           http://dev.mysql.com/doc/refman/en/using-systemd.html
 Main PID: 1331 (mysqld)
   CGroup: /system.slice/mysqld.service
           └─1331 /usr/local/mysql/bin/mysqld --daemonize --pid-file=/usr/local/mysql/mysqld.pid

1月 11 08:40:13 localhost.localdomain mysqld[1234]: 2020-01-11T00:40:13.769557Z 0 [Note] IPv6 is available.
1月 11 08:40:13 localhost.localdomain mysqld[1234]: 2020-01-11T00:40:13.769594Z 0 [Note]   - '::' resolves...:';
1月 11 08:40:13 localhost.localdomain mysqld[1234]: 2020-01-11T00:40:13.769639Z 0 [Note] Server socket cre...:'.
1月 11 08:40:14 localhost.localdomain mysqld[1234]: 2020-01-11T00:40:14.554678Z 0 [Note] Event Scheduler: ...nts
1月 11 08:40:14 localhost.localdomain mysqld[1234]: 2020-01-11T00:40:14.555387Z 0 [Note] /usr/local/mysql/...ns.
1月 11 08:40:14 localhost.localdomain mysqld[1234]: Version: '5.7.20'  socket: '/usr/local/mysql/mysql.soc...ion
1月 11 08:40:14 localhost.localdomain mysqld[1234]: 2020-01-11T00:40:14.555402Z 0 [Note] Executing 'SELECT...ck.
1月 11 08:40:14 localhost.localdomain mysqld[1234]: 2020-01-11T00:40:14.555405Z 0 [Note] Beginning of list...les
1月 11 08:40:15 localhost.localdomain mysqld[1234]: 2020-01-11T00:40:15.452007Z 0 [Note] End of list of no...les
1月 11 08:40:15 localhost.localdomain systemd[1]: Started MySQL Server.
Hint: Some lines were ellipsized, use -l to show in full.
[root@master ~]# systemctl status ntpd
● ntpd.service - Network Time Service
   Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled; vendor preset: disabled)
   Active: active (running) since 六 2020-01-11 09:11:06 CST; 1min 41s ago
  Process: 2253 ExecStart=/usr/sbin/ntpd -u ntp:ntp $OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 2256 (ntpd)
   CGroup: /system.slice/ntpd.service
           └─2256 /usr/sbin/ntpd -u ntp:ntp -g

1月 11 09:11:06 localhost.localdomain ntpd[2256]: Listening on routing socket on fd #23 for interface updates
1月 11 09:11:06 localhost.localdomain systemd[1]: Started Network Time Service.
1月 11 09:11:07 localhost.localdomain ntpd[2256]: refclock_newpeer: clock type 73 invalid
1月 11 09:11:07 localhost.localdomain ntpd[2256]: 127.127.73.0 interface 127.0.0.1 -> (none)
1月 11 09:11:07 localhost.localdomain ntpd[2256]: 0.0.0.0 c016 06 restart
1月 11 09:11:07 localhost.localdomain ntpd[2256]: 0.0.0.0 c012 02 freq_set kernel 0.000 PPM
1月 11 09:11:07 localhost.localdomain ntpd[2256]: 0.0.0.0 c011 01 freq_not_set
1月 11 09:11:13 localhost.localdomain ntpd[2256]: 0.0.0.0 c61c 0c clock_step +0.578610 s
1月 11 09:11:14 localhost.localdomain ntpd[2256]: 0.0.0.0 c614 04 freq_mode
1月 11 09:11:15 localhost.localdomain ntpd[2256]: 0.0.0.0 c618 08 no_sys_peer

slave1主机

[root@localhost ~]# hostnamectl set-hostname slave1
[root@localhost ~]# su
[root@slave1 ~]# systemctl status ntpd
● ntpd.service - Network Time Service
   Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled; vendor preset: disabled)
   Active: active (running) since 六 2020-01-11 09:18:58 CST; 38s ago
  Process: 2352 ExecStart=/usr/sbin/ntpd -u ntp:ntp $OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 2355 (ntpd)
   CGroup: /system.slice/ntpd.service
           └─2355 /usr/sbin/ntpd -u ntp:ntp -g

1月 11 09:18:58 slave1 ntpd[2355]: Listening on routing socket on fd #23 for interface updates
1月 11 09:18:58 slave1 systemd[1]: Started Network Time Service.
1月 11 09:18:58 slave1 ntpd[2355]: refclock_newpeer: clock type 73 invalid
1月 11 09:18:58 slave1 ntpd[2355]: 127.127.73.0 interface 127.0.0.1 -> (none)
1月 11 09:18:58 slave1 ntpd[2355]: 0.0.0.0 c016 06 restart
1月 11 09:18:58 slave1 ntpd[2355]: 0.0.0.0 c012 02 freq_set kernel 0.000 PPM
1月 11 09:18:58 slave1 ntpd[2355]: 0.0.0.0 c011 01 freq_not_set
1月 11 09:19:07 slave1 ntpd[2355]: 0.0.0.0 c61c 0c clock_step -0.138434 s
1月 11 09:19:07 slave1 ntpd[2355]: 0.0.0.0 c614 04 freq_mode
1月 11 09:19:08 slave1 ntpd[2355]: 0.0.0.0 c618 08 no_sys_peer
[root@slave1 ~]# systemctl status ntpdate
● ntpdate.service - Set time via NTP
   Loaded: loaded (/usr/lib/systemd/system/ntpdate.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
[root@slave1 ~]# systemctl status mysqld
● mysqld.service - MySQL Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since 六 2020-01-11 08:45:48 CST; 34min ago
     Docs: man:mysqld(8)
           http://dev.mysql.com/doc/refman/en/using-systemd.html
 Main PID: 1355 (mysqld)
   CGroup: /system.slice/mysqld.service
           └─1355 /usr/local/mysql/bin/mysqld --daemonize --pid-file=/usr/local/mysql/mysqld.pid

1月 11 08:45:47 localhost.localdomain mysqld[1284]: 2020-01-11T00:45:47.549220Z 0 [Note] IPv6 is available.
1月 11 08:45:47 localhost.localdomain mysqld[1284]: 2020-01-11T00:45:47.549246Z 0 [Note]   - '::' resolves...:';
1月 11 08:45:47 localhost.localdomain mysqld[1284]: 2020-01-11T00:45:47.549278Z 0 [Note] Server socket cre...:'.
1月 11 08:45:48 localhost.localdomain mysqld[1284]: 2020-01-11T00:45:48.218099Z 0 [Note] Event Scheduler: ...nts
1月 11 08:45:48 localhost.localdomain mysqld[1284]: 2020-01-11T00:45:48.219130Z 0 [Note] /usr/local/mysql/...ns.
1月 11 08:45:48 localhost.localdomain mysqld[1284]: Version: '5.7.20'  socket: '/usr/local/mysql/mysql.soc...ion
1月 11 08:45:48 localhost.localdomain mysqld[1284]: 2020-01-11T00:45:48.219150Z 0 [Note] Executing 'SELECT...ck.
1月 11 08:45:48 localhost.localdomain mysqld[1284]: 2020-01-11T00:45:48.219154Z 0 [Note] Beginning of list...les
1月 11 08:45:48 localhost.localdomain mysqld[1284]: 2020-01-11T00:45:48.828705Z 0 [Note] End of list of no...les
1月 11 08:45:48 localhost.localdomain systemd[1]: Started MySQL Server.
Hint: Some lines were ellipsized, use -l to show in full.

slave2主机

[root@localhost ~]# hostnamectl set-hostname slave2
[root@localhost ~]# su
[root@slave2 ~]# systemctl status ntpd
● ntpd.service - Network Time Service
   Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled; vendor preset: disabled)
   Active: active (running) since 六 2020-01-11 09:31:58 CST; 1min 4s ago
  Process: 2458 ExecStart=/usr/sbin/ntpd -u ntp:ntp $OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 2461 (ntpd)
   CGroup: /system.slice/ntpd.service
           └─2461 /usr/sbin/ntpd -u ntp:ntp -g

1月 11 09:31:58 localhost.localdomain ntpd[2461]: Listen normally on 6 ens33 fe80::d440:5edc:e02e:f317 UDP 123
1月 11 09:31:58 localhost.localdomain ntpd[2461]: Listening on routing socket on fd #23 for interface updates
1月 11 09:31:58 localhost.localdomain ntpd[2461]: refclock_newpeer: clock type 73 invalid
1月 11 09:31:58 localhost.localdomain ntpd[2461]: 127.127.73.0 interface 127.0.0.1 -> (none)
1月 11 09:31:58 localhost.localdomain ntpd[2461]: 0.0.0.0 c016 06 restart
1月 11 09:31:58 localhost.localdomain ntpd[2461]: 0.0.0.0 c012 02 freq_set kernel 0.000 PPM
1月 11 09:31:58 localhost.localdomain ntpd[2461]: 0.0.0.0 c011 01 freq_not_set
1月 11 09:32:05 localhost.localdomain ntpd[2461]: 0.0.0.0 c61c 0c clock_step +0.185545 s
1月 11 09:32:05 localhost.localdomain ntpd[2461]: 0.0.0.0 c614 04 freq_mode
1月 11 09:32:06 localhost.localdomain ntpd[2461]: 0.0.0.0 c618 08 no_sys_peer
[root@slave2 ~]# systemctl status ntpdate
● ntpdate.service - Set time via NTP
   Loaded: loaded (/usr/lib/systemd/system/ntpdate.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
[root@slave2 ~]# systemctl status mysqld
● mysqld.service - MySQL Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since 六 2020-01-11 08:47:07 CST; 46min ago
     Docs: man:mysqld(8)
           http://dev.mysql.com/doc/refman/en/using-systemd.html
 Main PID: 1361 (mysqld)
   CGroup: /system.slice/mysqld.service
           └─1361 /usr/local/mysql/bin/mysqld --daemonize --pid-file=/usr/local/mysql/mysqld.pid

1月 11 08:47:02 localhost.localdomain mysqld[1266]: 2020-01-11T00:47:02.638376Z 0 [Note] IPv6 is available.
1月 11 08:47:02 localhost.localdomain mysqld[1266]: 2020-01-11T00:47:02.638394Z 0 [Note]   - '::' resolves...:';
1月 11 08:47:02 localhost.localdomain mysqld[1266]: 2020-01-11T00:47:02.638422Z 0 [Note] Server socket cre...:'.
1月 11 08:47:04 localhost.localdomain mysqld[1266]: 2020-01-11T00:47:04.560179Z 0 [Note] Event Scheduler: ...nts
1月 11 08:47:04 localhost.localdomain mysqld[1266]: 2020-01-11T00:47:04.575284Z 0 [Note] /usr/local/mysql/...ns.
1月 11 08:47:04 localhost.localdomain mysqld[1266]: Version: '5.7.20'  socket: '/usr/local/mysql/mysql.soc...ion
1月 11 08:47:04 localhost.localdomain mysqld[1266]: 2020-01-11T00:47:04.575312Z 0 [Note] Executing 'SELECT...ck.
1月 11 08:47:04 localhost.localdomain mysqld[1266]: 2020-01-11T00:47:04.575317Z 0 [Note] Beginning of list...les
1月 11 08:47:07 localhost.localdomain mysqld[1266]: 2020-01-11T00:47:07.205380Z 0 [Note] End of list of no...les
1月 11 08:47:07 localhost.localdomain systemd[1]: Started MySQL Server.
Hint: Some lines were ellipsized, use -l to show in full.

manager主机

[root@localhost ~]# hostnamectl set-hostname manager
[root@localhost ~]# su
[root@manager ~]# 

client客户机

[root@localhost ~]# hostnamectl set-hostname client
[root@localhost ~]# su
[root@client ~]#

2.将master、slave1、slave2配置主从同步

1.修改主配置文件

master主机

[root@master ~]# vim /etc/my.cnf
[mysqld]
server-id = 1
log-bin = master-bin
log-slave-updates = true
[root@master ~]# systemctl restart mysqld

slave1主机

[root@slave1 ~]# vim /etc/my.cnf
[mysqld]
server-id = 11
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index
[root@slave1 ~]# systemctl restart mysqld

slave2主机

[root@slave2 ~]# vim /etc/my.cnf
[mysqld]
server-id = 12
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index
[root@slave2 ~]# systemctl restart mysqld

2.给所有的数据库服务器提权

master服务器

//登录master的mysql数据库
[root@master ~]# mysql -uroot -pabc123
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.20-log Source distribution

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

//开启从服务器写的权限
mysql> grant replication slave on *.* to 'myslave'@'192.168.73.%' identified by '123';
Query OK, 0 rows affected, 1 warning (0.00 sec)

//创建manager的使用用户mha
mysql> grant all privileges on *.* to 'mha'@'192.168.73.%' identified by 'manager';
Query OK, 0 rows affected, 1 warning (0.00 sec)

//刷新权限
mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)

//通过mha检查mysql主从报错,报两个从库通过主机名连接不上主库,所以所有数据库加上下面的授权。
//提升mha登录master的权限
mysql> grant all privileges on *.* to 'mha'@'master' identified by 'manager';
Query OK, 0 rows affected, 1 warning (0.00 sec)

//提升mha登录slave1的权限
mysql> grant all privileges on *.* to 'mha'@'slave1' identified by 'manager';
Query OK, 0 rows affected, 1 warning (0.00 sec)

//提升mha登录slave2的权限
mysql> grant all privileges on *.* to 'mha'@'slave2' identified by 'manager';
Query OK, 0 rows affected, 1 warning (0.00 sec)

//刷新权限
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

slave1服务器

[root@slave1 ~]# mysql -uroot -pabc123
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.20 Source distribution

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> grant replication slave on *.* to 'myslave'@'192.168.73.%' identified by '123';
Query OK, 0 rows affected, 1 warning (0.03 sec)

mysql> grant all privileges on *.* to 'mha'@'192.168.73.%' identified by 'manager';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> grant all privileges on *.* to 'mha'@'master' identified by 'manager';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> grant all privileges on *.* to 'mha'@'slave1' identified by 'manager';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> grant all privileges on *.* to 'mha'@'slave2' identified by 'manager';
Query OK, 0 rows affected, 1 warning (0.01 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

slave2服务器

[root@slave2 ~]# mysql -uroot -pabc123
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.20 Source distribution

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> grant replication slave on *.* to 'myslave'@'192.168.73.%' identified by '123';
Query OK, 0 rows affected, 1 warning (1.07 sec)

mysql> grant all privileges on *.* to 'mha'@'192.168.73.%' identified by 'manager';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> grant all privileges on *.* to 'mha'@'master' identified by 'manager';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> grant all privileges on *.* to 'mha'@'slave1' identified by 'manager';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> grant all privileges on *.* to 'mha'@'slave2' identified by 'manager';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

6.开启主从同步

master服务器

mysql> show master status;
+-------------------+----------+--------------+------------------+-------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master-bin.000001 |     1897 |              |                  |                   |
+-------------------+----------+--------------+------------------+-------------------+
1 row in set (0.01 sec)

slave1服务器

mysql> change master to master_host='192.168.73.140',master_user='myslave',master_password='123',master_log_file='master-bin.000001',master_log_pos=1897;
Query OK, 0 rows affected, 2 warnings (0.11 sec)

mysql> start slave;
Query OK, 0 rows affected (0.02 sec)

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.73.140
                  Master_User: myslave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: master-bin.000001
          Read_Master_Log_Pos: 1897
               Relay_Log_File: relay-log-bin.000002
                Relay_Log_Pos: 321
        Relay_Master_Log_File: master-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1897
              Relay_Log_Space: 526
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: a101d92d-3202-11ea-b018-000c290cd2cd
             Master_Info_File: /usr/local/mysql/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

//在从服务器上开启只读功能
mysql> set global read_only=1;
Query OK, 0 rows affected (0.00 sec

slave2服务器

mysql> change master to master_host='192.168.73.140',master_user='myslave',master_password='123',master_log_file='master-bin.000001',master_log_pos=1897;
Query OK, 0 rows affected, 2 warnings (0.34 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.73.140
                  Master_User: myslave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: master-bin.000001
          Read_Master_Log_Pos: 1897
               Relay_Log_File: relay-log-bin.000002
                Relay_Log_Pos: 321
        Relay_Master_Log_File: master-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1897
              Relay_Log_Space: 526
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: a101d92d-3202-11ea-b018-000c290cd2cd
             Master_Info_File: /usr/local/mysql/data/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 
                Auto_Position: 0
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.01 sec)

mysql> set global read_only=1;
Query OK, 0 rows affected (0.00 sec)

3.验证主从同步

在master上面建一个名为test的数据库

mysql> create database test;
Query OK, 1 row affected (0.00 sec)

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.05 sec)

在slave1上查看

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.11 sec)

在slave2上查看

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.12 sec)

两个从服务器上面能够正常的查看到master主服务器上创建的库,说明主从同步成功了。

4.安装MHAnode节点

MHA软件包对于每个操作系统版本不一样,这里是centos7.4必须选择0.57版本,在所有服务器上必须安装node组件,最后在MHA-manager节点上安装manager组件,因为manager依赖node组件,下面都是在master上操作演示安装node组件

1.安装gmake编译工具(在所有的服务器上都装)

tar zxvf cmake-2.8.6.tar.gz -C /opt
cd /opt/cmake-2.8.6/
./configure 
gmake
gmake install

2.在所有服务器上安装mh依赖环境和node节点

yum install epel-release --nogpgcheck -y
[root@mha_manager ~]# yum install -y perl-DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl-CPAN
tar zxvf mha/mha4mysql-node-0.57.tar.gz 
cd mha4mysql-node-0.57/
yum install perl-Module-Install -y
perl Makefile.PL 
//有个地方需要输入y或者n,我们输入y即可
make
make install

3.在manager服务器上安装manager组件

tar zxvf mha4mysql-manager-0.57.tar.gz
cd mha4mysql-manager-0.57/
perl Makefile.PL
make
make install

5.配置ssh免交互登录

1.将manager的密钥对推给三台mysql服务器

[root@manager ~]# ssh-keygen -t rsa
	'//创建非对称密钥对'
    '//因为想要免密登陆,所以三个都直接回车就行,不需要设置密码'
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
'//上传秘钥到节点服务器'
[root@manager ~]# ssh-copy-id 192.168.73.140
Are you sure you want to continue connecting (yes/no)? yes
root@192.168.73.140's password: 	//输入密码'
    '//相同方法建立其他服务器'
[root@manager ~]# ssh-copy-id 192.168.73.141
[root@manager ~]# ssh-copy-id 192.168.73.138

2.在master上面推给slave1和slave2

ssh-keygen -t rsa
ssh-copy-id 192.168.73.141
ssh-copy-id 192.168.73.138

3.在slave1上面推给master和slave2

ssh-keygen -t rsa
ssh-copy-id 192.168.73.140
ssh-copy-id 192.168.73.138

4.在slave2上面推给master和slave1

ssh-keygen -t rsa
ssh-copy-id 192.168.73.140
ssh-copy-id 192.168.73.141

6.配置mha-manager的组件

1.将相应的脚本复制到/usr/local/bin目录下面

[root@manager ~]# cp -ra /root/mha4mysql-manager-0.57/samples/scripts /usr/local/bin
[root@manager samples]# ls -l /usr/local/bin/scripts/
-rwxr-xr-x. 1 1001 1001  3648 5月  31 2015 master_ip_failover  自动切换时 VIP管理的脚本
-rwxr-xr-x. 1 1001 1001  9870 5月  31 2015 master_ip_online_change  在线切换VIP的管理 
-rwxr-xr-x. 1 1001 1001 11867 5月  31 2015 power_manager  故障发生后关闭主机的脚本
-rwxr-xr-x. 1 1001 1001  1360 5月  31 2015 send_report  因故障切换后发送报警的脚本
[root@manager ~]# cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin/	'//自动切换时 VIP管理的脚本'

2.修改master_ip_failover脚本

[root@manager ~]# vim /usr/local/bin/master_ip_failover 
'//删除内容,重新编写脚本'
#!/usr/bin/env perl '//第一行要最顶行写,不要有空格'
use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '192.168.73.100';
my $brdc = '192.168.73.255';
my $ifdev = 'ens33';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $exit_code = 0;
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

3.创建mha目录并修改配置文件

[root@manager ~]# mkdir /etc/masterha
[root@manager ~]# cp /root/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha/
[root@manager ~]# vim /etc/masterha/app1.cnf 
[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=manager
ping_interval=1
remote_workdir=/tmp
repl_password=123456
repl_user=myslave
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.73.141 -s 192.168.73.138
shutdown_script=""
ssh_user=root
user=mha

[server1]
hostname=192.168.73.140
port=3306

[server2]
candidate_master=1
hostname=192.168.73.141
check_repl_delay=0
port=3306

[server3]
hostname=192.168.73.138
port=3306

7.测试ssh与mysql连接

1.验证秘钥对文件

[root@manager ~]# masterha_check_ssh -conf=/etc/masterha/app1.cnf
Fri Jan 10 01:14:10 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Fri Jan 10 01:14:10 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Fri Jan 10 01:14:10 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Fri Jan 10 01:14:10 2020 - [info] Starting SSH connection tests..
Fri Jan 10 01:14:11 2020 - [debug] 
Fri Jan 10 01:14:10 2020 - [debug]  Connecting via SSH from root@192.168.73.141(192.168.73.141:22) to root@192.168.73.138(192.168.73.138:22)..
Fri Jan 10 01:14:11 2020 - [debug]   ok.
Fri Jan 10 01:14:12 2020 - [debug] 
Fri Jan 10 01:14:10 2020 - [debug]  Connecting via SSH from root@192.168.73.138(192.168.73.138:22) to root@192.168.73.141(192.168.73.141:22)..
Fri Jan 10 01:14:11 2020 - [debug]   ok.
Fri Jan 10 01:14:12 2020 - [info] All SSH connection tests passed successfully.

2.测试mysql主从连接

[root@manager ~]# masterha_check_repl -conf=/etc/masterha/app1.cnf

8.配置虚拟ip,启动MHA

1.配置虚拟ip

[root@master ~]# /sbin/ifconfig ens33:1 192.168.73.100/24

2.启动mha

[root@manager ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > 
[root@manager ~]# /var/log/masterha/app1/manager.log 2>&1 &
//查看到当前的master节点
[root@manager ~]# masterha_check_status --conf=/etc/masterha/app1.cnf
//查看当前日志信息
[root@manager ~]# cat /var/log/masterha/app1/manager.log 

9.模拟故障

启动监控,查看日志记录,模拟master服务器故障

[root@manager ~]# tailf /var/log/masterha/app1/manager.log 
[root@master ~]# pkill -9 mysqld 

10.测试实验结果

1.在slave1上面查看

[root@slave1 ~]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.73.141  netmask 255.255.255.0  broadcast 192.168.73.255
        inet6 fe80::159a:a8d1:5769:74d0  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:34:57:c1  txqueuelen 1000  (Ethernet)
        RX packets 347068  bytes 28229347 (26.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 528201  bytes 67755670 (64.6 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens33:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.73.100  netmask 255.255.255.0  broadcast 192.168.73.255
        ether 00:0c:29:34:57:c1  txqueuelen 1000  (Ethernet)

2.mha_manager查看

//会动态显示后续信息
Generating relay diff files from the latest slave succeeded.
192.168.73.138(192.168.73.138:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.73.141(192.168.73.141:3306)
192.168.73.141(192.168.73.141:3306): Resetting slave info succeeded.
Master failover to 192.168.73.141(192.168.73.141:3306) completed successfully.

3.在客户机上面安装mysql

yum -y install mysql
mysql -umha -pmanager -h 192.168.73.100 
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.20 Source distribution

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.05 sec)
MHA高速可用群集MHA高速可用群集 double_happy111 发布了92 篇原创文章 · 获赞 39 · 访问量 5626 私信 关注
上一篇:MySQL MHA高可用集群部署


下一篇:MYSQL 之 MHA架构搭建