Keepalived,即存活检测机制,是 Linux 下一个轻量级别的高可用解决方案;起初针对 LVS 进行研发,通过心跳检测检查系统中各个服务节点的健康状态,支持故障自动切换。
部署 1 yum install -y keepalive
修改配置文件 vi /etc/keepalived/keepalived.conf
,state 可以设置为:MASTER 或 BACKUP
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 global_defs { router_id ka01 } vrrp_instance VI_1 { state MASTER priority 150 interface eth0 virtual_router_id 50 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 10.0.0.3 } }
1 2 3 systemctl enable keepalive systemctl start keepalive systemctl stop keepalive
方式 Keepalive 高可用分为:抢占式和非抢占式
抢占式(默认) BACKUP挂掉,BACKUP上台,MASTER重新启动则将 IP 抢占过去。
非抢占式 两台均为BACKUP,在优先级上做区分,如MASTER挂掉,BACKUP上台,则BACKUP变成MASTER,MASTER变为BACKUP。
两个节点的state
均为BACKUP(官方建议)
两个节点都在vrrp_instance
中添加nopreempt
其中一个节点的优先级要高于另外一个节点
PS.两台服务器角色都启用了 nopreempt
后,必须修改角色状态统一为 BACKUP,唯一的区别就是优先级不同。
1 2 3 4 5 6 7 8 9 10 11 12 13 vrrp_instance VI_1 { state BACKUP priority 150 nopreempt } vrrp_instance VI_1 { state BACKUP priority 100 nopreempt }
故障 脑裂 当两台高可用服务器在指定的时间内,无法互相检测到对方心跳而各自启动故障转移功能,取得了资源以及服务的所有权,而此时的两台高可用服务器对都还活着并作正常运行,这样就会导致同一个服务在两端同时启动而发生冲突的严重问题,最严重的就是两台主机同时占用一个VIP的地址(类似双端导入概念),当用户写入数据的时候可能会分别写入到两端,这样可能会导致服务器两端的数据不一致或造成数据的丢失,这种情况就称为裂脑,也有的人称之为分区集群或者大脑垂直分隔。
服务器网线松动等网络故障
服务器硬件故障发生损坏现象而奔溃
主备服务器都开启了firewalld防火墙解决方法 如果 Nginx 宕机,会导致用户请求失败,但 Keepalived 并不会进行地址漂移;所以需要编写一个脚本检测 Nginx 的存活状态,如果不存活则 kill nginx 和 keepalived
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 vim check_nginx.sh nginxpid=$(ps -C nginx --no-header|wc -l) if [ $nginxpid -eq 0 ];then systemctl start nginx sleep 3 nginxpid=$(ps -C nginx --no-header|wc -l) if [ $nginxpid -eq 0 ];then systemctl stop keepalived fi fi chmod +x check_ nginx.sh
配置 Keepalived 使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 vi /etc/keepalived/keepalived.conf global_defs { router_id 01 } vrrp_script check_nginx { script "/root/check_nginx.sh" interval 5 } vrrp_instance VI_1 { state BACKUP priority 150 nopreempt interface eth0 virtual_router_id 50 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 10.0.0.3 } track_script { check_nginx } }
Nginx 至少两台 Nginx 服务器、一个 VIP(虚拟 IP),服务器都安装好 Keepalived,并设置好 VIP;当主 Nginx 发生故障时,VIP 自动换绑到备用 Nginx,并运行相应的脚本。
编译安装 [[Nginx 入门配置]]
YUM 安装 1 2 3 4 5 yum install -y nginx yum install -y keepalived rpm -q -a keepalived systemctl enable keepalived
Keepalived MASTER 和 BACKUP 各一台,编译安装 Keepalived,因为设置绑定网卡等操作,需要使用 root 用户 。
1 2 3 4 tar -zxvf keepalived-2.2.2.tar.gz -C /root && cd /root/keepalived-2.2.2 ./configure make && make install
配置文件 安装完成以后,拷贝并修改配置文件
1 2 3 4 5 6 7 ip addr mkdir /etc/keepalivedvi /etc/keepalived/keepalived.conf vrrp_skip_check_adv_addr vrrp_strict
PS.通过配置 priority 设置优先级,通过增加 nopreempt 配置非抢占式,vrrp_script 设置切换机制
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 ! Configuration File for keepalived global_defs { router_id keep_xxx } vrrp_script check_nginx { script "/etc/keepalived/nginx_check.sh" interval 2 weight -20 } vrrp_instance VI_1 { state MASTER interface eth0 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1111 } track_script { check_nginx } virtual_ipaddress { 192.168.200.16 } }
检查脚本 创建 Nginx 检查脚本:vi /etc/keepalived/nginx_check.sh
1 2 3 4 5 6 7 8 9 10 11 #!/bin/bash # 判断 nginx 是否宕机并尝试重启 if [ `ps -C nginx --no-header |wc -l` -eq 0 ];then /app/midware/nginx/sbin/nginx # 等待 5s 再次检查 nginx,如果没能启动成功,则停止 keepalived 切换备机 sleep 5 if [ `ps -C nginx --no-header |wc -l` -eq 0 ];then killall keepalived fi fi
增加可执行权限:chmod +x /etc/keepalived/nginx_check.sh
服务命令 1 2 cp /usr/local/etc/sysconfig/keepalived /etc/sysconfig/cp /usr/local/sbin/keepalived /usr/sbin/
创建系统服务文件:/etc/init.d/keepalived,并赋予执行权限:chmod +x /etc/init.d/keepalived
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 #!/bin/sh . /etc/rc.d/init.d/functions . /etc/sysconfig/keepalived RETVAL=0 prog="keepalived" start () { echo -n $"Starting $prog : " daemon keepalived ${KEEPALIVED_OPTIONS} RETVAL=$? echo [ $RETVAL -eq 0 ] && touch /var/lock/subsys/$prog } stop () { echo -n $"Stopping $prog : " killproc keepalived RETVAL=$? echo [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/$prog } reload () { echo -n $"Reloading $prog : " killproc keepalived -1 RETVAL=$? echo } case "$1 " in start) start ;; stop) stop ;; reload) reload ;; restart) stop start ;; condrestart) if [ -f /var/lock/subsys/$prog ]; then stop start fi ;; status) status keepalived RETVAL=$? ;; *) echo "Usage: $0 {start|stop|reload|restart|condrestart|status}" RETVAL=1 esac exit $RETVAL
然后 systemctl daemon-reload 重载
1 2 3 4 5 6 7 8 systemctl status keepalived systemctl start keepalived systemctl restart keepalived systemctl stop keepalived