keepalived两台设备同时出现VIP问题
目录
问题背景:
日志分析如下:
原因和解决方案总结:
问题背景:
keepalived-master和keepalived-slave同时出现了VIP,出现了非对称路由和双主现象
日志分析如下:
- master能够接受到来自slave的通告消息,并且master优先级100高于slave80,这是正常的。
- slave日志没有与master替换相关的信息记录,smtp持续报错,并且没有接受到来自master的通告消息,无法确认master状态即slave认为master宕机,所以就up上vip给自己。
[root@test1 ~]# journalctl -fu keepalived
-- Logs begin at Wed 2024-10-30 18:47:13 CST. --
May 28 21:46:28 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100
May 28 21:46:29 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100
May 28 21:46:30 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100
May 28 21:46:31 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100
May 28 21:46:32 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100
May 28 21:46:33 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100
May 28 21:46:34 test1 Keepalived_vrrp[28060]: (VI_1) Received advert from 192.168.2.191 with lower priority 80, ours 100[root@test2 ~]# journalctl -fu keepalived
-- Logs begin at Wed 2024-10-30 18:43:15 CST. --
May 28 21:39:30 test2 Keepalived_healthcheckers[22487]: Removing service [192.168.2.193]:tcp:80 from VS [192.168.2.100]:tcp:80
May 28 21:39:30 test2 Keepalived_healthcheckers[22487]: Lost quorum 1-0=1 > 0 for VS [192.168.2.100]:tcp:80
May 28 21:39:30 test2 Keepalived_healthcheckers[22487]: smtp fd 10 returned write error
May 28 21:43:48 test2 Keepalived_healthcheckers[22487]: TCP connection to [192.168.2.192]:tcp:80 success.
May 28 21:43:48 test2 Keepalived_healthcheckers[22487]: Adding service [192.168.2.192]:tcp:80 to VS [192.168.2.100]:tcp:80
May 28 21:43:48 test2 Keepalived_healthcheckers[22487]: Gained quorum 1+0=1 <= 1 for VS [192.168.2.100]:tcp:80
May 28 21:43:48 test2 Keepalived_healthcheckers[22487]: smtp fd 10 returned write error
May 28 21:43:56 test2 Keepalived_healthcheckers[22487]: TCP connection to [192.168.2.193]:tcp:80 success.
May 28 21:43:56 test2 Keepalived_healthcheckers[22487]: Adding service [192.168.2.193]:tcp:80 to VS [192.168.2.100]:tcp:80
May 28 21:43:56 test2 Keepalived_healthcheckers[22487]: smtp fd 10 returned write error
原因和解决方案总结:
- 查看防火墙状态,按以上日志的话主要查看slave设备的防火墙,将其关闭。(本人所遇问题为防火墙问题)。
- 在产生切换master行为后,网络故障或延迟导致通告消息不及时,产生双主,建议手动停止keepalived,删除一侧VIP地址,再按照优先级配置从大到小顺序进行启动master和slave。
- 主从keepalived配置文件中的vrrp_instance配置错误,着重关注网卡名称interface是否配置正确。
- 查看主从服务的时间是否同步,若不同步可能需要使用chrony进行统一时间。