Linux高可用性方案之Heartbeat日志查看

HeKing 2011-11-29

相关阅读:

下面跟着笔者我们来看详细看下Heartbeat的日志
启动主机Heartbeat服务

#/etc/init.d/heartbeat start
Heartbeat启动时,通过"tail -f /var/log/ messages"查看主节点系统日志信息,输出如下:
# tail -f /var/log/messages 
    Nov 26 07:52:21 node1 heartbeat: [3688]: info:
    Configuration validated. Starting heartbeat 2.0.8 
    Nov 26 07:52:21 node1 heartbeat: [3689]: info:
    heartbeat: version 2.0.8 
    Nov 26 07:52:21 node1 heartbeat: [3689]: info:
    Heartbeat generation: 3 
    Nov 26 07:52:21 node1 heartbeat: [3689]: info:
    G_main_add_TriggerHandler: Added signal manual handler 
    Nov 26 07:52:21 node1 heartbeat: [3689]: info:
    G_main_add_TriggerHandler: Added signal manual handler 
    Nov 26 07:52:21 node1 heartbeat: [3689]: info:
    glib: UDP Broadcast heartbeat started on port 694 (694) interface eth1 
    Nov 26 07:52:21 node1 heartbeat: [3689]: info:
    glib: UDP Broadcast heartbeat closed on port 694 interface eth1 - Status: 1 
    Nov 26 07:52:21 node1 heartbeat: [3689]: info:
    glib: ping heartbeat started. 
    Nov 26 07:52:21 node1 heartbeat: [3689]: info:
    G_main_add_SignalHandler: Added signal handler for signal 17 
    Nov 26 07:52:21 node1 heartbeat: [3689]: info:
    Local status now set to: 'up' 
    Nov 26 07:52:22 node1 heartbeat: [3689]: info:
    Link node1:eth1 up. 
    Nov 26 07:52:23 node1 heartbeat: [3689]: info:
    Link 192.168.60.1:192.168.60.1 up. 
    Nov 26 07:52:23 node1 heartbeat: [3689]: info:
    Status update for node 192.168.60.1: status ping

此段日志是Heartbeat在进行初始化配置,例如,Heartbeat的心跳时间间隔、UDP广播端口和ping节点的运行状态等,日志信息到这里会暂停,等待120秒之后,Heartbeat会继续输出日志,而这个120秒刚好是ha.cf中"initdead"选项的设定时间。此时Heartbeat的输出信息如下:
   
Nov 26 07:54:22 node1 heartbeat: [3689]: WARN: node node2: is dead 
    Nov 26 07:54:22 node1 heartbeat: [3689]: info:
    Comm_now_up(): updating status to active 
    Nov 26 07:54:22 node1 heartbeat: [3689]: info:
    Local status now set to: 'active' 
    Nov 26 07:54:22 node1 heartbeat: [3689]: info:
    Starting child client "/usr/lib/heartbeat/ipfail" (694,694) 
    Nov 26 07:54:22 node1 heartbeat: [3689]: WARN:
    No STONITH device configured. 

    Nov 26 07:54:22 node1 heartbeat: [3689]: WARN:
    Shared disks are not protected. 
    Nov 26 07:54:22 node1 heartbeat: [3689]: info:
    Resources being acquired from node2. 
    Nov 26 07:54:22 node1 heartbeat: [3712]: info:
    Starting "/usr/lib/heartbeat/ipfail" as uid 694  gid 694 (pid 3712)

在上面这段日志中,由于node2还没有启动,因此会给出"node2: is dead"的警告信息,接下来启动了Heartbeat插件ipfail。由于我们在ha.cf文件中没有配置STONITH,因此日志里也给出了"No STONITH device configured"的警告提示。
继续看下面的日志:
   
Nov 26 07:54:23 node1 harc[3713]: info: Running /etc/ha.d/rc.d/status status 
    Nov 26 07:54:23 node1 mach_down[3735]: info: /usr/lib/
    heartbeat/mach_down: nice_failback: foreign resources acquired 
    Nov 26 07:54:23 node1 mach_down[3735]: info: mach_down
    takeover complete for node node2. 
    Nov 26 07:54:23 node1 heartbeat: [3689]: info: mach_down takeover complete. 
    Nov 26 07:54:23 node1 heartbeat: [3689]: info: Initial
    resource acquisition complete (mach_down) 
    Nov 26 07:54:24 node1 IPaddr[3768]: INFO:  Resource is stopped 
    Nov 26 07:54:24 node1 heartbeat: [3714]: info: Local Resource
    acquisition completed. 
    Nov 26 07:54:24 node1 harc[3815]: info: Running /etc/ha.
    d/rc.d/ip-request-resp ip-request-resp 
    Nov 26 07:54:24 node1 ip-request-resp[3815]: received ip-
    request-resp 192.168.60.200/24/eth0 OK yes 
    Nov 26 07:54:24 node1 ResourceManager[3830]: info: Acquiring
    resource group: node1 192.168.60.200/24/eth0 Filesystem:
    :/dev/sdb5::/webdata::ext3 
    Nov 26 07:54:24 node1 IPaddr[3854]: INFO:  Resource is stopped 
    Nov 26 07:54:25 node1 ResourceManager[3830]: info: Running
    /etc/ha.d/resource.d/IPaddr 192.168.60.200/24/eth0 start 
    Nov 26 07:54:25 node1 IPaddr[3932]: INFO: Using calculated
    netmask for 192.168.60.200: 255.255.255.0 
    Nov 26 07:54:25 node1 IPaddr[3932]: DEBUG: Using calculated
    broadcast for 192.168.60.200: 192.168.60.255 
    Nov 26 07:54:25 node1 IPaddr[3932]: INFO: eval /sbin/ifconfig
    eth0:0 192.168.60.200 netmask 255.255.255.0 broadcast 192.168.60.255 
    Nov 26 07:54:25 node1 avahi-daemon[1854]: Registering new
    address record for 192.168.60.200 on eth0. 
    Nov 26 07:54:25 node1 IPaddr[3932]: DEBUG: Sending Gratuitous
    Arp for 192.168.60.200 on eth0:0 [eth0] 
    Nov 26 07:54:26 node1 IPaddr[3911]: INFO:  Success 

    Nov 26 07:54:26 node1 Filesystem[4021]: INFO:  Resource is stopped 
    Nov 26 07:54:26 node1 ResourceManager[3830]: info: Running
 
   /etc/ha.d/resource.d/ Filesystem/dev/sdb5 /webdata ext3 start 
    Nov 26 07:54:26 node1 Filesystem[4062]: INFO: Running start
    for /dev/sdb5 on /webdata 

    Nov 26 07:54:26 node1 kernel: kjournald starting.  Commit interval 5 seconds 
    Nov 26 07:54:26 node1 kernel: EXT3 FS on sdb5, internal journal 
    Nov 26 07:54:26 node1 kernel: EXT3-fs: mounted
    filesystem with ordered data mode. 
    Nov 26 07:54:26 node1 Filesystem[4059]: INFO:
    Success 
    Nov 26 07:54:33 node1 heartbeat: [3689]: info:
    Local Resource acquisition completed. (none) 
    Nov 26 07:54:33 node1 heartbeat: [3689]: info:
    local resource transition completed

上面这段日志是进行资源的监控和接管,主要完成haresources文件中的设置,在这里是启用集群虚拟IP和挂载磁盘分区。
此时,通过ifconfig命令查看主节点的网络配置,可以看到,主节点已经自动绑定集群IP地址,在HA集群之外的主机上通过ping命令检测集群IP地址192.168.60.200,已经处于可通状态,也就是该地址变得可用。
同时查看磁盘分区的挂载情况,共享磁盘分区/dev/sdb5已经被自动挂载。

相关推荐