Linux 内存 CPU 磁盘 网络流量的监控

chenchuang 2017-04-14

内存

free 命令

free命令由procps.*.rpm提供(在Redhat系列的OS上), free命令的所有输出值都是从/proc/meminfo中读出的。

1          2          3          4          5          6
               total       used       free     shared    buffers     cached
1 Mem:      24677460   23276064    1401396          0     870540   12084008
2 -/+ buffers/cache:   10321516   14355944
3 Swap:     25151484     224188   24927296

第一行 是从操作系统角度讲的, total代表总内存, used代表被使用的内存, shared代表被进程共享的内存,关于buffers和cached有更精彩的解释:  

  • A buffer is something that has yet to be "written" to disk. 
  • A cache is something that has been "read" from the disk and stored for later use.

 total = used + free 即:  24677460 = 23276064 + 1401396

第二行是从程序角度讲的,

程序used = 系统used - 系统buffers - 系统cached,

即 10321516=23276064-870540-12084008

程序可用free = 系统free + 系统buffers + 系统cached,

即 14355944=1401396+870540+12084008

例:监控交换分区,如果swap超过3000M发出报警邮件。 

#!/bin/bash
useSwap=`free -m |tail -1 |awk '{print $3}'`
if [ $useSwap -ge 3000 ]; then
	echo " Use swap more than 3000M, Please add memory in time" |mail -s "(Memory  Warning)"  ***@qq.com
fi

注:要根据具体运行的程序来确定报警触发规则

 

CPU

uptime 或 vi /proc/cpuinfo

13:40:08 up  4:48,  1 user,  load average: 0.46, 0.28, 0.15

0.46, 0.28, 0.15 分别代表1分钟,5分钟,15分钟CPU负载, 其数值n=cpu核心数表示负载刚好达到100%, 数值越大表示CPU负载越大

例: 监控CPU负载

#!/bin/bash
cpus=`cat /proc/cpuinfo |grep "physical id" |wc -l`
threshold=$[ $cpus - 1 ]
load=`uptime |cut -d, -f 4 | cut -d: -f 2`
control=`echo "$load > $threshold"|bc`  #shell中小数和整数不能直接比较,要用bc计算器
if [ $control -eq 1 ];then
	echo "  10.1.1.183(nginx server) CPU load is too large" |mail -s "10.1.1.183(nginx server)"  lianshubash.com
fi

磁盘

df

常用 df -hl

例:监控磁盘使用情况

#!/bin/bash
sda2DiskFree=`df -hl |grep "^/dev/sda2"|awk '{print $4}'|cut -dG -f 1`
sda3DiskFree=`df -hl |grep "^/dev/sda3"|awk '{print $4}'|cut -dG -f 1`
sda5DiskFree=`df -hl |grep "^/dev/sda5"|awk '{print $4}'|cut -dG -f 1`
if [ $sda2DiskFree -le 100 ]; then
	echo "  /dev/sda2 Disk space less than 100G,Please check the disk occupancy" |mail -s "(Disk Warning)"  com	
fi
if [ $sda3DiskFree -le 100 ]; then
	echo "  /dev/sda3 Disk space less than 100G,Please check the disk occupancy" |mail -s "(Disk Warning)"  com
fi
if [ $sda5DiskFree -le 70 ] ;then
	echo "  /dev/sda5 Disk space less than 70G,Please check the disk occupancy" |mail -s "(Disk Warning)"  com
fi

du

查看文件夹的磁盘占用大小 常用 du -sh /

fdisk

查看磁盘分区 常用fdisk -l

网络流量

cat /proc/net/dev

查看网络流量情况

例 编写一个报警邮件,当主机的出口流量超过500M的时候,发送报警邮件

#!/bin/bash
#
outcount=`cat /proc/net/dev | grep eth1 | gawk '{print $10}'`
outcheck=`echo "${outcount} > 500*1024*1024" |bc`
if [ $outcheck -eq 1 ];then
	echo "Output on eth1 is greator than 500M." | mail -s "Network Waring"  *@qq.com
fi
 注:查看网络的其它命令 http://os.51cto.com/art/201404/435279.htm 

相关推荐