清醒疯子 2018-02-09
之前搭建过elk,用于分析日志,无奈服务器资源不足,开了多个Logstash之后发现占用内存过高,于是现在改为Filebeat做日志收集,记录一下搭建过程和遇到问题的解决方案。
第一步 , 安装jdk8 。
tar -zxvf jdk-8u112-linux-x64.tar.gz
设置环境变量
vi /etc/profile
在profile文件下,添加
#set java environment JAVA_HOME=/usr/local/java/jdk1.8.0_112 JRE_HOME=$JAVA_HOME/jre CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin export JAVA_HOME JRE_HOME CLASS_PATH PATH
添加之后,执行
source /etc/profile
使配置生效。 然后输入
java -version
检验是否成功。
成功进入第二步安装Elasticsearch:
下载 Elasticsearch 5.1.1 的安装包,https://www.elastic.co/downloads/past-releases/elasticsearch-5-1-1
执行
rpm -ivh elasticsearch-5.1..rpm
然后看到
[root@localhost elk]# rpm -ivh elasticsearch-5.1.1.rpm warning: elasticsearch-5.1.1.rpm: Header V4 RSA/SHA512 Signature, key ID d88e42b4: NOKEY Preparing... ########################################### [100%] Creating elasticsearch group... OK Creating elasticsearch user... OK 1:elasticsearch ########################################### [100%] ### NOT starting on installation, please execute the following statements to configure elasticsearch service to start automatically using chkconfig sudo chkconfig --add elasticsearch ### You can start elasticsearch service by executing
说明安装成功。 然后我们来执行一下,service elasticsearch start 。
安装后各个目录说明
#/usr/share/elasticsearch/ 主目录
#/var/log/elasticsearch log日志
#/etc/sysconfig/elasticsearch 配置elasticsearch环境变量
#/etc/elasticsearch/elasticsearch.yml 配置elasticsearch集群
#/etc/elasticsearch/jvm.options 配置elasticsearch的jvm参数
#/etc/elasticsearch/log4j2.properties 配置elasticsearch日志参数
可能出现各种报错,解决方案参考 :http://blog.csdn.net/cardinalzbk/article/details/54924511
注意: es启动要求提高一些系统参数配置,否则会报错
a. 增大vm.max_map_count到至少262144
sudo vim /etc/sysctl.conf 添加 vm.max_map_count=262144 sudo sysctl -p
b. 增大文件句柄数至少 65536 ulimit -a查看
sudo vim /etc/security/limits.conf * soft nofile 65536 * hard nofile 65536
然后,我们对Elasticsearch集群配置文件进行配置 。
vi /etc/elasticsearch/elasticsearch.yml
解开注释
# ---------------------------------- Network ----------------------------------- # # Set the bind address to a specific IP (IPv4 or IPv6): # network.host: 0.0.0.0 # # Set a custom port for HTTP: # http.port:
重启服务 :service elasticsearch restart ,这时来看一下Elasticsearch用了我们多少内存,毕竟这次就是为了解决资源不足的问题的,top一下
嗯。。。 只剩几十M内存了,什么情况? 来,看一下jvm配置。
/etc/elasticsearch/jvm.options
好,看到了,默认
-Xms2g
-Xmx2g
我们先测试测试,设个500m试试。 重启,ok,正常启动~
Elasticsearch设置ok。
第三步,下载logstash-5.1.1 , 也是下载rpm,然后安装.
然后依旧,主体在 /etc/logstash下, 我们先进去bin, 执行
./logstash -e 'input { stdin { } } output { stdout {} }'
然后再随便输点东西,就能看到,我们输入啥,它就输出啥~
[root@localhost bin]# ./logstash -e 'input { stdin { } } output { stdout {} }' 112 WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs to console The stdin plugin is now waiting for input: 00:00:19.669 [[main]-pipeline-manager] INFO logstash.pipeline - Starting pipeline {"id"=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>125} 00:00:19.688 [[main]-pipeline-manager] INFO logstash.pipeline - Pipeline main started 00:00:19.802 [Api Webserver] INFO logstash.agent - Successfully started Logstash API endpoint {:port=>9600} 2018-02-06T16:00:20.050Z localhost.localdomain 112
但是我们可以看到 , 有一个warning
<span> Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. <br /><br />
它说咱们没有logstash.yml,这个是从logstash5.0之后开始出现的,详细配置参考官网。
现在这样启动之后,发现内存还是过大,那我们来看看怎么把占用内存调小一点。
依旧是在/etc/logstash下的jvm.options
我们来设置一下大小
vi /etc/logstash/jvm.options
-Xms128m
-Xmx256m
先试试,不够再调大~
我们看到,我们需要执行logstash的时候非常麻烦,需要先进入目录再执行啊,这样不科学~ 来执行下面的命令
ln -s /usr/share/logstash/bin/logstash /usr/bin/logstash
然后就可以了~
第四步,安装kibana
wgethttps://artifacts.elastic.co/downloads/kibana/kibana-5.1.1-x86_64.rpm
然后安装,安装之后,找到配置文件,在/etc/kibana/kibana.yml
server.port: server.host: 0.0.0.0 elasticsearch.url: "http://192.168.2.178:9200"
然后就可以启动了,不过一样,我们先创建软链接,
ln -s /usr/share/kibana/bin/kibana /usr/bin/kibana
就可以kibana命令启动了~
到这里,我们的elk已经安装完成~
第五步 , 安装Filebeat
wgethttps://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.1.1-x86_64.rpm
安装,创建软链接ln -s /usr/share/filebeat/bin/filebeat /usr/bin/filebeat
接下来就是让Filebeat跟logstash勾搭起来了~
先创建正则表达式目录/usr/local/elk/app/logstash-5.1.1/patterns
创建logstash配置文件 :
vi /etc/logstash/conf.d/pro-log.conf
input { beats { port => } } filter { if [fields][logIndex] == "nginx" { grok { patterns_dir => "/usr/local/elk/app/logstash-5.1.1/patterns" match => { "message" => "%{NGINXACCESS}" } } urldecode { charset => "UTF-8" field => "url" } if [upstreamtime] == "" or [upstreamtime] == "null" { mutate { update => { "upstreamtime" => "" } } } date { match => ["logtime", "dd/MMM/yyyy:HH:mm:ss Z"] target => "@timestamp" } mutate { convert => { "responsetime" => "float" "upstreamtime" => "float" "size" => "integer" } remove_field => ["port","logtime","message"] } } } output { elasticsearch { hosts => "192.168.2.178:9200" manage_template => false index => "%{[fields][logIndex]}-%{+YYYY.MM.dd}" document_type => "%{[fields][docType]}" } }
我们这里用nginx的access_log来试试,先看看nginx的配置
log_format logstash '$http_host $server_addr $remote_addr [$time_local] "$visit_flag" "$jsession_id" "$login_name" "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" ' '$request_time $upstream_response_time $http_x_forwarded_for $upstream_addr';
然后,创建自定义正则文件
vi/usr/local/elk/app/logstash-5.1.1/patterns/nginx
URIPARM1 [A-Za-z0-9$.+!*'|(){},~@#%&/=:;^\\_<>`?\-\[\]]* URIPATH1 (?:/[\\A-Za-z0-9$.+!*'(){},~:;=@#% \[\]_<>^\-&?]*)+ HOSTNAME1 \b(?:[0-9A-Za-z_\-][0-9A-Za-z-_\-]{0,62})(?:\.(?:[0-9A-Za-z_\-][0-9A-Za-z-:\-_]{0,62}))*(\.?|\b) STATUS ([0-9.]{0,3}[, ]{0,2})+ HOSTPORT1 (%{IPV4}:%{POSINT}[, ]{0,2})+ FORWORD (?:%{IPV4}[,]?[ ]?)+|%{WORD} NGINXACCESS (%{HOSTNAME1:http_host}|-) %{IPORHOST:serveraddr} %{IPORHOST:remoteaddr} \[%{HTTPDATE:logtime}\] %{QS:visitflag} %{QS:sessionid} %{QS:loginname} %{QS:request} %{NUMBER:status} %{NUMBER:body_bytes_sent} %{QS:referrer} %{QS:agent} %{NUMBER:upstreamtime} %{NUMBER:responsetime} (%{FORWORD:x_forword_for}|-) (?:%{HOSTPORT1:upstream_addr}|-)
启动logstash
logstash -f /etc/logstash/conf.d/pro-log.conf &
ok,启动之后,我们该启动Filebeat来试试了
修改filebeat.yml
vi/etc/filebeat/filebeat.yml
filebeat.prospectors: # Each - is a prospector. Most options can be set at the prospector level, so # you can use different prospectors for various configurations. # Below are the prospector specific configurations. - input_type: log # Paths that should be crawled and fetched. Glob based paths. paths: - /opt/nginx/logs/app.access.log fields: logIndex: nginx docType: nginx-access project: app-nginx #----------------------------- Logstash output -------------------------------- output.logstash: # The Logstash hosts hosts: ["{your-logstash-ip}:5044"]
启动 , filebeat -path.config /etc/filebeat/ &
这样就已经正常监控了,访问http://192.168.2.178:5601/
这时我们能看到nginx的access_log, 但是发现,好多静态资源的访问记录也混在了里面,我们去nginx配置一下,过滤掉静态资源的access_log
nginx中设置 access_log off 即可。
大体上,ELK+Filebeat已经搞掂了,其余的就是各种自定义配置的事情了,在这里就不详细讨论了,有时间再写配置篇~
另外一部分,则需要先做聚类、分类处理,将聚合出的分类结果存入ES集群的聚类索引中。数据处理层的聚合结果存入ES中的指定索引,同时将每个聚合主题相关的数据存入每个document下面的某个field下。