大数据高可用集群环境安装与配置(08)——安装Ganglia监控集群

1. 安装依赖包和软件

在所有服务器上输入命令进行安装操作

yum install epel-release -y
yum install ganglia-web ganglia-gmetad ganglia-gmond –y

 

2. 在master服务器上配置监控端

vi /etc/ganglia/gmetad.conf

修改下面内容

data_source "server" 50 master:8649 master-backup:8649 node1:8649 node2:8649 node3:8649
case_sensitive_hostnames 1
vi /etc/ganglia/gmond.conf

修改下面内容

cluster {
  name = "server"
  owner = "unspecified"
  latlong = "unspecified"
  url = "unspecified"
}
udp_send_channel {
  #mcast_join = 239.2.11.71
  host = master
  port = 8649
  ttl = 1
}
udp_recv_channel {
  #mcast_join = 239.2.11.71
  port = 8649
  #bind = 239.2.11.71
  #retry_bind = true
  # Size of the UDP buffer. If you are handling lots of metrics you really
  # should bump it up to e.g. 10MB or even higher.
  # buffer = 10485760
}

修改HTTP访问配置

vi /etc/httpd/conf.d/ganglia.conf

修改下面内容

Alias /ganglia /usr/share/ganglia
<Location /ganglia>
  Order deny,allow
  Allow from all
  #Require local
  # Require ip 10.1.2.3
  # Require host example.org
</Location>

设置ganglia监控程序链接到指定目录

ln -s /usr/share/ganglia/ /var/www/html/ganglia

修改apache配置

vi /etc/httpd/conf/httpd.conf

将Directory里的内容改为

# 修改80端口为10080,防止后面与nginx端口冲突
Listen 10080

<Directory />
    AllowOverride none
    Order Allow,Deny
    Allow from all
    #Require all denied
</Directory>

 

3. 启动apache和ganglia,并设置开机启动

systemctl start httpd.service
systemctl start gmetad
systemctl start gmond
systemctl enable httpd.service
systemctl enable gmetad
systemctl enable gmond

启动服务

rrdcached /usr/bin/rrdcached -p /var/run/ganglia/hdp/rrdcached.pid -m 664 -l unix:/var/run/ganglia/hdp/rrdcached.sock -m 777 -P FLUSH,STATS,HELP -l unix:/var/run/ganglia/hdp/rrdcached.limited.sock -b /var/lib/ganglia/rrds -B -t 4 -w 3600 -f 7200 -z 1800 -F

设置开机启动

vi /etc/rc.local

在尾部添加下面配置

/usr/bin/rrdcached -p /var/run/ganglia/hdp/rrdcached.pid -m 664 -l unix:/var/run/ganglia/hdp/rrdcached.sock -m 777 -P FLUSH,STATS,HELP -l unix:/var/run/ganglia/hdp/rrdcached.limited.sock -b /var/lib/ganglia/rrds -B -t 4 -w 3600 -f 7200 -z 1800 -F

 

4. 配置被监控端

在其他服务器上做下面操作

vi /etc/ganglia/gmond.conf

修改下面内容

cluster {
  name = "server"
  owner = "unspecified"
  latlong = "unspecified"
  url = "unspecified"
}
udp_send_channel {
  #mcast_join = 239.2.11.71
  host = master
  port = 8649
  ttl = 1
}
udp_recv_channel {
  #mcast_join = 239.2.11.71
  port = 8649
  #bind = 239.2.11.71
  #retry_bind = true
}

 

5. 配置HDFS、YARN集成Ganglia

vi /usr/local/hadoop/etc/hadoop/hadoop-metrics2.properties

将里面的值全部注释掉,然后替换成下面配置

# for Ganglia 3.1 support
*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
*.sink.ganglia.period=10
# default for supportsparse is false
*.sink.ganglia.supportsparse=true
*.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both
*.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40
namenode.sink.ganglia.servers=master:8649 # host请参考gmond.conf中的定义
datanode.sink.ganglia.servers=master:8649
resourcemanager.sink.ganglia.servers=master:8649
nodemanager.sink.ganglia.servers=master:8649
mrappmaster.sink.ganglia.servers=master:8649
jobhistoryserver.sink.ganglia.servers=master:8649
# 注意下面参数,如果修改可能会造成数据量过大,ganglia的磁盘空间迅速占满。
# Switch off container metrics
*.source.filter.class=org.apache.hadoop.metrics2.filter.GlobFilter
nodemanager.*.source.filter.exclude=*ContainerResource*

 

6. 配置HBase集成Ganglia

vi /usr/local/hbase/conf/hadoop-metrics2-hbase.properties

将里面的值全部注释掉,然后替换成下面配置

*.sink.file*.class=org.apache.hadoop.metrics2.sink.FileSink
# default sampling period
*.period=10
*.source.filter.class=org.apache.hadoop.metrics2.filter.GlobFilter
*.record.filter.class=${*.source.filter.class}
*.metric.filter.class=${*.source.filter.class}
hbase.sink.ganglia.record.filter.exclude=*Regions*
hbase.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31
hbase.sink.ganglia.tagsForPrefix.jvm=ProcessName
*.sink.ganglia.period=20
hbase.sink.ganglia.servers=master:8649 # host请参考gmond.conf中的定义

 

7. 设置被监控端自启动

systemctl start gmond
systemctl enable gmond

 

8. 检查服务是否正常

在master服务器上重启hadoop与hbase服务 访问http://192.168.10.90:10080/ganglia/ 查看监控页面

在master服务器输入命令查看监控服务运行状态

gstat –a

systemctl status gmetad –l

当遇到无法查看到监控信息时,可用上面的命令查看运行状态,如果gmetad与gmond服务都正常运行,却在网页端没有图形数据,可以在master服务器输入systemctl restart gmetad重启监控服务,在三个服务器都输入systemctl restart gmond重启监控收集服务。

 

版权声明:本文原创发表于 博客园,作者为 AllEmpty 本文欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则视为侵权。

作者博客:http://www.cnblogs.com/EmptyFS/

上一篇:2021-03-22


下一篇:Flume技术原理