hdfs、hive、hbase的搭建总结

eternityzzy 2020-07-19

jdk的安装

  • 上传jdk的安装包到linux中

  • 解压、更名

    [ software]# tar -zxvf jdk-8u221-linux-x64.tar.gz -C /usr/local/
    [ software]# cd /usr/local
    [ local]# mv jdk1.8.0_221/ jdk
  • 环境变量的配置

    [ local]# vi /etc/profile
    ......省略.........
    # java environment
    JAVA_HOME=/usr/local/jdk
    PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH
    
    export JAVA_HOME PATH
  • 重新加载配置文件

    [ local]# source /etc/profile
  • 验证是否配置成功

    [ local]# java -version
    [ local]# javac

    hdfs完全分布式的搭建

  • 上传并解压hadoop

    [ ~]# tar -zxvf hadoop-2.7.6.tar.gz -C /usr/local/
  • 更名

    [ ~]# cd /usr/local
    [ local]# mv hadoop-2.7.6/ hadoop
  • 环境变量的配置

    local]# vi /etc/profile
    .........省略..........
    #hadoop environment
    export HADOOP_HOME=/usr/local/hadoop
    export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
  • 重新加载配置文件

    [ local]# source /etc/profile
  • 验证是否装好

    [ local]# hadoop version

布局

qianfeng01:    namenode				datanode	resourcemanager		nodemanager
qianfeng02:    secondarynamenode	datanode						nodemanager	
qianfeng03:
  • 配置core-site.xml

    <!--  完全分布式文件系统的名称 :schema  ip  port -->
    <property>
      <name>fs.defaultFS</name>
      <value>hdfs://qianfeng01/</value>
    </property>
    
    <!--  分布式文件系统的其他路径的所依赖的一个基础路径,完全分布式不能使用默认值,因为临路径不安全,linux系统在重启时,可能会删除此目录下的内容-->
    <property>
      <name>hadoop.tmp.dir</name>
      <value>/usr/local/hadoop/tmp</value>
    </property>
  • 配置hdfs-site.xml

    <!--  namenode守护进程所管理文件的存储路径 -->
    <property>
      <name>dfs.namenode.name.dir</name>
      <value>file://${hadoop.tmp.dir}/dfs/name</value>
    </property>
    
    <!--  datanode守护进程所管理文件的存储路径 -->
    <property>
      <name>dfs.datanode.data.dir</name>
      <value>file://${hadoop.tmp.dir}/dfs/data</value>
    </property>
    
    
    <!--  hdfs的块的副本数 -->
    <property>
      <name>dfs.replication</name>
      <value>3</value>
    </property>
    
    <!--  hdfs的块大小,默认是128M -->
    <property>
      <name>dfs.blocksize</name>
      <value>134217728</value>
    </property>
    
    <!--  secondarynamenode的http服务的ip和port -->
    <property>
      <name>dfs.namenode.secondary.http-address</name>
      <value>qianfeng02:50090</value>
    </property>
  • 配置mapred-site.xml

    <!--  mapreduce程序运行时所使用的框架的名称-->
    <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
    </property>
    
    <!--  mapreduce程序运行的历史服务器的ip和port-->
    <property>
      <name>mapreduce.jobhistory.address</name>
      <value>qianfeng01:10020</value>
    </property>
    
    <!--  mapreduce程序运行的webui的ip和port-->
    <property>
      <name>mapreduce.jobhistory.webapp.address</name>
      <value>qianfeng01:19888</value>
    </property>
  • 配置yarn-site.xml

    <!--  配置yarn框架使用其核心技术:shuffle-->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    
    <!--  配置resourcemanager所在的主机的名称-->
    <property>
        <description>The hostname of the RM.</description>
        <name>yarn.resourcemanager.hostname</name>
        <value>qianfeng01</value>
    </property>
  • 配置hadoop-env.sh

    [ hadoop]# vi hadoop-env.sh
    .........
    # The java implementation to use.
    export JAVA_HOME=/usr/local/jdk
  • 配置yarn-env.sh

    [ hadoop]# vi yarn-env.sh
    .........
    # some Java parameters
    export JAVA_HOME=/usr/local/jdk
  • 配置slaves文件(重点)

    datanode守护进程所在主机的主机名称
    
    [ hadoop]# vi slaves
    qianfeng01
    qianfeng02
    qianfeng03

免密登陆

  • 生产密钥
ssh-keygen -t rsa   一路回车即可
  • 分发密钥,分发给自己就行了,克隆后再ssh

    语法格式:ssh-copy-id -i 公钥文件 远程用户名@远程机器IP
    
    作用:将本机当前用户的公钥文件,复制到远程机器的相关用户的主目录的隐藏目录.ssh下,同时自动更名为authorised_keys.
    
    注意:.ssh目录的权限是700
    	authorised_keys的权限是600

firewalld和NetworkManager以及selinux的关闭

查看服务状态:	systemctl status firewalld

临时关闭:	systemctl stop firewalld
临时启动:	systemctl start firewalld

设置开机不启动: systemctl disable firewalld        	#下次开机生效
设置开机启动: 	systemctl enable firewalld		#下次开机生效



systemctl  status NetworkManager
systemctl  start   NetworkManager
systemctl  stop  NetworkManager
systemctl  disable NetworkManager
systemctl  enable  NetworkManager


[ ~]# vi /etc/selinux/config

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=enforcing		<----  需要将enforcing改为disabled
# SELINUXTYPE= can take one of three values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted

克隆虚拟机

  • 修改主机名

    ]# hostnamectl set-hostname qianfeng02
  • 修改IP

    [ ~]# vi /etc/sysconfig/network-scripts/ifcfg-ens33
    
    TYPE=Ethernet
    BOOTPROTO=static
    NAME=ens33
    DEVICE=ens33
    ONBOOT=yes
    
    
    IPADDR=192.168.10.102			<--- 修改ip地址即可,别的都不用动
    NETMASK=255.255.255.0
    GATEWAY=192.168.10.2
    DNS1=192.168.10.2
    DNS2=8.8.8.8
    DNS3=114.114.114.114
  • 重启网络并检查ip

    systemctl restart network
    ip addr
    
    ping外网
    ping主机
    主机ping虚拟机mysql的安装

mysql的安装

  • 上传安装包,解压

    [ ~]# tar -xvf mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar -C /usr/local/mysql
    
    mysql-community-embedded-5.7.28-1.el7.x86_64.rpm
    mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm
    mysql-community-devel-5.7.28-1.el7.x86_64.rpm
    mysql-community-embedded-compat-5.7.28-1.el7.x86_64.rpm
    mysql-community-libs-5.7.28-1.el7.x86_64.rpm
    mysql-community-test-5.7.28-1.el7.x86_64.rpm
    mysql-community-common-5.7.28-1.el7.x86_64.rpm
    mysql-community-embedded-devel-5.7.28-1.el7.x86_64.rpm
    mysql-community-client-5.7.28-1.el7.x86_64.rpm
    mysql-community-server-5.7.28-1.el7.x86_64.rpm
  • 安装mysql所依赖的环境perl,移除mysql的冲突软件mariadb

    [ ~]# yum -y install perl
    [ ~]# yum -y install net-tools
    [ ~]# rpm -qa | grep mariadb
    mariadb-libs-5.5.64-1.el7.x86_64
    [ ~]# rpm -e mariadb-libs-5.5.64-1.el7.x86_64 --nodeps
  • 按照mysql的依赖顺序来安装mysql的rpm包

    [ ~]# rpm -ivh mysql-community-common-5.7.28-1.el7.x86_64.rpm
    [ ~]# rpm -ivh mysql-community-libs-5.7.28-1.el7.x86_64.rpm
    [ ~]# rpm -ivh mysql-community-client-5.7.28-1.el7.x86_64.rpm
    [ ~]# rpm -ivh mysql-community-server-5.7.28-1.el7.x86_64.rpm
  • 启动mysql的服务项,并检查状态

    [ ~]# systemctl start mysqld
    [ ~]# systemctl status mysqld
    ● mysqld.service - MySQL Server
       Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
       Active: active (running) since 五 2020-05-29 11:25:57 CST; 9s ago
         Docs: man:mysqld(8)
               http://dev.mysql.com/doc/refman/en/using-systemd.html
      Process: 2406 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
      Process: 2355 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
     Main PID: 2409 (mysqld)
       CGroup: /system.slice/mysqld.service
               └─2409 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid
    
    5月 29 11:25:52 qianfeng01 systemd[1]: Starting MySQL Server...
    5月 29 11:25:57 qianfeng01 systemd[1]: Started MySQL Server.
  • 查询mysql的初始密码(密码保存在/var/log/mysqld.log文件中,此文件是服务项启动后生成的)

    [ ~]# cat /var/log/mysqld.log | grep password
  • 使用初始密码进行登录

    [ ~]# mysql -uroot -p‘密码‘
  • 登录成功后,要降低密码策略机制,改为low,也可以将密码长度6.

    set global validate_password_policy=low;
    set global validate_password_length=6;
    
    查看密码策略,是否修改成功
    show variables like ‘%validate_password%‘;
  • 修改密码

    alter user  identified by ‘新密码‘
  • 如果想要远程连接mysql,需要进行远程授权操作(注意,一定要关闭虚拟机防火墙)

    *.*:所有库下的所有表
    “%”:root下的所有ip
    
    grant all privileges on *.* to "%" identified by ‘111111‘ with grant option;

hive的安装

  • 上传解压,更名

    [ local]# tar -zxvf apache-hive-2.1.1-bin.tar.gz -C /usr/local
    [ local]# mv apache-hive-2.1.1-bin/ hive
  • 环境变量的配置

    [ local]# vi /etc/profile 
    # 添加如下内容: 
    export HIVE_HOME=/usr/local/hive 
    export PATH=$HIVE_HOME/bin:$PATH #让profile生效 [ local
    ]# source /etc/profile
  • hive-env.sh

    export HIVE_CONF_DIR=/usr/local/hive/conf 
    export JAVA_HOME=/usr/local/jdk 
    export HADOOP_HOME=/urs/local/hadoop 
    export HIVE_AUX_JARS_PATH=/usr/local/hive/lib
  • hive-site.xml

    <!--hive仓库在hdfs的位置-->
    <property>
    		<name>hive.metastore.warehouse.dir</name>
    		<value>/user/hive/warehouse</value>
    		<description>location of default database for the warehouse</description>
    </property>
    
    <!-- 该参数主要指定Hive的临时文件存储目录  -->
    <property>
    		<name>hive.exec.scratchdir</name>
    		<value>/tmp/hive</value>
    </property>
    
    <!--连接mysql的url地址-->
    <property>
    		<name>javax.jdo.option.ConnectionURL</name>
    		<value>jdbc:mysql://qianfeng03:3306/hive?createDatabaseIfNotExist=true&amp;characterEncoding=latin1</value>
    </property>
    <!--mysql的驱动类-->
    <property>
    		<name>javax.jdo.option.ConnectionDriverName</name>
    		<value>com.mysql.jdbc.Driver</value>
    </property>
    <!--mysql的用户名-->
    <property>
    		<name>javax.jdo.option.ConnectionUserName</name>
    		<value>root</value>
    		</property>
    <!--mysql远程登陆的密码-->
    <property>
    		<name>javax.jdo.option.ConnectionPassword</name>
    		<value>111111</value>
    </property>
    
    
    <!--hive工作的本地临时存储空间-->
    <property>
    		<name>hive.exec.local.scratchdir</name>
    		<value>/usr/local/hive/iotmp/root</value>
    </property>
    <!--如果启用了日志功能,则存储操作日志的顶级目录-->
    <property>
    		<name>hive.server2.logging.operation.log.location</name>
    		<value>/usr/local/hive/iotmp/root/operation_logs</value>
    </property>
    <!--Hive运行时结构化日志文件的位置-->
    <property>
    		<name>hive.querylog.location</name>
    		<value>/usr/local/hive/iotmp/root</value>
    </property>
    <!--用于在远程文件系统中添加资源的临时本地目录-->
    <property>
    		<name>hive.downloaded.resources.dir</name>
    		<value>/usr/local/hive/iotmp/${hive.session.id}_resources</value>
    </property>
    			
    			
    说明:使用远程模式,需要在hadoop的core-site.xml文件中添加一下属性
    <property>
    		<name>hadoop.proxyuser.root.hosts</name>
    		<value>*</value>
    </property>
    <property>
    		<name>hadoop.proxyuser.root.groups</name>
    		<value>*</value>
    </property>

hbase完全分布式

安装zookeeper

  • 上传并解压更名

    [ ~]# tar -zxvf zookeeper-3.4.10.tar.gz -C /usr/local/
    [ ~]# cd /usr/local/ [ local]# mv zookeeper-3.4.10 zookeeper
  • 配置环境变量

    [ local]# vi /etc/profile .........省略...... export ZOOKEEPER_HOME=/usr/local/zookeeper export PATH=$ZOOKEEPER_HOME/bin:$PATH
  • 生效

    [ local]# source /etc/profile
  • 验证配置是否成功:使用tab键看看是否可以提示zookeeper相关脚本

  • 进入conf目录下,复制一个zoo.cfg文件

    cp zoo_sample.cfg  zoo.cfg
  • 修改zoo.cfg文件

    dataDir=/usr/local/zookeeper/zkData
    clientPort=2181
    server.1=qianfeng01:2888:3888
    server.2=qianfeng02:2888:3888
    server.3=qianfeng03:2888:3888
  • 如果dataDir属性指定的目录不存在,那么要创建出来

    mkdir /usr/local/zookeeper/zkData
  • 在zkData目录下创建myid写入相应数字

  • 将/etc/profile文件和zookeeper目录scp到其他机器上

  • 启动zookeeper集群,每台机器上都要运行一下命令

    zkServer.sh start
    zkServer.sh status

安装hbase

  • 上传解压更名

    [ software]# tar -zxvf hbase-1.2.1-bin.tar.gz -C /opt/apps/
  • 配置hbase-env.sh

    [ conf]# vi hbase-env.sh
    
    # The java implementation to use.  Java 1.7+ required.
    export JAVA_HOME=/opt/apps/jdk1.8.0_45
    
    # Tell HBase whether it should manage it‘s own instance of Zookeeper or not.
    export HBASE_MANAGES_ZK=true
  • hbase-site.xml

    <configuration>
            <property>
                    <name>hbase.cluster.distributed</name>
                    <value>true</value>
            </property>
            <property>
                    <name>hbase.rootdir</name>
                    <value>hdfs://qphone01:9000/hbase</value>
            </property>
            <property>
                    <name>hbase.zookeeper.quorum</name>
                    <value>qphone01,qphone02,qphone03</value>
            </property>
    </configuration>
  • regionservers

    qphone01
    qphone02
    qphone03
  • 将/etc/profile文件和hbase目录scp到其他机器上

  • 在qphone01的hbase的conf目录下创建backup-masters,写入qphone02做备用机器

    qphone02
  • 测试

    [ apps]# start-hbase.sh
    
    http://192.168.49.200:16010/master-status
  • jps查看进程

    • HMaster//必须的,表示hbase正常,因为配置了两个master一个是active状态一个是备用状态,所以在qianfeng01和qiangfeng02上各有一个hmaster
    • QuorumPeerMain//必须单独配置的Zookeeper集群,如果是内置的则为HQuorumPeer,表示zookeeper正常
  • 进入hbase的shell

    hbase shell

相关推荐