Hive安装配置学习笔记

zhulinyanyu 2012-11-24

转载请标明出处SpringsSpace: http://springsfeng.iteye.com

1 . 首先请安装好MySQL并修改root账户密码,使用root账户执行下面命令:

      su - root

      mysql

      GRANT ALL PRIVILEGES ON *.* TO 'root'@'localhost' IDENTIFIED BY 'root' WITH GRANT OPTION; 

2.  创建Hive用户: 使用root账户执行下面命令:

      su - root

      mysql -uroot -p

      CREATE USER 'hive'@'localhost' IDENTIFIED BY 'hive'; 

      CREATE USER 'hive'@'linux-fdc.linux.com' IDENTIFIED BY 'hive'; 

      CREATE USER 'hive'@'192.168.81.251' IDENTIFIED BY 'hive'; 

      CREATE DATABASE metastore;
      CREATE DATABASE metastore DEFAULT CHARACTER SET latin1 DEFAULT COLLATE latin1_swedish_ci; 

     
      GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'localhost' IDENTIFIED BY 'hive' WITH GRANT OPTION; 

      GRANT ALL PRIVILEGES ON metastore.* TO 'hive'@'192.168.81.251' IDENTIFIED BY 'hive' WITH GRANT OPTION; 
      flush privileges;

3.  导入MySQL 脚本

     使用hive账户登录:

     mysql -uhive -p -h192.168.81.251

     mysql> use metastore;
     Database changed
     mysql> source /opt/custom/hive-0.11.0/scripts/metastore/upgrade/mysql/hive-schema-0.10.0.mysql.sql

4.  Hive安装配置

     (1) 编译:针对当前Hive-0.11.0-SNAPSHOT版本

     下载最新的Hive源码包:hive-trunk.zip, 解压至:/home/kevin/Downloads/hive-trunk,修改:

     build.properties, 中: 

...
hadoop-0.20.version=0.20.2
hadoop-0.20S.version=1.1.2
hadoop-0.23.version=2.0.3-alpha
...

    若需修改其他依赖包的版本,请修改:ivy目录下的libraries.properties文件, 例如修改hbase

    的版本:

    ...

    guava-hadoop23.version=11.0.2
    hbase.version=0.94.6
    jackson.version=1.8.8

    ...

    在当前目录下执行:

     ant tar -Dforrest.home=/usr/custom/apache-forrest-0.9

     第3小节。

     (2)  解压:从编译后的build下Copy hive-0.11.0-SNAPSHOT.tar.gz至:/usr/custom/并解压。

     (3)  配置环境变量:

     exprot HIVE_HOME=/usr/custom/hive-0.11.0
     exprot PATH=$HIVE_HOME/bin:$PATH

     (4) 配置文件:

     复制conf目录下的.template生成对应的.xml或.properties文件:
     cp hive-default.xml.template hive-site.xml
     cp hive-log4j.properties.template hive-log4j.properties

     (5) 配置hive-config.sh:       

...
#
# processes --config option from command line
#

export JAVA_HOME=/usr/custom/jdk1.6.0_43
export HIVE_HOME=/usr/custom/hive-0.11.0
export HADOOP_HOME=/usr/custom/hadoop-2.0.3-alpha


this="$0"
while [ -h "$this" ]; do
  ls=`ls -ld "$this"`
  link=`expr "$ls" : '.*-> \(.*\)$'`
  if expr "$link" : '.*/.*' > /dev/null; then
    this="$link"
  else
    this=`dirname "$this"`/"$link"
  fi
done
...

    (6) 配置日志hive-log4j.properties:针对0.10.0版本的特别处理。

     将org.apache.hadoop.metrics.jvm.EventCounter改成:org.apache.hadoop.log.metrics

     .EventCounter , 这样将解决异常:

     WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. 
     Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.

     (7) 创建hive-site文件:

<configuration>

	<!-- WARNING!!! This file is provided for documentation purposes ONLY! -->
	<!-- WARNING!!! Any changes you make to this file will be ignored by Hive. -->
	<!-- WARNING!!! You must make your changes in hive-site.xml instead. -->

	<!-- Hive Execution Parameters -->
	<property>
		<name>javax.jdo.option.ConnectionURL</name>
		<value>jdbc:mysql://localhost:3306/metastore_db?createDatabaseIfNotExist=true</value>
		<description>
			JDBC connect string for a JDBC metastore.
			请注意上面value标签之间的部分前后之间不能有空格,否则HiveClient提示:
			FAILED: Error in metadata: java.lang.RuntimeException: Unable 
			to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
		</description>
	</property>
	
	<property>
		<name>javax.jdo.option.ConnectionDriverName</name>
		<value>com.mysql.jdbc.Driver</value>
		<description>Driver class name for a JDBC metastore</description>
	</property>

	<property>
		<name>javax.jdo.option.ConnectionUserName</name>
		<value>hive</value>
		<description>username to use against metastore database</description>
	</property>

	<property>
		<name>javax.jdo.option.ConnectionPassword</name>
		<value>hive</value>
		<description>password to use against metastore database</description>
	</property>

	<property>
		<name>hive.metastore.uris</name>
		<value>thrift://linux-fdc.linux.com:8888</value>
		<description>
			Thrift uri for the remote metastore. Used by metastore
			client to connect to remote metastore.
		</description>
	</property>

</configuration>

    (8) 配置MySQL-Connector-Java

    下载mysql-connector-java-5.1.22-bin.jar放在/usr/custom/hive-0.11.0/lib目录下,否则执行

    show tables; 命令时提示找不到ConnectionDriverName。

5. 启动使用

    (1) 启动
    进入bin目录下,执行命令:hive
   (2) 查看当前库及表
   show databases;   //默认为:default
   show tables;
   (3) 创建表示例

   这部分为我自己的测试, 测试数据见附件。

  
   CREATE TABLE cite (citing INT, cited INT)
   ROW FORMAT DELIMITED
   FIELDS TERMINATED BY ','
   STORED AS TEXTFILE;
    
   CREATE TABLE cite_count (cited INT, count INT);
    
   INSERT OVERWRITE TABLE cite_count
   SELECT cited,COUNT(citing)
   FROM cite
   GROUP BY cited;
    
   SELECT * FROM cite_count WHERE count > 10 LIMIT 10;
    
   CREATE TABLE age (name STRING, birthday INT)
   ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\t'
   LINES TERMINATED BY '\n'
   STORED AS TEXTFILE;
    
   CREATE TABLE age_out (birthday INT, birthday_count INT)
   ROW FORMAT DELIMITED
   FIELDS TERMINATED BY '\t'
   LINES TERMINATED BY '\n'
   STORED AS TEXTFILE;
   (4) 查看表结构
   desribe cite;
   (5) 加载数据
   hive> LOAD DATA LOCAL INPATH '/home/kevin/Documents/age.txt' OVERWRITE INTO TABLE age;

相关推荐