tugangkai 2015-02-03
本地文件加载到hive表
1.在hxl数据库下创建表
hive> create table tb_emp_info
> (id int,
> name string,
> age int,
> tel string)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|'
> STORED AS TEXTFILE;
OK
Time taken: 0.296 seconds
hive> show tables in hxl;
OK
tb_emp_info
Time taken: 0.073 seconds
2.准备加载数据
[hadoop1@node1 hive]$ more tb_emp_info.txt
1|name1|25|13188888888888
2|name2|30|13888888888888
3|name3|3|147896221
4|name4|56|899314121
5|name5|12|899314121
6|name6|9|899314121
7|name7|32|899314121
8|name8|42|158964
9|name9|86|899314121
10|name10|45|789541
3.本地系统加载文件数据
进入到tb_emp_info.txt文件所在的目录,然后执行hive进入到hive模式
[hadoop1@node1 hive]$ hive
hive> use hxl;
OK
Time taken: 0.103 seconds
hive> load data local inpath 'tb_emp_info.txt' into table tb_emp_info;
Copying data from file:/home/hadoop1/file/hive/tb_emp_info.txt
Copying file: file:/home/hadoop1/file/hive/tb_emp_info.txt
Loading data to table hxl.tb_emp_info
OK
Time taken: 0.694 seconds
若是分区表的话,需要指点导入的分区,如:
hive> load data local inpath 'login.txt' into table tb_sso_ver_login_day partition(statedate=20141201);
4.查看加载进去的数据
hive> select * from tb_emp_info;
OK
1 name1 25 13188888888888
2 name2 30 13888888888888
3 name3 3 147896221
4 name4 56 899314121
5 name5 12 899314121
6 name6 9 899314121
7 name7 32 899314121
8 name8 42 158964
9 name9 86 899314121
10 name10 45 789541
5.可以进入到hdfs目录下查看该表对应的文件
hive> dfs -ls /user/hive/warehouse/hxl.db/tb_emp_info;
Found 1 items
-rw-r--r-- 3 hadoop1 supergroup 214 2014-10-28 17:31 /user/hive/warehouse/hxl.db/tb_emp_info/tb_emp_info.txt
HDFS文件导入到Hive表
1.查看hdfs系统上的文件
$hadoop fs -cat /user/hadoop1/myfile/tb_class.txt
输出部分
0|班级0|2014-10-29 14:10:17|2014-10-29 14:10:17
1|班级1|2014-10-29 14:10:17|2014-10-29 14:10:17
2|班级2|2014-10-29 14:10:17|2014-10-29 14:10:17
3|班级3|2014-10-29 14:10:17|2014-10-29 14:10:17
4|班级4|2014-10-29 14:10:17|2014-10-29 14:10:17
5|班级5|2014-10-29 14:10:17|2014-10-29 14:10:17
6|班级6|2014-10-29 14:10:17|2014-10-29 14:10:17
7|班级7|2014-10-29 14:10:17|2014-10-29 14:10:17
8|班级8|2014-10-29 14:10:17|2014-10-29 14:10:17
2.创建表
create table tb_class_info
(id int,
class_name string,
createtime timestamp ,
modifytime timestamp)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS TEXTFILE;
3.导入表
load data inpath '/user/hadoop1/myfile/tb_class.txt' into table tb_class_info;