李春春 2017-03-29
hive中执行的SQL,会转化为一个或多个mapReduce作业并在集群上运行。
1.创建表导入单词文本文件
create table flwordcount(line string);
load data inpath '/user/hadoop/input/readme.txt' overwrite into table flwordcount;
命令查看:
0: jdbc:hive2://master:10000> select * from flwordcount;
INFO : Compiling command(queryId=hadoop_20170329124545_6ebd5606-18d3-4982-8706-4a8aaad5d582): select * from flwordcount
INFO : Semantic Analysis Completed
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:flwordcount.line, type:string, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hadoop_20170329124545_6ebd5606-18d3-4982-8706-4a8aaad5d582); Time taken: 0.86 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing command(queryId=hadoop_20170329124545_6ebd5606-18d3-4982-8706-4a8aaad5d582): select * from flwordcount
INFO : Completed executing command(queryId=hadoop_20170329124545_6ebd5606-18d3-4982-8706-4a8aaad5d582); Time taken: 0.006 seconds
INFO : OK
+------------------------------------------------------------------------+--+
| flwordcount.line |
+------------------------------------------------------------------------+--+
| For the latest information about Hadoop, please visit our website at: |
| |
| http://hadoop.apache.org/core/ |
| |
| and our wiki, at: |
| |
| http://wiki.apache.org/hadoop/ |
| |
| This distribution includes cryptographic software. The country in |
| which you currently reside may have restrictions on the import, |
| possession, use, and/or re-export to another country, of |
| encryption software. BEFORE using any encryption software, please |
| check your country's laws, regulations and policies concerning the |
| import, possession, or use, and re-export of encryption software, to |
| see if this is permitted. See <http://www.wassenaar.org/> for more |
| information. |
| |
| The U.S. Government Department of Commerce, Bureau of Industry and |
| Security (BIS), has classified this software as Export Commodity |
| Control Number (ECCN) 5D002.C.1, which includes information security |
| software using or performing cryptographic functions with asymmetric |
| algorithms. The form and manner of this Apache Software Foundation |
| distribution makes it eligible for export under the License Exception |
| ENC Technology Software Unrestricted (TSU) exception (see the BIS |
| Export Administration Regulations, Section 740.13) for both object |
| code and source code. |
| |
| The following provides more details on the included cryptographic |
| software: |
| Hadoop Core uses the SSL libraries from the Jetty project written |
| by mortbay.org. |
+------------------------------------------------------------------------+--+
31 rows selected (1.251 seconds)
2.hql执行查询每个单词个数。数据保存在新创建表里flwordcount1
0: jdbc:hive2://master:10000> CREATE TABLE flwordcount1 as select w.word ,count(1) as count from (select EXPLODE(SPLIT(line,'\s')) as word from flwordcount) w group by word order by word;
INFO : Compiling command(queryId=hadoop_20170329113737_2ad7ab0a-bb65-4f3f-b038-f60233318add): CREATE TABLE flwordcount1 as select w.word ,count(1) as count from (select EXPLODE(SPLIT(line,'\s')) as word from flwordcount) w group by word order by word
INFO : Semantic Analysis Completed
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:w.word, type:string, comment:null), FieldSchema(name:count, type:bigint, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hadoop_20170329113737_2ad7ab0a-bb65-4f3f-b038-f60233318add); Time taken: 0.136 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing command(queryId=hadoop_20170329113737_2ad7ab0a-bb65-4f3f-b038-f60233318add): CREATE TABLE flwordcount1 as select w.word ,count(1) as count from (select EXPLODE(SPLIT(line,'\s')) as word from flwordcount) w group by word order by word
INFO : Query ID = hadoop_20170329113737_2ad7ab0a-bb65-4f3f-b038-f60233318add
INFO : Total jobs = 2
INFO : Launching Job 1 out of 2
INFO : Starting task [Stage-1:MAPRED] in serial mode
INFO : Number of reduce tasks not specified. Estimated from input data size: 1
INFO : In order to change the average load for a reducer (in bytes):
INFO : set hive.exec.reducers.bytes.per.reducer=<number>
INFO : In order to limit the maximum number of reducers:
INFO : set hive.exec.reducers.max=<number>
INFO : In order to set a constant number of reducers:
INFO : set mapreduce.job.reduces=<number>
DEBUG : Configuring job job_1490754097195_0009 with /tmp/hadoop-yarn/staging/hadoop/.staging/job_1490754097195_0009 as the submit dir
DEBUG : adding the following namenodes' delegation tokens:[hdfs://master:9000]
DEBUG : Creating splits at hdfs://master:9000/tmp/hadoop-yarn/staging/hadoop/.staging/job_1490754097195_0009
INFO : number of splits:1
INFO : Submitting tokens for job: job_1490754097195_0009
INFO : The url to track the job: http://master:8088/proxy/application_1490754097195_0009/
INFO : Starting Job = job_1490754097195_0009, Tracking URL = http://master:8088/proxy/application_1490754097195_0009/
INFO : Kill Command = /opt/hadoop-2.6.0-cdh5.8.0/bin/hadoop job -kill job_1490754097195_0009
INFO : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
INFO : 2017-03-29 11:38:20,957 Stage-1 map = 0%, reduce = 0%
INFO : 2017-03-29 11:38:29,425 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.56 sec
INFO : 2017-03-29 11:38:38,847 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.95 sec
INFO : MapReduce Total cumulative CPU time: 2 seconds 950 msec
INFO : Ended Job = job_1490754097195_0009
INFO : Launching Job 2 out of 2
INFO : Starting task [Stage-2:MAPRED] in serial mode
INFO : Number of reduce tasks determined at compile time: 1
INFO : In order to change the average load for a reducer (in bytes):
INFO : set hive.exec.reducers.bytes.per.reducer=<number>
INFO : In order to limit the maximum number of reducers:
INFO : set hive.exec.reducers.max=<number>
INFO : In order to set a constant number of reducers:
INFO : set mapreduce.job.reduces=<number>
DEBUG : Configuring job job_1490754097195_0010 with /tmp/hadoop-yarn/staging/hadoop/.staging/job_1490754097195_0010 as the submit dir
DEBUG : adding the following namenodes' delegation tokens:[hdfs://master:9000]
DEBUG : Creating splits at hdfs://master:9000/tmp/hadoop-yarn/staging/hadoop/.staging/job_1490754097195_0010
INFO : number of splits:1
INFO : Submitting tokens for job: job_1490754097195_0010
INFO : The url to track the job: http://master:8088/proxy/application_1490754097195_0010/
INFO : Starting Job = job_1490754097195_0010, Tracking URL = http://master:8088/proxy/application_1490754097195_0010/
INFO : Kill Command = /opt/hadoop-2.6.0-cdh5.8.0/bin/hadoop job -kill job_1490754097195_0010
INFO : Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1
INFO : 2017-03-29 11:38:52,123 Stage-2 map = 0%, reduce = 0%
INFO : 2017-03-29 11:39:02,271 Stage-2 map = 100%, reduce = 0%, Cumulative CPU 1.15 sec
INFO : 2017-03-29 11:39:12,733 Stage-2 map = 100%, reduce = 100%, Cumulative CPU 2.93 sec
INFO : MapReduce Total cumulative CPU time: 2 seconds 930 msec
INFO : Ended Job = job_1490754097195_0010
INFO : Starting task [Stage-0:MOVE] in serial mode
INFO : Moving data to: hdfs://master:9000/user/hive/warehouse/flwordcount1 from hdfs://master:9000/user/hive/warehouse/.hive-staging_hive_2017-03-29_11-37-58_284_2062860527712359321-1/-ext-10001
INFO : Starting task [Stage-4:DDL] in serial mode
INFO : Starting task [Stage-3:STATS] in serial mode
INFO : Table default.flwordcount1 stats: [numFiles=1, numRows=79, totalSize=1507, rawDataSize=1428]
INFO : MapReduce Jobs Launched:
INFO : Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.95 sec HDFS Read: 8114 HDFS Write: 2886 SUCCESS
INFO : Stage-Stage-2: Map: 1 Reduce: 1 Cumulative CPU: 2.93 sec HDFS Read: 7150 HDFS Write: 1585 SUCCESS
INFO : Total MapReduce CPU Time Spent: 5 seconds 880 msec
INFO : Completed executing command(queryId=hadoop_20170329113737_2ad7ab0a-bb65-4f3f-b038-f60233318add); Time taken: 75.85 seconds
INFO : OK
No rows affected (75.999 seconds)
0: jdbc:hive2://master:10000>
0: jdbc:hive2://master:10000> select * from flwordcount1;
INFO : Compiling command(queryId=hadoop_20170329124848_8a2e0256-63f1-4a73-b91d-815fe4dabb00): select * from flwordcount1
INFO : Semantic Analysis Completed
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:flwordcount1.word, type:string, comment:null), FieldSchema(name:flwordcount1.count, type:bigint, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hadoop_20170329124848_8a2e0256-63f1-4a73-b91d-815fe4dabb00); Time taken: 0.344 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Executing command(queryId=hadoop_20170329124848_8a2e0256-63f1-4a73-b91d-815fe4dabb00): select * from flwordcount1
INFO : Completed executing command(queryId=hadoop_20170329124848_8a2e0256-63f1-4a73-b91d-815fe4dabb00); Time taken: 0.001 seconds
INFO : OK
+-------------------------------------------------------------+---------------------+--+
| flwordcount1.word | flwordcount1.count |
+-------------------------------------------------------------+---------------------+--+
| | 15 |
| | 1 |
| http://hadoop.apache.org/core/ | 1 |
| http://wiki.apache.org/hadoop/ | 1 |
| Hadoop Core u | 1 |
| Apache Software Foundation | 1 |
| Export Commodity | 1 |
| and policie | 1 |
| cla | 1 |
| concerning the | 1 |
| cryptographic | 1 |
| di | 1 |
| from the Jetty project written | 1 |
| i | 1 |
| information | 1 |
| it eligible for export under the Licen | 1 |
| law | 1 |
| more detail | 1 |
| on the import, | 1 |
| on the included cryptographic | 1 |
| permitted. See <http://www.wa | 1 |
| the SSL librarie | 1 |
| with a | 1 |
| , Section 740.13) for both object | 1 |
| , regulation | 1 |
| . The form and manner of thi | 1 |
| Control Number (ECCN) 5D002.C.1, which include | 1 |
| ENC Technology Software Unre | 1 |
| Export Admini | 1 |
| For the late | 1 |
| Security (BIS), ha | 1 |
| The U.S. Government Department of Commerce, Bureau of Indu | 1 |