使用sqoop将MySQL数据库中的数据导入Hbase

大伟爱自由 2012-03-26

前提:安装好 sqoop、hbase。

下载jbdc驱动:mysql-connector-java-5.1.10.jar

将 mysql-connector-java-5.1.10.jar 复制到 /usr/lib/sqoop/lib/ 下

MySQL导入HBase命令:

sqoopimport--connectjdbc:mysql://10.10.97.116:3306/rsearch--tableresearchers--hbase-tableA--column-familyperson--hbase-row-keyid--hbase-create-table--username'root'-P

说明:

--connectjdbc:mysql://10.10.97.116:3306/rsearch表示远程或者本地Mysql服务的URI,3306是Mysql默认监听端口,rsearch是数据库,若是其他数据库,如Oracle,只需修改URI即可。

--tableresearchers表示导出rsearch数据库的researchers表。

--hbase-tableA表示在HBase中建立表A。

--column-familyperson表示在表A中建立列族person。

--hbase-row-keyid表示表A的row-key是researchers表的id字段。

--hbase-create-table表示在HBase中建立表。

--username'root'表示使用用户root连接Mysql。

注意:

HBase的所有节点必须能够访问MySQL数据库,不然会出现如下错误:

java.sql.SQLException: null,  message from server: "Host '10.10.104.3' is not allowed to connect to this MySQL server"
  1. 在MySQL数据库服务器节点上执行以下命令允许远程机器使用相应用户访问本地数据库服务器:  
  2. [root@gc01vm6 htdocs] # /opt/lampp/bin/mysql  
  3.   
  4. mysql> use mysql;  
  5. Database changed  
  6. mysql> GRANT ALL PRIVILEGES ON rsearch.* TO 'root'@'10.10.104.3' IDENTIFIED BY '' WITH GRANT OPTION;   
  7. mysql> GRANT ALL PRIVILEGES ON rsearch.* TO 'root'@'10.10.104.5' IDENTIFIED BY '' WITH GRANT OPTION;   
  8. mysql> GRANT ALL PRIVILEGES ON rsearch.* TO 'root'@'10.10.104.2' IDENTIFIED BY '' WITH GRANT OPTION;   

在MySQL数据库服务器节点上执行以下命令允许远程机器使用相应用户访问本地数据库服务器:[root@gc01vm6 htdocs] # /opt/lampp/bin/mysqlmysql> use mysql;Database changedmysql> GRANT ALL PRIVILEGES ON rsearch.* TO 'root'@'10.10.104.3' IDENTIFIED BY '' WITH GRANT OPTION; mysql> GRANT ALL PRIVILEGES ON rsearch.* TO 'root'@'10.10.104.5' IDENTIFIED BY '' WITH GRANT OPTION; mysql> GRANT ALL PRIVILEGES ON rsearch.* TO 'root'@'10.10.104.2' IDENTIFIED BY '' WITH GRANT OPTION;

这里10.10.104.2,10.10.104.3,10.10.104.5是HBase节点。

-------------------------------------------------------------------------------------------------

MySQL导入HBase的日志:

[root@gd02hadoop]#sqoopimport--connectjdbc:mysql://10.10.97.116:3306/rsearch--tableresearchers--hbase-tableA--column-familyperson--hbase-row-keyid--hbase-create-table--username'root'-P

Enterpassword:

11/06/2919:08:00INFOtool.CodeGenTool:Beginningcodegeneration

11/06/2919:08:00INFOmanager.MySQLManager:ExecutingSQLstatement:SELECTt.*FROM`researchers`AStLIMIT1

11/06/2919:08:00INFOmanager.MySQLManager:ExecutingSQLstatement:SELECTt.*FROM`researchers`AStLIMIT1

11/06/2919:08:00INFOorm.CompilationManager:HADOOP_HOMEis/usr/lib/hadoop

11/06/2919:08:00INFOorm.CompilationManager:Foundhadoopcorejarat:/usr/lib/hadoop/hadoop-core.jar

Note:/tmp/sqoop-root/compile/d4dd4cb4e1e325fce31ca72c00a5589c/researchers.javausesoroverridesadeprecatedAPI.

Note:Recompilewith-Xlint:deprecationfordetails.

11/06/2919:08:02INFOorm.CompilationManager:Writingjarfile:/tmp/sqoop-root/compile/d4dd4cb4e1e325fce31ca72c00a5589c/researchers.jar

11/06/2919:08:02WARNmanager.MySQLManager:Itlookslikeyouareimportingfrommysql.

11/06/2919:08:02WARNmanager.MySQLManager:Thistransfercanbefaster!Usethe--direct

11/06/2919:08:02WARNmanager.MySQLManager:optiontoexerciseaMySQL-specificfastpath.

11/06/2919:08:02INFOmanager.MySQLManager:SettingzeroDATETIMEbehaviortoconvertToNull(mysql)

11/06/2919:08:02INFOmapreduce.ImportJobBase:Beginningimportofresearchers

11/06/2919:08:02INFOmanager.MySQLManager:ExecutingSQLstatement:SELECTt.*FROM`researchers`AStLIMIT1

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:zookeeper.version=3.3.3-cdh3u0--1,builton03/26/201100:21GMT

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:host.name=gd02

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:java.version=1.6.0_13

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:java.vendor=SunMicrosystemsInc.

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:java.home=/usr/java/jdk1.6.0_13/jre

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:java.class.path=/usr/lib/hadoop/conf:/usr/java/jdk1.6.0_13/lib/tools.jar:/usr/lib/hadoop:/usr/lib/hadoop/hadoop-core-0.20.2-cdh3u0.jar:/usr/lib/hadoop/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-cdh3u0.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-api-2.1.jar:/usr/lib/sqoop/conf:/usr/lib/hbase/conf::/usr/lib/sqoop/lib/ant-contrib-1.0b3.jar:/usr/lib/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar:/usr/lib/sqoop/lib/commons-io-1.4.jar:/usr/lib/sqoop/lib/hadoop-mrunit-0.20.2-CDH3b2-SNAPSHOT.jar:/usr/lib/sqoop/lib/ivy-2.0.0-rc2.jar:/usr/lib/sqoop/lib/mysql-connector-java-5.1.10.jar:/usr/lib/hbase/hbase-0.90.1-cdh3u0.jar:/usr/lib/hbase/hbase-0.90.1-cdh3u0-tests.jar:/usr/lib/hbase/lib/activation-1.1.jar:/usr/lib/hbase/lib/asm-3.1.jar:/usr/lib/hbase/lib/avro-1.3.3.jar:/usr/lib/hbase/lib/commons-cli-1.2.jar:/usr/lib/hbase/lib/commons-codec-1.4.jar:/usr/lib/hbase/lib/commons-el-1.0.jar:/usr/lib/hbase/lib/commons-httpclient-3.1.jar:/usr/lib/hbase/lib/commons-lang-2.5.jar:/usr/lib/hbase/lib/commons-logging-1.1.1.jar:/usr/lib/hbase/lib/commons-net-1.4.1.jar:/usr/lib/hbase/lib/core-3.1.1.jar:/usr/lib/hbase/lib/guava-r06.jar:/usr/lib/hbase/lib/hadoop-core.jar:/usr/lib/hbase/lib/hbase-0.90.1-cdh3u0.jar:/usr/lib/hbase/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hbase/lib/jackson-jaxrs-1.5.5.jar:/usr/lib/hbase/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hbase/lib/jackson-xc-1.5.5.jar:/usr/lib/hbase/lib/jasper-compiler-5.5.23.jar:/usr/lib/hbase/lib/jasper-runtime-5.5.23.jar:/usr/lib/hbase/lib/jaxb-api-2.1.jar:/usr/lib/hbase/lib/jaxb-impl-2.1.12.jar:/usr/lib/hbase/lib/jersey-core-1.4.jar:/usr/lib/hbase/lib/jersey-json-1.4.jar:/usr/lib/hbase/lib/jersey-server-1.4.jar:/usr/lib/hbase/lib/jettison-1.1.jar:/usr/lib/hbase/lib/jetty-6.1.26.jar:/usr/lib/hbase/lib/jetty-util-6.1.26.jar:/usr/lib/hbase/lib/jruby-complete-1.0.3.jar:/usr/lib/hbase/lib/jsp-2.1-6.1.14.jar:/usr/lib/hbase/lib/jsp-api-2.1-6.1.14.jar:/usr/lib/hbase/lib/jsp-api-2.1.jar:/usr/lib/hbase/lib/jsr311-api-1.1.1.jar:/usr/lib/hbase/lib/log4j-1.2.16.jar:/usr/lib/hbase/lib/protobuf-java-2.3.0.jar:/usr/lib/hbase/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hbase/lib/servlet-api-2.5.jar:/usr/lib/hbase/lib/slf4j-api-1.5.8.jar:/usr/lib/hbase/lib/slf4j-log4j12-1.5.8.jar:/usr/lib/hbase/lib/stax-api-1.0.1.jar:/usr/lib/hbase/lib/thrift-0.2.0.jar:/usr/lib/hbase/lib/xmlenc-0.52.jar:/usr/lib/hbase/lib/zookeeper.jar:/usr/lib/zookeeper/zookeeper-3.3.3-cdh3u0.jar:/usr/lib/zookeeper/zookeeper.jar:/usr/lib/zookeeper/lib/jline-0.9.94.jar:/usr/lib/zookeeper/lib/log4j-1.2.15.jar:/usr/lib/sqoop/sqoop-1.2.0-cdh3u0.jar:/usr/lib/sqoop/sqoop-test-1.2.0-cdh3u0.jar:

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:java.library.path=/usr/java/jdk1.6.0_13/jre/lib/amd64/server:/usr/java/jdk1.6.0_13/jre/lib/amd64:/usr/java/jdk1.6.0_13/jre/../lib/amd64:/usr/java/packages/lib/amd64:/lib:/usr/lib

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:java.io.tmpdir=/tmp

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:java.compiler=<NA>

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:os.name=Linux

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:os.arch=amd64

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:os.version=2.6.18-164.el5

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:user.name=root

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:user.home=/root

11/06/2919:08:02INFOzookeeper.ZooKeeper:Clientenvironment:user.dir=/home/hadoop

11/06/2919:08:02INFOzookeeper.ZooKeeper:Initiatingclientconnection,connectString=gd05:2181,gd03:2181,gd02:2181sessionTimeout=180000watcher=hconnection

11/06/2919:08:02INFOzookeeper.ClientCnxn:Openingsocketconnectiontoservergd03/10.10.104.3:2181

11/06/2919:08:02INFOzookeeper.ClientCnxn:Socketconnectionestablishedtogd03/10.10.104.3:2181,initiatingsession

11/06/2919:08:02INFOzookeeper.ClientCnxn:Sessionestablishmentcompleteonservergd03/10.10.104.3:2181,sessionid=0x130b2e901cd0012,negotiatedtimeout=180000

11/06/2919:08:02INFOzookeeper.ZooKeeper:Initiatingclientconnection,connectString=gd05:2181,gd03:2181,gd02:2181sessionTimeout=180000watcher=hconnection

11/06/2919:08:02INFOzookeeper.ClientCnxn:Openingsocketconnectiontoservergd03/10.10.104.3:2181

11/06/2919:08:02INFOzookeeper.ClientCnxn:Socketconnectionestablishedtogd03/10.10.104.3:2181,initiatingsession

11/06/2919:08:02INFOzookeeper.ClientCnxn:Sessionestablishmentcompleteonservergd03/10.10.104.3:2181,sessionid=0x130b2e901cd0013,negotiatedtimeout=180000

11/06/2919:08:02INFOclient.HConnectionManager$HConnectionImplementation:Closedzookeepersessionid=0x130b2e901cd0013

11/06/2919:08:02INFOzookeeper.ZooKeeper:Session:0x130b2e901cd0013closed

11/06/2919:08:02INFOzookeeper.ClientCnxn:EventThreadshutdown

11/06/2919:08:02INFOmapreduce.HBaseImportJob:Creatingmissingcolumnfamilyperson

11/06/2919:08:02INFOclient.HBaseAdmin:StarteddisableofA

11/06/2919:08:03INFOclient.HBaseAdmin:DisabledA

11/06/2919:08:03INFOclient.HBaseAdmin:StartedenableofA

11/06/2919:08:06INFOclient.HBaseAdmin:EnabledtableA

11/06/2919:08:07INFOmapred.JobClient:Runningjob:job_201106212352_0010

11/06/2919:08:08INFOmapred.JobClient:map0%reduce0%

11/06/2919:08:19INFOmapred.JobClient:map40%reduce0%

11/06/2919:08:20INFOmapred.JobClient:map80%reduce0%

11/06/2919:08:34INFOmapred.JobClient:map100%reduce0%

11/06/2919:08:34INFOmapred.JobClient:Jobcomplete:job_201106212352_0010

11/06/2919:08:34INFOmapred.JobClient:Counters:11

11/06/2919:08:34INFOmapred.JobClient:JobCounters

11/06/2919:08:34INFOmapred.JobClient:SLOTS_MILLIS_MAPS=82848

11/06/2919:08:34INFOmapred.JobClient:Totaltimespentbyallreduceswaitingafterreservingslots(ms)=0

11/06/2919:08:34INFOmapred.JobClient:Totaltimespentbyallmapswaitingafterreservingslots(ms)=0

11/06/2919:08:34INFOmapred.JobClient:Launchedmaptasks=5

11/06/2919:08:34INFOmapred.JobClient:SLOTS_MILLIS_REDUCES=0

11/06/2919:08:34INFOmapred.JobClient:FileSystemCounters

11/06/2919:08:34INFOmapred.JobClient:HDFS_BYTES_READ=527

11/06/2919:08:34INFOmapred.JobClient:FILE_BYTES_WRITTEN=310685

11/06/2919:08:34INFOmapred.JobClient:Map-ReduceFramework

11/06/2919:08:34INFOmapred.JobClient:Mapinputrecords=81868

11/06/2919:08:34INFOmapred.JobClient:SpilledRecords=0

11/06/2919:08:34INFOmapred.JobClient:Mapoutputrecords=81868

11/06/2919:08:34INFOmapred.JobClient:SPLIT_RAW_BYTES=527

11/06/2919:08:34INFOmapreduce.ImportJobBase:Transferred0bytesin28.108seconds(0bytes/sec)

11/06/2919:08:34INFOmapreduce.ImportJobBase:Retrieved81868records.

参考资料:

利用sqoop将mysql数据同步到hive手记

http://www.54chen.com/java-ee/sqoop-mysql-to-hive.html

利用Sqoop将数据从数据库导入到HDFS

http://www.cnblogs.com/gpcuster/archive/2011/03/01/1968027.html

Sqoop

http://www.duyifan.com/

MySQL向Hive/HBase的迁移工具

http://www.javabloger.com/article/hadoop-hive-mysql-sqoop.html

官方手册

http://archive.cloudera.com/cdh/3/sqoop/SqoopUserGuide.html

相关推荐

strongyoung / 0评论 2020-01-25