搭建起了Hadoop+oozie+Sqoop,准备写个Oozie工作流,里面通过Sqoop从DB里导数据再写到HDFS.一切就绪,但却报错了.报错如下:
Error: E0701 : E0701: XML schema error, cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'sqoop'.
再去看日志,好家伙,错误一大把,但已经能否发现些什么.异常如下:
- 2011-11-11 11:39:44,658 WARN V1JobsServlet:539 - USER[?] GROUP[users] TOKEN[-] APP[-] JOB[-] ACTION[-] URL[POST http:
- at org.apache.oozie.servlet.V1JobsServlet.submitWorkflowJob(V1JobsServlet.java:163)
- at org.apache.oozie.servlet.V1JobsServlet.submitJob(V1JobsServlet.java:74)
- at org.apache.oozie.servlet.BaseJobsServlet.doPost(BaseJobsServlet.java:92)
- at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
- at org.apache.oozie.servlet.JsonRestServlet.service(JsonRestServlet.java:281)
- at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
- at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
- at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
- at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
- at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
- at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
- at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
- at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
- at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
- at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
- at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
- at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
- at java.lang.Thread.run(Thread.java:662)
- Caused by: org.apache.oozie.DagEngineException: E0701: XML schema error, cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'sqoop'.
- at org.apache.oozie.DagEngine.submitJob(DagEngine.java:137)
- at org.apache.oozie.servlet.V1JobsServlet.submitWorkflowJob(V1JobsServlet.java:159)
- ... 17 more
- Caused by: org.apache.oozie.command.CommandException: E0701: XML schema error, cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'sqoop'.
- at org.apache.oozie.command.wf.SubmitXCommand.execute(SubmitXCommand.java:185)
- at org.apache.oozie.command.wf.SubmitXCommand.execute(SubmitXCommand.java:61)
- at org.apache.oozie.command.XCommand.call(XCommand.java:257)
- at org.apache.oozie.DagEngine.submitJob(DagEngine.java:125)
- ... 18 more
- Caused by: org.apache.oozie.workflow.WorkflowException: E0701: XML schema error, cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'sqoop'.
- at org.apache.oozie.workflow.lite.LiteWorkflowAppParser.validateAndParse(LiteWorkflowAppParser.java:120)
- at org.apache.oozie.workflow.lite.LiteWorkflowLib.parseDef(LiteWorkflowLib.java:47)
- at org.apache.oozie.service.LiteWorkflowAppService.parseDef(LiteWorkflowAppService.java:46)
- at org.apache.oozie.service.LiteWorkflowAppService.parseDef(LiteWorkflowAppService.java:41)
- at org.apache.oozie.command.wf.SubmitXCommand.execute(SubmitXCommand.java:95)
- ... 21 more
- Caused by: org.xml.sax.SAXParseException: cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'sqoop'.
- at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:195)
- at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:131)
- at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:384)
- at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:318)
- at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator$XSIErrorReporter.reportError(XMLSchemaValidator.java:417)
- at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.reportSchemaError(XMLSchemaValidator.java:3182)
- at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.handleStartElement(XMLSchemaValidator.java:1927)
- at com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.startElement(XMLSchemaValidator.java:705)
- at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:400)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2755)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
- at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
- at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
- at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
- at com.sun.org.apache.xerces.internal.jaxp.validation.StreamValidatorHelper.validate(StreamValidatorHelper.java:144)
- at com.sun.org.apache.xerces.internal.jaxp.validation.ValidatorImpl.validate(ValidatorImpl.java:111)
- at javax.xml.validation.Validator.validate(Validator.java:127)
- at org.apache.oozie.workflow.lite.LiteWorkflowAppParser.validateAndParse(LiteWorkflowAppParser.java:106)
- ... 25 more
错误大概就是xml不支持自定义标签sqoop.这里我在Oozie中需要使用Sqoop从DB中导入数据,所以在workflow.xml中进行了如下配置:
- <action name="db_export">
- <sqoop xmlns="uri:oozie:sqoop-action:0.2">
- <job-tracker>${job_tracker}</job-tracker>
- <name-node>${name_node}</name-node>
- <prepare>
- <delete path="${wf_job_base_path}/${wf:id()}/db_export"/>
- </prepare>
- <configuration>
- <property>
- <name>mapred.job.queue.name</name>
- <value>${queue_name}</value>
- </property>
- </configuration>
- <arg>import</arg>
- <arg>-D</arg>
- <arg>mapred.output.compress=false</arg>
- <arg>--connect</arg>
- <arg>jdbc:mysql://${db_hostname}/${db_name}</arg>
- <arg>--query</arg>
- <arg>${db_banner_query}</arg>
- <arg>--target-dir</arg>
- <arg>${wf_job_base_path}/${wf:id()}/db_export</arg>
- <arg>--num-mappers</arg>
- <arg>${sqoop_export_mappers_num}</arg>
- <arg>--username</arg>
- <arg>${db_user}</arg>
- <arg>--password</arg>
- <arg>${db_password}</arg>
- <arg>--as-sequencefile</arg>
- <arg>--class-name</arg>
- <arg>${db_class_name}</arg>
- </sqoop>
- <ok to="sqoop_export_done"/>
- <error to="fail"/>
- </action>
常识告诉我,自定义标签需要进行相关命名.看这边sqoop引入的是uri:oozie:sqoop-action:0.2,所以应该要有oozie-sqoop-action-0.2.xsd文件.于是去$OOZIE_HOME/lib/oozie-clinet-x.x.jar中
并未看到该文件.于是按照http://archive.cloudera.com/cdh/3/oozie/DG_SqoopActionExtension.html的配置新建该文件并放入该jar中.继续运行,但还是报错.同时随着对自定义标签的了解以及看了这篇文章后http://www.infoq.com/cn/articles/ExtendingOozie,我发现需要相关的ActionSuppoert支持,而在oozie-core中并未找到org.apache.oozie.action.hadoop.SqoopActionExecutor这个类.然后通过搜索才发现,之前是下载yahoo的oozie,而只有
cloudera才对oozie支持使用sqoop.于是去cloudera下载了oozie-2.3-cdhu30.tar.gz,并从中找到相关的xsd和ActionSupport.再直接运行,OK,问题解决!看来用上了cloudera,以后啥都要先围着它去转了.下载地址:http://archive.cloudera.com/cdh/3/oozie-2.3.2-cdh3u2.tar.gz