? ?在使用命令導數據過程中,出現如下錯誤
sqoop import --hive-import --
connect jdbc:oracle:thin:@
192.168.29.16:1521/testdb --username NAME --
passord PASS --verbose -m 1 --table T_USERINFO
錯誤1: File does not exist: hdfs://opt/sqoop-1.4.4/lib/commons-io-1.4.jar
FileNotFoundException: File does not exist: hdfs: // opt/sqoop-1.4.4/lib/commons-io-1.4.jar
at org.apache ... ...
at org.apache ... ...
原因分析:?
感謝? Daniel Koverman ‘ s answer? http://stackoverflow.com/questions/19375784/sqoop-jar-files-not-found
It is common for Hadoop services to look for jars in HDFS because all nodes in the cluster can access files in HDFS. This is important if the MapReduce job being kicked off by the Hadoop service, in this case Sqoop, has a dependence on those jars. Remember, the Mappers are running on a DataNode, not the NameNode even though you are (probably) running the Sqoop command from the NameNode. Putting the jars on HDFS is not the only possible solution to this problem, but it is a sensible one. Now we can deal with the actual error. At least one, but probably all, of your Mappers are unable to find a jar they need. That means that either the jar does not exist or the user trying to access them does not have the required permissions. First check if the file exists by running hadoop fs - ls home/SqoopUser/sqoop- 1.4 . 3 -cdh4. 4.0 /sqoop- 1.4 . 3 -cdh4. 4.0 .jar by a user with superuser privileges on the cluster. If it does not exist, put it there with hadoop fs -put {jarLocationOn/NameNode/fileSystem/sqoop- 1.4 . 3 -cdh4. 4.0 .jar} /home/SqoopUser/sqoop- 1.4 . 3 -cdh4. 4.0 /sqoop- 1.4 . 3 -cdh4. 4.0 .jar.
解決方法:
?將提示中涉及的jar文件put到hdfs文件系統中的相同位置,如果文件系統中沒有對應的目錄,則需要建立相應目錄,在我的錯誤提示中,由于hdfs://master:8020/中缺少了?/opt/sqoop-1.4.4/lib/文件夾中的各種jar,所以我的做法是把此處整個/opt/sqoop-1.4.4/lib文件夾put到hdfs://master:8020/中
<!--查看以下文件系統中的文件目錄,這是遞歸查詢,如果文件很多 建議不要家-R參數,而是逐層查看--> hadoop fs - ls -R / <!--建立相同的目錄結構--> hadoop fs - mkdir / opt hadoop fs - mkdir /opt/sqoop- 1.4 . 4 <!--將本地的/opt/sqoop- 1.4 . 4 /lib 拷貝到hdfs中的/opt/sqoop- 1.4 .4目錄中--> hadoop fs -put /opt/sqoop- 1.4 . 4 /lib /opt/sqoop- 1.4 . 4 / <!--查看一下結果,確認拷貝成功--> hadoop fs - ls -R /opt/sqoop- 1.4 . 4
錯誤2 :java.lang.ClassNotFoundException: Class U_BASICINFO not found
對于要導入到hive中的表,錯誤提示說找不到對應的.class和.jar文件
java.lang.Exception: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class U_BASICINFO not found at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java: 462 ) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java: 522 ) Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class U_BASICINFO not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java: 1895 ) at org.apache.sqoop.mapreduce.db.DBConfiguration.getInputClass(DBConfiguration.java: 394 ) at .....
原因分析:暫時不知道
解決方案:
感謝 user236575 ’s answer: ?http://stackoverflow.com/questions/21599785/sqoop-not-able-to-import-table/21626010#21626010
默認sqoop在執行導入table過程中會生成對應的table的java文件和編譯產生的.class和.jar文件,.java文件保存在sqoop/bin目錄下,而class 和 jar文件則保存在/tmp/sqoop-hduser/compile/ 下相應的文件夾中。
我的解決方式是找到要導入表的class和jar文件,然后將他們拷貝到sqoop/bin目錄下面和hdfs文件系統中的/user/USERNAM/ 目錄下面(后期測試后,只要將.class和.jar拷貝到sqoop/bin目錄下就可以成功import)。
<!--拷貝到sqoop/bin目錄下--> cp /tmp/sqoop-root/compile/某個臨時文件夾包含需要的class和jar文件 /* /opt/sqoop-1.4.4/bin/ <!--put到hdfs中的/user/USERNAME/文件夾下--> hadoop fs -put /tmp/sqoop-root/compile/某個臨時文件夾包含需要的class和jar文件/* /user/root/
錯誤3?org.apache.hadoop.mapred.file already exists exception:output directory hdfs://user/root/... ...
解決方案:
在執行過一次導入數據表命令后,當再次執行時,可能會出現這種錯誤,這是只要進入到hdfs中將對應的文件或者文件夾刪除即可。
hadoop fs - rm /user/USERNAME /*
錯誤4 sqoop導入數據時出現java.sql.SQLException: ORA-01017: invalid username/password; logon denied ??
原因:oracle 11對大小寫敏感,所以需要關掉oracle數據庫大小寫敏感。
解決方法:
1.登入數據庫,執行:alter?system?set?sec_case_sensitive_logon=false
2.或者重新建立一個用戶,用全部大寫或者小寫建立用戶名或密碼(由于sqoop中密碼用戶名必須大寫,但是最終是按大寫還是小寫傳入數據庫的不清楚,所以可能需要嘗試大寫和小寫兩種方式后才知道)。
錯誤5?
INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s)
13/12/14 20:12:07 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 13/12/14 20:12:08 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 13/12/14 20:12:09 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 13/12/14 20:12:10 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 13/12/14 20:12:11 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 13/12/14 20:12:12 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 13/12/14 20:12:13 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 13/12/14 20:12:14 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032.
問題原因: hadoop平臺可能只啟動了dfs沒有啟動yarn。
解決方法: 用start-all.sh啟動hadoop或者用start-dfs.sh和start-yarn.sh組合啟動hadoop。
?
更多文章、技術交流、商務合作、聯系博主
微信掃碼或搜索:z360901061

微信掃一掃加我為好友
QQ號聯系: 360901061
您的支持是博主寫作最大的動力,如果您喜歡我的文章,感覺我的文章對您有幫助,請用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧,狠狠點擊下面給點支持吧,站長非常感激您!手機微信長按不能支付解決辦法:請將微信支付二維碼保存到相冊,切換到微信,然后點擊微信右上角掃一掃功能,選擇支付二維碼完成支付。
【本文對您有幫助就好】元
