安装RHadoop
1. R Language Install
安裝相關(guān)依賴
yum install -y perl* pcre-devel tcl-devel zlib-devel bzip2-devel libX11-devel tk-devel tetex-latex *gfortran* compat-readline5 yum install libRmath-* rpm -Uvh --force --nodeps R-core-2.10.0-2.el5.x86_64.rpm rpm -Uvh R-2.10.0-2.el5.x86_64.rpm R-devel-2.10.0-2.el5.x86_64.rpm編譯安裝:R-3.0.1
tar -zxvf R-3.0.1 ./configure make make install #R運(yùn)行 export HADOOP_CMD=/usr/bin/hadoop排錯(cuò)
1、錯(cuò)誤1
error: --with-readline=yes (default)安裝readline
yum install readline*2、錯(cuò)誤2
error: No F77 compiler found安裝gfortran
3、錯(cuò)誤3
error: –with-x=yes (default) and X11 headers/libs are not available安裝
yum install libXt*4、錯(cuò)誤4
error: C++ preprocessor "/lib/cpp" fails sanity check安裝g++或build-essential(redhat6.2安裝gcc-c++和glibc-headers)
驗(yàn)證是否安裝成功
[root@node1 bin]# R R version 3.0.1 (2013-05-16) -- "Good Sport" Copyright (C) 2013 The R Foundation for Statistical Computing Platform: x86_64-unknown-linux-gnu (64-bit)R是自由軟件,不帶任何擔(dān)保。 在某些條件下你可以將其自由散布。 用'license()'或'licence()'來(lái)看散布的詳細(xì)條件。R是個(gè)合作計(jì)劃,有許多人為之做出了貢獻(xiàn). 用'contributors()'來(lái)看合作者的詳細(xì)情況 用'citation()'會(huì)告訴你如何在出版物中正確地引用R或R程序包。用'demo()'來(lái)看一些示范程序,用'help()'來(lái)閱讀在線幫助文件,或 用'help.start()'通過(guò)HTML瀏覽器來(lái)看幫助文件。 用'q()'退出R.2. 安裝Rhadoop
安裝rhdfs,rmr2
cd Rhadoop/ R CMD javareconf R CMD INSTALL 'plyr_1.8.tar.gz' R CMD INSTALL 'stringr_0.6.2.tar.gz' R CMD INSTALL 'reshape2_1.2.2.tar.gz' R CMD INSTALL 'digest_0.6.3.tar.gz' R CMD INSTALL 'functional_0.4.tar.gz' R CMD INSTALL 'iterators_1.0.6.tar.gz' R CMD INSTALL 'itertools_0.1-1.tar.gz' R CMD INSTALL 'Rcpp_0.10.3.tar.gz' R CMD INSTALL 'rJava_0.9-4.tar.gz' R CMD INSTALL 'RJSONIO_1.0-3.tar.gz' R CMD INSTALL 'reshape2_1.2.2.tar.gz' R CMD INSTALL 'rhdfs_1.0.5.tar.gz' R CMD INSTALL 'rmr2_2.2.0.tar.gz'R library(rhdfs)檢查是否能正常工作
驗(yàn)證測(cè)試
Rmr測(cè)試命令:
> train.mr<-mapreduce( + train.hdfs, + map = function(k, v) { + keyval(k,v$item) + } + ,reduce=function(k,v){ + m<-merge(v,v) + keyval(m$x,m$y) + } + )出現(xiàn)如下錯(cuò)誤:
packageJobJar: [/tmp/RtmpCuhs7d/rmr-local-env18916b6f86b3, /tmp/RtmpCuhs7d/rmr-global-env18913824c681, /tmp/RtmpCuhs7d/rmr-streaming-map18912d6c2b1c, /tmp/RtmpCuhs7d/rmr-streaming-reduce1891179bb645, /tmp/hadoop-root/hadoop-unjar4575094085541826184/] [] /tmp/streamjob2910108622786868147.jar tmpDir=null 13/06/05 18:22:28 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 13/06/05 18:22:28 INFO mapred.FileInputFormat: Total input paths to process : 1 13/06/05 18:22:29 INFO streaming.StreamJob: getLocalDirs(): [/tmp/hadoop-root/mapred/local] 13/06/05 18:22:29 INFO streaming.StreamJob: Running job: job_201306050931_0004 13/06/05 18:22:29 INFO streaming.StreamJob: To kill this job, run: 13/06/05 18:22:29 INFO streaming.StreamJob: /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=cdh1:8021 -kill job_201306050931_0004 13/06/05 18:22:29 INFO streaming.StreamJob: Tracking URL: http://cdh1:50030/jobdetails.jsp?jobid=job_201306050931_0004 13/06/05 18:22:30 INFO streaming.StreamJob: map 0% reduce 0% 13/06/05 18:22:56 INFO streaming.StreamJob: map 100% reduce 100% 13/06/05 18:22:56 INFO streaming.StreamJob: To kill this job, run: 13/06/05 18:22:56 INFO streaming.StreamJob: /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=cdh1:8021 -kill job_201306050931_0004 13/06/05 18:22:56 INFO streaming.StreamJob: Tracking URL: http://cdh1:50030/jobdetails.jsp?jobid=job_201306050931_0004 13/06/05 18:22:56 ERROR streaming.StreamJob: Job not successful. Error: NA 13/06/05 18:22:56 INFO streaming.StreamJob: killJob... Streaming Command Failed! Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, : hadoop streaming failed with error code 1錯(cuò)誤解決方法: 通過(guò)查看日志,hadoop沒(méi)有在/usr/bin下找到Rscript,于是從R的安裝目錄/usr/local/bin下做R和Rscript的符號(hào)鏈接到/usr/bin下,再次執(zhí)行即可解決次錯(cuò)。
#ln -s /usr/loca/bin/R /usr/bin #ln -s /usr/local/bin/Rscript /usr/bin3. 安裝rhbase
## 安裝依賴
#yum install boost* #yum install openssl*安裝thrift
#tar -zxvf thrift-0.9.0.tar.gz #mv thrift-0.9.0/lib/cpp/src/thrift/qt/moc_TQTcpServer.cpp thrift-0.9.0/lib/cpp/src/thrift/qt/moc_TQTcpServer.cpp.bak #cd thrift-0.9.0 #./configure --with-boost=/usr/include/boost JAVAC=/usr/java/jdk1.6.0_31/bin/javac #make #make install如果報(bào)錯(cuò):error: “Error: libcrypto required.”
#yum install openssl*如果報(bào)錯(cuò):
src/thrift/qt/moc_TQTcpServer.cpp:14:2: error: #error "This file was generated using the moc from 4.8.1. It" src/thrift/qt/moc_TQTcpServer.cpp:15:2: error: #error "cannot be used with the include files from this version of Qt." src/thrift/qt/moc_TQTcpServer.cpp:16:2: error: #error "(The moc has changed too much.)"則運(yùn)行下面命令:
#mv thrift-0.9.0/lib/cpp/src/thrift/qt/moc_TQTcpServer.cpp thrift-0.9.0/lib/cpp/src/thrift/qt/moc_TQTcpServer.cpp.bak配置PKG_CONFIG_PATH
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig/pkg-config --cflags thrift #返回:-I/usr/local/include/thrift為正確cp /usr/local/lib/libthrift-0.9.0.so /usr/lib/cp /usr/local/lib/libthrift-0.9.0.so /usr/lib64/啟動(dòng)hbase:
/usr/lib/hbase/bin/hbase-daemon.sh start thrift使用jps查看thrift進(jìn)程
安裝rhbase
R CMD INSTALL 'rhbase_1.1.1.tar.gz'驗(yàn)證并測(cè)試
在R命令行中輸入library(rmr2)、library(rhdfs)、library(rhbase),載入成功即表示安裝成功
[root@desktop27 hadoop]# R R version 3.0.1 (2013-05-16) -- "Good Sport" Copyright (C) 2013 The R Foundation for Statistical Computing Platform: x86_64-unknown-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(rhdfs) Loading required package: rJava HADOOP_CMD=/usr/bin/hadoop Be sure to run hdfs.init() > library(rmr2) Loading required package: Rcpp Loading required package: RJSONIO Loading required package: digest Loading required package: functional Loading required package: stringr Loading required package: plyr Loading required package: reshape2 > library(rhbase) >4. 裝RHive
環(huán)境變量
設(shè)置環(huán)境變量?vim /etc/profile,末行添加如下:
export HADOOP_CMD=/usr/bin/hadoop export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig/ export HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh4.2.1.jar export HADOOP_HOME=/usr/lib/hadoop export RHIVE_DATA=/hadoop/dfs/rhive/data export HIVE_HOME=/usr/lib/hive安裝Rserve:
#R CMD INSTALL 'Rserve_1.7-1.tar.gz'在安裝Rsever用戶下,創(chuàng)建一目錄,并創(chuàng)建Rserv.conf文件,寫入``remote enable’‘保存并退出。
#cd /usr/local/lib64/R/ #echo remote enable > Rserv.conf啟動(dòng)Rserve:
#R CMD Rserve --RS-conf /usr/local/lib64/R/Rserv.conf檢查Rserve啟動(dòng)是否正常:
#telnet localhost 6311顯示 Rsrv0103QAP1 則表示連接成功
安裝RHive
創(chuàng)建數(shù)據(jù)目錄:
#R CMD INSTALL RHive_0.0-7.tar.gz #cd /usr/local/lib64/R/ mkdir -p rhive/data在上傳rhive_udf.jar到hdfs上:
hadoop fs -mkdir /rhive/lib cd /usr/local/lib64/R/library/RHive/java hadoop fs -put rhive_udf.jar /rhive/lib hadoop fs -chmod a+rw /rhive/lib/rhive_udf.jar cd /usr/lib/hadoop ln -s /etc/hadoop/conf conf測(cè)試RHive安裝是否成功:
R library(RHive) rhive.connect('192.168.0.27')【hive的地址】 rhive.env()總結(jié)
- 上一篇: 《锋利的jQuery》二、jQuery的
- 下一篇: Diango博客--22.Django