hadoop 多机全分布式安装步骤(虚拟机1master+2slave)
文章目錄
- 1. 虛擬機安裝Centos7
- 2. 配置靜態IP
- 3. 更改主機名
- 4. 編輯域名映射
- 5. 安裝配置Java
- 6. 配置SSH免密登錄
- 7 .安裝Hadoop
- 8. 關閉防火墻
- 9. 格式化文件系統
- 10. 啟動驗證
- 11. 第一個MapReduce程序: WordCount
- 12. 關閉Hadoop
參考書:《Hadoop大數據原理與應用》
1. 虛擬機安裝Centos7
- 安裝3臺虛擬機,centos7,一個master,兩個slave,安裝時可以改hostname, 記得設置密碼
- 安裝的是4.7Gb的包,選擇的 service with GUI
- 選則 NAT 網絡鏈接
- ip route show 查看路由器網關ip
- ip addr 查找本機ip(下面用的著這兩個ip)
2. 配置靜態IP
vim /etc/sysconfig/network-scripts/ifcfg-ens33 TYPE=Ethernet PROXY_METHOD=none BROWSER_ONLY=no BOOTPROTO=static # 改靜態 DEFROUTE=yes IPV4_FAILURE_FATAL=no IPV6INIT=yes IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_FAILURE_FATAL=no IPV6_ADDR_GEN_MODE=stable-privacy NAME=ens33 UUID=caf90547-4b5a-46b3-ab7c-2c8fb1f5e4d7 DEVICE=ens33 ONBOOT=yes # 改yesIPADDR=192.168.253.130 # ip NETMASK=255.255.255.0 GATEWAY=192.168.253.2 # 網關 DNS1=192.168.253.2 # 跟網關一樣即可保存權限不足,輸入w !sudo tee %
- 重啟網絡
同理,另外兩臺 ip 為:192.168.253.128, 192.168.253.129(個人根據自己的情況來)
3. 更改主機名
- 安裝的時候就改了,此處可跳過
- 切換 root 用戶,sudo su
- vi /etc/hostname, 分別替換內容為 master,slave1, slave2
- reboot重啟,hostname 查看是否更改
4. 編輯域名映射
為了便捷訪問,三臺機器都做以下修改,sudo su
在 /etc/hosts追加以下內容,重啟
檢查各臺機器是否能ping通
ping master ping slave1 ping slave25. 安裝配置Java
- 卸載
查看 java -version
卸載 自帶的 Oracle OpenJDK,使用Oracle JDK
-
下載 jdk,位數根據下圖來
-
我從宿主機直接考過來安裝包
參考 JDK 安裝
裝到/opt/jdk1.8.0_281/
6. 配置SSH免密登錄
- 查詢 rpm -qa | grep ssh
- 沒有的話安裝
- vim /etc/ssh/sshd_config
第43行取消注釋,并加一行,3臺機器都做
- systemctl restart sshd.service,重啟服務
- 切換普通用戶 ctrl+d,回到home cd ~
- ssh-keygen,一直回車
- cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
- chmod 0600 ~/.ssh/authorized_keys
- 將master的公鑰復制給slave1,slave2,免密訪問從節點
在master里鍵入以下命令
ssh-copy-id -i ~/.ssh/id_rsa.pub dnn@slave1 ssh slave1 ssh-copy-id -i ~/.ssh/id_rsa.pub dnn@slave2 ssh slave2 ssh master遇到提示輸入 yes, 敲密碼
還可以在另外兩臺里,同樣的步驟操作一遍
7 .安裝Hadoop
偽分布式可以參考:hadoop 單機偽分布式安裝步驟
下載或拷貝 安裝包到3臺機器
scp dnn@michael:/home/dnn/hadoop-3.3.0.tar.gz /home/dnn/hadoop-3.3.0.tar.gz到文件目錄下,解壓 tar -zxvf hadoop-3.3.0.tar.gz
移動到你要放的目錄 sudo mv hadoop-3.3.0 /opt/hadoop-3.3.0
賦權限給普通用戶dnn,chown -R dnn /opt/hadoop-3.3.0
在主節點上操作:
- 切換 root 用戶,新建文件 vim /etc/profile.d/hadoop.sh
- 添加內容
-
切換普通用戶,上面已賦權限,vim /opt/hadoop-3.3.0/etc/hadoop/hadoop-env.sh
54行 改為 export JAVA_HOME=/opt/jdk1.8.0_281/
55行添加 export HADOOP_SSH_OPTS='-o StrictHostKeyChecking=no'
199行修改 export HADOOP_PID_DIR=${HADOOP_HOME}/pids -
vim /opt/hadoop-3.3.0/etc/hadoop/mapred-env.sh
添加 export JAVA_HOME=/opt/jdk1.8.0_281/,export HADOOP_MAPRED_PID_DIR=${HADOOP_HOME}/pids -
vim /opt/hadoop-3.3.0/etc/hadoop/yarn-env.sh
添加
- vim /opt/hadoop-3.3.0/etc/hadoop/core-site.xml
- vim /opt/hadoop-3.3.0/etc/hadoop/mapred-site.xml
- vim /opt/hadoop-3.3.0/etc/hadoop/yarn-site.xml
- 在 /opt/hadoop-3.3.0/etc/hadoop/ 下,vim workers
刪除 localhost, 加入
- 同步配置文件到2臺slave上
在root下
scp /etc/profile.d/hadoop.sh root@slave1:/etc/profile.d/ scp /etc/profile.d/hadoop.sh root@slave2:/etc/profile.d/在普通用戶下
scp -r /opt/hadoop-3.3.0/etc/hadoop/* dnn@slave1:/opt/hadoop-3.3.0/etc/hadoop/ scp -r /opt/hadoop-3.3.0/etc/hadoop/* dnn@slave2:/opt/hadoop-3.3.0/etc/hadoop/8. 關閉防火墻
在root下,systemctl disable firewalld.service
重啟,再看下狀態 systemctl status firewalld.service
顯示 inactive(dead), 3臺機器都做
9. 格式化文件系統
只在 master 上 用 普通用戶 操作:
hdfs namenode -format10. 啟動驗證
在 master 上執行3條命令
start-dfs.sh start-yarn.sh mr-jobhistory-daemon.sh start historyserver # 第三條可以用下面的命令,上面的顯示過期了,以后棄用 mapred --daemon start historyserver輸入 jps 命令,可以看見進程啟動了
11. 第一個MapReduce程序: WordCount
- master:在HDFS根目錄下創建目錄
- 上傳文件到 InputDataTest 文件夾
- hadoop jar /opt/hadoop-3.3.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.0.jar wordcount /InputDataTest /OutputDataTest
報錯: org.apache.hadoop.mapreduce.v2.app.MRAppMaster
重啟集群(關閉3條命令,見下面第12節,啟動3條命令),再次運行 wordcount 程序
[dnn@master ~]$ hadoop jar /opt/hadoop-3.3.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.0.jar wordcount /InputDataTest /OutputDataTest 2021-03-12 07:11:51,635 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at master/192.168.253.130:8032 2021-03-12 07:11:52,408 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/dnn/.staging/job_1615504213995_0001 2021-03-12 07:11:53,547 INFO input.FileInputFormat: Total input files to process : 3 2021-03-12 07:11:54,066 INFO mapreduce.JobSubmitter: number of splits:3 2021-03-12 07:11:54,271 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1615504213995_0001 2021-03-12 07:11:54,271 INFO mapreduce.JobSubmitter: Executing with tokens: [] 2021-03-12 07:11:54,624 INFO conf.Configuration: resource-types.xml not found 2021-03-12 07:11:54,624 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. 2021-03-12 07:11:55,117 INFO impl.YarnClientImpl: Submitted application application_1615504213995_0001 2021-03-12 07:11:55,164 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1615504213995_0001/ 2021-03-12 07:11:55,164 INFO mapreduce.Job: Running job: job_1615504213995_0001 2021-03-12 07:12:05,308 INFO mapreduce.Job: Job job_1615504213995_0001 running in uber mode : false 2021-03-12 07:12:05,319 INFO mapreduce.Job: map 0% reduce 0% 2021-03-12 07:12:21,455 INFO mapreduce.Job: map 33% reduce 0% 2021-03-12 07:12:22,460 INFO mapreduce.Job: map 100% reduce 0% 2021-03-12 07:12:29,514 INFO mapreduce.Job: map 100% reduce 100% 2021-03-12 07:12:29,526 INFO mapreduce.Job: Job job_1615504213995_0001 completed successfully 2021-03-12 07:12:29,652 INFO mapreduce.Job: Counters: 54File System CountersFILE: Number of bytes read=20470FILE: Number of bytes written=1097885FILE: Number of read operations=0FILE: Number of large read operations=0FILE: Number of write operations=0HDFS: Number of bytes read=25631HDFS: Number of bytes written=12134HDFS: Number of read operations=14HDFS: Number of large read operations=0HDFS: Number of write operations=2HDFS: Number of bytes read erasure-coded=0Job Counters Launched map tasks=3Launched reduce tasks=1Data-local map tasks=3Total time spent by all maps in occupied slots (ms)=42362Total time spent by all reduces in occupied slots (ms)=4808Total time spent by all map tasks (ms)=42362Total time spent by all reduce tasks (ms)=4808Total vcore-milliseconds taken by all map tasks=42362Total vcore-milliseconds taken by all reduce tasks=4808Total megabyte-milliseconds taken by all map tasks=43378688Total megabyte-milliseconds taken by all reduce tasks=4923392Map-Reduce FrameworkMap input records=667Map output records=3682Map output bytes=39850Map output materialized bytes=20482Input split bytes=358Combine input records=3682Combine output records=1261Reduce input groups=912Reduce shuffle bytes=20482Reduce input records=1261Reduce output records=912Spilled Records=2522Shuffled Maps =3Failed Shuffles=0Merged Map outputs=3GC time elapsed (ms)=800CPU time spent (ms)=2970Physical memory (bytes) snapshot=615825408Virtual memory (bytes) snapshot=10951270400Total committed heap usage (bytes)=385785856Peak Map Physical memory (bytes)=168960000Peak Map Virtual memory (bytes)=2738552832Peak Reduce Physical memory (bytes)=110534656Peak Reduce Virtual memory (bytes)=2742329344Shuffle ErrorsBAD_ID=0CONNECTION=0IO_ERROR=0WRONG_LENGTH=0WRONG_MAP=0WRONG_REDUCE=0File Input Format Counters Bytes Read=25273File Output Format Counters Bytes Written=12134- 查看結果
_SUCCESS 表示運行成功
結果文件是 part-r-00000
- hdfs dfs -cat /OutputDataTest/part-r-00000 查看結果
12. 關閉Hadoop
mr-jobhistory-daemon.sh stop historyserver # 或者 mapred --daemon stop historyserver stop-yarn.sh stop-dfs.sh好幾天了,跟著書,終于安裝成功了!
我的CSDN博客地址 https://michael.blog.csdn.net/
長按或掃碼關注我的公眾號(Michael阿明),一起加油、一起學習進步!
總結
以上是生活随笔為你收集整理的hadoop 多机全分布式安装步骤(虚拟机1master+2slave)的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: LeetCode 1976. 到达目的地
- 下一篇: LeetCode 2049. 统计最高分