运行Hadoop自带的wordcount单词统计程序
生活随笔
收集整理的這篇文章主要介紹了
运行Hadoop自带的wordcount单词统计程序
小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
1.使用示例程序?qū)崿F(xiàn)單詞統(tǒng)計(jì)
(1)wordcount程序
? ? wordcount程序在hadoop的share目錄下,如下:
| 1 2 3 4 5 6 7 8 9 | [root@leaf?mapreduce]#?pwd /usr/local/hadoop/share/hadoop/mapreduce [root@leaf?mapreduce]#?ls hadoop-mapreduce-client-app-2.6.5.jar?????????hadoop-mapreduce-client-jobclient-2.6.5-tests.jar hadoop-mapreduce-client-common-2.6.5.jar??????hadoop-mapreduce-client-shuffle-2.6.5.jar hadoop-mapreduce-client-core-2.6.5.jar????????hadoop-mapreduce-examples-2.6.5.jar hadoop-mapreduce-client-hs-2.6.5.jar??????????lib hadoop-mapreduce-client-hs-plugins-2.6.5.jar??lib-examples hadoop-mapreduce-client-jobclient-2.6.5.jar???sources |
????就是這個(gè)hadoop-mapreduce-examples-2.6.5.jar程序。
?
(2)創(chuàng)建HDFS數(shù)據(jù)目錄
????創(chuàng)建一個(gè)目錄,用于保存MapReduce任務(wù)的輸入文件:
| 1 | [root@leaf?~]#?hadoop?fs?-mkdir?-p?/data/wordcount |
????創(chuàng)建一個(gè)目錄,用于保存MapReduce任務(wù)的輸出文件:
| 1 | [root@leaf?~]#?hadoop?fs?-mkdir?/output |
????查看剛剛創(chuàng)建的兩個(gè)目錄:
| 1 2 3 | [root@leaf?~]#?hadoop?fs?-ls?/ drwxr-xr-x???-?root?supergroup??????????0?2017-09-01?20:34?/data drwxr-xr-x???-?root?supergroup??????????0?2017-09-01?20:35?/output |
(3)創(chuàng)建一個(gè)單詞文件,并上傳到HDFS
????創(chuàng)建的單詞文件如下:
| 1 2 3 4 5 6 | [root@leaf?~]#?cat?myword.txt? leaf?yyh yyh?xpleaf katy?ling yeyonghao?leaf xpleaf?katy |
????上傳該文件到HDFS中:
| 1 | [root@leaf?~]#?hadoop?fs?-put?myword.txt?/data/wordcount |
????在HDFS中查看剛剛上傳的文件及內(nèi)容:
| 1 2 3 4 5 6 7 8 | [root@leaf?~]#?hadoop?fs?-ls?/data/wordcount -rw-r--r--???1?root?supergroup?????????57?2017-09-01?20:40?/data/wordcount/myword.txt [root@leaf?~]#?hadoop?fs?-cat?/data/wordcount/myword.txt leaf?yyh yyh?xpleaf katy?ling yeyonghao?leaf xpleaf?katy |
(4)運(yùn)行wordcount程序
????執(zhí)行如下命令:
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | [root@leaf?~]#?hadoop?jar?/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar?wordcount?/data/wordcount?/output/wordcount ... 17/09/01?20:48:14?INFO?mapreduce.Job:?Job?job_local1719603087_0001?completed?successfully 17/09/01?20:48:14?INFO?mapreduce.Job:?Counters:?38 ????????File?System?Counters ????????????????FILE:?Number?of?bytes?read=585940 ????????????????FILE:?Number?of?bytes?written=1099502 ????????????????FILE:?Number?of?read?operations=0 ????????????????FILE:?Number?of?large?read?operations=0 ????????????????FILE:?Number?of?write?operations=0 ????????????????HDFS:?Number?of?bytes?read=114 ????????????????HDFS:?Number?of?bytes?written=48 ????????????????HDFS:?Number?of?read?operations=15 ????????????????HDFS:?Number?of?large?read?operations=0 ????????????????HDFS:?Number?of?write?operations=4 ????????Map-Reduce?Framework ????????????????Map?input?records=5 ????????????????Map?output?records=10 ????????????????Map?output?bytes=97 ????????????????Map?output?materialized?bytes=78 ????????????????Input?split?bytes=112 ????????????????Combine?input?records=10 ????????????????Combine?output?records=6 ????????????????Reduce?input?groups=6 ????????????????Reduce?shuffle?bytes=78 ????????????????Reduce?input?records=6 ????????????????Reduce?output?records=6 ????????????????Spilled?Records=12 ????????????????Shuffled?Maps?=1 ????????????????Failed?Shuffles=0 ????????????????Merged?Map?outputs=1 ????????????????GC?time?elapsed?(ms)=92 ????????????????CPU?time?spent?(ms)=0 ????????????????Physical?memory?(bytes)?snapshot=0 ????????????????Virtual?memory?(bytes)?snapshot=0 ????????????????Total?committed?heap?usage?(bytes)=241049600 ????????Shuffle?Errors ????????????????BAD_ID=0 ????????????????CONNECTION=0 ????????????????IO_ERROR=0 ????????????????WRONG_LENGTH=0 ????????????????WRONG_MAP=0 ????????????????WRONG_REDUCE=0 ????????File?Input?Format?Counters? ????????????????Bytes?Read=57 ????????File?Output?Format?Counters? ????????????????Bytes?Written=48 |
????
(5)查看統(tǒng)計(jì)結(jié)果
????如下:
| 1 2 3 4 5 6 7 | [root@leaf?~]#?hadoop?fs?-cat?/output/wordcount/part-r-00000 katy????2 leaf????2 ling????1 xpleaf??2 yeyonghao???????1 yyh?????2 |
本文轉(zhuǎn)自 xpleaf 51CTO博客,原文鏈接:http://blog.51cto.com/xpleaf/1962271,如需轉(zhuǎn)載請(qǐng)自行聯(lián)系原作者
總結(jié)
以上是生活随笔為你收集整理的运行Hadoop自带的wordcount单词统计程序的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 发送邮件程序报错454 Authenti
- 下一篇: 【Spark】Spark-空RDD判断与