MapReduce 源码分析(一)准备阶段
生活随笔
收集整理的這篇文章主要介紹了
MapReduce 源码分析(一)准备阶段
小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.
MapReduce 源碼分析
本篇博客根據(jù)wordCount代碼進(jìn)行分析底層源碼的。以下稱(chēng)它為WC類(lèi)。
package com.henu;import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Partitioner; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import java.io.IOException;/*** @author George* @description** hello you**/ public class WC {public static class WCMapper extends Mapper<LongWritable, Text,Text, IntWritable>{Text k1 = new Text();IntWritable v1 = new IntWritable(1);@Overrideprotected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {String line = value.toString();String[] strings = line.split("\\s+");for (String s : strings) {k1.set(s);context.write(k1,v1);}}}public static class WCReducer extends Reducer<Text, IntWritable,Text, IntWritable> {int count;IntWritable v2 = new IntWritable();@Overrideprotected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {count = 0;for (IntWritable value : values) {count += value.get();}v2.set(count);context.write(key,v2);}}public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {Configuration conf = new Configuration();Job job = Job.getInstance(conf);job.setJarByClass(WC.class);job.setMapperClass(WCMapper.class);job.setReducerClass(WCReducer.class);job.setMapOutputKeyClass(Text.class);job.setMapOutputValueClass(IntWritable.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class);//map階段設(shè)置分區(qū)job.setPartitionerClass(MyPartitoner.class);job.setNumReduceTasks(1);FileInputFormat.setInputPaths(job,new Path(args[0]));FileOutputFormat.setOutputPath(job,new Path(args[1]));job.waitForCompletion(true);}private static class MyPartitoner extends Partitioner<Text,IntWritable> {@Overridepublic int getPartition(Text text, IntWritable intWritable, int i) {String kStr = text.toString();return kStr.equalsIgnoreCase("hello")?0:1;}} }在WC類(lèi)的main方法中,點(diǎn)擊進(jìn)入job的waitForCompletion方法。?
waitForCompletion()調(diào)用 submit()?
點(diǎn)擊進(jìn)入submit
submit()調(diào)用 submitJobInternal()方法把作業(yè)提交到集群
點(diǎn)擊進(jìn)入submitJobInternal方法中
點(diǎn)擊進(jìn)入writeSplits方法,writeSplits()調(diào)用 writeNewSplits()
進(jìn)入WriteNewSplits方法中
然后搜索進(jìn)入FileInputFormat類(lèi)中
之前都是提交前的準(zhǔn)備,最終提交作業(yè)
總的來(lái)說(shuō),客戶(hù)端做了以下幾件事:
配置完善
檢查路徑
計(jì)算 split:maps
資源提交到 HDFS
提交任務(wù)
然后,AppMaster 根據(jù) split 列表信息向 ResourceManager 申請(qǐng)資源,RS 創(chuàng)建 container,然
后 AppMaster 啟動(dòng) container,把 MapReducer 任務(wù)放進(jìn)去。
圖示總結(jié):
?
超強(qiáng)干貨來(lái)襲 云風(fēng)專(zhuān)訪(fǎng):近40年碼齡,通宵達(dá)旦的技術(shù)人生總結(jié)
以上是生活随笔為你收集整理的MapReduce 源码分析(一)准备阶段的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 大剑无锋之hadoop默认的数据类型都有
- 下一篇: 利剑无意之Dubbo 面试题