MR案例:CombineFileInputFormat
生活随笔
收集整理的這篇文章主要介紹了
MR案例:CombineFileInputFormat
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
CombineFileInputFormat是一個抽象類。Hadoop提供了兩個實現類CombineTextInputFormat和CombineSequenceFileInputFormat。
此案例讓我明白了三點:詳見 解讀:MR多路徑輸入 和 解讀:CombineFileInputFormat類
- 對于單一輸入路徑情況:
- 對于多路徑輸入情況①:
- 多路徑輸入情況②:
細心觀察,還會發現兩種多路徑輸入① ②的區別:(已驗證)
完整的代碼:
package test0820;import java.io.IOException;import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.io.VLongWritable; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.CombineTextInputFormat; import org.apache.hadoop.mapreduce.lib.input.MultipleInputs; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class WordCount0826 {public static void main(String[] args) throws Exception {Configuration conf = new Configuration();Job job = Job.getInstance(conf);job.setJarByClass(WordCount0826.class); job.setMapperClass(IIMapper.class);job.setReducerClass(IIReducer.class);job.setNumReduceTasks(5);job.setMapOutputKeyClass(Text.class);job.setMapOutputValueClass(VLongWritable.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(VLongWritable.class);//CombineFileInputFormat類//job.setInputFormatClass(CombineTextInputFormat.class); CombineTextInputFormat.setMaxInputSplitSize(job, 60*1024*1024L);//CombineTextInputFormat.addInputPath(job, new Path(args[0]));//CombineTextInputFormat.addInputPath(job, new Path(args[1])); MultipleInputs.addInputPath(job, new Path(args[0]), CombineTextInputFormat.class);MultipleInputs.addInputPath(job, new Path(args[1]), CombineTextInputFormat.class);
FileOutputFormat.setOutputPath(job, new Path(args[2]));System.exit(job.waitForCompletion(true)? 0:1);}//mappublic static class IIMapper extends Mapper<LongWritable, Text, Text, VLongWritable>{@Overrideprotected void map(LongWritable key, Text value,Context context)throws IOException, InterruptedException {String[] splited = value.toString().split(" "); for(String word : splited){context.write(new Text(word),new VLongWritable(1L));}}}//reducepublic static class IIReducer extends Reducer<Text, VLongWritable, Text, VLongWritable>{@Overrideprotected void reduce(Text key, Iterable<VLongWritable> v2s, Context context)throws IOException, InterruptedException {long sum=0;for(VLongWritable vl : v2s){sum += vl.get(); }context.write(key, new VLongWritable(sum));}} }
?
轉載于:https://www.cnblogs.com/skyl/p/4761662.html
總結
以上是生活随笔為你收集整理的MR案例:CombineFileInputFormat的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: H5移动前端性能优化
- 下一篇: eclipse ide for java