Apache Flink 零基础入门(十三)Flink 计数器
生活随笔
收集整理的這篇文章主要介紹了
Apache Flink 零基础入门(十三)Flink 计数器
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
需求:當(dāng)一個文本文件進入時,有可能會有一些格式亂碼的錯誤行,如何統(tǒng)計哪些錯誤行?如何提取錯誤行
def main(args: Array[String]): Unit = {val env = ExecutionEnvironment.getExecutionEnvironmentval data = env.fromElements("hadoop","spark","pyspark", "storm")data.map(new RichMapFunction[String, Long] {var counter = 0loverride def map(value: String): Long = {counter = counter + 1println("counter:"+counter)counter}}).setParallelism(2).print()}使用這種方式,設(shè)置并行度之后,無法正確統(tǒng)計。
正確的方式是通過定義Accumulator來進行計數(shù)操作。scala實現(xiàn)方式如下:
val info = data.map(new RichMapFunction[String, String] {// step1:定義計數(shù)器val counter = new LongCounter()override def open(parameters: Configuration): Unit = {// step2: 注冊計數(shù)器getRuntimeContext.addAccumulator("ele-counts-scala", counter)}override def map(in: String): String = {counter.add(1)in}})info.writeAsText("E:/test3", WriteMode.OVERWRITE).setParallelism(4)val jobResult=env.execute("CounterApp")// step3: 獲取計數(shù)器val num =jobResult.getAccumulatorResult[Long]("ele-counts-scala")println("num:" + num )Java
public class JavaCounterApp {public static void main(String[] args) throws Exception {ExecutionEnvironment executionEnvironment = ExecutionEnvironment.getExecutionEnvironment();DataSource<String> data = executionEnvironment.fromElements("hadoop", "spark", "pyspark", "storm");DataSet dataSet = data.map(new RichMapFunction<String, String>() {LongCounter counter = new LongCounter();@Overridepublic void open(Configuration parameters) throws Exception {getRuntimeContext().addAccumulator("ele-counts-java",counter);}@Overridepublic String map(String value) throws Exception {counter.add(1);return value;}});dataSet.writeAsText("E:/test4", FileSystem.WriteMode.OVERWRITE).setParallelism(3);JobExecutionResult javaCounterApp = executionEnvironment.execute("JavaCounterApp");long num = javaCounterApp.getAccumulatorResult("ele-counts-java");System.out.println("num:" + num);} }?
與50位技術(shù)專家面對面20年技術(shù)見證,附贈技術(shù)全景圖總結(jié)
以上是生活随笔為你收集整理的Apache Flink 零基础入门(十三)Flink 计数器的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Apache Flink 零基础入门(十
- 下一篇: Apache Flink 零基础入门(十