java 如何实现计数_如何高效的实现一个计数器map
這本是多年前一個(gè)stackoverflow上的一個(gè)討論,回答中涉及到了多種計(jì)數(shù)方法。對(duì)于一個(gè)key-value結(jié)構(gòu)的map,我們?cè)诰幊虝r(shí)會(huì)經(jīng)常涉及到key是對(duì)象,而value是一個(gè)integer或long來(lái)負(fù)責(zé)計(jì)數(shù),從而統(tǒng)計(jì)多個(gè)key的頻率。
面對(duì)這樣一個(gè)基本需求,可能有很多種實(shí)現(xiàn)。比如最基本的使用jdk的map直接實(shí)現(xiàn)——value是一個(gè)integer或者long。其基本代碼型如下:
1: final Map freq = new HashMap();2: int count = freq.containsKey(word) ? freq.get(word) : 0;3: freq.put(word, count + 1);
邏輯簡(jiǎn)單,判斷是否存在,是則get取值,否則為0,再put進(jìn)去一個(gè)加1后的值。總共要contain判斷,get,put做三次方法調(diào)用。
當(dāng)然進(jìn)一步我們可以把contain判斷去掉,代碼如下:
1: final Map freq = new HashMap();2: Integer count = freq.get(word);3: if (count == null) {4: freq.put(word, 1);5: } else {6: freq.put(word, count + 1);7: }
一般情況,我們做到這個(gè)地步,多數(shù)人對(duì)其邏輯已經(jīng)滿足,簡(jiǎn)單性能也能接受,試著想一下,難道不是這樣嗎?get加put,解決了。
當(dāng)然這樣的實(shí)現(xiàn)還不夠高效,于是我們開始去嘗試實(shí)現(xiàn)或?qū)ふ腋咝У姆椒?#xff0c;看看開源的集合類庫(kù)是否有需要的:
有個(gè)Trove,可以讓我們參考一下:
1: final TObjectIntHashMap freq = new TObjectIntHashMap();2: freq.adjustOrPutValue(word, 1, 1);
這樣做,非常優(yōu)雅啊,性能如何呢?不知道,需要看源碼了解細(xì)節(jié)。那再看看大名鼎鼎的guava如何呢?
1: AtomicLongMap map = AtomicLongMap.create();2: map.getAndIncrement(word);
實(shí)現(xiàn)依然優(yōu)雅,但是,但是看這名字,再看源碼,好吧,線程安全的,支持并發(fā),這不好搞了,我們場(chǎng)景需要嗎?不需要的話直覺(jué)告訴我們這肯定是“慢”的。再找:
1: Multiset bag = HashMultiset.create();2: bag.add(word);
這個(gè)看上去合適了,bag的實(shí)現(xiàn)明顯好很多,而且從語(yǔ)義理解上,這樣的接口更容易讓人理解。
那么這些方法,性能如何呢?做了個(gè)簡(jiǎn)單的比較,將26個(gè)英文字母做key,均勻循環(huán)若干次比較各個(gè)方法的效率(單純時(shí)間效率),而且時(shí)間不統(tǒng)計(jì)構(gòu)建開銷。外加了一個(gè)線程安全版的concurrentMap實(shí)現(xiàn),而其實(shí)google的guava里的AtomicLongMap也是包裝了juc的concurrentMap而已。里面有最終極的MutableInt方法,找找看吧,性能最好的就是它了。
1: /**2: *3: */4:5:6: import gnu.trove.map.hash.TObjectIntHashMap;7:8: import java.util.HashMap;9: import java.util.Map;10: import java.util.concurrent.ConcurrentHashMap;11: import java.util.concurrent.ConcurrentMap;12: import java.util.concurrent.atomic.AtomicLong;13:14: import com.google.common.collect.HashMultiset;15: import com.google.common.collect.Multiset;16: import com.google.common.util.concurrent.AtomicLongMap;17:18: /**19: * @author Administrator20: *21: */22: public class IntMapTest {23:24: /**25: * @param args26: */27: public static void main(String[] args) {28: // TODO Auto-generated method stub29: int cycles[] = { 100, 1000, 10000, 100000 };30: Tester baseLine = new BaseLine();31: Tester testForNull = new UseNullTest();32: Tester useAtomicLong = new UseAtomicLong();33: Tester useTrove = new UseTrove();34: Tester useMutableInt = new UseMutableInt();35: Tester useGuava = new UseGuava();36: Tester useGuava2 = new UseGuava2();37:38: for (int i = 0; i < cycles.length; i++) {39: System.out.println("-----With " + cycles[i] + " cycles-----");40: baseLine.test(cycles[i]);41: testForNull.test(cycles[i]);42: useAtomicLong.test(cycles[i]);43: useTrove.test(cycles[i]);44: useMutableInt.test(cycles[i]);45: useGuava.test(cycles[i]);46: useGuava2.test(cycles[i]);47: System.out.println("------------------------");48: }49:50: }51:52: }53:54: abstract class Tester {55: long ms;56: static String[] strs = "abcdefghijklmnopqrstuvwxyz".split("");57:58: void pre() {59: System.out.println("====" + this.getName() + "Test Case ");60: ms = System.currentTimeMillis();61: System.out.println("start at " + ms);62: }63:64: void post() {65: ms = System.currentTimeMillis() - ms;66: System.out.println("Time used: " + ms + " ms");67: }68:69: abstract void doAction(int cycles);70:71: public void test(int cycles) {72: pre();73: doAction(cycles);74: post();75: }76:77: abstract String getName();78: }79:80: class BaseLine extends Tester {81: final Map freq = new HashMap();82:83: @Override84: void doAction(int cycles) {85: for (int i = 0; i < cycles; i++) {86: for (String word : strs) {87: int count = freq.containsKey(word) ? freq.get(word) : 0;88: freq.put(word, count + 1);89: }90: }91: }92:93: @Override94: String getName() {95: return "BaseLine";96: }97:98: }99:100: class UseNullTest extends Tester {101: final Map freq = new HashMap();102:103: @Override104: void doAction(int cycles) {105: for (int i = 0; i < cycles; i++) {106: for (String word : strs) {107: Integer count = freq.get(word);108: if (count == null) {109: freq.put(word, 1);110: } else {111: freq.put(word, count + 1);112: }113: }114: }115: }116:117: @Override118: String getName() {119: return "TestForNull";120: }121:122: }123:124: class UseAtomicLong extends Tester {125: final ConcurrentMap map = new ConcurrentHashMap();126:127: @Override128: void doAction(int cycles) {129: for (int i = 0; i < cycles; i++) {130: for (String word : strs) {131: map.putIfAbsent(word, new AtomicLong(0));132: map.get(word).incrementAndGet();133: }134: }135: }136:137: @Override138: String getName() {139: return "AtomicLong";140: }141:142: }143:144: class UseTrove extends Tester {145: final TObjectIntHashMap freq = new TObjectIntHashMap();146:147: @Override148: void doAction(int cycles) {149: for (int i = 0; i < cycles; i++) {150: for (String word : strs) {151: freq.adjustOrPutValue(word, 1, 1);152: }153: }154: }155:156: @Override157: String getName() {158: return "Trove";159: }160:161: }162:163: class MutableInt {164: int value = 1; // note that we start at 1 since we're counting165:166: public void increment() {167: ++value;168: }169:170: public int get() {171: return value;172: }173: }174:175: class UseMutableInt extends Tester {176: Map freq = new HashMap();177:178: @Override179: void doAction(int cycles) {180: for (int i = 0; i < cycles; i++) {181: for (String word : strs) {182: MutableInt count = freq.get(word);183: if (count == null) {184: freq.put(word, new MutableInt());185: } else {186: count.increment();187: }188: }189: }190: }191:192: @Override193: String getName() {194: return "MutableInt";195: }196:197: }198:199: class UseGuava extends Tester {200: AtomicLongMap map = AtomicLongMap.create();201:202: @Override203: void doAction(int cycles) {204: for (int i = 0; i < cycles; i++) {205: for (String word : strs) {206: map.getAndIncrement(word);207: }208: }209: }210:211: @Override212: String getName() {213: return "Guava AtomicLongMap";214: }215:216: }217:218: class UseGuava2 extends Tester {219: Multiset bag = HashMultiset.create();220:221: @Override222: void doAction(int cycles) {223: for (int i = 0; i < cycles; i++) {224: for (String word : strs) {225: bag.add(word);226: }227: }228: }229:230: @Override231: String getName() {232: return "Guava HashMultiSet";233: }234:235: }
輸出結(jié)果如下:
1: -----With 100 cycles-----2: ====BaseLineTest Case3: start at 13586557027294: Time used: 7 ms5: ====TestForNullTest Case6: start at 13586557027367: Time used: 3 ms8: ====AtomicLongTest Case9: start at 135865570273910: Time used: 14 ms11: ====TroveTest Case12: start at 135865570275313: Time used: 2 ms14: ====MutableIntTest Case15: start at 135865570275516: Time used: 2 ms17: ====Guava AtomicLongMapTest Case18: start at 135865570275719: Time used: 4 ms20: ====Guava HashMultiSetTest Case21: start at 135865570276122: Time used: 7 ms23: ------------------------24: -----With 1000 cycles-----25: ====BaseLineTest Case26: start at 135865570276827: Time used: 17 ms28: ====TestForNullTest Case29: start at 135865570278530: Time used: 7 ms31: ====AtomicLongTest Case32: start at 135865570279233: Time used: 44 ms34: ====TroveTest Case35: start at 135865570283636: Time used: 17 ms37: ====MutableIntTest Case38: start at 135865570285339: Time used: 5 ms40: ====Guava AtomicLongMapTest Case41: start at 135865570285842: Time used: 9 ms43: ====Guava HashMultiSetTest Case44: start at 135865570286845: Time used: 50 ms46: ------------------------47: -----With 10000 cycles-----48: ====BaseLineTest Case49: start at 135865570291850: Time used: 16 ms51: ====TestForNullTest Case52: start at 135865570293453: Time used: 14 ms54: ====AtomicLongTest Case55: start at 135865570294856: Time used: 29 ms57: ====TroveTest Case58: start at 135865570297759: Time used: 10 ms60: ====MutableIntTest Case61: start at 135865570298862: Time used: 5 ms63: ====Guava AtomicLongMapTest Case64: start at 135865570299365: Time used: 15 ms66: ====Guava HashMultiSetTest Case67: start at 135865570300968: Time used: 77 ms69: ------------------------70: -----With 100000 cycles-----71: ====BaseLineTest Case72: start at 135865570308673: Time used: 124 ms74: ====TestForNullTest Case75: start at 135865570321076: Time used: 118 ms77: ====AtomicLongTest Case78: start at 135865570332979: Time used: 240 ms80: ====TroveTest Case81: start at 135865570356982: Time used: 102 ms83: ====MutableIntTest Case84: start at 135865570367185: Time used: 45 ms86: ====Guava AtomicLongMapTest Case87: start at 135865570371688: Time used: 126 ms89: ====Guava HashMultiSetTest Case90: start at 135865570384291: Time used: 98 ms92: ------------------------
一般結(jié)論:單線程使用MutableInt,多線程使用guava的AtomicLongMap,其實(shí)可以看看guava對(duì)addAndGet的實(shí)現(xiàn),循環(huán),很有趣。
最后總結(jié)一下,我們?cè)趯?duì)這個(gè)問(wèn)題做優(yōu)化的時(shí)候,明顯的思路就是減少方法調(diào)用,而MutableInt效率最高,明顯的是它將方法調(diào)用減少到最小——1次get,指針的威力頓時(shí)顯現(xiàn)。當(dāng)然實(shí)際業(yè)務(wù)代碼實(shí)現(xiàn)的時(shí)候還要考慮到多個(gè)因素,比如代碼可讀性,與業(yè)務(wù)結(jié)合等等,我們現(xiàn)實(shí)中不一定要追求如此的效率,但是也要避免毫無(wú)思考的寫下baseline里的代碼,因?yàn)槊黠@是可優(yōu)化的,why not?
注:文中單個(gè)實(shí)現(xiàn)代碼來(lái)自stackoverflow的各個(gè)回答,測(cè)試代碼本人編寫。
ref:
posted on 2013-01-20 12:40 changedi 閱讀(4269) 評(píng)論(0) ?編輯 ?收藏 所屬分類: Java技術(shù)
總結(jié)
以上是生活随笔為你收集整理的java 如何实现计数_如何高效的实现一个计数器map的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: java 获取枚举对象_Java:获取与
- 下一篇: 现代计划在韩国新建一座电动汽车工厂,投资