lucene源码分析(5)lucence-group
生活随笔
收集整理的這篇文章主要介紹了
lucene源码分析(5)lucence-group
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
1. 普通查詢的用法
org.apache.lucene.search.IndexSearcher
public void search(Query query, Collector results)其中
Collector定義
/*** <p>Expert: Collectors are primarily meant to be used to* gather raw results from a search, and implement sorting* or custom result filtering, collation, etc. </p>** <p>Lucene's core collectors are derived from {@link Collector}* and {@link SimpleCollector}. Likely your application can* use one of these classes, or subclass {@link TopDocsCollector},* instead of implementing Collector directly:** <ul>** <li>{@link TopDocsCollector} is an abstract base class* that assumes you will retrieve the top N docs,* according to some criteria, after collection is * done. </li> * * <li>{@link TopScoreDocCollector} is a concrete subclass * {@link TopDocsCollector} and sorts according to score + * docID. This is used internally by the {@link * IndexSearcher} search methods that do not take an * explicit {@link Sort}. It is likely the most frequently * used collector.</li> * * <li>{@link TopFieldCollector} subclasses {@link * TopDocsCollector} and sorts according to a specified * {@link Sort} object (sort by field). This is used * internally by the {@link IndexSearcher} search methods * that take an explicit {@link Sort}. * * <li>{@link TimeLimitingCollector}, which wraps any other * Collector and aborts the search if it's taken too much * time.</li> * * <li>{@link PositiveScoresOnlyCollector} wraps any other * Collector and prevents collection of hits whose score * is <= 0.0</li> * * </ul> * * @lucene.experimental */Collector的層次結構
2 lucene-group
?提供了分組查詢GroupingSearch,對應相應的collector
3.實例:
public Map<String, Integer> groupBy(Query query, String field, int topCount) {Map<String, Integer> map = new HashMap<String, Integer>();long begin = System.currentTimeMillis();int topNGroups = topCount;int groupOffset = 0;int maxDocsPerGroup = 100;int withinGroupOffset = 0;try {FirstPassGroupingCollector c1 = new FirstPassGroupingCollector(field, Sort.RELEVANCE, topNGroups);boolean cacheScores = true; double maxCacheRAMMB = 4.0;CachingCollector cachedCollector = CachingCollector.create(c1, cacheScores, maxCacheRAMMB); indexSearcher.search(query, cachedCollector);Collection<SearchGroup<String>> topGroups = c1.getTopGroups(groupOffset, true);if (topGroups == null) { return null;} SecondPassGroupingCollector c2 = new SecondPassGroupingCollector(field, topGroups, Sort.RELEVANCE, Sort.RELEVANCE, maxDocsPerGroup, true, true, true);if (cachedCollector.isCached()) {// Cache fit within maxCacheRAMMB, so we can replay it: cachedCollector.replay(c2); } else {// Cache was too large; must re-execute query: indexSearcher.search(query, c2);}TopGroups<String> tg = c2.getTopGroups(withinGroupOffset);GroupDocs<String>[] gds = tg.groups;for(GroupDocs<String> gd : gds) {map.put(gd.groupValue, gd.totalHits);}} catch (IOException e) {e.printStackTrace();}long end = System.currentTimeMillis();System.out.println("group by time :" + (end - begin) + "ms");return map;}幾個參數說明:
- groupField: 分組域
- groupSort: 分組排序
- topNGroups: 最大分組數
- groupOffset: 分組分頁用
- withinGroupSort: 組內結果排序
- maxDocsPerGroup: 每個分組的最多結果數
- withinGroupOffset: 組內分頁用
參考資料
https://blog.csdn.net/wyyl1/article/details/7388241
轉載于:https://www.cnblogs.com/davidwang456/p/10000765.html
總結
以上是生活随笔為你收集整理的lucene源码分析(5)lucence-group的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: The number of object
- 下一篇: Lucene系列-facet--转