當(dāng)前位置：首頁(yè) > 编程资源 > 编程问答 >内容正文

编程问答

lucene3.5学习笔记02--创建索引和建立搜索

發(fā)布時(shí)間：2025/3/14 编程问答 29 豆豆

生活随笔收集整理的這篇文章主要介紹了 lucene3.5学习笔记02--创建索引和建立搜索小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

先大致了解一下lucene的組成結(jié)構(gòu)

lucene的組成結(jié)構(gòu)：對(duì)于外部應(yīng)用來(lái)說(shuō)索引模塊(index)和檢索模塊(search)是主要的外部應(yīng)用入口

org.apache.Lucene.search/	搜索入口
org.apache.Lucene.index/	索引入口
org.apache.Lucene.analysis/	語(yǔ)言分析器
org.apache.Lucene.queryParser/	查詢分析器
org.apache.Lucene.document/	存儲(chǔ)結(jié)構(gòu)
org.apache.Lucene.store/?	底層IO/存儲(chǔ)結(jié)構(gòu)
org.apache.Lucene.util/	一些公用的數(shù)據(jù)結(jié)構(gòu)

接下來(lái)，我們構(gòu)建一個(gè)最簡(jiǎn)單的文件搜索樣例

先在我的電腦里面創(chuàng)建兩個(gè)空文件夾
E:\lucene\data?????????? 用來(lái)存放數(shù)據(jù)，代表要搜索的文件
E:\lucene\index???????? 原來(lái)存放lucene為數(shù)據(jù)創(chuàng)建的索引文件

構(gòu)造一點(diǎn)假數(shù)據(jù)
E:\lucene\data\1.txt??????? 內(nèi)容為 a1a2a3?
E:\lucene\data\2.txt??????? 內(nèi)容為 b1b2b3
E:\lucene\data\3.txt??????? 內(nèi)容為 c1c2c3 honor

建立索引
public static void createIndex(String filePath, String indexPath) throws IOException { Version version = Version.LUCENE_35; File indexFile = new File(indexPath); FSDirectory directory = FSDirectory.open(indexFile); IndexWriterConfig conf = new IndexWriterConfig(version, new SimpleAnalyzer(version)); IndexWriter writer = new IndexWriter(directory, conf); List<File> files = FileList.getFiles(filePath);// 獲取該路徑下所有文件 for(File file:files){ System.out.println("Indexing file " + file); // 構(gòu)造Document對(duì)象 Document doc = new Document(); doc.add(new Field("filename", file.getName(), Field.Store.YES, Field.Index.ANALYZED)); doc.add(new Field("uri", file.getPath(), Field.Store.YES, Field.Index.NO)); String text = FileText.getText(file);// 獲取該文件內(nèi)容 doc.add(new Field("text", text, Field.Store.YES, Field.Index.ANALYZED));//將文件內(nèi)容索引在text // 將文檔寫入索引 writer.addDocument(doc); } // 關(guān)閉寫索引器 writer.close(); }
public static void main(String[] args) { String filePath = "E:/lucene/data"; String indexPath = "E:/lucene/index"; // try{ createIndex(filePath, indexPath); }catch(IOException e){ e.printStackTrace(); } }
這時(shí)E:\lucene\index\ 目錄下生成的索引文件如下

建立搜索
public static void search(String keyword, String indexPath) throws CorruptIndexException, IOException, ParseException { Version version = Version.LUCENE_35; // 指向索引目錄的搜索器 File indexFile = new File(indexPath); FSDirectory directory = FSDirectory.open(indexFile); IndexReader reader = IndexReader.open(directory); IndexSearcher searcher = new IndexSearcher(reader); // 查詢解析器：使用和索引同樣的語(yǔ)言分析器查詢text字段 QueryParser parser = new QueryParser(version, "text", new SimpleAnalyzer(version));// text 字段 Query query = parser.parse(keyword); // 搜索結(jié)果使用Hits存儲(chǔ) TopDocs hits = searcher.search(query, null, 10); // 通過(guò)hits可以訪問(wèn)到相應(yīng)字段的數(shù)據(jù)和查詢的匹配度 System.out.println(hits.totalHits + " total results"); System.out.println("-----匹配結(jié)果如下------"); ScoreDoc[] scoredocs = hits.scoreDocs; for(int i = 0; i < scoredocs.length; i++){ ScoreDoc scoreDoc = scoredocs[i]; Document d = searcher.doc(scoreDoc.doc); String path = d.get("uri"); System.out.println(i + "--得分:" +scoreDoc.score +" 文件路徑:"+path); } searcher.close(); }
public static void main(String[] args) { String indexPath = "E:/lucene/index"; try{ // 搜索 honor 這個(gè)關(guān)鍵字 search("honor",indexPath); }catch(CorruptIndexException e){ e.printStackTrace(); }catch(IOException e){ e.printStackTrace(); }catch(ParseException e){ e.printStackTrace(); } }
控制臺(tái)輸出如下

1 total results
-----匹配結(jié)果如下------
0--得分:0.70273256 文件路徑:E:\lucene\data\3.txt

怎么樣,利用lucene實(shí)現(xiàn)檢索很簡(jiǎn)單吧
由于沒有涉及到中文,使用lucene自帶的分析器就可以了
要是中文還得使用中文分詞器，這個(gè)接下來(lái)再學(xué)習(xí)

轉(zhuǎn)載于:https://www.cnblogs.com/hercules9/archive/2012/03/04/2461396.html

總結(jié)

以上是生活随笔為你收集整理的lucene3.5学习笔记02--创建索引和建立搜索的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： Ocr技术识别高级验证码
下一篇：无需安装Oracle，直接使用PL/SQ