當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

ElasticSearch（笔记）

發布時間：2023/12/3 编程问答 32 豆豆

生活随笔收集整理的這篇文章主要介紹了 ElasticSearch（笔记）小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

簡介

本教程基于ElasticSearch7.6.1, 注意ES7的語法與ES6的API調用差別很大, 教程發布時最新版本為ES7.6.2(20200401更新);

ES是用于全文搜索的工具:

SQL: 使用like %關鍵詞%來進行模糊搜索在大數據情況下是非常慢的, 即便設置索引提升也有限;
ElasticSearch: 搜索引擎(baidu, github, taobao)
一些ES涉及的概念:
- 分詞器 ik
- Restful操作ES
- CRUD
- SpringBoot集成ES

Lucene庫創始人 Doug Cutting

Lucene: java寫成的為各種中小型應用軟件加入全文檢索功能;

Nutch: 一個建立在Lucene核心之上的網頁搜索應用程序, Nutch的應用比Lucene要更加廣泛

大數據解決存儲與計算(MapReduce)兩個問題:

2004年Doug Cutting基于GFS系統開發了分布式文件存儲系統;
2005年Doug Cutting基于MapReduce在Nutch搜索引擎實現了這種算法;
加入Yahoo后, Doug Cutting將MapReduce和NDFS結合創建了Hadoop, 成為了Hadoop之父;
Doug Cutting將BigTable集成到Hadoop中

回到主題:

Lucene是一套信息檢索工具包, jar包, 不包含搜索引擎系統;
Lucene包含索引結構, 讀寫索引的工具, 排序, 搜索規則, 工具類;
Lucene和ES的關系:
- ES是基于Lucene做了一些封裝和增強, 上手是比較簡單的, 比Redis要簡單

Elastic概述

分布式的全文搜索引擎, 高擴展性;

接近實時更新的查詢搜索;

ES是基于Restful的(即用get, post, delete, put來訪問);

ES進行復雜的數據分析, ELK技術(elastic+logstash+kibana)

Elastic vs solr

當使用索引時, solr會發生io阻塞, 查詢性較差, elastic則在索引情況下的優勢明顯;

elastic的效率在傳統項目下一般有50倍的提升;

elastic解壓即可用, solr需要配置

solr用zookeeper進行分布式管理, elastic自帶分布式

solr支持更多格式的數據, json, xml, csv, elastic只支持json

solr比elastic的功能更強大

solr查詢快, 但是更新索引時慢(如插入和刪除慢), elastic查詢慢, 但是實時性查詢快, 用于facebook新浪等搜索

solr是傳統搜索應用的解決方案, elastic適用于新興的實時搜索應用

solr比較成熟, elastic目前更新換代快;

環境準備（版本對應）

本筆記參考狂神說，版本為7.6.X
Lucene是一套信息檢索工具包（jar包），不含搜索引擎系統
ElasticSearch是基于Lucene做了一些封裝和增強

入門操作

JDK1.8以上，客戶端，界面工具
版本對應。

下載

官網下載

windows下解壓就可以使用

bin：啟動文件 config：配置文件log4j2 日志文件jvm.options 虛擬機文件elasticsearch.yml 配置文件比如默認9200端口 lib：相關jar包modules：功能模塊 plugins：插件：比如ik插件

啟動，然后localhost:9200訪問

可視化界面head

es head插件，github上面下載

https://github.com/mobz/elasticsearch-head

npm installnpm run start #啟動插件：localhost:9100

解決跨域問題

修改elasticsearch.yml文件

#解決跨域問題http.cors.enabled: truehttp.cors.allow-origin: "*"

kibana日志分析和命令輸入

ELK：日志分析架構棧
注意：下載版本與es一致；可以在配置文件中漢化
默認端口 localhost:5601

漢化

配置文件中XXX.yml

ES核心概念

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-SRzob1Aa-1610955877349)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210117195426957.png)]

es是面向文檔的，一切都是JSON
對比
- 關系型數據庫Elasticsearch
  數據庫database 索引 indices（數據庫）
  表tables types （以后會被棄用）
  行rows documents （文檔）
  字段columns fields
物理設計
- 在后臺把每個索引劃分為多個分片，每片可以再集群中的不同服務器間遷移；
邏輯設計
- 文檔：索引和搜索數據的最小單位是文檔；
  - 自我包含：key：value
  - 層次型：一個文檔中包含文檔（json對象）
- 類型：文檔的邏輯容器
- 索引：數據庫
倒排索引
- es使用倒排索引的結構，采用Lucene倒排索引作為底層。用于快速全文檢索。

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-jfXa0y38-1610955877351)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210117204515912.png)]

IK分詞器插件

什么是IK分詞器：
- 把一句話分詞
- 如果使用中文：推薦IK分詞器
- 兩個分詞算法：ik_smart（最少切分），ik_max_word（最細粒度劃分）

4.1 下載安裝

下載地址：https://github.com/medcl/elasticsearch-analysis-ik/releases

然后解壓，放到elasticsearch的plugins中，建立“ik”文件夾，然后放入；

重啟觀察es：發現加載ik插件了

ik_smart

輸入：

GET _analyze {"analyzer": "ik_smart","text": "我是社會主義接班人" }

輸出：

ik_max_word

輸入：

GET _analyze {"analyzer": "ik_max_word","text": "我是社會主義接班人" }

輸入：

{"tokens" : [{"token" : "我","start_offset" : 0,"end_offset" : 1,"type" : "CN_CHAR","position" : 0},{"token" : "是","start_offset" : 1,"end_offset" : 2,"type" : "CN_CHAR","position" : 1},{"token" : "社會主義","start_offset" : 2,"end_offset" : 6,"type" : "CN_WORD","position" : 2},{"token" : "社會","start_offset" : 2,"end_offset" : 4,"type" : "CN_WORD","position" : 3},{"token" : "主義","start_offset" : 4,"end_offset" : 6,"type" : "CN_WORD","position" : 4},{"token" : "接班人","start_offset" : 6,"end_offset" : 9,"type" : "CN_WORD","position" : 5},{"token" : "接班","start_offset" : 6,"end_offset" : 8,"type" : "CN_WORD","position" : 6},{"token" : "人","start_offset" : 8,"end_offset" : 9,"type" : "CN_CHAR","position" : 7}] }

用戶配置字典

當一些特殊詞（比如姓名）不能被識別切分時候，用戶可以自定義字典：

重啟es和kibana測試

Rest風格

5.1 簡介

RESTful是一種架構的規范與約束、原則，符合這種規范的架構就是RESTful架構。

操作

methodurl地址描述

PUT	localhost:9100/索引名稱/類型名稱/文檔id	創建文檔（指定id）
POST	localhost:9100/索引名稱/類型名稱	創建文檔（隨機id）
POST	localhost:9100/索引名稱/文檔類型/文檔id/_update	修改文檔
DELETE	localhost:9100/索引名稱/文檔類型/文檔id	刪除文檔
GET	localhost:9100/索引名稱/文檔類型/文檔id	查詢文檔通過文檔id
POST	localhost:9100/索引名稱/文檔類型/_search	查詢所有文檔

5.2 測試

1、創建一個索引PUT /索引名/類型名/id
默認是_doc

數據類型

基本數據類型

字符串 text, keyword
數據類型 long, integer,short,byte,double,float,half_float,scaled_float
日期 date
布爾 boolean
二進制 binary

制定數據類型

創建規則

PUT /test2 {"mappings": {"properties": {"name": {"type": "text"},"age": {"type": "long"},"birthday": {"type": "date"}}} }

輸出：

{"acknowledged" : true,"shards_acknowledged" : true,"index" : "test2" }

如果不指定具體類型，es會默認配置類型

查看索引

GET test2

查看es信息

get _cat/

修改

1. 之前的辦法：直接put2. 現在的辦法： POST /test1/_doc/1/_update{ "doc": {"name": "龐世宗"}}

刪除索引

DELETE test1

關于文檔的基本操作（重點）

基本操作

添加數據

PUT /psz/user/1 {"name": "psz","age": 22,"desc": "偶像派程序員","tags": ["暖","帥"] }

獲取數據

GEt psz/user/1 ===============輸出=========== {"_index" : "psz","_type" : "user","_id" : "1","_version" : 1,"_seq_no" : 0,"_primary_term" : 1,"found" : true,"_source" : {"name" : "psz","age" : 22,"desc" : "偶像派程序員","tags" : ["暖","帥"]} }

更新數據PUT

更新數據，推薦POST _update

不推薦

POST psz/user/1 {"doc":{"name": "龐龐胖" #后面信息會沒有} }

推薦！

POST psz/user/1/_update {"doc":{"name": "龐龐胖" #后面信息存在} }

簡單搜索 GET

GET psz/user/1

簡答的條件查詢：根據默認映射規則產生基本的查詢

GET psz/user/_search?q=name:龐世宗

復雜查詢

查詢，參數使用JSON體

GET psz/user/_search {"query": {"match": {"name": "龐世宗" //根據name匹配} },"_source": ["name","age"], //結果的過濾，只顯示name和age"sort": [{"age": {"order": "desc" //根據年齡降序}}],"from": 0, //分頁：起始值，從0還是"size": 1 //返回多少條數據 }

之后只用java操作es時候，所有的對象和方法就是這里面的key
分頁前端 /search/{current}/{pagesize}

布爾值查詢

must(對應mysql中的and) ,所有條件都要符合

GET psz/user/_search {"query": {"bool": {"must": [ //相當于and{"match": {"name": "龐世宗"}},{"match": {"age": 22}}]}} }

shoule(對應mysql中的or)

GET psz/user/_search {"query": {"bool": {"should": [ //should相當于or{"match": {"name": "龐世宗"}},{"match": {"age": 22}}]}} }

must_not (對應mysql中的not)

過濾器

GET psz/user/_search {"query": {"bool": {"should": [{"match": {"name": "龐世宗"}}],"filter": [{"range": {"age": {"gt": 20 //過濾年齡大于20的}}}]}} }

多條件查詢

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-1EZhNdoZ-1610955877352)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210117233812605.png)]

精確查詢

trem查詢是直接通過倒排索引指定的詞條進行精確的查找的。

關于分詞：

trem，直接查詢精確地

match，會使用分詞器解析

關于類型：

text: 分詞器會解析

keywords: 不會被拆分

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-pqsrOf4H-1610955877357)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210117234310173.png)]

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-WBP1qabF-1610955877361)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210117234442418.png)]

高亮查詢

GET psz/user/_search {"query": {"match": {"name": "龐世宗"}},"_source": ["name","age"],"sort": [{"age": {"order": "desc"}}],"highlight": //高亮{"pre_tags": "<P>", //自定義高亮"post_tags": "</P>", "fields": {"name":{} //自定義高亮區域} } }

集成Springboot

官方文檔：https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/index.html

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-EtZuYbHs-1610955877362)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210117234918617.png)]

創建一個模塊的辦法（新）

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-96Z6UGhi-1610955877363)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210117235819775.png)]

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-bDRLboz4-1610955877364)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210118000624531.png)]

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-n5p04vql-1610955877365)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210118001126961.png)]

1、找到原生的依賴

<dependency><groupId>org.elasticsearch.client</groupId><artifactId>elasticsearch-rest-high-level-client</artifactId><version>7.6.1</version> </dependency><properties><java.version>1.8</java.version><elasticsearch.version>7.6.1</elasticsearch.version></properties>

2、找對象

Initialization

A RestHighLevelClient instance needs a REST low-level client builder to be built as follows:

package com.kuang.config;import org.apache.http.HttpHost; import org.elasticsearch.client.RestClient; import org.elasticsearch.client.RestHighLevelClient; import org.springframework.boot.context.properties.ConfigurationProperties; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration;@Configuration public class ElasticSearchClientConfig {@Beanpublic RestHighLevelClient restHighLevelClient(){RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http"),new HttpHost("localhost", 9201, "http")));return client;} }

The high-level client will internally create the low-level client used to perform requests based on the provided builder. That low-level client maintains a pool of connections and starts some threads so you should close the high-level client when you are well and truly done with it and it will in turn close the internal low-level client to free those resources. This can be done through the close:

client.close();

In the rest of this documentation about the Java High Level Client, the RestHighLevelClient instance will be referenced as client.

3、分析類中的方法

一定要版本一致！默認es是6.8.1，要改成與本地一致的。

Java配置類

@Configuration //xml public class EsConfig {@Beanpublic RestHighLevelClient restHighLevelClient(){RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http"))); //媽的被這個端口搞了return client;} }

索引API操作

1、創建索引

@SpringBootTest class EsApplicationTests {@Autowired@Qualifier("restHighLevelClient")private RestHighLevelClient restHighLevelClient;//創建索引的創建 Request@Testvoid testCreateIndex() throws IOException {//1.創建索引請求CreateIndexRequest request = new CreateIndexRequest("索引名");//2.執行創建請求 indices 請求后獲得響應CreateIndexResponse createIndexResponse = restHighLevelClient.indices().create(request, RequestOptions.DEFAULT);System.out.println(createIndexResponse);}}

2、獲取索引

@Testvoid testExistIndex() throws IOException {GetIndexRequest request = new GetIndexRequest("索引名");boolean exist =restHighLevelClient.indices().exists(request,RequestOptions.DEFAULT);System.out.println(exist);}

3、刪除索引

@Testvoid deleteIndex() throws IOException{DeleteIndexRequest request = new DeleteIndexRequest("索引名");AcknowledgedResponse delete = restHighLevelClient.indices().delete(request, RequestOptions.DEFAULT);System.out.println(delete.isAcknowledged());}

文檔API操作

package com.kuang.pojo;import lombok.AllArgsConstructor; import lombok.Data; import lombok.NoArgsConstructor; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Component;@Data @AllArgsConstructor @NoArgsConstructor @Component public class User {private String name;private int age;}

1、測試添加文檔

導入

<dependency><groupId>com.alibaba</groupId><artifactId>fastjson</artifactId><version>1.2.16</version> </dependency> //測試添加文檔@Testvoid testAddDocument() throws IOException {//創建對象User user = new User("psz", 22);IndexRequest request = new IndexRequest("ppp");//規則 PUT /ppp/_doc/1request.id("1");request.timeout(timeValueSeconds(1));//數據放入請求IndexRequest source = request.source(JSON.toJSONString(user), XContentType.JSON);//客戶端發送請求,獲取響應結果IndexResponse indexResponse = restHighLevelClient.index(request, RequestOptions.DEFAULT);System.out.println(indexResponse.toString());System.out.println(indexResponse.status());}

2、獲取文檔

//獲取文檔，判斷是否存在 GET /index/doc/1@Testvoid testIsExists() throws IOException {GetRequest getRequest = new GetRequest("ppp", "1");//過濾，不放回_source上下文getRequest.fetchSourceContext(new FetchSourceContext(false));getRequest.storedFields("_none_");boolean exists = restHighLevelClient.exists(getRequest, RequestOptions.DEFAULT);System.out.println(exists);}

3、獲取文檔信息

//獲取文檔信息@Testvoid getDocument() throws IOException {GetRequest getRequest = new GetRequest("ppp", "1");GetResponse getResponse = restHighLevelClient.get(getRequest, RequestOptions.DEFAULT);System.out.println(getResponse.getSourceAsString());System.out.println(getResponse);} ==============輸出========================== {"age":22,"name":"psz"} {"_index":"ppp","_type":"_doc","_id":"1","_version":2,"_seq_no":1,"_primary_term":1,"found":true,"_source":{"age":22,"name":"psz"}}

4、更新文檔信息

//更新文檔信息@Testvoid updateDocument() throws IOException {UpdateRequest updateRequest = new UpdateRequest("ppp","1");updateRequest.timeout("1s");//json格式傳入對象User user=new User("新名字",21);updateRequest.doc(JSON.toJSONString(user),XContentType.JSON);//請求，得到響應UpdateResponse updateResponse = restHighLevelClient.update(updateRequest, RequestOptions.DEFAULT);System.out.println(updateResponse);}

5、刪除文檔信息

//刪除文檔信息 @Test void deleteDocument() throws IOException {DeleteRequest deleteRequest = new DeleteRequest("ppp","1");deleteRequest.timeout("1s");DeleteResponse deleteResponse = restHighLevelClient.delete(deleteRequest, RequestOptions.DEFAULT);System.out.println(deleteResponse); }

批量操作Bulk

真實項目中，肯定用到大批量查詢
不寫id會隨機生成id

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-ppmPZo0L-1610955877367)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210118104900129.png)]

@Testvoid testBulkRequest() throws IOException{BulkRequest bulkRequest = new BulkRequest();bulkRequest.timeout("10s");//數據量大的時候，秒數可以增加ArrayList<User> userList = new ArrayList<>();userList.add(new User("psz",11));userList.add(new User("psz2",12));userList.add(new User("psz3",13));userList.add(new User("psz4",14));userList.add(new User("psz5",15));for (int i = 0; i < userList.size(); i++) {bulkRequest.add(new IndexRequest("ppp").id(""+(i+1)).source(JSON.toJSONString(userList.get(i)),XContentType.JSON));}//請求+獲得響應BulkResponse bulkResponse = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);System.out.println(bulkResponse.hasFailures());//返回false：成功}

搜索

/*查詢:搜索請求：SearchRequest條件構造：SearchSourceBuilder*/@Testvoid testSearch() throws IOException {SearchRequest searchRequest = new SearchRequest("ppp");//構建搜索條件SearchSourceBuilder searchSourceBuilderBuilder = new SearchSourceBuilder();// 查詢條件QueryBuilders工具// ：比如：精確查詢TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("name", "psz");searchSourceBuilderBuilder.query(termQueryBuilder);//設置查詢時間searchSourceBuilderBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));//設置高亮//searchSourceBuilderBuilder.highlighter()searchRequest.source(searchSourceBuilderBuilder);SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);System.out.println(JSON.toJSONString(searchResponse.getHits()));}

項目搭建

1、啟動ES，和head-master，用head-master建立索引

不建立也沒事，添加數據的時候會自動創建

2、導入SpringBoot需要的依賴

注意：elasticsearch的版本要和自己本地的版本一致！所以還要在pom里面添加自定義版本

<dependency><groupId>org.jsoup</groupId><artifactId>jsoup</artifactId><version>1.10.2</version> </dependency>  <dependency><groupId>com.alibaba</groupId><artifactId>fastjson</artifactId><version>1.2.73</version> </dependency>  <dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-data-elasticsearch</artifactId> </dependency>  <dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-thymeleaf</artifactId> </dependency> <dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId> </dependency>  <dependency><groupId>org.projectlombok</groupId><artifactId>lombok</artifactId><optional>true</optional> </dependency>

3、項目用到的靜態資源（修改過的）

鏈接：https://pan.baidu.com/s/1X1kwMHsDvML-0rBEJnUOdA
提取碼：qjqy

4、添加SpringBoot配置(application.yml)

#端口改為9090 server:port: 9090# 關閉 thymeleaf 的緩存 spring:thymeleaf:cache: false

5、項目的整體結構

6、添加靜態資源到項目中

7、SpringBoot中添加ES客戶端配置類

ElasticSearchClientConfig.java

package com.wu.config;@Configuration public class ElasticSearchClientConfig {@Beanpublic RestHighLevelClient restHighLevelClient() {RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("127.0.0.1", 9200, "http")));return client;} }

Jsoup爬取京東數據

爬取數據

1、進入京東官網搜索java

2、按F12審查元素，找到書籍所在位置

3、在utils包下建立HtmlParseUtil.java爬取測試

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-UiLd3GNL-1610955877368)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210118112732209.png)]

//測試數據 public static void main(String[] args) throws IOException, InterruptedException {//獲取請求String url = "https://search.jd.com/Search?keyword=java";// 解析網頁（Jsou返回的Document就是瀏覽器的Docuement對象）Document document = Jsoup.parse(new URL(url), 30000);//獲取id，所有在js里面使用的方法在這里都可以使用Element element = document.getElementById("J_goodsList");//獲取所有的li元素Elements elements = element.getElementsByTag("li");//用來計數int c = 0;//獲取元素中的內容，這里的el就是每一個li標簽for (Element el : elements) {c++;//這里有一點要注意，直接attr使用src是爬不出來的，因為京東使用了img懶加載String img = el.getElementsByTag("img").eq(0).attr("data-lazy-img");//獲取商品的價格，并且只獲取第一個text文本內容String price = el.getElementsByClass("p-price").eq(0).text();String title = el.getElementsByClass("p-name").eq(0).text();String shopName = el.getElementsByClass("p-shop").eq(0).text();System.out.println("========================================");System.out.println(img);System.out.println(price);System.out.println(title);System.out.println(shopName);}System.out.println(c); }

測試結果

獲取結果沒問題，下面就把它封裝成一個工具類

4、建立一個pojo實體類

實體類Content.java

package com.wu.pojo;@Data @AllArgsConstructor @NoArgsConstructor public class Content {private String img;private String price;private String title;private String shopName;//可以自己擴展屬性 }

工具類HtmlParseUtil.java

package com.wu.utils;@Component public class HtmlParseUtil {public List<Content> parseJD(String keyword) throws IOException {List<Content> list = new ArrayList<>();String url = "https://search.jd.com/Search?keyword=" + keyword;Document document = Jsoup.parse(new URL(url), 30000);Element element = document.getElementById("J_goodsList");Elements elements = element.getElementsByTag("li");for (Element el : elements) {String img = el.getElementsByTag("img").eq(0).attr("data-lazy-img");String price = el.getElementsByClass("p-price").eq(0).text();String title = el.getElementsByClass("p-name").eq(0).text();String shopName = el.getElementsByClass("p-shopnum").eq(0).text();list.add(new Content(img, price, title, shopName));}return list;} }

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-q05kRYi4-1610955877369)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210118115802010.png)]

5、業務層，這里就不寫接口了

ContentService.java

先寫一個方法讓爬取的數據添加到ES中

package com.wu.service;//業務編寫 @Service public class ContentService {//將客戶端注入@Autowired@Qualifier("restHighLevelClient")private RestHighLevelClient client;//1、解析數據放到 es 中public boolean parseContent(String keyword) throws IOException {List<Content> contents = new HtmlParseUtil().parseJD(keyword);//把查詢的數據放入 es 中BulkRequest request = new BulkRequest();request.timeout("2m");for (int i = 0; i < contents.size(); i++) {request.add(new IndexRequest("jd_goods").source(JSON.toJSONString(contents.get(i)), XContentType.JSON));}BulkResponse bulk = client.bulk(request, RequestOptions.DEFAULT);return !bulk.hasFailures();} }

6、在Controller包下建立

ContentController.java

package com.wu.controller;//請求編寫 @RestController public class ContentController {@Autowiredprivate ContentService contentService;@GetMapping("/parse/{keyword}")public Boolean parse(@PathVariable("keyword") String keyword) throws IOException {return contentService.parseContent(keyword);} }

7、啟動SpringBoot項目，訪問它爬取數據添加到ES中

http://127.0.0.1:9090/parse/java

實現搜索功能

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-t3mspb23-1610955877370)(C:\Users\王東梁\AppData\Roaming\Typora\typora-user-images\image-20210118131856663.png)]

1、在ContentService.java添加

//2、獲取這些數據實現基本的搜索功能 public List<Map<String, Object>> searchPage(String keyword, int pageNo, int pageSize) throws IOException {if (pageNo <= 1) {pageNo = 1;}if (pageSize <= 1) {pageSize = 1;}//條件搜索SearchRequest searchRequest = new SearchRequest("jd_goods");SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();//分頁sourceBuilder.from(pageNo).size(pageSize);//精準匹配TermQueryBuilder termQuery = QueryBuilders.termQuery("title", keyword);sourceBuilder.query(termQuery);sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));//執行搜索SearchRequest source = searchRequest.source(sourceBuilder);SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);//解析結果List<Map<String, Object>> list = new ArrayList<>();for (SearchHit documentFields : searchResponse.getHits().getHits()) {list.add(documentFields.getSourceAsMap());}return list; }

2、在ContentController添加搜索請求

@GetMapping("/search/{keyword}/{pageNo}/{pageSize}") public List<Map<String, Object>> search(@PathVariable("keyword") String keyword,@PathVariable("pageNo") int pageNo,@PathVariable("pageSize") int pageSize) throws IOException {List<Map<String, Object>> list = contentService.searchPage(keyword, pageNo, pageSize);return list; }

3、訪問http://127.0.0.1:9090/search/java/1/10

歐克，爬取和搜索都沒問題，下面要做的就是和前端交互了

和前端交互

1、前端接收數據

index.html

1、用vue接收數據

2、用vue給前端傳遞數據

2、訪問 127.0.0.1:9090 并且搜索java

歐克，完美

實現關鍵字高亮

1、改ContentService.java里面的搜索功能就行

//3、獲取這些數據實現基本的搜索高亮功能 public List<Map<String, Object>> searchPagehighlighter(String keyword, int pageNo, int pageSize) throws IOException {if (pageNo <= 1) {pageNo = 1;}if (pageSize <= 1) {pageSize = 1;}//條件搜索SearchRequest searchRequest = new SearchRequest("jd_goods");SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();//分頁sourceBuilder.from(pageNo).size(pageSize);//精準匹配TermQueryBuilder termQuery = QueryBuilders.termQuery("title", keyword);//==================================== 高亮 ==========================================HighlightBuilder highlightBuilder = new HighlightBuilder(); //獲取高亮構造器highlightBuilder.field("title"); //需要高亮的字段highlightBuilder.requireFieldMatch(false);//不需要多個字段高亮highlightBuilder.preTags("<span style='color:red'>"); //前綴highlightBuilder.postTags("</span>"); //后綴sourceBuilder.highlighter(highlightBuilder); //把高亮構造器放入sourceBuilder中sourceBuilder.query(termQuery);sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));//執行搜索SearchRequest source = searchRequest.source(sourceBuilder);SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);//解析結果List<Map<String, Object>> list = new ArrayList<>();for (SearchHit hit : searchResponse.getHits().getHits()) {Map<String, HighlightField> highlightFields = hit.getHighlightFields();//獲取高亮字段HighlightField title = highlightFields.get("title"); //得到我們需要高亮的字段Map<String, Object> sourceAsMap = hit.getSourceAsMap();//原來的返回的結果//解析高亮的字段if (title != null) {Text[] fragments = title.fragments();String new_title = "";for (Text text : fragments) {new_title += text;}sourceAsMap.put("title", new_title); //高亮字段替換掉原來的內容即可}list.add(sourceAsMap);}return list; }

2、改變Controller里面的搜索請求

3、發現問題

需要高亮的字段前綴和后綴都有了，但是這不是我們想要的結果

4、解決問題

這里Vue給了我們很方便的解決辦法

5、完美

總結

以上是生活随笔為你收集整理的ElasticSearch（笔记）的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。