白话Elasticsearch56-数据建模之 Path Hierarchy Tokenizer 对文件系统进行数据建模以及文件搜索
生活随笔
收集整理的這篇文章主要介紹了
白话Elasticsearch56-数据建模之 Path Hierarchy Tokenizer 对文件系统进行数据建模以及文件搜索
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
文章目錄
- 概述
- 官網
- 示例
- 模擬:文件系統數據構造
- 測試path_hierarchy分詞
- 需求一: 查找一份,內容包括ES,在/workspace/workspace/projects/helloworld這個目錄下的文件
- 需求二: 搜索/workspace目錄下,內容包含ES的所有的文件
概述
繼續跟中華石杉老師學習ES,第56篇
課程地址: https://www.roncoo.com/view/55
官網
簡言之,就是對類似文件系統這種的有多層級關系的數據進行分詞
Path Hierarchy Tokenizer:戳這里
Path Hierarchy Tokenizer Examples:戳這里
示例
模擬:文件系統數據構造
PUT /filesystem {"settings": {"analysis": {"analyzer": {"paths": { "tokenizer": "path_hierarchy"}}}} }測試path_hierarchy分詞
POST filesystem/_analyze {"tokenizer": "path_hierarchy","text": "/home/elasticsearch/image" }返回:
{"tokens": [{"token": "/home","start_offset": 0,"end_offset": 5,"type": "word","position": 0},{"token": "/home/elasticsearch","start_offset": 0,"end_offset": 19,"type": "word","position": 0},{"token": "/home/elasticsearch/image","start_offset": 0,"end_offset": 25,"type": "word","position": 0}] }path_hierarchy tokenizer: 會把/a/b/c/d路徑通過path_hierarchy 分詞為 /a/b/c/d, /a/b/c, /a/b, /a
需求一: 查找一份,內容包括ES,在/workspace/workspace/projects/helloworld這個目錄下的文件
手動指定字段類型,并模擬個數據到索引
#指定字段類型 PUT /filesystem/_mapping/file {"properties": {"name": { "type": "keyword"},"path": { "type": "keyword","fields": {"tree": { "type": "text","analyzer": "paths"}}}} }#查看映射 GET /filesystem/_mapping#寫入數據 PUT /filesystem/file/1 {"name": "README.txt", "path": "/workspace/projects/helloworld", "contents": "小工匠跟石杉老師學習ES" }需求DSL:
#文件搜索需求:查找一份,內容包括ES,在/workspace/workspace/projects/helloworld這個目錄下的文件GET /filesystem/_search {"query": {"bool": {"must": [{"match": {"contents": "ES"}}],"filter": {"term": {"path": "/workspace/projects/helloworld"}}}} }返回:
需求二: 搜索/workspace目錄下,內容包含ES的所有的文件
再寫幾條數據進去
PUT /filesystem/file/2 {"name": "README.txt", "path": "/workspace/projects", "contents": "小工匠跟石杉老師學習ES" }PUT /filesystem/file/3 {"name": "README.txt", "path": "/workspace/xxxxx", "contents": "小工匠跟石杉老師學習ES" }PUT /filesystem/file/4 {"name": "README.txt", "path": "/home/artisan", "contents": "小工匠跟石杉老師學習ES" }PUT /filesystem/file/5 {"name": "README.txt", "path": "/workspace", "contents": "小工匠跟石杉老師學習ES" }需求DSL: "path.tree": "/workspace"
GET filesystem/_search {"query": {"bool": {"must": [{"match": {"contents": "ES"}}],"filter": {"term": {"path.tree": "/workspace"}}}} }返回:
{"took": 8,"timed_out": false,"_shards": {"total": 5,"successful": 5,"skipped": 0,"failed": 0},"hits": {"total": 4,"max_score": 0.2876821,"hits": [{"_index": "filesystem","_type": "file","_id": "5","_score": 0.2876821,"_source": {"name": "README.txt","path": "/workspace","contents": "小工匠跟石杉老師學習ES"}},{"_index": "filesystem","_type": "file","_id": "1","_score": 0.2876821,"_source": {"name": "README.txt","path": "/workspace/projects/helloworld","contents": "小工匠跟石杉老師學習ES"}},{"_index": "filesystem","_type": "file","_id": "3","_score": 0.2876821,"_source": {"name": "README.txt","path": "/workspace/xxxxx","contents": "小工匠跟石杉老師學習ES"}},{"_index": "filesystem","_type": "file","_id": "2","_score": 0.18232156,"_source": {"name": "README.txt","path": "/workspace/projects","contents": "小工匠跟石杉老師學習ES"}}]} }可以看到id=4的數據,不符合需求,沒有被查詢出來,OK。
總結
以上是生活随笔為你收集整理的白话Elasticsearch56-数据建模之 Path Hierarchy Tokenizer 对文件系统进行数据建模以及文件搜索的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 白话Elasticsearch55-数据
- 下一篇: 白话Elasticsearch57-数据