Sphinx全文检索引擎测试
數據表
1、documents
CREATE TABLE `documents` (
? `id` int(13) NOT NULL auto_increment,
? `group_id` int(11) NOT NULL,
? `group_id2` int(11) NOT NULL,
? `date_added` datetime NOT NULL,
? `title` varchar(255) NOT NULL,
? `content` text NOT NULL,
? PRIMARY KEY ?(`id`),
? KEY `title` (`title`),
? FULLTEXT KEY `content` (`content`)
) ENGINE=MyISAM? DEFAULT CHARSET=utf8 AUTO_INCREMENT=500006 ;
2、Sphinx
CREATE TABLE `sphinx` (
? `id` int(13) NOT NULL,
? `weight` int(11) NOT NULL,
? `query` varchar(255) NOT NULL,
? KEY `Query` (`query`)
) ENGINE=SPHINX DEFAULT CHARSET=utf8;
?
?
本次測試,在sphinx.conf中共建立三個索引。
1、dramx?? 對 documents表的title、content字段均建立索引。
2、dramx_title? 對 documents表的title字段建立索引。
3、dramx_content? 對 documents表的content字段建立索引。
三個索引在具體測試時,分別被用到。
?
?
?
程序調用Sphinx全文檢索引擎有兩種方式:
一、Php調用 API
二、安裝SphinxSE.
安裝方法:
1、? 在官網下載mysql-5.0.45-sphinxse-r871-win32.zip解壓。
2、? 停掉本地mysql所有服務。并將解壓后的文件覆蓋到mysql相應目錄。
3、? 啟動mysql ,建立一個至少含有三個字段的表(sphinx),字段數據類型依次是int、int、varchar,
Storage Engine選擇sphinx即可。
4、? 查詢時結合表sphinx進行聯合查詢。
?
下面先將php+mysql和 Php+ Sphinx 通過like 、order by 、group by進行測試比較。最后單獨給出使用第二種方法SphinxSE的查詢測試。
?
?
測試環境:
內存:1GB? 、CPU:intel pentium(R) dual? CPU E2140
數據量:50萬條
?
?
| 測試項目 | Php+Mysql | Php + Sphinx 全文檢索引擎 | ||
| 所花時間(秒) | 代碼 | 所花時間 | 代碼 | |
| Title like ‘%Fm5%’ | 0.93 | mysql_query('set names utf8'); $sql = "select title? from? documents where? title? like '%Fm5%'"; $start = getmicrotime(); $result = mysql_query($sql) or die(mysql_error()); $end = getmicrotime(); print mysql_num_rows($result).'<br>'; print $end-$start; | 0.012 | $cl = new SphinxClient (); //設置連接Sphinx主機名與端口 $start = getmicrotime(); $cl->SetServer('localhost',3312); $cl->SetMatchMode(SPH_MATCH_ALL); //搜索模式均采用匹配所有查詢詞模式 $cl->SetLimits(0,100000); $res = $cl->Query('Fm5', 'dramx_title'); $end = getmicrotime(); print_r($res);? print $end-$start; |
| Content Like ‘%中國LED照明市場%’ | 11.75 | mysql_query('set names utf8'); $sql = "select title? from? documents where? content? like '%中國LED照明市場%'"; $start = getmicrotime(); $result = mysql_query($sql) or die(mysql_error()); $end = getmicrotime(); print mysql_num_rows($result).'<br>'; print $end-$start; | 0.020 | $cl = new SphinxClient (); //設置連接Sphinx主機名與端口 $start = getmicrotime(); $cl->SetServer('localhost',3312); $cl->SetMatchMode(SPH_MATCH_ALL); $cl->SetLimits(0,100000); $res = $cl->Query('中國LED照明市場', 'dramx_content'); $end = getmicrotime(); print_r($res);? print $end-$start; |
| MATCH (content) AGAINST ('無線網卡廠商') | 0.93 | mysql_query('set names utf8'); $sql = "select title? from documents where MATCH (content) AGAINST ('無線網卡廠商')"; $start = getmicrotime(); $result = mysql_query($sql) or die(mysql_error()); $end = getmicrotime(); print mysql_num_rows($result).'<br>'; print $end-$start; | 0.020 | $cl = new SphinxClient (); //設置連接Sphinx主機名與端口 $start = getmicrotime(); $cl->SetServer('localhost',3312); $cl->SetMatchMode(SPH_MATCH_ALL); $cl->SetLimits(0,100000); $res = $cl->Query('無線網卡廠商', 'dramx_content'); $end = getmicrotime(); print_r($res);? print $end-$start; |
| MATCH (content) AGAINST ('產業資訊') 并按照 id 排序(Order By id desc) | 1.50 | mysql_query('set names utf8'); $sql = "select title? from documents where MATCH (content) AGAINST ('產業資訊')? order? by? id? desc"; $start = getmicrotime(); $result = mysql_query($sql) or die(mysql_error()); $end = getmicrotime(); print mysql_num_rows($result).'<br>'; print $end-$start; | 0.031 | $cl = new SphinxClient (); //設置連接Sphinx主機名與端口 $start = getmicrotime(); $cl->SetServer('localhost',3312); $cl->SetMatchMode(SPH_MATCH_ALL); $cl->SetSortMode(SPH_SORT_EXTENDED, 'id desc'); $cl->SetLimits(0,100000); $res = $cl->Query('產業資訊', 'dramx'); $end = getmicrotime(); print_r($res);? print $end-$start; |
| MATCH (content) AGAINST ('專題報道') 并根據group_id分組(Group By group_id ) | 1.01 | mysql_query('set names utf8'); $sql = "select title? from documents where MATCH (content) AGAINST ('專題報道')? group? by? group_id"; $start = getmicrotime(); $result = mysql_query($sql) or die(mysql_error()); $end = getmicrotime(); print mysql_num_rows($result).'<br>'; print $end-$start; | 0.081 | $cl = new SphinxClient (); //設置連接Sphinx主機名與端口 $start = getmicrotime(); $cl->SetServer('localhost',3312); $cl->SetMatchMode(SPH_MATCH_ALL); $cl->SetGroupBy('group_id', SPH_SORT_ATTR_DESC,"@group? desc "); $cl->SetLimits(0,100000); $res = $cl->Query('專題報道臺', 'dramx'); $end = getmicrotime(); print_r($res);? print $end-$start; |
| Content Like ‘%中國LED照明市場%’? order by id desc | 52.54 | mysql_query('set names utf8'); $sql = "select title? from documents where content? like '%中國LED照明市場%'? order?? by? id? desc"; $start = getmicrotime(); $result = mysql_query($sql) or die(mysql_error()); $end = getmicrotime(); print mysql_num_rows($result).'<br>'; print $end-$start; | 0.031 | $cl = new SphinxClient (); //設置連接Sphinx主機名與端口 $start = getmicrotime(); $cl->SetServer('localhost',3312); $cl->SetMatchMode(SPH_MATCH_ALL); $cl->SetSortMode(SPH_SORT_EXTENDED, 'id desc'); $cl->SetLimits(0,100000); $res = $cl->Query(中國LED照明市場', 'dramx'); $end = getmicrotime(); print_r($res);? print $end-$start; |
| Content Like ‘%國際Dram報價平臺%’? group by group_id? desc | 40.05 | $sql = "select title? from documents where content? like '%中國LED照明市場%'? group?? by? group_id"; $start = getmicrotime(); $result = mysql_query($sql) or die(mysql_error()); $end = getmicrotime(); print mysql_num_rows($result).'<br>'; print $end-$start; | 0.081 | $cl = new SphinxClient (); //設置連接Sphinx主機名與端口 $start = getmicrotime(); $cl->SetServer('localhost',3312); $cl->SetMatchMode(SPH_MATCH_ALL); $cl->SetGroupBy('group_id', SPH_SORT_ATTR_DESC,"@group? desc "); $cl->SetLimits(0,100000); $res = $cl->Query('國際Dram報價平臺', 'dramx'); $end = getmicrotime(); print_r($res);? print $end-$start; |
?
針對php+mysql的查詢,mysql-nt進程占用CPU的峰值最高到50%, 一般在20%之間,而php+Sphinx由于查詢時根本沒有用到mysql引擎而是查詢自身的索引引擎,所以,在查詢上CPU的占用可以忽略。
?
?
SphinxSE性能測試
?
| 測試項目 | SphinxSE | |
| 所花時間(秒) | 代碼 | |
| Title ?包含 ‘Dramexchange 國際Dram報價平臺’ | 0.045 | mysql_query('set names utf8'); $sql = "select c.* from documents as c,sphinx as t where c.id=t.id and t.query='@title 國際Dram報價平臺;mode=extended'"; $start = getmicrotime(); $result = mysql_query($sql) or die(mysql_error()); $end = getmicrotime(); print mysql_num_rows($result).'<br>'; print $end-$start; |
| Content ?包含 ?國際Dram報價平臺 | 0.043 | mysql_query('set names utf8'); $sql = "select c.* from documents as c,sphinx as t where c.id=t.id and t.query='@content 國際Dram報價平臺;mode=extended'"; $start = getmicrotime(); $result = mysql_query($sql) or die(mysql_error()); $end = getmicrotime(); print mysql_num_rows($result).'<br>'; print $end-$start; |
| Title、Content 包含無線網卡廠商 | 0.016 | mysql_query('set names utf8'); $sql = "select c.* from documents as c? inner? join?? sphinx as t? on? c.id=t.id ?where ?t.query='包含無線網卡廠商'"; $start = getmicrotime(); $result = mysql_query($sql) or die(mysql_error()); $end = getmicrotime(); print mysql_num_rows($result).'<br>'; print $end-$start; |
| Title、Content 包含’中國LED照明市場’并按照group_id排序 | 0.052 | mysql_query('set names utf8'); $sql = "select c.* from documents as c? inner? join?? sphinx as t? on? c.id=t.id where t.query='中國LED照明市場;mode=extended;sort=extended:group_id desc'"; $start = getmicrotime(); $result = mysql_query($sql) or die(mysql_error()); $end = getmicrotime(); print mysql_num_rows($result).'<br>'; print $end-$start; |
| Title、Content 包含’中國LED照明市場’并按照group_id 分組 | 0.052 | mysql_query('set names utf8'); $sql = "select c.* from documents as c? inner? join?? sphinx as t? on? c.id=t.id where t.query='中國LED照明市場;mode=extended;groupby=attr:group_id'"; $start = getmicrotime(); $result = mysql_query($sql) or die(mysql_error()); $end = getmicrotime(); print mysql_num_rows($result).'<br>'; print $end-$start; |
謝謝!
轉載于:https://www.cnblogs.com/zl0372/p/sphinx_13.html
創作挑戰賽新人創作獎勵來咯,堅持創作打卡瓜分現金大獎總結
以上是生活随笔為你收集整理的Sphinx全文检索引擎测试的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Java中的Enum(枚举)用法介绍
- 下一篇: 什么是事件冒泡?如何用jquery/js