java indexof效率_Java indexOf函数比Rabin-Karp更有效吗?文字搜寻效率
幾周前,我向Stackoverflow提出了一個(gè)問(wèn)題,該問(wèn)題涉及如何創(chuàng)建一種有效的算法來(lái)搜索大量文本中的模式。現(xiàn)在,我正在使用String函數(shù)indexOf進(jìn)行搜索。一個(gè)建議是使用Rabin-
Karp作為替代方案。我編寫(xiě)了一些如下的測(cè)試程序,以測(cè)試Rabin-Karp的實(shí)現(xiàn),如下所示。
public static void main(String[] args) {
String test = "Mary had a little lamb whose fleece was white as snow";
String p = "was";
long start = Calendar.getInstance().getTimeInMillis();
for (int x = 0; x < 200000; x++)
test.indexOf(p);
long end = Calendar.getInstance().getTimeInMillis();
end = end -start;
System.out.println("Standard Java Time->"+end);
RabinKarp searcher = new RabinKarp("was");
start = Calendar.getInstance().getTimeInMillis();
for (int x = 0; x < 200000; x++)
searcher.search(test);
end = Calendar.getInstance().getTimeInMillis();
end = end -start;
System.out.println("Rabin Karp time->"+end);
}
這是我正在使用的Rabin-Karp的實(shí)現(xiàn):
import java.math.BigInteger;
import java.util.Random;
public class RabinKarp {
private String pat; // the pattern // needed only for Las Vegas
private long patHash; // pattern hash value
private int M; // pattern length
private long Q; // a large prime, small enough to avoid long overflow
private int R; // radix
private long RM; // R^(M-1) % Q
static private long dochash = -1L;
public RabinKarp(int R, char[] pattern) {
throw new RuntimeException("Operation not supported yet");
}
public RabinKarp(String pat) {
this.pat = pat; // save pattern (needed only for Las Vegas)
R = 256;
M = pat.length();
Q = longRandomPrime();
// precompute R^(M-1) % Q for use in removing leading digit
RM = 1;
for (int i = 1; i <= M - 1; i++)
RM = (R * RM) % Q;
patHash = hash(pat, M);
}
// Compute hash for key[0..M-1].
private long hash(String key, int M) {
long h = 0;
for (int j = 0; j < M; j++)
h = (R * h + key.charAt(j)) % Q;
return h;
}
// Las Vegas version: does pat[] match txt[i..i-M+1] ?
private boolean check(String txt, int i) {
for (int j = 0; j < M; j++)
if (pat.charAt(j) != txt.charAt(i + j))
return false;
return true;
}
// check for exact match
public int search(String txt) {
int N = txt.length();
if (N < M)
return -1;
long txtHash;
if (dochash == -1L) {
txtHash = hash(txt, M);
dochash = txtHash;
} else
txtHash = dochash;
// check for match at offset 0
if ((patHash == txtHash) && check(txt, 0))
return 0;
// check for hash match; if hash match, check for exact match
for (int i = M; i < N; i++) {
// Remove leading digit, add trailing digit, check for match.
txtHash = (txtHash + Q - RM * txt.charAt(i - M) % Q) % Q;
txtHash = (txtHash * R + txt.charAt(i)) % Q;
// match
int offset = i - M + 1;
if ((patHash == txtHash) && check(txt, offset))
return offset;
}
// no match
return -1; // was N
}
// a random 31-bit prime
private static long longRandomPrime() {
BigInteger prime = new BigInteger(31, new Random());
return prime.longValue();
}
// test client
}
Rabin-Karp的實(shí)現(xiàn)工作方式是返回我要查找的字符串的正確偏移量。但是,令我驚訝的是運(yùn)行測(cè)試程序時(shí)發(fā)生的時(shí)序統(tǒng)計(jì)信息。他們來(lái)了:
Standard Java Time->39
Rabin Karp time->409
這真是令人驚訝。Rabin-Karp(至少在這里已實(shí)現(xiàn))不僅不比標(biāo)準(zhǔn)java indexOf
String函數(shù)快,而且慢了一個(gè)數(shù)量級(jí)。我不知道怎么了(如果有的話)。有人對(duì)此有想法嗎?
謝謝,
艾略特
總結(jié)
以上是生活随笔為你收集整理的java indexof效率_Java indexOf函数比Rabin-Karp更有效吗?文字搜寻效率的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 造价120万人民币,日本这款美女机器人是
- 下一篇: 软件开发中的资源管理