當(dāng)前位置：首頁(yè) > 编程语言 > java >内容正文

java

java正则表达式 and_Java正则表达式详解

發(fā)布時(shí)間：2024/8/1 java 37 豆豆

生活随笔收集整理的這篇文章主要介紹了 java正则表达式 and_Java正则表达式详解小編覺(jué)得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

原標(biāo)題：Java正則表達(dá)式詳解

Java 提供了功能強(qiáng)大的正則表達(dá)式API，在java.util.regex 包下。本教程介紹如何使用正則表達(dá)式API。

正則表達(dá)式

一個(gè)正則表達(dá)式是一個(gè)用于文本搜索的文本模式。換句話說(shuō),在文本中搜索出現(xiàn)的模式。例如，你可以用正則表達(dá)式搜索網(wǎng)頁(yè)中的郵箱地址或超鏈接。

正則表達(dá)式示例

下面是一個(gè)簡(jiǎn)單的Java正則表達(dá)式的例子，用于在文本中搜索 http://

Stringtext= "This is the text to be searched "+ "for occurrences of the http:// pattern."; Stringpattern = ".*http://.*"; booleanmatches = Pattern.matches(pattern, text); System.out.println( "matches = "+ matches);

示例代碼實(shí)際上沒(méi)有檢測(cè)找到的 http:// 是否是一個(gè)合法超鏈接的一部分,如包含域名和后綴(.com,.net 等等)。代碼只是簡(jiǎn)單的查找字符串 http:// 是否出現(xiàn)。

Java6 中關(guān)于正則表達(dá)式的API

本教程介紹了Java6 中關(guān)于正則表達(dá)式的API。

Pattern (java.util.regex.Pattern)

類 java.util.regex.Pattern 簡(jiǎn)稱 Pattern, 是Java正則表達(dá)式API中的主要入口，無(wú)論何時(shí),需要使用正則表達(dá)式,從Pattern 類開(kāi)始

Pattern.matches()

檢查一個(gè)正則表達(dá)式的模式是否匹配一段文本的最直接方法是調(diào)用靜態(tài)方法Pattern.matches(),示例如下:

Stringtext= "This is the text to be searched "+ "for occurrences of the pattern."; Stringpattern = ".*is.*"; booleanmatches = Pattern.matches(pattern, text); System.out.println( "matches = "+ matches);

上面代碼在變量 text 中查找單詞 “is” 是否出現(xiàn)，允許”is” 前后包含 0或多個(gè)字符(由 .* 指定)

Pattern.matches() 方法適用于檢查一個(gè)模式在一個(gè)文本中出現(xiàn)一次的情況，或適用于Pattern類的默認(rèn)設(shè)置。

如果需要匹配多次出現(xiàn),甚至輸出不同的匹配文本，或者只是需要非默認(rèn)設(shè)置。需要通過(guò)Pattern.compile() 方法得到一個(gè)Pattern 實(shí)例。

Pattern.compile()

如果需要匹配一個(gè)正則表達(dá)式在文本中多次出現(xiàn)，需要通過(guò)Pattern.compile() 方法創(chuàng)建一個(gè)Pattern對(duì)象。示例如下

Stringtext = "This is the text to be searched "+ "for occurrences of the http:// pattern."; StringpatternString = ".*http://.*"; Patternpattern = Pattern.compile(patternString);

可以在Compile 方法中，指定一個(gè)特殊標(biāo)志：

Patternpattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);

Pattern 類包含多個(gè)標(biāo)志(int 類型),這些標(biāo)志可以控制Pattern 匹配模式的方式。上面代碼中的標(biāo)志使模式匹配是忽略大小寫(xiě)

Pattern.matcher()

一旦獲得了Pattern對(duì)象，接著可以獲得Matcher對(duì)象。Matcher 示例用于匹配文本中的模式.示例如下

Matcher matcher= pattern.matcher(text);

Matcher類有一個(gè)matches()方法，可以檢查文本是否匹配模式。以下是關(guān)于Matcher的一個(gè)完整例子

String text = "This is the text to be searched "+ "for occurrences of the http:// pattern.";String patternString = ".*http://.*";Pattern pattern = Pattern .compile(patternString, Pattern .CASE_INSENSITIVE) ;Matcher matcher = pattern .matcher(text) ;boolean matches = matcher .matches() ;System .out.println( "matches = "+ matches) ;

Pattern.split()

Pattern 類的 split()方法，可以用正則表達(dá)式作為分隔符，把文本分割為String類型的數(shù)組。示例：

Stringtext = "A sep Text sep With sep Many sep Separators"; StringpatternString = "sep"; Pattern pattern = Pattern.compile(patternString); String[] split= pattern. split(text); System.out.println( "split.length = "+ split.length); for( Stringelement : split){ System.out.println( "element = "+ element); }

上例中把text 文本分割為一個(gè)包含5個(gè)字符串的數(shù)組。

Pattern.pattern()

Pattern 類的 pattern 返回用于創(chuàng)建Pattern 對(duì)象的正則表達(dá)式,示例：

StringpatternString = "sep"; Patternpattern = Pattern.compile(patternString); Stringpattern2 = pattern.pattern();

上面代碼中 pattern2 值為sep ，與patternString 變量相同。

Matcher (java.util.regex.Matcher)

java.util.regex.Matcher 類用于匹配一段文本中多次出現(xiàn)一個(gè)正則表達(dá)式，Matcher 也適用于多文本中匹配同一個(gè)正則表達(dá)式。

Matcher 有很多有用的方法，詳細(xì)請(qǐng)參考官方JavaDoc。這里只介紹核心方法。

以下代碼演示如何使用Matcher

Stringtext= "This is the text to be searched "+ "for occurrences of the http:// pattern."; StringpatternString = ".*http://.*"; Pattern pattern = Pattern.compile(patternString); Matcher matcher = pattern.matcher( text); booleanmatches = matcher.matches();

首先創(chuàng)建一個(gè)Pattern，然后得到Matcher ，調(diào)用matches() 方法，返回true 表示模式匹配，返回false表示不匹配。

可以用Matcher 做更多的事。

創(chuàng)建Matcher

通過(guò)Pattern 的matcher() 方法創(chuàng)建一個(gè)Matcher。

matches()

Matcher 類的 matches() 方法用于在文本中匹配正則表達(dá)式

boolean matches= matcher. matches();

如果文本匹配正則表達(dá)式，matches() 方法返回true。否則返回false。

matches() 方法不能用于查找正則表達(dá)式多次出現(xiàn)。如果需要，請(qǐng)使用find(), start() 和 end() 方法。

lookingAt()

lookingAt() 與matches() 方法類似，最大的不同是，lookingAt()方法對(duì)文本的開(kāi)頭匹配正則表達(dá)式；而

matches() 對(duì)整個(gè)文本匹配正則表達(dá)式。換句話說(shuō)，如果正則表達(dá)式匹配文本開(kāi)頭而不匹配整個(gè)文本,lookingAt() 返回true,而matches() 返回false。示例：

String text = "This is the text to be searched "+ "for occurrences of the http:// pattern.";String patternString = "This is the";Pattern pattern = Pattern .compile(patternString, Pattern .CASE_INSENSITIVE) ;Matcher matcher = pattern .matcher(text) ;System .out.println( "lookingAt = "+ matcher .lookingAt()) ;System .out.println( "matches = "+ matcher .matches()) ;

上面的例子分別對(duì)文本開(kāi)頭和整個(gè)文本匹配正則表達(dá)式 “this is the”. 匹配文本開(kāi)頭的方法(lookingAt()) 返回true。

對(duì)整個(gè)文本匹配正則表達(dá)式的方法 (matches()) 返回false，因?yàn)?整個(gè)文本包含多余的字符,而正則表達(dá)式要求文本精確匹配”this is the”,前后又不能有額外字符。

find() + start() + end()

find() 方法用于在文本中查找出現(xiàn)的正則表達(dá)式，文本是創(chuàng)建Matcher時(shí)，通過(guò) Pattern.matcher(text) 方法傳入的。如果在文本中多次匹配，find() 方法返回第一個(gè)，之后每次調(diào)用 find() 都會(huì)返回下一個(gè)。

start() 和 end() 返回每次匹配的字串在整個(gè)文本中的開(kāi)始和結(jié)束位置。實(shí)際上, end() 返回的是字符串末尾的后一位，這樣，可以在把 start() 和 end() 的返回值直接用在String.substring() 里。

/** * Java學(xué)習(xí)交流QQ群：589809992 我們一起學(xué)Java！ */String text = "This is the text which is to be searched "+ "for occurrences of the word 'is'."; String patternString = "is"; Pattern pattern = Pattern.compile(patternString); Matcher matcher = pattern.matcher(text); intcount = 0; while(matcher.find()) { count++; System.out.println( "found: "+ count + " : "+ matcher.start() + " - "+ matcher.end()); }

這個(gè)例子在文本中找到模式 “is” 4次，輸出如下:

found: 1 : 2 - 4found: 2 : 5 - 7found: 3 : 23 - 25found: 4 : 70 - 72

reset()

reset() 方法會(huì)重置Matcher 內(nèi)部的匹配狀態(tài)。當(dāng)find() 方法開(kāi)始匹配時(shí),Matcher 內(nèi)部會(huì)記錄截至當(dāng)前查找的距離。調(diào)用 reset() 會(huì)重新從文本開(kāi)頭查找。

也可以調(diào)用 reset(CharSequence) 方法. 這個(gè)方法重置Matcher,同時(shí)把一個(gè)新的字符串作為參數(shù)傳入，用于代替創(chuàng)建 Matcher 的原始字符串。

group()

假設(shè)想在一個(gè)文本中查找URL鏈接，并且想把找到的鏈接提取出來(lái)。當(dāng)然可以通過(guò) start()和 end()方法完成。但是用group()方法更容易些。

分組在正則表達(dá)式中用括號(hào)表示，例如:

(John)

此正則表達(dá)式匹配John, 括號(hào)不屬于要匹配的文本。括號(hào)定義了一個(gè)分組。當(dāng)正則表達(dá)式匹配到文本后，可以訪問(wèn)分組內(nèi)的部分。

使用group(int groupNo) 方法訪問(wèn)一個(gè)分組。一個(gè)正則表達(dá)式可以有多個(gè)分組。每個(gè)分組由一對(duì)括號(hào)標(biāo)記。想要訪問(wèn)正則表達(dá)式中某分組匹配的文本，可以把分組編號(hào)傳入 group(int groupNo)方法。

group(0) 表示整個(gè)正則表達(dá)式，要獲得一個(gè)有括號(hào)標(biāo)記的分組，分組編號(hào)應(yīng)該從1開(kāi)始計(jì)算。

String text = "John writes about this, and John writes about that,"+ " and John writes about everything. ";String patternString1 = "(John)";Pattern pattern = Pattern .compile(patternString1) ;Matcher matcher = pattern .matcher(text) ;while(matcher .find()) { System .out.println( "found: "+ matcher .group( 1)) ;}

以上代碼在文本中搜索單詞John.從每個(gè)匹配文本中，提取分組1，就是由括號(hào)標(biāo)記的部分。輸出如下

found: Johnfound: Johnfound: John

多分組

上面提到，一個(gè)正則表達(dá)式可以有多個(gè)分組，例如：

(John)(.+?)

這個(gè)表達(dá)式匹配文本”John” 后跟一個(gè)空格,然后跟1個(gè)或多個(gè)字符，最后跟一個(gè)空格。你可能看不到最后的空格。

這個(gè)表達(dá)式包括一些字符有特別意義。字符點(diǎn) . 表示任意字符。字符 + 表示出現(xiàn)一個(gè)或多個(gè)，和. 在一起表示任何字符,出現(xiàn)一次或多次。字符? 表示匹配盡可能短的文本。

完整代碼如下

/** * Java學(xué)習(xí)交流QQ群：589809992 我們一起學(xué)Java！ */String text = "John writes about this, and John Doe writes about that,"+ " and John Wayne writes about everything."; String patternString1 = "(John) (.+?) "; Pattern pattern = Pattern.compile(patternString1); Matcher matcher = pattern.matcher(text); while(matcher.find()) { System.out.println( "found: "+ matcher.group( 1) + " "+ matcher.group( 2)); }

注意代碼中引用分組的方式。代碼輸出如下

found: John writesfound: John Doefound: John Wayne

嵌套分組

在正則表達(dá)式中分組可以嵌套分組，例如

((John)(.+?))

這是之前的例子，現(xiàn)在放在一個(gè)大分組里.(表達(dá)式末尾有一個(gè)空格)。

當(dāng)遇到嵌套分組時(shí), 分組編號(hào)是由左括號(hào)的順序確定的。上例中，分組1 是那個(gè)大分組。分組2 是包括John的分組，分組3 是包括 .+? 的分組。當(dāng)需要通過(guò)groups(int groupNo) 引用分組時(shí)，了解這些非常重要。

以下代碼演示如何使用嵌套分組

String text = "John writes about this, and John Doe writes about that,"+ " and John Wayne writes about everything.";String patternString1 = "((John) (.+?)) ";Pattern pattern = Pattern .compile(patternString1) ;Matcher matcher = pattern .matcher(text) ;while(matcher .find()) { System .out.println( "found: ") ;}

輸出如下

found: found: found:

replaceAll() + replaceFirst()

replaceAll() 和 replaceFirst() 方法可以用于替換Matcher搜索字符串中的一部分。replaceAll() 方法替換全部匹配的正則表達(dá)式，replaceFirst() 只替換第一個(gè)匹配的。

在處理之前，Matcher 會(huì)先重置。所以這里的匹配表達(dá)式從文本開(kāi)頭開(kāi)始計(jì)算。

示例如下

/** * Java學(xué)習(xí)交流QQ群：589809992 我們一起學(xué)Java！ */String text = "John writes about this, and John Doe writes about that,"+ " and John Wayne writes about everything."; String patternString1 = "((John) (.+?)) "; Pattern pattern = Pattern.compile(patternString1); Matcher matcher = pattern.matcher(text); String replaceAll = matcher.replaceAll( "Joe Blocks "); System.out.println( "replaceAll = "+ replaceAll); String replaceFirst = matcher.replaceFirst( "Joe Blocks "); System.out.println( "replaceFirst = "+ replaceFirst);

輸出如下

replaceAll = Joe Blocks aboutthis, andJoe Blocks writes aboutthat, andJoe Blocks writes abouteverything. replaceFirst = Joe Blocks aboutthis, andJohn Doe writes aboutthat, andJohn Wayne writes abouteverything.

輸出中的換行和縮進(jìn)是為了可讀而增加的。

注意第1個(gè)字符串中所有出現(xiàn) John 后跟一個(gè)單詞的地方，都被替換為 Joe Blocks 。第2個(gè)字符串中，只有第一個(gè)出現(xiàn)的被替換。

appendReplacement() + appendTail()

appendReplacement() 和 appendTail() 方法用于替換輸入文本中的字符串短語(yǔ)，同時(shí)把替換后的字符串附加到一個(gè) StringBuffer 中。

當(dāng)find() 方法找到一個(gè)匹配項(xiàng)時(shí)，可以調(diào)用 appendReplacement() 方法，這會(huì)導(dǎo)致輸入字符串被增加到StringBuffer 中，而且匹配文本被替換。從上一個(gè)匹配文本結(jié)尾處開(kāi)始，直到本次匹配文本會(huì)被拷貝。

appendReplacement() 會(huì)記錄拷貝StringBuffer 中的內(nèi)容，可以持續(xù)調(diào)用find(),直到?jīng)]有匹配項(xiàng)。

直到最后一個(gè)匹配項(xiàng)目，輸入文本中剩余一部分沒(méi)有拷貝到 StringBuffer. 這部分文本是從最后一個(gè)匹配項(xiàng)結(jié)尾，到文本末尾部分。通過(guò)調(diào)用 appendTail() 方法，可以把這部分內(nèi)容拷貝到 StringBuffer 中.

/** * Java學(xué)習(xí)交流QQ群：589809992 我們一起學(xué)Java！ */String text = "John writes about this, and John Doe writes about that,"+ " and John Wayne writes about everything."; String patternString1 = "((John) (.+?)) "; Pattern pattern = Pattern.compile(patternString1); Matcher matcher = pattern.matcher(text); StringBuffer stringBuffer = newStringBuffer(); while(matcher.find()){ matcher.appendReplacement(stringBuffer, "Joe Blocks "); System.out.println(stringBuffer.toString()); } matcher.appendTail(stringBuffer); System.out.println(stringBuffer.toString());

注意我們?cè)趙hile循環(huán)中調(diào)用appendReplacement() 方法。在循環(huán)完畢后調(diào)用appendTail()。代碼輸出如下:

Joe Blocks Joe Blocks aboutthis, andJoe Blocks Joe Blocks aboutthis, andJoe Blocks writes aboutthat, andJoe Blocks Joe Blocks aboutthis, andJoe Blocks writes aboutthat, andJoe Blocks writes abouteverything.Java 正則表達(dá)式語(yǔ)法

為了更有效的使用正則表達(dá)式，需要了解正則表達(dá)式語(yǔ)法。正則表達(dá)式語(yǔ)法很復(fù)雜，可以寫(xiě)出非常高級(jí)的表達(dá)式。只有通過(guò)大量的練習(xí)才能掌握這些語(yǔ)法規(guī)則。

本篇文字，我們將通過(guò)例子了解正則表達(dá)式語(yǔ)法的基礎(chǔ)部分。介紹重點(diǎn)將會(huì)放在為了使用正則表達(dá)式所需要了解的核心概念，不會(huì)涉及過(guò)多的細(xì)節(jié)。詳細(xì)解釋，參見(jiàn) Java DOC 中的 Pattern 類.

基本語(yǔ)法

在介紹高級(jí)功能前，我們先快速瀏覽下正則表達(dá)式的基本語(yǔ)法。

字符

是正則表達(dá)式中最經(jīng)常使用的的一個(gè)表達(dá)式，作用是簡(jiǎn)單的匹配一個(gè)確定的字符。例如：

John

這個(gè)簡(jiǎn)單的表達(dá)式將會(huì)在一個(gè)輸入文本中匹配John文本。

可以在表達(dá)式中使用任意英文字符。也可以使用字符對(duì)于的8進(jìn)制，16進(jìn)制或unicode編碼表示。例如：

101 \x41 \u0041

以上3個(gè)表達(dá)式都表示大寫(xiě)字符A。第一個(gè)是8進(jìn)制編碼(101),第2個(gè)是16進(jìn)制編碼(41),第3個(gè)是unicode編碼(0041).

字符分類

字符分類是一種結(jié)構(gòu)，可以針對(duì)多個(gè)字符匹配而不只是一個(gè)字符。換句話說(shuō)，一個(gè)字符分類匹配輸入文本中的一個(gè)字符，對(duì)應(yīng)字符分類中多個(gè)允許字符。例如，你想匹配字符 a,b 或c，表達(dá)式如下：

[abc]

用一對(duì)方括號(hào)[] 表示字符分類。方括號(hào)本身并不是要匹配的一部分。

可以用字符分類完成很多事。例如想要匹配單詞John，首字母可以為大寫(xiě)和小寫(xiě)J.

[Jj]ohn

字符分類[Jj] 匹配J或j，剩余的 ohn 會(huì)準(zhǔn)確匹配字符ohn.

預(yù)定義字符分類

正則表達(dá)式中有一些預(yù)定義的字符分類可以使用。例如, \d 表示任意數(shù)字, \s 表示任意空白字符,\w 表示任意單詞字符。

預(yù)定義字符分類不需要括在方括號(hào)里，當(dāng)然也可以組合使用

\d[\d\s]

第1個(gè)匹配任意數(shù)字，第2個(gè)匹配任意數(shù)字或空白符。

完整的預(yù)定義字符分類列表，在本文最后列出。

邊界匹配

正則表達(dá)式支持匹配邊界，例如單詞邊界，文本的開(kāi)頭或末尾。例如，\w 匹配一個(gè)單詞，^匹配行首,$ 匹配行尾。

^This is asingle line$

上面的表達(dá)式匹配一行文本，只有文本 This is a single line。注意其中的行首和行尾標(biāo)志，表示不能有任何文本在文本的前面后后面，只能是行首和行尾。

完整的匹配邊界列表，在本文最后列出。

量詞匹配

量詞可以匹配一個(gè)表達(dá)式多次出現(xiàn)。例如下列表達(dá)式匹配字母A 出現(xiàn)0次或多次。

量詞 * 表示0次或多次。+ 表示1次或多次。? 表示0次或1次。還有些其他量詞，參見(jiàn)本文后面的列表。

量詞匹配分為饑餓模式,貪婪模式,獨(dú)占模式。饑餓模式匹配盡可能少的文本。貪婪模式匹配盡可能多的文本。獨(dú)占模式匹配盡可能多的文本，甚至導(dǎo)致剩余表達(dá)式匹配失敗。

以下演示饑餓模式,貪婪模式,獨(dú)占模式區(qū)別。假設(shè)以下文本：

John went forawalk, andJohn fell down, andJohn hurt his knee.

饑餓模式下表達(dá)式：

John.*?

這個(gè)表達(dá)式匹配John 后跟0個(gè)或多個(gè)字符。 . 表示任意字符。* 表示0或多次。? 跟在 * 后面，表示 * 采用饑餓模式。

饑餓模式下，量詞只會(huì)匹配盡可能少的字符，即0個(gè)字符。上例中的表達(dá)式將會(huì)匹配單詞John,在輸入文本中出現(xiàn)3次。

如果改為貪婪模式，表達(dá)式如下：

John.*

貪婪模式下，量詞會(huì)匹配盡可能多的字符?，F(xiàn)在表達(dá)式會(huì)匹配第一個(gè)出現(xiàn)的John，以及在貪婪模式下匹配剩余的所有字符。這樣，只有一個(gè)匹配項(xiàng)。

最后，我們改為獨(dú)占模式：

John.*+hurt

*后跟+ 表示獨(dú)占模式量詞。

這個(gè)表達(dá)式在輸入文本中沒(méi)有匹配項(xiàng)，盡管文本中包括 John 和 hurt. 為什么會(huì)這樣? 因?yàn)?.*+ 是獨(dú)占模式。與貪婪模式下，盡可能多的匹配文本，以使整個(gè)表達(dá)式匹配不同。獨(dú)占模式會(huì)盡可能的多的匹配，但不考慮表達(dá)式剩余部分是否能匹配上。

.*+ 將會(huì)匹配第一個(gè)John之后的所有字符，這會(huì)導(dǎo)致表達(dá)式中剩余的 hurt 沒(méi)有匹配項(xiàng)。如果改為貪婪模式，會(huì)有一個(gè)匹配項(xiàng)。表達(dá)式如下：

John. *hurt

邏輯操作符

正則表達(dá)式支持少量的邏輯運(yùn)算(與，或，非)。

與操作是默認(rèn)的，表達(dá)式 John ,意味著J 與 o與h與n。

或操作需要顯示指定，用 | 表示。例如表達(dá)式 John|hurt 意味著John 或 hurt 。

字符

字符分類

內(nèi)置字符分類

邊界匹配

量詞

我有一個(gè)微信公眾號(hào)，經(jīng)常會(huì)分享一些Java技術(shù)相關(guān)的干貨。如果你喜歡我的分享，可以用微信搜索“Java團(tuán)長(zhǎng)”或者“javatuanzhang”關(guān)注。返回搜狐，查看更多

責(zé)任編輯：

總結(jié)

以上是生活随笔為你收集整理的java正则表达式 and_Java正则表达式详解的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問(wèn)題。

如果覺(jué)得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： java double 移位_【原创】J
下一篇： go项目目录结构