當前位置：首頁 > 编程语言 > python >内容正文

python

Python笔记_第四篇_高阶编程_正则表达式_3.正则表达式深入

發布時間：2025/7/14 python 23 豆豆

生活随笔收集整理的這篇文章主要介紹了 Python笔记_第四篇_高阶编程_正则表达式_3.正则表达式深入小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

1. re.split

　　正則的字符串切割

str1 = "Thomas is a good man" print(re.split(r" +",str1)) # " +" ：至少一個是空格 # ['Thomas', 'is', 'a', 'good', 'man']

2. finditer函數：

　　原型：re.findinter(pattern,string,flags=0)

　　參數：

　　　　pattern：要匹配的正則表達式

　　　　strings：要匹配的字符串

　　　　flags：標志位：用于控制正則表達式的匹配方式，是對pattern的一種輔助，如下：

　　　　　　　　re.I：忽略大小寫

　　　　　　　　re.L：做本地化識別

　　　　　　　　re.M：多行匹配，影響^和$

　　　　　　　　re.S：使.(點)匹配包括換行符在內的所有字符

　　　　　　　　re.U：根據Unicode字符集解析字符，影響\w \W \b \B

　　　　　　　　re.X：以更靈活的格式理解正則表達式

　　功能：與findall類似，掃描整個字符串，返回的是一個迭代器。

str2 = "Thomas is a good man! Thomas is a nice man! Thomas is a handsome man" d = re.finditer(r"(Thomas)",str2) while True:try:l = next(d)print(l)except StopIteration as e:break # <re.Match object; span=(0, 6), match='Thomas'> # <re.Match object; span=(22, 28), match='Thomas'> # <re.Match object; span=(44, 50), match='Thomas'>

3. re.sub() / re.subn()：

原型：
sub(pattern,reple,string,count=0,flags=0)
subn(pattern,reple,string,count=0,flags=0)
參數：
pattern：匹配的正則表達式
reple：指定的用來替換的字符串
string：目標字符串
count：最多替換次數
flags：標志位，用于控制正則表達式的匹配方式，是對pattern的一種輔助，值如下：
re.I：忽略大小寫
re.L：做本地化識別的
re.M：多行匹配，影響^和$
re.S：使.匹配包括換行符在內的所有字符
re.U：根據Unicode字符集解析字符，影響\w \W \b \B
re.X：以更靈活的格式理解正則表達式
功能：在目標字符串中，以正則表達式的規則匹配字符串，再把他們替換成指定的字符串。可以指定替換的次數，如果不指定，它會替換所有的匹配字符串。
區別：前者返回一個唄替換的字符串，后者返回一個字符串，第一個元素唄替換的字符串，第二個元素表示被替換的次數。

# sub替換 str4 = "Thomas is a good good good man" res = re.sub(r"(good)","nice",str4) print(res) print(type(res)) # Thomas is a nice nice nice man # <class 'str'>#指定匹配次數 str5 = "Thomas is a good good good man" res1 = re.sub(r"(good)","nice",str5,count=2) print(res1) print(type(res1)) # Thomas is a nice nice good man # <class 'str'>#subn替換 str6 = "Thomas is a good good good man" res2 = re.subn(r"(good)","nice",str6) print(res2) print(type(res2)) # ('Thomas is a nice nice nice man', 3) # <class 'tuple'>

4. 分組：

　　除了簡單的判斷是否匹配之外，正則表達式還有提取子串的功能。用()來表示分組，整個分組是提取出來的分組。

　　group()：表示按第幾個位置提取

　　groups()：表示提取全部

# 查看組信息 str7 = "010-53247654" m = re.match(r"(\d{3})-(\d{8})",str7) # 使用序號獲取對應組的信息，group(0)一直代表的原始字符串 print(m.group(0)) print(m.group(1)) print(m.group(2)) # 010-53247654 # 010 # 5324765# 查看匹配的各組的情況，從外頭一組一組的顯示 m1 = re.match(r"((\d{3})-(\d{8}))",str7) print(m1.groups()) # ('010-53247654', '010', '53247654')# 給組起名 m2 = re.match(r"(?P<first>\d{3})-(?P<second>\d{8})",str7) print(m2.group("first")) print(m2.group("second")) # 010 # 53247654

　　備注：另外我們可以看到我們還可以通過?P<>的方式對分組的部分進行編號。

5. 編譯：

　　當我們正在使用正則表達式時，re模塊會干兩件事：

　　第一：編譯正則表達式，如果正則表達式本身不合法，會報錯。

　　第二：編譯后的正則表達式去匹配對象。

　　re.compile(pattern,flags=0)

　　pattern:表示要編譯的正則表達式

　　flags：同上

pat = r"^1(([34578]\d)|(47))\d{8}$" print(re.match(pat,"13600000000")) re_telephon = re.compile(pat) print(re_telephon.match("13600000000")) # <re.Match object; span=(0, 11), match='13600000000'> # <re.Match object; span=(0, 11), match='13600000000'>

轉載于:https://www.cnblogs.com/noah0532/p/10907000.html

總結

以上是生活随笔為你收集整理的Python笔记_第四篇_高阶编程_正则表达式_3.正则表达式深入的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。