當前位置：首頁 > 编程语言 > python >内容正文

python

python3 正则表达式嵌套表格_在Python中将嵌套结构与正则表达式匹配

發布時間：2025/3/19 python 35 豆豆

生活随笔收集整理的這篇文章主要介紹了 python3 正则表达式嵌套表格_在Python中将嵌套结构与正则表达式匹配小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

喵喵時光機

?falsetru的嵌套解析器(我稍作修改以接受任意正則表達式模式以指定分隔符和項目分隔符)比我的原始re.Scanner解決方案更快，更簡單：import redef parse_nested(text, left=r'[(]', right=r'[)]', sep=r','):? ? """ https://stackoverflow.com/a/17141899/190597 (falsetru) """? ? pat = r'({}|{}|{})'.format(left, right, sep)? ? tokens = re.split(pat, text)? ? stack = [[]]? ? for x in tokens:? ? ? ? if not x or re.match(sep, x):? ? ? ? ? ? continue? ? ? ? if re.match(left, x):? ? ? ? ? ? # Nest a new list inside the current list? ? ? ? ? ? current = []? ? ? ? ? ? stack[-1].append(current)? ? ? ? ? ? stack.append(current)? ? ? ? elif re.match(right, x):? ? ? ? ? ? stack.pop()? ? ? ? ? ? if not stack:? ? ? ? ? ? ? ? raise ValueError('error: opening bracket is missing')? ? ? ? else:? ? ? ? ? ? stack[-1].append(x)? ? if len(stack) > 1:? ? ? ? print(stack)? ? ? ? raise ValueError('error: closing bracket is missing')? ? return stack.pop()text = "a {{c1::group {{c2::containing::HINT}} a few}} {{c3::words}} or three"print(parse_nested(text, r'\s*{{', r'}}\s*'))產量['a', ['c1::group', ['c2::containing::HINT'], 'a few'], ['c3::words'], 'or three']嵌套結構不能與Python正則表達式匹配的單獨的，但它是非常容易建立一個基本解析器使用(其可以處理嵌套結構)re.Scanner：import reclass Node(list):? ? def __init__(self, parent=None):? ? ? ? self.parent = parentclass NestedParser(object):? ? def __init__(self, left='$', right='$'):? ? ? ? self.scanner = re.Scanner([? ? ? ? ? ? (left, self.left),? ? ? ? ? ? (right, self.right),? ? ? ? ? ? (r"\s+", None),? ? ? ? ? ? (".+?(?=(%s|%s|$))" % (right, left), self.other),? ? ? ? ])? ? ? ? self.result = Node()? ? ? ? self.current = self.result? ? def parse(self, content):? ? ? ? self.scanner.scan(content)? ? ? ? return self.result? ? def left(self, scanner, token):? ? ? ? new = Node(self.current)? ? ? ? self.current.append(new)? ? ? ? self.current = new? ? def right(self, scanner, token):? ? ? ? self.current = self.current.parent? ? def other(self, scanner, token):? ? ? ? self.current.append(token.strip())可以這樣使用：p = NestedParser()print(p.parse("((a+b)*(c-d))"))# [[['a+b'], '*', ['c-d']]]p = NestedParser()print(p.parse("( (a ( ( c ) b ) ) ( d ) e )"))# [[['a', [['c'], 'b']], ['d'], 'e']]默認情況下NestedParser匹配嵌套括號。您可以傳遞其他正則表達式來匹配其他嵌套模式，例如方括號，[]。例如，p = NestedParser('\[', '\]')result = (p.parse("Lorem ipsum dolor sit amet [@a xxx yyy [@b xxx yyy [@c xxx yyy]]] lorem ipsum sit amet"))# ['Lorem ipsum dolor sit amet', ['@a xxx yyy', ['@b xxx yyy', ['@c xxx yyy']]],# 'lorem ipsum sit amet']p = NestedParser('', '')print(p.parse("BARBAZ"))# [['BAR', ['BAZ']]]當然，pyparsing可以做的比上面的代碼還多。但是出于這個目的，NestedParser對于小型字符串，上述速度要快大約5倍：In [27]: import pyparsing as ppIn [28]: data = "( (a ( ( c ) b ) ) ( d ) e )"? ??In [32]: %timeit pp.nestedExpr().parseString(data).asList()1000 loops, best of 3: 1.09 ms per loopIn [33]: %timeit NestedParser().parse(data)1000 loops, best of 3: 234 us per loop對于較大的字符串，速度要快大約28倍：In [44]: %timeit pp.nestedExpr().parseString('({})'.format(data*10000)).asList()1 loops, best of 3: 8.27 s per loopIn [45]: %timeit NestedParser().parse('({})'.format(data*10000))1 loops, best of 3: 297 ms per loop

總結

以上是生活随笔為你收集整理的python3 正则表达式嵌套表格_在Python中将嵌套结构与正则表达式匹配的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：麒麟操作系统配置网络_讲解银河麒麟桌面操
下一篇： python爬取贴吧所有标题的评论_用B

python

python3 正则表达式 嵌套表格_在Python中将嵌套结构与正则表达式匹配

總結

python3 正则表达式嵌套表格_在Python中将嵌套结构与正则表达式匹配