用 Python 编写干净、可测试、高质量的代码
簡介
編寫軟件是人所承擔的最復雜的任務之一。AWK 編程語言和 "K and R C" 的作者之一 Brian Kernigan 在?Software Tools?一書中總結了軟件開發的真實性質,他說,“控制復雜性是軟件開發的根本。” 真實軟件開發的殘酷現實是,軟件常常具有有意或無意造成的復雜性,而且開發人員常常漠視可維護性、可測試性和質量。這種不幸局面的最終結果是軟件的維護變得越來越困難且昂貴,軟件偶爾會出故障,甚至是重大故障。
編寫高質量代碼的第一步是,重新考量個人或團隊開發軟件的整個過程。在失敗或陷入麻煩的軟件開發項目中,常常按違反原則的方式開發軟件,開發人員關注的重點是解決問題,無論采用什么方式。在成功的軟件項目中,開發人員不但要考慮如何解決手中的問題,還要考慮解決問題涉及到的過程。
成功的軟件開發人員會按照便于自動化的方式運行測試,這樣就可以不斷地證明軟件工作正常。他們明白不必要的復雜性的危害。他們嚴格地遵守自己的方法,在每個階段都進行認真的復查,尋找重構的機會。他們經常思考如何確保其軟件是可測試、可讀且可維護的。盡管 Python 語言的設計者和 Python 社區都非常重視編寫干凈、可維護的代碼,但是仍然很容易出現相反的局面。在本文中,我們要探討這個問題,討論如何用 Python 編寫干凈、可測試、高質量的代碼。
演示這種開發風格的最好方法是解決一個假想的問題。假設您是某公司的后端 web 開發人員,公司允許用戶發表評論,您需要設法顯示和突出顯示這些評論的小片段。解決此問題的一種方法是編寫一個大函數,它接受文本片段和查詢參數,返回字符數量有限的片段并突出顯示查詢參數。解決此問題所需的所有邏輯都放在一個巨大的函數中,您只需反復運行腳本,直到得到想要的結果。代碼結構很可能像下面的代碼示例這樣,常常包含打印語句或日志記錄語句和交互式 shell。
?def my_mega_function(snippet, query)
"""This takes a snippet of text, and a query parameter and returns """
#Logic goes here, and often runs on for several hundred lines
#There are often deeply nested conditional statements and loops
#Function could reach several hundred, if not thousands of lines
return result
對于 Python、Perl 或 Ruby 等動態語言,軟件開發人員很容易一味專注于問題本身,常常采用交互方式進行探索,直到出現看似正確的結果,然后就宣告任務完成了。不幸的是,盡管這種方式很方便、很有吸引力,但是這常常會造成大功告成的錯覺,這是很危險的。危險主要在于沒有設計可測試的解決方案,而且沒有對軟件的復雜性進行適當的控制。
您如何確認這個函數工作正常呢?在開發期間最后一次運行它時它是正常的,您就此相信它是有效的,但是您能確定它的邏輯或語法中沒有細微的錯誤嗎?如果需要修改代碼,會怎么樣?它仍然有效嗎?您如何確認它仍然有效?如果需要由另一位開發人員維護并修改代碼,會怎么樣?他如何確認他的修改不會造成問題?對于他來說,理解代碼的作用有多難?
簡單地說,如果沒有測試,就不知道軟件是否有效。如果在開發過程中總是假設而不是證明有效性,最終可能會開發出看似有效的代碼,但是沒人能夠肯定代碼會正確地運行。這種局面太糟糕了,我編寫過這樣的軟件,也曾經幫助調試以這種方式編寫的軟件。幸運的是,很容易避免這種局面。應該先編寫測試(比如測試驅動的開發),否則在編寫邏輯的過程中編寫代碼的方向會偏離目標。先編寫測試會產生模塊化的可擴展的代碼,這種代碼很容易測試、理解和維護。對于有經驗的開發人員來說,很容易看出軟件是否是在一直牢記著測試的情況下編寫的。軟件本身在高手看來差別非常大。
您不必聽信我的觀點,也不必直接研究代碼,可以通過其他方法明顯地看出這兩種風格之間的差異。第一種方法是實際度量得到測試的代碼行數。Nose 是一種流行的 Python 單元測試框架擴展,它可以方便地自動運行一批測試和插件,比如度量代碼覆蓋率。通過在開發期間度量代碼覆蓋率,會很快看出對于由大函數組成、包含深度嵌套的邏輯、以非一般化方式構建的代碼來說,測試覆蓋率幾乎不可能達到 100%。
度量差異的第二種方法是使用靜態分析工具。有幾種流行的 Python 工具可以為 Python 開發人員提供多種指標,從一般性代碼質量指標到重復代碼或復雜度等特殊指標。可以用 pygenie 或 pymetrics 度量代碼的圈(cyclomatic)復雜度(見?參考資料)。
下面是對相當簡單的 “干凈” 代碼運行 pygenie 的結果示例:
pygenie 的圈復雜度輸出
% python pygenie.py complexity --verbose highlight spy
File: /Users/ngift/Documents/src/highlight.py
Type Name Complexity
----------------------------------------------------------------------------------------
M HighlightDocumentOperations._create_snippit 3
M HighlightDocumentOperations._reconstruct_document_string 3
M HighlightDocumentOperations._doc_to_sentences 2
M HighlightDocumentOperations._querystring_to_dict 2
M HighlightDocumentOperations._word_frequency_sort 2
M HighlightDocumentOperations.highlight_doc 2
X /Users/ngift/Documents/src/highlight.py 1
C HighlightDocumentOperations 1
M HighlightDocumentOperations.__init__ 1
M HighlightDocumentOperations._custom_highlight_tag 1
M HighlightDocumentOperations._score_sentences 1
M HighlightDocumentOperations._multiple_string_replace 1
什么是圈復雜度?
圈復雜度是 Thomas J. McCabe 在 1976 年開創的軟件指標,用來判斷程序的復雜度。這個指標度量源代碼中線性獨立的路徑或分支的數量。根據 McCabe 所說,一個方法的復雜度最好保持在 10 以下。這是因為對人類記憶力的研究表明,人的短期記憶只能存儲 7 件事(偏差為正負 2)。
如果開發人員編寫的代碼有 50 個線性獨立的路徑,那么為了在頭腦中描繪出方法中發生的情況,需要的記憶力大約超過短期記憶容量的 5 倍。簡單的方法不會超過人的短期記憶力的極限,因此更容易應付,事實證明它們的錯誤更少。Enerjy 在 2008 年所做的研究表明,在圈復雜度與錯誤數量之間有很強的相關性。復雜度為 11 的類的出錯概率為 0.28,而復雜度為 74 的類的出錯概率會上升到 0.98。
正如在此示例中看到的,每個方法都極其簡單,復雜度都低于 10,這符合 McCabe 提出的原則。在我的從業經歷中,我見過在沒有測試的情況下編寫的巨大函數,它們的復雜度超過 140,長度超過 1200 行。毫無疑問,根本不可能測試這樣的代碼。實際上甚至無法確認它是有效的,也不可能重構它。如果代碼的作者一直牢記測試,在保持 100% 測試覆蓋率的情況下編寫相同的邏輯,就不可能出現如此高的復雜度。
現在,我們來看一個完整的源代碼示例以及相配的單元測試和功能性測試,看看它的實際作用以及為什么說這樣的代碼是干凈的。按照嚴格的指標,“干凈” 的合理定義是代碼滿足以下要求:接近 100% 測試覆蓋率;所有類和方法的圈復雜度都低于 10;用 pylint 得到的評分接近 10.0。下面的示例使用 nose 在 highlight 模塊上執行單元測試和 doctest 覆蓋率檢查:
?% nosetests -v --with-coverage --cover-package=highlight --with-doctest\
--cover-erase --exe
Doctest: highlight.HighlightDocumentOperations._custom_highlight_tag ... ok
test_functional.test_snippit_algorithm ... ok
test_custom_highlight_tag (test_highlight.TestHighlight) ... ok
Consumes the generator, and then verifies the result[0] ... ok
Verifies highlighted text is what we expect ... ok
test_multi_string_replace (test_highlight.TestHighlight) ... ok
Verifies the yielded results are what is expected ... ok
Name Stmts Exec Cover Missing
-----------------------------------------
highlight 71 71 100%
----------------------------------------------------------------------
Ran 7 tests in 4.223s
OK
如上所示,帶幾個選項運行了 nosetests 命令,highlight spy 腳本的測試覆蓋率為 100%。惟一需要注意的是?--cover-package=highlight,它讓 nose 只顯示指定的模塊的覆蓋率報告。這可以非常有效地把覆蓋率報告的輸出限制為您希望觀察的模塊或包。可以從本文下載源代碼,注釋掉一些測試,從而觀察覆蓋率報告機制的實際工作情況。
?#/usr/bin/python
# -*- coding: utf-8 -*-
"""
:mod:`highlight` -- Highlight Methods
===================================
.. module:: highlight
:platform: Unix, Windows
:synopsis: highlight document snippets that match a query.
.. moduleauthor:: Noah Gift
Requirements::
1. You will need to install the ntlk library to run this code.
http://www.nltk.org/download
2. You will need to download the data for the ntlk:
See http://www.nltk.org/data::
import nltk
nltk.download()
"""
import re
import logging
import nltk
#Globals
logging.basicConfig()
LOG = logging.getLogger("highlight")
LOG.setLevel(logging.INFO)
class HighlightDocumentOperations(object):
"""Highlight Operations for a Document"""
def __init__(self, document=None, query=None):
"""
Kwargs:
document (str):
query (str):
"""
self._document = document
self._query = query
@staticmethod
def _custom_highlight_tag(phrase,
start="<strong>",
end="</strong>"):
"""Injects an open and close highlight tag after a word
Args:
phrase (str) - A word or phrase.
Kwargs:
start (str) - An opening tag. Defaults to <strong>
end (str) - A closing tag. Defaults to </strong>
Returns:
(str) word or phrase with custom opening and closing tags
>>> h = HighlightDocumentOperations()
>>> h._custom_highlight_tag("foo")
'foo'
>>>
"""
tagged_phrase = "{0}{1}{2}".format(start, phrase, end)
return tagged_phrase
def _doc_to_sentences(self):
"""Takes a string document and converts it into a list of sentences
Unfortunately, this approach might be a tad naive for production
because some segments that are split on a period are really an
abbreviation, and to make things even more complicated, an
abbreviation can also be the end of a sentence::
http://nltk.googlecode.com/svn/trunk/doc/book/ch03.html
Returns:
(generator) A generator object of a tokenized sentence tuple,
with the list position of sentence as the first portion of
the tuple, such as: (0, "This was the first sentence")
"""
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
sentences = tokenizer.tokenize(self._document)
for sentence in enumerate(sentences):
yield sentence
@staticmethod
def _score_sentences(sentence, querydict):
"""Creates a scoring system for each sentence by substitution analysis
Tokenizes each sentence, counts characters
in sentence, and pass it back as nested tuple
Returns:
(tuple) - (score (int), (count (int), position (int),
raw sentence (str))
"""
position, sentence = sentence
count = len(sentence)
regex = re.compile('|'.join(map(re.escape, querydict)))
score = len(re.findall(regex, sentence))
processed_score = (score, (count, position, sentence))
return processed_score
def _querystring_to_dict(self, split_token="+"):
"""Converts query parameters into a dictionary
Returns:
(dict)- dparams, a dictionary of query parameters
"""
params = self._query.split(split_token)
dparams = dict([(key, self._custom_highlight_tag(key)) for\
key in params])
return dparams
@staticmethod
def _word_frequency_sort(sentences):
"""Sorts sentences by score frequency, yields sorted result
This will yield the highest score count items first.
Args:
sentences (list) - a nested tuple inside of list
[(0, (90, 3, "The crust/dough was just way too effin' dry for me.
Yes, I know what 'cornmeal' is, thanks."))]
"""
sentences.sort()
while sentences:
yield sentences.pop()
def _create_snippit(self, sentences, max_characters=175):
"""Creates a snippet from a sentence while keeping it under max_chars
Returns a sorted list with max characters. The sort is an attempt
to rebuild the original document structure as close as possible,
with the new sorting by scoring and the limitation of max_chars.
Args:
sentences (generator) - sorted object to turn into a snippit
max_characters (int) - optional max characters of snippit
Returns:
snippit (list) - returns a sorted list with a nested tuple that
has the first index holding the original position of the list::
[(0, (90, 3, "The crust/dough was just way too effin' dry for me.
Yes, I know what 'cornmeal' is, thanks."))]
"""
snippit = []
total = 0
for sentence in self._word_frequency_sort(sentences):
LOG.debug("Creating snippit", sentence)
score, (count, position, raw_sentence) = sentence
total += count
if total < max_characters:
#position now gets converted to index 0 for sorting later
snippit.append(((position), score, count, raw_sentence))
#try to reassemble document by original order by doing a simple sort
snippit.sort()
return snippit
@staticmethod
def _multiple_string_replace(string_to_replace, dict_patterns):
"""Performs a multiple replace in a string with dict pattern.
Borrowed from Python Cookbook.
Args:
string_to_replace (str) - String to be multi-replaced
dict_patterns (dict) - A dict full of patterns
Returns:
(str) - Multiple replaced string.
"""
regex = re.compile('|'.join(map(re.escape, dict_patterns)))
def one_xlat(match):
"""Closure that is called repeatedly during multi-substitution.
Args:
match (SRE_Match object)
Returns:
partial string substitution (str)
"""
return dict_patterns[match.group(0)]
return regex.sub(one_xlat, string_to_replace)
def _reconstruct_document_string(self, snippit, querydict):
"""Reconstructs string snippit, build tags, and return string
A helper function for highlight_doc.
Args:
string_to_replace (list) - A list of nested tuples, containing
this pattern::
[(0, (90, 3, "The crust/dough was just way too effin' dry for me.
Yes, I know what 'cornmeal' is, thanks."))]
dict_patterns (dict) - A dict full of patterns
Returns:
(str) The most relevant snippet with the query terms highlighted.
"""
snip = []
for entry in snippit:
score = entry[1]
sent = entry[3]
#if we have matches, now do the multi-replace
if score:
sent = self._multiple_string_replace(sent,
querydict)
snip.append(sent)
highlighted_snip = " ".join(snip)
return highlighted_snip
def highlight_doc(self):
"""Finds the most relevant snippit with the query terms highlighted
Returns:
(str) The most relevant snippet with the query terms highlighted.
"""
#tokenize to sentences, and convert query to a dict
sentences = self._doc_to_sentences()
querydict = self._querystring_to_dict()
#process and score sentences
scored_sentences = []
for sentence in sentences:
scored = self._score_sentences(sentence, querydict)
scored_sentences.append(scored)
#fit into max characters, and sort by original position
snippit = self._create_snippit(scored_sentences)
#assemble back into string
highlighted_snip = self._reconstruct_document_string(snippit,
querydict)
return highlighted_snip
#/usr/bin/python
# -*- coding: utf-8 -*-
"""
Tests this query searches a document, highlights a snippit and returns it
http://www.example.com/search?find_desc=deep+dish+pizza&ns=1&rpp=10&find_loc=\
San+Francisco%2C+CA
Contains both unit and functional tests.
"""
import unittest
from highlight import HighlightDocumentOperations
class TestHighlight(unittest.TestCase):
def setUp(self):
self.document = """
Review for their take-out only.
Tried their large Classic (sausage, mushroom, peppers and onions) deep dish;\
and their large Pesto Chicken thin crust pizzas.
Pizza = I've had better. The crust/dough was just way too effin' dry for me.\
Yes, I know what 'cornmeal' is, thanks. But it's way too dry.\
I'm not talking about the bottom of the pizza...I'm talking about the dough \
that's in between the sauce and bottom of the pie...it was like cardboard, sorry!
Wings = spicy and good. Bleu cheese dressing only...hmmm, but no alternative\
of ranch dressing, at all. Service = friendly enough at the counters.
Decor = freakin' dark. I'm not sure how people can see their food.
Parking = a real pain. Good luck.
"""
self.query = "deep+dish+pizza"
self.hdo = HighlightDocumentOperations(self.document, self.query)
def test_custom_highlight_tag(self):
actual = self.hdo._custom_highlight_tag("foo",
start="[BAR]",
end="[ENDBAR]")
expected = "[BAR]foo[ENDBAR]"
self.assertEqual(actual,expected)
def test_query_string_to_dict(self):
"""Verifies the yielded results are what is expected"""
result = self.hdo._querystring_to_dict()
expected = {"deep": "deep",
"dish": "dish",
"pizza":"pizza"}
self.assertEqual(result,expected)
def test_multi_string_replace(self):
query = """pizza = I've had better"""
expected = """pizza = I've had better"""
query_dict = self.hdo._querystring_to_dict()
result = self.hdo._multiple_string_replace(query, query_dict)
self.assertEqual(expected, result)
def test_doc_to_sentences(self):
"""Consumes the generator, and then verifies the result[0]"""
results = []
expected = (0,'\nReview for their take-out only.')
for sentence in self.hdo._doc_to_sentences():
results.append(sentence)
self.assertEqual(results[0], expected)
def test_highlight(self):
"""Verifies highlighted text is what we expect"""
expected = """Tried their large Classic (sausage, mushroom, peppers and onions)\
deep
dish;and their large Pesto Chicken thin crust \
pizzas."""
actual = self.hdo.highlight_doc()
self.assertEqual(expected, actual)
def tearDown(self):
del self.query
del self.hdo
del self.document
if __name__ == '__main__':
unittest.main()
如果想運行以上代碼示例,需要下載 Natural Language Toolkit 源代碼并按照說明下載 nltk 數據。因為本文并不討論代碼示例本身,而是討論創建和測試它的方式,所以不詳細解釋代碼的實際作用。最后,我們對源代碼運行靜態代碼分析工具 pylint:
?% pylint highlight spy
No config file found, using default configuration
************* Module highlight
E: 89:HighlightDocumentOperations._doc_to_sentences: Instance of 'unicode' has no
'tokenize' member (but some types could not be inferred)
E: 89:HighlightDocumentOperations._doc_to_sentences: Instance of 'ContextFreeGrammar'
has no 'tokenize' member (but some types could not be inferred)
W:108:HighlightDocumentOperations._score_sentences: Used builtin function 'map'
W:192:HighlightDocumentOperations._multiple_string_replace: Used builtin function 'map'
R: 34:HighlightDocumentOperations: Too few public methods (1/2)
Report
======
69 statements analysed.
Global evaluation
-----------------
Your code has been rated at 8.12/10 (previous run: 8.12/10)
代碼的得分為 10 分制的 8.12 分,工具還指出了幾處缺陷。pylint 是可配置的,很可能需要根據項目的需求配置它。可以參考 pylint 官方文檔(見?參考資料)。對于這個示例,第 89 行上的兩個錯誤源于外部庫 nltk,兩個警告可以通過修改 pylint 的配置消除。一般來說,不希望允許源代碼中存在 pylint 指出的錯誤,但是在某些時候,比如對于上面的示例,可能需要做出務實的決定。它并不是完美的工具,但是我發現它在實際工作中非常有用。
結束語
在本文中,我們探討了看待測試的方式如何影響軟件的結構,以及缺乏面向測試的思想為什么會給項目帶來致命的危害。我們提供了一個完整的代碼示例,包括功能性測試和單元測試,用 nose 對它執行了代碼覆蓋率分析,還運行了兩個靜態分析工具 pylint 和 pygenie。我們沒有來得及討論的一個問題是,如何通過某種形式的連續集成測試使這個過程自動化。幸運的是,很容易用開放源碼的 Java? 連續集成系統 Hudson 實現這個目標。我希望您參考 Hudson 的文檔(見?參考資料),嘗試為項目建立自動化測試,它應該運行您的所有測試,包括靜態代碼分析。
最后,測試不是萬靈藥,靜態分析工具也不是。軟件開發是艱難的工作。為了爭取成功,我們必須時刻牢記真正的目標。不但要解決問題,而且要創建能夠證明有效的東西。如果您同意這個觀點,就應該明白過分復雜的代碼、傲慢的設計態度以及對 Python 的強大能力缺乏尊重都會直接妨礙實現這個目標。
總結
以上是生活随笔為你收集整理的用 Python 编写干净、可测试、高质量的代码的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 如何教育孩子
- 下一篇: 【论文】TagSLAM: Robust