mergesort_Mergesort算法的功能方法
mergesort
by Joe Chasinga
通過喬·查辛加(Joe Chasinga)
Mergesort算法的功能方法 (A functional approach to mergesort algorithm)
Algorithms are often difficult for people to understand. I believe that this is because they are most often programmed or explained in a language that encourages thinking in procedures or instructions which are not intuitive.
人們通常很難理解算法。 我相信這是因為它們通常是用一種語言來編程或解釋的,這種語言會鼓勵人們思考不直觀的程序或指令。
Very often the meat of an algorithm (how you solve a particular problem logically without computer coding) looks very simple and understandable when described graphically. Surprisingly, however, it often does not translate well into code written in languages like Python, Java, or C++. Therefore it becomes much more difficult to understand.
當(dāng)以圖形方式進(jìn)行描述時,算法的精髓(在沒有計算機(jī)編碼的情況下如何以邏輯方式解決特定問題的方法)通常看起來非常簡單易懂。 但是令人驚訝的是,它通常不能很好地轉(zhuǎn)換為用Python,Java或C ++等語言編寫的代碼。 因此,變得更加難以理解。
In other words, the algorithmic concept doesn’t map directly to how the code should be written and read.
換句話說,算法概念并不直接映射到應(yīng)如何編寫和讀取代碼 。
為什么算法這么難編碼? (Why are algorithms so difficult to code?)
Well, we could blame it on the inner workings of early electro-mechanic computers. The early inventors of some of the most used programming languages today could never get rid of those features. Or perhaps they couldn’t help leaving their fingerprints on their inventions. Once you understand computers that well, there’s no undoing that.
好吧,我們可以將其歸咎于早期機(jī)電計算機(jī)的內(nèi)部運(yùn)作。 當(dāng)今一些最常用的編程語言的早期發(fā)明者永遠(yuǎn)都不會擺脫這些功能。 也許他們不由自主地留下了自己的指紋。 一旦您對計算機(jī)有足夠的了解,就無法撤消它。
To make matters worse, on top of already micro-managing languages, somebody had to invent an API for better micro-management. They called it object-oriented programming (OOP), and added the concept of classes to programming — but I think modules and functions could handle the same things just fine, thank you very much.
更糟糕的是,除了已經(jīng)存在的微管理語言之外,還必須有人發(fā)明一個API來更好地進(jìn)行微管理。 他們稱其為面向?qū)ο缶幊?OOP),并在編程中添加了類的概念-但我認(rèn)為模塊和函數(shù)可以很好地處理相同的事情,非常感謝。
C++ didn’t make C any better, but it did pave a way by inspiring more descendants of OOP. And all together, all these things make abstract algorithmic thinking hard for the aforementioned reasons.
C ++并沒有使C變得更好,但它確實通過啟發(fā)更多的OOP子孫鋪平了道路。 綜上所述,由于上述原因,所有這些事情使得抽象算法的思考變得困難。
案例研究:合并排序 (The case study: merge sort)
For our discussion, we will use a merge sort algorithm as a specimen. Take a look at the diagram below. If you can count and put together jigsaw puzzles, then you can probably understand how it works in a few minutes.
對于我們的討論,我們將使用合并排序算法作為樣本。 看一下下圖。 如果您可以數(shù)數(shù)并把拼圖拼在一起,那么您可能會在幾分鐘內(nèi)了解它的工作原理。
The key steps of producing a merge sort are few and simple. In fact, I can explain it using my daughter’s number blocks (helpful to follow these by going back to the animated diagram for reference):
產(chǎn)生合并排序的關(guān)鍵步驟很少而且很簡單。 實際上,我可以使用女兒的數(shù)字塊來解釋它(可以通過返回動畫圖作為參考來幫助遵循這些規(guī)則):
- First, we need to keep subdividing a list of numbers (or letters, or any type of sortable values) by half until we end up with many single-element lists. A list with one element is technically sorted. This is called trivially sorted. 首先,我們需要將數(shù)字列表(或字母或任何類型的可排序值)細(xì)分為一半,直到得到許多單元素列表為止。 具有一個元素的列表在技術(shù)上進(jìn)行了排序。 這稱為瑣碎排序。
- Then, we create a new empty list in which we could start re-arranging the elements and putting them one by one according to which one is less/smaller than the other. 然后,我們創(chuàng)建一個新的空列表,在該列表中,我們可以開始重新排列元素,并根據(jù)其中一個元素的大小小于另一個元素將它們一一放置。
- Then we need to “merge” each pair of lists back together, effectively reversing the subdivision steps. But this time, at every step of the way, we have to make sure that the smaller element in the pair in question is being put into the empty list first. 然后,我們需要將每對列表“合并”在一起,以有效地逆轉(zhuǎn)細(xì)分步驟。 但是這一次,我們必須確保將有關(guān)對中的較小元素首先放入空列表中。
For the sake of the argument, we will try to map out the above processes by making each one a subroutine (function in normal speak). The meatiest part of this algorithm is the merging, so let’s start with that first.
出于爭論的目的,我們將通過使每個子程序成為一個子例程(通常來說是函數(shù))來嘗試映射上述過程。 該算法最重要的部分是合并,因此讓我們首先開始。
def merge(a, b): out = []while (len(a) > 0 and len(b) > 0): if (a[0] <= b[0]): out.append(a[0]) del a[0] else: out.append(b[0]) del b[0]while (len(a) > 0): out.append(a[0]) del a[0] while (len(b) > 0): out.append(b[0]) del b[0]return outGo on and spend some time looking it over. You might notice that with imperative Python code, it is designed to be spoken out and then understood. It is very understandable in English, but not in logic.
繼續(xù)并花一些時間查看一下。 您可能會注意到,使用命令式Python代碼,可以說出來然后理解它。 用英語很容易理解,但是在邏輯上卻不是。
我們的第一次嘗試 (Our first attempt)
Here is one attempt (that you could possibly use in a whiteboarding session):
這是一種嘗試(您可以在白板會話中使用):
To merge list a and b, we’ll have to first create an empty list named out for clarity (because in Python we can’t be sure it will really be “out” in the end). Then, as long as (or while) both lists are not empty, we’ll keep putting the head of both lists to a fight-off. Whichever is less than or equal to the opponent wins and gets to enter out first. The loser will have to stay and wait there for the new contestant down the line. The rematches continue on until the first while loop breaks.
要合并列表a和b ,我們必須先創(chuàng)建一個名為空單out的透明度(因為在Python我們不能肯定這將真正成為“走出去”到底)。 然后,只要(或同時)兩個列表都不為空,我們將繼續(xù)將兩個列表的開頭進(jìn)行辯論。 無論是小于或等于給對手獲勝,并得到進(jìn)入out第一位。 失敗者將不得不留下來,在那里等待新的參賽者。 重新比賽繼續(xù)進(jìn)行,直到第一個while循環(huán)中斷。
Now, at some point either a or b will be empty, leaving the other with one or more elements hanging. Without any contestants left in the other list, the two while loops make sure to fast track those poor elements into out so both list are exhausted. Then, when that’s all done, we return out.
現(xiàn)在,在某個時間點(diǎn)a或b將為空,而另一個則懸空一個或多個元素。 在其他列表中沒有任何競爭者的情況下,兩個while循環(huán)可確保快速將那些不良元素快速找out從而使兩個列表都用盡。 然后,當(dāng)這一切都完成后,我們返回out 。
And this is the test cases for merge:
這是合并的測試用例:
assert(merge([1], [2]) == [1, 2])assert(merge([2], [1]) == [1, 2])assert(merge([4, 1], [3, 0, 2]) == [3, 0, 2, 4, 1])I hope at this point it is clear to you why we end up with the result in the last case. If it isn’t, try drawing on a whiteboard or a piece of paper and simulating the explanation.
我希望在這一點(diǎn)上您很清楚為什么我們在最后一種情況下會得到結(jié)果。 如果不是,請嘗試在白板或紙上繪圖并模擬說明。
分而治之 (Divide and Conquer)
Now we will carry on with the subdivision part. This process is also known as partitioning or, in somewhat grander language, Divide and Conquer (by the way, the definition in politics is equally interesting).
現(xiàn)在,我們將繼續(xù)進(jìn)行細(xì)分部分。 這個過程也被稱為分區(qū),或者用某種更為宏大的語言來說,稱為分而治之 (順便說一句, 政治中的定義同樣有趣 )。
Basically, if anything is hard to conquer or understand, you should break it down until it becomes smaller and more easily understood. Do that until the parts are unbreakable and repeat the process with the rest.
基本上,如果很難克服或理解任何內(nèi)容,則應(yīng)分解它,直到變得更小且更容易理解為止。 這樣做直到零件牢不可破,然后對其余零件重復(fù)該過程。
def half(arr): mid = len(arr) / 2 return arr[:mid], arr[mid:]What the half routine does is find the middle index, slice the input list into roughly equal sublists, and return both as a pair. It only needs to do this once, since the parent function will eventually call it recursively.
half例程的工作是找到中間索引,將輸入列表切成大致相等的子列表,然后將兩個都成對返回。 它只需要執(zhí)行一次,因為父函數(shù)最終將遞歸調(diào)用它。
Since we have the pieces, now we just need to put them together into a coherent scheme. This is where the water gets murky, because recursions are involved.
既然有了這些片段,那么現(xiàn)在我們只需要將它們放到一個連貫的方案中即可。 這是水變得暗淡的地方,因為涉及到遞歸。
Before going into more code, let me explain why recursions and imperative programming languages like Python do not fit together so well. I won’t go into the topic of optimization, because that does not concern today’s discussion and is not as interesting.
在討論更多代碼之前,讓我解釋一下為什么遞歸和命令式編程語言(如Python)不能很好地融合在一起。 我不會討論優(yōu)化的主題,因為它與今天的討論無關(guān),也不那么有趣。
One distinct irony here is that, even in a language with iterative looping like Python, it still cannot entirely avoid recursion (it might get away without recursion, but I’m sure that would make it even more bizarre). Recursion is a territory where iterative constructs, such as for and while loops, become utterly useless.
這里有一個明顯的諷刺意味是,即使在像Python這樣的具有迭代循環(huán)的語言中,它仍然不能完全避免遞歸(如果沒有遞歸,它可能會消失,但是我敢肯定,這會使它變得更加離奇)。 遞歸是迭代構(gòu)造(例如for和while循環(huán))變得完全無用的領(lǐng)域。
Moreover, recursion is not natural in Python. It does not feel natural and transparent, but rather feels quite half-baked the way its lambda is. An attempt at voicing over recursions in Python would be like, “then we do this recursively and just pray it all works out and hits the base case in the end so it doesn’t spiral into the infinite darkness of stack overflow.” Wow, right?
而且,遞歸在Python中并不自然。 它感覺不到自然和透明,而是感覺像其lambda一樣半生半熟。 在Python中為遞歸發(fā)聲的嘗試就像是,“然后我們遞歸地執(zhí)行此操作,然后祈禱一切都解決了,并最終擊中了基礎(chǔ)情況,這樣它就不會陷入無限的堆棧溢出黑暗之中。” 哇對吧
So here is the mergesort function:
所以這是mergesort函數(shù):
def mergesort(arr):if (len(arr) <= 1): return arrleft, right = half(arr) L = mergesort(left) R = mergesort(right)return merge(L, R)Apparently, this is really clean and easy to read. But it isn’t clear what happens after the call to half , at least semantically. Because of this “non-nativity” to recursion, recursive calls are very opaque and obstructive to educational endeavors.
顯然,這確實很干凈而且易于閱讀。 但是,至少在語義上,調(diào)用half之后會發(fā)生什么尚不清楚。 由于對遞歸的這種“非本土化”,因此遞歸調(diào)用是非常不透明的,并且阻礙了教育工作。
The only way to visualize this mergesort process is probably to track the changes in the sublists in every step:
可視化此mergesort過程的唯一方法可能是在每個步驟中跟蹤子列表中的更改:
input: [0, 3, 1, 3, 2, 6, 5]A alias for mergesort / halfB alias for merge## subdividing by half ...A([0, 3, 1, 3, 2, 6, 5]) A([0, 3, 1]) A([3, 2, 6, 5]) A([0]) A([3, 1]) A([3, 2]) A([6, 5]) A([]) A([0]) A([3]) A([1]) A([3]) A([2]) A([6]) A([5])## base case reached, start merging ... B([], [0]) B([3], [1]) B([3], [2]) B([6], [5]) B([0], [1, 3]) B([2, 3], [5, 6]) B([0, 1, 3], [2, 3, 5, 6]) B([0, 1, 2, 3, 3, 5, 6])output: [0, 1, 2, 3, 3, 5, 6]On an asymptotic side note, dividing and conquering almost always incurs a logarithmic runtime. When you keep dividing a collection into N sub-collections, whether it contains 10 or 100,000,000 items, the number of steps taken in the latter case increases at the rate of log base N.
從漸近的角度來看,劃分和征服幾乎總是會導(dǎo)致對數(shù)運(yùn)行時間。 當(dāng)您繼續(xù)將一個集合劃分為N個子集合(無論它包含10還是100,000,000個項目)時,在后一種情況下采取的步驟數(shù)以對數(shù)N的速率增加。
For instance, it takes about 3 steps to keep dividing 10 by 2 until it gets as close to 1 as it can (or exactly 3 steps to reach 1 in integer division). But it takes only about 26 steps to do the same and divide 100,000,000 by 2 until you reach 1. This fact can be expressed as follows:
例如,保持10除以2大約需要3個步驟,直到它盡可能接近1(或者整數(shù)除以1恰好需要3個步驟)。 但是,僅需大約26個步驟即可完成,將100,000,000除以2,直到達(dá)到1。這一事實可以表示為:
2^3.321928 = 102^6.643856 = 100...2^26.575425 = 100000000orlog base 2 of 100000000 = 26.575425The takeaway here is that we had to visualize the recursive processes in order to understand the inner workings of the algorithm — even though it looked so trivial in the animated diagram.
這里的要點(diǎn)是,我們必須可視化遞歸過程,以了解算法的內(nèi)部工作原理,即使它在動畫圖中看起來微不足道。
Why is there a divide between the conceptual processes of the algorithm itself and the code that instructs the computer to compute such processes?
為什么算法本身的概念過程與指示計算機(jī)計算此類過程的代碼之間存在鴻溝?
It’s because in a way, by using imperative languages, we are in fact still mentally enslaved by the machines.
這是因為在某種程度上,通過使用命令式語言,我們實際上仍然在精神上被機(jī)器奴役。
深入研究代碼 (Diving deeper into the code)
“There’s a difference between knowing the path and walking the path.”
“知道路徑和走路徑是有區(qū)別的。”
― Morpheus, The Matrix
― Morpheus,矩陣
Programming is hard, we all know that. And understanding programming in a really deep way is even harder on your soul (and your career). But I would argue that, like Morpheus said, sometimes walking the path is all that matters. Being able to see clearly is one of most rewarding things in programming.
編程很難,我們都知道。 而且,以深刻的方式理解編程對您的靈魂(以及您的職業(yè)生涯)更加困難。 但是我想像莫非斯所說,有時走這條路很重要。 能夠清楚地看到是編程中最有意義的事情之一。
In functional programming, the programmer (you) gets the front seat in seeing how data change recursively. This means that you have the ability to decide how the data of a certain form should be transformed to the data of another based on the snapshot of how it looks. This isn’t unlike how we have visualized the mergesort process. Let me give you a preview.
在函數(shù)式編程中,程序員(您)在看待數(shù)據(jù)如何遞歸更改方面處于首位。 這意味著您可以根據(jù)其外觀快照決定將某種形式的數(shù)據(jù)轉(zhuǎn)換為另一種形式的數(shù)據(jù)的能力。 這與我們可視化mergesort過程的方式?jīng)]有什么不同。 讓我給你預(yù)覽。
Let’s say you want to create a base case in Python. In it, you want to return the list in question when it has only one element, and an empty list when there’s two elements. So you’d need to write something like this:
假設(shè)您要在Python中創(chuàng)建一個基本案例。 在其中,您要在有一個元素的情況下返回有問題的列表,而在有兩個元素的情況下返回一個空列表。 因此,您需要編寫如下內(nèi)容:
if (len(arr) == 1): return arrelif (len(arr) == 2): return []Or to make this worse but more interesting, you could try to access the first element by index 0 and the second element by index 1 and get ready to handle IndexError exception.
為了使這種情況變得更糟但更有趣,您可以嘗試通過索引0訪問第一個元素,并通過索引1訪問第二個元素,并準(zhǔn)備處理IndexError異常。
In a functional language like Erlang — which is what I’ll be using in this article for its dynamic type system like Python — you more or less would do something like this:
在像Erlang這樣的功能語言中-這就是我將在本文中為其動態(tài)類型系統(tǒng)(如Python)使用的語言-您或多或少會執(zhí)行以下操作:
case Arr of [_] -> Arr; [_,_] -> []end.This gives you a clearer view of the state of the data. Once it’s trained enough, it requires much less cognitive power to read and comprehend what the code does than len(arr) . Just keep in mind: a programmer who doesn’t speak English might ask, “what is len?” Then you get distracted by the literal meaning of the function instead of the value of that expression.
這使您可以更清楚地了解數(shù)據(jù)狀態(tài)。 一旦經(jīng)過足夠的培訓(xùn),與len(arr)相比,它需要更少的認(rèn)知能力來閱讀和理解代碼的作用。 請記住:不說英語的程序員可能會問:“倫是什么?” 然后,您會因函數(shù)的字面意思而不是該表達(dá)式的值而分心。
However, this comes with a price: you don’t have the luxury of a looping construct. A language like Erlang is recursion-native. Almost every meaningful Erlang program will make use of rigorous recursive function calls. And that’s why it is mapped more closely to the algorithmic concepts which usually consist of recursion.
但是,這是有代價的:您沒有循環(huán)構(gòu)造的奢侈。 像Erlang這樣的語言是遞歸本機(jī)的。 幾乎每個有意義的Erlang程序都將使用嚴(yán)格的遞歸函數(shù)調(diào)用。 這就是為什么將它更緊密地映射到通常由遞歸組成的算法概念的原因。
Let’s try to retrace our steps in producing mergesort, but this time in Erlang, starting with the merge function.
讓我們嘗試追溯生成合并排序的步驟,但是這次是在Erlang中,從merge功能開始。
merge([], [], Acc) -> Acc;merge([], [H | T], Acc) -> [H | merge([], T, Acc)];merge([H | T], [], Acc) -> [H | merge(T, [], Acc)];merge([Ha | Ta], [Hb | Tb], Acc) -> case Ha =< Hb of true -> [Ha | merge(Ta, [Hb | Tb], Acc)]; false -> [Hb | merge([Ha | Ta], Tb, Acc)] end.What an abomination! Definitely not an improvement in terms of readability, you think. Yes, Erlang admittedly won’t win any prizes for beautiful language. In fact, many functional languages can look like gibberish to the untrained eyes.
真是可惡! 您認(rèn)為絕對不是可讀性上的改進(jìn)。 是的,Erlang不會因為美麗的語言而贏得任何獎項。 實際上,許多功能語言對于未經(jīng)訓(xùn)練的人來說看起來像胡言亂語。
But let’s give it a chance. We will go through each step like we did before, and perhaps in the end some of us will see the light. But before we go on, for those of you who are not familiar with Erlang, these are some points worth noting:
但是,讓我們有機(jī)會。 我們將像以前一樣經(jīng)歷每個步驟,也許最終我們中的某些人會看到曙光。 但是在我們繼續(xù)之前,對于那些不熟悉Erlang的人來說,以下幾點(diǎn)值得注意:
Each block of merge is considered a function clause of the same function. They are separated by ;. When an expression ends in Erlang, it ends with a period (.). It’s a convention to separate a function into several clauses for different cases. For instance, merge([], [], Acc) -> Acc; clause maps the case where the first two arguments are empty lists to the value of the last argument.
每個merge塊都被視為同一函數(shù)的一個函數(shù)子句。 他們被分開; 。 當(dāng)表達(dá)式以Erlang結(jié)尾時,它以句點(diǎn)( . )結(jié)尾。 按照慣例,將函數(shù)分成幾個子句以適應(yīng)不同的情況。 例如, merge([], [], Acc) -> A cc; 子句將前兩個參數(shù)為空列表的情況映射到最后一個參數(shù)的值。
Arity plays an important role in Erlang. Two functions with the same name and arity are considered the same function. Otherwise, they aren’t. For example, merge/1 and merge/3 (how functions and their arity are addressed in Erlang) are two different functions.
Arity在Erlang中扮演重要角色。 具有相同名稱和別名的兩個功能被視為相同功能。 否則,事實并非如此。 例如, merge/1和merge/3 (如何在Erlang中解決功能及其Arity)是兩個不同的功能。
Erlang uses rigorous pattern matching (This is used in many other functional languages, but especially in Erlang). Since values in pure functional languages are immutable, it is safe to bind variables in a similar shape of data to the existing one with a matched shape. Here is a trivial example:
Erlang使用嚴(yán)格的模式匹配 (在許多其他功能語言中使用,尤其是在Erlang中)。 由于純函數(shù)語言中的值是不可變的,因此將具有相似數(shù)據(jù)形狀的變量綁定到具有匹配形狀的現(xiàn)有變量是安全的。 這是一個簡單的示例:
Note that we will seldom talk about returning values when we work with Erlang functions, because they don’t really “return” anything per se. It maps an input value to a new value. This isn’t the same as outputting or returning it from the function. The function application itself is the value. For instance, if Add(N1, N2) -> N1+N2., Add(1, 2) is 3. There’s no way for it to return anything other than 3, hence we can say it is 3. This is why you could easily do add_one = add(1) and then add_one(2) is 3, add_one(5) is 6, and so on.
請注意,在使用Erlang函數(shù)時,我們很少談?wù)摲祷刂?#xff0c;因為它們實際上并不真正“返回”任何東西。 它將輸入值映射到新值。 這與從函數(shù)輸出或返回它不同。 功能應(yīng)用程序本身就是價值。 例如,如果Add(N1, N2) -> N1+ N2 ., Add(1, 1,2)為3。除了3之外,它無法返回其他任何東西,因此我們可以說它是3。這就是為什么您可以輕松地do add_one = add (1),并且en add_one (2)為3, add_one (5)為6,依此類推。
For those who are interested, see referential transparency. To make this point clearer and risking redundancy, here is something to think about:
對于那些感興趣的人,請參閱參考透明 。 為了使這一點(diǎn)更加清楚并冒著冗余的風(fēng)險,請考慮以下幾點(diǎn):
when f(x) is a function with one arity, and the mapping is f(x) ->; x , then it's conclusive that f(1) -> 1, f(2) -> 2, f(3.1416) -> 3.1416, and f("foo") -> "foo".
當(dāng)f(x)是具有一個Arity的函數(shù),并且映射為f(x) -> ; x,則它是at f(1) - &g t; 1, f(2的定論th t; 1, f(2 t; 1, f(2 ) -> 2, f(3.1416) -> 3.1416, and f("fo o”)->“ foo”。
This may look like a no-brainer, but in an impure function there's no such guaranteed mapping:這看起來很容易,但是在一個不純函數(shù)中,沒有這樣保證的映射:a = 1
a = 1
a = 1def add_to_a(b):
a = 1 def add_to_a(b):
a = 1def add_to_a(b): return b + a
a = 1 def add_to_a(b): return b + a
Now a might as well be anything before add_to_a gets called. Thus in Python, you could write a pure version of the above as:
現(xiàn)在, a之前很可能會成為什么add_to_a被調(diào)用。 因此,在Python中,您可以將上述代碼的純文本編寫為:
def add(a, b):
def add(a, b):
def add(a, b): return a + b
def add(a, b): return a + b
or lambda a, b: a + b .
或lambda a, b: a + b 。
Now it’s time to bumble into the unknown.
現(xiàn)在是時候進(jìn)入未知世界了。
與Erlang一起前進(jìn) (Forging ahead with Erlang)
merge([], [], Acc) -> Acc;The first clause of the merge/3 function means that when the first two arguments are empty lists, map the entire expression to (or “return”) the third argument Acc.
merge/3函數(shù)的第一子句意味著,當(dāng)前兩個參數(shù)為空列表時,將整個表達(dá)式映射到(或“返回”)第三個參數(shù)Acc 。
Interestingly, in a pure function, there’s no way of retaining and mutating state outside of itself. We can only work with what we have received as inputs into the function, transform it, then feed the new state into another function’s argument (most often this is another recursive call to itself).
有趣的是,在一個純函數(shù)中,沒有辦法在其外部保持和改變狀態(tài)。 我們只能使用我們作為函數(shù)輸入收到的東西,對其進(jìn)行轉(zhuǎn)換,然后將新狀態(tài)饋入另一個函數(shù)的參數(shù)中(通常這是對自身的另一個遞歸調(diào)用)。
Here, Acc stands for accumulator, which you can think of as a state container. In the case of merge/3, Acc is a list that starts empty. But as the recursive calls get on, it accumulates values from the first two lists using the logic we program (which we will talk about next).
在這里, Acc代表累加器,您可以將其視為狀態(tài)容器。 在merge/3的情況下, Acc是一個以空開頭的列表。 但是隨著遞歸調(diào)用的進(jìn)行,它使用我們編程的邏輯(將在下面討論)從前兩個列表中累積值。
This process of exhausting a value to build up another value is collectively known as reduction. Therefore, in this case it we can conclude that since the first two lists are exhausted (empty), Acc must be ripe for pick up.
用盡一個值來建立另一個值的過程統(tǒng)稱為減少。 因此,在這種情況下,我們可以得出結(jié)論,由于前兩個列表已用盡(為空),因此Acc必須可以使用。
merge([], [H | T], Acc) -> [H | merge([], T, Acc)];The second clause matches the case when the first list is already empty, but there’s still at least one more element in the second list. [H | T] means a list has a head element H which cons onto another list T. In Erlang, a list is a linked list, and the head has a pointer to the rest of the list. So a list of [1, 2, 3, 4] can be thought of as:
第二個子句與第一個列表已經(jīng)為空的情況匹配,但是第二個列表中至少還有一個元素。 [H | T] [H | T]表示一個列表的頭元素H 約束在另一個列表 T 。 在Erlang中,列表是一個鏈接列表,并且頭部具有指向列表其余部分的指針。 因此, [1, 2, 3, 4]可以認(rèn)為是:
%% match A, B, C, and D to 1, 2, 3, and 4, respectively[A | [B | [C | [D | []]]]] = [1, 2, 3, 4].In this case, as you can see in the conning example, T can just be an empty tail list. So in this second case, we map it to a value of a new list in which the H element of the second list is conned onto the recursive result of calling merge/3 when T is the second argument.
在這種情況下,如您在精簡示例中所看到的, T只能是一個空的尾列表。 因此,在第二種情況下,我們將其映射到新列表的值,其中,當(dāng)T是第二個參數(shù)時,第二個列表的H元素被限制在調(diào)用merge/3的遞歸結(jié)果上。
merge([H | T], [], Acc) -> [H | merge(T, [], Acc)];The third case is just a flip side of the second case. It matches the case when the first list is not empty, but the second is. This clause maps to a value in a similar pattern, except it calls merge/3 with the tail of the first list as the first argument and keeps the second list empty.
第三種情況只是第二種情況的反面。 它匹配第一個列表不為空但第二個列表為空的情況。 該子句以相似的模式映射到一個值,除了它以第一個列表的尾部作為第一個參數(shù)調(diào)用merge/3并保持第二個列表為空。
merge([Ha | Ta], [Hb | Tb], Acc) -> case Ha =< Hb of true -> [Ha | merge(Ta, [Hb | Tb], Acc)]; false -> [Hb | merge([Ha | Ta], Tb, Acc)] end.Let’s begin with the meat of merge/3 first. This clause matches the case when the first and second arguments are non-empty lists. In this case, we enter a case … of clause (equivalent to switch case in other languages) to test if the head element of the first list (Ha) is less than or equal to the head element of the second list (Hb).
讓我們先從merge/3開始。 此子句與第一個和第二個自變量為非空列表時的情況匹配。 在這種情況下,我們輸入子句的case … of (相當(dāng)于其他語言的switch case),以測試第一個列表( Ha )的頭元素是否小于或等于第二個列表( Hb )的頭元素。
If that is true, we con Ha onto the resulting list of the next recursive call to merge with the tail list of the previous first list (Ta) as the new first argument. We keep the second and third arguments the same.
如果是這樣,我們將Ha在下一個遞歸調(diào)用的結(jié)果列表上,并與先前的第一個列表( Ta )的尾部列表合并為新的第一個參數(shù)。 我們保持第二個和第三個參數(shù)相同。
These clauses constitute to a single function, merge/3. You can imagine that it could have been a single function clause. We could use complex case … of and/or if conditional plus pattern-matching to weed out each case and map it to the right result. That would have made it more chaotic, when you can easily read each case the function is matching quite nicely on separate lines.
這些子句構(gòu)成單個功能merge/3 。 您可以想象它可能是單個函數(shù)子句。 我們可以使用...和/或條件匹配和模式匹配的復(fù)雜案例來剔除每種情況,并將其映射到正確的結(jié)果。 當(dāng)您可以輕松閱讀每種情況時,該函數(shù)在單獨(dú)的行上可以很好地匹配,這會使它變得更加混亂。
However, things got a little hairy for the subdividing operation, which needs two functions: half/1 and half/3.
但是,細(xì)分操作有些麻煩,它需要兩個功能: half/1和half/3 。
half([]) -> {[], []};half([X]) -> {[X], []};half([X,Y]) -> {[X], [Y]};half(L) -> Len = length(L), half(L, {0, Len}, {[], []}).half([], _, {Acc1, Acc2}) -> {lists:reverse(Acc1), lists:reverse(Acc2)};half([X], _, {Acc1, Acc2}) -> {lists:reverse(Acc1), lists:reverse([X | Acc2])};half([H|T], {Cnt, Len}, {Acc1, Acc2}) -> case Cnt >= (Len div 2) of true -> half(T, {Cnt + 1, Len}, {Acc1, [H|Acc2]}); false -> half(T, {Cnt + 1, Len}, {[H|Acc1], Acc2}) end.This is where you’ll miss Python and its destructive nature. In a pure functional language, lists are linked lists. When you work with them, there’s no looking back. There’s no logic that says “I want to divide a list in half, so I’m going to get the middle index, and slice it into two left and right portions.”
這是您會想念Python及其破壞性的地方。 在純功能語言中,列表是鏈接列表。 當(dāng)您與他們一起工作時,便不會回頭。 有沒有邏輯,說:“我要除以2的列表,所以我會得到中間指標(biāo),并切片成左右兩個部分。”
If your mind is set in working with linked lists, you’re more along the lines of “I can only go forward through the list, working with a few elements at a time. I need to create two empty lists and keep count of how many items I’ve retrieve from the source list and put into the first one so I know when it’s time to switch to another bucket. All the aforementioned needs to be passed in as arguments in the recursive calls.” Whew!
如果您決定使用鏈接列表,那么您將更像“我只能遍歷列表,一次處理幾個元素”。 我需要創(chuàng)建兩個空列表,并統(tǒng)計從源列表中檢索到的第一個條目的數(shù)量,以便我知道何時該切換到另一個存儲桶。 所有上述所有內(nèi)容都需要在遞歸調(diào)用中作為參數(shù)傳遞。” ew!
In other words, cutting a list in half can be compared to chopping a block of cheese with a knife in the middle of it. On the other hand, a functional comparison for doing so is like pouring coffee into two cups equally — you just need to know when it’s time to stop pouring into the first one and move on to the second one.
換句話說,將清單切成兩半可以比作中間用刀切成塊的干酪。 另一方面,這樣做的功能比較就像將咖啡均勻地倒入兩杯中一樣-您只需要知道何時該停止倒入第一個杯子并轉(zhuǎn)到第二個杯子。
The half/1 function, although it isn’t really necessary, is there for convenience.
half/1函數(shù)雖然不是真正必需的,但它還是為了方便起見。
half([]) -> {[], []};half([X]) -> {[X], []};half([X,Y]) -> {[X], [Y]};half(L) -> Len = length(L), half(L, {0, Len}, {[], []}).By now, you should get the sense of what each Erlang function clause is doing. The new bracket pairs here represent tuples in Erlang. Yes, we are returning a left and right value pair, like in the Python version. The half/1 function is here to handle simple, explicit base cases which don’t warrant the worthiness of passing in other arguments.
到目前為止,您應(yīng)該了解每個Erlang函數(shù)子句在做什么。 這里的新括號對表示Erlang中的元組。 是的,我們返回的是左值和右值對,就像在Python版本中一樣。 half/1函數(shù)在這里用于處理簡單的顯式基本情況,這些情況不保證傳遞其他參數(shù)的價值。
However, take note of the last case when the argument has a list with more than two elements. (Note: those with less than or equal to two elements are already handled by the first three clauses.) It simply computes the following:
但是,請注意當(dāng)參數(shù)的列表包含兩個以上元素時的最后一種情況。 (注意:少于或等于兩個元素的元素已經(jīng)由前三個子句處理。)它僅計算以下內(nèi)容:
the length of the list L and calls half/3 with L as the first argument
列表L的長度,并以L作為第一個參數(shù)調(diào)用half/3
- a pair of counter variables and list’s length, which will be used to signal the switching from list one to list two 一對計數(shù)器變量和列表的長度,用于指示從列表一切換到列表二
and of course, a pair of empty lists to fill the elements from L in.
當(dāng)然還有一對空列表來填充L in中的元素。
half/3 looks like a mess, but only to the untrained eyes. The first clause matches a pattern when the source list is drained. In this case, the second pair of counter and length won’t matter since it’s already the end. We simply know that Acc1 and Acc2 are ripe for yielding. But wait, what’s with the reversing of both?
half/3看起來像是一團(tuán)糟,但僅限于未經(jīng)訓(xùn)練的眼睛。 當(dāng)源列表耗盡時,第一個子句與模式匹配。 在這種情況下,第二對計數(shù)器和長度無關(guān)緊要,因為它已經(jīng)結(jié)束了。 我們僅知道Acc1和Acc2已經(jīng)成熟。 但是,等等,兩者的反轉(zhuǎn)又如何呢?
Appending an element to a linked list is a very slow operation. It runs O(N) times for every append, because it needs to create a new list, copy the existing one onto it, and create a pointer to the new element and assign it to the last element. It’s like redoing the whole list. Couple this with recursions and you are bound for disaster.
將元素附加到鏈表是很慢的操作。 它需要為每個追加運(yùn)行O(N)次,因為它需要創(chuàng)建一個新列表,將現(xiàn)有列表復(fù)制到該列表上,并創(chuàng)建一個指向新元素的指針并將其分配給最后一個元素。 就像重做整個列表一樣。 再加上遞歸,您將注定要遭受災(zāi)難。
The only good way to add something to a linked list is to prepend it at the head. Then all it needs to do is create a memory for that new value and give it a reference to the head of the linked list. A simple O(1) operation. So even though we could concatenate lists using ++ like [1, 2, 3] ++ [4], we rarely want to do it this way, especially with recursions.
向鏈接列表添加內(nèi)容的唯一好方法是將其放在開頭。 然后,它要做的就是為該新值創(chuàng)建一個內(nèi)存,并為其提供對鏈接列表開頭的引用。 一個簡單的O(1)操作。 因此,即使我們可以使用[1, 2, 3] ++ [4]類的++來連接列表,我們也很少想這樣做,特別是在遞歸的情況下。
The technique here is to reverse the source list first, then con an element onto it like [4 | [3, 2, 1]] , and reverse them again to get the right result. This may sound terrible, but reversing a list and reversing it back is an O(2N) operation, which is O(N). But in between, conning elements onto the list takes only O(1), so it basically costs no extra runtime.
這里的技術(shù)是先反轉(zhuǎn)源列表,然后將元素像[4 | [3, 2, 1]] [4 | [3, 2, 1]]并再次反轉(zhuǎn)它們以獲得正確的結(jié)果。 這聽起來很糟糕,但是反轉(zhuǎn)列表并將其反轉(zhuǎn)是O(2N)操作,即O(N)。 但是在這兩者之間,將元素精簡到列表僅需要O(1),因此基本上不需要額外的運(yùn)行時間。
half([H|T], {Cnt, Len}, {Acc1, Acc2}) -> case Cnt >= (Len div 2) of true -> half(T, {Cnt + 1, Len}, {Acc1, [H|Acc2]}); false -> half(T, {Cnt + 1, Len}, {[H|Acc1], Acc2}) end.Getting back to half/3. The second clause, the meat of the function, does exactly the same thing as the coffee pouring metaphor we visited earlier. Since the source list is still “emitting” data, we want to keep track of the time we have been pouring values from it into the first coffee cup Acc1.
回到half/3 。 第二個子句,即函數(shù)的實質(zhì),與我們之前訪問過的咖啡澆注隱喻的功能完全相同。 由于源列表仍在“發(fā)出”數(shù)據(jù),因此我們希望跟蹤將值從其中倒入第一個咖啡杯Acc1 。
Remember that in half/1’s last clause, we calculated the length of the original list? That is the Len variable here, and it stays the same throughout all the calls. It’s there so that we can compare Cnt counter to it divided by 2 to see if we have come to the middle of the source list and should switch to filling up Acc2 . That is where the case … of comes in.
還記得在half/1的最后一個子句中,我們計算了原始列表的長度嗎? 這就是這里的Len變量,并且在所有調(diào)用中都保持不變。 在那里,我們可以將Cnt計數(shù)器與其除以2的值進(jìn)行比較,以查看是否已到達(dá)源列表的中間,并且應(yīng)該切換為填充Acc2 。 就是這種case … of 。
Now, let’s put them all together in mergesort/1 . This should be as simple as the Python version, and can be easily compared.
現(xiàn)在,讓我們將它們放到mergesort/1 。 這應(yīng)該與Python版本一樣簡單,并且可以輕松進(jìn)行比較。
mergesort([A]) -> [A];mergesort([A, B]) -> case A =< B of true -> [A,B]; false -> [B,A] end;mergesort(L) -> {Left, Right} = half(L), merge(mergesort(Left), mergesort(Right), []).而已! (That’s it!)
At this point, either you think this is a novel and useful way of thinking about a problem, or you find it just plain confusing. But I hope you got something out of this programming approach that helps shine new light on how we can think about algorithms.
在這一點(diǎn)上,您要么認(rèn)為這是一種解決問題的新穎且有用的方法,要么就會發(fā)現(xiàn)它令人困惑。 但是,我希望您能從這種編程方法中學(xué)到一些東西,這有助于使我們對如何思考算法有新的認(rèn)識。
更新資料 (Update)
The Python implementation of merge function isn’t efficient because in each while loop the first element in the list is removed. Although this is a common pattern in functional languages like Erlang, in Python it is very costly to remove or insert an element anywhere other than the last position because unlike a list in Erlang which is a linked list which is very efficient to remove or add element at the head of the list, Python list behaves like an array which has to reposition all other elements when one is removed or added, incurring a O(n) runtime.
merge功能的Python實現(xiàn)效率不高,因為在每個while循環(huán)中,列表中的第一個元素均被刪除。 盡管在Erlang等功能語言中這是常見的模式,但是在Python中,刪除或插入除最后位置以外的其他位置的元素非常昂貴,因為與Erlang中的列表不同的是,鏈接列表非常有效地刪除或添加元素在列表的頂部,Python列表的行為就像一個數(shù)組,當(dāng)刪除或添加一個元素時,它必須重新定位所有其他元素,從而導(dǎo)致O(n)運(yùn)行時。
The better way is to sacrifice little space to define a counter variable for each list which can be incremented and used to access the current element of the source list without the need to remove the top-most element at all.
更好的方法是犧牲很少的空間為每個列表定義一個計數(shù)器變量,該計數(shù)器變量可以遞增并用于訪問源列表的當(dāng)前元素,而根本不需要刪除最頂層的元素。
def merge(a, b): out = []ai = 0 bi = 0while (ai <= len(a) - 1 and bi <= len(b) - 1): if (a[ai] <= b[bi]): out.append(a[ai]) ai += 1 else: out.append(b[bi]) bi += 1while (ai <= len(a) - 1): out.append(a[ai]) ai += 1while (bi <= len(b) - 1): out.append(b[bi]) bi += 1return out翻譯自: https://www.freecodecamp.org/news/a-functional-approach-to-merge-sort-and-algorithms-in-general-bbc12457eeb0/
mergesort
總結(jié)
以上是生活随笔為你收集整理的mergesort_Mergesort算法的功能方法的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 梦到情人到我家找我啥意思
- 下一篇: 梦到被丧尸追是什么意思