马尔可夫的营销归因
An edited version of this article was first published on ClickZ: Marketer’s guide to data-driven marketing attribution.
本文的編輯版本首次發(fā)布在ClickZ:基于市場營銷人員的數(shù)據(jù)驅(qū)動營銷歸因指南中 。
Marketing attribution is a way of measuring the value of the campaigns and channels that are reaching your potential customers. The point in time when a potential customer interacts with a campaign is called a touchpoint, and a collection of touchpoints forms a buyer journey. Marketers use the results of an attribution model to understand what touchpoints have the most influence on successful buyer journeys, so that they can make more informed decisions on how to optimise investment in future marketing resources.
營銷歸因是一種衡量吸引潛在客戶的廣告系列和渠道的價值的方法。 潛在客戶與廣告系列互動的時間點稱為接觸點,接觸點的集合構(gòu)成了購買者的旅程。 營銷人員使用歸因模型的結(jié)果來了解哪些接觸點對成功的購買者旅程具有最大的影響,以便他們可以就如何優(yōu)化對未來營銷資源的投資做出更明智的決策。
Buyer journeys are rarely straightforward and the paths to success can be long and winding. With so many touchpoints to consider it is difficult to distinguish between the true high and low impact interactions, which can result in an inaccurate division of credit and a false representation of marketing performance. This is why choosing the best attribution model for your business is so important.
買家的旅程很少是直截了當(dāng)?shù)?#xff0c;成功的道路可能漫長而曲折。 考慮到這么多的接觸點,很難區(qū)分真正的高影響力互動和低影響力互動,這可能導(dǎo)致信貸分配不準(zhǔn)確和營銷績效的錯誤表述。 這就是為什么為您的業(yè)務(wù)選擇最佳歸因模型如此重要的原因。
In this post, I provide some insight into how Cloudera has used Cloudera products to build a custom, data-driven attribution model to measure the performance of our global campaigns.
在本文中,我將提供一些有關(guān)Cloudera如何使用Cloudera產(chǎn)品來構(gòu)建自定義,數(shù)據(jù)驅(qū)動的歸因模型以衡量我們的全球活動績效的見解。
傳統(tǒng)模式的局限性 (Limitations of traditional models)
All attribution models have their pros and cons, but one drawback the traditional models have in common is that they are rules based. The user has to decide up front how they want the credit for sales events to be divided between the touchpoints. Traditional models include:
所有歸因模型都有其優(yōu)缺點,但是傳統(tǒng)模型的一個缺點是它們都是基于規(guī)則的。 用戶必須預(yù)先決定他們?nèi)绾蜗M诮佑|點之間分配銷售活動的功勞。 傳統(tǒng)模型包括:
Luckily there are more sophisticated data-driven approaches that are able to capture the intricacies of buyer journeys by modelling how touchpoints actually interact with buyers, and each other, to influence a desired sales outcome. A data-driven model provides marketers with deeper insight into the importance of campaigns and channels, driving better marketing accountability and efficiency.
幸運的是,存在更復(fù)雜的數(shù)據(jù)驅(qū)動方法,這些方法可以通過對接觸點實際上如何與買方交互以及如何與買方交互以影響期望的銷售結(jié)果進(jìn)行建模來捕獲買方旅程的復(fù)雜性。 數(shù)據(jù)驅(qū)動的模型為營銷人員提供了對廣告系列和渠道重要性的更深入的了解,從而提高了營銷責(zé)任和效率。
Cloudera的數(shù)據(jù)驅(qū)動方法 (Cloudera’s data-driven approach)
The first attribution model we evaluated was based on the Shapley value from cooperative game theory. I covered the details of this model in a previous post. This popular (Nobel prize winning) model provided much more insight into channel performance than the traditional approaches, but in its most fundamental implementation it didn’t scale to handle the number of touchpoints we wanted to include. The Shapley model performed well on a relatively small number of channels, but our requirement was to perform attribution for all campaigns, which can equate to hundreds of touchpoints along a buyer’s journey.
我們評估的第一個歸因模型是基于合作博弈理論的Shapley值。 我在上一篇文章中介紹了該模型的詳細(xì)信息。 與傳統(tǒng)方法相比,這種流行的(獲得諾貝爾獎的)模型提供了對渠道性能的更多了解,但是在其最基本的實施中,它無法擴(kuò)展以處理我們想要包含的接觸點數(shù)量。 Shapley模型在相對較少的渠道上表現(xiàn)良好,但我們的要求是對所有廣告系列進(jìn)行歸因,這可以等同于買方整個旅程中的數(shù)百個接觸點。
Before investing time into scaling out the Shapley algorithm, we researched alternate methods and decided to evaluate the use of Markov models to solve the attribution problem. We used the ChannelAttribution R package for the implementation and found that it produced similar results to the Shapley model, it could scale to a large number of touchpoints, and was easy to set up and use in Cloudera Data Science Workbench (CDSW).
在花時間擴(kuò)展Shapley算法之前,我們研究了替代方法,并決定評估使用Markov模型解決歸因問題。 我們使用ChannelAttribution R包進(jìn)行實施,發(fā)現(xiàn)它產(chǎn)生了與Shapley模型相似的結(jié)果,可以擴(kuò)展到大量接觸點,并且易于在Cloudera Data Science Workbench(CDSW)中設(shè)置和使用。
馬爾可夫歸因模型 (Markov attribution models)
Markov is a probabilistic model that represents buyer journeys as a graph, with the graph’s nodes being the touchpoints or “states”, and the graph’s connecting edges being the observed transitions between those states. For example, a buyer watches a product Webinar (first state) then browses to LinkedIn (transition) where they click on an Ad impression for the same product (second state).
馬爾可夫是一個概率模型,它以圖表的形式表示買方的旅程,圖表的節(jié)點是接觸點或“狀態(tài)”,圖表的連接邊是在這些狀態(tài)之間觀察到的過渡。 例如,買主觀看產(chǎn)品網(wǎng)絡(luò)研討會 (第一狀態(tài)),然后瀏覽到LinkedIn(過渡),在該處他們單擊同一產(chǎn)品的廣告展示(第二狀態(tài))。
The key ingredient to the model is the transition probabilities (the likelihood of moving between states). The number of times buyers have transitioned between two states is converted into a probability, and the complete graph can be used to measure the importance of each state and the most likely paths to success.
該模型的關(guān)鍵要素是轉(zhuǎn)移概率(狀態(tài)之間移動的可能性)。 買家在兩個州之間轉(zhuǎn)換的次數(shù)轉(zhuǎn)換為概率,并且完整的圖表可用于衡量每個州的重要性以及最可能的成功之路。
For example, in a sample of buyer journey data we observe that the Webinar touchpoint occurs 8 times, and buyers watched the webinar followed by clicking on the LinkedIn Ad only 3 times, so the transition probability between the two states is 3 / 8 = 0.375 (37.5%). A probability is calculated for every transition to complete the graph.
例如,在購買者旅程數(shù)據(jù)的樣本中,我們觀察到網(wǎng)絡(luò)研討會接觸點發(fā)生了8次,并且購買者觀看了網(wǎng)絡(luò)研討會,隨后僅點擊了LinkedIn 廣告 3次,因此兩種狀態(tài)之間的轉(zhuǎn)換概率為3/8 = 0.375 (37.5%)。 計算每個過渡完成圖的概率。
Before we get to calculating campaign attribution, the Markov graph can tell us a couple of useful nuggets of information about our buyer journeys. From the example above you can see that the path with the highest probability of success is “Start > Webinar > Campaign Z > Success” with a total probability of 42.5% (1.0 * 0.425 * 1.0).
在計算廣告系列歸因之前,馬爾可夫圖可以告訴我們一些有用的關(guān)于購買者旅程的信息。 從上面的示例中,您可以看到成功概率最高的路徑是“ 開始>網(wǎng)絡(luò)研討會>廣告系列Z>成功 ”,總概率為42.5%(1.0 * 0.425 * 1.0)。
The Markov graph can also tell us the overall success rate; that is, the likelihood of a successful buyer journey given the history of all buyer journeys. The success rate is a baseline for overall marketing performance and the needle for measuring the effectiveness of any changes. The example Markov graph above has a success rate of 67.5%:
馬爾可夫圖還可以告訴我們總體成功率; 也就是說,根據(jù)所有買家旅程的歷史記錄,成功的買家旅程的可能性。 成功率是整體營銷績效的基準(zhǔn),是衡量任何變化的有效性的關(guān)鍵。 上面的示例馬爾可夫圖的成功率為67.5%:
廣告活動歸屬 (Campaign attribution)
A Markov graph can be used to measure the importance of each campaign by calculating what is known as the Removal Effect. A campaign’s effectiveness is determined by removing it from the graph and simulating buyer journeys to measure the change in success rate without it in place. Removal Effect is a proxy for weight, and it’s calculated for each campaign in the Markov graph.
馬爾可夫圖可通過計算所謂的“ 去除效果”來衡量每個活動的重要性。 廣告活動的效果是通過將其從圖表中刪除并模擬買家的旅程來衡量成功率變化(而不進(jìn)行設(shè)置)來確定的。 去除效果是權(quán)重的代表,它是針對馬爾可夫圖中的每個廣告系列計算得出的。
Using Removal Effect for marketing attribution is the final piece of the puzzle. To calculate each campaign’s attribution value we can use the following formula: A = V * (Rt / Rv)
使用“去除效果”進(jìn)行市場營銷歸因是最后一個難題。 要計算每個廣告系列的歸因值,我們可以使用以下公式: A = V *(Rt / Rv)
A = Campaign’s attribution value
A =廣告系列的歸因值
V = Total value to divide. For example, the total USD value of all successful buyer journeys used as input to the Markov model
V =要除的總值。 例如,所有成功買家旅程的總美元價值用作馬爾可夫模型的輸入
Rt = Campaign’s Removal Effect
Rt =廣告系列的移除效果
Rv = Sum of all Removal Effect values
Rv =所有去除效果值的總和
Let’s walk through an example. Say that during the first quarter of the fiscal year the total USD value of all successful buyer journeys is $1M. The same buyer journeys are used to build a Markov model and it calculated the Removal Effect for our Ad campaign to be 0.7 (i.e. The buyer journey success rate dropped by 70% when the Ad campaign was removed from the Markov graph). We know the Removal Effect values for every campaign observed in the input data, and for this example let’s say they sum to 2.8. By plugging the numbers into the formula we calculate the attribution value for our Ad campaign to be $250k:
讓我們來看一個例子。 假設(shè)在會計年度的第一季度,所有成功的買家旅程的總美元價值為100萬美元 。 使用相同的買家旅程來構(gòu)建馬爾可夫模型,并計算出我們的廣告系列的去除效果為0.7 (即,當(dāng)從Markov圖中刪除廣告系列時,買家旅程成功率下降了70%)。 我們知道在輸入數(shù)據(jù)中觀察到的每個活動的“去除效果”值,對于這個示例,假設(shè)它們的總和為2.8 。 通過將數(shù)字插入公式,我們得出廣告系列的歸因價值為25萬美元 :
$250,000 = $1,000,000 * (0.7 / 2.8)
$ 250,000 = $ 1,000,000 *(0.7 / 2.8)
In addition to this, we calculate campaign ROI by subtracting the cost of running a campaign over the same period of time from its attribution value.
除此之外,我們通過從廣告活動的歸因值中減去在相同時間段內(nèi)運行廣告活動的成本來計算廣告活動的投資回報率。
What’s nice about the ChannelAttribution R package is it does all of this for you and even includes implementations for three of the traditional rules-based algorithms for comparison (first-touch, last-touch, and linear-touch). Theres a new Python implementation too.
ChannelAttribution R軟件包的好處是它可以為您完成所有這些工作,甚至包括三種傳統(tǒng)的基于規(guī)則的比較算法(初次觸摸,最后一次觸摸和線性觸摸)的實現(xiàn)。 也有一個新的Python實現(xiàn)。
Cloudera上的Cloudera (Cloudera on Cloudera)
We’re proud of our data practice at Cloudera. The marketing attribution application was developed by Cloudera’s Marketing and Data Centre of Excellence lines of business. It’s built on our internal Enterprise Data Hub and the Markov models run in Cloudera Data Science Workbench (CDSW).
我們?yōu)镃loudera的數(shù)據(jù)實踐感到自豪。 營銷歸因應(yīng)用程序是由Cloudera的營銷和卓越數(shù)據(jù)中心業(yè)務(wù)部門開發(fā)的。 它基于我們內(nèi)部的企業(yè)數(shù)據(jù)中心構(gòu)建,并且Markov模型在Cloudera Data Science Workbench(CDSW)中運行 。
By leveraging a data-driven attribution model we have eliminated the biases associated with traditional attribution mechanisms. We have been able to understand how various messages influence our potential customers and the variances by geography and revenue type. Now that we have solid and trusted data behind attribution, we’re confident in using the results to inform and drive our marketing mix strategy and investment decisions. And we can rely on the numbers when we partner with sales teams to drive our marketing strategies going forward.
通過利用數(shù)據(jù)驅(qū)動的歸因模型,我們消除了與傳統(tǒng)歸因機(jī)制相關(guān)的偏見。 我們已經(jīng)能夠了解各種消息如何影響我們的潛在客戶以及按地理位置和收入類型劃分的差異。 既然歸因于背后的是可靠且可靠的數(shù)據(jù),我們有信心使用結(jié)果來指導(dǎo)和推動我們的營銷組合策略和投資決策。 與銷售團(tuán)隊合作時,我們可以依靠數(shù)字來推動我們的營銷策略。
翻譯自: https://towardsdatascience.com/multi-channel-marketing-attribution-with-markov-6b744c0b119a
總結(jié)
- 上一篇: python集群_使用Python集群文
- 下一篇: 苹果 iPhone 14“车祸检测”功能