零信任模型_关于信任模型
零信任模型
In the world of deep learning, there are certain safety-critical applications where a cold prediction is just not very helpful. If a patient without symptoms is diagnosed with a serious and time-sensitive illness, can the doctor trust the model to administer immediate treatment? We are currently in a phase between solely human doctors being useful and complete AI hegemony in terms of diagnosing illness: deep learning models perform better than human experts independently, but a cooperation between human experts and AI is the optimal strategy.
在深度學習的世界中,在某些對安全性要求很高的應用中,冷預測并不是很有用。 如果沒有癥狀的患者被診斷出患有嚴重且對時間敏感的疾病,醫生可以信任該模型來立即進行治療嗎? 當前,在診斷方面,人類醫生正處于有用和完全AI霸權之間的階段:深度學習模型的性能要比人類專家獨立更好,但是人類專家和AI之間的合作是最佳策略。
Human experts must gauge the certainty behind the deep learning model’s predictions if they are to provide an additional layer of judgement to the diagnosis. And to gauge the trust we can put into the model, we must be able to measure the types of uncertainty of the predictions.
人類專家必須為深度學習模型的預測提供依據,才能為診斷提供額外的判斷層。 為了衡量我們對模型的信任程度,我們必須能夠衡量預測不確定性的類型。
建模不確定性 (Modelling Uncertainty)
A deep learning model trained on an infinite amount of perfect data for an infinite amount of time necessarily reaches 100% certainty. In the real world, however, we don’t have perfect data or an infinite amount of it, and this is what causes the uncertainty of deep learning models.
在無限長的時間內對無限量的完美數據進行訓練的深度學習模型必須達到100%的確定性。 但是,在現實世界中,我們沒有完美的數據,也沒有無限的數據,這就是造成深度學習模型不確定性的原因。
We call it aleatoric uncertainty when we have less-than-perfect data. If we had an infinite amount of it, the model will still not perform perfectly. It is the uncertainty stemming from noisy data.
當我們獲得的數據不完美時,我們稱其為不確定性 。 如果我們擁有無限量,則該模型仍將無法完美運行。 這是來自嘈雜數據的不確定性。
And when we have high quality data, but we still are not performing perfectly, we are dealing with epistemic uncertainty, the uncertainty due to imperfect parameter values.
而且,當我們擁有高質量的數據,但仍無法完美運行時,我們將處理認知不確定性(由于參數值不完善而導致的不確定性)。
A measure of aleatoric uncertainty becomes more important in large-data tasks, since more data explains away epistemic uncertainty. In small datasets, however, epistemic uncertainty proves to be a greater issue, especially in biomedical settings, where we work with a small amount of well-prepared and high-quality data.
在大數據任務中,測量不確定性變得更加重要,因為更多的數據可以解釋認知不確定性。 然而,在小型數據集中,認知不確定性被證明是一個更大的問題,尤其是在生物醫學環境中,在該環境中,我們將處理少量準備充分且高質量的數據。
Aleatoric uncertainty can be measured by directly adding a term to the loss function, such that the model predicts the input’s prediction and the prediction’s uncertainty. Epistemic uncertainty is slightly more tricky, since this uncertainty comes from the model itself. If we were to measure epistemic uncertainty as we would aleatoric, the model would have to do the impossible task of predicting the imperfection of its own parameters.
可以通過將一項直接添加到損失函數中來測量運動不確定性,以使模型預測輸入的預測和預測的不確定性。 認知不確定性稍微復雜一些,因為這種不確定性來自模型本身。 如果我們要像測量性那樣測量認知不確定性,則該模型將不得不完成預測其自身參數不完善的不可能的任務。
For the remainder of this article, I will focus more on epistemic uncertainty rather than aleatoric uncertainty. Both aleatoric and epistemic uncertainty can be measured in a single model, but I find epistemic uncertainty is far more significant in most biomedical and other safety-critical applications.
在本文的其余部分中,我將更多地關注認知不確定性而不是偶然不確定性。 可以在單個模型中測量無意識不確定性和認知不確定性,但是我發現認知不確定性在大多數生物醫學和其他對安全至關重要的應用中更為重要。
測量認知不確定性 (Measuring Epistemic Uncertainty)
Let’s make a toy example and create some training data. Suppose we have 10 points of data. The x-value of each point is evenly spaced, and the y value is determined by adding some random noise to x.
讓我們做一個玩具示例并創建一些訓練數據。 假設我們有10點數據。 每個點的x值均勻分布,并且y值是通過向x添加一些隨機噪聲來確定的。
This is our training data; let’s fit three models (10-degree polynomials) to it:
這是我們的訓練數據; 讓我們擬合三個模型(10次多項式):
Three models, all fitting perfectly on the training data, and all evidently different.三種模型都完全適合訓練數據,并且明顯不同。In this toy example, each model fits perfectly on the data. This means that for any input that is identical to one of the points of the training data, the model perfectly predicts its y-value. However, the if we take any x-value other than that of the training set, the predictions will be wildly off, depending on which model we use.
在這個玩具示例中,每個模型都完全適合數據。 這意味著對于任何與訓練數據的一個點相同的輸入,模型可以完美地預測其y值。 但是,如果我們采用除訓練集以外的任何x值,則根據我們使用的模型,預測將大相徑庭。
This is the intuition behind measuring epistemic uncertainty: we trained three different models on the training data, and we got three different models. If we give each model an input, .25 for example, then the first model will give us 20 for its prediction, about -20 for the second model, and around -60 for the third. There is a high standard deviation between these predictions, meaning each model does not accurately represent that data point.
這就是測量認知不確定性的直覺:我們在訓練數據上訓練了三種不同的模型,并且得到了三種不同的模型。 如果我們給每個模型一個輸入,例如.25,那么第一個模型將為我們提供20的預測,第二個模型約-20,第三個模型約-60。 這些預測之間存在很高的標準偏差,這意味著每個模型都無法準確表示該數據點。
The epistemic uncertainty can thus be defined as the standard deviation of these three predictions, since wildly different predictions on otherwise similar models suggests that each model is guessing for that data point.
因此,可以將認知不確定性定義為這三個預測的標準偏差,因為在其他方面相似的模型上的預測差異很大,這表明每個模型都在猜測該數據點。
We can execute this procedure in practice by training several (usually 10) different models on the same training data, and during inference, take the standard deviation of each model’s prediction to estimate the epistemic uncertainty.
在實踐中,我們可以通過在同一訓練數據上訓練幾個(通常為10個)不同模型來執行此過程,并在推斷過程中采用每個模型預測的標準差來估計認知不確定性。
However, training 10 different models is computationally expensive and sometimes infeasible for deep learning models training on giant datasets. Fortunately, there is a simple alternative that uses a single model to estimate epistemic uncertainty, that is, by using dropout on inference time.
但是,訓練10個不同的模型在計算上很昂貴,有時對于在巨型數據集上進行深度學習模型訓練有時是不可行的。 幸運的是,有一個簡單的替代方法,即使用單個模型來估計認知不確定性,即通過使用推斷時間的下降。
Dropout regularization, for those who are unfamiliar, is a technique that literally drops out random neurons of the network when training each batch during training. It is usually turned off at inference time, but, if we turn it on, and predict the test example 10 times, we can effectively simulate the approach of using 10 different models for epistemic uncertainty, since random dropout essentially results in a different model.
對于不熟悉的人,輟學正則化是一種在訓練過程中訓練每批時從字面上去除網絡隨機神經元的技術。 通常在推理時將其關閉,但是,如果我們將其打開并預測10次測試示例,則可以有效地模擬使用10個不同模型進行認知不確定性的方法,因為隨機丟棄實際上會導致一個不同的模型。
This approach is called Monte Carlo dropout, and it is currently the standard in estimating epistemic uncertainty. There are some issues with the approach, namely that it requires nearly ten times more computation during inference time relative to a standard prediction without an uncertainty measurement. Monte Carlo dropout is therefore impractical for many real-time applications, leading to the alternative usage of often less effective but quicker methods to measure uncertainty.
這種方法稱為“ 蒙特卡洛輟學” ,目前是估計認知不確定性的標準。 該方法存在一些問題,即與沒有不確定性度量的標準預測相比,在推理時間內它需要的計算量增加了近十倍。 因此,對于許多實時應用而言,蒙特卡洛輟學是不切實際的,從而導致通常使用效率較低但較快的方法來測量不確定性的替代方法 。
警告 (Caveat)
We have measured the epistemic uncertainty, but in reality, we must remain uncertain about that very uncertainty measure. Indeed, we can measure the uncertainty of our measurement by graphing the standard deviation against absolute error. If the aleatoric uncertainty remains negligible, then there should be a linear relation between our epistemic uncertainty measurement and the absolute error between the predictions and the labels of a test set on a regression task.
我們已經測量了認知的不確定性,但是實際上,我們必須對該不確定性測量保持不確定性。 實際上,我們可以通過將標準偏差與絕對誤差作圖來測量測量的不確定性。 如果誤差不確定性仍然可以忽略不計,那么我們的認知不確定性度量與預測和回歸任務的測試集標簽之間的絕對誤差之間應該存在線性關系。
Estimate of epistemic uncertainty measurement’s accuracy on a regression task (SAMPL dataset)估計回歸任務上的認知不確定性測量的準確性(SAMPL數據集)A perfect uncertainty measurement should be a straight, positive-sloping line, so although the uncertainty measurement evidently correlates with absolute error (which is what we want), the measurement is imperfect.
理想的不確定度測量應該是一條直線,正斜線,因此盡管不確定度測量顯然與絕對誤差(這是我們想要的)相關,但該測量并不完美。
This imperfect measurement is better than nothing, but what we really just did is add an uncertain proxy to represent uncertainty. Again, the proxy reduces uncertainty given the positive correlation, but the uncertainty remains.
這種不完美的測量總比沒有好,但是我們真正所做的是添加一個不確定的代理來表示不確定性。 同樣,在給定正相關的情況下,代理可以減少不確定性,但是不確定性仍然存在。
含義 (Implications)
What does this mean for doctors that might use uncertainty measurements? As of now, take the predictions of a deep learning model, whether it be the actual predictions of the label or its uncertainty measurements, with a grain of salt. The model cannot contextualize circumstances as a human expert might, so if a doctor finds that a model predicts a case which has evidently low aleatoric uncertainty (i.e. the data is high-quality and unequivocal) with high uncertainty, he or she is best to doubt the prediction on grounds of epistemic uncertainty. High epistemic uncertainty means the model was not trained optimally for the context of that particular case, and the model is just not generalizing well.
對于可能使用不確定性測量的醫生意味著什么? 到目前為止,采用一粒鹽來進行深度學習模型的預測,無論是標簽的實際預測還是不確定性測量。 該模型無法像人類專家那樣對環境進行上下文描述,因此,如果醫生發現模型預測的案例具有明顯的低不確定性(即,數據質量高且明確)且不確定性很高,那么最好懷疑他或她基于認知不確定性的預測。 較高的認知不確定性意味著該模型并未針對該特定案例進行最佳訓練,并且該模型推廣得還不夠好。
Indeed, one study directly supports this logic by illustrating that human experts are worse than deep learning algorithms in cases where the algorithm predicts with low uncertainty, but human experts outperform the algorithm on high uncertainty cases.
實際上, 一項研究通過說明人類專家在算法具有較低不確定性的情況下比深度學習算法更糟糕,而人類專家在高不確定性情況下的性能優于該算法,從而直接支持了這種邏輯。
Uncertainty measurements are only metrics for the purpose of human understanding; they do not aid model performance at all. However, the performance of human-AI systems, on the other hand, directly benefit from uncertainty measurements.
不確定性度量只是出于人類理解目的的度量; 它們根本無法提高模型性能。 然而,另一方面,人類AI系統的性能直接受益于不確定性測量。
Thus, measuring model uncertainty is not just a case of keeping humans passively informed; for it directly improves the performance of the broader system.
因此,測量模型的不確定性不僅僅是讓人們被動地了解情況的一種情況。 因為它直接改善了整個系統的性能。
引文 (Citations)
Measuring Uncertainty
測量不確定度
Real-time Epistemic Uncertainty
實時認知不確定性
Human-augmented deep learning
人為增強的深度學習
And a thank you to DeepChem tutorials for originally introducing many of these concepts.
非常感謝DeepChem教程最初介紹了其中許多概念。
翻譯自: https://medium.com/swlh/on-trusting-the-model-e67c94b5d205
零信任模型
總結
以上是生活随笔為你收集整理的零信任模型_关于信任模型的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 上海市:延续实施新能源车置换补贴,符合相
- 下一篇: 疑似一加Ace2安兔兔跑分曝光:约115