正则化技术
What is Regularization Technique?It’s a technique mainly used to overcome the over-fitting issue during the model fitting. This is done by adding a penalty as the model’s complexity gets increased. Regularization parameter λ penalizes all the regression parameters except the intercept so that the model generalizes the data and it will avoid the over-fitting (i.e. it helps to keep the parameters regular or normal). This will make the fit more generalized to unseen data.
w ^帽子是正則化?這是主要用于模型擬合過程中,克服了過度擬合問題的技術。 這是通過在模型的復雜性增加時增加懲罰來實現的。 正則化參數λ懲罰除截距外的所有回歸參數,以便模型對數據進行泛化,從而避免過擬合(即,有助于保持參數規則或正常)。 這將使擬合更廣泛地適用于看不見的數據。
Over-fitting means while training the model using the training data, the model reads all the observation and learns from it and model becomes too complex. But while validating the same model using the testing data, the fit becomes worse.
過度擬合意味著在使用訓練數據訓練模型時,模型會讀取所有觀察值并從中學習,因此模型變得過于復雜。 但是,當使用測試數據驗證同一模型時,擬合度會變差。
What does the Regularization Technique do?The basic concept is we don’t want huge weight for the regression coefficients. The simple regression equation is y= β0+β1x , where y is the response variable or dependent variable or target variable, x is the feature variable or independent variable and β’s are the regression coefficient parameter or unknown parameter.A small change in the weight to the parameters makes a larger difference in the target variable, thus it ensures that not too much weight is added. In this, not too much weight to any feature is given, and zero weight is given to the least significant feature.
w ^帽子做的正則化呢?其基本概念是我們不希望的回歸系數的巨大重量。 簡單的回歸方程為y =β0+β1x,其中y是響應變量或因變量或目標變量,x是特征變量或自變量,β是回歸系數參數或未知參數。參數對目標變量的影響更大,因此可以確保添加的權重不會太大。 在這種情況下,沒有給予任何特征太多的權重,并且給最低有效特征賦予零的權重。
Working of RegularizationThus regularization will add the penalty for the higher terms and this will decrease the importance given to the higher terms and will bring the model towards less complex. Regularization equation:
RegularizationThus正規化W的工作會有將增加則判為更高的條款,這將減少分配給更高的條款的重要性,并會帶來對模型不太復雜。 正則化方程:
Min(Σ(yi-βi*xi)2 + λ/2 * Σ (|βi|)^p )
最小(Σ(yi-βi* xi)2+λ/ 2 *Σ(|βi|)^ p)
where p=1,2,…. and i=1,…,n. Mostly the popular values of p chosen would be 1 or 2. Thus selecting the feature is done by regularization.
其中p = 1,2,...。 并且i = 1,…,n。 通常,選擇的p的流行值將是1或2。因此,通過正則化選擇特征。
What is Loss function?Loss function is a function mainly used to estimate how far the estimated value from the observed actual value. i.e Σ(Y - f(x)). This is of two types:
w ^帽子損失函數?喪失功能主要是用來估計多遠的估計值從觀測到的實際值的函數。 即Σ(Y-f(x)) 。 這有兩種類型:
L1 loss function- Which gives the absolute sum of the difference of actual value minus estimated value. Given by: Σ(|Yi - f(x)|), thus there is a possibility of multiple solutions.
L1損失函數-給出實際值減去估計值之差的絕對和。 由Σ(| Yi-f(x)|)給出,因此可能有多種解決方案。
L2 loss function- Which gives the squared sum of the difference of actual value minus estimated value. Given by: Σ(Yi - f(x))2, thus it gives us the least square value and will give us one clear form of solution.
L2損失函數-給出實際值減去估計值之差的平方和。 給出: Σ(Yi-f(x))2,因此它為我們提供了最小的平方值,并且為我們提供了一種清晰的解決方案。
What are the type of Regularization Technique?There are two type of regularization technique, they are:
?w ^帽子是正則化的類型有兩種類型的正則化技術的,它們分別是:
arg Min(Σ(yi-βi*xi)2 + λ* Σ (|βi|))
arg Min(Σ(yi-βi* xi)2+λ*Σ(|βi|))
2. Ridge Regularization / L2 Regularization- This will add “Squared magnitude” of coefficient as penalty term to the loss function.
2.嶺正則化/ L2正則化-這會將系數的“平方幅度”作為損失項的懲罰項。
arg Min(Σ(yi-βi*xi)2 + λ* Σ (βi)2)
arg Min(Σ(yi-βi* xi)2+λ*Σ(βi)2)
If λ is zero, then OLS (Ordinary Least Square) method is used. Else if λ is Very large, then it will add too much weight and it will lead to under-fitting. Thus choosing the value for λ is very important.
如果λ為零,則使用OLS(普通最小二乘)方法。 否則,如果λ非常大,那么它將增加過多的重量,并且會導致擬合不足。 因此,選擇λ的值非常重要。
LASSOwill shrink the less important feature’s coefficient by assigning the value zero to it, and this will remove automatically the least significant variables. This will help us in variable selection. Mainly, LASSO will work well will less number of significant variable parameters.
L ASSO通過將零值分配給次要特征的系數來縮小它,這將自動刪除最低有效變量。 這將有助于我們進行變量選擇。 主要是,LASSO將在較少數量的重要可變參數的情況下工作良好。
RidgeThis will add penalty and as a result, shrinks the size of weight. Additionally, this will work well even with a huge number of variable parameters. Also used in collinearity problems.
?IDGE這將增加懲罰并且作為結果,收縮的體積重量。 此外,即使使用大量可變參數,此方法也能很好地工作。 也用于共線性問題。
Finding optimal suitable weights is a big challenge.
尋找最佳合適的重量是一個很大的挑戰。
Prediction accuracy is good.
預測精度好。
Other model selection criteria like AIC, BIC, Cross-Validation, Step-wise regression to handle over-fitting, and perform feature selection work well with a small set of features but these regularization techniques are a great alternative when we are dealing with a large set of features.
其他模型選擇標準,例如AIC,BIC,交叉驗證,逐步回歸以處理過度擬合,以及使用少量特征來很好地執行特征選擇,但是當我們處理大型特征時,這些正則化技術是一個很好的選擇功能集。
翻譯自: https://medium.com/swlh/regularization-technique-84779df34092
總結
- 上一篇: OPPO Find N2 Flip全球版
- 下一篇: 检测对抗样本_避免使用对抗性T恤进行检测