学习笔记——sklearn监督学习:回归(简单数学知识罗列)
文章目錄
- 一、數(shù)學(xué)優(yōu)化
- 損失函數(shù)
- SoftMax 損失
- 二、回歸分析
- 1、最小二乘回歸
- 2、嶺回歸
- 3、Logistics 回歸
- 4、Lasso 回歸
- 三、實例
- 1、Linear Regression
- 2、Ridge Regression
- 3、Lasso
- 4、Logistics Regression
一、數(shù)學(xué)優(yōu)化
損失函數(shù)
損失函數(shù)(Loss function)是用來估量你模型的預(yù)測值與真實值的不一致程度,它是一個非負實值函數(shù),通常用。損失函數(shù)越小,模型的魯棒性就越好。損失函數(shù)是經(jīng)驗風(fēng)險函數(shù)的核心部分,也是結(jié)構(gòu)風(fēng)險函數(shù)的重要組成部分。模型的風(fēng)險結(jié)構(gòu)包括了風(fēng)險項和正則項,通常如下所示:
其中,前面的均值函數(shù)表示的是經(jīng)驗風(fēng)險函數(shù),L 代表的是損失函數(shù),后面的 Φ是正則化項(regularizer)或者叫懲罰項(penalty term),它可以是 L1,也可以是L2,或者其他的正則函數(shù)。整個式子表示的意思是找到使目標函數(shù)最小時的θ值。
常見的損失誤差有五種:
1. 0-1 損失(黑色)
3. 鉸鏈損失(Hinge Loss):主要用于支持向量機(SVM) 中
4. 互熵損失 (Cross Entropy Loss,Softmax Loss):用于 Logistic 回歸與 Softmax 分類中(紅色);
5. 平方損失(Square Loss):主要是最小二乘法(OLS)中(略);
6. 指數(shù)損失(Exponential Loss) :主要用于 Adaboost 算法中(藍色);
SoftMax 損失
邏輯回歸并沒有求似然函數(shù)的極值,而是把極大化當(dāng)做是一種思想,進而推導(dǎo)出它的經(jīng)驗風(fēng)險函數(shù)為:最小化負的似然函數(shù)(即axF(y,f(x))→min?F(y,f(x)))maxF(y,f(x))→min?F(y,f(x)))。從損失函數(shù)的視角來看,它就成了 Softmax 損失函數(shù)了。
log 損失函數(shù)的標準形式:
利用已知的樣本分布,找到最有可能(即最大概率)導(dǎo)致這種分布的參數(shù)值;或者說什么樣的參數(shù)才能使我們觀測到目前這組數(shù)據(jù)的概率最大。
二、回歸分析
考慮廣義線性模型:
其中 X 表示樣本集,W 為待擬合的參數(shù)
1、最小二乘回歸
2、嶺回歸
3、Logistics 回歸
4、Lasso 回歸
三、實例
1、Linear Regression
import matplotlib.pyplot as plt import numpy as np from sklearn import datasets, linear_model from sklearn.metrics import mean_squared_error, r2_score# Load the diabetes dataset diabetes = datasets.load_diabetes()# Use only one feature diabetes_X = diabetes.data[:, np.newaxis, 2]# Split the data into training/testing sets diabetes_X_train = diabetes_X[:-20] diabetes_X_test = diabetes_X[-20:]# Split the targets into training/testing sets diabetes_y_train = diabetes.target[:-20] diabetes_y_test = diabetes.target[-20:]# Create linear regression object regr = linear_model.LinearRegression()# Train the model using the training sets regr.fit(diabetes_X_train, diabetes_y_train)# Make predictions using the testing set diabetes_y_pred = regr.predict(diabetes_X_test)# The coefficients print('Coefficients: \n', regr.coef_) # The mean squared error print("Mean squared error: %.2f"% mean_squared_error(diabetes_y_test, diabetes_y_pred)) # Explained variance score: 1 is perfect prediction print('Variance score: %.2f' % r2_score(diabetes_y_test, diabetes_y_pred))# Plot outputs plt.scatter(diabetes_X_test, diabetes_y_test, color='black') plt.plot(diabetes_X_test, diabetes_y_pred, color='blue', linewidth=3)plt.xticks(()) plt.yticks(())plt.show()結(jié)果:
('Coefficients: \n', array([938.23786125])) Mean squared error: 2548.07 Variance score: 0.472、Ridge Regression
import numpy as np import matplotlib.pyplot as plt from sklearn import linear_model# X is the 10x10 Hilbert matrix X = 1. / (np.arange(1, 11) + np.arange(0, 10)[:, np.newaxis]) y = np.ones(10)# ##################################################################### # Compute pathsn_alphas = 200 alphas = np.logspace(-10, -2, n_alphas)coefs = [] for a in alphas:ridge = linear_model.Ridge(alpha=a, fit_intercept=False)ridge.fit(X, y)coefs.append(ridge.coef_)# #################################################################### # Display resultsax = plt.gca()ax.plot(alphas, coefs) ax.set_xscale('log') ax.set_xlim(ax.get_xlim()[::-1]) # reverse axis plt.xlabel('alpha') plt.ylabel('weights') plt.title('Ridge coefficients as a function of the regularization') plt.axis('tight') plt.show()結(jié)果:
3、Lasso
import numpy as np import matplotlib.pyplot as pltfrom sklearn.metrics import r2_score# ############################################################################# # Generate some sparse data to play with np.random.seed(42)n_samples, n_features = 50, 200 X = np.random.randn(n_samples, n_features) coef = 3 * np.random.randn(n_features) inds = np.arange(n_features) np.random.shuffle(inds) coef[inds[10:]] = 0 # sparsify coef y = np.dot(X, coef)# add noise y += 0.01 * np.random.normal(size=n_samples)# Split data in train set and test set n_samples = X.shape[0] X_train, y_train = X[:n_samples // 2], y[:n_samples // 2] X_test, y_test = X[n_samples // 2:], y[n_samples // 2:]# ############################################################################# # Lasso from sklearn.linear_model import Lassoalpha = 0.1 lasso = Lasso(alpha=alpha)y_pred_lasso = lasso.fit(X_train, y_train).predict(X_test) r2_score_lasso = r2_score(y_test, y_pred_lasso) print(lasso) print("r^2 on test data : %f" % r2_score_lasso)plt.plot(lasso.coef_, color='gold', linewidth=2,label='Lasso coefficients') plt.plot(coef, '--', color='navy', label='original coefficients') plt.legend(loc='best') plt.title("Lasso R^2: %f"% (r2_score_lasso)) plt.show()結(jié)果:
Lasso(alpha=0.1) r^2 on test data : 0.3859824、Logistics Regression
import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import LogisticRegression from sklearn import datasets# import some data to play with iris = datasets.load_iris() X = iris.data[:, :2] # we only take the first two features. Y = iris.targetlogreg = LogisticRegression(C=1e5, solver='lbfgs', multi_class='multinomial')# Create an instance of Logistic Regression Classifier and fit the data. logreg.fit(X, Y)# Plot the decision boundary. For that, we will assign a color to each # point in the mesh [x_min, x_max]x[y_min, y_max]. x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5 y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5 h = .02 # step size in the mesh xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) Z = logreg.predict(np.c_[xx.ravel(), yy.ravel()])# Put the result into a color plot Z = Z.reshape(xx.shape) plt.figure(1, figsize=(4, 3)) plt.pcolormesh(xx, yy, Z, cmap=plt.cm.Paired)# Plot also the training points plt.scatter(X[:, 0], X[:, 1], c=Y, edgecolors='k', cmap=plt.cm.Paired) plt.xlabel('Sepal length') plt.ylabel('Sepal width')plt.xlim(xx.min(), xx.max()) plt.ylim(yy.min(), yy.max()) plt.xticks(()) plt.yticks(())plt.show()結(jié)果:
總結(jié)
以上是生活随笔為你收集整理的学习笔记——sklearn监督学习:回归(简单数学知识罗列)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 学习笔记(四)——JavaScript(
- 下一篇: 学习笔记(五)——JavaScript(