生活随笔
收集整理的這篇文章主要介紹了
logstic 回归
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
????????一天,某人問我什么是logstic回歸。雖然做數據分析這么長時間經常用,僅僅是import some * 而已,沒有深入思考,然而很遺憾,我在網上看到的logstic回歸的數學推導都是錯的,包括幾本機器學習的經典教科書。花了幾天時間推導一下,發現其背后的數學思想比較復雜,涉及到矩陣點乘和矩陣微分的概念 logstic回歸就是對p/(1-p)進行線性回歸)
from numpy import *def loadDataSet():dataMat = []; labelMat = []fr = open('testSet.txt')for line in fr.readlines():lineArr = line.strip().split()dataMat.append([1.0, float(lineArr[0]), float(lineArr[1])])labelMat.append(int(lineArr[2]))return dataMat,labelMatdef sigmoid(inX):return 1.0/(1+exp(-inX))def gradAscent(dataMatIn, classLabels):dataMatrix = mat(dataMatIn) #convert to NumPy matrixlabelMat = mat(classLabels).transpose() #convert to NumPy matrixm,n = shape(dataMatrix)alpha = 0.001maxCycles = 5000weights = ones((n,1))for k in range(maxCycles): #heavy on matrix operationsh = sigmoid(dataMatrix*weights) #matrix multerror = (labelMat - h) #vector subtractionweights = weights + alpha * dataMatrix.transpose()* error #matrix multreturn weightsdef plotBestFit(weights):import matplotlib.pyplot as pltdataMat,labelMat=loadDataSet()dataArr = array(dataMat)n = shape(dataArr)[0] xcord1 = []; ycord1 = []xcord2 = []; ycord2 = []for i in range(n):if int(labelMat[i])== 1:xcord1.append(dataArr[i,1]); ycord1.append(dataArr[i,2])else:xcord2.append(dataArr[i,1]); ycord2.append(dataArr[i,2])fig = plt.figure()ax = fig.add_subplot(111)ax.scatter(xcord1, ycord1, s=30, c='red', marker='s')ax.scatter(xcord2, ycord2, s=30, c='green')x = arange(-3.0, 3.0, 0.1)y = (-weights[0]-weights[1]*x)/weights[2]ax.plot(x, y)plt.xlabel('X1'); plt.ylabel('X2');plt.show()#import logRegres dataArr,labelMat=loadDataSet() weights=gradAscent(dataArr,labelMat)
plotBestFit(weights.getA())
輸出的weight=
matrix([[ 9.35184677],[ 0.87401362],[-1.28891422]])
xw=9.35+0.87x-1.28y 令9.35+0.87x-1.28y=0,這就是分類曲線,為什么要這么做,在logstic 回歸中,在分類中以概率值0.5為分類界限,ln(p/1-p)=xw,p=0.5,得xw=0
代碼下載
總結
以上是生活随笔 為你收集整理的logstic 回归 的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網站內容還不錯,歡迎將生活随笔 推薦給好友。