5.7 程序示例--基于 SMO 的 SVM 模型-机器学习笔记-斯坦福吴恩达教授
程序示例–基于 SMO 的 SVM 模型
在這里,我們會實現一個基于 SMO 的 SVM 模型,在其中,提供了簡化版 SMO 和 完整版 SMO 的實現。
-
簡化版 SMO:不使用啟發式方法選擇 (α(i),α(j))(α^{(i)},α^{(j)})(α(i),α(j)) ,始終在整個樣本集上選擇違反了 KKT 條件的 α(i)α^{(i)}α(i) , α(j)α^{(j)}α(j) 則隨機選擇。
-
完整版 SMO:使用啟發式方法選擇 (α(i),α(j))(α^{(i)},α^{(j)})(α(i),α(j)) 。在整個訓練集或者非邊界樣本中選擇違反了 KKT 條件的 α(i)α^{(i)}α(i) 。并選擇使得 ∣E(i)?E(j)∣|E^{(i)}?E^{(j)}|∣E(i)?E(j)∣ 達到最大的 alpha(j)alpha^{(j)}alpha(j) 。
核函數
模型支持了核函數,并提供了線性核函數 和 高斯核(又稱之為徑向基核函數 RBF)函數 使用:
# coding: utf8 # svm/smo.pydef linearKernel():"""線性核函數"""def calc(X, A):return X * A.Treturn calcdef rbfKernel(delta):"""rbf核函數"""gamma = 1.0 / (2 * delta**2)def calc(X, A):return np.mat(rbf_kernel(X, A, gamma=gamma))return calcRBF 核的實現我們使用了性能更高的 sklearn.metrics.pairwise 提供的的 rbf_kernel
訓練初始化
def getSmo(X, y, C, tol, maxIter, kernel=linearKernel()):"""SMOArgs:X 訓練樣本y 標簽集C 正規化參數tol 容忍值maxIter 最大迭代次數K 所用核函數Returns:trainSimple 簡化版訓練算法train 完整版訓練算法predict 預測函數"""m, n = X.shape# 存放核函數的轉化結果K = kernel(X, X)# Cache存放預測誤差,用以加快計算速度ECache = np.zeros((m,2))# ...權值向量 www
我們知道,模型 f(x)f(x)f(x) 僅僅與支持向量有關,所以,權值 www 可以定義為:
w=∑i=1ma(i)y(i)(x(i))Tw=\sum_{i=1}^ma^{(i)}y^{(i)}(x^{(i)})^Tw=i=1∑m?a(i)y(i)(x(i))T=∑i∈1a(i)y(i)(x(i))T=\sum_{i∈1}a^{(i)}y^{(i)}(x^{(i)})^T=i∈1∑?a(i)y(i)(x(i))T
其中, SSS 表示了屬于支持向量的樣本坐標集合。
# coding: utf8 # svm/smo.pydef getSmo(X, y, C, tol, maxIter, kernel=linearKernel()):# ...def w(alphas, b, supportVectorsIndex, supportVectors):return (np.multiply(alphas[supportVectorsIndex], y[supportVectorsIndex]).T * supportVectors).T預測誤差 E(i)E^{(i)}E(i)
E(i)=f(x(i))?y(i)E^{(i)}=f(x^{(i)})-y^{(i)}E(i)=f(x(i))?y(i)=∑j=1ma(j)y(j)k(x(j),x(i))+b?y(i)=\sum_{j=1}^ma^{(j)}y^{(j)}k(x^{(j)},x^{(i)})+b-y^{(i)}=j=1∑m?a(j)y(j)k(x(j),x(i))+b?y(i)
# coding: utf8 # svm/smo.pydef getSmo(X, y, C, tol, maxIter, kernel=linearKernel()):# ...def E(i, alphas, b):"""計算預測誤差Args:i ialphas alphasb bReturns:E_i 第i個樣本的預測誤差"""FXi = float(np.multiply(alphas, y).T * K[:, i]) + bE = FXi - float(y[i])return E預測函數
f(x)=∑i∈Sa(i)y(i)(x(i))T+bf(x)=\sum_{i∈S}a^{(i)}y^{(i)}(x^{(i)})^T+bf(x)=i∈S∑?a(i)y(i)(x(i))T+b
# coding: utf8 # svm/smo.pydef getSmo(X, y, C, tol, maxIter, kernel=linearKernel()):# ...def predict(X, alphas, b, supportVectorsIndex, supportVectors):"""計算權值向量Args:X 預測矩陣alphas alphasb bsupportVectorsIndex 支持向量坐標supportVectors 支持向量Returns:predicts 預測結果"""Ks = kernel(supportVectors, X)predicts = (np.multiply(alphas[supportVectorsIndex], y[supportVectorsIndex]).T * Ks + b).Tpredicts = np.sign(predicts)return predicts選擇 α(i)α^{(i)}α(i)
# coding: utf8 # svm/smo.pydef getSmo(X, y, C, tol, maxIter, kernel=linearKernel()):# ...def select(i, alphas, b):"""alpha對選擇"""Ei = E(i, alphas, b)# 選擇違背KKT條件的,作為alpha2Ri = y[i] * Eiif (Ri < -tol and alphas[i] < C) or \(Ri > tol and alphas[i] > 0):# 選擇第二個參數j = selectJRand(i)Ej = E(j, alphas, b)# j, Ej = selectJ(i, Ei, alphas, b)# get boundsif y[i] != y[j]:L = max(0, alphas[j] - alphas[i])H = min(C, C + alphas[j] - alphas[i])else:L = max(0, alphas[j] + alphas[i] - C)H = min(C, alphas[j] + alphas[i])if L == H:return 0, alphas, bKii = K[i, i]Kjj = K[j, j]Kij = K[i, j]eta = 2.0 * Kij - Kii - Kjjif eta >= 0:return 0, alphas, biOld = alphas[i].copy()jOld = alphas[j].copy()alphas[j] = jOld - y[j] * (Ei - Ej) / etaif alphas[j] > H:alphas[j] = Helif alphas[j] < L:alphas[j] = Lif abs(alphas[j] - jOld) < tol:alphas[j] = jOldreturn 0, alphas, balphas[i] = iOld + y[i] * y[j] * (jOld - alphas[j])# update ECacheupdateE(i, alphas, b)updateE(j, alphas, b)# update bbINew = b - Ei - y[i] * (alphas[i] - iOld) * Kii - y[j] * \(alphas[j] - jOld) * KijbJNew = b - Ej - y[i] * (alphas[i] - iOld) * Kij - y[j] * \(alphas[j] - jOld) * Kjjif alphas[i] > 0 and alphas[i] < C:bNew = bINewelif alphas[j] > 0 and alphas[j] < C:bNew = bJNewelse:bNew = (bINew + bJNew) / 2return 1, alphas, belse:return 0, alphas, b選擇 α(j)α^{(j)}α(j)
# coding: utf8 # svm/smo.pydef getSmo(X, y, C, tol, maxIter, kernel=linearKernel()):# ...def selectJRand(i):""""""j = iwhile j == i:j = int(np.random.uniform(0, m))return jdef selectJ(i, Ei, alphas, b):"""選擇權值 {% math %}\alpha^{(i)}{% endmath %}"""maxJ = 0; maxDist=0; Ej = 0ECache[i] = [1, Ei]validCaches = np.nonzero(ECache[:, 0])[0]if len(validCaches) > 1:for k in validCaches:if k==i: continueEk = E(k, alphas, b)dist = np.abs(abs(Ei-Ek))if maxDist < dist:Ej = EkmaxJ = kmaxDist = distreturn maxJ, Ejelse:### 隨機選擇j = selectJRand(i)Ej = E(j, alphas, b)return j, Ej完整版 SMO 訓練方法:
# coding: utf8 # svm/smo.pydef getSmo(X, y, C, tol, maxIter, kernel=linearKernel()):# ...def train():"""完整版訓練算法Returns:alphas alphasw wb bsupportVectorsIndex 支持向量的坐標集supportVectors 支持向量iterCount 迭代次數"""numChanged = 0examineAll = TrueiterCount = 0alphas = np.mat(np.zeros((m, 1)))b = 0# 如果所有alpha都遵從 KKT 條件,則在整個訓練集上迭代# 否則在處于邊界內 (0, C) 的 alpha 中迭代while (numChanged > 0 or examineAll) and (iterCount < maxIter):numChanged = 0if examineAll:for i in range(m):changed, alphas, b = select(i, alphas, b)numChanged += changedelse:nonBoundIds = np.nonzero((alphas.A > 0) * (alphas.A < C))[0]for i in nonBoundIds:changed, alphas, b = select(i, alphas, b)numChanged += changediterCount += 1if examineAll:examineAll = Falseelif numChanged == 0:examineAll = TruesupportVectorsIndex = np.nonzero(alphas.A > 0)[0]supportVectors = np.mat(X[supportVectorsIndex])return alphas, w(alphas, b, supportVectorsIndex, supportVectors), b, \supportVectorsIndex, supportVectors, iterCount簡化版 SMO 訓練方法:
# coding: utf8 # svm/smo.pydef getSmo(X, y, C, tol, maxIter, kernel=linearKernel()):# ...def trainSimple():"""簡化版訓練算法Returns:alphas alphasw wb bsupportVectorsIndex 支持向量的坐標集supportVectors 支持向量iterCount 迭代次數"""numChanged = 0iterCount = 0alphas = np.mat(np.zeros((m, 1)))b = 0L = 0H = 0while iterCount < maxIter:numChanged = 0for i in range(m):Ei = E(i, alphas, b)Ri = y[i] * Ei# 選擇違背KKT條件的,作為alpha2if (Ri < -tol and alphas[i] < C) or \(Ri > tol and alphas[i] > 0):# 選擇第二個參數j = selectJRand(i)Ej = E(j, alphas, b)# get boundsif y[i] != y[j]:L = max(0, alphas[j] - alphas[i])H = min(C, C + alphas[j] - alphas[i])else:L = max(0, alphas[j] + alphas[i] - C)H = min(C, alphas[j] + alphas[i])if L == H:continueKii = K[i, i]Kjj = K[j, j]Kij = K[i, j]eta = 2.0 * Kij - Kii - Kjjif eta >= 0:continueiOld = alphas[i].copy();jOld = alphas[j].copy()alphas[j] = jOld - y[j] * (Ei - Ej) / etaif alphas[j] > H:alphas[j] = Helif alphas[j] < L:alphas[j] = Lif abs(alphas[j] - jOld) < tol:alphas[j] = jOldcontinuealphas[i] = iOld + y[i] * y[j] * (jOld - alphas[j])# update bbINew = b - Ei - y[i] * (alphas[i] - iOld) * Kii - y[j] * \(alphas[j] - jOld) * KijbJNew = b - Ej - y[i] * (alphas[i] - iOld) * Kij - y[j] * \(alphas[j] - jOld) * Kjjif alphas[i] > 0 and alphas[i] < C:b = bINewelif alphas[j] > 0 and alphas[j] < C:b = bJNewelse:b = (bINew + bJNew) / 2.0numChanged += 1if numChanged == 0:iterCount += 1else:iterCount = 0supportVectorsIndex = np.nonzero(alphas.A > 0)[0]supportVectors = np.mat(X[supportVectorsIndex])return alphas, w(alphas, b, supportVectorsIndex, supportVectors), b, \supportVectorsIndex, supportVectors, iterCount完整代碼
# coding: utf8 # svm/smo.pyimport numpy as npfrom sklearn.metrics.pairwise import rbf_kernel""" svm模型 """def linearKernel():"""線性核函數"""def calc(X, A):return X * A.Treturn calcdef rbfKernel(delta):"""rbf核函數"""gamma = 1.0 / (2 * delta**2)def calc(X, A):return np.mat(rbf_kernel(X, A, gamma=gamma))return calcdef getSmo(X, y, C, tol, maxIter, kernel=linearKernel()):"""SMOArgs:X 訓練樣本y 標簽集C 正規化參數tol 容忍值maxIter 最大迭代次數K 所用核函數Returns:trainSimple 簡化版訓練算法train 完整版訓練算法predict 預測函數"""# 存放核函數的轉化結果K = kernel(X, X)# Cache存放預測誤差,用以加快計算速度ECache = np.zeros((m,2))def predict(X, alphas, b, supportVectorsIndex, supportVectors):"""計算權值向量Args:X 預測矩陣alphas alphasb bsupportVectorsIndex 支持向量坐標集supportVectors 支持向量Returns:predicts 預測結果"""Ks = kernel(supportVectors, X)predicts = (np.multiply(alphas[supportVectorsIndex], y[supportVectorsIndex]).T * Ks + b).Tpredicts = np.sign(predicts)return predictsdef w(alphas, b, supportVectorsIndex, supportVectors):"""計算權值Args:alphas alphasb bsupportVectorsIndex 支持向量坐標supportVectors 支持向量Returns:w 權值向量"""return (np.multiply(alphas[supportVectorsIndex], y[supportVectorsIndex]).T * supportVectors).Tdef E(i, alphas, b):"""計算預測誤差Args:i ialphas alphasb bReturns:E_i 第i個樣本的預測誤差"""FXi = float(np.multiply(alphas, y).T * K[:, i]) + bE = FXi - float(y[i])return Edef updateE(i, alphas, b):ECache[i] = [1, E(i, alphas, b)]def selectJRand(i):""""""j = iwhile j == i:j = int(np.random.uniform(0, m))return jdef selectJ(i, Ei, alphas, b):"""選擇權值 {% math %}\alpha^{(i)}{% endmath %}"""maxJ = 0; maxDist=0; Ej = 0ECache[i] = [1, Ei]validCaches = np.nonzero(ECache[:, 0])[0]if len(validCaches) > 1:for k in validCaches:if k==i: continueEk = E(k, alphas, b)dist = np.abs(abs(Ei-Ek))if maxDist < dist:Ej = EkmaxJ = kmaxDist = distreturn maxJ, Ejelse:### 隨機選擇j = selectJRand(i)Ej = E(j, alphas, b)return j, Ejdef select(i, alphas, b):"""alpha對選擇"""Ei = E(i, alphas, b)# 選擇違背KKT條件的,作為alpha2Ri = y[i] * Eiif (Ri < -tol and alphas[i] < C) or \(Ri > tol and alphas[i] > 0):# 選擇第二個參數j = selectJRand(i)Ej = E(j, alphas, b)# j, Ej = selectJ(i, Ei, alphas, b)# get boundsif y[i] != y[j]:L = max(0, alphas[j] - alphas[i])H = min(C, C + alphas[j] - alphas[i])else:L = max(0, alphas[j] + alphas[i] - C)H = min(C, alphas[j] + alphas[i])if L == H:return 0, alphas, bKii = K[i, i]Kjj = K[j, j]Kij = K[i, j]eta = 2.0 * Kij - Kii - Kjjif eta >= 0:return 0, alphas, biOld = alphas[i].copy()jOld = alphas[j].copy()alphas[j] = jOld - y[j] * (Ei - Ej) / etaif alphas[j] > H:alphas[j] = Helif alphas[j] < L:alphas[j] = Lif abs(alphas[j] - jOld) < tol:alphas[j] = jOldreturn 0, alphas, balphas[i] = iOld + y[i] * y[j] * (jOld - alphas[j])# update ECacheupdateE(i, alphas, b)updateE(j, alphas, b)# update bbINew = b - Ei - y[i] * (alphas[i] - iOld) * Kii - y[j] * \(alphas[j] - jOld) * KijbJNew = b - Ej - y[i] * (alphas[i] - iOld) * Kij - y[j] * \(alphas[j] - jOld) * Kjjif alphas[i] > 0 and alphas[i] < C:bNew = bINewelif alphas[j] > 0 and alphas[j] < C:bNew = bJNewelse:bNew = (bINew + bJNew) / 2return 1, alphas, belse:return 0, alphas, bdef train():"""完整版訓練算法Returns:alphas alphasw wb bsupportVectorsIndex 支持向量的坐標集supportVectors 支持向量iterCount 迭代次數"""numChanged = 0examineAll = TrueiterCount = 0alphas = np.mat(np.zeros((m, 1)))b = 0# 如果所有alpha都遵從 KKT 條件,則在整個訓練集上迭代# 否則在處于邊界內 (0, C) 的 alpha 中迭代while (numChanged > 0 or examineAll) and (iterCount < maxIter):numChanged = 0if examineAll:for i in range(m):changed, alphas, b = select(i, alphas, b)numChanged += changedelse:nonBoundIds = np.nonzero((alphas.A > 0) * (alphas.A < C))[0]for i in nonBoundIds:changed, alphas, b = select(i, alphas, b)numChanged += changediterCount += 1if examineAll:examineAll = Falseelif numChanged == 0:examineAll = TruesupportVectorsIndex = np.nonzero(alphas.A > 0)[0]supportVectors = np.mat(X[supportVectorsIndex])return alphas, w(alphas, b, supportVectorsIndex, supportVectors), b, \supportVectorsIndex, supportVectors, iterCountdef trainSimple():"""簡化版訓練算法Returns:alphas alphasw wb bsupportVectorsIndex 支持向量的坐標集supportVectors 支持向量iterCount 迭代次數"""numChanged = 0iterCount = 0alphas = np.mat(np.zeros((m, 1)))b = 0L = 0H = 0while iterCount < maxIter:numChanged = 0for i in range(m):Ei = E(i, alphas, b)Ri = y[i] * Ei# 選擇違背KKT條件的,作為alpha2if (Ri < -tol and alphas[i] < C) or \(Ri > tol and alphas[i] > 0):# 選擇第二個參數j = selectJRand(i)Ej = E(j, alphas, b)# get boundsif y[i] != y[j]:L = max(0, alphas[j] - alphas[i])H = min(C, C + alphas[j] - alphas[i])else:L = max(0, alphas[j] + alphas[i] - C)H = min(C, alphas[j] + alphas[i])if L == H:continueKii = K[i, i]Kjj = K[j, j]Kij = K[i, j]eta = 2.0 * Kij - Kii - Kjjif eta >= 0:continueiOld = alphas[i].copy();jOld = alphas[j].copy()alphas[j] = jOld - y[j] * (Ei - Ej) / etaif alphas[j] > H:alphas[j] = Helif alphas[j] < L:alphas[j] = Lif abs(alphas[j] - jOld) < tol:alphas[j] = jOldcontinuealphas[i] = iOld + y[i] * y[j] * (jOld - alphas[j])# update bbINew = b - Ei - y[i] * (alphas[i] - iOld) * Kii - y[j] * \(alphas[j] - jOld) * KijbJNew = b - Ej - y[i] * (alphas[i] - iOld) * Kij - y[j] * \(alphas[j] - jOld) * Kjjif alphas[i] > 0 and alphas[i] < C:b = bINewelif alphas[j] > 0 and alphas[j] < C:b = bJNewelse:b = (bINew + bJNew) / 2.0numChanged += 1if numChanged == 0:iterCount += 1else:iterCount = 0supportVectorsIndex = np.nonzero(alphas.A > 0)[0]supportVectors = np.mat(X[supportVectorsIndex])return alphas, w(alphas, b, supportVectorsIndex, supportVectors), b, \supportVectorsIndex, supportVectors, iterCountreturn trainSimple, train, predict參考資料
- 《機器學習實戰》
- 《機器學習》
總結
以上是生活随笔為你收集整理的5.7 程序示例--基于 SMO 的 SVM 模型-机器学习笔记-斯坦福吴恩达教授的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 5.6 SMO-机器学习笔记-斯坦福吴恩
- 下一篇: 5.8 程序示例--线性分类-机器学习笔