《机器学习实战》第十三章 PCA
生活随笔
收集整理的這篇文章主要介紹了
《机器学习实战》第十三章 PCA
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
在這一章的學習過程中,前面的程序都可以正常執行,但是在做13.3節,利用PCA對半導體制造數據降維時提示錯誤:
numpy.linalg.linalg.LinAlgError: Array must not contain infs or NaNs錯誤寫的很明顯,數據中含有了無窮大(infs)或缺失值(NaNs),那么我們找到提示出錯的語句
eigVals, eigVects = np.linalg.eig(np.matrix(covMat))明顯是covMat里面的數據有問題,我們將它輸出來看下
>>> covMat array([[ nan, nan, nan, ...,nan, nan, nan],[ nan, 6436.49876891, nan, ...,nan, nan, nan],[ nan, nan, nan, ...,nan, nan, nan],..., [ nan, nan, nan, ...,nan, nan, nan],[ nan, nan, nan, ...,nan, nan, nan],[ nan, nan, nan, ...,nan, nan, nan]])covMat數據中有多個缺失值nan,covMat之前的計算過程是這樣的
>>> dataMat = pca.replaceNanWithMean() >>> meanVals = np.mean(dataMat, axis = 0) >>> meanRemoved = dataMat - meanVals >>> covMat = np.cov(meanRemoved, rowvar = 0)我們將meanRemoved輸出,發現有nan值,將meanVals輸出,發現也有nan值,甚至將dataMat輸出,也有nan值,哎,扶額,看來之前replaceNanWithMean()函數中去nan值的過程有問題,不過我不想再去細改了,心累,我直接采用了一種比較簡單的方法,那就是只把covMat里面的nan改為0,這是一個偷懶的方法,結果可能會和書中老師的教程有一點差異,不過我是在學習,我也就不在乎了。我是這樣改的:
在covMat求出來后執行這樣一條語句
看一下修改結果
>>> covMat array([[ 0. , 0. , 0. , ...,0. , 0. , 0. ],[ 0. , 6436.49876891, 0. , ...,0. , 0. , 0. ],[ 0. , 0. , 0. , ...,0. , 0. , 0. ],..., [ 0. , 0. , 0. , ...,0. , 0. , 0. ],[ 0. , 0. , 0. , ...,0. , 0. , 0. ],[ 0. , 0. , 0. , ...,0. , 0. , 0. ]])嗯,不錯,然后再執行
>>> eigVals, eigVects = np.linalg.eig(np.matrix(covMat)) >>> eigVals array([ 1.06780209e+04, 8.59736396e+03, 6.41273414e+03,5.02643597e+03, 3.40488093e+03, 3.50779450e+03,2.22725443e+03, 2.75157234e+03, 2.84136889e+02,4.31321293e+01, 3.88326123e+01, 3.62389854e+01,2.40571225e+01, 1.75260412e+01, 1.33949736e+01,1.07994035e+01, 7.22312565e+00, 3.51543592e+00,2.09186700e+00, 1.04068981e+00, 9.22874395e-01,8.93680570e-01, 5.61249320e-01, 3.11675322e-01,7.21988883e-02, 2.18386663e-02, 1.35886283e-02,5.46328191e-03, 2.03257991e-03, 1.33860851e-03,2.03863244e-04, 1.41361019e-04, 1.15731062e-04,9.39311390e-05, 7.57144587e-05, 5.19232822e-05,4.07149667e-05, 2.43725392e-05, 2.35559282e-05,1.72257179e-05, 7.59457610e-06, 6.94631765e-06,1.99023740e-06, 1.39115700e-06, 6.53622178e-07,3.13084316e-07, 1.32830309e-09, 9.79155640e-09,3.03601711e-08, 1.20546679e-07, 1.08996063e-07,2.10808964e-07, 1.89481992e-07, 0.00000000e+00,0.00000000e+00, 0.00000000e+00, 0.00000000e+00,0.00000000e+00, 0.00000000e+00, 0.00000000e+00,0.00000000e+00, 0.00000000e+00, 0.00000000e+00,0.00000000e+00, 0.00000000e+00, 0.00000000e+00,....................................................0.00000000e+00, 0.00000000e+00, 0.00000000e+00,0.00000000e+00, 0.00000000e+00, 0.00000000e+00,0.00000000e+00, 0.00000000e+00])畫下圖看下結果,還是不錯的嘛,哈哈哈(自我催眠中。。。)
總結
以上是生活随笔為你收集整理的《机器学习实战》第十三章 PCA的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 机器学习第十二章
- 下一篇: 《机器学习实战》第十五章 MapRedu