【学习笔记】吴恩达机器学习 WEEK2 线性回归 Octave教程
Multivariate Linear Regression
Multiple Features
原來(lái):hθ(x)=θ0+θ1x1+θ2x2+?+θnxnh_{\theta}(x)=\theta_{0}+\theta_{1} x_{1}+\theta_{2} x_{2}+\cdots+\theta_{n} x_{n}hθ?(x)=θ0?+θ1?x1?+θ2?x2?+?+θn?xn?
definex0=1x_0=1x0?=1
現(xiàn)在:hθ(x)=θTx=θ0x0+θ1x1+θ2x2+?+θnxnh_{\theta}(x)=\theta^{T} x=\theta_{0} x_{0}+\theta_{1} x_{1}+\theta_{2} x_{2}+\cdots+\theta_{n} x_{n}hθ?(x)=θTx=θ0?x0?+θ1?x1?+θ2?x2?+?+θn?xn?
Gradient descent for multilple variables
New algorithm (n≥1)(n \geq 1)(n≥1) : (多個(gè)特征變量)
Repeat {
θj:=θj?α1m∑i=1m(hθ(x(i))?y(i))xj(i)\theta_{j}:=\theta_{j}-\alpha \frac{1}{m} \sum_{i=1}^{m}\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{j}^{(i)}θj?:=θj??αm1?∑i=1m?(hθ?(x(i))?y(i))xj(i)?
(simultaneously update θj\theta_{j}θj? for j=0,…,nj=0, \ldots, nj=0,…,n)
}
其中:先計(jì)算(hθ(x(i))?y(i))xj(i)\left(h_{\theta}\left(x^{(i)}\right)-y^{(i)}\right) x_{j}^{(i)}(hθ?(x(i))?y(i))xj(i)?,再求和
Gradient descent in practice I :Feature Scaling (特征縮放)
Get every feature into approximately a ?1≤xi≤1-1 \leq x_{i} \leq 1?1≤xi?≤1 range.
控制特征范圍大致相近,使梯度下降法可以更快的收斂
Mean normalization 均值歸一化
新特征值 = (特征值 - 均值)/范圍
Gradient descent in practice II :Learning rate (學(xué)習(xí)率)
代價(jià)函數(shù)隨迭代次數(shù)的變化曲線
代價(jià)函數(shù)沒(méi)有隨著迭代次數(shù)的增加而減小時(shí),減小學(xué)習(xí)率
0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1
Features and polynomial regression 特征選擇和多項(xiàng)式回歸
hθ(x)=θ0+θ1x1+θ2x2+θ3x3=θ0+θ1(size?)+θ2(size?)2+θ3(size?)3x1=(size?)x2=(size?)2x3=(size?)3\begin{aligned} h_{\theta}(x) &=\theta_{0}+\theta_{1} x_{1}+\theta_{2} x_{2}+\theta_{3} x_{3} \\ &=\theta_{0}+\theta_{1}(\operatorname{size})+\theta_{2}(\operatorname{size})^{2}+\theta_{3}(\operatorname{size})^{3} \\ x_{1} &=(\operatorname{size}) \\ x_{2} &=(\operatorname{size})^{2} \\ x_{3} &=(\operatorname{size})^{3} \end{aligned}hθ?(x)x1?x2?x3??=θ0?+θ1?x1?+θ2?x2?+θ3?x3?=θ0?+θ1?(size)+θ2?(size)2+θ3?(size)3=(size)=(size)2=(size)3?
Normal equation 正規(guī)方程
令代價(jià)函數(shù)導(dǎo)數(shù)為0 ,直接求出最優(yōu)值,無(wú)需迭代
Xθ=yX\theta=yXθ=y 推導(dǎo)出:
θ=(XTX)?1XTy\theta=\left(X^{T} X\right)^{-1} X^{T} yθ=(XTX)?1XTy
使用正規(guī)方程時(shí),不需要進(jìn)行特征縮放
| 需要確定學(xué)習(xí)率 | 不需要確定學(xué)習(xí)率 |
| 需要多次的迭代 | 不需要迭代 |
| 復(fù)雜度O(kn2)O(kn^2)O(kn2) | 復(fù)雜度O(n3)O(n^3)O(n3), need to calculate inverse of XTXX^{T}XXTX |
| 適用于特征變量較多時(shí) | 特征變量較多時(shí)變慢 |
說(shuō)明:使用正規(guī)方程法,計(jì)算矩陣的逆時(shí)比較耗時(shí),特征變量的數(shù)量超過(guò)10000時(shí),建議使用梯度下降法。
上述在求解θ\thetaθ時(shí),用到(XTX)?1\left(X^{T} X\right)^{-1}(XTX)?1,要考慮到其中XTXX^{T} XXTX是否可逆??
特征變量冗余 & 特征變量過(guò)多
Octave Tutorial
目的:簡(jiǎn)單,提高計(jì)算效率
備注:
總結(jié)
以上是生活随笔為你收集整理的【学习笔记】吴恩达机器学习 WEEK2 线性回归 Octave教程的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 【学习笔记】吴恩达机器学习 WEEK1
- 下一篇: 【教程】写CSDN博客时 调整图片大小,