吴恩达机器学习Ex2
本文由兩部分組成
-
邏輯回歸 Logistic Regression
問題背景:預(yù)測學(xué)生是否可以入學(xué) -
正則化邏輯回歸 regularized logistic regression
問題背景:預(yù)測芯片是否可以通過質(zhì)量檢測。
Logistic Regression
問題背景:預(yù)測學(xué)生是否可以入學(xué)
數(shù)據(jù)集:歷屆申請者的歷史數(shù)據(jù):兩次考試的成績,以及是否入學(xué)的結(jié)果(0表示未入學(xué),1表示入學(xué))
目標(biāo):建立一個分類模型,基于申請者的兩次考試成績來評估申請者可以上大學(xué)的概率
兩部分的成績
初始化和加載數(shù)據(jù)
數(shù)據(jù)集:ex2data1.txt 位于文章最后
Part 1:Plotting
%% ==================== Part 1: Plotting ==================== % We start the exercise by first plotting the data to understand the % the problem we are working with.fprintf(['Plotting data with + indicating (y = 1) examples and o ' ...'indicating (y = 0) examples.\n']);plotData(X, y);%自己編寫該函數(shù)% Put some labels hold on; % Labels and Legend xlabel('Exam 1 score') ylabel('Exam 2 score')% Specified in plot order legend('Admitted', 'Not admitted') hold off;fprintf('\nProgram paused. Press enter to continue.\n'); pause;需要完成如下效果:兩個坐標(biāo)軸分別是兩次得分,繪制出來的點(diǎn)的顏色代表是否錄取。黑色代表錄取,黃色代表未錄取
plotData.m
function plotData(X, y) %PLOTDATA Plots the data points X and y into a new figure % PLOTDATA(x,y) plots the data points with + for the positive examples % and o for the negative examples. X is assumed to be a Mx2 matrix.% Create New Figure figure; hold on;% ====================== YOUR CODE HERE ====================== % Instructions: Plot the positive and negative examples on a % 2D plot, using the option 'k+' for the positive % examples and 'ko' for the negative examples. %pos=find(y==1); neg=find(y==0); plot(X(pos,1),X(pos,2),'k+','LineWidth',2,...'MarkerSize',7); plot(X(neg,1),X(neg,2),'ko','MarkerFaceColor','y',...'MarkerSize',7);% =========================================================================hold off;end繪制結(jié)果
資料查詢
find函數(shù)
功能:找到非零元素的索引(也就是它在array中的位置)
本例中使用的是 pos=find(y==1)這里y是一個列向量,find(y== 1)得到的就是 y==1時的位置,通過pos找到矩陣X中對應(yīng)的一行,進(jìn)行繪制。
繪制的代碼
‘MarkerFaceColor’ - 標(biāo)記填充顏色
本例中'MarkerFaceColor','y'使用的是黃色
圖片來源:matlab 幫助中心
如果沒有加入'MarkerFaceColor','y'得到的是黑白圖形
Part2 :Compute Cost and Gradient
%% ============ Part 2: Compute Cost and Gradient ============ % In this part of the exercise, you will implement the cost and gradient % for logistic regression. You neeed to complete the code in % costFunction.m% Setup the data matrix appropriately, and add ones for the intercept term [m, n] = size(X);% Add intercept term to x and X_test X = [ones(m, 1) X];% Initialize fitting parameters initial_theta = zeros(n + 1, 1);% Compute and display initial cost and gradient [cost, grad] = costFunction(initial_theta, X, y);fprintf('Cost at initial theta (zeros): %f\n', cost); fprintf('Expected cost (approx): 0.693\n'); fprintf('Gradient at initial theta (zeros): \n'); fprintf(' %f \n', grad); fprintf('Expected gradients (approx):\n -0.1000\n -12.0092\n -11.2628\n');% Compute and display cost and gradient with non-zero theta test_theta = [-24; 0.2; 0.2]; [cost, grad] = costFunction(test_theta, X, y);fprintf('\nCost at test theta: %f\n', cost); fprintf('Expected cost (approx): 0.218\n'); fprintf('Gradient at test theta: \n'); fprintf(' %f \n', grad); fprintf('Expected gradients (approx):\n 0.043\n 2.566\n 2.647\n');fprintf('\nProgram paused. Press enter to continue.\n'); pause;sigmoid.m函數(shù)
預(yù)測函數(shù)
hθ(x)=g(θTx)h_\theta(x)=g(\theta^Tx)hθ?(x)=g(θTx)
其中 g為sigmoid function
g(z)=11+e?zg(z)=\frac{1}{1+e^{-z}}g(z)=1+e?z1?
函數(shù)的性質(zhì)
sigmoid(100000)
ans = 1
sigmoid(0)
ans = 0.5000
sigmoid(-100000)
ans =0
代價(jià)函數(shù)
J(θ)=?1m∑i=1m[y(i)×log(hθ(x(i)))+(1?y(i))×log(1?hθ(x(i)))]J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}\left [y^{(i)}\times log\left(h_\theta(x^{(i)})\right)+(1-y^{(i)}) \times log\left(1-h_\theta(x^{(i)})\right)\right ]J(θ)=?m1?i=1∑m?[y(i)×log(hθ?(x(i)))+(1?y(i))×log(1?hθ?(x(i)))]
代價(jià)函數(shù)的梯度是一個向量,第jthj^{th}jth個元素被定義為
?J(θ)?θj=1m∑i=1m(hθ(x(i))?y(i))xj(i)\frac{\partial J(\theta)}{\partial \theta_j}=\frac{1}{m}\sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})x_j^{(i)}?θj??J(θ)?=m1?i=1∑m?(hθ?(x(i))?y(i))xj(i)?
costFunction.m
function [J, grad] = costFunction(theta, X, y) %COSTFUNCTION Compute cost and gradient for logistic regression % J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the % parameter for logistic regression and the gradient of the cost % w.r.t. to the parameters.% Initialize some useful values m = length(y); % number of training examples% You need to return the following variables correctly J = 0; grad = zeros(size(theta));% ====================== YOUR CODE HERE ====================== % Instructions: Compute the cost of a particular choice of theta. % You should set J to the cost. % Compute the partial derivatives and set grad to the partial % derivatives of the cost w.r.t. each parameter in theta % % Note: grad should have the same dimensions as theta % h=sigmoid(X*theta); first=y.*log(h);%第一項(xiàng),點(diǎn)乘 second=(1-y).*log(1-h);%第二項(xiàng),同樣是點(diǎn)乘 J=-1/m*sum(first+second);%求和,代價(jià)函數(shù)grad=1/m*X'*(h-y);% =============================================================endPart 3: Optimizing using fminunc
下面是調(diào)用內(nèi)置函數(shù) fminunc
首先是定義fminunc的選項(xiàng)設(shè)置GradObj為on,是告訴函數(shù)返回值有兩個:cost 和 gradient
設(shè)置MaxIter為400,是告訴函數(shù) 最多迭代400次就要結(jié)束運(yùn)行。
@(t)(costFunction(t, X, y))創(chuàng)造一個函數(shù),參數(shù)為t的函數(shù),調(diào)用costFunction
[theta, cost] = ...fminunc(@(t)(costFunction(t, X, y)), initial_theta, options); %% ============= Part 3: Optimizing using fminunc ============= % In this exercise, you will use a built-in function (fminunc) to find the % optimal parameters theta.% Set options for fminunc options = optimset('GradObj', 'on', 'MaxIter', 400);% Run fminunc to obtain the optimal theta % This function will return theta and the cost [theta, cost] = ...fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);% Print theta to screen fprintf('Cost at theta found by fminunc: %f\n', cost); fprintf('Expected cost (approx): 0.203\n'); fprintf('theta: \n'); fprintf(' %f \n', theta); fprintf('Expected theta (approx):\n'); fprintf(' -25.161\n 0.206\n 0.201\n');% Plot Boundary plotDecisionBoundary(theta, X, y);% Put some labels hold on; % Labels and Legend xlabel('Exam 1 score') ylabel('Exam 2 score')% Specified in plot order legend('Admitted', 'Not admitted') hold off;fprintf('\nProgram paused. Press enter to continue.\n'); pause;函數(shù)fminunc不需要自己手動寫循環(huán),不需要手動為梯度下降法設(shè)置學(xué)習(xí)率,只需要提供計(jì)算代價(jià)和梯度的函數(shù)costFunction.它會收斂到正確的最優(yōu)參數(shù),并且返回cost和θ\thetaθ
Cost at theta found by fminunc: 0.203498 Expected cost (approx): 0.203 theta: -25.161343 0.206232 0.201472 Expected theta (approx):-25.1610.2060.201后面是繪制決策邊界
對于函數(shù)
hθ(x)=g(θ1+θ2x2+θ3x3)h_\theta(x)=g(\theta_1+\theta_2x_2+\theta_3x_3)hθ?(x)=g(θ1?+θ2?x2?+θ3?x3?)
由于sigmoid函數(shù)分界點(diǎn)為z=0,z>0時,輸出為1;z<0時,輸出為0;
所以分界點(diǎn)時,有:這里的變量是(x2,x3x_2,x_3x2?,x3?)
θ1+θ2x2+θ3x3=0\theta_1+\theta_2x_2+\theta_3x_3=0θ1?+θ2?x2?+θ3?x3?=0
這里下標(biāo)使用1,2,3的原因是:matlab中初始下標(biāo)是從1開始,而不是0,便于計(jì)算
其中繪圖函數(shù)代碼如下
plotDecisionBoundary.m函數(shù)
繪制直線需要知道 點(diǎn)對(x,y)的值,
舉例子
繪制的圖形為:
對于本題,對應(yīng)點(diǎn)對是什么呢?
這里的變量是(x2,x3x_2,x_3x2?,x3?),其中x2=x_2=x2?=plot_x, 而x3=x_3=x3?=plot_y可由θ1+θ2x2+θ3x3=0\theta_1+\theta_2x_2+\theta_3x_3=0θ1?+θ2?x2?+θ3?x3?=0解出來。
Part 4: Predict and Accuracies
%% ============== Part 4: Predict and Accuracies ============== % After learning the parameters, you'll like to use it to predict the outcomes % on unseen data. In this part, you will use the logistic regression model % to predict the probability that a student with score 45 on exam 1 and % score 85 on exam 2 will be admitted. % % Furthermore, you will compute the training and test set accuracies of % our model. % % Your task is to complete the code in predict.m% Predict probability for a student with score 45 on exam 1 % and score 85 on exam 2 prob = sigmoid([1 45 85] * theta); fprintf(['For a student with scores 45 and 85, we predict an admission ' ...'probability of %f\n'], prob); fprintf('Expected value: 0.775 +/- 0.002\n\n');% Compute accuracy on our training set p = predict(theta, X);fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100); fprintf('Expected accuracy (approx): 89.0\n');fprintf('\n');下面是predict.m函數(shù)的內(nèi)容
返回值p是預(yù)測值,結(jié)果只能是0或者1,0代表不能上大學(xué),1代表能夠上大學(xué)。
X的維度是m*(n+1),m表示訓(xùn)練樣本的數(shù)量。theta長度是(n+1)*1
下面是簡化寫法,也是matlab推薦的寫法,速度更快
%或者寫成 index=sigmoid(X*theta)>=0.5; p(index)=1;代碼
function p = predict(theta, X) %PREDICT Predict whether the label is 0 or 1 using learned logistic %regression parameters theta % p = PREDICT(theta, X) computes the predictions for X using a % threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)m = size(X, 1); % Number of training examples% You need to return the following variables correctly p = zeros(m, 1);% ====================== YOUR CODE HERE ====================== % Instructions: Complete the following code to make predictions using % your learned logistic regression parameters. % You should set p to a vector of 0's and 1's %index=sigmoid(X*theta)>=0.5; p(index)=1;% =========================================================================end然后是計(jì)算準(zhǔn)確率
這里準(zhǔn)確率的計(jì)算是通過比較預(yù)測值p和真實(shí)值y差異的平均值。若p==y,則結(jié)果=1,若p≠y,則結(jié)果=0,這樣所有的結(jié)果累加,然后取平均值,計(jì)算出來的就是準(zhǔn)確率。比如100個數(shù)據(jù)中,有90個是預(yù)測值等于真實(shí)值,然后平均值就是(901+100)/100=0.9。
結(jié)果
For a student with scores 45 and 85, we predict an admission probability of 0.776291 Expected value: 0.775 +/- 0.002Train Accuracy: 89.000000 Expected accuracy (approx): 89.0數(shù)據(jù)集
ex2data1.txt
regularized logistic regression正則化邏輯回歸
問題背景:預(yù)測芯片是否可以通過質(zhì)量檢測。
數(shù)據(jù)集:過去芯片的兩項(xiàng)檢測結(jié)果的歷史數(shù)據(jù)
數(shù)據(jù)可視化
%% Initialization clear ; close all; clc%% Load Data % The first two columns contains the X values and the third column % contains the label (y).data = load('ex2data2.txt'); X = data(:, [1, 2]); y = data(:, 3);plotData(X, y);% Put some labels hold on;% Labels and Legend xlabel('Microchip Test 1') ylabel('Microchip Test 2')% Specified in plot order legend('y = 1', 'y = 0') hold off;結(jié)果:芯片經(jīng)歷兩次檢測的結(jié)果,調(diào)用的仍然是上文的 plotData函數(shù)
我們可以看到,該題不是一個線性的決策邊界,不能用線性函數(shù)來做。需要考慮多項(xiàng)式來做。
Part 1: Regularized Logistic Regression
從每一個特征點(diǎn)中提取出來更多的特征,我們使用x1x_1x1?和x2x_2x2?的最高六次多項(xiàng)式來進(jìn)行映射。
從上面可以看到,我們把兩個特征(x1x_1x1?和x2x_2x2?)轉(zhuǎn)換成了一個(28×128\times 128×1)的向量
經(jīng)過這個高維的向量,訓(xùn)練出來的邏輯回歸分類器,它的決策邊界更加復(fù)雜,而且非線性。
對應(yīng)的多項(xiàng)式特征轉(zhuǎn)變使用如下函數(shù)
mapFeature.m函數(shù)
下面計(jì)算正則化邏輯回歸的代價(jià)函數(shù)和梯度
代價(jià)函數(shù)為
需要注意的是,不需要正則化θ0\theta_0θ0?,即不需要正則化theta(1),因?yàn)閙atlab下標(biāo)是從1開始的。
對應(yīng)的梯度函數(shù)
函數(shù)為
costFunctionReg.m文件
核心部分就是上面兩個公式的matlab實(shí)現(xiàn)
下面是調(diào)用函數(shù),并且進(jìn)行正則化邏輯回歸
%% =========== Part 1: Regularized Logistic Regression ============ % In this part, you are given a dataset with data points that are not % linearly separable. However, you would still like to use logistic % regression to classify the data points. % % To do so, you introduce more features to use -- in particular, you add % polynomial features to our data matrix (similar to polynomial % regression). %% Add Polynomial Features% Note that mapFeature also adds a column of ones for us, so the intercept % term is handled X = mapFeature(X(:,1), X(:,2));% Initialize fitting parameters initial_theta = zeros(size(X, 2), 1);%size(X, 2)%返回X第二維度的長度,也就是每一行的元素個數(shù)% Set regularization parameter lambda to 1 lambda = 1;% Compute and display initial cost and gradient for regularized logistic % regression [cost, grad] = costFunctionReg(initial_theta, X, y, lambda);fprintf('Cost at initial theta (zeros): %f\n', cost); fprintf('Expected cost (approx): 0.693\n'); fprintf('Gradient at initial theta (zeros) - first five values only:\n'); fprintf(' %f \n', grad(1:5)); fprintf('Expected gradients (approx) - first five values only:\n'); fprintf(' 0.0085\n 0.0188\n 0.0001\n 0.0503\n 0.0115\n');fprintf('\nProgram paused. Press enter to continue.\n'); pause;% Compute and display cost and gradient % with all-ones theta and lambda = 10 test_theta = ones(size(X,2),1); [cost, grad] = costFunctionReg(test_theta, X, y, 10);fprintf('\nCost at test theta (with lambda = 10): %f\n', cost); fprintf('Expected cost (approx): 3.16\n'); fprintf('Gradient at test theta - first five values only:\n'); fprintf(' %f \n', grad(1:5)); fprintf('Expected gradients (approx) - first five values only:\n'); fprintf(' 0.3460\n 0.1614\n 0.1948\n 0.2269\n 0.0922\n');fprintf('\nProgram paused. Press enter to continue.\n'); pause;測試結(jié)果
Cost at initial theta (zeros): 0.693147 Expected cost (approx): 0.693 Gradient at initial theta (zeros) - first five values only:0.008475 0.018788 0.000078 0.050345 0.011501 Expected gradients (approx) - first five values only:0.00850.01880.00010.05030.0115Program paused. Press enter to continue.Cost at test theta (with lambda = 10): 3.164509 Expected cost (approx): 3.16 Gradient at test theta - first five values only:0.346045 0.161352 0.194796 0.226863 0.092186 Expected gradients (approx) - first five values only:0.34600.16140.19480.22690.0922Program paused. Press enter to continue.Part 2: Regularization and Accuracies
正則化和準(zhǔn)確性:嘗試不同的λ\lambdaλ取值,看決策邊界如何變化和訓(xùn)練集的準(zhǔn)確性如何變化。
%% ============= Part 2: Regularization and Accuracies ============= % Optional Exercise: % In this part, you will get to try different values of lambda and % see how regularization affects the decision coundart % % Try the following values of lambda (0, 1, 10, 100). % % How does the decision boundary change when you vary lambda? How does % the training set accuracy vary? %% Initialize fitting parameters initial_theta = zeros(size(X, 2), 1);% Set regularization parameter lambda to 1 (you should vary this) lambda = 1;% Set Options options = optimset('GradObj', 'on', 'MaxIter', 400);% Optimize [theta, J, exit_flag] = ...fminunc(@(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options);% Plot Boundary plotDecisionBoundary(theta, X, y); hold on; title(sprintf('lambda = %g', lambda))% Labels and Legend xlabel('Microchip Test 1') ylabel('Microchip Test 2')legend('y = 1', 'y = 0', 'Decision boundary') hold off;% Compute accuracy on our training set p = predict(theta, X);fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100); fprintf('Expected accuracy (with lambda = 1): 83.1 (approx)\n');結(jié)果
對應(yīng)的準(zhǔn)確率
Train Accuracy: 88.983051 Expected accuracy (with lambda = 1): 83.1 (approx)對應(yīng)的準(zhǔn)確率
Train Accuracy: 61.016949 Expected accuracy (with lambda = 1): 83.1 (approx)其中
plotDecisionBoundary.m函數(shù)如下,繪制決策邊界的函數(shù)代碼
數(shù)據(jù)集
ex2data2.txt文件中內(nèi)容如下
總結(jié)
以上是生活随笔為你收集整理的吴恩达机器学习Ex2的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 深圳室内装修靠谱的公司推荐?
- 下一篇: 16款蓝鸟怎么判定启动电池