手写数字识别实现
本文主要實現手寫數字識別,利用多類邏輯回歸與神經網絡兩種方法實現
Multi-class Classification
數據源
There are 5000 training examples in ex3data1.mat, where each training example is a 20 pixel by 20 pixel grayscale image of the digit. Each pixel is represented by a floating point number indicating the grayscale intensity at that location. The 20 by 20 grid of pixels is “unrolled” into a 400-dimensional vector. Each of these training examples becomes a single row in our data matrix X. This gives us a 5000 by 400 matrix X where every row is a training example for a handwritten digit image.
The second part of the training set is a 5000-dimensional vector y that contains labels for the training set. To make things more compatible with Octave/MATLAB indexing, where there is no zero index, we have mapped the digit zero to the value ten. Therefore, a “0” digit is labeled as “10”, while the digits “1” to “9” are labeled as “1” to “9” in their natural order.
數據可視化
通過displaydata函數可以察看數據
向量化邏輯回歸
由于數據較多,使用向量化可以使訓練更加高效,提高速度。
代價函數的向量化
代價函數為
Let us define X and θ as
gradient向量化
小技巧
Debugging Tip: Vectorizing code can sometimes be tricky. One com- mon strategy for debugging is to print out the sizes of the matrices you are working with using the size function. For example, given a data ma- trix X of size 100 × 20 (100 examples, 20 features) and θ, a vector with dimensions 20×1, you can observe that Xθ is a valid multiplication oper- ation, while θX is not. Furthermore, if you have a non-vectorized version of your code, you can compare the output of your vectorized code and non-vectorized code to make sure that they produce the same outputs.
綜上,可以寫出lrCostFunction
function [J, grad] = lrCostFunction(theta, X, y, lambda) m = length(y); % number of training examples% You need to return the following variables correctly J = 0; grad = zeros(size(theta)); temp=[0;theta(2:end)]; % 先把theta(1)拿掉,不參與正則化 J= -1 * sum( y .* log( sigmoid(X*theta) ) + (1 - y ) .* log( (1 - sigmoid(X*theta)) ) ) / m + lambda/(2*m) * temp' * temp ; grad = ( X' * (sigmoid(X*theta) - y ) )/ m + lambda/m * temp ;grad = grad(:);end一對多分類
在這個問題中,需要分成10份
小技巧
Octave/MATLAB Tip: Logical arrays in Octave/MATLAB are arrays which contain binary (0 or 1) elements. In Octave/MATLAB, evaluating the expression a == b for a vector a (of size m×1) and scalar b will return a vector of the same size as a with ones at positions where the elements of a are equal to b and zeroes where they are different. To see how this works for yourself, try the following code in Octave/MATLAB:a = 1:10; % Create a and bb = 3;a == b % You should try different values of b here由上可得oneVsAll function
function [all_theta] = oneVsAll(X, y, num_labels, lambda) m = size(X, 1); n = size(X, 2); X = [ones(m, 1) X];options = optimset('GradObj', 'on', 'MaxIter', 50);initial_theta = zeros(n + 1, 1);for c = 1:num_labelsall_theta(c,:) = fmincg (@(t)(lrCostFunction(t, X, (y == c), lambda)), ...initial_theta, options); endendOne-vs-all Prediction
根據上面得到了訓練后的θ ,根據訓練的結果,得到各個分類的可能性,取最大的一個,即是結果
可得準確率為95%
SourceCode:machine-learning-ex3
神經網絡實現
神經網絡實現主要步驟如下:
- 初始化與導入數據
- Compute Cost (Feedforward)
- Implement Regularization
- 隨機初始化參數
- Implement Backpropagation
- 梯度檢驗
- 訓練神經網絡
- 預測
反向傳播算法的具體實現
訓練神經網絡代碼
options = optimset('MaxIter', 50);% You should also try different values of lambda lambda = 1;% Create "short hand" for the cost function to be minimized costFunction = @(p) nnCostFunction(p, ...input_layer_size, ...hidden_layer_size, ...num_labels, X, y, lambda);% Now, costFunction is a function that takes in only one argument (the % neural network parameters) [nn_params, cost] = fmincg(costFunction, initial_nn_params, options);預測結果代碼
function p = predict(Theta1, Theta2, X) %PREDICT Predict the label of an input given a trained neural network % p = PREDICT(Theta1, Theta2, X) outputs the predicted label of X given the % trained weights of a neural network (Theta1, Theta2)% Useful values m = size(X, 1); num_labels = size(Theta2, 1);% You need to return the following variables correctly p = zeros(size(X, 1), 1);h1 = sigmoid([ones(m, 1) X] * Theta1'); h2 = sigmoid([ones(m, 1) h1] * Theta2'); [dummy, p] = max(h2, [], 2);% =========================================================================end準確率大概95%左右,根據隨機初始化的參數會有1%左右的誤差
SourceCode:machine-learning-ex4
總結
- 上一篇: unet实现区域分割
- 下一篇: 数字图像处理 实验一 图像的基本运算