1. Neural Networks
Feedforward and const function
The cost function for the neural network (without regularization) is
\[ J(\theta) = \frac{1}{m}\sum_{i=1}^m\sum_{k=1}^K \bigg[ −y_k^{(i)}\log((h_{θ}(x^{(i)}))_k)−(1−y_k^{(i)})\log(1−(h_θ(x^{(i)}))_k) \bigg] \],
where \(h_{\theta}(x^{(i)})\) is computed as shown in the Figure 2 and \(K = 10\) is the total number of possible labels. Note that \(h_θ(x^{(i)})_k = a^{(3)}_k\) is the activation (output value) of the \(k\)-th output unit.
Implementation-nnCostFunction.m
a1 = [ones(m, 1) X];
z2 = a1*Theta1';
a2 = [ones(size(z2, 1), 1) sigmoid(z2)];
z3 = a2*Theta2';
a3 = sigmoid(z3);
yd = eye(num_labels);
y = yd(y,:);
log_dif = -log(a3).*y-log(1-a3).*(1-y);
J=sum(log_dif(:))/m;
Regularized const function
The cost function for neural networks with regularization is given by
\[ J(\theta) = \frac{1}{m}\sum_{i=1}^m\sum_{k=1}^K \bigg[ −y_k^{(i)}\log((h_{θ}(x^{(i)}))_k)−(1−y_k^{(i)})\log(1−(h_θ(x^{(i)}))_k) \bigg] \]
\[ + \frac{\lambda}{2m}\bigg[\sum_{j=1}^{25}\sum_{k=1}^{400}(\theta_{j,k}^{(1)})^2+\sum_{j=1}^{10}\sum_{k=1}^{25}(\theta_{j,k}^{(2)})^2\bigg] \]
Implementation-nnCostFunction.m
a1 = [ones(m, 1) X];
z2 = a1*Theta1';
a2 = [ones(size(z2, 1), 1) sigmoid(z2)];
z3 = a2*Theta2';
a3 = sigmoid(z3);
yd = eye(num_labels);
y = yd(y,:);
log_dif = -log(a3).*y-log(1-a3).*(1-y);
Theta1s=Theta1(:,2:end);
Theta2s=Theta2(:,2:end);
penalty = lambda/(2*m)*(sum((Theta1s.*Theta1s)(:)) + sum((Theta2s.*Theta2s)(:)));
J=sum(log_dif(:))/m + penalty;