\[\begin{equation} \tag{1} (\mathbf{A B})^{-1}=\mathbf{B}^{-1} \mathbf{A}^{-1} \end{equation}\]

\[\begin{equation} \tag{2} (\mathbf{A B C} \ldots)^{-1}=\ldots \mathbf{C}^{-1} \mathbf{B}^{-1} \mathbf{A}^{-1} \end{equation}\]

\[\begin{equation} \tag{3} \left(\mathbf{A}^{T}\right)^{-1}=\left(\mathbf{A}^{-1}\right)^{T} \end{equation}\]

\[\begin{equation} \tag{4} (\mathbf{A}+\mathbf{B})^{T}=\mathbf{A}^{T}+\mathbf{B}^{T} \end{equation}\]

\[\begin{equation} \tag{5} (\mathbf{A B})^{T}=\mathbf{B}^{T} \mathbf{A}^{T} \end{equation}\]

\[\begin{equation} \tag{6} (\mathbf{A B C} \ldots)^{T}=\ldots \mathbf{C}^{T} \mathbf{B}^{T} \mathbf{A}^{T} \end{equation}\]

Notes

  • Eigen-decomposition \[S = \sum_{k=1}^N \rho_ku_ku_k^T\]

1. Neural Networks

Feedforward and const function

The cost function for the neural network (without regularization) is

\[ J(\theta) = \frac{1}{m}\sum_{i=1}^m\sum_{k=1}^K \bigg[ −y_k^{(i)}\log((h_{θ}(x^{(i)}))_k)−(1−y_k^{(i)})\log(1−(h_θ(x^{(i)}))_k) \bigg] \],

where \(h_{\theta}(x^{(i)})\) is computed as shown in the Figure 2 and \(K = 10\) is the total number of possible labels. Note that \(h_θ(x^{(i)})_k = a^{(3)}_k\) is the activation (output value) of the \(k\)-th output unit.

Implementation-nnCostFunction.m

a1 = [ones(m, 1) X];
z2 = a1*Theta1';
a2 = [ones(size(z2, 1), 1) sigmoid(z2)];
z3 = a2*Theta2';
a3 = sigmoid(z3);

yd = eye(num_labels);
y = yd(y,:);

log_dif = -log(a3).*y-log(1-a3).*(1-y);
J=sum(log_dif(:))/m;

Regularized const function

The cost function for neural networks with regularization is given by

\[ J(\theta) = \frac{1}{m}\sum_{i=1}^m\sum_{k=1}^K \bigg[ −y_k^{(i)}\log((h_{θ}(x^{(i)}))_k)−(1−y_k^{(i)})\log(1−(h_θ(x^{(i)}))_k) \bigg] \]

\[ + \frac{\lambda}{2m}\bigg[\sum_{j=1}^{25}\sum_{k=1}^{400}(\theta_{j,k}^{(1)})^2+\sum_{j=1}^{10}\sum_{k=1}^{25}(\theta_{j,k}^{(2)})^2\bigg] \]

Implementation-nnCostFunction.m

a1 = [ones(m, 1) X];
z2 = a1*Theta1';
a2 = [ones(size(z2, 1), 1) sigmoid(z2)];
z3 = a2*Theta2';
a3 = sigmoid(z3);

yd = eye(num_labels);
y = yd(y,:);

log_dif = -log(a3).*y-log(1-a3).*(1-y);

Theta1s=Theta1(:,2:end);
Theta2s=Theta2(:,2:end);

penalty = lambda/(2*m)*(sum((Theta1s.*Theta1s)(:)) + sum((Theta2s.*Theta2s)(:)));
J=sum(log_dif(:))/m + penalty;

2. Backpropagation

Install Octave

Download Octave from https://ftp.gnu.org/gnu/octave/windows/, and register its folder to environment variable Path.

Create build system for Octave file in Sublime Text3

Octave.sublime_build

{
    "cmd": ["octave-gui", "$file"],
    "shell": true   // to show plots
}

Create short-cut for canceling build

Add the line to Preferencesbindings

{ "keys": ["ctrl+shift+b"], "command": "exec", "args": {"kill": true} },