Matrix cookbook - Trace

Reading time ~2 minutes

\[\begin{equation} \tag{11} \operatorname{Tr}(\mathbf{A})=\sum_{i} A_{i i} \end{equation}\]

Let’s write the trace in a more convenient way. We have: 1 \[\begin{equation} A e_{i}=\left[\begin{array}{ccc}{a_{11}} & {\cdots} & {a_{1 n}} \\ {\vdots} & {\ddots} & {\vdots} \\ {a_{n 1}} & {\cdots} & {a_{n n}}\end{array}\right]\left[\begin{array}{c}{0} \\ {\vdots} \\ {1} \\ {\vdots} \\ {0}\end{array}\right]=\left[\begin{array}{c}{a_{i 1}} \\ {\vdots} \\ {a_{i n}}\end{array}\right] \end{equation}\] where the \(1\) is in the \(i\)-th entry. This way: \[\begin{equation} \left\langle e_{i}, A e_{i}\right\rangle= e_{i}^{t} A e_{i}=A_{i i} \end{equation}\] So \(\operatorname{Tr}(\mathbf{A})=\sum_{i}A_{ii}\).

Intuitive explanation 2


\[\begin{equation} \tag{12} \operatorname{Tr}(\mathbf{A})=\sum_{i} \lambda_{i}, \quad \lambda_{i}=\operatorname{eig}(\mathbf{A}) \end{equation}\]

If eigendecomposition of matrix \(\mathbf{A}\) is \(\mathbf{A}=\mathbf{Q} \mathbf{\Lambda} \mathbf{Q}^{-1}\), then according to equation (16): \[\begin{align} \operatorname{Tr}(\mathbf{A})&=\operatorname{Tr}(\mathbf{Q} \mathbf{\Lambda} \mathbf{Q}^{-1}) \\ &=\operatorname{Tr}(\mathbf{\Lambda} \mathbf{Q}^{-1} \mathbf{Q}) \\ &=\operatorname{Tr}(\mathbf{\Lambda}) \\ &=\sum_{i} \lambda_{i} \end{align}\]


\[\begin{equation} \tag{13} \operatorname{Tr}(\mathbf{A})=\operatorname{Tr}\left(\mathbf{A}^{T}\right) \end{equation}\]

\[\begin{equation} \tag{14} \operatorname{Tr}(\mathbf{A B})=\operatorname{Tr}(\mathbf{B A}) \end{equation}\]

Now: \((\mathbf{A B})_{ij}=\sum_{k}A_{ik}B_{kj}\), and: 3 \[\begin{equation} \operatorname{tr}(A B)=\sum_{i} \sum_{k} A_{i k} B_{k i} \end{equation}\]

On the other hand, \((\mathbf{B A})_{ij}=\sum_{k}B_{ik}A_{kj}\). So: \[\begin{equation} \operatorname{tr}(B A)=\sum_{i} \sum_{k} B_{i k} A_{k i} \end{equation}\] They are the same quantity, up to renaming indices \((i \leftrightarrow k)\)


\[\begin{equation} \tag{15} \operatorname{Tr}(\mathbf{A}+\mathbf{B})=\operatorname{Tr}(\mathbf{A})+\operatorname{Tr}(\mathbf{B}) \end{equation}\]

\[\begin{equation} \tag{16} \operatorname{Tr}(\mathbf{A B C})=\operatorname{Tr}(\mathbf{B C A})=\operatorname{Tr}(\mathbf{C A B}) \end{equation}\]


\[\begin{equation} \tag{17} \mathbf{a}^{T} \mathbf{a}=\operatorname{Tr}\left(\mathbf{a a}^{T}\right) \end{equation}\]

\[\begin{align} \mathbf{a a}^{T}&=\left[\begin{array}{c}{a_{1}} \\ {\vdots} \\ {a_{n}}\end{array}\right]\left[{a_{1}}, {\cdots}, {a_{n}}\right] \\ &=\left[\begin{array}{ccc}{a_{1}}^{2} & {\cdots} & {a_{1}a_{n}} \\ {\vdots} & {\ddots} & {\vdots} \\ {a_{n}a_{1}} & {\cdots} & {a_{n}}^{2}\end{array}\right] \end{align}\]

So, \[\begin{equation} \operatorname{Tr}\left(\mathbf{a a}^{T}\right) = a_{1}^{2}+\cdots+a_{n}^{2} = \mathbf{a}^{T} \mathbf{a} \end{equation}\]

comments powered by Disqus