Matrix differentiation for dummies
Awesome summary
- The Jacobian matrix
- The Jacobian of the dot product
- The Jacobian of a linear form
- The Jacobian of Af(x)
The Jacobian matrix
Let $\psi:\mathbb{R}^n \rightarrow \mathbb{R}^m$, or equivalently:
The Jacobian of $\psi$ is defined as follows.
Notice that, if $\psi:\mathbb{R} \rightarrow \mathbb{R}^m$ (i.e., $n=1$), then the Jacobian is a $m \times 1$ matrix, e.g., a column vector.
On the other hand, if $\psi:\mathbb{R}^n \rightarrow \mathbb{R}$ (i.e., $m=1$), then the Jacobian is a $1 \times n$ matrix, e.g., a row vector.
When $\psi:\mathbb{R}^n \rightarrow \mathbb{R}$, the transpose of the row vector $J_{\psi}$ is called the gradient of $\psi$ and denoted by $\nabla \psi$.
The Jacobian of a linear form
Let $\boldsymbol{y} = A \boldsymbol{x}$ be a linear form with $\boldsymbol{y} \in \mathbb{R}^m$, $A \in \mathbb{R}^{m \times n}$ and $\boldsymbol{x} \in \mathbb{R}^n$, or equivalently:
From the definition of Jacobian it is immediate to notice that $J_{ij} = \frac{\partial y_i}{\partial x_j} $, and hence:
Takeaway: $\frac{\partial }{\partial \boldsymbol{x}} A \boldsymbol{x} = A$.
The Jacobian of Af(x)
Given $A f(\boldsymbol{x})$ with $A \in \mathbb{R}^{m \times n}$ and $f(\boldsymbol{x}):\mathbb{R}^l \rightarrow \mathbb{R}^n$, or equivalently:
From the definition of Jacobian matrix:
Takeaway: $\frac{\partial }{\partial \boldsymbol{x}} A f(\boldsymbol{x}) = A \frac{\partial f(\boldsymbol{x})}{\partial \boldsymbol{x}}$.