In array and radar signal processing, especially when co-array models are concerned, one may frequently encounter the vectorization operation, the Kronecker product, and the Khatri-Rao product. This article will give a brief review of these three operations and their commonly used properties.
Given an M×N matrix A=[a1,a2,…,aN], where ai is the i-th column of A. The vectorization of A is defined as
Basically, the vectorization operation rearranges the elements of A into a long vector by stacking all the columns together. In a general signal processing scenario, we may have N observations, x1, x2, ..., xn. We can either arrange them into a matrix X=[x1,x2,…,xN], or a long vector vec(X), depending on the problem structure. In some cases, vectorizing a matrix may be helpful. However, the additional information given by the matrix structure (e.g., low-rankness) will be lost.
The vectorization operation itself does not have many interesting properties. One of the useful ones is related to trace:
As we will show later, the vectorization operation shows more interesting properties when combined with the Kronecker product and the Khatri-Rao product.
Given an M×N matrix A and a P×Q matrix B, the Kronecker product A⊗B is defined as
which is an MP×NQ matrix.
The Kronecker product has many interesting properties. First, it is distributive and associative:
- Distributivity: (a) (A+B)⊗C=A⊗C+B⊗C; (b) A⊗(B+C)=A⊗B+A⊗C.
- Associativity: (A⊗B)⊗C=A⊗(B⊗C).
The Kronecker product is also "distributive" with respect to the (Hermitian) transpose, trace, and Frobenius norm:
- Transpose: (A⊗B)T=AT⊗BT.
- Hermitian transpose: (A⊗B)H=AH⊗BH.
- Trace: tr(A⊗B)=tr(A)tr(B).
- Frobenius norm: ∥A⊗B∥F=∥A∥F∥B∥F.
One of the most important and useful properties of the Kronecker product is the product rule:
Proposition 1. Let A, B, C, D be M×N, P×Q, N×K, and Q×L, respectively, then
Proof. By the definition of the Kronecker product,
which completes the proof. ∎
With the product rule, one can show that the following properties also hold:
- Inverse: (A⊗B)−1=A−1⊗B−1.
- Rank: rank(A⊗B)=rank(A)rank(B).
- Determinant: Let A and B be M×M and N×N, respectively. Then det(A⊗B)=det(A)Ndet(B)M.
- Positive-definiteness： If both A and B are positive-definite, then A⊗B is also positive-definite.
For a complete review of the properties of the Kronecker product, the readers are directed to the wiki page, Kathrin Schäcke's On the Kronecker Product, or Chapter 11 in A matrix handbook for statisticians. Readers pursuing a more abstract understanding may also check out the tensor product.
The Khatri-Rao product is usually defined as the column-wise Kronecker product. In other words, given an M×N matrix A and a P×N matrix B, the Khatri-Rao product is defined as
which is an MP×N matrix. The Khatri-Rao product appears frequently in the difference co-array model (e.g., for co-prime and nested arrays) or sum-coarray model (e.g., in MIMO radar). Although the definition of the Khatri-Rao product is based on the Kronecker product, the Khatri-Rao product does not have many nice properties.
A Property That Connects the Three
A handy property connecting the vectorization, the Kronecker product, and the Khatri-Rao product is given below.
Proposition 2. Let A, X, B be M×N, N×P, P×Q, respectively. Then
Moreover, if X is a diagonal matrix, then
where diag(X) is a vector representing the main diagonal of X.
Proof. Let Bi denote the i-th row of B.
The proof for the second equality follows the same idea. ∎
Here are some examples where Proposition 2 is used.
Example 1. Consider the following optimization problem with a matrix variable:
Apply Proposition 2, we can rewrite the above optimization problem as
which is indeed a least squares problem. Note that the original formulation is more compact.
Example 2. Consider the DOA estimation problem using a linear array. Adapting the unconditional model with uncorrelated sources, the covariance matrix of the received snapshots is given by
where A is the array steering matrix, P is a diagonal matrix whose main diagonal represents source powers, and σ2 is the power of additive noises. Vectorizing R leads to
where p=diag(P). We can observe that (11) resembles a snapshot from a virtual array whose steering matrix is given by (A∗⊙A). This idea is exploited to obtain enhanced degrees of freedom.
Proposition 2 also leads to the following interesting corollary:
Corollary 1. Let A, B, C, D be M×N, N×P, P×Q, and Q×R, respectively. Then
Proof. Immediately obtained by combining Proposition 2 and the fact that tr(AB)=vec(AT)Tvec(B). ∎
Corollary 1 can be quite useful in the derivation of the Cramér-Rao bound (CRB) for Gaussian models, as shown in the following example.
Example 3. Let x follow the complex circularly-symmetric Gaussian distribution CN(0,R(θ)), where θ∈RK denotes the parameters to be estimated. The (i,j)-th element of the Fisher information matrix (FIM) is then given by
In many cases, evaluating each element according the above formula can be quite tedious. Let r=vec(R). Using Corollary 1 and the fact the R is Hermitian, we can rewrite the above formula as
∂θ∂r=[∂θ1∂r ∂θ2∂r ⋯ ∂θK∂r].
We immediate obtain a compact representation of the FIM:
In some cases, this may simplify the derivation of the corresponding CRB.