# Mianzhi Wang

Ph.D. in Electrical Engineering

# Vectorizing the Computation of MUSIC Spectrum

MUtiple SIgnal Classification (MUSIC) is a widely-used subspace-based direction finding algorithm in array signal processing. MUSIC works in both 1D and multi-dimensional direction finding problems. Given the noise subspace estimate, $\hat{\mathbf{E}}_\mathrm{n}$, and the search grid $\Theta$ of size $n$, the MUSIC spectrum is computed using the following formula:

$P_\mathrm{MUSIC}(\theta) = \frac{1}{\mathbf{a}^H(\theta) \hat{\mathbf{E}}_\mathrm{n} \hat{\mathbf{E}}_\mathrm{n}^H \mathbf{a}(\theta)},$
(1)

for every $\theta \in \Theta$ (for more details, check the MUSIC paper or my previous post about implementing MUSIC in JavaScript). In other words, to evaluate the full MUSIC spectrum, we need to evaluate (1) for every $\theta \in \Theta$.

Let $\mathbf{A} = [\mathbf{a}(\theta_1), \mathbf{a}(\theta_2), \ldots, \mathbf{a}(\theta_n)]$ (which is actually the "steering matrix" of the search grid $\Theta$), a straightforward implementation in Python would be:

# Assume A and En are already computed. We also omit the hat over En in the
# code.

# Output spectrum.
sp = np.zeros((n,))
for i in range(n):
sp[i] = 1.0 / (A[:,i].T.conj() @ (En @ En.T.conj()) @ A[:,i])


This implementation, while works, is not very efficient for the following reasons:

1. A for-loop is used while vectorization is possible;
2. $\hat{\mathbf{E}}_\mathrm{n} \hat{\mathbf{E}}_\mathrm{n}^H$ is repeatedly computed.

Note: This can be avoided by caching the result. However, in the vectorized implementation we do not have to and we will be dealing with a smaller matrix, which further speeds up the computation.

Now we derive the vectorized implementation. First, $\mathbf{a}^H(\theta) \hat{\mathbf{E}}_\mathrm{n} \hat{\mathbf{E}}_\mathrm{n}^H \mathbf{a}(\theta)$ is actually $\left|\hat{\mathbf{E}}_\mathrm{n}^H \mathbf{a}(\theta)\right|^2$. Let $\mathbf{v}(\theta) = \hat{\mathbf{E}}_\mathrm{n}^H \mathbf{a}(\theta)$. We have

$P_\mathrm{MUSIC}(\theta) = \frac{1}{\left|\mathbf{v}(\theta)\right|^2}.$
(2)

Given a complex vector $\mathbf{z} \in \mathbb{C}^N$, we have $|\mathbf{z}|^2 = \mathbf{z}^H \mathbf{z} = (\mathbf{z} \circ \mathbf{z}^*)^T \mathbf{1}_N$, where $\mathbf{z}^*$ is the conjugate of $\mathbf{z}$, $\circ$ denotes Hadamard product (element-wise product), and $\mathbf{1}_N$ is an $N \times 1$ vector of ones. Therefore,

$P_\mathrm{MUSIC}(\theta) = \frac{1}{(\mathbf{v}(\theta) \circ \mathbf{v}(\theta)^*)^T \mathbf{1}_K},$
(3)

where $K$ is the number of sources (the number of columns in $\hat{\mathbf{E}}_\mathrm{n}$).

Let us define $\mathbf{s}$ as

$\mathbf{s} = [(P_\mathrm{MUSIC}(\theta_1))^{-1}, (P_\mathrm{MUSIC}(\theta_2))^{-1}, \ldots, (P_\mathrm{MUSIC}(\theta_n))^{-1}].$
(4)

We have

\begin{aligned} \mathbf{s} =& \begin{bmatrix} (\mathbf{v}(\theta_1) \circ \mathbf{v}(\theta_1)^*)^T \mathbf{1}_K \\ (\mathbf{v}(\theta_2) \circ \mathbf{v}(\theta_2)^*)^T \mathbf{1}_K \\ \vdots \\ (\mathbf{v}(\theta_n) \circ \mathbf{v}(\theta_n)^*)^T \mathbf{1}_K \end{bmatrix}^T \\ =& \mathbf{1}_K^T \begin{bmatrix} \mathbf{v}(\theta_1) \circ \mathbf{v}(\theta_1)^* & \mathbf{v}(\theta_2) \circ \mathbf{v}(\theta_2)^* & \cdots & \mathbf{v}(\theta_n) \circ \mathbf{v}(\theta_n)^* \end{bmatrix} \\ =& \mathbf{1}_K^T (\mathbf{V} \circ \mathbf{V}^*), \end{aligned}
(5)

where $\mathbf{V} = [\mathbf{v}(\theta_1), \mathbf{v}(\theta_2), \ldots, \mathbf{v}(\theta_n)] = \hat{\mathbf{E}}_\mathrm{n}^H \mathbf{A}$. Once we get $\mathbf{s}$, we just need to compute the reciprocals of its elements to obtain the final MUSIC spectrum.

Now lets translate the above steps into Python codes:

• $\mathbf{V} = \hat{\mathbf{E}}_\mathrm{n}^H \mathbf{A}$: V = En.T.conj() @ A.
• $\mathbf{V} \circ \mathbf{V}^*$: V * V.conj().
• $\mathbf{s} = \mathbf{1}_K^T (\mathbf{V} \circ \mathbf{V}^*)$: np.sum(V * V.conj(), axis=0).real. Note that $\mathbf{1}^T \mathbf{X}$ just computes the column-wise sums. While $\mathbf{V} \circ \mathbf{V}^*$ is real mathematically, we need to force real output here due to floating point errors.
• Compute the reciprocals of elements in $\mathbf{s}$: use np.reciprocal().

Combining them together we get the vectorized implementation:

# Assume A and En are already computed. We also omit the hat over En in the
# code.
V = En.T.conj() @ A
sp = np.reciprocal(np.sum(V * V.conj(), axis=0).real)


Not convinced? Let us do a simple benchmarking with the following three implementations:

def music(A, En):
'''Naive implementation.'''
n = A.shape
sp = np.zeros((n,))
for i in range(n):
sp[i] = 1.0 / (A[:,i].T.conj() @ (En @ En.T.conj()) @ A[:,i]).real
return sp

def music_caching(A, En):
'''Naive implementation with caching.'''
n = A.shape
sp = np.zeros((n,))
Q = En @ En.T.conj()
for i in range(n):
sp[i] = 1.0 / (A[:,i].T.conj() @ Q @ A[:,i]).real
return sp

def music_vectorized(A, En):
'''Vectorized implementation.'''
V = En.T.conj() @ A
return np.reciprocal(np.sum(V * V.conj(), axis=0).real)


For benchmarking purpose, we just fill $\mathbf{A}$ and $\hat{\mathbf{E}}_\mathbf{n}$ with complex Gaussians:

n = 360  # search grid size
m = 20  # number of sensors
k = 6  # number of source

# Fill noise subspace and steering matrices with random numbers.
# This is for benchmark purpose only. You can swap them will real
# ones.
A = np.random.randn(m, n) + 1j * np.random.randn(m, n)
En = np.random.randn(m, k) + 1j * np.random.randn(m, k)


Benchmarking environment:

• i7-9700K CPU and 32GB RAM
• Windows 10 Pro
• Python 3.6
• numpy 1.16.4

Results:

music:            2.18 ms ± 6.91 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
music_caching:    1.09 ms ± 5.12 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
music_vectorized: 50.4 µs ± 507 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


We can see that the speed up is quite significant. This speed up can save a lots time when running Monte Carlo simulations involving computing the MUSIC spectrum.

1. R. Schmidt, "Multiple emitter location and signal parameter estimation," IEEE Transactions on Antennas and Propagation, vol. 34, no. 3, pp. 276–280, Mar. 1986.

2. Here is a reference on vectorization in MATLAB: https://www.mathworks.com/help/matlab/matlab_prog/vectorization.html. The same idea also applies to numpy.

3. Actually the speed-up not only comes from vectorizing the for-loops. To see why, try to derive the computational complexity of the three approaches.

4. If possible, consider using root-MUSIC instead of the classical spectrum-base approach. It does not require a search grid and is much faster when the search grid is dense.