This post summarizes the ESPIRiT algorithm for estimating coil sensitivies in MRI. It took me months to fully grasp the theory behind it.
A short introduction
MRI is a common imaging technique for medical diagnosis nowadays. It’s nearly harmless for humans in contrast to other radioactive imaging techniques. In MRI, when a patient is exposed to a magnetic field, hydrogen atoms align with the field. This alignment could be disrupt by a radiofrequency pulse. As the atoms return to their original alignment, they emit signals captured by sensors and used to create medical images.
The signals captured by a sensor c (or coil c) can be approximately formulated as:
yc(k(t))=∫x(r)Bc(r)e−i2πk(t)⋅rdr
where x(r) is an unkown tissue value at location r, Bc(r) is some value related to the receiver coil c, often named as coil sensitivities, k(t) is k-space coordinates along with the time t. It’s common to have dozens of coils in clinical settings.
This formula indicates that the signals captured by the coil c are fourier transform of interested tissue values which are multipled by the sensitivity map of the coil c. A concise way of the above formula is to represent it with operators:
yc=FScx
where F denotes fourier transform, Sc denotes the sensitivity map of the coil c, yc and x are both vectorized.
Most of the time, coil sensitivities are not available beforehand. We have to estimate them from acquired signals. ESPIRiT is such a method for estimating coil sensitivities , supported by soild theoretical analysis.
About GRAPPA
Before diving into ESPIRiT, it’s important to first understand GRAPPA, a well-established parallel imaging method used in MRI. The core of GRAPPA is to interpolate missing k-space points by linearly weighting neighboring acquired points, with the weighting coefficients determined from the calibration region.
Again, a concise way to describe GRAPPA is using operators:
xc(r)=(PkRry)Tgc
where y represents the vectorized multicoil k-space data with unacquired data set to zero, xc(r) is the estimated value for coil c at location r (for simplicity, the vector notation is omitted, but keep in mind that r is a discrete spatial vector index), gc denotes weighting coefficients for coil c, Rr is a patch operator that selects a local block of data around location r and Pk is a sampling operator that selects only the acquired points within the patch (the subscript k implies that there may exist many sampling patterns).
We can easily construct linear equations using the data from the calibration region, with the same undersamping pattern Pk:
ycAC=APkTgc
where ycAC denotes the vectorized data in the calibration region, A is a so-called calibration matrix, that each row is a local patch around the current acquired point. In this case, gc can be efficiently solved by many numerical methods.
By construction , one of the columns of A is actually ycAC itself. Write this as Aec=ycAC, where ec is a vector with 1 in the appropriate location to select the corresponding column of A. We note that:
The existence of null-space implies redundancy in row-space of A. gc in GRAPPA are very specific null-space vectors. Instead of using this specific form, ESPIRiT analyzes the calibration matrix A directly.
Key assumptions in ESPIRiT
Since the matrix A is constructed using local patches in the calibration region, ESPIRiT assumes that all patches, whether in the calibration region or not, lie in the row-space of A:
V⊥HRrx=0,∀r
where V⊥ is the basis that spans the null-space and x is the vectorized multicoil k-space data. Note that “lying in the row-space” is equivalent to being orthogonal to the null-space.
V⊥ is computed by singular value decomposition of A:
A=UΣVH=UΣ[V∥V⊥]H
where V∥ corresponds to non-zero singular values and V⊥ corresponds to zero singular values.
Solving the nullspace constraint in the least-square framework
which leads to the following equations by zeroing out the gradient:
r∑RrHV⊥V⊥HRrx=0
Since V is unitary, we have I=V⊥V⊥H+V∥V∥H. By substituting into the above equations, we get:
r∑RrHV∥V∥HRrx=r∑RrHRrx
The effect of RrHRr is to restore the patch to its original location while zeroing out others elements. The overall effect of ∑rRrHRr is a diagonal matrix, where each diagonal element equals to M, indicating the number of times it appears in the neighboring patches. With M, the equation can be transformed to:
M−1r∑RrHV∥V∥HRrx=x
which is equation (9) in the paper.
“Obvious” convolution
I believe the most challenging part is deriving equation (16) Gqsq=sq from the paper. The authors provided little detail about this step, only mentioning that the operator W=M−1∑rRrHV∥V∥HRr is a matrix-valued convolution, which can be decoupled into point-wise matrix operations in the image domain. The sensitivity maps are obtained by the eigenvalue decomposition of all Gq’s choosing only the eigenvectors corresponding to eigenvalue “=1”. After that, they jump straight to showing the computation of Gq in Figure 3 without any mathematical explanation.
The intutive is that we could use the Fourier Convolution Theorem to get what we want, which is more straightforward in 1D cases. However, since W is a high-dimensional operator (k-space locations and channels), applying the theorem directly is not straightforward. A better approach would be to transform W into a form that resembles something familiar in 1D.
To simplify the derivation, we first introduce some notations:
P=V∥V∥H: P operates on patches
P(c,α),(d,β): a scalar value, where (c,α) represents the c-th channel and location α in the output (row index), and (d,β) represents the d-th channel and location β in the input (column index)
xc(r): a scalar value, where c represents the c-th channel and r reprsents location r in the k-space
(Rrx)(d,β): a scalar value, the d-th coil value at location β in the patch centered at location r
(PRrx)(c,α): a scalar value, the c-th coil value at location α of the output
(RrHPRrx)c(m): a scalar value, the c-th coil value at location m in the k-space
As we have discussed above, Rr extracts a patch centered at location r:
(Rrx)(d,β)=xd(r+β)
where β represents the relative shift with respect to the central location r.
The effect of P is to perform a weighted summation across the channels and locations of the input:
where Kc,d(u)=∑α,β,α−β=uP(c,α),(d,β). Recall that u is determined by the relative shifts of α and β. What we are doing here is actually re-grouping the pairs (α,β) according to their common difference u. We don’t need to worry about how to compute Kc,d(u) right now; we only need to understand that it is a scalar value associated with c,d and the difference u.
Finally, we got something that looks like a convolution. ∑uKc,d(u)xd(m−u) is a convolution of Kc,d(m) and xd(m). It’s worth noting that u cannot exceed twice the size of the patch, as it represents the difference between the relative shifts. Kc,d(m) is only meaningful in this range.
Calculate coil sensitivies
We should know that xc(m)=F[sc(n)x^(n)](m), as we have discussed in the introduction. Let’s add it to the right-hand side of the equation:
(Wx)c(m)M−1d∑Kc,d(m)∗xd(m)=xc(m)=xc(m)
Apply the inverse Fourier transform on both sides:
M−1d∑K^c,d(n)sd(n)x^(n)=sc(n)x^(n)
where x^(n) is in image-space, sd(n) is the d-th coil sensitivity value at the location n. With all coils at the location n, we have:
We got it! Now it’s clear that the coil sensitivities at location n are the eigenvectors corresponding to the eigenvalues equal to 1. This is exactly equation (16) in the paper (different notations but essentially the same).
The remaining question is what the inverse Fourier transform of K^c,d(n) is: