The authors propose using an implicit shape representation for the source and the target shapes. The Euclidean signed distance transform is used to model the shape of interest as the zero level set of a distance function.

Let $\Phi: \Omega \to R^+$ denote the distance transform of a shape $S$. The shape defines a partition of the image domain $\Omega$ - the region enclosed by $S$ is denoted by $[R_S]$ and the background region is denoted by $[\Omega - R_S]$. Using signed Euclidean distance transform, we have

$$ \Phi_S(x,y) = \begin{cases} 0, & \text{if}\ (x,y) \in S \\ +D((x,y),S), & \text{if}\ (x,y) \in R_S \\ -D((x,y),S), & \text{if}\ (x,y) \in [\Omega - R_S] \end{cases}. $$

where $D((x,y),S)$ denotes the minimum Euclidean distance between the image pixel location $(x,y)$ and the shape $S$. The authors note that the gradient of the embedding distance function $D((x,y),S)$ is a unit vector normal to the shape. This is important because this guarantees the convergence of gradient descent algorithm. Moreover, the use of implicit shape representation provides additional support to the registration algorithm around the shape boundaries by imposing smoothness constraints. The concern regarding the efficiency in registration while using implicit shape representations is addressed by using only a narrow band around the shape in the embedding space as the registration sample domain.

It can be seen that the implicit shape representation is inherently invariant to translation and rotation. By considering to analogy of registering distance maps of a shape in various scales as matching multiple with the same scene elements but with different modalities, the authors propose using mutual information for the task. The combination of mutual information and the implicit shape representation means that the resulting alignment framework is invariant to translation, rotation, scaling, and is robust to transformations in arbitrary dimensions.

Let $\Phi_D$ and $\Phi_S$ represent the source shape and the target shape respectively, and let their corresponding intensity images (where the intensities represent the distance values) be denoted by $f$ and $g$ respectively. Representing the sample domain as $\Omega$ and the image source domain as $f^2$, let the parametric transformation be denoted by $A$, whose parameters $\Theta = (\theta_1, \theta_2,…..\theta_N)$. Therefore, the task of global registration is equivalent to estimating the value of $\Theta$ such that the mutual information between $f_\Omega = f(\Omega)$ and $g_\Omega^A = g(A(\Theta;\Omega))$ is maximized. The expression for this mutual information (MI) is given as $$ \text{MI}(f_\Omega, g_\Omega^A) = H[p^{f_\Omega}(l_1)] + H[p^{g_\Omega^A}(l_2)] - H[p^{f_\Omega,g_\Omega^A}(l_1,l_2)] $$ where $l_1$ and $l_2$ represent the intensity of the distance transform image in the $f_\Omega$ and the $g_\Omega^A$ domains respectively. $H$ denotes the differential entropy. $p^{f_\Omega}$ and $p^{g_\Omega^A}$ represent the intensity probability density function in the $f_\Omega$ and the $g_\Omega^A$ domains respectively, while $p^{f_\Omega,g_\Omega^A}$ represents their joint probability distribution.

Upon expanding by substituting the formulae for the differential entropy, we can write the following energy functional which has to be minimized using gradient descent: \begin{align*} E_{Global}(A(\Theta)) & = - \text{MI}(f_\Omega, g_\Omega^A) \\ & = -\int \int_{R^2} p^{f_\Omega,g_\Omega^A}(l_1,l_2) log \frac{p^{f_\Omega,g_\Omega^A}(l_1,l_2)}{p^{f_\Omega}(l_1)p^{g_\Omega^A}(l_2)} dl_1 dl_2 \end{align*}

probability density functions in the energy functional $E_{Global}(A(\Theta))$ have been approximated using non-parametric, differentiable Gaussina Kernel-Based Density Estimation model. Using gradient descent, the parameters $\Theta = (\theta_1,\theta_2,…..\theta_N)$ can be updated by calculating $\frac{\partial E_{Global}}{\partial \theta_i}$.

This summary was written in Fall 2018 as a part of the CMPT 880 Special Topics in AI: Medical Imaging Meets Machine Learning course.