Introduction
This paper presents an algorithm for non-rigid registration formulated as a discrete labeling problem. The authors note that the two major contemporary works for image registration had inherent flaws - Free-Form Deformation (FFD) based model was crippled by the choice of the set of control points to represent the deformation, while Demons Based Method did not penalize large displacements of pixels and was highly sensitive to local artifacts. The authors demonstrate the proposed algorithm’s superior performance for 2D and 3D registration compared to the two aforementioned algorithms.
Method
Given a source image $I$ and a floating image $J$, both of dimension $d$, let $\boldsymbol{X}$ be the continuous spatial domain of both the images. Therefore, $\boldsymbol{I}(\boldsymbol{x})$ and $\boldsymbol{J}(\boldsymbol{x})$ represent the intensity values (or feature vectors for a generalized formulation) of the two images as $\boldsymbol{x}$. Let $T$ represent the transformation which denotes the displacement vector field $D$, denoting the displacement of every point $\boldsymbol{x}$ to a new point $\boldsymbol{x} + \boldsymbol{D(x)}$.
Therefore, the registration task can be defined as finding the optimal $\boldsymbol{D^*}$ such that
$$ \boldsymbol{D^*} = \text{argmin}_D C\big(I(\boldsymbol{X}), J(\boldsymbol{X} + \boldsymbol{D})\big) + \lambda S(\boldsymbol{D}) $$ where $C$ is a image-to-image dissimilarity measure and $S$ is a smoothness regularizer. Choosing them to be the integrated absolute difference and the magnitude of first derivative terms, we have
$$ \boldsymbol{D^*} = \text{argmin}_{D} \int_{\boldsymbol{X}} \| I(\boldsymbol{x}) - J(\boldsymbol{x} + \boldsymbol{D}(\boldsymbol{x})) \| d\boldsymbol{X} + \lambda \sum_{i=1}^d \int_{X} \| \boldsymbol{D}_{x_i} \| d\boldsymbol{X} $$
For a discretized formulation, let $\boldsymbol{X}$ be discretized into pixels and subsequently replace the integrals with summations and the derivatives with finite differences, and we have
$$ \boldsymbol{D^*} = \text{argmin}_{D} \sum_{\boldsymbol{x} \in \boldsymbol{X}} \| I(\boldsymbol{x}) - J(\boldsymbol{x} + \boldsymbol{D}(\boldsymbol{x})) \| + \lambda \sum_{(\boldsymbol{x}, \boldsymbol{y}) \in N} \| \boldsymbol{D}_{\boldsymbol{x}} - \boldsymbol{D}_{\boldsymbol{y}} \| $$
where $(\boldsymbol{x}, \boldsymbol{y}) \in N$ if and only if $x$ and $y$ are adjacent pixels.
It should be noted that $\boldsymbol{D}(\boldsymbol{x}) \in \mathbb{R}^d$ is still not a discrete domain representation. In order to convert this optimization problem to a labeling problem which can then efficiently be solved by graph cuts, a second step of discretization is done to limit $\boldsymbol{D}(\boldsymbol{x})$ to a finite set.
Choosing a window $W = \{0, \pm s, \pm2s,…,\pm ws\}^d$ of dimension $d$ to discretize the continuous region $[-ws, ws]^d$ with a sampling period $s$ along all directions, we can put a restriction on how far a pixel can be displaced. Moreover, an interesting extension to this is that for $s \le 1$, displacements with sub-pixel units can also be considered.
A general formulation of the graph-cuts method is
$$ E_f = \sum_{p \in P} D_p(f_p) + \sum_{(p, q) \in N} V_{p, q} (f_p, f_q) $$ where $E_f$ is the energy functional to be minimized and $D_p(f_p)$ represents the penalty associated with assigning the label $f_p$ to pixel $p$ and $V_{p, q} (f_p, f_q)$ represents the penalty of assigning labels $f_p$ and $f_q$ to neighborhood pixels $p$ and $q$ respectively. Moreover, $P$ is the set of pixels, $N \subset P \times P$ is a neighborhood system defined on $P$, $f : P \to L$ is the labeling function where $L$ is the set of labels and $f_i \in L$ is the label of the pixel $i$ in $f$.
If we consider a 4-connected neighborhood $N$, then the optimization problem transforms into a labeling problem that can be solved by using graph-cuts via a sequence of $\alpha$-expansion moves. Previous work by Kolmogorov and Zabih [1] has shown that graph-cuts can be used to find the exact minimum of a two label problem if for every $V_{p, q}$, we have $$ V_{p, q} (0, 0) + V_{p, q} (1, 1) \leq V_{p, q} (0, 1) + V_{p, q} (1, 0) $$
References
- Kolmogorov, Vladimir, and Ramin Zabih. ”What energy functions can be minimized via graph cuts?.” IEEE Transactions on Pattern Analysis Machine Intelligence 2 (2004): 147-159.