Abstract
Parametric imaging of nuclear medicine data exploits dynamic functional images in order to reconstruct maps of kinetic parameters related to the metabolism of a specific tracer injected in the biological tissue. From a computational viewpoint, the realization of parametric images requires the pixel-wise numerical solution of compartmental inverse problems that are typically ill-posed and nonlinear. In the present paper we introduce a fast numerical optimization scheme for parametric imaging relying on a regularized version of the standard affine-scaling Trust Region method. The validation of this approach is realized in a simulation framework for brain imaging and comparison of performances is made with respect to a regularized Gauss-Newton scheme and a standard nonlinear least-squares algorithm.
1. Introduction
Positron Emission Tomography (PET) [1] utilizes an isotope produced in a cyclotron to provide dynamical images of the metabolism-based isotope accumulation in the biological tissue. PET dynamic images of the tracer distribution are obtained by applying a reconstruction algorithm to the measured radioactivity and provide a reliable estimate of the tracer-related metabolism in the tissue [2, 3].
From a technical viewpoint, compartmental analysis [4, 5, 6] allows processing these dynamic PET data in order to estimate a set of physiological kinetic parameters that explain such metabolism in a quantitative manner (specifically, these parameters express the effectiveness of the tracer in changing its functional status within the tissue). Compartmental analysis requires, first, the formulation of a forward model for the tracer concentration represented by a Cauchy problem, in which the kinetic parameters are the coefficients of the differential equations; then, the numerical solution of the corresponding nonlinear inverse problem, in which the kinetic parameters are the unknowns and the tracer concentrations in the tissue are the input data.
Relying on compartmental analysis, parametric imaging [7, 8, 9] allows the pixel-wise determination of the kinetic parameters by means of two possible alternative approaches. On the one hand, direct parametric imaging [10, 11] utilizes as input the PET raw sinograms and solves the inverse problem that relates them to the parameters; on the other hand, indirect parametric imaging [7, 8, 12, 13] is applied to the reconstructed PET images and solves pixel-wise the compartmental analysis problem. Direct approaches do not need the application of image reconstruction methods but have typically to deal with the intertwining of spatial and temporal correlations, which makes the optimization process more complex; this same optimization is more straightforward in indirect approaches but requires a higher computational burden, due to the need of solving a large number of nonlinear inverse problems.
The present paper deals with indirect parametric imaging and introduces a regularized optimization method for the solution of the nonlinear ill-posed inverse problem of compartmental analysis. The idea of the method is to introduce a regularizing strategy [14] in the standard affine-scaling Trust Region method [15, 16], which allows a better reduction of the numerical instabilities induced by the presence of the experimental noise in the measured data.
From a formal viewpoint, we prove a convergence result for the regularized algorithm, which enables a generalization to the non-negatively constrained case of the convergence analysis developed in [14] for the unconstrained problem. The numerical validation of the method is performed against synthetic data realized from an ‘ad hoc’ modification of the Hoffman Brain Phantom often used in PET and CT imaging (http://depts.washington.edu/petctdro/DROhoffman_main.html). Specifically, we mimicked a two-compartment experiment for the kinetics of [18F]fluoro-2-deoxy-D-glucose (FDG), which is the mostly utilized tracer in PET diagnostic and prognostic activities [17, 18, 19, 20, 21]. Using this simulation we could compare the computational effectiveness and reconstruction accuracy of the method with respect to the performances provided by two frequently used indirect parametric imaging methods.
The structure of the paper is as follows. Section 2 sets up the two-compartment problem for FDG kinetics. Section 3 describes in detail the nonlinear optimization method for the solution of this problem. Section 4 illustrates the validation experiment and its results. Our conclusions are offered in Section 5.
2. Compartmental analysis of dynamic PET data
Compartmental analysis of nuclear medicine data is the mathematical framework for the quantitative assessment of tracer kinetics in the biological tissue [22, 23, 24, 25, 26, 27]. The compartmental model of a specific organ comprises compartments representing the functional states of the tracer radioactive molecules (e.g.: physical location as intravascular space, extracellular space, intracellular space, etc., or chemical state as metabolic form, binding state, etc.), and kinetic parameters, which are the input/output tracer rates for each compartment. Figure 1 illustrates the standard two-compartmental model describing the FDG metabolism in the organ under consideration [28]. This model reproduces the main steps of the FDG path in a PET experiment. First, the tracer is injected into the blood with a concentration mathematically modelled by the Input Function (IF), here assumed as known and represented by the tracer concentration Cb in the arterial blood compartment. Then, the FDG metabolism within the tissue is characterized by two functional states: the free compartment with concentration Cf, associated to the tracer molecules outside the tissue cells, and the metabolized compartment with concentration Cm, associated to FDG molecules within the cytoplasm. Finally, the FDG kinetics is described by four rate constants connecting the model compartments: k1 and k2 describe the exchange rates between the input and free pools, and k3 and k4 describe the exchange rates at the basis of the phosphorylation/dephosphorylation process.
The system of Ordinary Differential Equations (ODEs) for the two-compartment model is where where t is the time variable, and ±ki, i = 1, 2, 3, 4, represent incoming and outgoing fluxes. In standard applications of compartmental analysis, the initial conditions are Cf (0) = Cm(0) = 0, meaning that the PET experiment starts at time t = 0 when there is no available tracer into the biological system. The analytical solution of (1) represents the forward model equation of determining the compartment concentrations given the kinetic parameters, and takes the form where the entries of the vector k = (k1, k2, k3, k4)T ∈ ℝ4 have to be non-negative real values.
The compartmental input function Cb(t) can be obtained experimentally either from serial sampling of the arterial blood or reconstructed dynamic images [29], when a large arterial pool such as the left ventricle is in the field of view for many frames, or by using reference tissue methods [30, 31]. However, PET images cannot offer enough resolution power to provide information on C(t; k). Therefore the measurement equation should be added to equation (3) to connect the compartment model to the PET data. In this equation (p, q) represents a specific image pixel, denotes the measured tracer concentration at pixel (p, q) of the organ image, C(p, q) is the formal analytic solution of (3), and V(p, q) is the fraction of tissue volume occupied by the blood. In general, the blood volume fraction depends on the pixel position, but within a homogeneous tissue it can be assumed as a known constant.
In equation (4) the unknown kinetic parameters are functions of (p, q) and therefore the inverse problem represented by this equation should be solved numerically and pixel-wise. Rather coarse approximations allow a linearization of this equation [32, 33]. However, the pixel-wise solution of the exact nonlinear equation requires the availability of an effective optimization scheme for the regularization of the ill-posed nonlinear compartmental inverse problem represented by the equation and eventually for the reconstruction of the four parametric images associated to k1, k2, k3, and k4.
3. Computational approaches for nonlinear ill-posed problems
The compartmental inverse problem described in the previous section is a special case of the following more general formulation. Given a set of measurements y0 of tracer concentration provided by PET images, corresponding to a finite sample of N time points t1,…, tN, we have to determine the kinetic parameters k ∈, ℝn n ≤ N, by solving the non-negatively constrained nonlinear system
Here , and F: ℝn → ℝN is the continuously differentiable function at the right hand side of (4). In real experiments a noisy version yδ of y0 is at disposal, where δ is a known bound on the measurement error, with δ ≤ ||y0||. A standard approach to address equation (5) [34, 35] consists in approximating a solution k† of this nonlinear system by solving the following non-negatively nonlinear least squares problem via an iterative regularization technique with semiconvergent behaviour:
In view of the discrepancy principle [34], the iterative method is stopped at the iteration satisfying the following condition for a suitable τ > 1.
In this section, we describe a method for computing a regularized solution of problem (6); in particular, we combine the regularizing approach developed in [14] for unconstrained ill-posed problems with the affine scaling trust-region (TR) schemes for a box-constrained minimization problem [15, 36]. The key point to link these methods is represented by the following Proposition 1, which shows that possible projection steps do not prevent the convergence of the iterative scheme. Therefore, the main contribution of this section is to show that the theoretical framework developed for the unconstrained problem [14] still holds also in the non-negatively constrained case.
3.1. A regularizing affine scaling trust-region method for non-negatively nonlinear least-squares problems
For unconstrained nonlinear ill-posed least-squares problems, the state-of-the-art approaches are the regularized Levenberg-Marquadt (LM) method, proposed by Hanke [37], and its reformulation within a Trust-Region (TR) framework, proposed by Wang et al. [14] and, more recently, by Bellavia et al. [38]. As in the standard TR algorithm, the regularizing TR iteration requires to compute, at each iteration, a trial step pj, by minimizing the quadratic model mj (p) within a region around the current iterate kj: where Bj ≡ J(kj)T J(kj) is the Gauss-Newton approximation of the Hessian of F, gj ≡ ∇Φ(kj) = J(kj)T (F(kj)—yδ) and Δj denotes the TR radius; this can be expanded or reduced depending on whether a sufficient reduction of the model is achieved or not, i.e. if the ratio between the actual reduction in the objective functional and the predicted reduction in the quadratic model is lower than some positive threshold β ∈ (0,1). The regularizing property is accomplished by requiring that the TR constraint is active at the solution, i.e., the solution pj of (8) must be such that ||pj || = Δj so that the associated Lagrange multiplier αj plays the role of a penalization parameter in a Tikhonov-like regularization. Indeed, given kj, the new iterate can be viewed as the solution of the penalized subproblem arising at the iteration of LM method:
This regularization technique for an unconstrained problem can be combined with the TR methods for box-constrained nonlinear least-squares problems. To this aim, we introduce a regularizing technique in the affine-scaling TR method [16, 15, 36] requiring that the TR constraint in the subproblem (8) is active at the solution. In particular, given kj > 0 and gj ≠ 0, we find the solution αj > 0 of the nonlinear equation Δj – ||p(α)|| = 0, where p(α) = (J(kj)T J(kj) + αIn)−1 J(kj)T(yδ ‒ F(kj)). By setting pj = p(αj), in order to ensure the strict feasibility of a new iterate, the i-th entry of is computed in accordance with the following rule: where Π(·) denotes the Euclidean projection onto the non-negative orthant of ℝn and t ∈ (0,1). Clearly, in view of the properties of the projection operator, .
As emphasized in [15], a key point to assure the convergence of the affine-scaling TR method is that the new iterate must be able to achieve at least as much reduction in the quadratic model as the one achieved by the generalized Cauchy point , where D(k) is a diagonal matrix such that and is defined as follows with t ∈ (0,1). If and ρj > β ∈ [0.25,1), the current trial step is accepted and the next iterate is updated as , otherwise the TR radius is reduced. In particular, if , the unsatisfactory reduction of the quadratic model at with respect to the reduction obtained with the generalized Cauchy step highlights that we have to increase the effect of the regularization term by reducing the TR radius and computing a new reduced step; this vector tend to line up with gj and the new generalized Cauchy step, so that the sufficient reduction of the quadratic model is obtained. Furthermore, when ||gj || ≠ 0, after a successful iteration of the method, the TR radius can be further adjusted by increasing or reducing it within a prefixed range, accordingly to a strategy proposed in [38] (see Eq. (5.5)-(5.6)), as follows: where with q ∈ (0,1), and θ, η ∈ (0,1). The regularizing affine-scaling TR method, called in the following reg-AS-TR, is summarized in Algorithm 1; for data affected by noise, the stopping criterion is based on the discrepancy principle (7).
The convergence analysis of reg-AS-TR requires to prove Proposition 1, which is analogous to Theorem 2.1 of [14], i.e., we need to prove that the distance between kj and the exact solution k† decreases for . To this aim, we give two essential assumptions on the local properties of the nonlinear system (5), very similar to the ones used in [37, 14] to handle ill-posed problems.
A1. Given an initial guess k0 > 0, there exist ν, c > 0 such that k† ∈ Bν(k0) = {k ≥ 0: ║k – k°║ ≤ ν} and for all , k ∈ B2ν(k0) = {k ≥ 0: ║k – k0║ ≤ 2ν} the following condition holds:
A2. for noisy-free data (δ = 0) and for noisy data (δ > 0) with τ > 1/q.
We highlight that, when at the first steps of the algorithm these assumptions are not verified, the initial iterations can enable to restrict the domain so that they hold from a certain j. Now, we are able to state the following key proposition (for the proof see the Appendix).
Let assume that J(kj)T J(kj) + αjIn is positive definite, gj ≠ 0 and
for a suitable q ∈ (0,1), with j ≥ 0 and with when δ > 0. Moreover, let assume that, for a suitable γδ > 1, the following condition holds for kj > 0:
Thus we have
with vj = (J(kj) J (kj)T + αjIN)-1(yδ – F(kj)).
We remark that condition (17) with j = 0 follows directly from the assumptions A1-A2 with for noise-free data. For δ > 0, condition (17) with j = 0 is obtained with , combining the assumptions A1-A2 with the inequality which is satisfied for (see (7)). As a consequence of Proposition 1, k1 belongs to B2ν(k0) and to Bν(k†). Therefore, for the same argument above, condition (17) holds by induction for j ≥ 0 and for when δ > 0; as a consequence, the sequence ║kj – k†║ is decreasing.
Based on the above proposition and the convergence results of the affine-scaling TR methods, the same properties of the regularizing TR method for an unconstrained nonlinear least-squares problem can be easily extended to the non-negatively constrained case. Under Assumptions A1-A2 on the exact solution k†, reg-AS-TR terminates after iterations, where δ is the noise level on the data, whereas for δ = 0 or δ → 0 the sequence {kj} generated by Algorithm 1 converges to a solution of the original problem.
As a final remark, we point out that the ill-posedness and nonlinearity of the method, together with the local properties of reg-AS-TR imply that the effectiveness of our numerical scheme may be significantly influenced by the accuracy of both the initialization and the noise estimate. The reliability with which these two aspects are addressed is an essential requirement for the accuracy of the reconstruction results.
Regularizing affine-scaling Trust-Region (reg-AS-TR) method
4. Numerical experiments
The numerical validation of reg-AS-TR is performed using synthetic PET data generated by means of a digital phantom of the human brain. All simulations were realized on a workstation equipped with an Intel Xeon QuadCore E5620 processor at 2,40 GHz and 18 Gb of RAM, by implementing the method in the Matlab® R2019a environment.
4.1. Simulation setting
The starting point was the 3D Hoffman Digital Reference Object, a digital representation of the Hoffman Brain Phantom used in PET and CT imaging studies, freely available from the Imaging Research Laboratory of the Department of Radiology at the Medical Center of the University of Washington (http://depts.washington.edu/petctdro/DROhoffman_main.html).
The 3D Hoffman brain phantom is composed of 250 slices, covering the entire head, consisting in black/white images of size 256 × 256. We reduced the image size to 128 × 128 to resemble typical PET acquisitions, preserving the shape and features of the original phantom. For sake of simplicity, we selected a middle slice including eight anatomical structures that can be subdivided into the four homogeneous functional regions in Figure 2(a): grey matter (region 1), white matter (region 2), basal ganglia (region 3), and thalamus (region 4). Then, for each region, we assigned a ground-truth set of rate constants of the two-compartment model for FDG kinetics (described in Section 2) and a specific blood volume fraction V. The numerical values of such parameters, as reported in Table 1, have been chosen in order to reproduce a realistic framework for the FDG uptake of a human brain [39, 40, 41, 42]. The ground-truth parametric images are shown in Figure 3.
In order to model the IF we implemented the following procedure [43]. We considered a mathematical function (see Eq. (2) in [43]) consisting of an increasing linear component followed by a tri-exponential decay; we fitted the free parameters of this function against measurements for 80 subjects; we selected the median estimated parameters computed over all 80 subjects (see Table 2 in [43]), a median initial distribution volume (12.7 L corresponding to 0.1683 L/kg body weight), and an Administered Activity (AA) of 350 MBq (typical of human PET acquisitions). The resulting simulated IF is shown in Figure 2(d).
The dynamic PET data were generated by solving the compartmental forward problem for each pixel of the processed Hoffman brain image. In particular, the two-compartment concentrations were evaluated by means of the integral equation (3) with the ground truth values of the compartmental parameters and the simulated IF, at 28 time frames (6 × 10 sec, 3 × 20 sec, 3 × 30 sec, 4 × 60 sec, 3 × 150 sec, 9 × 300 sec) with a time sampling typical of standard PET experiments, for a total time interval of 60 minutes. Then, the measurement equation (4) was computed to create the time concentration curves characteristic for each brain region (Figure 2(c)). The last frame of the obtained dynamic PET images is reported in Figure 2(b).
Once the noise-free dynamic PET images were obtained, we projected the images into the sinogram space by means of the Radon transform, and we added Poisson noise to the projected data through the Matlab function poissrnd. In this way, we created ten independent identically-distributed noisy data. In addition to the noise-free IF case, we considered two further instances where the IF was perturbed by two Gaussian noise levels: , for time points ti, i = 1,…, N, where r is randomly generated from a standard normal distribution of mean 0 and standard deviation 1, and c = 0.10, 0.20 (Figure 2(d)).
4.2. Setup of the algorithms
The parametric reconstruction by means of reg-AS-TR was performed as follows.
In order to remove blurring artifacts from the images of each dataset, we applied a well-known deblurring technique based on the minimization of the Kullback-Leibler divergence with a smooth total-variation regularization term referred to as hypersurface potential; this minimization is performed by means of the Scaled Gradient Projection (SGP) method proposed in [44] (see also [45]), starting from the inverse Radon transform of the noisy sinogram data. The deblurring procedure exploits the parallel toolbox of Matlab enabling the use of GPUarray and it requires about 7 minutes overall.
The stopping criterion of reg-AS-TR is the following: where ϵj = ║yδ – F(kj)║, τ1 is the sample standard deviation computed at the current pixel and τ2 is a multiple of τ1, which changes accordingly when the procedure switches between boundary (τ2 = 10τ1) and inner pixels (τ2 = 3τ1) of a region. In addition, if condition (19) is not satisfied, the execution terminates when stagnation or the maximum number of iterations are reached.
The stopping rule implemented allows to diversify the initialization procedure of reg-AS-TR. In general, the initial vector is randomly chosen in an interval determined by a priori knowledges on the physiology. However, when the current pixel is strictly inside a functional region and some neighboring pixels have been already successfully processed, the initialization value is the mean over the values obtained on these neighboring pixels.
The reconstruction accuracy of reg-AS-TR has been assessed by comparison with both the ground truth and the parametric images provided by a recently introduced regularized Gauss-Newton method (reg-GN) [13]. For sake of comparison, the setup of reg-GN is coherent with what is done in that paper, i.e.:
Deblurring. The noise on the PET datasets was reduced by applying a Gaussian smoothing filter (mean 0, standard deviation 1, window 3×3) directly to the noisy PET images.
Initialization. The starting point of the kinetic parameters was chosen randomly in intervals determined by knowledge on the physiology.
Stopping criterion. The iterative scheme is stopped when the relative error between the experimental dynamic concentration and the model-predicted one is less than an appropriate threshold, or the maximum number of iterations is reached.
4.3. Results
Figure 4, Figure 5, and Figure 6 show the mean images computed over the ten reconstructions obtained by the methods reg-AS-TR, reg-GN, and by the Matlab routine lsqcurvefit implementing a standard Trust-Region-Reflective least-squares algorithm [15, 46]. We used the noise-free IF and the perturbed IF with 10% and 20% of noise, respectively. Figure 7 contains mean and standard deviation values of the kinetic parameters computed over the ten reconstructions and over each one of the four homogeneous regions, for each one of the three noise levels on the IF.
Finally, Figure 8 represents the last frame of the dynamic PET data reconstructed with the mean parametric values returned by reg-AS-TR, reg-GN, and lsqcurvefit, with respect to the noise-free, 10%-noise, and 20%-noise IFs.
5. Comments and conclusions
In general, reg-AS-TR and lsqcurvefit seem to provide similar mean reconstructions, although uncertainties associated to lsqcurvefit are significantly bigger. On the other hand reg-GN seems to systematically underestimate the parameter values within region 1. Furthermore and as expected, for all methods the quality of the parametric reconstructions deteriorates with increasing noise levels; this is more clear from the k3 and k4 parametric images, probably due to the different sensitivities of the data with respect to the model parameters [47]. In reg-GN and some artifacts can be observed at the edges of the homogeneous regions, especially around region 1 and region 2, whereas the effect of regularization in reg-AS-TR results in a reduced presence of artifacts while the structure of the regions is preserved. This general trend is confirmed by the error-bar plots of Figure 7. Finally, the frames in Figure 8 corresponding to reg-AS-TR show a significant improvement of the image quality with respect to what is provided by the other two approaches.
The mean execution time for a single parametric reconstruction differs considerably between the reconstruction methods: reg-AS-TR requires about 20 minutes, reg-GN needs a computational time in the range 75 – 120 minutes with run time increasing with noise level on IF (as a consequence of the stopping criterion implemented) and Matlab lsqcurvefit takes about 90 minutes. Therefore reg-AS-TR seems to be the most efficient approach in terms of both computational time and reconstruction accuracy.
Next steps for this piece of research activity will be the validation of reg-AS-TR against several experimental datasets in the case of both humans’ and small animals’ dynamic PET images. Further, we are going to generalize reg-AS-TR to the case of more complex compartmental models like the ones for the assessment of FDG kinetics in liver [26] and kidneys [27].
Appendix
In this Appendix, we provide the proof of Proposition 1.
Let pj† = kj – kj and , with given in (10). If , we return back to the unconstrained case, for which Proposition 2.1 in [14] holds. Let assume for some j that
Let . From the properties of the projection operator [48, Proposition 2.1.3], we have in particular, Eq. (A.1) holds for k† ≥ 0 and, therefore, we obtain
Setting and , we can write the previous inequality as follows:
From the identity , the definition , the previous inequality and t ∈ (0,1), we have
Now, we recall that, in view of positive definiteness of the matrix (J(kj)T J(kj) + αjIn), the following matrix identities hold:
As a consequence, setting rj = yδ – F(kj), we can write
In view of inequality (A.2), the definition of pj and the above identities, we have where the last inequality follows from the Cauchy-Schwarz inequality. Then, the q-condition and the assumption (17) yields
Footnotes
E-mail: serena.crisci{at}unife.it