Robust atrophy rate measurement in Alzheimer's disease using multi-site serial MRI: Tissue-specific intensity normalization and parameter selection
Introduction
Large multi-site clinical studies provide a powerful way to understand diseases and their treatments. In recent years, neuroimaging outcomes have increasingly been incorporated into such studies (Horn and Toga; 2009; Barkhof et al., 2009). However, information is often lacking about the robustness and variability of these outcomes in a multi-site setting. The Alzheimer's Disease Neuroimaging Initiative (ADNI) was established partly to address this issue. ADNI included subjects from over 50 sites across the U.S. and Canada, and its aims include testing the ability of serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological and imaging markers, and clinical and neuropsychological assessments to measure progression in mild cognitive impairment (MCI) and early Alzheimer's disease (AD) (Mueller et al., 2005).
The use of images from different sites and scanners brings particular challenges for image analysis algorithms with the potential to lose sensitivity and introduce systematic errors (Stonnington et al., 2008). Increased variability in the outcome measure leads to a corresponding loss of power to detect treatment effects. For longitudinal studies the stability of image acquisition is critical but may be compromised in several ways. For MRI, variability in the outcome may be affected by: (1) image intensity variation due to subject-specific noise, noise in the electronics, and imaging gradient non-linearities (Sled et al., 1998; Lewis and Fox; 2004), (2) variability in distortion fields due to differences in subject positioning (Jovicich et al., 2006), (3) voxel size variation due to drift in the strength of the applied read out gradient (i.e. calibration drift) (Clarkson et al., 2009), (4) imaging protocol differences between scanners and between baseline and repeat scans (due to scanner hardware and software changes during the study) (Preboske et al., 2006); and (5) differences in scanner calibration and quality control procedures (Whitwell et al., 2004). Although much effort has been put in to address these problems, e.g. intensity inhomogeneity correction (Sled et al., 1998), distortion field correction (Jovicich et al., 2006), voxel size correction based on geometric phantom (Gunter et al., 2006) or image registration (Clarkson et al., 2009), intensity and geometric distortion artifacts and contrast differences still exist in the images. These errors interact in a complex manner and affect the results from different image analysis algorithms in a large multi-site clinical study. Images are often reviewed by expert raters as part of the quality control in clinical studies, so that those with unacceptable errors or artifacts can be excluded from subsequent analysis. However, the exclusion of images (and hence subjects) decreases the statistical power of the study and, more importantly, may introduce bias if the outcome values for the excluded images differ systematically from those included.
The aim of this paper is to increase the robustness and reproducibility of brain atrophy measurement in multi-site image studies. The boundary shift integral (BSI) is a semi-automated measure of regional and global cerebral atrophy rates from serial MRI which uses intra-subject image registration to give higher precision than is typically possible with manual measures (Freeborough and Fox; 1997). The BSI has been used to assess atrophy progression in clinical trials in AD (Fox et al., 2005), and in a number of natural history studies in a range of neurological disorders, including AD (Ridha et al., 2006; Freeborough and Fox; 1997), frontotemporal dementia (Chan et al., 2001), multiple sclerosis (Anderson et al., 2007) and Huntington's disease (Henley et al., 2006). The BSI estimates the changes in cerebral volume using differences in voxel intensities between two serial MRI volume scans at the boundary region of the brain. In order to accurately measure brain atrophy using BSI, the intensity of the same tissue in the baseline and repeat scans should be as similar as possible. The classic BSI technique employs intensity normalization between baseline and repeat images by dividing the intensity on each scan by the mean intensity of the interior region of the brain (consisting mainly of white matter). Where there is the possibility of tissue contrast changes over time this is not an ideal approach because it does not take into account the intensity changes of individual tissue types in the brain, namely cerebrospinal fluid (CSF), gray matter (GM) and white matter (WM), relative to each other. Furthermore, an intensity window parameter must be chosen in the calculation of BSI, in order to correctly capture the intensity transitions associated with the brain boundary. The optimal value is largely dependent on the signal-to-noise ratio (SNR) and the image intensity of CSF and GM. Existing protocols make use of a single BSI intensity window for all the images from all the imaging sites; however different images acquired from different sites may have different tissue contrasts and SNRs with different optimal BSI intensity windows. Ideally the choice of that optimal window should be automated and unbiased, and based upon the intrinsic tissue contrast and SNR in the image pairs of a particular subject produced by a particular scanner and acquisition protocol.
Few papers have explicitly addressed the problems of MR image intensity normalization and standardization. Nyúl and Udupa used a two-step approach to standardize MR image intensity to a standard intensity scale, so that specific tissue types have a similar intensity (Nyúl and Udupa; 1999). The first step (‘training step’) involved finding the parameters of the standardizing transform from a set of images. The second step (‘transformation step’) applied the learnt parameters to transform the intensity of a new image into the standardized histogram. Madabhushi and Udupa later used scale-space concepts to accurately identify principal regions used for the training step (Madabhushi and Udupa; 2006). Christensen reported the use of even-ordered derivatives of the image histogram to determine a single global scaling factor between two images (Christensen; 2003). The model of a single global scaling factor is the same as the model of intensity normalization in the classic-BSI. Weisenfeld and Warfield proposed the use of Kullback-Leibler divergence to match the intensity distribution of two images (Weisenfeld and Warfield; 2004). Since disease progression in AD will cause changes in the histogram model (changes in the relative heights and spread of the CSF/GM/WM peaks) in the repeat image, the methods proposed by Weisenfeld and Warfield may introduce bias in the BSI.
Many image processing algorithms have a set of customizable parameters to allow the users to adapt the algorithms to specific problems (e.g. biological and image quality variability) (Fennema-Notestine et al., 2006; Popovic et al., 2006). However, in a clinical trial setting, it is desirable that the image analysis is standardized (in terms of procedures and parameters), repeatable and reproducible (in terms of small intra-rater and inter-rater variabilities) (Schuster; 2007), and increasingly, regulations require that the procedure for choosing parameters be defined in advance for the trial.
In this paper, we describe two improvements for the BSI that address differences in tissue contrast and SNR over time and between scanners, namely robust intensity normalization and automatic parameter selection based on the intrinsic tissue contrast of the MR images. The aim thereby was to increase the robustness and reproducibility of the BSI in multi-site image studies. We used the large ADNI dataset to assess whether, and by how much, these modifications may reduce variability in measurements of atrophy rates and consequently reduce estimated sample sizes for a randomized trial of a putative disease-modification therapy for AD.
Section snippets
Methods and materials
In this section, we describe the image data, the method of computing BSI based on normalization using interior brain regions and manual selection of intensity window (referred to as ‘classic-BSI’), the improved method of computing BSI (referred to as ‘KN-BSI’), and the methods of comparison between classic-BSI and KN-BSI.
Qualitative analysis
After reviewing the 341 normalized image pairs following standard image registration and intensity normalization (classic-BSI image processing procedures), 289 (120 AD, 169 controls) image pairs (85%) were found to have image quality scores 1–3, and 52 (21 AD, 31 controls) image pairs (15%) were found to have image quality score 4. The percentages of images with quality score 4 were similar in AD subjects and controls (15% AD vs 16% controls). Fig. 2 shows an example of the intensity
Conclusions and discussion
We have described a method of brain atrophy measurement from serial MR imaging that addresses the problem of differences in tissue contrast and SNR over time and between scanners. The method involves tissue-specific intensity normalization to improve consistency over time, and automated BSI parameters selection based on image specific brain boundary contrast to improve consistency between scanners. The method was applied to over 300 baseline and 1-year volumetric MR image pairs acquired in a
Acknowledgments
The authors would like to thank Josephine Barnes at the Dementia Research Centre, and Derek L.G. Hill and David M. Cash at IXICO for helpful discussions. We would also like to thank all the image analysts (Melanie Blair, Magda Sokolska, Elizabeth Gordon, Raivo Kittus, Laila Ahsan, Kate MacDonald) and the research associates (Casper Nielsen and Ian Malone) in the Dementia Research Centre for their help in the study. The implementation of KN-BSI uses the Insight Segmentation and Registration
References (35)
Normalization of brain magnetic resonance images using histogram even-order derivative analysis
Magn. Reson. Imaging.
(Sep 2003)- et al.
Comparison of phantom and registration scaling corrections using the ADNI cohort
Neuroimage.
(Oct 2009) - et al.
Interactive algorithms for the segmentation and quantitation of 3-D MRI brain scans
Comput. Methods Programs Biomed.
(May 1997) - et al.
Reliability in multi-site structural MRI studies: effects of gradient non-linearity correction on phantom and human data
Neuroimage
(Apr 2006) - et al.
Correction of differential intensity inhomogeneity in longitudinal MR images
Neuroimage
(Sep 2004) - et al.
The Alzheimer's disease neuroimaging initiative
Neuroimaging Clin. N. Am.
(Nov 2005) - et al.
Compensation for surface coil sensitivity variation in magnetic resonance imaging
Magn. Reson. Imaging
(1988) - et al.
Common MRI acquisition non-idealities significantly impact the output of the boundary shift integral method of measuring brain atrophy on serial MRI
Neuroimage
(May 2006) - et al.
Tracking atrophy progression in familial Alzheimer's disease: a serial MRI study
Lancet. Neurol.
(Oct 2006) - et al.
Interpreting scan data acquired from multiple scanners: a study with Alzheimer's disease
Neuroimage
(Feb 2008)
Using nine degrees-of-freedom registration to correct for changes in voxel size in serial MRI studies
Magn. Reson. Imaging
Cerebral atrophy measurement in clinically isolated syndromes and relapsing remitting multiple sclerosis: a comparison of registration-based methods
J. Neuroimaging.
Imaging outcomes for neuroprotection and repair in multiple sclerosis trials
Nat. Rev. Neurol.
Rates of global and regional cerebral atrophy in AD and frontotemporal dementia
Neurology
Volume changes in Alzheimer's disease and mild cognitive impairment: cognitive associations
Eur. Radiol
Quantitative evaluation of automated skull-stripping methods applied to contemporary and legacy images: effects of diagnosis, bias correction, and slice location
Hum. Brain Mapp.
Effects of Aβ immunization (AN1792) on MRI measures of cerebral volume in Alzheimer disease
Neurology
Cited by (126)
Spinal cord atrophy in a primary progressive multiple sclerosis trial: Improved sample size using GBSI
2020, NeuroImage: ClinicalQuantifying 3D MR fingerprinting (3D-MRF) reproducibility across subjects, sessions, and scanners automatically using MNI atlases
2024, Magnetic Resonance in Medicine
- 1
Equal senior author.
- 2
Data used in the preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (www.loni.ucla.edu/ADNI). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. ADNI investigators included (complete listing available at www.loni.ucla.edu/ADNI/Collaboration/ADNI_Citatation.shtml).