DPABI Harmonization: A Toolbox for Harmonizing Multi-site Brain Imaging for Big-data Era

Submitted by wanghanlin on

DPABI Harmonization is an open-source, user-friendly toolkit developed to address the challenge of integrating multi-site neuroimaging data in the pursuit of validity and reproducibility in research. It is designed to harmonize resting-state functional Magnetic Resonance Imaging (rs-fMRI) data across different centers, accounting for site-specific variances that can hinder the aggregation of large, diverse datasets.

 

 

DPABI Harmonization was developed based on MATLAB2020a (The MathWorks Inc., Natick, MA, US) (RRID: SCR_001622), Statistical Parametric Mapping (Ashburner, 2012) (RRID: SCR_007037) and DPABI (Yan et al., 2016) (RRID: SCR_010501), and adopt Docker (https://docker.com) for python-based ICVAE. It is recommended to use MATLAB version 2020a or later. The required toolboxes inside MATLAB for DPABI Harmonization includes Parallel Computing Toolbox, Statistics and Machine Learning Toolbox, Optimization Toolbox, and Control System Toolbox. Please install DPABI for clone from Github to use it: https://github.com/Chaogan-Yan/DPABI.

 

DAPBI Harmonization integrates a range of techniques, including the state-of-the-art Subsampling Maximum-mean-distance Algorithms (SMA, recommended), ComBat/CovBat, linear models, and invariant conditional variational auto-encoder (ICVAE). It equips neuroscientists with an easy-to-use and transparent harmonization workflow, ensuring the feasibility of post-hoc analysis for multi-site studies. To use this tool for harmonizing multisite datasets, the overall workflow can simply be two steps: load the dataset, ensuring that site information is included; then choose a method that best aligns with your goals. Additional features are available to improve the efficiency, accuracy and reproducibility of the harmonization process, ensuring a better overall experience.

 

It is recommend to watch the following two Course videos before you start to use DPABI Harmonization

  1.  Comprehensive Evaluation of Harmonization on Functional Brain Imaging for Multisite Data-fusion (https://d.rnet.co/Course/DPABIHarmonization/DPABIHarmonization.mp4)


This video is about 19 minutes long. Here the Dr. Yu-Wei Wang mainly introduces a work of comprehensively evaluating the Harmonization methods in brain imaging (Wang et al., 2023), which is the basis for the development of DPABI Harmonization introduced in the next video. The presenter first talks about what is the site effect, and then introduces the specific work. The evaluation of various Harmonization methods mainly involves two aspects: site effect removal and biological information retention. And the specific evaluation indicators include individual identification rate, test-retest reliability, replicability and so on. Considering all aspects of performance, the Subsampling Maximum Mean Distance Algorithm (SMA) is the optimal harmonization method.

 

2.  DPABI Harmonization module: Functionalities and Practice (https://d.rnet.co/Course/DPABIHarmonization/DPABIHarmonizationPractice.mp4)
 

This video is about 40 minutes long. In the first 14 minutes, Dr. Yu-Wei Wang provides a detailed introduction to the various modules of the DPABI Harmonization toolbox GUI, as well as the overall usage process and some important notes. The later part of the video shows the actual use of the software. By watching this video, you will understand how to organize input files during the use of DPABI Harmonization, what parameters need to be set under each harmonization method, and what output results you will obtain.

 

References:

Ashburner, J. (2012). SPM: a history. NeuroImage, 62(2), 791-800. https://doi.org/10.1016/j.neuroimage.2011.10.025 

Chen, A. A., Beer, J. C., Tustison, N. J., Cook, P. A., Shinohara, R. T., Shou, H., & Initiative, A. s. D. N. (2022). Mitigating site effects in covariance for machine learning in neuroimaging data. Human brain mapping, 43(4), 1179-1195. (CovBat)

Fortin, J.-P., Cullen, N., Sheline, Y. I., Taylor, W. D., Aselcioglu, I., Cook, P. A., Adams, P., Cooper, C., Fava, M., & McGrath, P. J. (2018). Harmonization of cortical thickness measurements across scanners and sites. NeuroImage, 167, 104-120. (ComBat)

Fortin, J.-P., Parker, D., Tunç, B., Watanabe, T., Elliott, M. A., Ruparel, K., Roalf, D. R., Satterthwaite, T. D., Gur, R. C., & Gur, R. E. (2017). Harmonization of multi-site diffusion tensor imaging data. NeuroImage, 161, 149-170. (ComBat)

Johnson, W. E., Li, C., & Rabinovic, A. (2007). Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics, 8(1), 118-127. (ComBat)

Moyer, D., Ver Steeg, G., Tax, C. M., & Thompson, P. M. (2020). Scanner invariant representations for diffusion MRI harmonization. Magnetic resonance in medicine, 84(4), 2174-2189. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7384065/pdf/MRM-84-2174.pdf   (ICVAE)

Wang, Y. W., Chen, X., & Yan, C. G. (2023). Comprehensive evaluation of harmonization on functional brain imaging for multisite data-fusion. Neuroimage, 274, 120089. https://doi.org/10.1016/j.neuroimage.2023.120089 

Yan, C. G., Wang, X. D., Zuo, X. N., & Zang, Y. F. (2016). DPABI: Data Processing & Analysis for (Resting-State) Brain Imaging. Neuroinformatics, 14(3), 339-351. https://doi.org/10.1007/s12021-016-9299-4

Zhou, H. H., Singh, V., Johnson, S. C., Wahba, G., & Alzheimer’s Disease Neuroimaging Initiative (2018). Statistical tests and identifiability conditions for pooling and analyzing multisite           datasets. Proceedings of the National Academy of Sciences of the United States of America, 115(7), 1481–1486. (SMA)