BrainImageNet Dataset

Submitted by YAN Chao-Gan on

BrainImageNet Dataset


This dataset was used to pretrain brain MRI-based sex classifier models and to construct brain disorder classifiers with high generalizability via transfer learning (Lu et al., 2022. A practical Alzheimer’s disease classifier via brain imaging-based deep learning on 85,721 samples. Journal of Big Data.


The data was shared through the R-fMRI Maps Project (RMP).

1. The preprocessed brain imaging data for which raw data owners permit sharing data derivatives are available through FTP:

Please use FTP software (Host:, Username: ftpdownload, Password: FTPDownload, Path: /sharing/RfMRIMaps/PaperDataSharing/Lu_2022_BrainImageNetData/

N = 19552. Sex.csv: 1 -- Male; 0 -- Female.

Please sign the Data Use Agreement and email the scanned signed copy to to get unzip password.

2. ABCD Data. After discussing with ABCD coordinator, we will upload the derived data to NDA in the form of an NDA Study. We are still working on the procedure, and will release a link once uploaded.

3. UKBiobank Data. UKBiobank coordinator asked to return the derived data to UKbiobank, and they could potentially make them available on request in the folder of the return. We are still working on the procedure, and will release a link once available.

4. ADNI Data. We are still working with ADNI coordinator to see if we can find a feasible way to re-share the derivatives.

Investigators and Affiliations

Bin Lu1,2, Hui-Xian Li1,2, Zhi-Kai Chang1,2, Le Li3, Ning-Xuan Chen1,2, Zhi-Chen Zhu1,2, Hui-Xia Zhou2,4, Xue-Ying Li1,5,6, Yu-Wei Wang1,2, Shi-Xian Cui1,5,6, Zhao-Yu Deng1,2, Zhen Fan7, Hong Yang8, Xiao Chen1,2, Paul M. Thompson9, Francisco Xavier Castellanos10,11, Chao-Gan Yan1,2,12,13*, for the Alzheimer’s Disease Neuroimaging Initiative**


1CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing, China; 2Department of Psychology, University of Chinese Academy of Sciences, Beijing, China; 3Center for Cognitive Science of Language, Beijing Language and Culture University, Beijing, China; 4CAS Key Laboratory of Mental Health, Institute of Psychology, Beijing, China; 5Sino-Danish College, University of Chinese Academy of Science, Beijing, China; 6Sino-Danish Center for Education and Research, Beijing, China; 7Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China; 8Department of Radiology, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, Zhejiang, China; 9Imaging Genetics Center, Mark & Mary Stevens Institute for Neuroimaging & Informatics, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA; 10Department of Child and Adolescent Psychiatry, NYU Grossman School of Medicine, New York, NY, USA; 11Nathan Kline Institute for Psychiatric Research, Orangeburg, NY, USA; 12International Big-Data Center for Depression Research, Institute of Psychology, Chinese Academy of Sciences, Beijing, China; 13Magnetic Resonance Imaging Research Center, Institute of Psychology, Chinese Academy of Sciences, Beijing, China.

Data Sharing Coordinator

Zi-Han Wang


We would like to thank Ms. Zi-Han Wang for assistance in data organizing and anonymization. Data used in the preparation of this article for training and testing the sex classifier was obtained from the Adolescent Brain Cognitive Development (ABCD) Study (, held in the NIMH Data Archive (NDA). This is a multisite, longitudinal study designed to recruit more than 10,000 children ages 9-10 and follow them over 10 years into early adulthood. The ABCD Study is supported by the National Institutes of Health and additional federal partners under award numbers U01DA041048, U01DA050989, U01DA051016, U01DA041022, U01DA051018, U01DA051037, U01DA050987, U01DA041174, U01DA041106, U01DA041117, U01DA041028, U01DA041134, U01DA050988, U01DA051039, U01DA041156, U01DA041025, U01DA041120, U01DA051038, U01DA041148, U01DA041093, U01DA041089. A full list of supporters is available at A listing of participating sites and a complete listing of the study investigators can be found at ABCD consortium investigators designed and implemented the study and/or provided data but did not necessarily participate in analysis or writing of this report. This manuscript reflects the views of the authors and may not reflect the opinions or views of the NIH or ABCD consortium investigators. This research has been conducted using the UK Biobank Resource. Data collection and sharing for the training and testing the sex and AD classifiers were funded by the Alzheimer's Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie; Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health ( The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. Data used in the preparation of this article were obtained from the MIRIAD database. The MIRIAD investigators did not participate in analysis or writing of this report. The MIRIAD dataset is made available through the support of the UK Alzheimer's Society (Grant RF116). The original data collection was funded through an unrestricted educational grant from GlaxoSmithKline (Grant 6GKC). 



This work was supported by the Sci-Tech Innovation 2030 - Major Project of Brain Science and Brain-inspired Intelligence Technology (grant number: 2021ZD0200600), National Key R&D Program of China (grant number: 2017YFC1309902), the National Natural Science Foundation of China (grant numbers: 82122035, 81671774, 81630031), the 13th Five-year Informatization Plan of Chinese Academy of Sciences (grant number: XXH13505), the Key Research Program of the Chinese Academy of Sciences (grant NO. ZDBS-SSW-JSC006), Beijing Nova Program of Science and Technology (grant number: Z191100001119104), and the Scientific Foundation of Institute of Psychology, Chinese Academy of Sciences (grant number: E2CX4425YZ).


Publication Related to This Dataset

The following publication include the data shared in this data collection:

Lu, B., Li, H. X., Chang, Z. K., Li, L., Chen, N. X., Zhu, Z. C., ... & Alzheimer’s Disease Neuroimaging Initiative. (2022). A practical Alzheimer disease classifier via brain imaging-based deep learning on 85,721 samples. Journal of Big Data, In Press.


Sample Size

Total: 85,911 (42,691 females, a few datasets would be upload later after obtaining the approval of the administrator of raw dataset).

Exclusion criteria: Images unable to finish brain segmentation and spatial normalization were excluded.


Image Acquisition

We submitted data access applications to nearly all the open-access brain imaging data archives and received permissions from the administrators of 34 datasets. Deidentified data were contributed from datasets collected with approvals from local Institutional Review Boards. The reanalysis of these data was approved by the Institutional Review Board of Institute of Psychology, Chinese Academy of Sciences. All participants had provided written informed consent at their local institution. For participants with multiple sessions of structural images, each image was considered an independent sample for data augmentation in model training. Therefore, we recommend using cross-datasets-validation while training models using this data. Because allocating different scans from the same person into training and testing sets may artifactually inflate model performance.


MRI preprocessing.

The T1-weighted brain MRI images were segmented and normalized to acquire grey matter density (GMD) and grey matter volume (GMV) maps. Specifically, we used the voxel-based morphometry (VBM) analysis module within Data Processing Assistant for Resting-State fMRI (DPARSF), which is based on SPM, to segment individual T1-weighted images into grey matter, white matter, and cerebrospinal fluid (CSF). Then, the segmented images were transformed from individual native space to MNI-152 space (a coordinate system created by Montreal Neurological Institute) using the Diffeomorphic Anatomical Registration Through Exponentiated Lie algebra (DARTEL) tool. Two voxel-based structural metrics, GMD and GMV were derived from the above-mentioned procedure. GMD is the output of the unmodulated tissue segmentation map in MNI space. GMV is calculated by multiplying the voxel value in GMD by the Jacobian determinants derived from the spatial normalization step (modulated).


Code availability

The code for training and testing the model are openly shared at Demonstration website for classifying sex and AD is available at



1. Kermany DS, Goldbaum M, Cai W, Valentim CC, Liang H, Baxter SL, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell. 2018;172(5):1122-31.

2. Yosinski J, Clune J, Bengio Y, Lipson H, editors. How transferable are features in deep neural networks? Adv Neural Inf Process Syst; 2014.

3. Jonsson BA, Bjornsdottir G, Thorgeirsson TE, Ellingsen LM, Walters GB, Gudbjartsson DF, et al. Brain age prediction using deep learning uncovers associated sequence variants. Nat Commun. 2019;10(1):5409.