Public datasets hosted by DBIC
Recently, the world-wide neuroscience community has experienced a rise in data sharing efforts aimed at increasing transparency and reproducibility of experimental findings. As a result, many collated and annotated functional imaging datasets have become publicly available. Access to such resources provides investigators and students a great opportunity to explore and test models of brain function and cognition. However, many datasets are very large and can be challenging for individuals to access. To facilitate easy access and promote the use of publicly available data resources at DBIC, we host several datasets on the Dartmouth Research Computing file system where they are readily available to DBIC users for analyzing on the Discovery cluster.
List of hosted datasets
Contact
- For questions about existing datasets or requests to add new datasets, please contact Jamie Ford (James.C.Ford@dartmouth.edu)
Dartmouth brain imaging Center (DBIC) QA dataset
The Dartmouth Brain Imaging Center's QA data, converted to BIDS standard, can be accessed using the DataLad file-sharing and data-versioning utility (see: https://www.datalad.org/).
- Path to DBIC QA data on Rolando: /inbox/BIDS/dbic/QA
- Access: Open (no restrictions to rc-DBIC users)
Online resources
Narratives dataset
The Narratives Dataset is a large collection of fMRI studies in natural language processing conducted at Princeton Neuroscience Institute by the labs of Professors Ken Norman and Uri Hasson and procured by DBIC alumnus Dr. Sam Nastase. The dataset consists of normalized 3T fMRI data collected while subjects (345 individuals) listened to various spoken word narratives with 28 unique stories and over 800 scanning sessions. This dataset is a benchmark for testing models of language processing and language comprehension.
- Path to the Narratives dataset on Discovery: /dartfs/rc/lab/D/DBIC/DBIC/archive/narratives
- Access: Open (no restrictions to rc-DBIC users)
Online resources
- Nastase, S.A., Liu, YF., Hillman, H. et al. The “Narratives” fMRI dataset for evaluating models of naturalistic language comprehension. Sci Data 8, 250 (2021)
- Narratives Dataset - DataLad Repository
NATURAL SCENES DATASET
The Natural Scenes Dataset (NSD) is a large ultra-high-field 7T fMRI dataset conducted by the Center for Magnetic Resonance Research at the University of Minnesota. It consists of high-resolution fMRI data collected from 8 adult subjects while they viewed thousands of images of natural scenes. Each subject was scanned multiple times with 30 to 40 sessions per subject. The NSD dataset is an excellent resource for testing models of visual representation and cognition.
- Path to the NSD on Discovery: /dartfs/rc/lab/D/DBIC/DBIC/archive/NSD.
- Access: Data Use Agreement (DUA) and group limited access permission
- Link to DUA: https://forms.gle/eT4jHxaWwYUDEf2i9
- Group limited access permission: Email Jamie Ford
Online resources
The Human connectome project
The Human Connectome Project (HCP) is a massive collaborative undertaking led by PIs Dr. David Van Essen of Washington University and Dr. Kamil Ugurbil of the University of Minnesota. The DBIC currently hosts the HCP 1200 Young Adult dataset which contains resting state fMRI data from over 1100 healthy adults ages 22-35.
- Path to HCP dataset on Discovery: /dartfs/rc/lab/D/DBIC/DBIC/archive/HCP/HCP1200
- Access: Data Use Agreement (DUA) and group limited access permission
- Link to DUA: https://balsa.wustl.edu/project?project=HCP_YA
- note: applies retroactively to HCP1200
- Group limited access permission: Email Jamie Ford
- Link to DUA: https://balsa.wustl.edu/project?project=HCP_YA
Information about the directory structure and file names can be found HERE.
Online resources
Healthy brain network
The Healthy Brain Network (HBN) is a large 3T fMRI dataset conducted by the Child Mind Institute for developing human brains (5-21 yrs) The whole study includes eye tracking, EEG and rich demographic and behavioral data that can be found here. The data hosted by DBIC only includes the fMRI data. It consists of subjects under resting-state, eye-tracking calibration and watching two short movie clips ("Despicable Me" and "The Present"). We used the version from "Reproducible Brain Charts (RBC)" and the 845 subjects passing the quality check by the RBC team (up to "Release 9" by the HBN team). The paper on "Reproducible Brain Charts" can be found here. In additional to the downloaded raw BIDS data and Freesurfer 6 outputs, custom processing derivatives with fmriprep and nb_prep were stored.
- Path to HBN dataset on Discovery: /dartfs/rc/lab/D/DBIC/DBIC/archive/HBN
- Access: Creative Commons (CC license) (open with restrictions)
- The HBN data uses a Creative Commons license with "BY-NC-SA" restrictions.
- BY (attribution): citation requirements on eventual publications
- NC (non-commercial)
- SA (share-alike): if you reshare you must keep the CC restrictions
- The HBN data uses a Creative Commons license with "BY-NC-SA" restrictions.
Online resources
AMSTERDAM OPEN MRI COLLECTION (AOMIC)
The Amsterdam Open MRI Collection (AOMIC) is a collection of three independent, open-access datasets—known as ID1000, PIOP1, and PIOP2—collected at 3T and totaling over 1,300 unique participants (N=928, N=216, N=226, respectively). Participants received T1-weighted, diffusion-weighted (DWI), and fMRI imaging. The functional data in the largest component, ID1000, is from movie watching of natural scenes by 19-26 year olds, while the two PIOP datasets include resting state and task-based fMRI of university students, with the latter targeting emotion matching and working memory (both), face perception, cognitive control, and emotion anticipation (PIOP1 only), and response inhibition (PIOP2 only). Demographics and psychometric variables are also included.
- Path to the AOMIC datasets on Discovery: /vast/labs/DBIC/datasets/Amsterdam-Open-MRI
- Access: Open (no restrictions to rc-DBIC users)
Online Resources
- Snoek, L., van der Miesen, M.M., Beemsterboer, T. et al. The Amsterdam Open MRI Collection, a set of multimodal MRI datasets for individual difference analyses. Sci Data 8, 85 (2021). https://doi.org/10.1038/s41597-021-00870-6
BISSETT SELF REGULATION DATASET
This dataset includes neuroimaging and behavioral data collected to examine the general construct of self-regulation. It consists of anatomical MRI, resting-state fMRI, and task-based fMRI (Stroop and Stop-signal) along with various self-report surveys from 103 healthy adult subjects.
- Path to dataset on Discovery: /vast/labs/DBIC/datasets/Bissett-Self-Regulation
- Access: Open (no restrictions to rc-DBIC users)
Online Resources
- Bissett, P.G., Eisenberg, I.W., Shim, S. et al. Cognitive tasks, anatomical MRI, and functional MRI data evaluating the construct of self-regulation. Sci Data 11, 809 (2024). https://doi.org/10.1038/s41597-024-03636-y
