Name | Description | Data type | Data access | Data download | Latest update |
---|---|---|---|---|---|
Oncological disease | |||||
CPTAC [48] https://cptac-data-portal.georgetown.edu/cptacPublic/ | The Data Portal represents the NCI’s largest public repository of proteogenomic comprehensive sequence datasets | MS proteomic and phosphoproteomic data and gene expression | Open/controlled user account (open use studies) or request access by data use application (controlled use studies) | Web based, web client and programmatic | 2018 |
The Cancer Genome Atlas is a large cancer genomics data collection covering 43 projects with normal-control. Patient outcomes, treatment details, pathology, and expert analyses are also provided when available. Many subjects possess corresponding imaging data on The Cancer Imaging Archive (TCIA) | Gene expression, DNA methylation, germline and somatic mutations, clinical data | Open/controlled user account (open use studies) or request access by data use application (controlled use studies) | Web-based, web client and Programmatic | 2018 | |
ICGC [50] https://dcc.icgc.org/ | The International Cancer Genome Consortium archives large number of datasets with molecular data from more than 20,000 donors including the Pan cancer Analysis of Whole Genomes (PCAWG) study | Germline and somatic mutations, gene expression, DNA methylation | Open/controlled user account (open use studies) or request access by data use application (controlled use studies) | Web based, web client and programmatic | 2018 |
The Cancer Imaging Archive collects medical cancer images accessible for public download. Data include 78 collections and different image modalities. Many subjects possess corresponding genomics data on the GDC (ex TCGA) | Medical images in DICOM format, clinical data | Open/controlled user account (open use studies) or request access by data use application (controlled use studies) | Web based, web client and Programmatic | 2018 | |
Neurological and neurodegenerative disorders | |||||
1000 Functional Connectomes Project/INDI International NeuroImaging Data-sharing Initiative [52] and curse of dimensionality [4]. https://www.nitrc.org/projects/fcon_1000/ | It provides the broader imaging community complete access to a large-scale functional imaging dataset such as prospective, retrospective dataset | Imaging and clinical data | NITRC account for some public datasets and some controlled dataset | Amazon Web Services S3 and CyberDuke web client and command line | 2018 |
LONI Database (The Laboratory of Neuroimaging at University of Southern California) [53] https://loni.usc.edu/about_loni | Repository for sharing and long-term preservation of neuroimaging and biomedical research data especially on neurological, neurodegenerative and psychiatric diseases. Some studies ongoing are: ADNI, ENIGMA, GAAIN, PPMI | Clinical, imaging (MRI, PET, MRA, DTI and other imaging modalities), genetic and behavioral data from multisite longitudinal study | Open use data required account controlled access by Image and Data Archive (IDA) request otherwise data use application request | Web-based Image and Data Archive (IDA)* | 2018 |
LRRK2 Cohort consortium (The Michael J. Fox Foundation (MJFF) for Parkinson’s Research) [54] https://https://www.michaeljfox.org/page.html?lrrk2-cohort-consortium | The LRRK2 Cohort Consortium (LCC) comprises three closed studies: the LRRK2 Cross-sectional Study, LRRK2 Longitudinal Study and the 23 and Me Blood Collection Study | Clinical data and biospecimens (blood, urine and cerebrospinal fluid) from PD and control volunteers | Account controlled access data | LONI (IDA) repositorya | 2018 |
National Institute of Neurological Disorders and Stroke/The Michael J. Fox Foundation (MJFF) for Parkinson’s Research BioFIND [55] http://biofind.loni.usc.edu/ | BioFIND is a cross-sectional clinical study designed to discovery new Parkinson’s disease biomarker | Clinical data and biospecimens (blood, urine and cerebrospinal fluid) from PD and control volunteers | Account controlled access data | LONI (IDA) repository* | 2018 |
The National Institute of Mental Health (NIMH)/NIMH Repository and Genomic resources (RGR) [56] https://https://www.nimhgenetics.org/ | The NIMH Repository is an infrastructure for sharing data collected by hundreds of research projects in concerns clinical and genetic analysis of mental health disorders (e.g. schizophrenia, bipolar disorder, depression, Alzheimer’s disease, autism, obsessive–compulsive disorder, etc.). For instance the National Database for Autism Research (NDAR) website is the primary point of entry for Autism Research | Imaging Genetic and Clinical data | NIMH account approval | Web-based and web client Open Database License (ODbL) | 2018 |
The National Institute of Neurological Disorders and Stroke (NINDS) [57] https://https://www.ninds.nih.gov/, https://pdbp.ninds.nih.gov/ | The NINDS is divided into basic, clinical and translational research projects to advance the study of neurological disorders to both academic and industry investigators. One dataset is the PDBP DMR Parkinson’s Disease Biomarkers Program Data Management Resource | Gene expression, clinical data | NINDS account approval | Web-based and web-client Open Database License (ODbL) | 2018 |
The National Institute on Aging (NIA)/AMP-AD Knowledge Portal Accelerating Medicines Partnership-Alzheimer’s Disease [58] https://http://www.synapse.org/#!Synapse:syn2580853/wiki/409840 | The AMP-AD Knowledge Portal is the NIA-designated repository for distribution of data from multiple NIA-supported programs on Alzheimer’s disease | Various types of molecular data from human, cell-based and animal model biosamples | Account controlled access data | Synapse web browser and web client | 2018 |
The National Institute on Aging Genetics of Alzheimer’s Disease (Data Storage Site NIAGADS) [59] https://http://www.niagads.org/ | The NIAGADS provides access to publicly available NIAGADS summary statistics datasets for Alzheimer’s Disease and related neuropathologies | Multi-omic GWAS, whole genome (WGS) and whole exome (WES), expression, RNA Seq, and CHIP Seq analyses | Open to investigators return secondary analysis data to the database | Web-based (NIAGADS genome browser) and web-client Open Database License (ODbL) | 2018 |
Cardiovascular disease | |||||
Cardiac Atlas Project [60] http://http://www.cardiacatlas.org/ | A multi-center cardiac MRI data sets with the most robust manual contours defined by the consensus of 7 independent expert readers from 7 world-class core labs. Datasets related to 6 different studies | Imaging (MRI data) and clinical data | Controlled CAP data access request | Web client | 2018 |
National Heart, Lung, and Blood Institute (BioLINCC) [61] https://biolincc.nhlbi.nih.gov/home/ | NHLBI is the NIH center devoted to research, training, and education of heart, lung, blood and sleep disorders. It provides teaching datasets and public use datasets | Clinical data and sometimes corresponding biospecimens | Open and controlled data on request | Web-based user interface (BioLINCC) | 2018 |
The Cardiovascular Research Grid (CVRG) [62] http://cvrgrid.org/ | The CardioVascular Research Grid (CVRG) project is supported by the National Heart Lung & Blood Institute for creating an infrastructure for sharing cardiovascular data and data analysis tools | Imaging (ex vivo DWI and in vivo heart CT) and clinical data | Open/Controlled | Web-based | 2018 |
The Qatar Cardiovascular Biorepository (QCBio) [63] http://http://www.qcbio.org/ | Cases include patients needing percutaneous intervention for symptomatic coronary heart disease (CHD) or admitted with an acute coronary syndrome (myocardial infarction or unstable angina). Controls are individuals identified from the Hamad Medical Corp. blood bank who have no history of CHD. The goal of QCBio is to archive plasma and DNA of 1000 Qatari patients with coronary heart disease and 1000 controls, who are matched on age, sex and ethnicity | Biospecimens (plasma and DNA) and clinical data | Open to Qatari investigators and controlled access data for others | Web-based and web client | 2018 |
Vascular Diseases Biorepository [63] https://http://www.mayo.edu/research/labs/atherosclerosis-lipid-genomics/research-projects/vascular-diseases-biorepository | Biorepository for common vascular diseases, including: (PAD) Peripheral artery disease, aortic aneurysm, (CAD) carotid artery stenosis, fibromuscular dysplasia. These samples are linked with demographic information, conventional cardiovascular risk factors, and comorbidities ascertained from Mayo Clinic’s electronic health record using EHR-based electronic phenotyping algorithms | Biospecimens (DNA, serum and plasma) and clinical data | Open/controlled | Web-based and web client | 2018 |
Multiple diseases | |||||
DAA [64] http://ageing-map.org/atlas/ | The Digital Aging Data is a portal of age-related changes covering different biological levels. It integrates to create an interactive portal that serves as the first centralised collection of human ageing changes and pathologies | Gene expression and proteomic, psychological and pathological age-related data | Publicly available by DAA account approval | DAA account approval for open | 2017 |
The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the data and results from studies that have investigated the interaction of genotype and phenotype in Humans. Over 150 NCI studies are registered in dbGaP | Genome wide studies and clinical data | Open/controlled NCBI account approval | Web client and programmatic | 2018 | |
The European Genome-phenome Archive collects human biomedical data across Europe. It allows authorised users to search sequenced material, patient samples stored in biobanks, patients illnesses, treatments, outcomes | Imaging Gene expression, genome wide studies and clinical data | Controlled data use application request, then EGA account approval | Web client and programmatic | 2018 | |
Gene Expression Omnibus provides multiple level datasets (4348 in total) related to cancer and other diseases | Gene expression, genome wide studies and clinical data | Most data are publicly available, sometimes data use on request | Web client and programmatic | 2018 | |
The Human Ageing Genomic Resources (HAGR) is a collection of databases and tools designed to help researchers study the genetics of human ageing using modern approaches such as functional genomics, network analyses, systems biology and evolutionary analyses | Gene expression and clinical data | Publicly available raw data, processed data on request | Web based download (zip, csv files) | 2018 | |
JGA [69] https://https://www.ddbj.nig.ac.jp/jga/index-e.html | Japanese Genotype-phenome Archive is a service for archiving and sharing of all types of individual-level genetic and de-identified phenotypic data | Imaging, gene expression, genome wide studies and clinical data | NBDC Human Database approval | Web client and programmatic | 2018 |