Skip to main content

Table 2 Integrated Database of oncological, neurological/neurodegenerative, cardiovascular and multiple diseases

From: Bringing radiomics into a multi-omics framework for a comprehensive genotype–phenotype characterization of oncological diseases

Name

Description

Data type

Data access

Data download

Latest update

Oncological disease

 CPTAC [48] https://cptac-data-portal.georgetown.edu/cptacPublic/

The Data Portal represents the NCI’s largest public repository of proteogenomic comprehensive sequence datasets

MS proteomic and phosphoproteomic data and gene expression

Open/controlled user account (open use studies) or request access by data use application (controlled use studies)

Web based, web client and programmatic

2018

 GDC [49] https://portal.gdc.cancer.gov/

The Cancer Genome Atlas is a large cancer genomics data collection covering 43 projects with normal-control. Patient outcomes, treatment details, pathology, and expert analyses are also provided when available. Many subjects possess corresponding imaging data on The Cancer Imaging Archive (TCIA)

Gene expression, DNA methylation, germline and somatic mutations, clinical data

Open/controlled user account (open use studies) or request access by data use application (controlled use studies)

Web-based, web client and Programmatic

2018

 ICGC [50] https://dcc.icgc.org/

The International Cancer Genome Consortium archives large number of datasets with molecular data from more than 20,000 donors including the Pan cancer Analysis of Whole Genomes (PCAWG) study

Germline and somatic mutations, gene expression, DNA methylation

Open/controlled user account (open use studies) or request access by data use application (controlled use studies)

Web based, web client and programmatic

2018

 TCIA [51] http://https://www.cancerimagingarchive.net/

The Cancer Imaging Archive collects medical cancer images accessible for public download. Data include 78 collections and different image modalities. Many subjects possess corresponding genomics data on the GDC (ex TCGA)

Medical images in DICOM format, clinical data

Open/controlled user account (open use studies) or request access by data use application (controlled use studies)

Web based, web client and Programmatic

2018

Neurological and neurodegenerative disorders

 1000 Functional Connectomes Project/INDI International NeuroImaging Data-sharing Initiative [52] and curse of dimensionality [4]. https://www.nitrc.org/projects/fcon_1000/

It provides the broader imaging community complete access to a large-scale functional imaging dataset such as prospective, retrospective dataset

Imaging and clinical data

NITRC account for some public datasets and some controlled dataset

Amazon Web Services S3 and CyberDuke web client and command line

2018

 LONI Database (The Laboratory of Neuroimaging at University of Southern California) [53] https://loni.usc.edu/about_loni

Repository for sharing and long-term preservation of neuroimaging and biomedical research data especially on neurological, neurodegenerative and psychiatric diseases. Some studies ongoing are: ADNI, ENIGMA, GAAIN, PPMI

Clinical, imaging (MRI, PET, MRA, DTI and other imaging modalities), genetic and behavioral data from multisite longitudinal study

Open use data required account controlled access by Image and Data Archive (IDA) request otherwise data use application request

Web-based Image and Data Archive (IDA)*

2018

 LRRK2 Cohort consortium (The Michael J. Fox Foundation (MJFF) for Parkinson’s Research) [54] https://https://www.michaeljfox.org/page.html?lrrk2-cohort-consortium

The LRRK2 Cohort Consortium (LCC) comprises three closed studies: the LRRK2 Cross-sectional Study, LRRK2 Longitudinal Study and the 23 and Me Blood Collection Study

Clinical data and biospecimens (blood, urine and cerebrospinal fluid) from PD and control volunteers

Account controlled access data

LONI (IDA) repositorya

2018

 National Institute of Neurological Disorders and Stroke/The Michael J. Fox Foundation (MJFF) for Parkinson’s Research BioFIND [55] http://biofind.loni.usc.edu/

BioFIND is a cross-sectional clinical study designed to discovery new Parkinson’s disease biomarker

Clinical data and biospecimens (blood, urine and cerebrospinal fluid) from PD and control volunteers

Account controlled access data

LONI (IDA) repository*

2018

 The National Institute of Mental Health (NIMH)/NIMH Repository and Genomic resources (RGR) [56] https://https://www.nimhgenetics.org/

http://ndar.nih.gov/

The NIMH Repository is an infrastructure for sharing data collected by hundreds of research projects in concerns clinical and genetic analysis of mental health disorders (e.g. schizophrenia, bipolar disorder, depression, Alzheimer’s disease, autism, obsessive–compulsive disorder, etc.). For instance the National Database for Autism Research (NDAR) website is the primary point of entry for Autism Research

Imaging Genetic and Clinical data

NIMH account approval

Web-based and web client Open Database License (ODbL)

2018

 The National Institute of Neurological Disorders and Stroke (NINDS) [57] https://https://www.ninds.nih.gov/, https://pdbp.ninds.nih.gov/

The NINDS is divided into basic, clinical and translational research projects to advance the study of neurological disorders to both academic and industry investigators. One dataset is the PDBP DMR Parkinson’s Disease Biomarkers Program Data Management Resource

Gene expression, clinical data

NINDS account approval

Web-based and web-client Open Database License (ODbL)

2018

 The National Institute on Aging (NIA)/AMP-AD Knowledge Portal Accelerating Medicines Partnership-Alzheimer’s Disease [58] https://http://www.synapse.org/#!Synapse:syn2580853/wiki/409840

The AMP-AD Knowledge Portal is the NIA-designated repository for distribution of data from multiple NIA-supported programs on Alzheimer’s disease

Various types of molecular data from human, cell-based and animal model biosamples

Account controlled access data

Synapse web browser and web client

2018

 The National Institute on Aging Genetics of Alzheimer’s Disease (Data Storage Site NIAGADS) [59] https://http://www.niagads.org/

The NIAGADS provides access to publicly available NIAGADS summary statistics datasets for Alzheimer’s Disease and related neuropathologies

Multi-omic GWAS, whole genome (WGS) and whole exome (WES), expression, RNA Seq, and CHIP Seq analyses

Open to investigators return secondary analysis data to the database

Web-based (NIAGADS genome browser) and web-client Open Database License (ODbL)

2018

Cardiovascular disease

 Cardiac Atlas Project [60] http://http://www.cardiacatlas.org/

A multi-center cardiac MRI data sets with the most robust manual contours defined by the consensus of 7 independent expert readers from 7 world-class core labs. Datasets related to 6 different studies

Imaging (MRI data) and clinical data

Controlled CAP data access request

Web client

2018

 National Heart, Lung, and Blood Institute (BioLINCC) [61] https://biolincc.nhlbi.nih.gov/home/

NHLBI is the NIH center devoted to research, training, and education of heart, lung, blood and sleep disorders. It provides teaching datasets and public use datasets

Clinical data and sometimes corresponding biospecimens

Open and controlled data on request

Web-based user interface (BioLINCC)

2018

 The Cardiovascular Research Grid (CVRG) [62] http://cvrgrid.org/

The CardioVascular Research Grid (CVRG) project is supported by the National Heart Lung & Blood Institute for creating an infrastructure for sharing cardiovascular data and data analysis tools

Imaging (ex vivo DWI and in vivo heart CT) and clinical data

Open/Controlled

Web-based

2018

 The Qatar Cardiovascular Biorepository (QCBio) [63] http://http://www.qcbio.org/

Cases include patients needing percutaneous intervention for symptomatic coronary heart disease (CHD) or admitted with an acute coronary syndrome (myocardial infarction or unstable angina). Controls are individuals identified from the Hamad Medical Corp. blood bank who have no history of CHD.

The goal of QCBio is to archive plasma and DNA of 1000 Qatari patients with coronary heart disease and 1000 controls, who are matched on age, sex and ethnicity

Biospecimens (plasma and DNA) and clinical data

Open to Qatari investigators and controlled access data for others

Web-based and web client

2018

 Vascular Diseases Biorepository [63] https://http://www.mayo.edu/research/labs/atherosclerosis-lipid-genomics/research-projects/vascular-diseases-biorepository

Biorepository for common vascular diseases, including: (PAD) Peripheral artery disease, aortic aneurysm, (CAD) carotid artery stenosis, fibromuscular dysplasia. These samples are linked with demographic information, conventional cardiovascular risk factors, and comorbidities ascertained from Mayo Clinic’s electronic health record using EHR-based electronic phenotyping algorithms

Biospecimens (DNA, serum and plasma) and clinical data

Open/controlled

Web-based and web client

2018

Multiple diseases

 DAA [64] http://ageing-map.org/atlas/

The Digital Aging Data is a portal of age-related changes covering different biological levels. It integrates to create an interactive portal that serves as the first centralised collection of human ageing changes and pathologies

Gene expression and proteomic, psychological and pathological age-related data

Publicly available by DAA account approval

DAA account approval for open

2017

 dbGaP [65] https://https://www.ncbi.nlm.nih.gov/gap/

The database of Genotypes and Phenotypes (dbGaP) was developed to archive and distribute the data and results from studies that have investigated the interaction of genotype and phenotype in Humans. Over 150 NCI studies are registered in dbGaP

Genome wide studies and clinical data

Open/controlled NCBI account approval

Web client and programmatic

2018

 EGA [66] https://https://www.ebi.ac.uk/ega/

The European Genome-phenome Archive collects human biomedical data across Europe. It allows authorised users to search sequenced material, patient samples stored in biobanks, patients illnesses, treatments, outcomes

Imaging Gene expression, genome wide studies and clinical data

Controlled data use application request, then EGA account approval

Web client and programmatic

2018

 GEO [67] https://https://www.ncbi.nlm.nih.gov/geo

Gene Expression Omnibus provides multiple level datasets (4348 in total) related to cancer and other diseases

Gene expression, genome wide studies and clinical data

Most data are publicly available, sometimes data use on request

Web client and programmatic

2018

 HGAR [68] http://genomics.senescence.info/

The Human Ageing Genomic Resources (HAGR) is a collection of databases and tools designed to help researchers study the genetics of human ageing using modern approaches such as functional genomics, network analyses, systems biology and evolutionary analyses

Gene expression and clinical data

Publicly available raw data, processed data on request

Web based download (zip, csv files)

2018

 JGA [69] https://https://www.ddbj.nig.ac.jp/jga/index-e.html

Japanese Genotype-phenome Archive is a service for archiving and sharing of all types of individual-level genetic and de-identified phenotypic data

Imaging, gene expression, genome wide studies and clinical data

NBDC Human Database approval

Web client and programmatic

2018

  1. aLONI (IDA) repository of multiple projects