Skip to main content


Table 2 Data sources used for the generation of the Intelligence Network

From: Alzheimer's disease biomarker discovery using in silico literature mining and clinical validation

Databases Description
Alzheimer Disease & Frontotemporal Dementia Mutation Database ( The Alzheimer Disease & Frontotemporal Dementia Mutation Database (AD&FTDMDB) aims at collecting all known mutations and non-pathogenic coding variations in the genes related to Alzheimer disease (AD) and frontotemporal dementia (FTD). All data were exported and loaded into Sofia, to create gene-disease assertions.
Diseases Database ( The Diseases database is a cross-referenced medical dictionary of diseases, medications, symptoms, signs and investigations, which was loaded into Sofia and provided assertions linking Alzheimers disease to symptoms and signs, histopathological abnormalities, risk factors etc.
Gene Ontology ( The Gene Ontology project provides an ontology of defined terms representing gene product properties. The ontology covers three domains for the gene products: cellular component, molecular function, & biological process. All of GO was processed and loaded into Sofia, and the relevant assertions were then exported into the IN.
Genetic Association Database ( The Genetic Association Database is an archive of human genetic association studies of complex diseases and disorders. All the data linking genes to diseases were processed and downloaded into Sofia, and the relevant assertions were then exported into the IN.
Gensat Brain Atlas ( GENSAT is a gene expression atlas of the developing and adult central nervous system of the mouse. After AD-related brain areas were identified from literature reviews, the relevant genes were exported from GENSAT, and assertions linking gene to anatomical area created and loaded into Sofia.
KEGG ( KEGG (Kyoto Encyclopedia of Genes and Genomes) is a bioinformatics resource for linking genomes to life and the environment. Pathways relevant to AD were reviewed, and relevant protein-pathway assertions were generated using Sofia.
NCBI Gene Expression Omnibus ( Gene Expression Omnibus (GEO) is a database repository of high throughput gene expression data and hybridization arrays, chips, microarrays. GEO was searched for AD-relevant expression data, which were downloaded from the NCBI site and loaded into Sofia.
OMIM ( Online Mendelian Inheritance in Man (OMIM) is a database that catalogues all the known diseases with a genetic component, and if possible, links them to the relevant genes in the human genome and provides references for further research and tools for genomic analysis of a catalogued gene. All of OMIM Genemap was exported and loaded into Sofia; relevant AD records were used to create gene-disease assertions.
Telemakus knowledgebase ( Telemakus Biomarkers in Alzheimer's Disease & Mild Cognitive Impairment Knowledgebase contains information from AD and MCI biomarker studies. All of the Knowledgebase was exported and loaded into Sofia as protein-disease assertions.
Textual Data Description
PubMed ( PubMed is a service of the U.S. National Library of Medicine that includes over 18 million citations from MEDLINE and other life science journals for biomedical articles back to the 1950s. PubMed contains a rich set of biomedical literature abstracts relevant to many areas. AD-relevant vocabularies within Sofia were used to build a "corpus" of AD-relevant abstracts, which were then used in assertion-generation processes to create disease-protein and disease-process links.
Full text papers and reports Various full text reviews from journals, and reports from AD websites (Alzheimer Research Forum; and Essential Science Indicators; were downloaded, and text versions were loaded into Sofia for assertion generation using key AD-related vocabularies.