Table 3 List of the representative databases with potential for application of the machine learning in microbiome field

Database Reference (URL) Description
BacDive BacDive offers data on 81,827 bacterial and archaeal strains, including 14,091 type strains and thereby covers approx. 90% of the validly described species
Gold Gold is a World Wide Web resource for comprehensive access to information regarding genome and metagenome sequencing projects, and their associated metadata
NCBI Microbial Genomes Microbial Genomes resource presents public data from prokaryotic genome sequencing projects
EnsemblBacteria Ensembl Bacteria is a browser for bacterial and archaeal genomes
European Nucleotide Archive The European Nucleotide Archive (ENA) provides a comprehensive record of the world’s nucleotide sequencing information, covering raw sequencing data, sequence assembly information and functional annotation
DrugBank DrugBank, the world's most comprehensive and structured drug and molecular drug information resource
Super Natural Super Natural II, a database of natural products. It contains 325,508 natural compounds (NCs), including information about the corresponding 2d structures, physicochemical properties, predicted toxicity class and potential vendors
ChEMBL ChEMBL is a manually curated database of bioactive molecules with drug-like properties
ChemSpider ChemSpider is a free chemical structure database providing fast text and structure search access to over 100 million structures from hundreds of data sources
BindingDB BindingDB is a public, web-accessible database of measured binding affinities. BindingDB contains 41,328 Entries, each with a DOI, containing 2,259,122 binding data for 8,516 protein targets and 977,487 small molecules
MicrobiomeDB A data-mining platform for interrogating microbiome experiments
UniProt UniProt provides the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information
Virtual Metabolic Human The VMH database captures information on human and gut microbial metabolism and links this information to hundreds of diseases and nutritional data
Disbiome Disbiome® is a database covering microbial composition changes in different kinds of diseases, managed by Ghent University
eHOMD eHOMD provides comprehensive curated information on the bacterial species present in the human aerodigestive tract (ADT), which encompasses the upper digestive and upper respiratory tracts, including the oral cavity, pharynx, nasal passages, sinuses and esophagus
HMDB The Human Metabolome Database (HMDB) is a freely available electronic database containing detailed information about small molecule metabolites found in the human body
MDB Microbiome database involves the sequencing resource and metadata of ecological community samples of microorganisms, including both host-associated or environmental microbes
MGnify MGnify provides amplicon, assemblies,metabarcoding, metagenomes and metatranscriptomes data on human and environmental biomes
Human Microbiome Project Genomic characterization of microbiota at five body sites (HMP1), and information on microbiota-human interactions in disease (iHMP)