The COPD Knowledge Base: enabling data analysis and computational simulation in translational COPD research

Background Previously we generated a chronic obstructive pulmonary disease (COPD) specific knowledge base (http://www.copdknowledgebase.eu) from clinical and experimental data, text-mining results and public databases. This knowledge base allowed the retrieval of specific molecular networks together with integrated clinical and experimental data. Results The COPDKB has now been extended to integrate over 40 public data sources on functional interaction (e.g. signal transduction, transcriptional regulation, protein-protein interaction, gene-disease association). In addition we integrated COPD-specific expression and co-morbidity networks connecting over 6 000 genes/proteins with physiological parameters and disease states. Three mathematical models describing different aspects of systemic effects of COPD were connected to clinical and experimental data. We have completely redesigned the technical architecture of the user interface and now provide html and web browser-based access and form-based searches. A network search enables the use of interconnecting information and the generation of disease-specific sub-networks from general knowledge. Integration with the Synergy-COPD Simulation Environment enables multi-scale integrated simulation of individual computational models while integration with a Clinical Decision Support System allows delivery into clinical practice. Conclusions The COPD Knowledge Base is the only publicly available knowledge resource dedicated to COPD and combining genetic information with molecular, physiological and clinical data as well as mathematical modelling. Its integrated analysis functions provide overviews about clinical trends and connections while its semantically mapped content enables complex analysis approaches. We plan to further extend the COPDKB by offering it as a repository to publish and semantically integrate data from relevant clinical trials. The COPDKB is freely available after registration at http://www.copdknowledgebase.eu.


C CO OP PD D K
Kn no ow wl le ed dg ge e b ba as se e u us se er r m ma an nu ua al l To give an overview of the functionality of the Portal, we provide a short, screenshot based guide on the information available within the COPD Knowledge base and the usage of the user interface.
Step 1: access the login page to access the secure, authentification and authorisation based COPDKB ( Figure 1). Please not, due to inclusion of clinical data access requires registration at COPDKB@clinic.ub.es. Step 2: You are now on the Home page ( Figure 2). On the left you can find a frame providing navigation to the different types of information integrated into the COPDKB.
On the home page main panel you can find a diagram providing an overview of the Synergy-COPD project structure, overall information becoming availbale during the project run-time and workflow (see below).  In this section you can find the Patient's data, an overview of patient related clinical data available in the COPDKB. Currently data from three clinical studies have been integrated into the COPD knowledge portal, the Biobridge 3-week and 8-week studies as well as the PAC-COPD study.
Step 1 Clicking on "Patients" brings you to the Biobridge and PAC-COPD patients browser where you can find pre-defined queries which allow to access all or specific subsets of clinical data. The patients browser for the Biobridge data is located directly on this page, whereas the browser for the PAC-COPD patients can be accessed by following the link "Filter PAC-COPD patients" at the top of the page ( Figure   4).
BioBridge Patient Data: The BioBridge study was designed as a pilot study with a set of experimental studies aimed to test the hypothesis that mitochondrial alterations and nitroso-redox unbalance are centrally involved in skeletal muscle dysfunction and reduced exercise capacity in patients with COPD.
According to these aims, a 3-week and an 8-week Training Project were designed to obtain the appropriate clinical, functional and biological information.
In both training projects groups of patients with COPD and healthy controls were studied in two conditions:  Oxidative and nitrosative stress-induced muscle protein modifications.
 Muscle transcriptomics by microarray analysis.

Figure 4 Patients browser
All patients of a selected sub-group are presented in a list report. In Figure 5 age, gender and associated experimental data are shown for each patient. The "Search" option allows to filter the list e.g.
for gender (image below).    The buttons on the top of the table provide methods to filter or futher extend from the clinical data to molecular measurements, diseases or simple statistics Statistics -provides simple summary statistics as well as two group comparisons. Select clinical parameters of interest and after clicking the "Statistics" button select the patients for group one and two.
"Summary" will aggregate all selected patients, "Comparison" will provide a t-test between the two groups for each individual parameter (see Figure 9). Step 2 Selecting "Molecular measurements" (Figure 3) in the "Clinical study data" section brings the user to the "Biobridge studies molecular data browser" (Figure 10) where there is an overview of currently available experimental data (restricted to the BioBridge study).  By clicking on the experiment name and selecting the view "Experiment -Experiment data" in the left panel, you obtain details about an Inflammation marker measurement as presented in Figure 13.
Experimental conditions, measured marker, measured value.

Figure 13 Inflammation measurement details
Step 3 Selecting "Clinical parameter centered view" in the "Clinical study data" section ( Figure 3) brings the user to a list of predefined queries selecting all or certain sets of clinical parameters measured in the PAC-COPD and BioBridge studies ( Figure 14). Figure 14 Clinical parameter centered view Example The query "2. Parameters common to the BioBridge and PAC-COPD clinical studies" selects all parameters that were measured in both studies and displays them along with parameter specific information in a result table ("Parameter report view", Figure 15). An alternative "Data matrix view" for these parameters can also be selected and results in a data matrix reporting the values for all common parameters measured in all BioBridge and PAC-COPD patients ( Figure 16). To display the data matrix of specific parameters and specific patients, first the parameters have to be selected with check marks and the appropriate data matrix has to be chosen from the drop-down menu "Data matrix" (here "2 Parameters common to PAC-COPD and BB"). In a next step patients can be chosen ( Figure 17) and the data matrix displayed (Figure 18).   The "Network Searches" section allows to search for all defined connection types that join a set of objects by a maximum of x intermediate steps. One example would be to select a list of genes e.g. from the differential expression analysis and then check whether these are somehow functional connected by searching for protein-protein interactions, gene regulation interaction, metabolic reactions and association within the same signalling pathway which will connect any of the selected genes by a maximum of 8 intermediate steps.
The first block on the "Network Searches" page lists networks based on protein-protein interactions whereas the second block give access to a set of precalculated protein networks based on overrepresentation analyses ( Figure 22).

Figure 22 Network searches
Results of protein-protein interaction networks are presented as tables listing source and target proteins as well as their network associations as shown in (Figure 23). Click on a column header sorts the table.
Click on the button next to the headline shows the graph of the interaction network, which also can be opened in a Java-based graph editor ( Figure 24).  There is an overview description available for each model, if you click on a model name you will be directed to it (Figure 27 and Figure 28).   Following the link "COPD associated genes" reveals a list of genes associated with COPD in public databases or by literature mining. Information related to these genes in the COPD-Knowledge base can be found by selecting various views in the left panel ( Figure 31). Please note that the view "Gene -full functional information (slow)" might take up to several minutes to load.   Further information related to these genes or proteins can then be found by selecting interesting genes/proteins with check marks and selecting appropriate menu points form the drop-down menues "Data matrix" or "Change type". Gene/protein related information from public sources (like PPI, pathways) as well as information from the Biobridge and PAC-COPD studies (like Gene expression) can be retrieved (Figure 36 and Figure 37).