Effective visualization of integrated knowledge and data to enable informed decisions in drug development and translational medicine
© Brynne et al.; licensee BioMed Central Ltd. 2013
Received: 18 December 2012
Accepted: 18 September 2013
Published: 8 October 2013
Integrative understanding of preclinical and clinical data is imperative to enable informed decisions and reduce the attrition rate during drug development. The volume and variety of data generated during drug development have increased tremendously. A new information model and visualization tool was developed to effectively utilize all available data and current knowledge. The Knowledge Plot integrates preclinical, clinical, efficacy and safety data by adding two concepts: knowledge from the different disciplines and protein binding.
Internal and public available data were gathered and processed to allow flexible and interactive visualizations. The exposure was expressed as the unbound concentration of the compound and the treatment effect was normalized and scaled by including expert opinion on what a biologically meaningful treatment effect would be.
The Knowledge Plot has been applied both retrospectively and prospectively in project teams in a number of different therapeutic areas, resulting in closer collaboration between multiple disciplines discussing both preclinical and clinical data. The Plot allows head to head comparisons of compounds and was used to support Candidate Drug selections and differentiation from comparators and competitors, back translation of clinical data, understanding the predictability of preclinical models and assays, reviewing drift in primary endpoints over the years, and evaluate or benchmark compounds in due diligence comparing multiple attributes.
The Knowledge Plot concept allows flexible integration and visualization of relevant data for interpretation in order to enable scientific and informed decision-making in various stages of drug development. The concept can be used for communication, decision-making, knowledge management, and as a forward and back translational tool, that will result in an improved understanding of the competitive edge for a particular project or disease area portfolio. In addition, it also builds up a knowledge and translational continuum, which in turn will reduce the attrition rate and costs of clinical development by identifying poor candidates early.
KeywordsData integration Preclinical data Clinical data Informatics Visualization Decision-making Translational medicine Drug development Knowledge management
Translational Medicine is the discipline focusing on improving drug discovery and development by bridging the gap between basic research, clinical development and clinical practice. The key is to identify and quantify biomarkers that characterize the efficacy and safety profiles at different stages of drug development. The goal is to build up the knowledge of a translational continuum from bed to bench and vice versa, e.g. forward and back translation . The translation of the pharmacodynamic drug action between species is a fundamental process in order to confidently select drug candidates that will demonstrate the biological and translational hypothesis in clinical development, and therefore also reduce the attrition rate [2–4].
Within all development phases, it is imperative to visualize data to be able to explore and integrate biomarkers from preclinical and clinical studies for multiple compounds (for benchmarking, differentiation and to compare forerunners) side-by side for informed decisions. This requires aggregation of a large amount of data and a holistic scientific understanding of all biomarkers. The visualization and integration of different biomarkers and endpoints across a large variety of studies are tedious processes and requires input and collaboration between experts from multiple disciplines in the organization. A platform that addresses this requires seamless access of data from clinical trials and preclinical studies. Furthermore, it needs to encompass a framework for harmonizing the interpretation of different types of data, gathered from various species, patient populations and therapeutic areas. The platform should, both technically and organizationally, allow use and reuse of data retrieved from internal and external sources as well as outputs from pharmacokinetic and pharmacodynamic modeling and simulations (PKS™). It should also address both individual and aggregated data within and across projects. Export of data to other applications for visualization and integration is key to ensure flexibility.
A number of commercial informatics tools (e.g. D360 , TranSMART ) for translational research purposes allow searching, analyzing and sharing data from multiple sources, data types and scales, are currently available and enables export of data to other applications. TranSMART Knowledge Management Platform is a platform that combines a data repository with intuitive search capabilities and analysis tools. However, it is based on a gene-centric approach that supports hypothesis development from a phenotypic perspective.
Today, pre-defined integration and visualization of data to answer key questions are performed within the tight project timelines. The volume and variety of data generated during drug development have increased tremendously. The Napiergram  is widely used within Pharma to get an overview of preclinical and clinical data, by presenting exposure ranges of unbound drug concentrations across assay systems and biomarkers. There is a clear need to facilitate the tedious work of visualizing and managing data for forward and back translation of compounds, as well as compound comparison at each milestone.
This paper describes the Knowledge Plot, a new translational framework for effective and flexible integration of preclinical and clinical data using a project-centric approach. The Plot, utilize current knowledge and desired effect levels for each biomarker and opens up for a transparent discussion and holistic understanding. The principles of how to construct a Knowledge Plot will be outlined, as well as retrospective and prospective use cases, demonstration of the translational continuum, and how it is imperative to use real-time data integration.
The Knowledge Plot
Examples of effect formulas
Unit of reference value
Unscaled (e.g. occupancy, %)
m - mbl
Change from baseline (raw)
Same unit as endpoint
m - mctrl
Change from control (raw)
Same unit as endpoint
100*((m -mbl) / mbl)
Change from baseline (%)
100*((m - mctrl) / (mctrl)
Change from control (%)
100*(Nsbj,grp/Ngrp – Nsbj,ctrl /Nctrl)
Diff in % Event (percentage points)
% (percentage points)
The TEI should be transparent with a clear rationale, which is agreed within the project team or skill network or dictated by a governance body. TEI and the corresponding rationale are documented in the database in order to support a common understanding between experts of different endpoints. There are similarities between the previously published Clinical Utility Index [8–10] and the TEI presented in this paper. Hence, a comparison is outlined in the Discussion-section. Regarding the interpretation of TEI, a TEI value of 0 means no effect of treatment, a value between 0 and 100 means that the effect goes in the desired direction but the size of the effect has not reach the Meaningful Effect, a negative TEI value means that the effect goes in the undesired direction, and a value above 100 means that the effect is greater than the Meaningful Effect.
Study meta data.
Study information describing the study design, compounds, species/population, study code and other data outlined only in study documentation.
Terminology is generally consistent within a study, but across studies and across development programs the terminology is out of sync. Encoding all the various entities in a controlled fashion is especially important for species, endpoints and compounds.
Categorization of endpoints.
Grouping and clustering of endpoints in a hierarchical manner allows comparison across compounds and species at different levels. In our approach the levels corresponds to the granularity of the key question.
At the top level: Is it a safety or efficacy endpoint?
Intermediate level: Which domains does the effect derive from? (Vital signs, biomarkers, adverse events etc.)
At the lowest level: What are the effects? (Increased blood pressure, occurrence of specific liver signals, number of subjects experience at least one occasion of dizziness etc.).
Specifications of the Treatment Effect Index.
The specifications of how the Treatment Effect Index is calculated for each endpoint may vary between species, target and type of disease etc. These descriptions outline how the placebo/vehicle or baseline controlled response is derived (Treatment Effect Definition) and what a meaningful clinical or biological effect (Reference value) is on this scale. Table 1 contains a list of commonly used Treatment Effect Definitions and their associate brief explanations. The set of options for transforming endpoint measurements into a Treatment Effect, accounts for the different study designs.
Unbound fraction (Plasma protein binding).
The unbound fraction is specific for the species/strain and the compound. The factor allows comparison of the exposure levels across species by using the unbound concentrations rather than the total plasma concentration of the compound.
A data mart is built row-wise by combining maximum exposure and endpoint values by the treatment group (in an advanced mode the rows are further divided into time points and visits). Study meta information, unbound fraction (protein binding), treatment information, and scaling details are then added to each row. The unbound concentration is calculated by multiplying the free fraction by the mean total concentration in each treatment group. In this paper maximum plasma concentration (Cmax) and steady-state plasma concentration (Css) are used, but other measures reflecting the exposure (e.g. area under the curve (AUC)) can be used if that is more relevant for the actual disease area or compound properties. Output from pharmacokinetic-pharmacodynamic modeling is used when there is a lag-time between maximal exposure and effect (hysteresis) . Pharmacokinetic data from satellite animals are used, when no other data are available. As an example, a list of variables that were collected to build a Knowledge Plot is given in Additional file 1: Table S1. For vocabularies, MeDRA  were used to describe adverse events, whilst internally developed vocabularies and data structures were used for the majority of other data types. The Treatment Effect is calculated for each endpoint and time point using the corresponding Treatment Effect Definition that corresponds to the study design. The Treatment Effect is then normalized into the Treatment Effect Index by using the reference value.
All available data from a project/target are exported into a data mart and loaded into the visualization tool. A tool that allows interactive visualizations is preferable as it gives possibility to switch between time, total and unbound concentration on the horizontal axis, as well as between raw data, the un-normalized treatment effect and Treatment Effect Index on the vertical axis. Thus, it is possible to visualize a certain endpoint using either raw or transformed data, or apply the treatment effect index when exploring the time or concentration relationships. The dose groups information, Treatment Effect Index rationale and other study meta information are stored in the database and easily accessible. Scatter plots is the recommended graph-type, where data points can be connected by a line for each endpoint and study. Furthermore, an interactive tool allows filtering out subsets of data to highlight selected endpoints, compounds and studies of interest. In addition, a tool that can color, shape, and size the points depending on their attributes are desired, so is also the possibility to draw trellis plots. Graphical templates can be optimized for each disease area or target of interest to enable instant visualization of data.
The Knowledge Plot uses two concepts, knowledge (c.f. TEI) and plasma protein binding. The concept of adding knowledge about each endpoint enables comparison of efficacy and safety data generated in the same species. By also adding protein binding for each species, comparisons between species, populations and subpopulations are possible and the number of ways to visualize and integrate data will be innumerable . The Knowledge Plot can integrate studies performed on non-equivalent doses and transparently revile different receptor densities or pathophysiological mechanisms in different species, interference of safety pharmacology/toxicology endpoints with efficacy endpoints. Trellis plots are powerful when dissecting data in different ways and comparing compound profiles. Data from different compounds could either be plotted on top of each other in the same graph or side-by-side. The flexibility enables a relative comparison of information/knowledge within or between compounds by utilizing data from all development stages. Data are visualized on the same scale and uses the same unit, enabling eyeball inspection to compare drug profiles.
The value of real-time forward and back translation to enable informed decision has been recently been demonstrated . Both retrospective and prospective integration of data resulted in improved cross-functional work and increased transparency of the large amount of compound and project information. Exploitation and interpretation of preclinical and clinical data supported improved awareness of face validity and predictability of animal models and in vitro assays. Retrospective documentation of all available data and information available at each milestone for decision-making clearly identified which preclinical and clinical studies the biomarkers and endpoints had reached a meaningful effect or a threshold. Another finding was hepatotoxicity in phase II, with no pre-warnings from preclinical data or early clinical studies, which clearly demonstrated that additional biomarkers are needed. The Knowledge Plot concept was also used to evaluate or benchmark compounds in due diligence (unpublished data). The prospective pilot initiative also highlighted the requirements on data and infrastructure for aligning a number of existing translational data platforms across pre-clinical and clinical domain. It also exposed the need to develop an enterprise solution in comparison to the methods used in the pilot projects, which were tailored to the individual project needs . In addition to unbound and total concentration in plasma, drug exposures in CSF and brain were compared with efficacy biomarkers for a number of compounds and species in the prospective study (unpublished data).
The Knowledge Plot approach helps project teams to handle enormous amounts of information in a flexible way with limited efforts and without getting information overload. The Knowledge Plot visualizes and integrates available data and knowledge in a transparent way by enabling a holistic (cross domain) interpretation by taking all attributes into consideration. As of today we have about 15 different targets, 50 compounds, 10 species, 17 populations/subpopulations, 260 studies and 100 endpoints in our database and we continue to build the knowledge bank as new and existing compounds progress in the pipeline.
In this paper, the basic version of the Knowledge Plot has been presented to illustrate the main idea and how it can be implemented to support business needs. The concept can be, and has been, expanded in many different ways. Some characteristics that have been incorporated into the Knowledge Plot are: confidence in each observation (confidence intervals, standard error, and number of observations the summary is based on), individual data, and time dimension. Others have proposed a scoring system for biomarker assessment  and their translatability  in early drug projects to estimate risks in project and portfolio decisions. This scoring system apply weights on data in order to avoid weak data sets to have equal impact as strong data sets by using scores between 1 and 5 (e.g. 5 = more validated data, more clinically relevant data). This implies that therapeutic areas with high translational risk will already up front call for the need to identify more reliable biomarkers . Such a weighting procedure can be a complement in the decision-making process.
The use of expert knowledge to define a desired Meaningful Effect, is the fundamental principle that distinguish the Knowledge Plot from methods published by others. One example is the utility function that is a component in the Clinical Utility Index (CUI) [8–10]. The CUI is a weighted sum of the utilities of all the attributes that are considered to be important for decision making in the actual situation. Both the CUI and each of the individual utilities take values in the (0, 1) range whereas the TEI can take any value though values in the (0,100) range are most common. In order to add a new summary index, in parallel to what is presented in this paper, you only need to convert it to the (0,100) scale. Clinical Utility Index can be used alternatively, or be included as complement, to the Treatment Effect Index.
There are some important considerations, in particular how an index such as the Treatment Effect Index is derived. One is that the random variation in the control group that is used to normalize against will result in even larger random variation in the normalized summary index, in particular when the control group is small in relation to the variation of the endpoint. Another concern is how to standardize the placebo/vehicle controlled treatment response. The Knowledge Plot uses standardization against the meaningful effect, which initially can be difficult to define for explorative endpoints. However, for such endpoints the relative difference between compounds can still be identified as long as the same meaningful effect value is used. Note that the reference value should be updated when more knowledge about the explorative endpoint is gained. Instead of using a reference value of a meaningful effect to derive an index it has been suggested to use the variance, or the standard deviation, to standardize the treatment effect to a normalized summary index, e.g. the Clinical Utility Index [8–10]. There are pros and cons with all methods. The method that suits the particular situation should be selected depending on how to interpret complex data and information with visualization techniques. The main reason why using the Treatment Effect Index is that it includes the expert opinion on the biological/medical knowledge identifying at which level an endpoint will give benefit to the patient. The index is based on current available treatments so that an index value of 100 will mean a biological meaningful treatment effect that has a realizable business value. A Napiergram can be derived by collapsing the vertical axis in the Knowledge Plot (see Figures 5 and 6, compound B).
Data standards and terminologies are very important in all data integration initiatives. Especially controlled vocabularies are important in order to compare across species, endpoints and studies. There are several available options that can be used (e.g. MeSH , Snowmed , MeDRA , NCI terminology ) as well as data structures and information models where the CDISC-suite  (SEND and SDTM) or a Triple Store solution may be considered. In general, the terminology used in clinical development is more consistent compared to the terminology in preclinical research.
Evaluating multiple attributes in a prospective manner and utilization of current information and knowledge are necessary procedures in order to optimize the productivity in future drug development. Visualization, data mining and knowledge management are all critical capabilities in this process. Currently, several stand-alone computational tools are used in a manual environment and data often exists in disconnected databases. Integration of data is a complex and difficult endeavor. Thus, a common computational infrastructure will remove many of the inherent road blocks . In addition, cultural changes have to take place to ensure effective sharing and integration of data between various functions, such as Drug metabolism and pharmacokinetics (DMPK), Pharmacology, Safety and Clinical, and external sources. The principles enlisted here is key to achieve shorter development timelines and will allow more time spent on building a Translational and Knowledge continuum that can support all drug development stages. Incorporation of scientific knowledge allows the organization to make informed and transparent investment decisions with respect to individual projects as well as the complete portfolio. Furthermore, using frameworks like the one presented in this paper serves like a common language for the spectra of subject matter experts that constitute a modern drug development team. Each subject matter expert is accountable for the interpretation within respective domain (i.e. Meaningful Effect), but that same person can also easily accept the data-driven interpretations from all the other domains, that includes subject matter expertise from other functions. Ultimately, we have seen that the Knowledge Plot catalyzes the build-up of confidence and trust among the team members, where the discussion have moved from one experiment at a time to what the experiment is worth in the context of all available information and knowledge. The organizational effects are difficult to quantify, but informed decision making and knowledge management are central for large research and developing organizations. In this context the Knowledge Plot plays a central role and will be of great value for everyone that embraces its principles.
The Knowledge Plot allows a transparent head-to-head comparison of data across multiple domains. It harmonizes and simplifies the interpretation and enables scientific and informed decision-making in various stages of drug development. Furthermore, the Knowledge Plot visualizes the translational and knowledge continuum, which in turn will reduce the attrition rate and reduce costs of clinical development by spotting poor candidates early. It provides a quick overview of what has been done with a molecule and uncovers hidden patterns by comparing the molecule with previous similar molecules and how efficacy and safety profiles compares with potential competitors and comparators. Exhaustive information, holistic understanding and integration of all current knowledge are prerequisites for effective decision making. Thus, the Knowledge Plot is a valuable communication, decision-making and knowledge management tool that will result in an improved understanding of the competitive edge for a particular project or disease area portfolio.
- Drolet BC, Lorenzi NM: Translational research: understanding the continuum from bench to bedside. Transl Res. 2010, 157: 1-5.View ArticlePubMedGoogle Scholar
- Gabrielsson J, Dolgos H, Gillberg P-G, Bredberg U, Benthem B, Duker G: Early integration of pharmacokinetic and dynamic reasoning is essential if lead compounds are to be developed optimally: strategic considerations. Drug DiscovToday. 2009, 14: 358-372.Google Scholar
- Morgan P, Van Der Graaf PH, Arrowsmith J, Feltner DE, Drummond KS, Wegner CD, Street SDA: Can the flow of medicines be improved? fundamental pharmacokinetic and pharmacological principles toward improving phase II survival. Drug DiscovToday. 2009, 17: 419-424.Google Scholar
- Visser SAG, Aurell M, Jones RDO, Schuck V, Egnell A-C, Peters S, Brynne L, Yates JWT, Jansson-Löfmark R, Tan B, Cooke M, Barry ST, Hughes A, Bredberg U: Model based drug discovery – implementation and impact. Drug Discov Today. 2013, 18: 764-775. 10.1016/j.drudis.2013.05.012.View ArticlePubMedGoogle Scholar
- Szalma S, Koka V, Khasanova T, Peraksils ED: Effective knowledge management in translational medicine. J Trans Med. 2010, 8: 68-10.1186/1479-5876-8-68.View ArticleGoogle Scholar
- Napier C, Wallis R: The napiergram: a tool for visualising efficacy and safety data. J Pharm and Toxicological Methods. 2010, 62 (2): e12-View ArticleGoogle Scholar
- Korsan B, Dykstra K, Pullman W: Transparent trade-offs:a clinical utility index (CUI) openly evaluates a product’s attributes - and chance of success. Pharmceutical Executive. 2005Google Scholar
- Poland B, Hodge FL, Khan A, Clemen RT, Wagner JA, Dykstra K, Krishna R: The clinical utility index as a practical multiattribute approach to drug development decisions. Clin Pharmacol Ther. 2009, 86: 105-108. 10.1038/clpt.2009.71.View ArticlePubMedGoogle Scholar
- Ouellet D: Benefit-risk assessment: the use of clinical utility index. Expert Opin Drug Saf. 2010, 9: 289-300. 10.1517/14740330903499265.View ArticlePubMedGoogle Scholar
- Seltzer B, Zolnouni P, Nunez M, Goldman R, Kumar D, Ieni J, Richardson S: Efficacy of donepezil in early-stage alzheimer disease. Arch Neurol. 2004, 61: 1852-1856. 10.1001/archneur.61.12.1852.View ArticlePubMedGoogle Scholar
- Adam J, Ades AE, Aslan T, Barnett D, Brain E, Claxton K, Cookson R, Duncan F, Eccleston C, Ewings P, Feest T, Forbes A, Geddes J, Goulston J, Hands L, Haxby E: Alzheimer’s disease - donepezil, rivastigmine, galantamine and memantine (review) - final appraisal document: national institute for health and clinical excellence. NICE Guidelines. 2006, 1-63.http://www.nice.org.uk/page.aspx?o=322952,Google Scholar
- Maher-Edwards G, Dixon R, Hunter J, Gold M, Hopton G, Hunter J, Williams P: Efficacy and tolerability of SB-742457, a novel 5HT6 receptor antagonist, and donepezil in subjects with mild-to moderate alzheimer’s disease (AD). Alzheimer’s & Dementia: J Alzheimer’s Association. 2008, 4 (4): T772-T773.View ArticleGoogle Scholar
- Birks J, Harvey RJ: Donepezil for dementia due to alzheimer’s disease. Cochrane Database Syst Rev. 2006, 25: CD001190Google Scholar
- Gold M, Alderton A, Zvartau-Hind M, Ritchie S, Saunders A, Craft S, Landreth G, Linnamägi U, Sawchak S: Effects of rosigltazone as monotherapy in APOE4-stratified subjects with mild-to-moderate Alzheimer's disease. The Journal of the Alzheimer's Association. 2009, 5 (4): 86-View ArticleGoogle Scholar
- Wehling M: Translational medicine: can it really facilitate the transition of research 'from bench to bedside’?. Eur J Clin Pharmacol. 2006, 62: 91-95.View ArticlePubMedGoogle Scholar
- Wehling M: Assessing the translatability of drug projects: what needs to be scored to predict success?. Nat Rev Drug Discov. 2009, 8: 541-546. 10.1038/nrd2898.View ArticlePubMedGoogle Scholar
- Wendler A, Wehling M: Translatability scoring in drug development: eight case studies. J Trans Med. 2012, 10: 39-49. 10.1186/1479-5876-10-39.View ArticleGoogle Scholar
- Ruch P, Gobeill J, Lovis C, Geissbühler A: Automatic medical encoding with SNOMED categories. BMC Med Inform Decis Mak. 2008, 8: S1-S6. 10.1186/1472-6947-8-S1-S1.View ArticleGoogle Scholar
- Sioutos N, de Coronado S, Haber MW, Hartel FW, Shaiu WL, Wright LW: NCI Thesaurus: a semantic model integrating cancer-related clinical and molecular information. J Biomed Inform. 2007, 40: 30-43. 10.1016/j.jbi.2006.02.013.View ArticlePubMedGoogle Scholar
- Souza T, Kush R, Evans JP: Global clinical data interchange standards are here!. Drug Discov Today. 2007, 12: 174-181. 10.1016/j.drudis.2006.12.012.View ArticlePubMedGoogle Scholar
- Krishna R, Schaefer HG, Bjerrum OJ: Effective integration of system biology, biomarkers, biosimulation, and modeling in streamlining drug development. J Clin Pharmacol. 2007, 47: 738-43. 10.1177/0091270007300746.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.