Skip to main content

Adapting modeling and simulation credibility standards to computational systems biology


Computational models are increasingly used in high-impact decision making in science, engineering, and medicine. The National Aeronautics and Space Administration (NASA) uses computational models to perform complex experiments that are otherwise prohibitively expensive or require a microgravity environment. Similarly, the Food and Drug Administration (FDA) and European Medicines Agency (EMA) have began accepting models and simulations as forms of evidence for pharmaceutical and medical device approval. It is crucial that computational models meet a standard of credibility when using them in high-stakes decision making. For this reason, institutes including NASA, the FDA, and the EMA have developed standards to promote and assess the credibility of computational models and simulations. However, due to the breadth of models these institutes assess, these credibility standards are mostly qualitative and avoid making specific recommendations. On the other hand, modeling and simulation in systems biology is a narrower domain and several standards are already in place. As systems biology models increase in complexity and influence, the development of a credibility assessment system is crucial. Here we review existing standards in systems biology, credibility standards in other science, engineering, and medical fields, and propose the development of a credibility standard for systems biology models.


As computing power rapidly increases, computational models become more intricate and an increasingly important tool for scientific discovery. In systems biology, where the amount of available data has also expanded, computational modeling has become an important tool to study, explain, and predict behavior of biological systems. The scale of biological models ranges from subcellular components [1] to entire ecosystems [2]. Modeling paradigms include mechanistic models, rule-based systems, Boolean networks, and agent-based models [3]. This review will focus on mechanistic models of subcellular processes.

The Food and Drug Administration (FDA) defines model credibility as “the trust, established through the collection of evidence, in the predictive capability of a computational model for a context of use” [4]. Model credibility is important in systems biology as models are used to guide experiments or to optimize patient treatment. This is particularly important given the increasing scale and intricacy of models. Reproducibility, the ability to recreate a model and data de novo and obtain the same result [5], is directly connected to credibility, but even reproducibility remains a challenge. It was recently discovered that 49% of published models undergoing the review and curation process for the BioModels [6] database were not reproducible primarily due to missing materials necessary for simulation, the availability of the model and code in public databases, and lack of documentation [7]. With some extra effort, an additional 12% of the published models could be reproduced. A model that cannot be reproduced is not credible.

Due to the increasing importance of computational models in scientific discovery, the National Aeronautics and Space Administration (NASA), the FDA, and other regulatory bodies have developed standards to assess the credibility of models [4, 8, 9]. These standards are somewhat vague and generally qualitative to accommodate the broad scope of models in these fields. However, mechanistic models in systems biology are relatively narrow in scope and are supported by a variety of standards for model encoding, annotation, simulation, and dissemination potentially enabling the development of a credibility standard for mechanistic systems biology models.

In this review, we discuss current systems biology modeling standards that could aid in the development of credibility standards, examine existing credibility standards in other scientific fields, and propose that current standards in systems biology and other fields could support the development of a credibility standard for mechanistic systems biology models.

Current standards in systems biology

Klipp et al. describe standards as agreed-upon formats used to enhance information exchange and mutual understanding [10]. In the field of systems biology, standards are a means to share information about experiments, models, data formats, nomenclature, and graphical representations of biochemical systems. Standardized means of information exchange improve model reuse, expandability, and integration as well as allowing communication between tools. In a survey of 125 systems biologists, most thought of standards as essential to their field, primarily for the purpose of reproducing and checking simulation results, both essential aspects of credibility [10].

A multitude of standards exist in systems biology for processes from annotation to dissemination. Although there is currently no widely used standard for model credibility, the development of this standard is likely to depend on existing systems biology standards, just as standards for model simulation are dependent on standards for model encoding. This section will summarize current standards relevant to model credibility including standards for ontology, encoding, simulating, and disseminating models. Although standards also exist for graphical representation of systems biology models (SBGN) [11] and representation of simulation results (SBRML) [12], these will not be discussed here as they are less relevant to the future implementation of model credibility standards.

Model representation

Having a commonly understood language for describing a model is essential in exchange, reproducibility, credibility. Without a common language to describe models, they cannot be simulated across different platforms or freely shared. For this reason, systems biology model representation has become standardized using XML-based languages SBML [10, 13], CellML [14], and BioPAX [15]. NeuroML [16], similar to SBML and CellML, is used to represent neuronal models, but is beyond the scope of this review.


The most widely used model format is SBML (Systems Biology Markup Language) [10, 13, 17, 18]. SBML is a XML-based language for encoding mathematical models that reproduce biological processes, particularly biochemical reaction networks, gene regulation, metabolism, and signaling networks [17, 19]. SBML encodes critical biological process data such as species, compartments, reactions, and other properties (such as concentrations, volumes, stoichiometry, and rate laws) in a standardized format. Annotations can also be stored in the SBML format. With its support by over 200 third party tools and its ability to easily convert to other model formats, SBML is the de facto language for systems biology models [13, 19].

SBML models are composed of entities, such as species, located in containers that can by acted upon by processes that create, destroy, or modify [20]. Other elements allow for the definition of parameters, initial conditions, variables, and mathematical relationships. The SBML language is structured as a series of upwardly compatible levels, with higher levels incorporating more powerful features. Versions describe the refinement of levels. Most recently, SBML level 3 introduced modular architecture consisting of a set of fixed features, SBML level 3 core, and a scheme for adding packages that augment the core functionality. This allows for extensive customization of the language while enabling reuse of key features. Currently, eight packages are part of the SBML 3 standard. These packages extend the capability of SBML such as enabling descriptions of uncertainties in terms of distributions [21], allowing for the encoding, exchange, and annotation of constraint-based models [22], rendering visual diagrams [23], among many others [20].


Similar to SBML but broader in scope, CellML is also an XML-based language for reproducing mathematical models of any kind, including biochemical reaction networks [24]. CellML models do not encode biological information explicitly in the model, but instead consist of mathematical formulations of biological processes [25]. This feature of the CellML language increases flexibility enabling the description of a wide variety of biological processes. CellML models are composed of several interconnected components [26] with each component containing at least one variable that is associated with physical units. This enables CellML processors to automatically check equations for dimensional consistency.

Both CellML and SBML use almost identical mathematical expressions in MathML, an international standard for encoding mathematical expression using XML [27]. CellML explicitly encodes all mathematics, such as ODEs [28]. It is more versatile than SBML, capable of describing any type of mathematical model. SBML defines reaction rates, which can be used to build rate rules and ODEs [29]. There is more third party support for SBML and it is a semantically richer language compared to CellML [13, 28].


BioPAX (Biological Pathway Exchange) is an ontology, a formal systems of describing knowledge that structures biological pathway data making it more easily processed by computer software [15]. It describes the biological semantics of metabolic, signaling, molecular, gene-regulatory, and genetic interaction networks [15]. Whereas SBML and CellML focus on quantitative modelling and dynamic simulation, BioPAX concentrates primarily on quantitative processes and visualization [15, 30].

BioPAX contains one superclass, Entity. Within the Entity superclass, there are two main classes: PhysicalEntity and Interaction. PhysicalEntity describes molecules, including proteins, complexes, and DNA, while the Interaction class defines reactions and relationships between instances of the PhysicalEntity class. Interactions can be either Control or Conversion, both of which are divided into several more detailed subclasses [15, 30]. Like SBML, BioPAX is released level-wise with level 1 describing interactions, level 2 supporting signaling pathways and molecular interactions, and level 3 enabling the description of gene-regulatory networks and genetic interactions.


As models grow more numerous and complex, there is an increasing need for a standardized encoding format to search, compare, and integrate them. While standards such as SBML and CellML provide information on the mathematical structure of a model, there is no information as to what variables and mathematical expressions represent. Simple textual descriptions of these representations are subject to errors and ambiguity and require text-mining for computational interpretation [31]. Standardized metadata annotations address these issues by capturing the biological meaning of a model’s components and describing its simulation, provenance, and layout information for visualization. The use of annotations improves model interoperability, reusability, comparability and comprehension [32]. Annotations are enabled by systems biology specific ontologies [25] which define a common vocabulary and set of rules to unambiguously represent information [33].

To avoid accounting for a variety of annotation formats and approaches, standard annotation protocols are necessary [32]. However, despite the numerous standards and tools, annotation remains a challenge. For example, the ChEBI database [34] has approximately 1,000 annotations for glucose. While more than one entry for each annotation can serve a purpose (some users may prefer to be more abstract in their annotations), this adds to the challenge of defining the purpose of a model and, therefore its credibility. Additionally, annotations can be obsolete, inappropriate or incorrect, or provide insufficient information. Evaluating the quality of annotations would be essential in any credibility assessment for systems biology models. Some tools already exist for this purpose, such as SBMate [35], a python package that automatically assesses coverage, consistency, and specificity of semantic annotations in systems biology models.


MIRIAM (Minimum Information Requested in the Annotation of Biochemical Models) [36] was developed to encourage the standardized annotation of computational models by providing guidelines for annotation. The MIRIAM guidelines suggest that model metadata clearly references the relation documentation (e.g. a journal article), that the documentation and encoded model have a high degree of correspondence, and that the model be encoded in a machine-readable format (such as SBML or CellML). Annotations should also include the name of the model, the citation for its corresponding journal article, the contact information of the creators and date of creation, as well as a statement about the terms of distribution. Additionally, models should have accurate annotations that unambiguously links model components to corresponding structures in existing open access bioinformatics resources. The referenced information should be described using a triplet, data collection, collection-specific identifier, optional qualifier and expressed as a Uniform Resource Identifier (URI), a unique sequence of characters that identifies a resource used by web technologies [37]. The optional qualifier field is used to describe relationships between the model constituents and the piece of knowledge with language such as “has a”, “is a version of”, “is homologous to”, etc.

Systems biology ontology (SBO)

SBO (Systems Biology Ontology) describes entities used in computational modeling [31, 38]. It defines a set of interrelated concepts used to specify the types of components specified in a model and their relationships to one another. Annotation with SBO terms allows for unambiguous and explicit understanding of the meaning of model components and enables mapping between elements of different models encoded in different formats [31]. Both SBML and CellML support annotation with SBO terms. SBML elements contain an optional sboTerm attribute [25, 29, 31].


In order to harmonize the metadata annotations across models encoded in various formats, Gennari et al., with the consensus of the COMBINE (Computational Modeling in Biology Network) community, developed a specification for encoding annotations in Open Modeling Exchange (OMEX)-formatted archives. The specification describes standards for model component annotations as well as for annotation at the model-level, and archive-level [32]. The specification describes annotation best practices and addresses annotation issues such as composite annotations, annotating tabular data and physical units, as well as provides a list of ontologies relevant to systems biology. Implementation of these specifications is aided by LibOMexMeta, a software library supporting reading, writing, and editing of model annotations. It uses Resource Description Framework [39] (RDF), an XML-based standard format for data exchange on the web, for representing annotations. It also makes use of several standard knowledge resources describing biology and biological processes such as ChEBI [34], a dictionary of small chemical compounds, and UniProt [40], a database of protein sequence and functional information.

Annotation in CellML and SBML

Both CellML and SBML have their own annotation protocols based on RDF [25]. The CellML language uses its own ontology for model annotation, a necessity due to the flexibility of the language [24]. The CellML Metadata Specification was developed parallel to the CellML language [25]. CellMLBiophysical/OWL ontology is composed of two categories: physical and biological [25]. The physical ontology describes physical quantitative information and concepts captured in the model’s mathematical expressions. It is subdivided into processes, such as enzyme kinetics, ionic current, and rate constants, and physical entities, such as area, concentration, volume, and stoichiometry. The biological ontology provides description for processes, entities, the role of an entity in relation to a process, and the specific location of the entity in a biological system. Bioprocesses are divided into three subclasses: biochemical reactions, transport, and complex assembly. Biological entities include proteins, small molecules, and complexes. The biological roles subclass is composed of modifiers, reactants, and products.

SBML also facilitates MIRIAM compliant annotation using RDF) [39, 41]. Annotations use [42] qualifier elements embedded in XML form of RDF [43]. Each annotation is a single RDF triple consisting of the model component to annotate (subject), the relationship between the model component and the annotation term (predicate), and a term which describes the meaning of the component (object). These terms come from defined ontologies, such as SBO [38]. RDF annotation is supported by the software libraries libSBML [44] and JSBML [45].

Simulation and parameter estimation

Information about a model alone is insufficient to enable efficient reuse. A variety of advanced numerical algorithms and complex modeling workflows make the reproduction of simulations challenging. Many modelers reproduce simulations by reading the simulation description in the corresponding publication [46]. This is time consuming and error prone and the published description of a simulation is often incomplete or incorrect. For these reasons, it is essential to define and include information necessary to perform all simulations.


Guidelines for the Minimum Information About a Simulation Experiment (MIASE) were introduced to specify what information should be provided in order to correctly reproduce and interpret a simulation [46]. MIASE is a set of rules that fall into three categories: information about the model used in the simulation experiment must be listed in a way that enables reproduction of the experiment; all information necessary to run any step of the experiment must be provided; all information needed to post-process data and compare results must be included. Along with MIRIAM [36] guidelines, MIASE compliance guarantees that the simulation experiment is true to the intention of the original authors and is reproducible.


KiSAO (Kinetic Simulation Algorithm Ontology) is an ontology used to describe and structure existing simulation algorithms [31, 47]. It consists of three main branches, each with several subbranches. The first branch is Kinetic simulation algorithm characteristics, such as the type of system behavior or type of solution. The second in the kinetic simulation algorithm such as Gillespie or accelerated stochastic simulation. The third branch is kinetic simulation algorithm parameters which describe error and granularity, among other characteristics.


Simulation Experiment Description Markup Language (SED-ML) is software independent, XML-based format for encoding descriptions of simulation experiments and results [48, 49]. To help modelers comply with MIASE rules, SED-ML describes the details of simulation procedures, including what datasets and models to use, which modifications to apply to models, which simulations to run on each model, how to post-process data, report, and present results can all be encoded [46]. Each algorithm mentioned in a SED-ML file must be identified by a KiSAO term [31]. PhraSED-ML was developed to enable modelers to encode human readable SED-ML elements without the use of specialized software [50].


Parameter estimation is common in modeling and simulation, which often requires running multiple simulations to scan the suitability of several parameter sets. Although many parameter estimation toolboxes exist, they each use their own input formats. The lack of a standardized format makes it difficult to switch between tools, hindering reproducibility [51]. PEtab is a parameter estimation problem definition format consisting of several files containing information necessary for parameter estimation, including the model (in SBML format), experimental conditions, observables, measurements, parameters, and optional visualization files [51]. A final PEtab problem file links all other files to form a single, reusable, parameter estimation problem. Following the success of PEtab, parameter estimation functionality was added to SED-ML [49].


Model reproducibility best practices describe dissemination as an essential part of reproducibility [52]. Sharing all model artifacts and documentation on open-source repositories allows independent researchers to reproduce, reuse, and understand the model. Several guidelines and archive formats have been developed to ensure that all relevant information necessary to reproduce a modeling result is easily accessible to the public.

MIRIAM curation guidelines

In addition to annotation guidelines, MIRIAM also provides guidelines for model curation, the process of collecting and verifying models. The aim of MIRIAM guidelines is the ensure that model is properly associated with a reference description (e.g. a journal article) and that it is consistent with that reference description, meaning that it reflects the biological process listed in the reference description. The model must be encoded in a public, machine-readable format such as SBML or CellML and comply with the associated encoding standard. The encoded model must be simulatable, including quantitative values for initial conditions, parameters, and kinetic expressions, and must reproduce relevant results when simulated [36].


More recently, the FAIR guidelines were published to improve the ability of computers to Find, Access Interoperate, and Reuse models [53] with minimal human interaction. FAIR defines characteristics that data resources should possess to assist with discovery and reuse by third-parties. Unlike most data management and archival guidelines, FAIR is a set of high-level, domain-independent guidelines that can be applied to a variety of digital assets. Each element of the FAIR principle is independent.

For a model to be “findable,” it should be easy to find for both humans and computers. This requires describing and annotating data and metadata with unique identifiers that are registered or indexed in a searchable resource. Once the user finds the relevant model, it should be accessible: data and metadata should be retrievable by their identifiers using standard communications protocol. Metadata should remain accessible even when data are no longer available. Interoperability refers to the integration with other data and the ability to operate with various applications and workflows. This is enabled by the use of broadly applicable languages for model representation and annotation. As the ultimate goal of FAIR is to enable the reuse of data, the guidelines dictate that data and metadata should be associated with detailed provenance, meet domain-specific community standards (such as COMBINE archive format described below), and released with clear and accessible data usage license.

COMBINE archives

COMBINE (COmputational Modelling in BIology NEtwork) is a formal entity that coordinates standards in systems biology. To assist in this coordination, a MIRIAM compliant system for sharing groups of documents regarding a model was developed called the COMBINE Archive [54]. The archive is encoded in OMEX (Open Modeling Exchange format) and the archive itself is a “ZIP” file. A COMBINE archive could contain files in several different standard formats including SBML, SBOL, and SED-ML among others. Additionally, every COMBINE Archive contains at least one file titled manifest.xml that contains a list of all the files comprising the archive and describing their locations. An archive also may contain a metadata file, ideally conforming to MIRIAM and MIASE guidelines. The inclusion of all necessary protocols and data needed to implement a model enables distribution of models via a single file encouraging reuse and improving reproducibility [55].

Credibility guidelines in systems biology

Although no standard for model credibility in systems biology exists, there are general guidelines aimed at improving the trustworthiness of models developed by the Committee on Credible Practice of Modeling and Simulation in Healthcare, a group formed by the U.S. National Institutes of Health [56]. The purpose of these guidelines is to encourage the credible use of modeling and simulation in healthcare and translational research. These guidelines are qualitative and share many components with best practices for reproducibility. The term “credible” was defined as “dependable, with a desired certainty level to guide research or support decision-making within a prescribed application domain and intended use; establishing reproducibility and accountability.” These guidelines are qualitative and intended to cover a variety of modeling approaches and applications within the biomedical context (Fig. 1).

The credibility of a model should be evaluated within the model’s context of use [56]. To this end, the guidelines recommend using contextually appropriate data and evaluating the model (performing verification, validation, uncertainty quantification and sensitivity analysis) with respect to the context in which the model will be used. Any limitations should be listed explicitly.

Borrowing from software engineering best practices, the guidelines also recommend the use of version control to track model and simulation development as well as extensive documentation of simulation code, model mark-up, scope and intended use. Models should also include guides for developers and users and conform to domain-specific standards  [56].

Different simulation strategies should tested to ensure that the results and conclusions are similar across various tools and methods. All modeling components such as software, models, and results should be reviewed by third party users and developers and disseminated widely.

Fig. 1
figure 1

The Committee on Credible Practice of Modeling and Simulation in Healthcare 10 rules of model credibility

Qualitative credibility assessment in other modeling fields

NASA and the FDA also have a keen interest in producing well-documented and credible models for the purpose of making critical decisions. However, modeling and simulation tasks in these institutions are far broader compared to systems biology. NASA models range from the analysis of individual parts to orbits and spacecraft while models submitted to the FDA include medical devices and pharmacokinetics. Due to the wide variety of modeling tasks relevant to NASA and the FDA, credibility guidelines in these institutions are general, largely qualitative, and do not prescribe specific tests.

NASA credibility assessment scale

After the loss of the Columbia Space Shuttle and its seven crew members in 2003, NASA significantly increased its focus on quantitative and credible models. The misuse of an existing model and the reliance on engineers’ judgment led to the false conclusion that shuttle reentry would not be affected by a small hole in the heat shield caused by a debris strike during takeoff [57, 58]. The lack of quantifiable uncertainty and risk analysis in the report to management ultimately led to the shuttle’s disintegration [59]. Since then, NASA has developed extensive modeling and simulation standards including the Credibility Assessment Scale [8] (CAS) (Fig.  2).

Each model credibility standard described here emphasizes assessments be made within a specific context of use, the specific role and scope of the model and the specific question of interest that the model is intended to help answer [4]. The judgment error that ultimately led to the Columbia Space Shuttle disaster was partially due to the use of a modeling software far outside the intended context of use, leading to incorrect predictions and over-reliance on engineer’s judgment [57]. In addition to specifying the scope and question of interest, the context of us should also describe how model outputs will be used to answer the question of interest and whether other information, such as bench-testing, will be used in conjunction with the model to answer the question of interest [4]. The standards described here, from various institutions such as NASA and the FDA, all specify that credibility is to be evaluated within a specific context of use.

Fig. 2
figure 2

Categories of the NASA Credibility Assessment Scale (CAS)

NASA’s CAS is intended to help a decision-maker evaluate the credibility of specific modeling and simulation results and to identify aspects of the results that most influence credibility [57, 60]. The credibility assessment process can be viewed as a two part process: first the modeler conveys an assessment of the results, then a decision maker infers the credibility of these results. The CAS standard consists of eight factors grouped into three categories [8]: development, operations, and management. Each of the eight factors is scored on a scale of 0–4 with guidelines for each numeric score. These factors were selected as they were considered to be the most essential, sufficiently independent of one another, and could be objectively assessed. While the primary concern is the score for each individual factor, the secondary concern is the score of the overall model, which is the minimum score of the eight subfactors.

The model and simulation (M &S) development category consists of subsections verification and validation [8]. Scoring in these subcategories assess the correctness of the model implementation, the numerical error and uncertainty, and the extent to which the M &S result matches reference data. If numerical errors for important features are “small” and if results agree with real-world data, the highest score of 4 is awarded for these factors.

The second category, M &S operations, consists of three factors: input pedigree, results uncertainty, and result robustness [8]. Input pedigree describes the level of trust in the input data, where input data that accurately reflects real-world data receiving the highest score. The results uncertainty category earns the highest score if non-deterministic numerical analysis is performed. Result robustness high scores are achieved by including sensitivity analysis for most parameters and identifying key sensitivities.

Model and simulation management, the third category, is less technical, containing the factors use history, M &S management, and people qualification [8]. Use history scores the highest score if the model has previously been used successfully and meets de facto standards. For example, a model used for finite element analysis (FEA) would be required to meet FEA standards and codes for the type of object being modeled. M &S management refers to the maintenance and improvement of the model with continual process improvement receiving the highest score of 4. The people qualification category assesses the experience and qualifications of those constructing, maintaining, and using the model where personnel with extensive experience with the model and best practices scoring the highest.

Although these categories were chosen, in part, due to their ability to be objectively assessed, there is still a significant subjective component of the scoring process. It is acknowledged that different decision makers may assign different degrees of credibility to the same model and different decisions may require different levels of credibility. The CAS serves as a template to assess and clearly communicate risks to decision-makers. Additionally, it can be useful in measuring model development progress or in identifying areas where improvement is most needed [57].

Credibility standards for medical models

In addition to systems biology, computational models are also becoming essential tools in biomedical applications such as drug discovery [61], pharmacokinetics [62], and medical devices [63]. Credibility is essential in biomedical modeling, particularly in cases where models influence patient treatment or regulatory approval of a device or drug. Both the FDA and the European Medicines Agency have developed standards guiding model credibility for the purposes of regulatory approval. As with NASA, these guidelines are broad and qualitative due to the broad scope of biomedical modeling. Before the FDA began formalizing guidelines for model credibility, the American Society of Mechanical Engineers (ASME) issued Verification and Validation (V &V) 40 for assessing credibility of computational models in medical device applications [64]. However, this standard assumes the ability to perform traditional validation activities such as comparing model predictions to well-controlled validation experiments [4], a task which can be unfeasible with some biomedical models. Many models used in regulatory submissions are supported by many sources of evidence beyond traditional validation experiments including clinical trials and population-level validation. Recognizing this fact, FDA modeling credibility guidelines expand on ASME V &V 40 concepts to provide a more general framework for assessing a wider variety of models.

ASME V &V 40

The ASME developed the V &V 40 standard in 2012 as a means of describing verification and validation activities in the modeling and simulation of medical devices [64]. Like the NASA CAS, V &V 40 focuses on context of use, model risk, and the establishment of credibility goals prior to any credibility assessment. The context of use addresses the specific role of the model in addressing the question of interest. Model risk is then assessed based on the possibility that the model may lead to incorrect conclusions resulting in adverse outcomes. After the establishment of credibility goals, verification and validation take place.

Of particular relevance to modeling in systems biology are the descriptions of code and calculation verification found in V &V 40. Verification seeks to determine if the model is built correctly. More specifically, code verification aims to identify any errors in the source code and numerical algorithms. This can be done by comparing output from a model to benchmark problems with known solutions [64]. Calculation verification estimates the error in the output of a model due to numerical methods. Output errors can include discretization errors, rounding errors, numerical solver errors, or user errors. Calculation verification is complete when it is demonstrated that errors in the numerical solution are minimized to the point that they are not corrupting the numerical results [64].

Validation assesses how well the computational model represents reality. Validation activities might include comparing the model’s behavior to the biological features of the real phenomenon by comparing results to in vitro/in vivo benchmark experiments. Validation also includes uncertainty quantification and sensitivity analysis. Uncertainty quantification refers to the estimation of how stochastic error in the input propagates into the model’s output. Sensitivity analysis is a post-hoc examination of the results of the uncertainty quantification to evaluate which elements most influence output variability  [64].

Unlike the NASA CAS, V &V 40 does not describe the quality of evidence needed to prove a model credible and lacks an objective scoring system necessary for implementing “cut-offs” of credible versus non-credible models, or for comparing the credibility of multiple models.

FDA guidance on computational model credibility in medical devices

Based on the V &V 40 standard, the FDA released guidance on assessing credibility for models of medical devices [4]. This guidance expands V &V 40 to include other forms of credibility evidence beyond traditional verification and validation exercises. Applicable to physics-based, mechanistic, or other first-principles-based models of medical devices, these guidelines consist of ten categories broadly divided into code verification, calculation verification, and validation. The code verification category is taken directly from V &V 40 described previously.

The calculation verification guideline extends the V &V 40 by detailing several methods to verify that the model produces the intended output. For example, the model results can be compared with the same data used to calibrate the model parameters. Broader evidence in support of the model, but perhaps without a specific context of use are also acceptable. A model can also be verified using in vitro or in vivo experiments either within the context of use, or within conditions supporting a different context of use. These techniques can also be used for validation evidence.

Validation assesses the model’s ability to reproduce real-world behavior. In addition to the methods described for calculation verification, validation can also include population-based evidence, statistical comparisons of model predictions to population-level data such as the results of a clinical trial. Credibility is also supported by emergent model behavior, the ability of a model to reproduce real-world phenomena that were not pre-specified or explicitly modeled, as well as general model plausibility, that model assumptions, input parameters, and other characteristics are deemed reasonable based on scientific knowledge of the system modeled.

Unlike the NASA Credibility Assessment Scale, these FDA guidelines are sets of nonbinding recommendations. Additionally, no scoring or suggested quality measures of FDA credibility factors are included making quantitative analysis of credibility impossible.

EMA guidelines for PBPK models

Of particular relevance for the field of systems biology is The European Medicines Agency’s (EMA) Guideline on the Reporting of Physiologically Based Pharmacokinetic (PBPK) Modeling and Simulation issued in 2018 [9, 65]. PBPK models are mathematical models that simulate the concentration of a drug over time in tissues and blood. With the rise in regulatory submissions that include PBPK models that rely on specialized software programs, this guideline provides detailed advice on what to include in a PBPK modeling report.

The standard dictates the necessary information to describe and justify model parameters. Like the FDA standard, modelers are required to submit any assumptions made when assigning parameters and to document the sources of any literature-based parameters. Additionally, modelers must perform a sensitivity analysis for parameters that are key to the model (those that significantly influence the outcome) and list any parameters that are uncertain.

The submission must include the simulation results as well as the files used to generate the final simulations in both tabular and executable format. This requirement is shared with reproducibility standards already in place for systems biology in the COMBINE Archive standard as well as described in systems biology modeling reproducibility best practices [52].

The predictive performance of the model must also be evaluated. That is, its ability to recapitulate observed pharmacokinetics. This requirement is also mentioned in the FDA guidelines.

Lastly, the guideline requires a discussion of uncertainty and confidence in the model. Although described more qualitatively in the EMA standard, this requirement is shared by NASA’s CAS, V &V 40, FDA credibility guidelines, and best practices for reproducible modeling in systems biology [52].

Current tools for systems biology model testing

Although there is no credibility standard in systems biology modeling, some tools provide automated model testing. Although these tools were not developed explicitly to assess credibility, many of the factors they test for could be considered aspects of credibility. Future model credibility assessments could aspire to the level of quantification and automation these tools offer.


MEMOTE (MEtabolic MOdel TEsts) is an open-source Python software that automatically tests and scores genome-scale metabolic models [66]. MEMOTE offers a web interface and command line interface where SBML files can be uploaded and analyzed, and ultimately scored. The tests check that a model is annotated according to the MIRIAM standard, that components are described using SBO terms, and that the model is properly constructed using the relevant SBML package, SBML-FBC [66, 67]. Basic tests check for the presence of relevant components, charge information, and metabolite formulas. Biomass tests check that biomass precursors are produced and that the growth rate is non-zero. Stoichiometry tests test for inconsistency, erroneously produced energy metabolites, and reactions that are permanently blocked. A numeric score is output after testing indicating the extent to which a model conforms to these standards.

MEMOTE is designed to assess genome-scale metabolic models and largely includes tests that are specific to this model subset. Although a high MEMOTE score is likely to be indicative of model quality and reproducibility, it is not an assessment of credibility. A credible model will likely have a good MEMOTE score, but a good MEMOTE score does not necessarily indicate a credible model. However, the quantitative and automated nature of MEMOTE allows for quickly gauging model quality, comparing models, and the iterative improvement of metabolic models.

FROG analysis

Similar to MEMOTE, the COMBINE community has recently developed FROG analysis, an ensemble of analyses for constraint-based models to generate standardized numerically reproducible reference datasets [68]. Results from constraint-based models are often communicated as flux values and there are often multiple solutions for a single model. As such, results cannot be used to gauge reproducibility. The COMBINE community outlined a list of outputs and results of flux balance analysis that are numerically reproducible and can be used for curation, known as FROG reports. FROG reports can be used in the BioModels [6, 69] curation process to assess reproducibility.

FROG analysis consists of Flux variability analysis, Reaction deletion, Objective function values, and Gene deletion fluxes. Flux variability analysis tests that the maximum and minimum fluxes are reproducible using different software tools. The objective function value for a defined set of bounds should be reproducible. The systematic deletion of all reactions or all genes, one at a time, should provide comparable reference results. Currently four tools support the generation of FROG reports. Web-based tools include fbc_curation [68], CBMPy MOdel Curator [70] (both of which are also available as command line tools), and FLUXER [71]. fbc_curation_matlab is a command-line tool and exports results in COMBINE archive format [72].

Unlike MEMOTE, FROG analysis produces a report in lieu of a single numerical score.


When standards are established, tests can be established to assess the extent to which a model conforms to that standard. With the development of standardized quantitative metrics (as opposed to qualitative guidelines such as those discussed previously), models can be constructed to meet minimum quality requirements lending credibility to those models and allowing for easy comparison across models [73].

The difficulty in developing these quantitative metrics is that the characteristics of an ideal bio-model must be known and expressed concisely. Existing standards in systems biology seek to address the first point by outlining what information is necessary to completely define and reproduce a model as well as the format in which that information is to be presented. However, a model could meet all existing standards and not be credible. For example, a model could be fully defined in SBML with extensive annotations, be reproducible, properly formatted for dissemination with SED-ML files describing all simulations. Despite meeting these standards, this hypothetical model could produce negative concentrations when simulating, clearly indicating that the model is not credible. Additional metrics and standards must be established to adequately assess credibility. These metrics might include the relative concentration of floating species or the shape of response curves.

Hellerstein et al. note that several issues in biomedical modeling are analogous to problems faced in software development and propose that software development best practices might be translated to improve modeling in systems biology [74]. Of particular interest is software testing, which can be considered a form of credibility assessment. These tests aim to ensure the correctness, reliability, and availability of software, all characteristics that are also essential in systems biology model credibility.

Software tests can be divided into two categories, which may also be applicable in systems biology modeling: (i) black-box testing and (ii) white-box testing. Black-box software testing assesses the behavior of the code and does not deal with implementation. For systems biology model credibility assessment, black-box credibility indicators might be that the model accurately predicts observed data. White-box testing evaluates the internal workings of a software project or model. The absence of errors, such as undefined parameters, typos, or unused species, might serve as white-box credibility indicators.


Although many reproducibility standards are in use to simplify assessing reproducibility, there are no standards and scoring systems for model credibility in systems biology. Unlike institutions such as NASA and the FDA, which deal with models spanning a broad scope of applications and scales, systems biology is focused on the modeling of cellular processes. This narrow scope, combined with the variety of standards already in use, makes systems biology models well-suited for a credibility standard.

A quantitative credibility scoring system would be particularly useful and enable comparing the credibility of different models and guide the development of more credible models. Credibility metrics could be published alongside models to indicate the trustworthiness of results and allow users to make informed decisions about reusing models.

Systems such as MEMOTE demonstrate that model standards and model quality indicators can be automatically quantitatively scored enabling iterative improvement during the development phase. More challenging is further developing standards to express characteristics, both quantitative and qualitative, that make a model credible. Current modeling standards in other scientific fields emphasize assessing credibility in the model’s context of use. This poses a challenge for automating credibility assessment in systems biology modeling as more or less rigor may be required to achieve a sufficiently credible model depending on the intended use of the model. It may prove useful to develop a manual, semi-quantitative scoring systems, such as NASA’s Credibility Assessment Scale prior to attempting to implement a fully quantitative and perhaps automated credibility scoring system for systems biology models.

Availability of data and materials

Not applicable.


  1. Chengyuan Wang, Si Li, Ademiloye Adesola S, Perumal Nithiarasu. Biomechanics of cells and subcellular components: a comprehensive review of computational models and applications. Int J Numer Methods Biomed Eng. 2021.

    Article  Google Scholar 

  2. Hassell James M, Newbold Tim, Dobson Andrew P, Linton Yvonne-Marie, Franklinos Lydia H V, Zimmerman Dawn, Lohan Katrina M Pagenkopp. Towards an ecosystem model of infectious disease. Nature Ecol Amp Evol. 2021;5(7):907–18.

    Article  Google Scholar 

  3. Bartocci Ezio, Lió Pietro. Computational modeling, formal analysis, and tools for systems biology. PLOS Comput Biol. 2016;12(1): e1004591.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Assessing the credibility of computational modeling and simulation in medical device submissions: Draft guidance for industry and food and drug administration staff. U.S. Food and Drug Administration, 2021.

  5. Janis Shin, Veronica Porubsky, James Carothers, Sauro Herbert M. Standards, dissemination, and best practices in systems biology. Current Opin Biotechnol. 2023;81: 102922.

    Article  Google Scholar 

  6. ...Malik-Sheriff Rahuman S, Mihai Glont, Nguyen Tung VN, Krishna Tiwari, Roberts Matthew G, Xavier Ashley Vu, Manh T, Jinghao Men, Matthieu Maire, Sarubini Kananathan, Fairbanks Emma L, Meyer Johannes P, Chinmay Arankalle, Varusai Thawfeek M, Vincent Knight-Schrijver, Li Lu, DuñasRocaCorina Dass Gaurhari, Keating Sarah M, Park Young M, Nicola Buso, Nicolas Rodriguez, Michael Hucka, Henning Hermjakob. BioModels - 15 years of sharing computational models in life science. Nucl Acids Res. 2020;48(D1):D407–15.

    CAS  PubMed  Google Scholar 

  7. Krishna Tiwari, Sarubini Kananathan, Roberts Matthew G, Meyer Johannes P, Sharif Shohan Mohammad Umer, Ashley Xavier, Matthieu Maire, Ahmad Zyoud, Jinghao Men, Szeyi Ng, Nguyen Tung VN, Mihai Glont, Henning Hermjakob, Malik-Sheriff Rahuman S. Reproducibility in systems biology modelling. Mol Syst Biol. 2021;17(2): e9982.

    Article  Google Scholar 

  8. Babula Maria, Bertch William, Green Lawrence, Hale Joseph, Mosier Gary, Steele Martin, Woods Jody. NASA Standard for Models and Simulations: Credibility Assessment Scale. In 47th AIAA Aerospace Sciences Meeting including The New Horizons Forum and Aerospace Exposition, Orlando, Florida. American Institute of Aeronautics and Astronautics.2007.

  9. Shepard T, Scott G, Cole S, Nordmark A, Bouzom F. Physiologically based models in regulatory submissions: Output from the ABPI/MHRA forum on physiologically based modeling and simulation. CPT Pharmacometrics Syst Pharmacol. 2015;4(4):221–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Klipp Edda. Wolfram Liebermeister. Anselm Helbig: Axel Kowald, and Jörg schaber. standards in computational systems biology; 2007.

    Google Scholar 

  11. Novère N, Hucka M, Mi H. The systems biology graphical notation. Nature Biotechnology29. 2000.

  12. Dada, Spasić, Paton, Mendes. Sbrml: a markup language for associating systems biology data with models. Bioinformatics, 2010;26.

  13. Machado Daniel, Costa Rafael S, Rocha Miguel, Ferreira Eugénio C, Tidor Bruce. Modeling formalisms in systems biology. AMB Express. 2011;1:45.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Cuellar Autumn A, Lloyd Catherine M, Nielsen Poul F, Bullivant David P, Nickerson David P, Hunter Peter J. An overview of CellML 1.1, a biological model description language. Simulation. 2003;79(12):740–7.

    Article  Google Scholar 

  15. ...Demir Emek, Cary Michael P, Paley Suzanne, Fukuda Ken, Lemer Christian, Vastrik Imre, Guanming Wu, D’Eustachio Peter, Schaefer Carl, Luciano Joanne, Schacherer Frank, Martinez-Flores Irma, Zhenjun Hu, Jimenez-Jacinto Veronica, Joshi-Tope Geeta, Kandasamy Kumaran, Lopez-Fuentes Alejandra C, Mi Huaiyu, Pichler Elgar, Rodchenkov Igor, Splendiani Andrea, Tkachev Sasha, Zucker Jeremy, Gopinath Gopal, Rajasimha Harsha, Ramakrishnan Ranjani, Shah Imran, Syed Mustafa, Anwar Nadia, Babur Ozgun, Blinov Michael, Brauner Erik, Corwin Dan, Donaldson Sylva, Gibbons Frank, Goldberg Robert, Hornbeck Peter, Luna Augustin, Murray-Rust Peter, Neumann Eric, Reubenacker Oliver, Samwald Matthias, van Iersel Martijn, Wimalaratne Sarala, Allen Keith, Braun Burk, Whirl-Carrillo Michelle, Dahlquist Kam, Finney Andrew, Gillespie Marc, Glass Elizabeth, Gong Li, Haw Robin, Honig Michael, Hubaut Olivier, Kane David, Krupa Shiva, Kutmon Martina, Leonard Julie, Marks Debbie, Merberg David, Petri Victoria, Pico Alex, Ravenscroft Dean, Ren Liya, Shah Nigam, Sunshine Margot, Tang Rebecca, Whaley Ryan, Letovksy Stan, Buetow Kenneth H, Rzhetsky Andrey, Schachter Vincent, Sobral Bruno S, Dogrusoz Ugur, McWeeney Shannon, Aladjem Mirit, Birney Ewan, Collado-Vides Julio, Goto Susumu, Hucka Michael, Le Novère Nicolas, Maltsev Natalia, Pandey Akhilesh, Thomas Paul, Wingender Edgar, Karp Peter D, Sander Chris, Bader Gary D. BioPAX - A community standard for pathway data sharing. Nat Biotechnol. 2010;28(9):935–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Gleeson Padraig, Crook Sharon, Cannon Robert C, Hines Michael L, Billings Guy O, Farinella Matteo, Morse Thomas M, Davison Andrew P, Ray Subhasis, Bhalla Upinder S, Barnes Simon R, Dimitrova Yoana D, Silver R Angus. NeuroML: a language for describing data driven models of neurons and networks with a high degree of biological detail. PLoS Computat Biol. 2010;6(6):e1000815.

    Article  Google Scholar 

  17. Hucka Bolouri H, Finney A, Sauro HM, Doyle JC, Kitano H, Arkin AP, Bornstein BJ, et al. The systems biology markup language (sbml): a medium for representation and exchange of biochemical network models. Bioinformatics. 2003;19:524.

    Article  PubMed  Google Scholar 

  18. Finney A, Hucka M. Systems biology markup language: level 2 and beyond. Biochem Soc Trans. 2003;31(6):1472–3.

    Article  CAS  PubMed  Google Scholar 

  19. Michael Kohl. Standards databases and modeling tools in systems biology. In: Hamacher Michael, Eisenacher Martin, Stephan Christian, editors. Data Mining in Proteomics. Totowa: Humana Press; 2011.

    Google Scholar 

  20. ...Keating Sarah M, Dagmar Waltemath, Matthias König, Fengkai Zhang, Andreas Dräger, Claudine Chaouiya, Bergmann Frank T, Andrew Finney, Gillespie Colin S, Tomáš Helikar, Stefan Hoops, Malik-Sheriff Rahuman S, Moodie Stuart L, Moraru Ion I, Myers Chris J, Aurélien Naldi, Olivier Brett G, Sven Sahle, Schaff James C, Smith Lucian P, Swat Maciej J, Denis Thieffry, Leandro Watanabe, Wilkinson Darren J, Blinov Michael L, Kimberly Begley, Faeder James R, Gómez Harold F, Hamm Thomas M, Yuichiro Inagaki, Wolfram Liebermeister, Lister Allyson L, Daniel Lucio, Eric Mjolsness, Proctor Carole J, Karthik Raman, Nicolas Rodriguez, Shaffer Clifford A, Shapiro Bruce E, Joerg Stelling, Neil Swainston, Naoki Tanimura, John Wagner, Martin Meier-Schellersheim, Sauro Herbert M, Bernhard Palsson, Hamid Bolouri, Hiroaki Kitano, Akira Funahashi, Henning Hermjakob, Doyle John C, Michael Hucka, Adams Richard R, Allen Nicholas A, Angermann Bastian R, Marco Antoniotti, Bader Gary D, Jan Červený, Mélanie Courtot, Cox Chris D, Dalle Pezze Piero, Emek Demir, Denney William S, Harish Dharuri, Julien Dorier, Dirk Drasdo, Ali Ebrahim, Johannes Eichner, Johan Elf, Lukas Endler, Evelo Chris T, Christoph Flamm, Fleming Ronan MT, Martina Fröhlich, Mihai Glont, Emanuel Gonçalves, Martin Golebiewski, Hovakim Grabski, Alex Gutteridge, Damon Hachmeister, Harris Leonard A, Heavner Benjamin D, Ron Henkel, Hlavacek William S, Bin Hu, Hyduke Daniel R, Hidde Jong, Nick Juty, Karp Peter D, Karr Jonathan R, Kell Douglas B, Roland Keller, Ilya Kiselev, Steffen Klamt, Edda Klipp, Christian Knüpfer, Fedor Kolpakov, Falko Krause, Martina Kutmon, Camille Laibe, Conor Lawless, Li Lu, Loew Leslie M, Rainer Machne, Yukiko Matsuoka, Pedro Mendes, Huaiyu Mi, Florian Mittag, Monteiro Pedro T, Nath Natarajan Kedar, Nielsen Poul MF, Tramy Nguyen, Alida Palmisano, Jean-Baptiste Pettit, Thomas Pfau, Phair Robert D, Tomas Radivoyevitch, Rohwer Johann M, Ruebenacker Oliver A, Julio Saez-Rodriguez, Martin Scharm, Henning Schmidt, Falk Schreiber, Michael Schubert, Roman Schulte, Sealfon Stuart C, Kieran Smallbone, Sylvain Soliman, Stefan Melanie I, Sullivan Devin P, Koichi Takahashi, Bas Teusink, David Tolnay, Ibrahim Vazirabad, Axel Kamp, Ulrike Wittig, Clemens Wrzodek, Finja Wrzodek, Ioannis Xenarios, Anna Zhukova, Zucker Jeremy. Sbml level 3: an extensible format for the exchange and reuse of biological models. Mol Syst Biol. 2020;16(8):e9110.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Smith Lucian P, Moodie Stuart L, Bergmann Frank T, Colin Gillespie, Keating Sarah M, Matthias König, Myers Chris J, Swat Maciek J, Wilkinson Darren J, Michael Hucka. Systems biology markup language (SBML) level 3 package: distributions, version 1, release 1. J Integr Bioinform. 2020;17:2–3.

    Google Scholar 

  22. Olivier Brett G, Bergmann Frank T. SBML level 3 package: Flux balance constraints version 2. J Integr Bioinform. 2018.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Gauges Ralph, Rost Ursula, Sahle Sven, Wengler Katja, Bergmann Frank T. The systems biology markup language (SBML) level 3 package: Layout, version 1 core. J Integr Bioinform. 2015;12(2):550–602.

    Article  Google Scholar 

  24. Beard Daniel A, Britten Randall, Cooling Mike T, Garny Alan, Halstead Matt DB, Hunter Peter J, Lawson James, Lloyd Catherine M, Marsh Justin, Miller Andrew, Nickerson David P, Nielsen Poul MF, Nomura Taishin, Subramanium Shankar, Wimalaratne Sarala M, Yu Tommy. CellML metadata standards, associated tools and repositories. Philos Trans Royal Soc Math Phys Eng Sci. 2009;367(1895):1845–67.

    CAS  Google Scholar 

  25. Wimalaratne SM, Halstead MDB, Lloyd CM, Crampin EJ, Nielsen PF. Biophysical annotation and representation of CellML models. Bioinformatics. 2009;25(17):2263–70.

    Article  CAS  PubMed  Google Scholar 

  26. Mesiti Marco, Ruiz Ernesto Jiménez, Sanz Ismael, Llavori Rafael Berlanga, Valentini Giorgio, Perlasca Paolo, Manset David. Data integration issues and opportunities in biological XML data management. In Open and Novel Issues in XML Database Applications, pp, 263–286. IGI Global. 2009.

  27. Caprotti O, Carlisle D. OpenMath and MathML: semantic markup for mathematics. XRDS Crossroads ACM Magazine Stud. 1999;6(2):11–4.

    Article  Google Scholar 

  28. Smith LP, Butterworth E, Bassingthwaighte JB, Sauro HM. SBML and cellML translation in antimony and JSim. Bioinformatics. 2013;30(7):903–7.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Hucka Michael, Bergmann Frank T, Dräger Andreas, Hoops Stefan, Keating Sarah M, Le Novère Nicolas, Myers Chris J, Olivier Brett G, Sahle Sven, Schaff James C, Smith Lucian P, Waltemath Dagmar, Wilkinson Darren J. Systems biology markup language (SBML) level 2 version 5: structures and facilities for model definitions. J Integr Bioinform. 2015;12(2):731–901.

    Article  Google Scholar 

  30. Büchel Finja, Wrzodek Clemens, Mittag Florian, Dräger Andreas, Eichner Johannes, Rodriguez Nicolas, Le Novère Nicolas, Zell Andreas. Qualitative translation of relations from BioPAX to SBML qual. Bioinformatics. 2012;28(20):2648–53.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Courtot Juty N, Knupfer C, Waltemath D, Zhukova A, Drager A, Dumontier M, Finney A, Golebiewski M. Controlled vocabularies and semantics in systems biology. FEBS Lett. 2011;587:2832.

    Google Scholar 

  32. Gennari John H, König Matthias, Misirli Goksel, Neal Maxwell L, Nickerson David P, Waltemath Dagmar. OMEX metadata specification (version 1.2). Journal of Integrative Bioinformatics. 18(3).2021.

  33. Noy Natasha. Ontology development 101: A guide to creating your first ontology. 2001.

  34. Hastings Janna, Owen Gareth, Dekker Adriano, Ennis Marcus, Kale Namrata, Muthukrishnan Venkatesh, Turner Steve, Swainston Neil, Mendes Pedro, Steinbeck Christoph. ChEBI in 2016: improved services and an expanding collection of metabolites. Nucl Acids Res. 2015;44(D1):D1214–9.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Shin Woosub, Hellerstein Joseph L, Munarko Yuda, Neal Maxwell L, Nickerson David P, Rampadarath Anand K, Sauro Herbert M, Gennari John H. SBMate: A framework for evaluating quality of annotations in systems biology models. BioRxivs. 2021.

  36. Le Novère Nicolas, Andrew Finney, Michael Hucka, Bhalla Upinder S, Fabien Campagne, Julio Collado-Vides, Crampin Edmund J, Matt Halstead, Edda Klipp, Pedro Mendes, Poul Nielsen, Herbert Sauro, Bruce Shapiro, Snoep Jacky L, Spence Hugh D, Wanner Barry L. Minimum information requested in the annotation of biochemical models (MIRIAM). Nat Biotechnol. 2005;23(12):1509–15.

    Article  Google Scholar 

  37. Berners-Lee Tim, Fielding Roy, Masinter Larry. Uniform resource identifier (uri): Generic syntax. Technical report 2005.

  38. Juty N. Systems biology ontology: Update. Nature Precedings, 2010.

  39. Decker S, Melnik S, van Harmelen F, Fensel D, Klein M, Broekstra J, Erdmann M, Horrocks I. The semantic web: the roles of XML and RDF. IEEE Internet Comput. 2000;4(5):63–73.

    Article  Google Scholar 

  40. Apweiler R. UniProt: the universal protein knowledgebase. Nucl Acids Res. 2004;32(90001):115D – 119.

    Article  Google Scholar 

  41. Swainston N, Mendes P. libAnnotationSBML: a library for exploiting SBML annotations. Bioinformatics. 2009;25(17):2292–3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Le Novere N. BioModels database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems. Nucl Acids Res. 2006;34(90001):D689–91.

    Article  PubMed  Google Scholar 

  43. Hucka Michael, Bergmann Frank T, Chaouiya Claudine, Dräger Andreas, Hoops Stefan, Keating Sarah M, König Matthias, Novère Nicolas Le, Myers Chris J, Olivier Brett G, Sahle Sven, Schaff James C, Sheriff Rahuman, Smith Lucian P, Waltemath Dagmar, Wilkinson Darren J, Zhang Fengkai. The systems biology markup language (SBML): Language specification for level 3 version 2 core release 2. Journal of Integrative Bioinformatics. 16(2).2019.

  44. Bornstein BJ, Keating SM, Jouraku A, Hucka M. Libsbml: an api library for sbml. Bioinformatics. 2008.

    Article  PubMed  Google Scholar 

  45. ...Rodriguez Nicolas, Thomas Alex, Watanabe Leandro, Vazirabad Ibrahim Y, Kofia Victor, Gómez Harold F, Mittag Florian, Matthes Jakob, Rudolph Jan, Wrzodek Finja, Netz Eugen, Diamantikos Alexander, Eichner Johannes, Keller Roland, Wrzodek Clemens, Fröhlich Sebastian, Lewis Nathan E, Myers Chris J, Novère Nicolas Le, Palsson Bernhard Ø, Hucka Michael, Dräger Andreas. JSBML1.0: providing a smorgasbord of options to encode systems biology models: Table 1. Bioinformatics. 2015;31(20):3383–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. ...Waltemath Dagmar, Adams Richard, Beard Daniel A, Bergmann Frank T, Bhalla Upinder S, Britten Randall, Chelliah Vijayalakshmi, Cooling Michael T, Cooper Jonathan, Crampin Edmund J, Garny Alan, Hoops Stefan, Hucka Michael, Hunter Peter, Klipp Edda, Laibe Camille, Miller Andrew K, Moraru Ion, Nickerson David, Nielsen Poul, Nikolski Macha, Sahle Sven, Sauro Herbert M, Schmidt Henning, Snoep Jacky L, Tolle Dominic, Wolkenhauer Olaf, Le Novère Nicolas. Minimum information about a simulation experiment (MIASE). PLoS Comput Biol. 2011;7(4): e1001122.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Zhukova Anna, Zhukova Anna, Waltemath Dagmar, Juty Nick, Laibe Camille, Novère Nicolas Le. Kinetic simulation algorithm ontology. Nature Precedings. 2011.

  48. Bergmann Frank T, Cooper Jonathan, König Matthias, Moraru Ion, Nickerson David, Le Novère Nicolas, Olivier Brett G, Sahle Sven, Smith Lucian, Waltemath Dagmar. Simulation experiment description markup language (SED-ML) level 1 version 3 (L1V3). J Integr Bioinform. 2018;15(1):20170086.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Smith Lucian P, Bergmann Frank T, Alan Garny, Tomáš Helikar, Jonathan Karr, David Nickerson, Herbert Sauro, Dagmar Waltemath, Matthias König. The simulation experiment description markup language (sed-ml): language specification for level 1 version 4. J Integr Bioinform. 2021.

    Article  PubMed  PubMed Central  Google Scholar 

  50. Choi Kiri, Smith Lucian P, Medley J Kyle, Sauro Herbert M. phraSED-ML: a paraphrased, human-readable adaptation of SED-ML. J Bioinform Comput Biol. 2016;14(6):1650035.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. ...Schmiester Leonard, Schälte Yannik, Bergmann Frank T, Camba Tacio, Dudkin Erika, Egert Janine, Fröhlich Fabian, Fuhrmann Lara, Hauber Adrian L, Kemmer Svenja, Lakrisenko Polina, Loos Carolin, Merkt Simon, Müller Wolfgang, Pathirana Dilan, Raimúndez Elba, Refisch Lukas, Rosenblatt Marcus, Stapor Paul L, Städter Philipp, Wang Dantong, Wieland Franz-Georg, Banga Julio R, Timmer Jens, Villaverde Alejandro F, Sahle Sven, Kreutz Clemens, Hasenauer Jan, Weindl Daniel. PEtab—interoperable specification of parameter estimation problems in systems biology. PLOS Comput Biol. 2021;17(1): e1008646.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Porubsky Veronica L, Goldberg Arthur P, Rampadarath Anand K, Nickerson David P, Karr Jonathan R, Sauro Herbert M. Best practices for making reproducible biochemical models. Cell Syst. 2020;11(2):109–20.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Wilkinson Mark D, Dumontier Michel, Aalbersberg IJsbrand Jan, Appleton Gabrielle, Axton Myles, Baak Arie, Blomberg Niklas, Boiten Jan-Willem, da Silva Santos Luiz Bonino, Bourne Philip E, Bouwman Jildau, Brookes Anthony J, Clark Tim, Crosas Mercè, Dillo Ingrid, Dumon Olivier, Edmunds Scott,Evelo Chris T, Finkers Richard, Gonzalez-Beltran Alejandra, Gray Alasdair JG, Groth Paul, Goble Carole, Grethe Jeffrey S, Heringa Jaap, ’t Hoen Peter AC, Hooft Rob, Kuhn Tobias, Kok Ruben, Kok Joost, Lusher Scott J, Martone Maryann E, Mons Albert, Packer Abel L, Persson Bengt, Rocca-Serra Philippe, Roos Marco, van Schaik Rene, Sansone Susanna-Assunta, Schultes Erik, Sengstag Thierry, Slater Ted, Strawn George, Swertz Morris A, Thompson Mark, van der Lei Johan, van Mulligen Erik, Velterop Jan, Waagmeester Andra, Wittenburg Peter, Wolstencroft Katherine, Zhao Jun, Mons Barend. The FAIR guiding principles for scientific data management and stewardship. Scientific Data 2016.

  54. Schreiber Falk, Sommer Björn, Czauderna Tobias, Golebiewski Martin, Gorochowski Thomas E, Hucka Michael, Keating Sarah M, König Matthias, Myers Chris, Nickerson David, Waltemath Dagmar. Specifications of standards in systems and synthetic biology: status and developments in 2020. J Integr Bioinform. 2020;17(2–3):20200022.

    Article  PubMed  PubMed Central  Google Scholar 

  55. Bergmann Frank T, Adams Richard, Moodie Stuart, Cooper Jonathan, Glont Mihai, Golebiewski Martin, Hucka Michael, Laibe Camille, Miller Andrew K, Nickerson David P, Olivier Brett G, Rodriguez Nicolas, Sauro Herbert M, Scharm Martin, Soiland-Reyes Stian, Waltemath Dagmar, Yvon Florent, Le Novère Nicolas. COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project. BMC Bioinform. 2014;15(1):369.

    Article  Google Scholar 

  56. Erdemir Ahmet, Mulugeta Lealem, Ku Joy P, Drach Andrew, Horner Marc, Morrison Tina M, Peng Grace C Y, Vadigepalli Rajanikanth, Lytton William W, Myers Jerry G. Credible practice of modeling and simulation in healthcare: ten rules from a multidisciplinary perspective. J Transl Med. 2020;18(1):369.

    Article  PubMed  PubMed Central  Google Scholar 

  57. Blattnig Steve R, Luckring James M, Morrison Joseph H, Sylvester Andre J, Tripathi Ram K, Zang Thomas A. NASA standard for models and simulations: Philosophy and requirements overview. J Aircr. 2013;50(1):20–8.

    Article  Google Scholar 

  58. Howell Elizabeth, Daisy Dobrijevic. Columbia disaster: what happened and what nasa learned, 2021.

  59. Niewoehner Robert, Steidle Craig, Johnson Eric. The loss of the space shuttle columbia: Portaging the leadership lessons with a critical thinking model. American Society for Engineering Education, 2008.

  60. Blattnig Steve, Green Lawrence, Luckring James, Morrison Joseph, Tripathi Ram, Zang Thomas. Towards a Credibility Assessment of Models and Simulations. In 49th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference, Schaumburg, IL . American Institute of Aeronautics and Astronautics. 2008.

  61. Chen Xing, Yan Chenggang Clarence, Zhang Xiaotian, Zhang Xu, Dai Feng, Yin Jian, Zhang Yongdong. Drug–target interaction prediction: databases, web servers and computational models. Briefings Bioinform. 2015;17(4):696–712.

    Article  Google Scholar 

  62. Kuh Hyo-Jeong, Jang Seong Hoon, Wientjes M Guillaume, Au Jess LS. Computational model of intracellular pharmacokinetics of paclitaxel. J Pharmacol Experim Ther. 2000;293(3):740–61.

    Google Scholar 

  63. Ethan Kung, Masoud Farahmand, Akash Gupta. A hybrid experimental-computational modeling framework for cardiovascular device testing. J Biomechan Eng. 2019.

    Article  Google Scholar 

  64. Marco Viceconti, Juárez Miguel A, Cristina Curreli, Marzio Pennisi, Giulia Russo, Francesco Pappalardo. Credibility of in silico trial technologies-a theoretical framing. IEEE J Biomed Health Inform. 2020;24(1):4–13.

    Article  Google Scholar 

  65. Marco Viceconti, Francesco Pappalardo, Blanca Rodriguez, Marc Horner, Jeff Bischoff, Flora Musuamba-Tshinanu. In silico trials: Verification, validation and uncertainty quantification of predictive models used in the regulatory evaluation of biomedical products. Methods. 2021.

    Article  Google Scholar 

  66. ...Lieven Christian, Beber Moritz E, Olivier Brett G, Bergmann Frank T, Ataman Meric, Babaei Parizad, Bartell Jennifer A, Blank Lars M, Chauhan Siddharth, Correia Kevin, Diener Christian, Dräger Andreas, Ebert Birgitta E, Edirisinghe Janaka N, Faria José P, Feist Adam M, Fengos Georgios, Fleming Ronan M T, García-Jiménez Beatriz, Hatzimanikatis Vassily, van Helvoirt Wout, Henry Christopher S, Hermjakob Henning, Herrgård Markus J, Kaafarani Ali, Kim Hyun Uk, King Zachary, Klamt Steffen, Klipp Edda, Koehorst Jasper J, König Matthias, Lakshmanan Meiyappan, Lee Dong-Yup, Lee Sang Yup, Lee Sunjae, Lewis Nathan E, Liu Filipe, Ma Hongwu, Machado Daniel, Mahadevan Radhakrishnan, Maia Paulo, Mardinoglu Adil, Medlock Gregory L, Monk Jonathan M, Nielsen Jens, Nielsen Lars Keld, Nogales Juan, Nookaew Intawat, Palsson Bernhard O, Papin Jason A, Patil Kiran R, Poolman Mark, Price Nathan D, Resendis-Antonio Osbaldo, Richelle Anne, Rocha Isabel, Sánchez Benjamín J, Schaap Peter J, Malik Rahuman S, Sheriff Saeed Shoaie, Sonnenschein Nikolaus, Teusink Bas, Vilaça Paulo, Vik Jon Olav, Wodke Judith A H, Xavier Joana C, Yuan Qianqian, Zakhartsev Maksim, Zhang Cheng. MEMOTE for standardized genome-scale metabolic model testing. Nat Biotechnol. 2020;38(3):272–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Olivier Brett G, Bergmann Frank T. SBML level 3 package: flux balance constraints version 2. J Integr Bioinform. 2018.

    Article  PubMed  PubMed Central  Google Scholar 

  68. König Matthias. sbmlsim: Sbml simulation made easy, 2021.

  69. Glont Mihai, Nguyen Tung V N, Graesslin Martin, Hälke Robert, Ali Raza, Schramm Jochen, Wimalaratne Sarala M, Kothamachu Varun B, Rodriguez Nicolas, Swat Maciej J, Eils Jurgen, Eils Roland, Laibe Camille, Malik-Sheriff Rahuman S, Chelliah Vijayalakshmi, Le Novère Nicolas, Hermjakob Henning. Biomodels: expanding horizons to include more modelling approaches and formats. Nucl Acids Res. 2018;46(D1):D1248–53.

    Article  CAS  PubMed  Google Scholar 

  70. Olivier Brett, Gottstein Willi, Molenaar Douwe, Teusink Bas. Cbmpy release 0.8.2, 2021.

  71. Hari Archana, Lobo Daniel. Fluxer: a web application to compute, analyze and visualize genome-scale metabolic flux networks. Nucl Acids Res. 2020;48(W1):W427–35.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Raman Karthik. Matlab/cobra helper for frog analysis of fbc models, 2022.

  73. Kaddi Chanchala, Oden Erica D, Quo Chang F, Wang May D. Exploration of quantitative scoring metrics to compare systems biology modeling approaches. In 2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, August 2007.

  74. Hellerstein Joseph L, Gu Stanley, Choi Kiri, Sauro Herbert M. Recent advances in biomedical simulations: a manifesto for model engineering. F1000 Res. 2019;8:261.

    Article  Google Scholar 

Download references


Not applicable.


This work was supported by NIH Imaging and Bioengineering (NIBIB) award P41GM109824, and the National Science Foundation award 1933453. The content expressed here is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, the National Science Foundation, or the University of Washington.

Author information

Authors and Affiliations



LTT: Original draft preparation, review, and editing. LPS: Writing— review and editing. JLH: Writing - review and editing. HMS: Conceptualization, supervision, funding acquisition.

Corresponding author

Correspondence to Lillian T. Tatka.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

All authors have approved the content of this manuscript for submission and the manuscript is not under consideration for publication elsewhere

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tatka, L.T., Smith, L.P., Hellerstein, J.L. et al. Adapting modeling and simulation credibility standards to computational systems biology. J Transl Med 21, 501 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: