A workflow-driven approach to integrate generic software modules in a Trusted Third Party
© Bialke et al. 2015
Received: 9 February 2015
Accepted: 25 May 2015
Published: 4 June 2015
Cohort studies and registries rely on massive amounts of personal medical data. Therefore, data protection and information security as well as ethical aspects gain in importance and need to be considered as early as possible during the establishment of a study. Resulting legal and ethical obligations require a precise implementation of appropriate technical and organisational measures for a Trusted Third Party.
This paper defines and organises a consistent workflow-management to realize a Trusted Third Party. In particular, it focusses the technical implementation of a Trusted Third Party Dispatcher to provide basic functionalities (including identity management, pseudonym administration and informed consent management) and measures required to meet study specific conditions of cohort studies and registries. Thereby several independent open source software modules developed and provided by the MOSAIC project are used. This technical concept offers the necessary flexibility and extensibility to address legal and ethical requirements of individual scenarios.
The developed concept for a Trusted Third Party Dispatcher allows mapping single process steps as well as individual requirements and characteristics of particular studies to workflows, which in turn can be combined to model complex Trusted Third Party processes. The uniformity of this approach permits unrestricted re-combination of the available functionalities (depending on the applied software modules) for various research projects.
The proposed approach for the technical implementation of an independent Trusted Third Party reduces the effort for scenario specific implementations as well as for maintenance. The applicability and the efficacy of the concept for a workflow-driven Trusted Third Party could be confirmed during the establishment of several nationwide studies (e.g. German Centre for Cardiovascular Research and the National Cohort).
KeywordsMedical data management Data protection Informed consent Pseudonyms Record linkage
Epidemiological research in the context of cohort studies and registries becomes increasingly cooperative and often requires multi-site acquisition of extensive medical data. As a consequence research becomes more and more networked regarding communication, information exchange and cross-coordination between participating research institutions, laboratories and imaging facilities.
For these reasons, legal aspects of data security and information protection significantly gain in importance. This concerns the written informed consent of potential participants, which is mandatory for acquiring medical data for research purposes from an ethical point of view. On the national level legal principles like data avoidance and frugality [§3a of the German Federal Data Protection Act (Bundesdatenschutzgesetz, BDSG)] as well as requirements for the separation of identifying data from further personal data (§40 BDSG) need to be accounted for. International legislation includes the “Convention for the Protection of Individuals with regard to Automatic Processing of Personal Data” (Council of Europe) , the “EU legal framework on the protection of personal data” (European Commission)  and the “Declaration of Helsinki” (World Medical Association) . The resulting legal and ethical obligations require effective solutions realizing all necessary measures for data protection and IT security.
In Germany the Technology, Methods and Infrastructure for Networked Medical Research (TMF) provided a guideline  proposing a Trusted Third Party (TTP) to address typical challenges in data protection and ethics. Following the TMF-specification a TTP requires an informational separation of powers by separating person identifying information (PII) and medical information from a technical as well as from an organizational perspective. This includes an electronic identity-management and should be supplemented by a secure pseudonymisation mechanism . Following this definition, a TTP is described as a combination of technical as well as organisational measures and shall comply with fundamental principles according to data protection rules for IT-solutions . Moreover, the guideline demands the TTP to be legally, staff-wise and spatially autonomous and independent.
It is of importance that the employees of a TTP (the data trustee) do not depend on the institutions which are providing or processing the research data. In particular the employees need to be independent in terms of their contracts, incomes, duties, work hours and other operational aspects from all scientists of the project that they support. This can be realized either in a separate legal organisation or on a contract level. According to TMF guidelines  and legal reports  as well as the Federal Data Protection Act (cf. §28 BDSG) the processing of data on a contract level prevents a sufficient informational separation of powers. From an organisational perspective the TTP requires a functional transfer (transfer of full responsibilities for data processing) in order to be independent of instructions from the initiators of a research.
This paper focusses the technical implementation of a TTP. The goal is to define and organise a consistent workflow-management within the TTP. Allowing to increase reusability for individual TTP scenarios, the workflow approach shall reduce the effort for implementation and maintenance.
Assembling a modular Trusted Third Party
The spectrum of tasks of a data trustee includes the management of identities, informed consents and the generation of pseudonyms. Additionally, the data trustee supports the matching of personal data from population registries and further external data sources.
An identity management is required to manage participants and assigned participant identities. It includes probabilistic matching algorithms for an efficient and fault-tolerant record-linkage. Furthermore, it comprehends the provision and management of appropriate pseudonyms for each set of identities. Especially in prospective cohort studies and registries compiling variations in the identifying data of a participant (a so called identity), e.g. different spelling in a participant’s name, need to be stored.
To ensure compliance to the principles of informational self-determination , the participant has to be able to consent to several aspects of data processing. Within the TTP the management of informed consents includes the provision of patient information documents, the consent itself and a monitoring of various types of revocations. For digital processing informed consent documents are depicted as modular examinable policies and modules and are combined with additional data like electronic signatures, dates and organisational information. This modular informed consent allows for verifiable as well as contemporary statements, whether for example the processing of a participant’s data, the secondary use of collected data or the specimen-collection is legitimate or not.
The efficient generation and administration of pseudonyms within the TTP is a key functionality when medical scientific data needs to be processed and permanently stored. In order to provide scientific data for research projects and secondary use, the data has to be pseudonymised secondarily or be anonymised. In some cases an anonymisation is not applicable. Follow-up investigations, the communication of incidental findings or the linkage of secondary data require the pseudonymisation to be reversible in order to retrieve the corresponding participants for further contact.
For the implementation of the independent TTP several open source software modules are used. Following the basic concepts and processes described by the TMF , the MOSAIC project  (funded by the German Research Foundation (HO 1937/2-1)) has developed a set of practical tools to address data protection challenges and to provide support for the implementation of a data management in epidemiologic research projects. These free software tools (E-PIX, gICS, gPAS) facilitate the principles of “privacy by design”  and use uniform technical standards. Moreover these tools provide a service-oriented architecture and consistent graphical user interfaces.
The E-PIX (Enterprise Patient Identifier Cross Referencing)  allows a precise identity management and supports the data trustee to distinguish participants sustainably based on their identifying data (IDAT). It follows the principles of a Master Person Index. This ensures a participant to exist only once in the linkage database based on demographic information . The completely service-based software module generates a unique identifier for every managed participant and allows solving ambiguous matching cases interactively using a web-based graphical interface. The equally modular solution gPAS (generic Pseudonym Administration Service)  adopts similar technical approaches and provides domain-specific pseudonym creation, de-pseudonymisation and anonymisation functionalities. The utilisation of gICS (generic Informed Consent Service)  completes the set of TTP tools. It facilitates the management of digital informed consent documents and allows automatable checks for consent validity and revocations . Modular informed consents are defined, based on examinable policies and re-usable modules.
The simultaneous use of the MOSAIC software modules E-PIX, gICS and gPAS allows implementing basic requirements of an independent TTP. The administration of participant identities, informed consents and pseudonyms can be performed using graphical web interfaces. However, due to their modular design there is no direct communication among these components. In order to realize more complex workflows, a manual intervention of the data trustee is necessary in many tasks. For example, if a new participant is recruited, it is necessary to assign a unique identifier based on his IDAT (identity management), to pseudonymise this unique identifier (pseudonym administration) and to return the generated pseudonym in order to start capturing the medical data within the study site.
Most widely automating the communication between the software-modules E-PIX, gPAS and gICS through well-defined workflows reduces the number of necessary manual interventions of the data trustee. Only a small number of crucial decisions remains, where a human interaction cannot be replaced (e.g. to evaluate and resolve possible matches).
Extending flexibility to support individual scenarios
The required communication between the previously described TTP services depends on the workflow of a specific cohort or registry and, hence, individual characteristics may differ from the typical scenario. In order to flexibly orchestrate the particular TTP services and to coordinate the corresponding communication between the services, a dispatcher has been developed. The TTP Dispatcher represents the conceptual continuation of a request dispatcher, which was introduced in the GANI_MED project. .
Flexibility through workflows
In terms of a TTP, a workflow technically describes a sequence of (parallel) processes and operations, starting with an input and ending with a defined outcome. Workflows are being used to control and process the necessary calls to the connected software modules E-PIX, gPAS and gICS. They are distinguished into groups. Basic workflows represent common tasks of a data trustee and are of relevance in most project scenarios, e.g. checking if a participant already exists in the management system or generating pseudonyms for a list of participant identifiers. Project-specific workflows describe all necessary individual processes and operations beyond, for example all required steps to automatically generate a pseudonym when a new participant is created in a study site based on his IDAT and a valid informed consent. The separation of basic and project-specific workflows allows a consistent approach for several implementations of the TTP Dispatcher. This architecture hereby supports portability to other research projects, reduces maintenances and improves the sustainability of a TTP implementation.
The technical description of each workflow is performed using Apache Camel . Based on Enterprise Integration Patterns  routes can be defined using a domain specific language. Each route comes with at least two end-points (source and target, e.g. a simple file, a web-service or an internal process), which are expecting an input (e.g. objects, messages) and returning a result. These end-points are linked using a message channel and basic elements of the Apache Camel syntax.
Overview of basic Trusted Third Party workflows
Generate MPI ID for given IDAT using E-PIX-service
Check if a participant with given IDAT already exists in the E-PIX-database
Get a pseudonym for a given identifier (e.g. MPI ID) and vice versa using the gPAS-service
Add a new informed consent (based on a template containing several modules and policies) for the given identifier using the gICS-service
Check if an informed consent for the given identifier exists in the gICS-database
Query a list of policies and their consented state for a given informed consent identifier using the gICS-service
Add a document scan to a previously defined informed consent using the gICS-service
Update a participants IDAT already existing in the E-PIX-database
Retrieve a participants IDAT from the E-PIX database identified by its MPI ID
Sequential workflow combining get_mpi and get_id_from_id
Sequential workflow combining get_id_from_id and get_participant_by_mpi_id
An essential part for the technical establishment of a Trusted Third Party is the implementation of required dispatcher functionalities. In the past the necessary individual implementations for a cohort study or registry required up to 6 month of work.
In case of an error, the workflow processing is interrupted. Among other information, the error message and the error origin are documented and returned to the respective study site.
As the example demonstrates, a consistent workflow management allows easily linking available functionalities (E-PIX, gICS, gPAS) by reusing and combining predefined workflows. Thus the individual character of cohort studies and registries can be depicted straightforward. Moreover, study specific processes are most widely automatable and manual intervention of the data trustee could be essentially reduced.
The components and workflows of a TTP vary according to their specific context. For example, the Central Clinical Cancer Registry in Mecklenburg-Western Pomerania  does not require a consent management and the German National Cohort  uses a specific pseudonymisation for different sites and data categories (e.g. MRT, bio samples, web-forms).
Advantages and disadvantages of the developed Trusted Third Party Dispatcher
Support for automation reduces susceptibility to errors and accelerates internal TTP processes
Initial configuration of the TTP Dispatcher requires professional IT support to set up mandatory databases and the application server
Integrated audit-and-trail-mechanisms ensure traceability and transparency of participating systems
Changes and updates in TTP Dispatcher core functions involve a determined update management including tests in the respective project, study or registry
Modular and adaptable workflows improve portability and re-usability
Multi-client capability to manage large multi-site projects
Interoperability of service components
Following the described approach for the technical implementation of a Trusted Third Party supports compliance with project-specific legal data protection requirements in cohort studies and registries. But in order to exhaustively fulfil security, data protection, ethical and legal requirements , additional measures are necessary. Among others, this includes the institution of a data trustee, several dedicated rules, access controls for non-employees, separated network infrastructures and full client-capability on a technical and organisational level , resulting in the separated storage of participant identifying data for each supported study and registry. Moreover, regular internal and external audits have to be engaged involving both the institutional and the federal data protection officers.
The aim of the described TTP Dispatcher approach has a significant difference to existing IT platforms supporting clinical research, such as EHR4CR . The proposed TTP approach focusses exclusively on the management of participant identifying data and related technical and organisational measures. Medical data is not processed within the TTP. EHR4CR focusses a widespread support to all steps of a clinical trial process instead. This includes the provision of information about new and running trials, several tools for data managers and investigators, the provision of query engines, recruitment software, an identity and access management, as well as a security framework. Unlike the proposed TTP approach, the EHR4CR IT platform stores aggregated medical information and patient identifying data does not leave the clinical context.
The concept of the TTP Dispatcher has already seen successful implementation in the German Centre for Cardiovascular Research (DZHK)  and the German National Cohort . The resulting pattern can flexibly be adopted and easily be extended for reuse in future cohort studies and registries. The established TTP solutions are compatible to legal requirements of the “Convention for the Protection of Individuals with regard to Automatic Processing of Personal Data” (Council of Europe) , the “Declaration of Helsinki” (World Medical Association) , the “EU legal framework on the protection of personal data” (European Commission)  as well as the “Treaty of Lisbon” (European Union)  and they are aligned to the previously mentioned recommendations of the TMF data protection concepts for medical research .
During the recruitment of participants for cohort studies and registries particularly the acquisition, processing and storage of personal health data necessitate both compliance with ethical standards and stringent policies for data protection. For Germany, the resulting requirements for data management are compiled comprehensively in the guideline provided by the TMF . Conformity is usually achieved by the implementation of a Trusted Third Party (TTP). However, the individual TTP implementation for different studies is associated with considerable high technical efforts that can be prohibitive in smaller studies or in institutions without a professional IT-department.
This paper demonstrates how generic software modules developed and provided by the MOSAIC project [8, 10, 11] can be deployed in order to meet essential TTP requirements. The concept of a workflow-driven dispatcher is introduced combining these modules in structured workflows, allowing for a free combination of separate functionalities. Single process steps can be easily implemented by concatenating corresponding function calls and mapping them to workflows. The combination of multiple workflows enables an efficient conception and implementation of highly complex working procedures. Simultaneously the necessary effort for customisation is reduced to a minimum.
The proposed approach for the technical implementation of a TTP facilitates the necessary flexibility, portability and reusability for application in cohort studies and registries. This is achieved by mapping the individual requirements and characteristics of a particular study to pre-defined workflows. Reusability additionally benefits from the encapsulation of module logic and a uniform interface for all modules avoiding study specific modifications of individual modules or functionalities. The generic software modules connected with the workflow approach presented in this paper can easily be adopted to accommodate national and international requirements in terms of informed consent, identity management, pseudonymisation, data linkage and data transfer.
However, specification of a uniform interface for essential functionalities and parameters accounting for established standards and methods within the scientific community must still be considered time-consuming and labour-intensive. Future work will focus on further facilitating the establishment of an independent TTP, including workflow visualization, a generic module configuration independent from the deployed services, a graphical configuration tool for the configuration of the dispatcher and an extended central role-and-rights-management.
Deutsches Zentrum für Herz-Kreislauf-Forschung (German Centre for Cardiovascular Research)
Enterprise Patient Identifier Cross-referencing
generic Informed Consent Administration Service
generic Pseudonym Administration Service
- MPI ID:
unique identifier following concepts of a Master Patient Index
Technology, Methods and Infrastructure for Networked Medical Research
Trusted Third Party
PP and TB were involved in the conception and design process of the TTP Dispatcher. MB, PP, TB, TW, CH, JP and WH drafted the manuscript. PP, TB CH and WH revised it critically. PP and TB were responsible for the set-up of the IT infrastructure. All authors read and approved the final manuscript.
The MOSAIC-Project is funded by the German Research Foundation (DFG) as a part of the research grant programme “Information infrastructure for research data” (grant number HO 1937/2-1).
Compliance with ethical guidelines
Competing interests The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Council of Europe. Convention for the protection of individuals with regard to automatic processing of personal data. In ETS No.108 1981 Strasbourg
- European Commission (2012) Official website of the European Commission. http://ec.europa.eu/justice/data-protection/document/review2012/com_2012_11_de.pdf. Accessed 02 Oct 2014
- World Medical Association (WMA) (2008) WMA Declaration of Helsinki—Ethical Principles for Medical Research Involving Human Subjects. In 59th WMA General Assembly 2008 Korea
- Pommerening K, Drepper J, Helbing K, Ganslandt T, Müller T, Speer R et al (2014) Generic data protection concepts for medical research networks 2.0 (Leitfaden zum Datenschutz in medizinischen Forschungsprojekten. Generische Lösungen der TMF 2.0). Berlin. TMF e.V. 2014
- Dierks C (2008) Legal evaluation of an electronical data trustee ship of the TMF (Rechtsgutachen zur elektronischen Datentreuhänderschaft der TMF, TMF-Produkt P052011). Berlin. 2008
- The MOSAIC-Project (2014) Mosaic-Project Website. http://mosaic-greifswald.de. Accessed 1 Jun 2014
- Schaar P (2010) Privacy by design. Identity Inf Soc 3(2):267–274. doi:10.1007/s12394-010-0055-x View ArticleGoogle Scholar
- The MOSAIC-Project (2014) ID-Management with E-PIX. https://mosaic-greifswald.de/werkzeuge-und-vorlagen/id-management-e-pix.html. Accessed 19 Sep 2014
- Lenson C (1998) Building a successful enterprise master patient index: a case study. Top Health Inf Manag 19(1):66–71Google Scholar
- Geidel L, Bahls T, Hoffmann W (2013) A generic pseudonymization tool as a module of Central Data Management for medical research data (Ein generisches Pseudonymisierungswerkzeug als Modul des Zentralen Datenmanagements medizinischer Forschungsdaten). In: Löffler M, Riedel-Heller S (eds) Abstractband 8th Annual Conference of the German Society for Epidemiology (DGEpi) e.V. and 1st International LIFE Symposium (Abstractband 8. Jahrestagung der Deutschen Gesellschaft für Epidemiologie und 1. Internationales LIFE Symposium). Leipzig, pp 245–246
- Bahls T, Liedtke W, Geidel L, Langanke M (2015) Ethics meets IT: aspects and elements of computer-based informed consent processing. In: Fischer T, Langanke M, Marschall P, Michl S (eds) Individualized medicine: ethical, economical and historical perspectives. Springer, Cham, pp 209–229. http://www.springer.com/biomed/book/978-3-319-11718-8
- Grabe H, Assel H, Bahls T, Dörr M, Endlich K, Endlich N et al (2014) Cohort profile: Greifswald approach to individualized medicine (GANI_MED). J Transl Med 12:144. doi:10.1186/1479-5876-12-144 PubMed CentralPubMedView ArticleGoogle Scholar
- Lablans M, Borg A, Ückert F (2015) A RESTful interface to pseudonymization services in modern web applications. BMC Med Inform Decis Mak 15:2. doi:10.1186/s12911-014-0123-5 PubMed CentralPubMedView ArticleGoogle Scholar
- The Apache Software Foundation (2014) Apache Camel Homepage. http://camel.apache.org/. Accessed 20 Oct 2014
- The Apache Software Foundation (2014) Enterprise Integration Patterns. http://camel.apache.org/enterprise-integration-patterns.html. Accessed 20 Oct 2014
- ZKKR-MV (2014) Central Clinical Cancer Registry in MV (Zentrales klinisches Krebsregister Mecklenburg-Vorpommern). http://web1-zkkr.zkkr.med.uni-greifswald.de/. Accessed 29 Sep 2014
- The National Cohort (Nationale Kohorte e.V.) (2015) http://www.nationale-kohorte.de/content/9.2-DS-Konzept-Treuhandstelle-NAKO-V1-01-2015-01-11.pdf. Accessed 20 May 2015
- Conference of the Federal and State Data Protection Officers—Workinggroup for technical and organisational data protection issues (2012) Guide to client-capability (Technische und organisatorische Anforderungen an die Trennung von automatisierten Verfahren bei der Benutzung einer gemeinsamen Infrastruktur—Orientierungshilfe Mandantenfähigkeit). http://www.baden-wuerttemberg.datenschutz.de/wp-content/uploads/2013/04/Mandantenfähigkeit.pdf. Accessed 24 Apr 2015
- Doods J, Bache R, McGilchrist M, Daniel C, Dugas M, Fritz F (2014) Piloting the EHR4CR feasibility platform across Europe. Methods Inf Med 53(4):264–268. doi:10.3414/ME13-01-0134 PubMedView ArticleGoogle Scholar
- German Centre for Cardiovascular Research (DZHK) (2014) dzhk.de. http://dzhk.de/. Accessed 11 Mar 2014
- Conference of the Representatives of the Governments of the Member States. Treaty of Lisbon Amending the Treaty on European Union and the Treaty Establishing the European Community. In Official Journal of the European Union (2007/C 306/01) 2007 Lisbon. pp 1–228