EHR4CR Consortium Page 1 of 50
Electronic Health Records for Clinical Research
Deliverable 7.4
Final report from the pilot site evaluations and the status of local interfaces
Version 2.0
Final
28/01/2015
Project acronym: EHR4CR
Project full title: Electronic Health Records for Clinical Research
Grant agreement no.: 115189
Budget: 16 million EURO
Start: 01.03.2011 - End: 28.02.2015 followed by 1 year extension
Website: www.ehr4cr.eu
The EHR4CR project is partially
funded by the IMI JU programme
Coordinator:
Managing Entity:
EHR4CR Consortium Page 2 of 50
Document description
Deliverable no: 7.4
Deliverable title: Final report from the pilot site evaluations and the status of local interfaces
Status: Final
Version: 2.0 Date: 29/02/2016
Security: EHR4CR Consortium
Editors: Fleur Fritz, Benjamin Trinczek, Justin Doods, Iñaki Soto-Rey, Philipp Bruland, James
Cunningham, Mark McGilchrist, Christian Lovis, Colin McCowan, Hans-Ulrich
Prokosch, Scott Askin, Elena Bolanos, Helen Townsend, Eric Zapletal, Sebastian
Mate, Marc Cuggia, Bolaji Coker, Andreas Schmidt, Andy Sykes.
Document history
Date Revision Author Changes 22102014 0.0 Fleur Fritz Initial draft
08122014 0.1 Fleur Fritz, Benjamin
Trinczek, Justin Doods, Iñaki
Soto-Rey, Philipp Bruland,
James Cunningham, Mark
McGilchrist
Description of work and results
19122014 0.2 Christian Lovis, Colin
McCowan, Hans-Ulrich
Prokosch, Scott Askin, Elena
Bolanos, Helen Townsend,
Eric Zapletal, Sebastian
Mate, Marc Cuggia, Bolaji
Coker
Site updates and changes
19012015 0.3 Mark McGilchrist, James
Cunningham, Inaki Soto
Rey, Justin Doods, Philipp
Bruland, Fleur Fritz
Result updates
28012015 1.0 Andreas Schmidt, Andy
Sykes
Comments and corrections
29022016 2.0 Martin Dugas, Dipak Kalra Minor updates
EHR4CR Consortium Page 3 of 50
Table of Contents 1. Executive Summary ......................................................................................................................... 5
2. Deliverable Description ................................................................................................................... 7
2.1. Task 7.1 Local Interfaces with EHR4CR platform ................................................................... 7
2.2. Task 7.2 Protocol Feasibility ................................................................................................... 7
2.3. Task 7.3 Patient Recruitment ................................................................................................. 7
2.4. Task 7.4 Clinical Trial Execution & Task 7.5 Serious Adverse Event Reporting ...................... 8
2.5. Output of the deliverable ...................................................................................................... 8
3. Organization of Work and Results ................................................................................................. 10
3.1. Local Interfaces – Site Readiness ......................................................................................... 10
3.1.1 Method ............................................................................................................................ 10
3.1.2 Data access and available contexts ................................................................................. 11
3.1.3 Authorisations required and obtained ............................................................................ 11
3.1.4 Precautionary measures taken by sites ........................................................................... 12
3.1.5 Technical approaches to accessing data.......................................................................... 12
3.1.6 Transforming data ........................................................................................................... 14
3.1.7 Coding and measurement units ...................................................................................... 15
3.1.8 Installation and configuration ......................................................................................... 17
3.2. Protocol Feasibility – Effectiveness & Efficiency Evaluation ................................................ 18
3.2.1 Effectiveness Results ....................................................................................................... 18
3.2.2 Efficiency Results ............................................................................................................. 19
3.2.3 Analysis of the results ...................................................................................................... 20
3.3. Protocol Feasibility – Scalability Evaluation ......................................................................... 20
3.4. Protocol Feasibility – Usability Evaluation ........................................................................... 22
3.4.1 Results.............................................................................................................................. 24
3.5. Patient Recruitment – Data Inventory ................................................................................. 30
3.6. Patient Recruitment – Evaluation ........................................................................................ 31
3.6.1 Task #1 (Installation and configuration of the PRS components at the sites) ................. 32
EHR4CR Consortium Page 4 of 50
3.6.2 Task #2 (Approach and seeking confirmation of the respective Principal Investigator(s)
(PI) at the participating sites for the respective trial(s) that have already been chosen in year 3)
32
3.6.3 Tasks #3 (Adjustment of the evaluation protocol for each site participating in the
evaluation for the respective trial(s)) & #4 (Approach and requesting approval of the local ethics
committee/institutional review board at the participating sites for the respective trial(s)) ........ 34
3.6.4 Tasks #5 (Review of simplified eligibility criteria (EC) of the trials used for the
evaluation, and, if necessary, re-simplification of these EC) & #6 (Check availability of data items
correlating to the EC within the central terminology, and, if necessary, coordinate insertion of
missing items with Work package 4 team) ................................................................................... 35
3.6.5 Task #7 (Extraction, Transformation and Loading (ETL) of EHR data correlating to these
data items) ..................................................................................................................................... 36
3.6.6 Tasks #8 (Creation of (database) queries for the trials utilized in the evaluation within
the central workbench), #9 (Distribution of queries to the participating sites) & #10 (Execution
of queries, collection of necessary numbers and screening lists at the participating sites) ......... 37
3.6.7 Task #11 (Comparison between screening list from standard method with candidate list
from EHR4CR systems) .................................................................................................................. 39
3.7. Clinical Trial Execution & Serious Adverse Event Reporting – Data Inventory .................... 40
4. Appendix ........................................................................................................................................ 42
4.1. Scalability Evaluation ........................................................................................................... 42
4.2. Usability Evaluation .............................................................................................................. 42
4.3. PRS Evaluation...................................................................................................................... 43
4.3.1 PRS testing template from last year’ s deliverable .......................................................... 43
4.3.2 PRS results document from AP-HP for the EUCLID study ................................................ 44
4.3.3 PRS results document from AP-HP for the GetGoal Duo-2 study ................................... 45
4.4. PRS Data Inventory .............................................................................................................. 46
4.5. CTE Data Inventory .............................................................................................................. 49
EHR4CR Consortium Page 5 of 50
1. Executive Summary
This document describes the Work Package 7 (WP7) deliverables for the fourth year within the
Electronic Health Records for Clinical Research (EHR4CR) project. The EHR4CR project aims to create
a platform to reuse data from electronic health records for clinical research. More information can
be found at: http://www.ehr4cr.eu/.
The overall objective of WP7 is to demonstrate the functionality of the tools and services provided by
the platform (Work Packages 3-6) and to evaluate the EHR4CR platform in the areas of clinical study
design, execution and SAE reporting with a specific focus towards a set of mutually acceptable
medical domains agreed on by the demonstrator sites and EFPIA partners in accordance with Work
Package 1.
WP7 pilots the platform at 11 different data provider sites. The piloting is divided into three
scenarios: protocol feasibility, patient identification and recruitment, clinical trial execution including
serious adverse event reporting. The fourth year of the project concentrates on all three scenarios.
The protocol feasibility is to be evaluated, the patient identification and recruitment is to be installed
and also evaluated, the clinical trial execution is to be analysed. For the third (last) scenario it has
been decided that the project objectives would be adjusted to a prototype installation established by
WPG2 and the CRF data element analyses by WP7.
The key results, challenges and proposed mitigation steps for year four are highlighted below and
described in detail in each section.
Key results:
Access to real data at the data provider sites incl. mapping of local terminology to central
terminology
PFS evaluation results
PFS usability evaluation in conjunction with WP1
Platform scalability testing
Updated version of Data Inventory for PRS
Local Integration of Patient Recruitment Services (follow-up from year 3)
PRS evaluation results
Clinical trial execution (CTE) data inventory and validation
Challenges:
Due to several delays in the availability of the Protocol Feasibility (PFS) and Patient Recruitment (PRS)
Platforms, not all tests and evaluations have been performed compared to what was initially
planned. The PFS scenario was therefore tested only once and the PRS could only be tested
retrospectively.
EHR4CR Consortium Page 6 of 50
Proposed mitigation:
Some tests have been performed only at selected sites. Both scenarios were tested with a slightly
reduced scope. More tests were planned to be conducted during year 5. However, the project
experienced a significant and unexpected shortfall in budget in the fifth year, which resulted in work
package 7 having to close down early the fifth year, and therefore not having the opportunity to
undertake any further evaluation work during year five on the PRS or on the prototype
implementations of CTE.
EHR4CR Consortium Page 7 of 50
2. Deliverable Description
The fourth year deliverable D7.4 for WP7 is: “Final report from the pilot site evaluations and the
status of local interfaces”.
This deliverable focuses on activities related to the last scenario of CTE and serious adverse event
reporting. However, the first two scenarios are also still being worked upon and are each included in
the tasks 7.1 to 7.3. To achieve the deliverable, it was divided into tasks with specific activities that
are described below.
2.1. Task 7.1 Local Interfaces with EHR4CR platform
The year four activity for this task is: “Mapping of local data items to pivot representation (v2); local
adaptation of uniform access layer (v2); local integration of Clinical Trial Data Capture Services”.
This task aims at preparing the local data provider sites to install the necessary components of the
EHR4CR platform and make data available to be queried for the selected clinical trials within the
project. This not only involves identifying and preparing the data through Extract – Transform – Load
(ETL) processes but also obtaining approvals (e.g. data privacy, health authority) and making sure
that data privacy and security are respected.
The main outputs are therefore a collection of data elements called the data inventory and a
checklist about each sites readiness to run the EHR4CR platform with specific data warehouses at the
local sites.
2.2. Task 7.2 Protocol Feasibility
The year four activity for this task is: “Further improvement of the efficiency of the trial feasibility
demonstrators based on EFPIA evaluation (based on evaluation concept from Work Package 1)”.
This task focuses on the evaluation of the protocol feasibility component of the EHR4CR platform. For
the evaluation, in comparison with conventional methods, a specific test plan and evaluation
protocol was designed and carried out at selected data provider sites with a subset of real clinical
trials. The system was evaluated with respect to efficiency and accuracy. Furthermore, several kinds
of user acceptance tests with respect to scalability and usability have been carried out.
2.3. Task 7.3 Patient Recruitment
The year four activity for this task is: “The effect of EHR4CR platform on recruitment and enrolment
rates in hospitals will be analysed according to EFPIA criteria”.
This task focuses on the scenario “Patient Identification and Recruitment”, in which the central and
locally developed and installed services shall be used to identify potentially eligible patients. The
evaluation of these services is designed to compare the process and output of identifying potentially
EHR4CR Consortium Page 8 of 50
eligible patients with the current methodology, as described and agreed upon in the evaluation
protocol (template) that has been written in the previous year together with EFPIA.
2.4. Task 7.4 Clinical Trial Execution & Task 7.5 Serious Adverse Event Reporting
The year four activities for those tasks are: “The ability of EHR4CR platform to query EHRs in order to
enrich CRFs with EHR data will be tested and the ability to transmit data from several hospitals to the
sponsor’s CDMS. EHR4CR will support the execution by the demonstrators of integration profiles
similar to IHE Retrieve Form for Data capture (RFD). The completeness, quality of data and the
timeliness achieved through the EHR4CR platform will be compared to current methods (using paper
CRF or eCRF without EHR integration)”
and:
“The ability of EHR4CR platform to query EHRs in order to enrich drug safety reporting forms with EHR
data will be tested as well as the ability to transmit data from several hospitals to the relevant
authorities. EHR4CR will support the execution by the pilots of integration profiles similar to IHE RFD.
The completeness, quality of data and the timeliness achieved through the EHR4CR platform will be
compared with current methods”.
Since there is no fully functional pilot for CTE available, these tasks were decided to be changed
towards a comprehensive data inventory for the third scenario. The inventory sets a focus on the
determination of common data elements in clinical trials as well as their availability and
completeness within sites’ EHR systems. In addition, the serious adverse event reporting process was
removed from the list of deliverables since the SAE reporting from clinical trials is done by the
sponsor companies through the pharmacovigilance systems. However, the relevant data elements
were collected for the CTE data inventory.
2.5. Output of the deliverable
Based on the described tasks the output of the fourth year deliverable consists of:
Site Readiness
o Status of clinical data warehouses and end-point installations at the sites
o Mapping and ETL
Protocol Feasibility
o Efficiency and effectiveness evaluation
o Scalability Evaluation
o Usability evaluation
EHR4CR Consortium Page 9 of 50
Patient Recruitment
o Data Inventory
o PRS evaluation protocol template for institutional review boards (IRB)
o Evaluation
Clinical Trial Execution & Serious Adverse Event Reporting
o Data Inventory
The participating pilot sites have the following abbreviations:
University College London UCL
Kings College London KCL
University of Dundee UNIVDUN
Université de Rennes 1 U936
Westfälische Wilhelms-Universität Münster WWU
Friedrich-Alexander-Universität Erlangen-Nürnberg FAU
Hôpitaux universitaires de Genève HUG
Assistance Publique Hôpitaux de Paris AP-HP
The University of Manchester UoM
Medical University of Warsaw – POLCRIN MUW
University of Glasgow UoG
EHR4CR Consortium Page 10 of 50
3. Organization of Work and Results
In this main section the organization of work within the work package and the respective results
representing the outputs of the WP7 deliverable are described. More detailed information about
each deliverable part can be found in the Appendix 4 as well as in the referenced SharePoint
documents.
3.1. Local Interfaces – Site Readiness
Sites are required to perform a number of duties in regard to their membership of the EHR4CR
network. They must provide the necessary interfaces to their data (obtained from one of the
providers) and ensure that their data, systems and staff are ready to participate. This section – Local
interfaces and site readiness – discusses the nature of these interfaces at each site and the
requirements of the data and staff, and assesses whether existing sites are ready.
3.1.1 Method
The information contained within this section was obtained in one-to-one interviews with
representatives from each site and performed between 1st and 3rd December 2014. Ten sites were
successfully interviewed, except one site (Rennes). No results will be presented for that site. Those
responding included Münster (WWU), Erlangen (FAU), Glasgow (UoG), Geneva (HUG), Manchester
(UoM), AP-HP (Paris), Warsaw (MUW), Dundee (UoD), Kings College London (KCL) and University
College London (UCL). Table 3.1.1 lists those participating in the interviews.
Site Name(s)
WWU Justin Doods, Inaki Soto Rey, Benjamin Trinczek
FAU Sebastian Mate, Thomas Ganslandt
UoG Kevin Ross
HUG Dina Vishnyakova
UoM James Cunningham
AP-HP Eric Zapletal
MUW Cezary Szmigielski, Marcin Rozek, Slawomir Majewski
UNIVDUN Mark McGilchrist
KCL Bolaji Coker
UCL Dionisio Acosta
Table 3.1.1: Interview participants
Interviews lasted between 40 and 50 minutes and the following issues were discussed:
Access to data
Context in which data can be used
Authorisation required and obtained
EHR4CR Consortium Page 11 of 50
Precautionary measures
ETL
Data quality and provenance
Coding and units
Installation and configuration
3.1.2 Data access and available contexts
In the EHR4CR project, data were partitioned into 7 distinct categories:
Demographics, Diagnosis, Procedures, Administration (prescribing), Laboratory, Findings, Pathology
Sites were asked whether they contributed these kinds of data (table 3.1.2) and which clinical
domains they thought the data could support for clinical trials (table 3.1.3).
WWU FAU UoG HUG UoM AP-HP
MUW UNIVDUN KCL UCL U936
Demo Y Y Y Y Y Y Y Y Y Y
Diagnosis Y Y Y Y Y Y Y Y Y(r) Y
Procedure Y Y Y Y Y Y Y(r) Y Y(r) Y
Admin/Rx Y Y(r) Y(r) Y Y Y X Y Y(r) N
Lab Y Y(r) Y Y Y Y Y Y(r) Y N
Findings Y(r) N(r) N N Y Y ? Y(r) Y(r) Y(r)
Pathology ? Y(r) N N ? Y ? N N Y(r)
Table 3.1.2: Availability of data types by site. Notes: R – restrictions
WWU2 FAU UoG HUG UoM AP-HP
MUW UNIVDUN KCL UCL U936
Diabetes Y Y Y (r) Y2 Y Y Y Y
CV Y Y Y Y2 Y Y Y Y
Oncology Y Y N Y2 N Y Y N Y(r)1
Respiratory Y Y N ? N Y Y N (r)
Inflammatory Y Y N Y2 N Y N N (r)
Neurology Y Y N Y2 N N Y N (r)
Renal Y Y N Y (r) N (r) Y
Table 3.1.3: Supported clinical domains. Y – yes, Y (r) – yes, but some restrictions, N – No, N (r) –
generally no, but some things may be possible. 1 – Breast Cancer (UCL). 2 – domain is subject to
authorisation.
3.1.3 Authorisations required and obtained
The data of section 3.1.2 can only be leveraged for EHR4CR with the appropriate authorisations in
place. EHR4CR sought general authorisation for PFS and specific authorisations for PRS. However,
many sites also imposed study-specific authorisations for PFS limiting their ability to contribute to the
overall platform. A summary of the authorisation status of the sites is given in table 3.1.4.
EHR4CR Consortium Page 12 of 50
Authorisations were obtained from data protections officers, administrators and ethics committees
where appropriate.
WWU FAU UoG HUG UoM AP-HP MUW UNIVDUN
KCL UCL U936
General - Y Y X Y X Y Y Y X
Study Y (10) - - Y (1) - Y (3) - - - Y (2?)
Expires Y (P) Y (P) Y (P) Y (S) Y (P) X ? Y (P) ? Y (P)
Table 3.1.4: Authorisation status by site. Notes: P – EHR4CR project, S – study, Y (#) – number of studies
3.1.4 Precautionary measures taken by sites
Sites took precautionary measures with the data to reduce its identifiability when accessed by the
platform. These measures included altering the patient date of birth by up to 3 months in a random
manner, and altering date and time of events (diagnoses, findings, etc.) by a random amount of up to
one year on a patient-by-patient basis. All events within a patient record were shifted by the same
time increment; event sequencing and inter-event durations were not altered, but patient age at
events may have been slightly altered. Absolute dates are altered and this can have implications in
queries constructed at the central workbench.
In terms of the ambitions of the PFS platform a general authorisation for domain data is desirable.
Four sites did not offer this level of flexibility and provided study-specific authorisations instead. The
latter hinder the objectives of the PFS use case where exploration is of prime importance and takes
place prior to any study-specific agreements being put in place.
All authorisations, with the exception of AP-HP, will expire at the end of the project period on 28th
February 2015. (Warsaw and KCL do not have an explicit position on the termination of
authorisations, but is likely to be the similar to most other sites.)
3.1.5 Technical approaches to accessing data
Each site must extract data from one or more systems and transform it structurally for incorporation
into the native EHR4CR clinical data warehouse (CDW), or an i2b2 CDW. One site, AP-HP, has a pre-
existing i2b2 warehouse for local hospital use and it is used as the target for EHR4CR requests.
Another site, Erlangen, also has an IBM Cognos-based hospital data warehouse as the source for data
extraction. Other sites must access one or more systems (many in some cases) to extract the
necessary data for the site project warehouse. The described topologies for data extraction are
shown schematically in figure 3.1.1.
EHR4CR Consortium Page 13 of 50
Figure 3.1.1: Possible topologies when extracting data for hospital sites to the EHR4CR CDW
Extractions may be performed by either EHR4CR staff or local hospital or IT staff depending on the
relationship established between the two parties. The topologies and staff arrangements for data
extraction for each site are shown in table 3.1.5.
WWU FAU UoG HUG UoM AP-HP MUW UNIVDUN KCL UCL U936
Topology β α α β α β β α α α
EHR4CR staff
Y Y - - - - - Y Y Y
Hospital staff
- - Y Y Y Y Y - Y -
Table 3.1.5: Extraction topology and staff data access arrangements by site
Methods for extracting data include CSV files (obtained through SQL scripts), SQL database backups,
or through direct access with a tool such as Talend Open Studio. Extraction methods for each site are
given in table 3.1.6.
WWU FAU UoG HUG UoM AP-HP MUW UNIVDUN KCL UCL U936
csv - - Y Y Y - Y - -
backup - - - - - Y - - - Y
tool Talend IBM Cognos
- - - Talend - Y (1) SQL SQL
Table 3.1.6: Extraction method by site. Note: 1 – proprietary tool, to be replaced by Talend at some point.
EHR4CR Consortium Page 14 of 50
Each site has options in regard to the mode of extraction - full or incremental – and the frequency of
extraction. These are summarised by site in table 3.1.7.
WWU FAU UoG HUG UoM AP-HP MUW UNIVDUN KCL UCL U936
Mode F F/I F F F F I F F F
Frequency od od Inf od od w m od od od
Table 3.1.7: Mode and frequency of data extraction by site. Notes: F – full, I – incremental, od – on demand, inf – infrequent, /2d – every two days, m – monthly, w - weekly
3.1.6 Transforming data
Data, once extracted, must be transformed to a new structure – either native EHR4CR or i2b2. (This
may also include the generation of new terminology codes where necessary.) However, as a rule, no
code mappings to the central terminology (CT) take place at this stage; the final warehouse always
uses site terminology coding, whether locally defined or standard.
Staff use proprietary or off-the-shelf products to achieve this structural change, e.g. Talend.
Proprietary methods include multiple SQL scripts or 3GL applications written in Java for example.
During this phase, filtering/transformation for data quality purposes may take place. In this process,
records may be discarded if they do not meet defined data quality standards, while other records
may have data values substituted by equivalent or more appropriate values. No data quality
standards have been defined for the EHR4CR CDW.
Particular care must be taken to ensure compliance with the Blue Model, i.e. those rules that the
end-point software uses when processing data.
In correcting, or discarding data, sites may establish the provenance of the data from hospital
warehouse IT staff, hospital systems IT staff, or clinical staff such as physicians or nurses.
Table 3.1.8 shows the extent to which sites impose quality control measures when processing the
data during ETL.
WWU FAU UoG HUG UoM AP-HP MUW UNIVDUN KCL UCL U936
DQ - I2b2 - Some
Y Y (loss) - Y - -
Blue Model
- I2b2 - - Y - - Y - -
Provenance
- some - Some
- - Some Y - -
Table 3.1.8: Imposition of data quality measures by site. Notes: i2b2 – some measures are imposed by the use of i2b2. Where quality measures are imposed no specific site details are provided as the individual site situations are complex.
EHR4CR Consortium Page 15 of 50
3.1.7 Coding and measurement units
During the transformation phase a list of site terminology codes is generated, which is retained
within the native or i2b2 warehouses. These terminology codes, whether local, national or
international, must be mapped to the chosen central terminology (which itself consists of specific
national or international codes defined by WP4.) Mappings are created as new data becomes
available to a site or when the CT expands for new studies. EHR4CR has not provided tools to support
this process so far, but these are due in the near future. Therefore, at the moment, sites must choose
their own tools for this purpose. In many cases the process is highly manual (M), sometimes more
automated (A).
The output of this process must be a file which can be read by Terminology Services (currently
Continuity of Care Document, but CSV previously.) It is observed that sites have varying degrees of
success in establishing these mappings.
Mappings are performed in two possible contexts: 1) the central terminology, or 2) the eligibility
criteria for a PRS study, the latter perhaps requiring the expansion of the CT. Table 3.1.9 shows the
method of mapping employed by sites – manual or automatic – when mapping between various
coding systems. The central terminology uses the following coding systems (terminologies) and are
the target for all mapping processes:
Diagnosis (D) ICD10 (WHO)
Procedure (P) SCT (SNOMED-CT)
Administration (Rx) ATC (WHO)
Laboratory (L) LOINC
Demographics (M) SCT
Units (U) UCUM
Findings (F) SCT
Pathology (Y) PathLex
WWU FAU UoG HUG UoM AP-HP MUW UNIVDUN
KCL **
UCL U936
ICD10 (D)
Auto Auto (r) Auto Auto Auto Auto (r) Auto Auto
Local (D)
ULMS (D)
Auto
SCT (P) Auto
CCAM (P)
man
CHAP (P)
man
OPCS4 (P)
man man Semi
OPS (P)
man man (r)
ICD9 Auto(L)
EHR4CR Consortium Page 16 of 50
(P)
Local (P)
ATC (Rx)
Auto Auto Auto
BNF (Rx)
man (r)
Auto
Local (Rx)
man man (r)
LOINC (L)
Auto(r) Auto(r)
Local (L)
man man (r)
man (r)
man (r)
man man semi
SCT (M)
Local (M)
man man man man man man man man
UMLS (M)
Auto(r)
UCUM (U)
Local (U)
man man man man man man man
SCT (F) Auto(r)
Read (F)
?
Local (F)
man man man man man man man Auto
PathLex (Y)
UMLS (Y)
Auto (r)
Table 3.1.9: Mapping methods employed by sites. For example, Dundee maps laboratory data semi-manually between local coding and LOINC as the Central Terminology. Erlangen already uses LOINC for its laboratory data and automatically maps. Notes: purple background indicates central terminology for the given data category (D, P, Rx, L, M, U, F and Y), Man – manual, r – restrictions apply, semi - tools have been used to assist manual operations, L – licence decision required before MUW will proceed to map. ** KCL has not yet employed any mapping techniques against their data.
The degree to which mappings operations are finished, and the final coverage obtained for the
central terminology and the various PRS studies has been reported by the sites and is given in Table
3.1.10. It should be noted that as the CT expands mapping operations will almost never be finished
(<100%). The figures reported are for the last time the sites checked their mappings.
Study WWU FAU UoG HUG UoM AP-HP MUW UNIVDUN KCL UCL3 U936
CT F 60% ? 60% ?5 75% ? ?
6 99% 0% ?
C 97% 52%4 40% ?
5 50% 60% ?
6 52%
1 0% ?
GGD2 F 100% 100%
C 95% 90%2
EUCLID F 100% 100%
C 93% 94%
Proselica F 100% 100%
EHR4CR Consortium Page 17 of 50
C 100% 50%7
Bayer 15141 F 0%
C 0%
EINSTEIN F 100%
C 100%
KATHERINE F 100% 100%
C ?100% 100%
OCTAVE F ?0 ?100%
C ?0 ?60%
PASSAGE F ?100%
C ?100%
TSAT F PIx
C PIx
Table 3.1.10: Notes: CT – central terminology, GGD2 - GetGoal Duo-2, F – Finished, C - Coverage. Notes: 1 – no mappings for Pathology, some findings and EHR4CR specific elements. 2 – Medication daily dose, lab ranges LLN/UL, some specific drugs not available. 3 – UCL expected to do two studies, but these had not been finalised at time of writing, and it is expected they will require new terms in the central terminology. 4 – Excludes mappings where there are no data in the warehouse. 5 – PRS studies only. 6 – Awaiting confirmation of terminology licence arrangements. 7 – Awaiting an update to warehouse which will improve mapping completeness. PIx – PI did not wish to proceed. ? – no information available.
3.1.8 Installation and configuration
Sites must devote resources (money, hardware, software, staff time) to the installation and
configuration of the platform. The EHR4CR installation covers 5 platform components: PFS and PRS
end-points, terminology services, local workbench, audit facility, and a local warehouse, either
EHR4CR or i2b2, hosted by an RDBMS on dedicated hardware or a virtual machine on shared
hardware. Table 3.1.11 shows the current setup for each of the sites.
WWU FAU UoG HUG UoM AP-HP MUW UNIVDUN
KCL UCL U936
Hardware
- - 2 - - Y Y (db) Y (db) - -
VM 3 2 - 2 2 - 1 (ep) 4 Y 2
OS Linux Ubuntu
Window7
Ubuntu
Linux/WSvr
Linux Centos Window7
Ubuntu+Win2008
Linux
RDBMS PG/my Oracle 11
MSSQL
2012
Oracle MSSQL
Oracle 11
MSSQL MSSQL Oracle Express 11g
Postgres
CDW v1.3.7 I2b2 v1.3.7 I2b2 v1.3.8 I2b2 v1.3.8 v1.3.6 V1.3.7 V1.3.6
Java JDK 1.7 JDK 1.6/7
JDK 1.7
? JDK 1.7
JDK 1.7 ? JDK 1.7 JDK 1.7 JDK 1.7
PFS EP 3.1.1 3.1.1 James Y Y Y (r) Y Y Modified Y
TS Y Y Y Y Y Y Y Y Y Y
Audit Y Y X X SQL logs
X SQL logs X N SQL Logs
PFS status
OK OK OK OK OK (Prod)
OK (r) OK OK OK OK (r)
PRS EP Y Y (i) - Y (i) Y (i) Y - Y (i) - -
LWB Y Y (i) - Y (i) Y (i) Y - Y (i) - -
PRS OK OK - OK OK OK - OK? - -
EHR4CR Consortium Page 18 of 50
status
Table 3.1.11: Current installation and configuration by site. Notes: # - number of machines. PG – Postgres, my – MySQL, I – v3.1.1 installer, r – restrictions apply, db – database, ep - endpoint Installation was a variable experience for sites and the following points were noted in the interviews:
WWU Installation now stable, but not well tested. Need to bring manual up to date.
FAU Documentation is good, but there are many things to know related to the specifics of the environment. Communications are still a problem. End-point code and Blue model appear resilient.
UoG Not an easy process. Tests appear ok, but failures do not appear to be reported back to the central workbench properly.
HUG Documentation is excellent. Local tests using the central workbench appear to be ok.
UoM Production environment is mostly ok. Central workbench interface could be improved.
AP-HP PFS and PRS endpoints now stable. PFS is using local (site) validation for validating access to real data.
MUW Better documentation is needed. Only succeeded after help from other sites. Local checks using central workbench seem ok.
UNIVDUN PFS endpoint seems ok. PRS with installer (v3.1.1) appears ok. Testing between PFS and PRS suggest there are issues to be corrected.
UCL PFS installation is stable. Some test queries did not report answers. PRS installer not tested yet as only recently made available.
It should be noted that sites performed their installations at different points in the software
development cycle.
3.2. Protocol Feasibility – Effectiveness & Efficiency Evaluation
The evaluation of the feasibility scenario is focused on the demonstration of the improvement in the
accuracy (effectiveness) and the time saved (efficiency) using the EHR4CR platform instead of the
current manual protocol feasibility (PF) questionnaire based process. This demonstration consists of
a comparison between the results of a simulation of the patient cohort phase of ten clinical trials, the
same ten trials that were used to build the EHR4CR PF scenario, using the EHR4CR platform and the
results of a simulation of the process following the current methodology. These numbers were
compared to a gold standard based on a manual check of patient records. Due to the impracticality
to check all patient records, 100 patient record sets were randomly extracted and the results
extrapolated using the Wilson CI score with a 95% of confidence.
3.2.1 Effectiveness Results
Overall results at UKM
Study EFPIA Partner
Results Current Process
Results EHR4CR PF
System
Gold Standard -Mean [LB - HB]
Patients per clinic
EHR4CR Consortium Page 19 of 50
NCT00439725 Bayer
30 0 42 [14 – 120] 1411
NCT00345839 Amgen
200 0 74 [13 – 406] 7439
NCT00627640 Merck
300 0 112 [30 – 393] 5607
NCT00894387 Novartis
25 0 75 [29 – 186] 1885
NCT00638690 Janssen
50 0 31 [8 – 108] 1540
NCT00626548 AstraZeneca
200 174 216 [131 – 341] 1540
NCT00715624 Sanofi
12 18 0 [0 – 96] 2575
NCT01018173 Roche
340 566 876 [655 – 1126] 2575
NCT01468987 Eli Lilly
110 8 257 [142 – 449] 2575
NCT00490139 GSK
10 0 22 [8 – 54] 546
Table 3.2.1: Overview of the results obtained by current and EHR4CR supported processes for each of
the trials evaluated at UKM and comparison with the gold standard.
Overall results at AP-HP
Study EFPIA Partner
Results Current Process
Results EHR4CR PF
system
Gold Standard -Mean
[LB - HB]
Patients per clinic
NCT00439725 Bayer
20 205 494 [288 – 816] 4116
NCT00638690 Janssen
25 5 86 [15 – 470] 8626
NCT00626548 AstraZeneca
250 695 1035 [603 – 1709] 8626
Table 3.2.2: Overview of the results obtained by current and EHR4CR supported processes for each of
the trials evaluated at AP-HP and comparison with the gold standard.
3.2.2 Efficiency Results
The creation, execution and visualization of a query using the EHR4CR PF system required between 5
and 25 minutes depending on the complexity of the query (5 minutes for a query with 3 criteria and
25 minutes with 26 criteria).
The time required to receive the response from the PIs varied depending on whether the
questionnaire was sent by the EFPIA representative or directly by the site. In the former case, the
response to the questionnaire was received in seven calendar days, whereas in the latter it required
EHR4CR Consortium Page 20 of 50
between 30 and 90 days. This difference might be explained by the lack of commercial interest by the
PIs who answered to the questionnaire when these were sent by the sites, leaving therefore the first
measurement (seven days) as the only valid estimation.
3.2.3 Analysis of the results
The results of the efficacy and effectiveness evaluation of the EHR4CR PFS show that the protocol
design process can be significantly enhanced by the EHR4CR system, which provides patient counts in
a large number of sites and within a short period of time. Furthermore, the evaluation demonstrates
that these counts strongly depend on the availability of structured electronic health data in the
EHR4CR data provider sites and that, only when structured data is stored in the EHR4CR site data
warehouses, the patient counts generated by the EHR4CR PF system are accurate (being normally
zero otherwise).
3.3. Protocol Feasibility – Scalability Evaluation
The EHR4CR PFS system is intended to work on large-scale patient databases, with the number of
records potentially in the hundreds of millions. For this reason a scalability evaluation of the PFS
system was undertaken, where the ability of the system to respond in reasonable time to requests
against large databases was tested. This section outlines the approach that was taken for these tests.
The approach to testing the system’s ability to handle large-scale data was to distribute a series of
datasets, based around the record sets of 200, 2000, 20.000 and 200.000 patients respectively, to a
selection of pilot sites, whose IT environments reflected the range of environments supported by
EHR4CR. A series of queries was run against these datasets and the time taken for the system to
respond was recorded. The hypothesis, based on analysis of the ‘Blue Model’ [Bache, R., Miles, S.,
Taweel, A., 2013. An adaptable architecture for patient cohort identification from diverse data
sources. J Am Med Inform Assoc. 2013 Dec;20(e2):e327-33] algorithms, was that the time taken for
the system to process queries should scale linear in time with the size of the dataset. The recorded
times were analysed to test the hypothesis.
Given the stringent requirements on maintaining the privacy and confidentiality of electronic patient
records that exist across the European Union and within the regulatory domains of the EHR4CR pilot
sites, undertaking testing of the system against real patient data is often unfeasible. Further, for
formal testing of the system’s ability to handle large volumes of data, the same data was needed to
be hosted at each of the participating pilot sites. Given this an approach to providing test data for the
testing was required.
Given the availability of 200 manually-anonymised records of diabetes patients (the “Dundee 200”)
available within EHR4CR, it was decided to base the test data around these. An alternative approach
of producing purely random data was rejected as such randomised data will fail to represent a
realistic form of data, and tests run against it may have failed to account for issues in processing
EHR4CR Consortium Page 21 of 50
queries that stemmed from features only present in real or realistic data. As such a tool was
produced which took source data as input and scaled it up to a required size (e.g. taking in the
records of 200 patients as input, and producing as output the records of up to 200.000 patients)
whilst altering the particular details of each source record to avoid uniformity in the scaled data.
Three queries were authored on the PFS platform and distributed to participating pilot sites. These
queries were designed to represent a basic query, a more complex inclusion/exclusion query and a
query incorporating temporal constraints. Appendix 4.3 contains the specification of the queries
used. These queries were distributed to sites using the following environments:
Operating System Hosting
Environment
RAM Database System Warehouse
Windows 7 Physical Machine 8gb MS SQL Server
2012
Native
Windows Server
2012
VM 4gb MS SQL Server
2012
Native
Windows 7
Enterprise
Physical Machine 4gb PostreSQL 9.3 Native
Linux (CentOS
6.6)
VM 48gb Oracle 11g I2B2
The tests measured a baseline time associated with each query being performed against the 200
patient dataset. Times were then recorded in the endpoint log files for each subsequent size. Given
the variation in absolute timings for the systems results were normalised to a factor increase over
the baseline measure. Theoretical analysis indicated that the endpoint algorithm should run in O(n)
(linear) time against the size of the database being queried. As such, the expected results were that
each 10x increase in database size would correspond to a 10x increase in response time for the
query.
The table below shows the average increase in normalised time across the systems as the multiple
increase in time over the previous DB size (e.g. 200 to 2000, 2000 to 20000 and 20000 to 200000).
Normalised increase
in time 200-2000
patients
Normalised increase
in time 2000-20000
patients
Normalised increase
in time 20000-200000
patients
Query 1 8.3 4.2 3.0
Query 2 3.0 9.6 9.0
Query 4 3.0 9.0 12.03
EHR4CR Consortium Page 22 of 50
The test against the Postgres Database failed with large databases sizes. Analysis of this problem
showed that it stemmed from a configuration issue with the software, which caused memory issues
independent of the endpoint software itself. As of the time of publication of this report the issue was
being investigated and fixed.
The results indicate that the time taken to process queries increases in roughly linear time with the
number of records in the system. This indicates that an acceptable level of performance is available
with respect to the response times of the PFS platform.
3.4. Protocol Feasibility – Usability Evaluation
Two main objectives were agreed upon for the EHR4CR Usability evaluation:
a) To evaluate the user satisfaction of the EHR4CR query builder (QB)
b) To assess if the amount of provided training is sufficient/adequate
A document with information about the test was created (appendix 4.4.1) so that EFPIA partners
could contact colleagues without previous knowledge of the EHR4CR PF system and with interest in
testing the QB. After confirming a date for the test, user accounts were created within the system
and delivered to the testers. Prior to the testing, they also received a document (appendix 4.4.2) with
a manual of the platform and the description of the 3 tasks that they would need to complete. The
three tasks had a different level of complexity, starting with a very simple one to get familiar with the
QB, a second one based on a possible feasibility query and a third one with a complex query that
included all the functionality of the QB. The documentation given to the testers was completed with
a video that showed how to use the QB and a link to an electronic survey with questions about the
tasks, the training, the QB and demographic questions. In the electronic survey, users needed to
upload screenshots after the completion of each task with the results of the query creation and
execution. These, were used to evaluate the success rate of the users in the creation of the 3 queries
that conformed the 3 tasks (see table 3.4.a).
Query 1 Query 2 Query 3
Criterion Temporal
constraints
Criterion Temporal
constraints
Criterion Temporal
constraints
Inclusion
criteria
Gender Female - - - - -
Age >50 years - >18 years - >18 years -
Diagnosis - - Non-
insulin-
dependent
- Heart failure at most 3
years before
the query
EHR4CR Consortium Page 23 of 50
diabetes
mellitus
Lab values - - Body Mass
Index
25<X<40
- Left
ventricular
ejection
fraction
<40%
at most 14
months
before the
query
- - Haemoglo
bin A1c
>7,5%
- Systolic
blood
pressure
>2,0 mmHg
2 times
separated
with at least
2 months in
between
Exclusion
criteria
Diagnosis - - Acute
myocardial
infarction
- Cardiomyop
athy in the
puerperium
between 0
and 5
months
before the
Heart failure
- - - - Acute
myocardial
infarction
between 5
and 30
months
before now
- - - - Percutaneou
s coronary
intervention
between 5
and 30
months
before now
- - - Operation
on heart
between 5
and 30
months
before now
Treatment - - Using
Insulin and
analogues
- Vasodilators
used in
cardiac
diseases
-
- - - - Phosphodies
terase
inhibitors
-
- - - - Cardiac -
EHR4CR Consortium Page 24 of 50
stimulants
Table 3.4.a– In- and exclusion criteria of the queries correspondent to the three tasks that comprise the user satisfaction
test.
In order to measure the user’s satisfaction, we used a standardized usability questionnaire, the
System Usability Scale (SUS) to which we have added additional questions such as about the tester’s
demographics and computing skills.
To assess the adequacy of provided training, we compared the screenshots provided by the users
with the correct queries built by one of the EHR4CR experts. Besides this, we analysed the responses
to the questionnaire section corresponding to the training satisfaction and suggestions.
3.4.1 Results
A total of 16 participants participated and completed the questionnaire for the user satisfaction
evaluation. The demographic data from the users can be seen in the following table (see table
3.4.1a).
Variable Number of
testers
Answers
current job group
feasibility manager 7 43.75%
data manager 1 6.25%
trial manager 2 12.50%
other (e.g. head of clinical operations, enrolment specialist, clinical operations portfolio manager)
6 37.50%
work experience (years)* 16 3.01 years (1.680)
gender
male 7 43.75%
female 8 50.00%
(no answer) 1 6.25%
EHR4CR Consortium Page 25 of 50
age (years)* 14 43.57 years (5.827)
(no answer) 2
native language
English 12 75.00%
German/ Swiss German 2 12.50%
Polish 1 6.25%
(no answer) 1 6.25%
difficulties regarding English
never, English is my native language 11 66.75%
never, English is not my native language 2 12.50%
rarely 2 12.50%
(no answer) 1 6.25%
usage of similar systems in the past
no 13 81.25%
yes 3 18.75%
experience with feasibility studies
little experience 3 18.75%
some experience 4 25.00%
much experience 9 56.25%
computer skills
average computer skills 2 12.50%
good computer skills 8 50.00%
excellent computer skills 6 37.50%
knowledge in Boolean algebra
no knowledge 3 18.75%
little knowledge 4 25.00%
EHR4CR Consortium Page 26 of 50
average knowledge 3 18.75%
good knowledge 5 31.25%
excellent knowledge 1 6.25%
Table 3.4.1a– Summarized number and row percentage per category of the participant demographics; *=for “work
experience” and “age” mean and standard deviation was calculated; n=16 participants.
After each one of the three tasks, the testers had to answer to some questions about the tasks (See
table 3.4.1b). In the table, it can be seen that the first and the second tasks are well rated but the
third seems to be somehow hard and unsatisfactory.
Task 1 Task 2 Task 3 Wilcoxon-Test, p-value
Item Mean
SD Mean
SD Mean
SD T1-T2
T1-T3 T2-T3
Task difficulty
3.94 0.929
3.75 1.000
2.63 1.088
0.582
0.006* 0.005*
Satisfaction with the ease of completing the task
3.81 0.911
4.06 0.574
2.75 0.856
0.271
0.007* 0.001*
Satisfaction with the amount of time it took to complete the task
3.88 0.885
3.75 0.856
2.94 0.680
0.755
0.017* 0.005*
Satisfaction with the functionality provided
3.69 0.873
3.81 0.655
2.88 0.957
0.557
0.010* 0.002*
Table 3.4.1b– Mean ratings about the task difficulty and satisfaction (5-point rating scale), standard deviations and p- values of Wilcoxon-Test, *=significant at the p 0.05 level; n=16 participants.
Together with the opinion about the tasks, there was also a free text question to report about errors
found during the creation and execution of the queries, or simply to report about possible
modifications to the query builder (see table 3.4.1c).
EHR4CR Consortium Page 27 of 50
Task Missing functionalities User Number
Expert review
Task 1 criteria of >49years was selected but appears as >=49 years
user01 not important; mistake in specification of query not tool
no ability to execute results for all countries, only for UK/ no response when clicking on all countries
user19 medium importance; probably a problem with available sites not with the tool
the query in eclectic format didn‘t show up and look similar to screen shot in training manual
user06 low importance; feature only used for testing probably be removed for 'real world' version of tool
Task 2 than & less selections appear transformed into moreVless than OR EQUAL to
user01 not important; mistake in specification of query not tool
Task 3 the sequence of building the query is not clear, system seems to require it in reverse (i.e. the parameters of time to be entered before the diagnosis)
user18, user19
important; comment on usability though unspecific
entering exclusion criteria (e.g. 3.3.4 EC02) is cumbersome
user06 important; comment on usability though unspecific
no visible option how to add a range of 5-30 month, the range always began at 0 month
user01 medium importance; option is there when user selects 'between' rather than 'more than' or 'less than', user interface issue or poor documentation
no way to clear just one component from the query, „clear“ clears all components/ if you want to change particular part of the inclusion or exclusion criteria, you have to delete the whole; it would be better to delete parts
user06, user21
medium/low; true but the individual inclusion sections are never hugely complex so deleting all is not too bad
the „before now“ button didn‘t work several times
user19 important; was not seen this reproduced though
the run function and eclectic format was not possible, computer crashed when run the query or do it eclectic format/ „does not compute“ message
user21, user01
important; there was not seen a 'crash' reproduced elsewhere
EHR4CR Consortium Page 28 of 50
appeared, when trying to generate the eclectic format
system feedback that query had been saved, but it doesn‘t appear to have been
user01 important; true in terms of lack of feedback, but it is always saved
if you want to check a specific value of a criteria (e.g. if left ventricular ejection fraction was correctly entered and you want to check it later) you are not able to see it by clicking on the symbols
user21 important; this information appears at the bottom of the screen and not where user would originally see it, may need to scroll, user interface issue
Table 3.4.1c– Responses to the open-ended question “What function or feature do you miss for this task?”, and expert review of usability issues; n=16 participants.
After the completion of the three tasks, the overall satisfaction question was assessed, though the
SUS questionnaire. The results show a total score of 55.86, which means that the application is
acceptable but it doesn’t reach the desired “good” level of satisfaction. We estimate that the
difficulty of the third task could have had a negative impact on the user satisfaction that might have
influenced the responses to the SUS.
SUS Item N (valid) Mean SD
I think that I would like to use the Query Builder frequently. 16 3.63 0.885
I [did not find] the Query Builder unnecessarily complex.* 16 3.06 0.929
I thought the Query Builder was easy to use. 16 3.38 0.719
I think that I [would not] need assistance to be able to use the Query Builder.*
16 2.94 0.998
I found the various functions in the Query Builder were well integrated.
15 3.07 0.961
I [did not think] there was too much inconsistency in the Query Builder.*
15 3.33 1.047
I would imagine that most people would learn to use the Query Builder very quickly.
16 3.25 1.000
I [did not find] the Query Builder very cumbersome to use.* 16 3.19 0.834
I felt very confident using the Query Builder. 16 3.06 0.854
I [did not need] to learn a lot of things before I could get going 16 3.00 1.033
EHR4CR Consortium Page 29 of 50
with the Query Builder.*
Overall SUS-score 15 55.86 15.37
Table 3.4.1d– Mean rating (5-point-scale from 1 “strongly disagree” to 5 “strongly agree), standard deviations, and overall SUS score. Items marked with an asterisk (*) were reverse coded, n=16 participants.
The responses to the quality of the training show a good grade of satisfaction with the training
among the testers (see table 3.4.1e). Some of them reported though the will of a personal trainer
who could provide info or answer questions on-site. Some of the testers also reported insufficient
information in the training manual to complete the task 3.
Items N (valid) Mean SD
The topics covered by the training were relevant for the tasks.
15 4.20 0.414
The time allotted for the training was sufficient. 13 3.54 0.877
The content of the training was well organized and easy to follow.
15 3.93 0.458
The materials distributed were helpful. 15 4.13 0.352
The speed of the training video was appropriate. 14 3.64 0.842
The amount of information was sufficient for solving the tasks.
15 3.27 1.033
This training experience will be useful in my work. 15 3.40 0.737
Overall, I am satisfied with the training. 15 3.53 0.640
Table 3.4.1e– Mean ratings about quality of the training (5-point-scale from 1 “strongly disagree” to 5 “strongly agree)
and standard deviations, n=16 participants.
The analysis of the screenshots with the queries built and the results of the execution (see table
3.4.1f) show that ten out of thirteen testers could successfully complete the tasks 1 and 2, whereas
only 4 could replicate these results in the task number 3. The reason might be because query 2 is
based on a real possible feasibility query and query 3 is only meant to test the whole functionality of
the query builder, containing this one too many temporal constrains whit what testers are not
familiar.
User ID Task 1 Task 2 Task 3
User 1 S S F
User 2 P P
EHR4CR Consortium Page 30 of 50
User 3 S S F
User 4 F S P
User 5 S F
User 6 S F
User 7 S S S
User 8 S S P
User 9 S S F
User 10 S S S
User 11 S S S
User 12 S P P
User 13 S S S
User 14 F F
Table 3.4.1e– S=Success is given when the whole completion of the task is successful. P=Partial Success is given when the
user commits not more than a severe mistake (wrong use of the Eligibility criteria) or not more than two minor mistakes
(wrong use of the temporal constraints). F=Fail is given when the user commits more than a severe mistake or more than
two minor mistakes.
3.5. Patient Recruitment – Data Inventory
The creation of the PRS data inventory was described in last years’ deliverable D7.3. Since then the
inventory has been refined and Unified Medical Language System (UMLS) codes were added to each
element.
When identifying codes for the elements it was discovered that some described the same concepts
(e.g. ‘platelets blood’ and ‘platelets count’). Redundant elements were removed so that the revised
data inventory contains 150 data elements now. The number of elements per Data Group can be
taken from table x. The whole data inventory is located in the appendix 4.1.
Data Group Total Example
Demographics 5 Gender
Diagnosis 5 Code
Findings 25 Systolic blood pressure
Laboratory Findings 81 HbA1C
Medical device 1 Type
Medical History 10 Smoking status
EHR4CR Consortium Page 31 of 50
Medication 9 Route
Patient
Characteristics
1 day-night cycles
Procedure 3 Procedures date/time
Scores&Classification 10 Expanded Disability Status Scale (EDSS) score
Table 3.5.1. Number of studies for the project disease areas covered in the third version of the Data Inventory.
3.6. Patient Recruitment – Evaluation
The basic evaluation design for this scenario has been developed and described in year 3. To perform
the evaluation, a set of several tasks has to be conducted by different project partners. Some of the
tasks can be worked on in parallel, while others depend on the outcome of their predecessors. Since
the participating sites have slightly different settings and are in different states of the evaluation, this
report will list the current situation for the detailed tasks:
1. Installation and configuration of the PRS components at the sites;
2. Approach and seeking confirmation of the respective Principal Investigator(s) (PI) at the
participating sites for the respective trial(s) that have already been chosen in year 3;
3. Adjustment of the evaluation protocol for each site participating in the evaluation for the
respective trial(s);
4. Approach and requesting approval of the local ethics committee/institutional review board
at the participating sites for the respective trial(s);
5. Review of simplified eligibility criteria (EC) of the trials used for the evaluation, and, if
necessary, re-simplification of these EC;
6. Check availability of data items correlating to the EC within the central terminology, and, if
necessary, coordinate insertion of missing items with Work package 4 team;
7. Extraction, Transformation and Loading (ETL) of EHR data correlating to these data items;
8. Creation of (database) queries for the trials utilized in the evaluation within the central
workbench;
9. Distribution of queries to the participating sites;
10. Execution of queries, collection of necessary numbers and screening lists at the participating
sites;
11. Comparison between screening list from standard method with candidate list from EHR4CR
systems
EHR4CR Consortium Page 32 of 50
3.6.1 Task #1 (Installation and configuration of the PRS components at the sites)
The evaluation by WP7 was delayed, because WPG2 was unable to keep the planned timescales for
the roll out of the PRS platform as there were difficulties in the development of the software. When
it was eventually deployed there were difficulties installing it at all of the sites.
This task is a crucial pre-requisite of the whole evaluation. Without successfully installed and
configured PRS components, no patients can be found as described in the specification.
At Dundee the PRS component have been installed successfully and run. However, the platform does
not report counts that are consistent with PFS and there are usability issues (query runtimes). Until
these are resolved Dundee will not be able to perform comparisons with actual study recruitment.
The issue was reported in the project internal ticketing system JIRA.
3.6.2 Task #2 (Approach and seeking confirmation of the respective Principal Investigator(s) (PI) at the participating sites for the respective trial(s) that have already been chosen in year 3)
While trying to install and configure the PRS components in several iterations, the sites were asked to
approach the PIs of the respective trials to explain the evaluation design and ask about their
willingness to cooperate. If willing to participate PIs would provide their local screening list, screen
patients being identified by the EHR4CR system and provide an overview of screened, eligible and
additionally identified patients as described in the evaluation protocol. The template was part of last
year’s deliverable and for reference has been added to the Appendix 4.3.1.
The status of approaching the local PIs is as follows:
Site Status
University College London (UCL) No appropriate study to utilize for this
evaluation could be found. However, UCL is pro-
actively seeking to contribute to this evaluation
by identifying trials at the clinic (inter-
institutional, multi-drug, etc.) for retrospective
evaluation. PI approval obtained, subject to
Ethics approvals being obtained.
Kings College London (KCL) PI approval obtained for study “Bayer 15141”
University of Dundee (UNIVDUN) PI approval obtained for studies “AZ EUCLID”
and “Sanofi GetGoal Duo-2”
Université de Rennes 1 (U936) PI approval obtained for study
“Roche_WA25204_09092013”. Will be done as
simulation as study did not start.
Westfälische Wilhelms-Universität Münster
(WWU)
PI approval obtained for studies „NVS OCTAVE”,
“PASSAGE” and “Roche KATHERINE”; PI
EHR4CR Consortium Page 33 of 50
approvals denied for studies “Amgen MM Bone
Study” and “NCT01816295”
Friedrich-Alexander-Universität Erlangen-
Nürnberg (FAU)
PI approval obtained for studies „Bayer EINSTEIN
Junior”, “Roche KATHERINE” and “Sanofi
EFC11785 (Proselica)”
Hôpitaux universitaires de Genève (HUG) PI contacted and approval obtained. But NVS
OCTAVE trial has been cancelled at HUG.
Assistance Publique Hôpitaux de Paris (AP-HP) PI approval obtained for studies “AZ EUCLID”
and “Sanofi GetGoal Duo-2”
The University of Manchester (UoM) Does not participate in any of the trials identified
for the evaluation
Medical University of Warsaw – POLCRIN (MUW) Does not participate in any of the trials identified
for the evaluation
University of Glasgow (UoG) PI approval for study “Sanofi EFC11785
(Proselica)” obtained.
Table 3.6.1 – PI participation status overview
No appropriate study to utilize for this evaluation could be found for two of the sites (UoM and
MUW). Thus, these sites won’t be listed in the following evaluation steps.
PI approvals could not be obtained for some of the studies at some sites (e.g. “Amgen MM Bone
Study” at WWU). Whenever a missing approval affected a trial that is conducted at only the
respective site, the trial was no longer considered to be evaluated. However, in most of the cases, we
were able to gain approvals by the local PIs, which indicates that the local clinicians are willing to test
new methods for identifying potentially eligible patients and cooperate in the evaluation of such
methods.
FAU will not proceed with the Sanofi PROSELICA study because the required patient population is
treated not at Erlangen University Hospital but at a collaborating hospital with a separate IT
infrastructure and only limited usage of the Erlangen University Hospital EHR system.
At HUG, the local PI did agree to participate in the PRS evaluation. However, since the sponsor pulled
the study at this site, the evaluation will not be done.
UoG is one of the sites of the “Sanofi EFC11785 (Proselica)” study. However, the recruitment has
already been finished in 2013 for this site, thus only leaving the option for a retrospective evaluation
at UoG.
EHR4CR Consortium Page 34 of 50
3.6.3 Tasks #3 (Adjustment of the evaluation protocol for each site participating in the evaluation for the respective trial(s)) & #4 (Approach and requesting approval of the local ethics committee/institutional review board at the participating sites for the respective trial(s))
Sites that have successfully gained approvals by their local PIs adjusted the evaluation protocol
(template) according to the evaluated trials and approached their local ethics
committee/institutional review board. The ethics committees’ responses were as follows:
Site Status
University College London (UCL) A project wide approval is being sought, and no
obstacles are anticipated.
Kings College London (KCL) The study has ethical approval and we have
been given Caldicott approval to do the
evaluation.
University of Dundee (UNIVDUN) Gold standard test will not be performed for this
site. Thus, no ethics approval was sought
Université de Rennes 1 (U936) The study actually did not start in Rennes but
will be simulated.
Westfälische Wilhelms-Universität Münster
(WWU)
Ethics committee approved evaluation with
studies „NVS OCTAVE”, “PASSAGE” and “Roche
KATHERINE”
Friedrich-Alexander-Universität Erlangen-
Nürnberg (FAU)
Ethics committee approved evaluation with
studies "Roche KATHERINE", "Bayer EINSTEIN
JUNIOR" and "Sanofi EFC11785 (PROSELICA)".
Hôpitaux universitaires de Genève (HUG) Ethics committee did approve, but trial has been
pulled at HUG.
Assistance Publique Hôpitaux de Paris (AP-HP) Ethics committee approved evaluation with
studies “AZ EUCLID” and “Sanofi GetGoal Duo-2”
University of Glasgow (UoG) Local Privacy Advisory committee (responsible
group regarding evaluation without patient
contact) approved evaluation with study "Sanofi
EFC11785 (PROSELICA)"
Table 3.6.2 – Ethics status overview
In summary, whenever a local PI was willing to cooperate in the EHR4CR PRS evaluation, the local
ethics committee did approve the conduction of this evaluation as well.
EHR4CR Consortium Page 35 of 50
3.6.4 Tasks #5 (Review of simplified eligibility criteria (EC) of the trials used for the evaluation, and, if necessary, re-simplification of these EC) & #6 (Check availability of data items correlating to the EC within the central terminology, and, if necessary, coordinate insertion of missing items with Work package 4 team)
The EC of the trials utilized for this evaluation have already been simplified. However, the
simplification took place in the beginning of the project with focus on protocol feasibility, not patient
identification and recruitment. Hence, the result of the simplification was more a list of necessary EC
rather than EC suitable for screening patients. Additionally, the team gained experience throughout
the project. Because of this, we decided to review the simplified EC of the utilized trial with focus on
patient identification and recruitment and re-simplified the EC where necessary.
Based on the simplified criteria, the central terminology has been checked to ensure that all the data
items necessary to build proper queries are available in this central terminology. This task is of
special importance for the trials that are conducted within multiple sites, because the sites shall use
the very same query and thus agree upon the usage and interpretation of data items. The following
table lists the status of the central terminology check for each of the utilized studies:
Study Participating site(s) Central terminology check
status
AZ EUCLID AP-HP, UNIVDUN Central terminology has been
checked by AP-HP; In process of
checking EC and mappings to
central terminology by
UNIVDUN.
Bayer EINSTEIN Junior FAU Simplification was optimized,
terminology checked (no
additions necessary)
Bayer 15141 KCL The terminology checking is
ongoing
NVS OCTAVE WWU, HUG Terminology not checked yet
NVS CFTY720D2406 WWU Terminology not checked yet
Roche KATHERINE FAU, WWU Simplification was optimized,
terminology checked and
missing terms were added to
the terminology by WP4
Sanofi GetGoal Duo-2 AP-HP, UNIVDUN In process of checking EC and
mappings to central
terminology. AP-HP did an
analysis of the EC and provided
EHR4CR Consortium Page 36 of 50
it to UNIVDUN; Drug dosage
could not be used in the
eligibility criteria
Sanofi EFC11785 (Proselica) FAU, UoG Simplification was optimized,
terminology checked and
missing items reported to WP4
UCL-internal study UCL Not started yet, because no
local trial has been identified
yet.
Rennes-internal study U936 Rennes will not do the
evaluation with the EHR4CR
system but with their local
recruitment system and did
therefore not review the
criteria according to the central
terminology.
Table 3.6.3 – Status of reviewing and aligning of study EC and codes available in central terminology
For some of the studies it was not checked whether their EC have corresponding codes in the central
terminology. The main reason is that the same people who have to check the availability of the data
items are involved in installation of the PRS systems and ETL of patient data and could not make the
additional terminology check. This task revealed that the EC and data items in the central
terminology have to be reviewed for patient identification and recruitment, because some items
could not be found. The result of the task did already and will lead to an improvement of the central
terminology, because it will be suitable for PFS and PRS EC.
3.6.5 Task #7 (Extraction, Transformation and Loading (ETL) of EHR data correlating to these data items)
Having identified the data items that are necessary to query for potentially eligible patients, the sites
have to check and, if necessary, perform the task of Extraction, Transformation and Loading (ETL) (at
least) for the studies evaluated at their sites. The following table shows the current status of the
sites’ ETL.
Site Status
University College London (UCL) PRS scenario at UCL uses the same warehouse as
PFS, thus no special ETL for the PRS studies is
necessary. ETL will be performed on-demand
once local studies have been selected. Additional
mappings might be required.
EHR4CR Consortium Page 37 of 50
Kings College London (KCL) ETL tools currently being tested on live database
for data extraction for “Bayer 15141”
University of Dundee (UNIVDUN) PRS scenario at UNIVDUN uses the same
warehouse as PFS, thus no special ETL for the
PRS studies is necessary.
Université de Rennes 1 (U936) We did not have the approval from our ethic
committee to connect any source of real patient
data to the EHR4CR infrastructure. No sufficient
warranty of security has been provided by the
EHR4CR to convince our IT department. That’s
why the own recruitment system is going to be
used.
Westfälische Wilhelms-Universität Münster
(WWU)
ETL done for “Roche KATHERINE” and “Novartis
PASSAGE”, ETL started for „Novartis OCTAVE”
Friedrich-Alexander-Universität Erlangen-
Nürnberg (FAU)
ETL is in progress for study "Roche KATHERINE",
not started yet for "Bayer EINSTEIN JUNIOR" and
"Sanofi EFC11785 (Proselica)"
Hôpitaux universitaires de Genève (HUG) ETL is done and data for the period 06.2012-
07.2013 were extracted.
Assistance Publique Hôpitaux de Paris (AP-HP) ETL checked for “AZ EUCLID” and “Sanofi
GetGoal Duo-2”; no adjustment necessary
University of Glasgow (UoG) ETL is currently ongoing
Table 3.6.4 – Status of ETL
AP-HP has used the hospital clinical data warehouse and therefore where the items were already
stored. The ETL process has not been changed, but additional mapping was used to take the new
items into account.
3.6.6 Tasks #8 (Creation of (database) queries for the trials utilized in the evaluation within the central workbench), #9 (Distribution of queries to the participating sites) & #10 (Execution of queries, collection of necessary numbers and screening lists at the participating sites)
AP-HP has chosen the AstraZeneca EUCLID and SANOFI GetGoal Duo-2 clinical trials to evaluate the
EHR4CR recruitment scenario. The whole workflow of the PRS scenario has been tested during the
AP-HP local evaluation:
The central terminology items used by the 2 studies have been mapped to the AP-HP Clinical
Data Warehouse items
EHR4CR Consortium Page 38 of 50
The 2 corresponding queries have been written in the central workbench.
For the SANOFI GetGoal Duo-2 study, as daily drug dosage was not available in the query
workbench, only the presence of the drugs was used. For AstraZeneca EUCLID, as some
logical operations were not available in the central workbench, we decided to transform
exclusion criteria: {"ICD 10 code I49.5 (Sick Sinus Syndrome)" AND NOT "Pacemaker"} into
inclusion criteria: {"No ICD 10 code I49.5 (Sick Sinus Syndrome)" OR "Pacemaker"}.
The queries in the Central Workbench have been submitted to the AP-HP Local Workbench
and validated by the AP-HP Data Relation Manager. A PI user has been assigned to each
study.
The PI has locally launched the 2 queries in order to get a list of patients for each study.
The eligibility status of each patient of the EUCLID study has been checked. For the SANOFI
Get Goal Duo-2 study, this process is not yet finished.
All along the AP-HP recruitment evaluation, the screening dashboard was available in the
central workbench
AP-HP was able to create a query in the central workbench that covers a selected set of eligibility
criteria, send it to the local PRS component(s) of their site, execute it at least once with the local PRS
components and gather a list of potentially eligible patients.
WWU and FAU were able to create the query for the KATHRINE study in the central workbench and
send it to their local components. The execution resulted in 0 patients found for WWU.
For FAU, the electronic query returned 13 possibly eligible patients for the Roche KATHERINE study.
At this time, the screening log of this study contained 10 enrolled patients. 5 enrolled patients were
also identified by the electronic query.
A closer examination still has to decide whether the other 8 suggested patients are truly eligible for
the study or whether they represent false-positive results. However, the 5 correctly found patients
clearly demonstrate the usefulness of electronic patient recruitment based on routine care data. For
studies with a low recruitment rate, such as the KATHERINE study, this result clearly demonstrates
success.
Due to current mapping problems with the EHR4CR platform, FAU executed the query on its i2b2
platform. FAU is confident that the query can also be executed successfully on the EHR4CR platform
once these mapping problems are being sorted out. FAU actually expects a lower false-positive rate
on the EHR4CR platform, because for the i2b2 query no temporal constraints were imposed (thus
probably returning more “false” patients).
EHR4CR Consortium Page 39 of 50
WWU created the query for the PASSAGE study at the central workbench but was not able to send it
to their local components due to ongoing technical issues with the platform.
Other sites and studies were not able to complete these tasks, since steps that need to be performed
before are not finished/encounter problems.
The reasons for this are, that the software – especially the local PRS components – had to be
installed, tested and reported about in several iterations, which leads to massive time delays. The
additional tasks (e.g. check of central terminology) have been delayed as well. Since the successful
conduction of the evaluation depends on both the availability of the PRS components and fulfilled
additional tasks, most sites were not able to achieve both in the same time. However, all
participating sites still are working on both topics and try to conduct the evaluation.
3.6.7 Task #11 (Comparison between screening list from standard method with candidate list from EHR4CR systems)
No site was able to conduct the whole evaluation, because no site was ready to do the manual check
of patient records yet. Partial results have been collected and are described below.
AP-HP has completed the evaluation of the AstraZeneca EUCLID study in the context of the EHR4CR
PRS scenario on Oct. 8th 2014 and the Sanofi GetGoal Duo2 study on Jan. 15th 2015. The comparison
of the traditional recruitment process and the result of the EHR4CR recruitment process for the
EUCLID study are shown in table 3.6.7.1 and for GetGoal Duo2 in table 3.6.7.2. The official results
document from AP-HP can be found in Appendix 4.5.
Patient Counts
Method
Traditional
Recruitment
EHR4CR
Platform
Unique Clinically Validated 53 2
Identical Clinically Validated Traditional
Recruitment & EHR4CR Platform 0
Table 3.6.7.1 - AP-HP PRS results for the AstraZeneca study (EUCLID)
Patient Counts
Method
Traditional
Recruitment
EHR4CR
Platform
Unique Clinically Validated 2 4
EHR4CR Consortium Page 40 of 50
Identical Clinically Validated Traditional
Recruitment & EHR4CR Platform 0
Table 3.6.7.1 - AP-HP PRS results for the Sanofi study (GetGoal Duo2)
For WWU the results of the KATHRINE study were 0 Patients identified by the EHR4CR platform.
When contacting the PI to investigate further he said that the study was cancelled at the site,
because no patients had tumour tissue left after their surgery, which was a requirement for
inclusion.
3.7. Clinical Trial Execution & Serious Adverse Event Reporting – Data Inventory
Due to delays in the platform availability and subsequent evaluation for PFS and PRS the execution of
this task has been shifted to the fourth year. Another reason was the discussion and following
decision to combine these two tasks into one single task by including SAE reporting into the Clinical
Trial Execution.
For the CTE and SAE reporting scenario a data inventory based on the most common data elements
in clinical trials has been established. Complete Trial CRFs from the EFPIA partners were collected
and categorized by their respective disease domains. The number of studies received per company is
as follows: Amgen: 1; AstraZeneca: 2; Bayer: 4; GSK: 7; J&J: 5; Merck: 0; Lilly: 0; Novartis: 5; Roche: 0;
Sanofi: 1. Table 3.7.1 shows the number of trials for which CRFs are present for a given disease
domain.
Disease domain Number of Trial CRFs
Oncology 3
Diabetes 3
Cardiovascular 4
Renal 1
Respiratory 10
Infections 1
Psychiatric 1
Ophthalmology 1
Neuroscience 1 Table 3.7.1 – Number of Trial CRFs for a specific disease domain that was used for the fourth Data Inventory.
A data standards catalogue from Amgen and Lilly containing the basis form domains in clinical trials
has also been included. In addition, a frequency analysis of the most used data elements had been
performed by Bayer and Novartis, which was also included in the overall analysis. Due to the fact that
all CRFs were received in multiple different formats, all forms and data elements had to be mapped
into a central database schema. As a second step, the form domains were harmonized by each
delivering EFPIA partner. During this, all data elements were normalized to allow the determination
EHR4CR Consortium Page 41 of 50
of equal elements over all trials. Based on this normalization step frequency analyses were
performed. To identify the relevance of data elements, the top 24 form domains selected and for
each domain one list containing data elements sorted by occurrence were created. To determine the
relevance of data elements, each list was sent to at least 2 EFPIA partners for review. In this process
step the priority and the category for each element were stated, whether a data element is relevant
or not relevant and whether it is a study administrative or a clinical value. This information is
necessary to identify those data elements that are most likely to be found within an electronic health
record system. During this process, naming errors and SDTM variable labels were assigned where
necessary and where possible. The collation process for the data elements in all selected form
domains was performed during a face-to-face meeting in which again the naming and the relevance
of data elements was discussed and agreed.
The Data Inventory consists of the following form domains with the number of data elements in
brackets: Concomitant Medication (9), Demographics (4), Disease Characteristics (2), Disposition (2),
ECG-Findings (9), Laboratory Data (6), Common Lab Data Analytes (57), Medical History (4), Patient
Reported Outcome / Questionnaires (3), Substance Use (8), Surgery (4), Tumor Response (6) and
Vital Signs (8). For the SAE reporting scenario data elements of the Adverse Events domain were
included into this data inventory. The concluded data elements were compared with the previous
data inventory of PRS and highlighted what elements are already present and what are novel ones.
Terminology concept codes of UMLS as well as a short description of the data element were
assigned.
After the identification and determination of the common data elements in clinical trials, the
complete element list was sent out to the sites to perform data exports in their local systems. As in
the previous scenarios, the availability and completeness was assessed. Availability is determined by
the localization of data elements within the sites’ EHR database; completeness by the frequency of
the uniquely documented value per patient divided by the total number of admitted patients in
2013.
In the extension process of the PRS Data Inventory, in total 133 data elements were identified, 83
novel ones and 50 that already exist in the previous one.
The complete CTE Data Inventory including the results for availability and completeness for each data
element per site can be found in the Appendix 4.7. At this point the results from Rennes, Glasgow
and Manchester are missing. Column D represents the average completeness for each site (E-O). The
completeness is given in percentage. Missing results are colored in purple. Available data elements
have white as background color and not available ones have black.
EHR4CR Consortium Page 42 of 50
4. Appendix
4.1. Scalability Evaluation
Outlined below are the three queries used in the scalability testing.
Query 1
1 gender() in {[SNOMED Clinical Terms:248152002,"Female"]} and
2 born() at least 60 year before now
Query 2
1 born() at least 18 year before now and
2 last procedure([SNOMED Clinical Terms:64915003,"Operation on heart"]) and
3 last vitalsign([SNOMED Clinical Terms:50373000,"Body height measure"]) in range(>=1.6)
unit([ucum:m,"meter"]) and
4 last vitalsign([SNOMED Clinical Terms:27113001,"Body weight"]) in range(<=100.0)
unit([ucum:kg,"kilogram"]) and
5 not last medication([ATC:A10BG02,"rosiglitazone"])
Query 3
1 born() at least 18 year before now and
2 last vitalsign([SNOMED Clinical Terms:271649006,"Systolic blood pressure"]) in range(>=140.0)
unit([ucum:mm[Hg],"millimeter Mercury column"]) and
3 not last medication([ATC:A10BG02,"rosiglitazone"]) at most 5 year before now and
4 last procedure([SNOMED Clinical Terms:64915003,"Operation on heart"]) at most 9 year before
now
4.2. Usability Evaluation
Appendix 4.4.1.- Information note for the testers
EHR4CR Usability Evaluation Information note.pdf
EHR4CR Consortium Page 43 of 50
Appendix 4.4.2.- Manual + Test Script:
EHR4CR Usability Evaluation of the Query Builder v2.5.docx
4.3. PRS Evaluation
4.3.1 PRS testing template from last year’ s deliverable
PRS_testing_protocol_template.docx
EHR4CR Consortium Page 44 of 50
4.3.2 PRS results document from AP-HP for the EUCLID study
EHR4CR Patient Identification and Recruitment (PIR)
Evaluation
Study name: EUCLID
Study Identifier: NCT:01732822
Site: APHP
Date: 08/10/2014
Authorized Site User: Dr Yannick GIRARDEAU
Signature:
Patient Counts
Method
Traditional
Recruitment
EHR4CR
Platform
Unique Clinically Validated 53 2
Identical Clinically Validated
Traditional Recruitment & EHR4CR
Platform
0
Unique Clinically Validated Patient Counts indicates the total number of patients uniquely identified
with the specified method based on pseudonymized case numbers as determined by the treating
physician or authorized site user.
Identical Clinically Validated Patient Counts indicates the total number of identical patients
identified with two or three methods based on pseudonymized case numbers as determined by the
treating physician or authorized site user.
PRS Results document from AP-HP for the AstraZeneca EUCLID study
EHR4CR Consortium Page 45 of 50
4.3.3 PRS results document from AP-HP for the GetGoal Duo-2 study
EHR4CR Patient Identification and Recruitment (PIR)
Evaluation
Study name: GetGoal Duo-2
Study Identifier: NCT:01768559
Site: APHP
Date: 15/01/2013
Authorized Site User: Dr Yannick GIRARDEAU
Signature:
Patient Counts
Method
Traditional
Recruitment
EHR4CR
Platform
Unique Clinically Validated 2 4
Identical Clinically Validated
Traditional Recruitment & EHR4CR
Platform
0
Unique Clinically Validated Patient Counts indicates the total number of patients uniquely identified
with the specified method based on pseudonymized case numbers as determined by the treating
physician or authorized site user.
Identical Clinically Validated Patient Counts indicates the total number of identical patients
identified with two or three methods based on pseudonymized case numbers as determined by the
treating physician or authorized site user.
PRS Results document from AP-HP for the Sanofi GetGoal Duo2 study
EHR4CR Consortium Page 46 of 50
4.4. PRS Data Inventory
Data Group Data Item
Demographics Admission date
Demographics Case Status
Demographics Date of Birth
Demographics discharge date
Demographics Gender
Diagnosis COPD exacerbation
Diagnosis Diagnosis Code
Diagnosis Diagnosis Date
Diagnosis Diagnosis text
Diagnosis Histologically confirmed diagnosis
Findings Blood pressure diastolic
Findings Blood pressure systolic
Findings Body Mass Index (BMI)
Findings Cardiac function
Findings CT
Findings Date/Time of Finding
Findings ECG
Findings Focal lesion
Findings Forced expiratory volume in 1 second / forced vital capacity (FEV1/FVC) ratio
Findings Gallium scan
Findings Height
Findings HR
Findings hypertension
Findings Infection
Findings Intraocular pressure
Findings Lytic bone lesion
Findings MRI
Findings Oxigen therapy
Findings Percent predicted forced expiratory volume in 1 second (FEV1)
Findings Percentage lesion area
Findings PET scan
Findings Pulse
Findings Spirometry
Findings Temperature
Findings Weight
Laboratory findings Albumin
Laboratory Findings albumin-adjusted serum calcium
EHR4CR Consortium Page 47 of 50
Laboratory Findings Alkaline Phosphatase
Laboratory findings Amylase
Laboratory findings ANA titer
Laboratory Findings antibodies
Laboratory findings anti-cyclic citrullinated peptide antibodies (anti-PP)
Laboratory Findings Beta HCG in serum
Laboratory Findings biPTH
Laboratory Findings Blood Urea Nitrogen [BUN]
Laboratory Findings BNP
Laboratory Findings Ca x P
Laboratory Findings Calcitonin
Laboratory Findings Calcium in serum
Laboratory Findings calculated creatinine clearance
Laboratory Findings Cardiac troponin T
Laboratory findings C-Reactive protein (hs-CRP)
Laboratory Findings Creatinine clearance
Laboratory Findings Creatinine in serum
Laboratory Findings CRP in serum
Laboratory Findings Direct Bilirubin in serum
Laboratory findings eGFR
Laboratory Findings Eosinophils Blood
Laboratory Findings Erythrocytes
Laboratory Findings Fasting C-peptide
Laboratory findings Fasting hypertriglyceridemia
Laboratory Findings Fasting plasma glucose (in serum)
Laboratory findings Ferritin
Laboratory findings Folate
Laboratory findings Gamma GT
Laboratory Findings Glomerular Filtration Rate
Laboratory Findings Glucose in serum
Laboratory Findings Haematocrit Blood
Laboratory Findings HbA1c
Laboratory Findings HDL in serum
laboratory findings hepatitis C virus (HCV)
Laboratory findings HER2 status
Laboratory findings KRAS mutation
Laboratory Findings LDL in serum
Laboratory findings Lipase
Laboratory findings Leukocytes
Laboratory Findings Lymphocytes Blood
Laboratory Findings Mantoux Test
EHR4CR Consortium Page 48 of 50
Laboratory Findings Measles Antibody
Laboratory Findings Monoclonal light chain in the urine protein electrophoresis
Laboratory Findings Monoclonal plasma cells in the bone marrow
Laboratory Findings Monoclonal protein in Serum
Laboratory Findings Monoclonal protein in Urine
Laboratory Findings Neutrophils Blood
Laboratory findings NRAS mutation
Laboratory Findings NTproBNP
Laboratory Findings Platelet Count
Laboratory Findings Potassium in serum
Laboratory Findings Prolactin
Laboratory Findings PSA
Laboratory findings PT
Laboratory Findings PT (INR)
Laboratory Findings PTT Blood
Laboratory findings Rheumatoid Factor
Laboratory Findings sampling Date / Time of Laboratory Finding
Laboratory Findings Serum immunoglobulin free light chain
Laboratory Findings serum immunoglobulin kappa lambda free light chain ratio
Laboratory Findings Serum monoclonal paraprotein (M Protein)
Laboratory Findings serum pregnancy test
Laboratory Findings SGOT (AST) in serum
Laboratory Findings SGPT (ALT) in serum
Laboratory Findings Sodium in Serum
Laboratory Findings Thyroid-stimulating hormone (TSH)
Laboratory Findings total bilirubin
Laboratory Findings Total Bilirubin in serum
Laboratory Findings Total Cholesterol in serum
Laboratory Findings Total Protein in serum
Laboratory findings Total testosterone level
Laboratory findings Transferrin saturation
Laboratory Findings Triglycerides
Laboratory Findings Urine monoclonal light chain protein
Laboratory Findings Urine monoclonal paraprotein (M Protein)
Laboratory findings Urine protein to creatine ratio
Laboratory findings Varicella Antibody
Laboratory findings Vitamin B12
Laboratory findings white blood cell count
Medical device type
Medical History Alcohol Abuse
Medical History Allergies and Hypersensitivity reactions
EHR4CR Consortium Page 49 of 50
Medical History Currently breast feeding
Medical History Currently pregnant
Medical History Diet
Medical History Libido
Medical History menopausal status
Medical History pregnancy number
Medical History Smoking Status
Medical History Substance Abuse
Medication active substance
Medication Dosage
Medication Drug Class
Medication Drug Group
Medication Drug name
Medication Medication Code
Medication Medication end date
Medication Medication start date
Medication Route
Patient Characteristics
Day-night cycles
Procedure Procedure Code
Procedure Procedure Date
Procedure Procedure Text
Scores&Classification AJCC
Scores&Classification Best-corrected visual acuity (BCVA) Score
Scores&Classification CTCAE
Scores&Classification Expanded Disability Status Scale (EDSS) score
Scores&Classification IPI
Scores&Classification modified Rankin Scores&Classification
Scores&Classification NCI-Common Terminology Criteria for Adverse Events
Scores&Classification SELENA-SLEDAI Scores&Classification
Scores&Classification SLE
Scores&Classification WHO
4.5. CTE Data Inventory
_Top Data Items Export Evaluation CTE v1.0.xlsx
EHR4CR Consortium Page 50 of 50
Nr Data Element Domain
Average
completeness APHP FAU KCL MUW U936 UNIVDUN UOG UoM WWU
UCL
breastcancer HUG
22 Date Of Birth Demographics 88% 100% 100% A 100% 100% 100% 100% 100%
21 Sex Demographics 87% 100% 100% A 100% 100% 100% 100% 100%
26 Diagnosis Code Disease Characteristics 53% 33% 79% N/A 100% A 80% 100% 35%
25 Date of diagnosis Disease Characteristics 49% N/A 79% A 100% A 80% 100% 35%
117 Date Of Procedure Surgery 22% 13% 15% N/A A A 32% 100% 18%
116 Procedure Name Surgery 22% 13% 15% N/A A A 32% 100% 18%
85 Platelets Laboratory 15% 46% A A A A 25% N/A 50%
38 Result Laboratory Data 15% 68% A A A A 45% 6% N/A
39 Laboratory Test Laboratory Data 15% 68% A A A A 45% 6% N/A
77 Hematocrit Laboratory 14% 49% A N/A A A 36% N/A 31%
40 Original Result Unit Laboratory Data 14% 68% A A A A 41% 6% N/A
80 MCHC (Erythrocyte Mean Corpuscular Hemoglobin Concentration)Laboratory 14% 49% A N/A A A 34% N/A 32%
55 Creatinine Laboratory 14% 48% A A A A 24% N/A 43%
42 Reference Range Upper Limit (reported in Original Unit)Laboratory Data 14% 68% A N/A A A 45% N/A N/A
89 Red Blood Count Laboratory 14% 49% A N/A A A 36% N/A 28%
98 Urine Red Blood Cells Laboratory 14% 0% A N/A A A 100% N/A 11%
120 Date Of Assessment Tumor Response 14% 11% N/A N/A N/A N/A N/A 100% N/A
121 Lesion Location Tumor Response 14% 11% N/A N/A N/A N/A N/A 100% N/A
123 Lesion Description Tumor Response 14% 11% N/A N/A N/A N/A N/A 100% N/A
79 Lymphocytes Laboratory 14% 46% A A A A 33% N/A 31%
41 Reference Range Lower Limit (reported in Original Unit)Laboratory Data 14% 68% A N/A A A 41% N/A N/A
84 Neutrophils (total) Laboratory 14% 46% A A N/A A 36% N/A 27%
128 Body Weight Vital Signs 13% 26% 2% A N/A A 80% N/A N/A
129 Height Vital Signs 13% 26% N/A A N/A A 76% N/A N/A
133 Date Of Assessment Vital Signs 13% N/A 2% A N/A A 100% N/A N/A
101 Ongoing Medical History 13% N/A N/A A N/A N/A N/A 100% N/A
102 Reported Term Medical History 13% 0% N/A N/A N/A N/A N/A 100% N/A
122 Method Of Tumor Measurement Tumor Response 13% N/A N/A N/A N/A N/A N/A 100% N/A
124 New Lesion Description Tumor Response 13% N/A N/A N/A N/A N/A N/A 100% N/A
125 Measurement Of Target Lesion Diameter Tumor Response 13% N/A N/A N/A N/A N/A N/A 100% N/A
105 Questionnaire Name PRO 12% 93% N/A N/A N/A N/A 1% N/A N/A
106 Date / Time Of Assessment PRO 12% 93% N/A N/A N/A N/A 1% N/A N/A
107 Question Name PRO 12% 93% N/A N/A N/A N/A 1% N/A N/A
78 Hemoglobin Laboratory 12% 49% A A A A 13% N/A 31%
82 MCV (Erythrocyte Mean Corpuscular Volume)Laboratory 12% 49% A N/A A A 12% N/A 32%
87 PT,INR (International Normalized Ratio of Prothrombin Time)Laboratory 11% 36% A A A A 25% N/A 30%
63 Potassium Laboratory 11% 47% A A A A 3% N/A 41%
66 SGPT/ALT Laboratory 11% 32% A A A A 27% N/A 31%
86 PT (Prothrombin time) Laboratory 11% 36% A N/A A A 24% N/A 30%
74 Basophils Laboratory 11% 46% A N/A A A 11% N/A 31%
67 Sodium Laboratory 11% 47% A A A A 0% N/A 41%
64 Protein, total Laboratory 11% 47% A A A A 23% N/A 18%
83 Monocytes Laboratory 11% 46% A N/A A A 9% N/A 31%
65 SGOT/AST Laboratory 11% 32% A A A A 23% N/A 31%
81 mean corpuscular hemoglobin Laboratory 10% 49% A N/A A A 34% N/A N/A
131 Temperature Vital Signs 10% 63% N/A N/A N/A A 16% N/A N/A
57 Glucose, unspecified Laboratory 10% 40% A A A A 3% N/A 35%
12 Start Date Concomitant Medication 10% 33% 3% A N/A A 21% N/A 21%
19 Drug Name Concomitant Medication 10% 33% 3% N/A N/A A 21% N/A 20%
75 Eosinophils Laboratory 10% 46% A N/A A A 0% N/A 30%
48 Bilirubin, total Laboratory 9% 30% A A A A 22% N/A 24%
90 WBC Laboratory 9% 75% A A A A 0% N/A N/A
13 Route Of Administration Concomitant Medication 9% 33% N/A A N/A A 20% N/A 21%
14 Frequency Concomitant Medication 9% 33% N/A A N/A A 20% N/A 21%
18 Dose Unit Concomitant Medication 9% 33% N/A A N/A A 21% N/A 20%
17 Dose Per Administration Concomitant Medication 9% 33% N/A A N/A N/A 20% N/A 20%
47 Bilirubin, indirect Laboratory 9% 30% A N/A A N/A 18% N/A 24%
45 Alkaline phosphatase Laboratory 9% 47% A A A A 1% N/A 24%
51 Calcium Laboratory 9% 38% A A A A 10% N/A 21%
52 Chloride Laboratory 9% 47% A N/A N/A A 8% N/A 13%
126 Systolic Blood Pressure Vital Signs 9% 46% N/A A N/A A 22% N/A N/A
127 Diastolic Blood Pressure Vital Signs 9% 46% N/A A N/A A 22% N/A N/A
88 Partial Thromboplastin Time Laboratory 8% 1% A N/A A N/A 34% N/A 30%
43 Date / Time Sample Was Taken Laboratory Data 8% 48% A A A A 7% 6% N/A
54 Creatine Kinase (CK, CPK) Laboratory 7% 15% A A A A 32% N/A 11%
46 Bilirubin direct Laboratory 7% 30% A N/A A N/A 1% N/A 24%
44 Albumin Laboratory 7% 7% A A A A 16% 7% 24%
20 Total Daily Dose Concomitant Medication 7% 33% N/A A N/A N/A 21% N/A N/A
56 GGT Laboratory 6% 30% A A A A 20% N/A N/A
37 QTCF Interval ECG 6% N/A N/A N/A N/A A 45% N/A N/A
72 TSH Laboratory 5% 13% A A A A 15% N/A 15%
71 Troponin T Laboratory 5% 19% A N/A N/A A 23% N/A N/A
16 Stop Date Concomitant Medication 5% 33% 3% N/A N/A N/A 7% N/A N/A
76 Erythrocyte Sedimentation Rate Laboratory 5% 3% A N/A A N/A 35% N/A N/A
53 Cholesterol, total Laboratory 5% 13% A A A A 14% N/A 11%
92 Urine Glucose Laboratory 5% 1% A N/A A A 2% N/A 35%
58 Glycated Haemoglobin / Hemoglobin A1C Laboratory 4% 10% A A A A 19% N/A 6%
97 Urine Protein Laboratory 4% 15% A N/A A A 4% N/A 15%
49 Blood Urea Nitrogen Laboratory 4% 32% A N/A A N/A 0% N/A N/A
50 Brain Natriuretic Peptide Laboratory 4% 6% A N/A N/A N/A 20% N/A 5%
69 Triglycerides Laboratory 4% 13% A A A A 4% N/A 12%
62 Phosphorus, Inorg Laboratory 3% N/A A N/A A N/A 28% N/A N/A
130 Pulse Vital Signs 3% 5% N/A N/A N/A A 21% N/A N/A
73 Uric acid Laboratory 3% 15% A N/A A A 10% N/A 0%
91 Urine Bilirubin Laboratory 3% N/A A N/A A A 0% N/A 25%
95 Urine Nitrites Laboratory 3% N/A A N/A A A 8% N/A 15%
93 Urine Ketones Laboratory 2% N/A A N/A A A 3% N/A 17%
96 Urine pH Laboratory 2% N/A A N/A A A 3% N/A 15%
61 N-terminal probrain natriuretic peptide Laboratory 2% 6% A N/A A N/A 10% N/A N/A
100 Urine Urobilinogen Laboratory 2% 0% A N/A A A 0% N/A 15%
94 Urine Leucocytes Laboratory 2% N/A A N/A A A 0% N/A 15%
68 Total T4 Laboratory 1% 2% A N/A N/A N/A 8% N/A 1%
32 ECG Clinical Findings ECG 1% 4% N/A N/A N/A A 7% N/A N/A
60 Magnesium Laboratory 1% N/A A A A A 2% N/A 8%
108 What Is The History Of Smoking Use For This SubjectSubstance Use 1% 2% N/A A N/A N/A 7% A N/A
59 LDH Laboratory 1% 5% A N/A A A 4% N/A N/A
28 Disposition Start Date Disposition 1% N/A N/A N/A N/A A 8% N/A N/A
113 Alcohol Consumption Substance Use 1% 8% N/A A N/A A 0% N/A N/A
70 Troponin I Laboratory 1% 0% A N/A A A 0% N/A 7%
33 QRS Interval / Complex ECG 1% N/A N/A N/A N/A A 7% N/A N/A
35 ECG Heart Rate ECG 1% N/A N/A N/A N/A A 7% N/A N/A
36 QTCB Interval ECG 1% N/A N/A N/A N/A A 7% N/A N/A
30 Sinus Rhythm ECG 1% N/A N/A N/A N/A A 5% N/A N/A
110 Cigarettes Smoked Per Day Substance Use 0% 2% N/A N/A N/A A 1% N/A N/A
8 Date Of Death Adverse Events 0% 1% N/A A N/A A 0% A 1%
112 Number Of Pack Years Substance Use 0% 2% N/A N/A N/A A N/A N/A N/A
34 QT Interval ECG 0% N/A N/A N/A N/A A 1% N/A N/A
31 PR Interval ECG 0% N/A N/A N/A N/A A 1% N/A N/A
10 Time Of Death Adverse Events 0% N/A N/A A N/A N/A 0% N/A N/A
29 Electrocardiogram Date / Time ECG 0% N/A N/A N/A N/A A 0% N/A N/A
109 Last Smoked Substance Use 0% N/A N/A N/A N/A A 0% N/A N/A
1 Start Date / Time Adverse Events 0% N/A N/A A N/A A 0% N/A N/A
2 Outcome Adverse Events 0% N/A N/A A N/A A 0% N/A N/A
3 Verbatim Description Adverse Events 0% N/A N/A A N/A A 0% N/A N/A
4 End Date / Time Adverse Events 0% N/A N/A N/A N/A A 0% N/A N/A
5 Severity of Adverse Event Adverse Events 0% N/A N/A A N/A N/A 0% N/A N/A
6 Seriousness of Adverse Event Adverse Events 0% N/A N/A N/A N/A N/A 0% N/A N/A
7 Action(s) taken Adverse Events 0% N/A N/A N/A N/A N/A 0% N/A N/A
9 Cause Of Death Adverse Events 0% N/A N/A A N/A A 0% N/A N/A
11 In Case Of Death, Autopsy Report Adverse Events 0% N/A N/A N/A N/A N/A 0% N/A N/A
15 Reason Concomitant Medication 0% N/A N/A N/A N/A N/A N/A N/A N/A
23 Ethnicity Demographics 0% 0% N/A A N/A A N/A N/A N/A
24 Race Demographics 0% N/A N/A N/A N/A N/A N/A N/A N/A
27 Disposition Category Disposition 0% N/A N/A N/A N/A A 0% N/A N/A
99 Urine Specific Gravity Laboratory 0% N/A A N/A A N/A N/A N/A N/A
103 Event End Date Time Medical History 0% N/A N/A N/A A N/A N/A N/A N/A
104 Event Start Date Time Medical History 0% N/A N/A N/A A N/A N/A A N/A
111 Years Smoked Substance Use 0% N/A N/A N/A N/A A N/A N/A N/A
114 Substance Use Start Date Time Substance Use 0% N/A N/A N/A N/A A N/A N/A N/A
115 Substance Use End Date Time Substance Use 0% N/A N/A N/A N/A A N/A N/A N/A
118 Indication Surgery 0% N/A N/A N/A N/A N/A N/A N/A N/A
119 Planned Date Of Surgery Procedure Surgery 0% N/A N/A N/A N/A N/A N/A N/A N/A
132 Position of VS Measurement Vital Signs 0% N/A N/A N/A N/A N/A N/A N/A N/A
Top Related