Download - Deliverable 7.4 Final report from the pilot site ......Eric Zapletal, Sebastian Mate, Marc Cuggia, Bolaji Coker Site updates and changes 19012015 0.3 Mark McGilchrist, James Cunningham,

EHR4CR Consortium of 50

Electronic Health Records for Clinical Research

Deliverable 7.4

Final report from the pilot site evaluations and the status of local interfaces

Version 2.0

Final

28/01/2015

Project acronym: EHR4CR

Project full title: Electronic Health Records for Clinical Research

Grant agreement no.: 115189

Budget: 16 million EURO

Start: 01.03.2011 - End: 28.02.2015 followed by 1 year extension

Website: www.ehr4cr.eu

The EHR4CR project is partially

funded by the IMI JU programme

Coordinator:

Managing Entity:

http://www.ehr4cr.eu/


Document description

Deliverable no: 7.4

Deliverable title: Final report from the pilot site evaluations and the status of local interfaces

Status: Final

Version: 2.0 Date: 29/02/2016

Security: EHR4CR Consortium

Editors: Fleur Fritz, Benjamin Trinczek, Justin Doods, Iñaki Soto-Rey, Philipp Bruland, James

Cunningham, Mark McGilchrist, Christian Lovis, Colin McCowan, Hans-Ulrich

Prokosch, Scott Askin, Elena Bolanos, Helen Townsend, Eric Zapletal, Sebastian

Mate, Marc Cuggia, Bolaji Coker, Andreas Schmidt, Andy Sykes.

Document history

Date Revision Author Changes 22102014 0.0 Fleur Fritz Initial draft

08122014 0.1 Fleur Fritz, Benjamin

Trinczek, Justin Doods, Iñaki

Soto-Rey, Philipp Bruland,

James Cunningham, Mark

McGilchrist

Description of work and results

19122014 0.2 Christian Lovis, Colin

McCowan, Hans-Ulrich

Prokosch, Scott Askin, Elena

Bolanos, Helen Townsend,

Eric Zapletal, Sebastian

Mate, Marc Cuggia, Bolaji

Coker

Site updates and changes

19012015 0.3 Mark McGilchrist, James

Cunningham, Inaki Soto

Rey, Justin Doods, Philipp

Bruland, Fleur Fritz

Result updates

28012015 1.0 Andreas Schmidt, Andy

Sykes

Comments and corrections

29022016 2.0 Martin Dugas, Dipak Kalra Minor updates


Table of Contents 1. Executive Summary ......................................................................................................................... 5

2. Deliverable Description ................................................................................................................... 7

2.1. Task 7.1 Local Interfaces with EHR4CR platform ................................................................... 7

2.2. Task 7.2 Protocol Feasibility ................................................................................................... 7

2.3. Task 7.3 Patient Recruitment ................................................................................................. 7

2.4. Task 7.4 Clinical Trial Execution & Task 7.5 Serious Adverse Event Reporting ...................... 8

2.5. Output of the deliverable ...................................................................................................... 8

3. Organization of Work and Results ................................................................................................. 10

3.1. Local Interfaces – Site Readiness ......................................................................................... 10

3.1.1 Method ............................................................................................................................ 10

3.1.2 Data access and available contexts ................................................................................. 11

3.1.3 Authorisations required and obtained ............................................................................ 11

3.1.4 Precautionary measures taken by sites ........................................................................... 12

3.1.5 Technical approaches to accessing data.......................................................................... 12

3.1.6 Transforming data ........................................................................................................... 14

3.1.7 Coding and measurement units ...................................................................................... 15

3.1.8 Installation and configuration ......................................................................................... 17

3.2. Protocol Feasibility – Effectiveness & Efficiency Evaluation ................................................ 18

3.2.1 Effectiveness Results ....................................................................................................... 18

3.2.2 Efficiency Results ............................................................................................................. 19

3.2.3 Analysis of the results ...................................................................................................... 20

3.3. Protocol Feasibility – Scalability Evaluation ......................................................................... 20

3.4. Protocol Feasibility – Usability Evaluation ........................................................................... 22

3.4.1 Results.............................................................................................................................. 24

3.5. Patient Recruitment – Data Inventory ................................................................................. 30

3.6. Patient Recruitment – Evaluation ........................................................................................ 31

3.6.1 Task #1 (Installation and configuration of the PRS components at the sites) ................. 32


3.6.2 Task #2 (Approach and seeking confirmation of the respective Principal Investigator(s)

(PI) at the participating sites for the respective trial(s) that have already been chosen in year 3)

32

3.6.3 Tasks #3 (Adjustment of the evaluation protocol for each site participating in the

evaluation for the respective trial(s)) & #4 (Approach and requesting approval of the local ethics

committee/institutional review board at the participating sites for the respective trial(s)) ........ 34

3.6.4 Tasks #5 (Review of simplified eligibility criteria (EC) of the trials used for the

evaluation, and, if necessary, re-simplification of these EC) & #6 (Check availability of data items

correlating to the EC within the central terminology, and, if necessary, coordinate insertion of

missing items with Work package 4 team) ................................................................................... 35

3.6.5 Task #7 (Extraction, Transformation and Loading (ETL) of EHR data correlating to these

data items) ..................................................................................................................................... 36

3.6.6 Tasks #8 (Creation of (database) queries for the trials utilized in the evaluation within

the central workbench), #9 (Distribution of queries to the participating sites) & #10 (Execution

of queries, collection of necessary numbers and screening lists at the participating sites) ......... 37

3.6.7 Task #11 (Comparison between screening list from standard method with candidate list

from EHR4CR systems) .................................................................................................................. 39

3.7. Clinical Trial Execution & Serious Adverse Event Reporting – Data Inventory .................... 40

4. Appendix ........................................................................................................................................ 42

4.1. Scalability Evaluation ........................................................................................................... 42

4.2. Usability Evaluation .............................................................................................................. 42

4.3. PRS Evaluation...................................................................................................................... 43

4.3.1 PRS testing template from last year’ s deliverable .......................................................... 43

4.3.2 PRS results document from AP-HP for the EUCLID study ................................................ 44

4.3.3 PRS results document from AP-HP for the GetGoal Duo-2 study ................................... 45

4.4. PRS Data Inventory .............................................................................................................. 46

4.5. CTE Data Inventory .............................................................................................................. 49


1. Executive Summary

This document describes the Work Package 7 (WP7) deliverables for the fourth year within the

Electronic Health Records for Clinical Research (EHR4CR) project. The EHR4CR project aims to create

a platform to reuse data from electronic health records for clinical research. More information can

be found at: http://www.ehr4cr.eu/.

The overall objective of WP7 is to demonstrate the functionality of the tools and services provided by

the platform (Work Packages 3-6) and to evaluate the EHR4CR platform in the areas of clinical study

design, execution and SAE reporting with a specific focus towards a set of mutually acceptable

medical domains agreed on by the demonstrator sites and EFPIA partners in accordance with Work

Package 1.

WP7 pilots the platform at 11 different data provider sites. The piloting is divided into three

scenarios: protocol feasibility, patient identification and recruitment, clinical trial execution including

serious adverse event reporting. The fourth year of the project concentrates on all three scenarios.

The protocol feasibility is to be evaluated, the patient identification and recruitment is to be installed

and also evaluated, the clinical trial execution is to be analysed. For the third (last) scenario it has

been decided that the project objectives would be adjusted to a prototype installation established by

WPG2 and the CRF data element analyses by WP7.

The key results, challenges and proposed mitigation steps for year four are highlighted below and

described in detail in each section.

Key results:

Access to real data at the data provider sites incl. mapping of local terminology to central

terminology

PFS evaluation results

PFS usability evaluation in conjunction with WP1

Platform scalability testing

Updated version of Data Inventory for PRS

Local Integration of Patient Recruitment Services (follow-up from year 3)

PRS evaluation results

Clinical trial execution (CTE) data inventory and validation

Challenges:

Due to several delays in the availability of the Protocol Feasibility (PFS) and Patient Recruitment (PRS)

Platforms, not all tests and evaluations have been performed compared to what was initially

planned. The PFS scenario was therefore tested only once and the PRS could only be tested

retrospectively.

http://www.ehr4cr.eu/


Proposed mitigation:

Some tests have been performed only at selected sites. Both scenarios were tested with a slightly

reduced scope. More tests were planned to be conducted during year 5. However, the project

experienced a significant and unexpected shortfall in budget in the fifth year, which resulted in work

package 7 having to close down early the fifth year, and therefore not having the opportunity to

undertake any further evaluation work during year five on the PRS or on the prototype

implementations of CTE.


2. Deliverable Description

The fourth year deliverable D7.4 for WP7 is: “Final report from the pilot site evaluations and the

status of local interfaces”.

This deliverable focuses on activities related to the last scenario of CTE and serious adverse event

reporting. However, the first two scenarios are also still being worked upon and are each included in

the tasks 7.1 to 7.3. To achieve the deliverable, it was divided into tasks with specific activities that

are described below.

2.1. Task 7.1 Local Interfaces with EHR4CR platform

The year four activity for this task is: “Mapping of local data items to pivot representation (v2); local

adaptation of uniform access layer (v2); local integration of Clinical Trial Data Capture Services”.

This task aims at preparing the local data provider sites to install the necessary components of the

EHR4CR platform and make data available to be queried for the selected clinical trials within the

project. This not only involves identifying and preparing the data through Extract – Transform – Load

(ETL) processes but also obtaining approvals (e.g. data privacy, health authority) and making sure

that data privacy and security are respected.

The main outputs are therefore a collection of data elements called the data inventory and a

checklist about each sites readiness to run the EHR4CR platform with specific data warehouses at the

local sites.

2.2. Task 7.2 Protocol Feasibility

The year four activity for this task is: “Further improvement of the efficiency of the trial feasibility

demonstrators based on EFPIA evaluation (based on evaluation concept from Work Package 1)”.

This task focuses on the evaluation of the protocol feasibility component of the EHR4CR platform. For

the evaluation, in comparison with conventional methods, a specific test plan and evaluation

protocol was designed and carried out at selected data provider sites with a subset of real clinical

trials. The system was evaluated with respect to efficiency and accuracy. Furthermore, several kinds

of user acceptance tests with respect to scalability and usability have been carried out.

2.3. Task 7.3 Patient Recruitment

The year four activity for this task is: “The effect of EHR4CR platform on recruitment and enrolment

rates in hospitals will be analysed according to EFPIA criteria”.

This task focuses on the scenario “Patient Identification and Recruitment”, in which the central and

locally developed and installed services shall be used to identify potentially eligible patients. The

evaluation of these services is designed to compare the process and output of identifying potentially


eligible patients with the current methodology, as described and agreed upon in the evaluation

protocol (template) that has been written in the previous year together with EFPIA.

2.4. Task 7.4 Clinical Trial Execution & Task 7.5 Serious Adverse Event Reporting

The year four activities for those tasks are: “The ability of EHR4CR platform to query EHRs in order to

enrich CRFs with EHR data will be tested and the ability to transmit data from several hospitals to the

sponsor’s CDMS. EHR4CR will support the execution by the demonstrators of integration profiles

similar to IHE Retrieve Form for Data capture (RFD). The completeness, quality of data and the

timeliness achieved through the EHR4CR platform will be compared to current methods (using paper

CRF or eCRF without EHR integration)”

and:

“The ability of EHR4CR platform to query EHRs in order to enrich drug safety reporting forms with EHR

data will be tested as well as the ability to transmit data from several hospitals to the relevant

authorities. EHR4CR will support the execution by the pilots of integration profiles similar to IHE RFD.

The completeness, quality of data and the timeliness achieved through the EHR4CR platform will be

compared with current methods”.

Since there is no fully functional pilot for CTE available, these tasks were decided to be changed

towards a comprehensive data inventory for the third scenario. The inventory sets a focus on the

determination of common data elements in clinical trials as well as their availability and

completeness within sites’ EHR systems. In addition, the serious adverse event reporting process was

removed from the list of deliverables since the SAE reporting from clinical trials is done by the

sponsor companies through the pharmacovigilance systems. However, the relevant data elements

were collected for the CTE data inventory.

2.5. Output of the deliverable

Based on the described tasks the output of the fourth year deliverable consists of:

Site Readiness

o Status of clinical data warehouses and end-point installations at the sites

o Mapping and ETL

Protocol Feasibility

o Efficiency and effectiveness evaluation

o Scalability Evaluation

o Usability evaluation


Patient Recruitment

o Data Inventory

o PRS evaluation protocol template for institutional review boards (IRB)

o Evaluation

Clinical Trial Execution & Serious Adverse Event Reporting

o Data Inventory

The participating pilot sites have the following abbreviations:

University College London UCL

Kings College London KCL

University of Dundee UNIVDUN

Université de Rennes 1 U936

Westfälische Wilhelms-Universität Münster WWU

Friedrich-Alexander-Universität Erlangen-Nürnberg FAU

Hôpitaux universitaires de Genève HUG

Assistance Publique Hôpitaux de Paris AP-HP

The University of Manchester UoM

Medical University of Warsaw – POLCRIN MUW

University of Glasgow UoG


3. Organization of Work and Results

In this main section the organization of work within the work package and the respective results

representing the outputs of the WP7 deliverable are described. More detailed information about

each deliverable part can be found in the Appendix 4 as well as in the referenced SharePoint

documents.

3.1. Local Interfaces – Site Readiness

Sites are required to perform a number of duties in regard to their membership of the EHR4CR

network. They must provide the necessary interfaces to their data (obtained from one of the

providers) and ensure that their data, systems and staff are ready to participate. This section – Local

interfaces and site readiness – discusses the nature of these interfaces at each site and the

requirements of the data and staff, and assesses whether existing sites are ready.

3.1.1 Method

The information contained within this section was obtained in one-to-one interviews with

representatives from each site and performed between 1st and 3rd December 2014. Ten sites were

successfully interviewed, except one site (Rennes). No results will be presented for that site. Those

responding included Münster (WWU), Erlangen (FAU), Glasgow (UoG), Geneva (HUG), Manchester

(UoM), AP-HP (Paris), Warsaw (MUW), Dundee (UoD), Kings College London (KCL) and University

College London (UCL). Table 3.1.1 lists those participating in the interviews.

Site Name(s)

WWU Justin Doods, Inaki Soto Rey, Benjamin Trinczek

FAU Sebastian Mate, Thomas Ganslandt

UoG Kevin Ross

HUG Dina Vishnyakova

UoM James Cunningham

AP-HP Eric Zapletal

MUW Cezary Szmigielski, Marcin Rozek, Slawomir Majewski

UNIVDUN Mark McGilchrist

KCL Bolaji Coker

UCL Dionisio Acosta

Table 3.1.1: Interview participants

Interviews lasted between 40 and 50 minutes and the following issues were discussed:

Access to data

Context in which data can be used

Authorisation required and obtained


Precautionary measures

ETL

Data quality and provenance

Coding and units

Installation and configuration

3.1.2 Data access and available contexts

In the EHR4CR project, data were partitioned into 7 distinct categories:

Demographics, Diagnosis, Procedures, Administration (prescribing), Laboratory, Findings, Pathology

Sites were asked whether they contributed these kinds of data (table 3.1.2) and which clinical

domains they thought the data could support for clinical trials (table 3.1.3).

WWU FAU UoG HUG UoM AP-HP

MUW UNIVDUN KCL UCL U936

Demo Y Y Y Y Y Y Y Y Y Y

Diagnosis Y Y Y Y Y Y Y Y Y(r) Y

Procedure Y Y Y Y Y Y Y(r) Y Y(r) Y

Admin/Rx Y Y(r) Y(r) Y Y Y X Y Y(r) N

Lab Y Y(r) Y Y Y Y Y Y(r) Y N

Findings Y(r) N(r) N N Y Y ? Y(r) Y(r) Y(r)

Pathology ? Y(r) N N ? Y ? N N Y(r)

Table 3.1.2: Availability of data types by site. Notes: R – restrictions

WWU2 FAU UoG HUG UoM AP-HP

MUW UNIVDUN KCL UCL U936

Diabetes Y Y Y (r) Y2 Y Y Y Y

CV Y Y Y Y2 Y Y Y Y

Oncology Y Y N Y2 N Y Y N Y(r)1

Respiratory Y Y N ? N Y Y N (r)

Inflammatory Y Y N Y2 N Y N N (r)

Neurology Y Y N Y2 N N Y N (r)

Renal Y Y N Y (r) N (r) Y

Table 3.1.3: Supported clinical domains. Y – yes, Y (r) – yes, but some restrictions, N – No, N (r) –

generally no, but some things may be possible. 1 – Breast Cancer (UCL). 2 – domain is subject to

authorisation.

3.1.3 Authorisations required and obtained

The data of section 3.1.2 can only be leveraged for EHR4CR with the appropriate authorisations in

place. EHR4CR sought general authorisation for PFS and specific authorisations for PRS. However,

many sites also imposed study-specific authorisations for PFS limiting their ability to contribute to the

overall platform. A summary of the authorisation status of the sites is given in table 3.1.4.


Authorisations were obtained from data protections officers, administrators and ethics committees

where appropriate.

WWU FAU UoG HUG UoM AP-HP MUW UNIVDUN

KCL UCL U936

General - Y Y X Y X Y Y Y X

Study Y (10) - - Y (1) - Y (3) - - - Y (2?)

Expires Y (P) Y (P) Y (P) Y (S) Y (P) X ? Y (P) ? Y (P)

Table 3.1.4: Authorisation status by site. Notes: P – EHR4CR project, S – study, Y (#) – number of studies

3.1.4 Precautionary measures taken by sites

Sites took precautionary measures with the data to reduce its identifiability when accessed by the

platform. These measures included altering the patient date of birth by up to 3 months in a random

manner, and altering date and time of events (diagnoses, findings, etc.) by a random amount of up to

one year on a patient-by-patient basis. All events within a patient record were shifted by the same

time increment; event sequencing and inter-event durations were not altered, but patient age at

events may have been slightly altered. Absolute dates are altered and this can have implications in

queries constructed at the central workbench.

In terms of the ambitions of the PFS platform a general authorisation for domain data is desirable.

Four sites did not offer this level of flexibility and provided study-specific authorisations instead. The

latter hinder the objectives of the PFS use case where exploration is of prime importance and takes

place prior to any study-specific agreements being put in place.

All authorisations, with the exception of AP-HP, will expire at the end of the project period on 28th

February 2015. (Warsaw and KCL do not have an explicit position on the termination of

authorisations, but is likely to be the similar to most other sites.)

3.1.5 Technical approaches to accessing data

Each site must extract data from one or more systems and transform it structurally for incorporation

into the native EHR4CR clinical data warehouse (CDW), or an i2b2 CDW. One site, AP-HP, has a pre-

existing i2b2 warehouse for local hospital use and it is used as the target for EHR4CR requests.

Another site, Erlangen, also has an IBM Cognos-based hospital data warehouse as the source for data

extraction. Other sites must access one or more systems (many in some cases) to extract the

necessary data for the site project warehouse. The described topologies for data extraction are

shown schematically in figure 3.1.1.


Figure 3.1.1: Possible topologies when extracting data for hospital sites to the EHR4CR CDW

Extractions may be performed by either EHR4CR staff or local hospital or IT staff depending on the

relationship established between the two parties. The topologies and staff arrangements for data

extraction for each site are shown in table 3.1.5.

WWU FAU UoG HUG UoM AP-HP MUW UNIVDUN KCL UCL U936

Topology β α α β α β β α α α

EHR4CR staff

Y Y - - - - - Y Y Y

Hospital staff

- - Y Y Y Y Y - Y -

Table 3.1.5: Extraction topology and staff data access arrangements by site

Methods for extracting data include CSV files (obtained through SQL scripts), SQL database backups,

or through direct access with a tool such as Talend Open Studio. Extraction methods for each site are

given in table 3.1.6.


csv - - Y Y Y - Y - -

backup - - - - - Y - - - Y

tool Talend IBM Cognos

- - - Talend - Y (1) SQL SQL

Table 3.1.6: Extraction method by site. Note: 1 – proprietary tool, to be replaced by Talend at some point.


Each site has options in regard to the mode of extraction - full or incremental – and the frequency of

extraction. These are summarised by site in table 3.1.7.


Mode F F/I F F F F I F F F

Frequency od od Inf od od w m od od od

Table 3.1.7: Mode and frequency of data extraction by site. Notes: F – full, I – incremental, od – on demand, inf – infrequent, /2d – every two days, m – monthly, w - weekly

3.1.6 Transforming data

Data, once extracted, must be transformed to a new structure – either native EHR4CR or i2b2. (This

may also include the generation of new terminology codes where necessary.) However, as a rule, no

code mappings to the central terminology (CT) take place at this stage; the final warehouse always

uses site terminology coding, whether locally defined or standard.

Staff use proprietary or off-the-shelf products to achieve this structural change, e.g. Talend.

Proprietary methods include multiple SQL scripts or 3GL applications written in Java for example.

During this phase, filtering/transformation for data quality purposes may take place. In this process,

records may be discarded if they do not meet defined data quality standards, while other records

may have data values substituted by equivalent or more appropriate values. No data quality

standards have been defined for the EHR4CR CDW.

Particular care must be taken to ensure compliance with the Blue Model, i.e. those rules that the

end-point software uses when processing data.

In correcting, or discarding data, sites may establish the provenance of the data from hospital

warehouse IT staff, hospital systems IT staff, or clinical staff such as physicians or nurses.

Table 3.1.8 shows the extent to which sites impose quality control measures when processing the

data during ETL.


DQ - I2b2 - Some

Y Y (loss) - Y - -

Blue Model

- I2b2 - - Y - - Y - -

Provenance

- some - Some

- - Some Y - -

Table 3.1.8: Imposition of data quality measures by site. Notes: i2b2 – some measures are imposed by the use of i2b2. Where quality measures are imposed no specific site details are provided as the individual site situations are complex.


3.1.7 Coding and measurement units

During the transformation phase a list of site terminology codes is generated, which is retained

within the native or i2b2 warehouses. These terminology codes, whether local, national or

international, must be mapped to the chosen central terminology (which itself consists of specific

national or international codes defined by WP4.) Mappings are created as new data becomes

available to a site or when the CT expands for new studies. EHR4CR has not provided tools to support

this process so far, but these are due in the near future. Therefore, at the moment, sites must choose

their own tools for this purpose. In many cases the process is highly manual (M), sometimes more

automated (A).

The output of this process must be a file which can be read by Terminology Services (currently

Continuity of Care Document, but CSV previously.) It is observed that sites have varying degrees of

success in establishing these mappings.

Mappings are performed in two possible contexts: 1) the central terminology, or 2) the eligibility

criteria for a PRS study, the latter perhaps requiring the expansion of the CT. Table 3.1.9 shows the

method of mapping employed by sites – manual or automatic – when mapping between various

coding systems. The central terminology uses the following coding systems (terminologies) and are

the target for all mapping processes:

Diagnosis (D) ICD10 (WHO)

Procedure (P) SCT (SNOMED-CT)

Administration (Rx) ATC (WHO)

Laboratory (L) LOINC

Demographics (M) SCT

Units (U) UCUM

Findings (F) SCT

Pathology (Y) PathLex


KCL **

UCL U936

ICD10 (D)

Auto Auto (r) Auto Auto Auto Auto (r) Auto Auto

Local (D)

ULMS (D)

Auto

SCT (P) Auto

CCAM (P)

man

CHAP (P)

man

OPCS4 (P)

man man Semi

OPS (P)

man man (r)

ICD9 Auto(L)


(P)

Local (P)

ATC (Rx)

Auto Auto Auto

BNF (Rx)

man (r)

Auto

Local (Rx)

man man (r)

LOINC (L)

Auto(r) Auto(r)

Local (L)

man man (r)

man (r)

man (r)

man man semi

SCT (M)

Local (M)

man man man man man man man man

UMLS (M)

Auto(r)

UCUM (U)

Local (U)

man man man man man man man

SCT (F) Auto(r)

Read (F)

?

Local (F)

man man man man man man man Auto

PathLex (Y)

UMLS (Y)

Auto (r)

Table 3.1.9: Mapping methods employed by sites. For example, Dundee maps laboratory data semi-manually between local coding and LOINC as the Central Terminology. Erlangen already uses LOINC for its laboratory data and automatically maps. Notes: purple background indicates central terminology for the given data category (D, P, Rx, L, M, U, F and Y), Man – manual, r – restrictions apply, semi - tools have been used to assist manual operations, L – licence decision required before MUW will proceed to map. ** KCL has not yet employed any mapping techniques against their data.

The degree to which mappings operations are finished, and the final coverage obtained for the

central terminology and the various PRS studies has been reported by the sites and is given in Table

3.1.10. It should be noted that as the CT expands mapping operations will almost never be finished

(<100%). The figures reported are for the last time the sites checked their mappings.

Study WWU FAU UoG HUG UoM AP-HP MUW UNIVDUN KCL UCL3 U936

CT F 60% ? 60% ?5 75% ? ?

6 99% 0% ?

C 97% 52%4 40% ?

5 50% 60% ?

6 52%

1 0% ?

GGD2 F 100% 100%

C 95% 90%2

EUCLID F 100% 100%

C 93% 94%

Proselica F 100% 100%


C 100% 50%7

Bayer 15141 F 0%

C 0%

EINSTEIN F 100%

C 100%

KATHERINE F 100% 100%

C ?100% 100%

OCTAVE F ?0 ?100%

C ?0 ?60%

PASSAGE F ?100%

C ?100%

TSAT F PIx

C PIx

Table 3.1.10: Notes: CT – central terminology, GGD2 - GetGoal Duo-2, F – Finished, C - Coverage. Notes: 1 – no mappings for Pathology, some findings and EHR4CR specific elements. 2 – Medication daily dose, lab ranges LLN/UL, some specific drugs not available. 3 – UCL expected to do two studies, but these had not been finalised at time of writing, and it is expected they will require new terms in the central terminology. 4 – Excludes mappings where there are no data in the warehouse. 5 – PRS studies only. 6 – Awaiting confirmation of terminology licence arrangements. 7 – Awaiting an update to warehouse which will improve mapping completeness. PIx – PI did not wish to proceed. ? – no information available.

3.1.8 Installation and configuration

Sites must devote resources (money, hardware, software, staff time) to the installation and

configuration of the platform. The EHR4CR installation covers 5 platform components: PFS and PRS

end-points, terminology services, local workbench, audit facility, and a local warehouse, either

EHR4CR or i2b2, hosted by an RDBMS on dedicated hardware or a virtual machine on shared

hardware. Table 3.1.11 shows the current setup for each of the sites.


KCL UCL U936

Hardware

- - 2 - - Y Y (db) Y (db) - -

VM 3 2 - 2 2 - 1 (ep) 4 Y 2

OS Linux Ubuntu

Window7

Ubuntu

Linux/WSvr

Linux Centos Window7

Ubuntu+Win2008

Linux

RDBMS PG/my Oracle 11

MSSQL

2012

Oracle MSSQL

Oracle 11

MSSQL MSSQL Oracle Express 11g

Postgres

CDW v1.3.7 I2b2 v1.3.7 I2b2 v1.3.8 I2b2 v1.3.8 v1.3.6 V1.3.7 V1.3.6

Java JDK 1.7 JDK 1.6/7

JDK 1.7

? JDK 1.7

JDK 1.7 ? JDK 1.7 JDK 1.7 JDK 1.7

PFS EP 3.1.1 3.1.1 James Y Y Y (r) Y Y Modified Y

TS Y Y Y Y Y Y Y Y Y Y

Audit Y Y X X SQL logs

X SQL logs X N SQL Logs

PFS status

OK OK OK OK OK (Prod)

OK (r) OK OK OK OK (r)

PRS EP Y Y (i) - Y (i) Y (i) Y - Y (i) - -

LWB Y Y (i) - Y (i) Y (i) Y - Y (i) - -

PRS OK OK - OK OK OK - OK? - -


status

Table 3.1.11: Current installation and configuration by site. Notes: # - number of machines. PG – Postgres, my – MySQL, I – v3.1.1 installer, r – restrictions apply, db – database, ep - endpoint Installation was a variable experience for sites and the following points were noted in the interviews:

WWU Installation now stable, but not well tested. Need to bring manual up to date.

FAU Documentation is good, but there are many things to know related to the specifics of the environment. Communications are still a problem. End-point code and Blue model appear resilient.

UoG Not an easy process. Tests appear ok, but failures do not appear to be reported back to the central workbench properly.

HUG Documentation is excellent. Local tests using the central workbench appear to be ok.

UoM Production environment is mostly ok. Central workbench interface could be improved.

AP-HP PFS and PRS endpoints now stable. PFS is using local (site) validation for validating access to real data.

MUW Better documentation is needed. Only succeeded after help from other sites. Local checks using central workbench seem ok.

UNIVDUN PFS endpoint seems ok. PRS with installer (v3.1.1) appears ok. Testing between PFS and PRS suggest there are issues to be corrected.

UCL PFS installation is stable. Some test queries did not report answers. PRS installer not tested yet as only recently made available.

It should be noted that sites performed their installations at different points in the software

development cycle.

3.2. Protocol Feasibility – Effectiveness & Efficiency Evaluation

The evaluation of the feasibility scenario is focused on the demonstration of the improvement in the

accuracy (effectiveness) and the time saved (efficiency) using the EHR4CR platform instead of the

current manual protocol feasibility (PF) questionnaire based process. This demonstration consists of

a comparison between the results of a simulation of the patient cohort phase of ten clinical trials, the

same ten trials that were used to build the EHR4CR PF scenario, using the EHR4CR platform and the

results of a simulation of the process following the current methodology. These numbers were

compared to a gold standard based on a manual check of patient records. Due to the impracticality

to check all patient records, 100 patient record sets were randomly extracted and the results

extrapolated using the Wilson CI score with a 95% of confidence.

3.2.1 Effectiveness Results

Overall results at UKM

Study EFPIA Partner

Results Current Process

Results EHR4CR PF

System

Gold Standard -Mean [LB - HB]

Patients per clinic


NCT00439725 Bayer

30 0 42 [14 – 120] 1411

NCT00345839 Amgen

200 0 74 [13 – 406] 7439

NCT00627640 Merck

300 0 112 [30 – 393] 5607

NCT00894387 Novartis

25 0 75 [29 – 186] 1885

NCT00638690 Janssen

50 0 31 [8 – 108] 1540

NCT00626548 AstraZeneca

200 174 216 [131 – 341] 1540

NCT00715624 Sanofi

12 18 0 [0 – 96] 2575

NCT01018173 Roche

340 566 876 [655 – 1126] 2575

NCT01468987 Eli Lilly

110 8 257 [142 – 449] 2575

NCT00490139 GSK

10 0 22 [8 – 54] 546

Table 3.2.1: Overview of the results obtained by current and EHR4CR supported processes for each of

the trials evaluated at UKM and comparison with the gold standard.

Overall results at AP-HP

Study EFPIA Partner

Results Current Process

Results EHR4CR PF

system

Gold Standard -Mean

[LB - HB]

Patients per clinic

NCT00439725 Bayer

20 205 494 [288 – 816] 4116

NCT00638690 Janssen

25 5 86 [15 – 470] 8626

NCT00626548 AstraZeneca

250 695 1035 [603 – 1709] 8626

Table 3.2.2: Overview of the results obtained by current and EHR4CR supported processes for each of

the trials evaluated at AP-HP and comparison with the gold standard.

3.2.2 Efficiency Results

The creation, execution and visualization of a query using the EHR4CR PF system required between 5

and 25 minutes depending on the complexity of the query (5 minutes for a query with 3 criteria and

25 minutes with 26 criteria).

The time required to receive the response from the PIs varied depending on whether the

questionnaire was sent by the EFPIA representative or directly by the site. In the former case, the

response to the questionnaire was received in seven calendar days, whereas in the latter it required


between 30 and 90 days. This difference might be explained by the lack of commercial interest by the

PIs who answered to the questionnaire when these were sent by the sites, leaving therefore the first

measurement (seven days) as the only valid estimation.

3.2.3 Analysis of the results

The results of the efficacy and effectiveness evaluation of the EHR4CR PFS show that the protocol

design process can be significantly enhanced by the EHR4CR system, which provides patient counts in

a large number of sites and within a short period of time. Furthermore, the evaluation demonstrates

that these counts strongly depend on the availability of structured electronic health data in the

EHR4CR data provider sites and that, only when structured data is stored in the EHR4CR site data

warehouses, the patient counts generated by the EHR4CR PF system are accurate (being normally

zero otherwise).

3.3. Protocol Feasibility – Scalability Evaluation

The EHR4CR PFS system is intended to work on large-scale patient databases, with the number of

records potentially in the hundreds of millions. For this reason a scalability evaluation of the PFS

system was undertaken, where the ability of the system to respond in reasonable time to requests

against large databases was tested. This section outlines the approach that was taken for these tests.

The approach to testing the system’s ability to handle large-scale data was to distribute a series of

datasets, based around the record sets of 200, 2000, 20.000 and 200.000 patients respectively, to a

selection of pilot sites, whose IT environments reflected the range of environments supported by

EHR4CR. A series of queries was run against these datasets and the time taken for the system to

respond was recorded. The hypothesis, based on analysis of the ‘Blue Model’ [Bache, R., Miles, S.,

Taweel, A., 2013. An adaptable architecture for patient cohort identification from diverse data

sources. J Am Med Inform Assoc. 2013 Dec;20(e2):e327-33] algorithms, was that the time taken for

the system to process queries should scale linear in time with the size of the dataset. The recorded

times were analysed to test the hypothesis.

Given the stringent requirements on maintaining the privacy and confidentiality of electronic patient

records that exist across the European Union and within the regulatory domains of the EHR4CR pilot

sites, undertaking testing of the system against real patient data is often unfeasible. Further, for

formal testing of the system’s ability to handle large volumes of data, the same data was needed to

be hosted at each of the participating pilot sites. Given this an approach to providing test data for the

testing was required.

Given the availability of 200 manually-anonymised records of diabetes patients (the “Dundee 200”)

available within EHR4CR, it was decided to base the test data around these. An alternative approach

of producing purely random data was rejected as such randomised data will fail to represent a

realistic form of data, and tests run against it may have failed to account for issues in processing


queries that stemmed from features only present in real or realistic data. As such a tool was

produced which took source data as input and scaled it up to a required size (e.g. taking in the

records of 200 patients as input, and producing as output the records of up to 200.000 patients)

whilst altering the particular details of each source record to avoid uniformity in the scaled data.

Three queries were authored on the PFS platform and distributed to participating pilot sites. These

queries were designed to represent a basic query, a more complex inclusion/exclusion query and a

query incorporating temporal constraints. Appendix 4.3 contains the specification of the queries

used. These queries were distributed to sites using the following environments:

Operating System Hosting

Environment

RAM Database System Warehouse

Windows 7 Physical Machine 8gb MS SQL Server

2012

Native

Windows Server

2012

VM 4gb MS SQL Server

2012

Native

Windows 7

Enterprise

Physical Machine 4gb PostreSQL 9.3 Native

Linux (CentOS

6.6)

VM 48gb Oracle 11g I2B2

The tests measured a baseline time associated with each query being performed against the 200

patient dataset. Times were then recorded in the endpoint log files for each subsequent size. Given

the variation in absolute timings for the systems results were normalised to a factor increase over

the baseline measure. Theoretical analysis indicated that the endpoint algorithm should run in O(n)

(linear) time against the size of the database being queried. As such, the expected results were that

each 10x increase in database size would correspond to a 10x increase in response time for the

query.

The table below shows the average increase in normalised time across the systems as the multiple

increase in time over the previous DB size (e.g. 200 to 2000, 2000 to 20000 and 20000 to 200000).

Normalised increase

in time 200-2000

patients

Normalised increase

in time 2000-20000

patients

Normalised increase

in time 20000-200000

patients

Query 1 8.3 4.2 3.0

Query 2 3.0 9.6 9.0

Query 4 3.0 9.0 12.03


The test against the Postgres Database failed with large databases sizes. Analysis of this problem

showed that it stemmed from a configuration issue with the software, which caused memory issues

independent of the endpoint software itself. As of the time of publication of this report the issue was

being investigated and fixed.

The results indicate that the time taken to process queries increases in roughly linear time with the

number of records in the system. This indicates that an acceptable level of performance is available

with respect to the response times of the PFS platform.

3.4. Protocol Feasibility – Usability Evaluation

Two main objectives were agreed upon for the EHR4CR Usability evaluation:

a) To evaluate the user satisfaction of the EHR4CR query builder (QB)

b) To assess if the amount of provided training is sufficient/adequate

A document with information about the test was created (appendix 4.4.1) so that EFPIA partners

could contact colleagues without previous knowledge of the EHR4CR PF system and with interest in

testing the QB. After confirming a date for the test, user accounts were created within the system

and delivered to the testers. Prior to the testing, they also received a document (appendix 4.4.2) with

a manual of the platform and the description of the 3 tasks that they would need to complete. The

three tasks had a different level of complexity, starting with a very simple one to get familiar with the

QB, a second one based on a possible feasibility query and a third one with a complex query that

included all the functionality of the QB. The documentation given to the testers was completed with

a video that showed how to use the QB and a link to an electronic survey with questions about the

tasks, the training, the QB and demographic questions. In the electronic survey, users needed to

upload screenshots after the completion of each task with the results of the query creation and

execution. These, were used to evaluate the success rate of the users in the creation of the 3 queries

that conformed the 3 tasks (see table 3.4.a).

Query 1 Query 2 Query 3

Criterion Temporal

constraints

Criterion Temporal

constraints

Criterion Temporal

constraints

Inclusion

criteria

Gender Female - - - - -

Age >50 years - >18 years - >18 years -

Diagnosis - - Non-

insulin-

dependent

- Heart failure at most 3

years before

the query


diabetes

mellitus

Lab values - - Body Mass

Index

25<X<40

- Left

ventricular

ejection

fraction

<40%

at most 14

months

before the

query

- - Haemoglo

bin A1c

>7,5%

- Systolic

blood

pressure

>2,0 mmHg

2 times

separated

with at least

2 months in

between

Exclusion

criteria

Diagnosis - - Acute

myocardial

infarction

- Cardiomyop

athy in the

puerperium

between 0

and 5

months

before the

Heart failure

- - - - Acute

myocardial

infarction

between 5

and 30

months

before now

- - - - Percutaneou

s coronary

intervention

between 5

and 30

months

before now

- - - Operation

on heart

between 5

and 30

months

before now

Treatment - - Using

Insulin and

analogues

- Vasodilators

used in

cardiac

diseases

-

- - - - Phosphodies

terase

inhibitors

-

- - - - Cardiac -


stimulants

Table 3.4.a– In- and exclusion criteria of the queries correspondent to the three tasks that comprise the user satisfaction

test.

In order to measure the user’s satisfaction, we used a standardized usability questionnaire, the

System Usability Scale (SUS) to which we have added additional questions such as about the tester’s

demographics and computing skills.

To assess the adequacy of provided training, we compared the screenshots provided by the users

with the correct queries built by one of the EHR4CR experts. Besides this, we analysed the responses

to the questionnaire section corresponding to the training satisfaction and suggestions.

3.4.1 Results

A total of 16 participants participated and completed the questionnaire for the user satisfaction

evaluation. The demographic data from the users can be seen in the following table (see table

3.4.1a).

Variable Number of

testers

Answers

current job group

feasibility manager 7 43.75%

data manager 1 6.25%

trial manager 2 12.50%

other (e.g. head of clinical operations, enrolment specialist, clinical operations portfolio manager)

6 37.50%

work experience (years)* 16 3.01 years (1.680)

gender

male 7 43.75%

female 8 50.00%

(no answer) 1 6.25%


age (years)* 14 43.57 years (5.827)

(no answer) 2

native language

English 12 75.00%

German/ Swiss German 2 12.50%

Polish 1 6.25%

(no answer) 1 6.25%

difficulties regarding English

never, English is my native language 11 66.75%

never, English is not my native language 2 12.50%

rarely 2 12.50%

(no answer) 1 6.25%

usage of similar systems in the past

no 13 81.25%

yes 3 18.75%

experience with feasibility studies

little experience 3 18.75%

some experience 4 25.00%

much experience 9 56.25%

computer skills

average computer skills 2 12.50%

good computer skills 8 50.00%

excellent computer skills 6 37.50%

knowledge in Boolean algebra

no knowledge 3 18.75%

little knowledge 4 25.00%


average knowledge 3 18.75%

good knowledge 5 31.25%

excellent knowledge 1 6.25%

Table 3.4.1a– Summarized number and row percentage per category of the participant demographics; *=for “work

experience” and “age” mean and standard deviation was calculated; n=16 participants.

After each one of the three tasks, the testers had to answer to some questions about the tasks (See

table 3.4.1b). In the table, it can be seen that the first and the second tasks are well rated but the

third seems to be somehow hard and unsatisfactory.

Task 1 Task 2 Task 3 Wilcoxon-Test, p-value

Item Mean

SD Mean

SD Mean

SD T1-T2

T1-T3 T2-T3

Task difficulty

3.94 0.929

3.75 1.000

2.63 1.088

0.582

0.006* 0.005*

Satisfaction with the ease of completing the task

3.81 0.911

4.06 0.574

2.75 0.856

0.271

0.007* 0.001*

Satisfaction with the amount of time it took to complete the task

3.88 0.885

3.75 0.856

2.94 0.680

0.755

0.017* 0.005*

Satisfaction with the functionality provided

3.69 0.873

3.81 0.655

2.88 0.957

0.557

0.010* 0.002*

Table 3.4.1b– Mean ratings about the task difficulty and satisfaction (5-point rating scale), standard deviations and p- values of Wilcoxon-Test, *=significant at the p 0.05 level; n=16 participants.

Together with the opinion about the tasks, there was also a free text question to report about errors

found during the creation and execution of the queries, or simply to report about possible

modifications to the query builder (see table 3.4.1c).


Task Missing functionalities User Number

Expert review

Task 1 criteria of >49years was selected but appears as >=49 years

user01 not important; mistake in specification of query not tool

no ability to execute results for all countries, only for UK/ no response when clicking on all countries

user19 medium importance; probably a problem with available sites not with the tool

the query in eclectic format didn‘t show up and look similar to screen shot in training manual

user06 low importance; feature only used for testing probably be removed for 'real world' version of tool

Task 2 than & less selections appear transformed into moreVless than OR EQUAL to

user01 not important; mistake in specification of query not tool

Task 3 the sequence of building the query is not clear, system seems to require it in reverse (i.e. the parameters of time to be entered before the diagnosis)

user18, user19

important; comment on usability though unspecific

entering exclusion criteria (e.g. 3.3.4 EC02) is cumbersome

user06 important; comment on usability though unspecific

no visible option how to add a range of 5-30 month, the range always began at 0 month

user01 medium importance; option is there when user selects 'between' rather than 'more than' or 'less than', user interface issue or poor documentation

no way to clear just one component from the query, „clear“ clears all components/ if you want to change particular part of the inclusion or exclusion criteria, you have to delete the whole; it would be better to delete parts

user06, user21

medium/low; true but the individual inclusion sections are never hugely complex so deleting all is not too bad

the „before now“ button didn‘t work several times

user19 important; was not seen this reproduced though

the run function and eclectic format was not possible, computer crashed when run the query or do it eclectic format/ „does not compute“ message

user21, user01

important; there was not seen a 'crash' reproduced elsewhere


appeared, when trying to generate the eclectic format

system feedback that query had been saved, but it doesn‘t appear to have been

user01 important; true in terms of lack of feedback, but it is always saved

if you want to check a specific value of a criteria (e.g. if left ventricular ejection fraction was correctly entered and you want to check it later) you are not able to see it by clicking on the symbols

user21 important; this information appears at the bottom of the screen and not where user would originally see it, may need to scroll, user interface issue

Table 3.4.1c– Responses to the open-ended question “What function or feature do you miss for this task?”, and expert review of usability issues; n=16 participants.

After the completion of the three tasks, the overall satisfaction question was assessed, though the

SUS questionnaire. The results show a total score of 55.86, which means that the application is

acceptable but it doesn’t reach the desired “good” level of satisfaction. We estimate that the

difficulty of the third task could have had a negative impact on the user satisfaction that might have

influenced the responses to the SUS.

SUS Item N (valid) Mean SD

I think that I would like to use the Query Builder frequently. 16 3.63 0.885

I [did not find] the Query Builder unnecessarily complex.* 16 3.06 0.929

I thought the Query Builder was easy to use. 16 3.38 0.719

I think that I [would not] need assistance to be able to use the Query Builder.*

16 2.94 0.998

I found the various functions in the Query Builder were well integrated.

15 3.07 0.961

I [did not think] there was too much inconsistency in the Query Builder.*

15 3.33 1.047

I would imagine that most people would learn to use the Query Builder very quickly.

16 3.25 1.000

I [did not find] the Query Builder very cumbersome to use.* 16 3.19 0.834

I felt very confident using the Query Builder. 16 3.06 0.854

I [did not need] to learn a lot of things before I could get going 16 3.00 1.033


with the Query Builder.*

Overall SUS-score 15 55.86 15.37

Table 3.4.1d– Mean rating (5-point-scale from 1 “strongly disagree” to 5 “strongly agree), standard deviations, and overall SUS score. Items marked with an asterisk (*) were reverse coded, n=16 participants.

The responses to the quality of the training show a good grade of satisfaction with the training

among the testers (see table 3.4.1e). Some of them reported though the will of a personal trainer

who could provide info or answer questions on-site. Some of the testers also reported insufficient

information in the training manual to complete the task 3.

Items N (valid) Mean SD

The topics covered by the training were relevant for the tasks.

15 4.20 0.414

The time allotted for the training was sufficient. 13 3.54 0.877

The content of the training was well organized and easy to follow.

15 3.93 0.458

The materials distributed were helpful. 15 4.13 0.352

The speed of the training video was appropriate. 14 3.64 0.842

The amount of information was sufficient for solving the tasks.

15 3.27 1.033

This training experience will be useful in my work. 15 3.40 0.737

Overall, I am satisfied with the training. 15 3.53 0.640

Table 3.4.1e– Mean ratings about quality of the training (5-point-scale from 1 “strongly disagree” to 5 “strongly agree)

and standard deviations, n=16 participants.

The analysis of the screenshots with the queries built and the results of the execution (see table

3.4.1f) show that ten out of thirteen testers could successfully complete the tasks 1 and 2, whereas

only 4 could replicate these results in the task number 3. The reason might be because query 2 is

based on a real possible feasibility query and query 3 is only meant to test the whole functionality of

the query builder, containing this one too many temporal constrains whit what testers are not

familiar.

User ID Task 1 Task 2 Task 3

User 1 S S F

User 2 P P


User 3 S S F

User 4 F S P

User 5 S F

User 6 S F

User 7 S S S

User 8 S S P

User 9 S S F

User 10 S S S

User 11 S S S

User 12 S P P

User 13 S S S

User 14 F F

Table 3.4.1e– S=Success is given when the whole completion of the task is successful. P=Partial Success is given when the

user commits not more than a severe mistake (wrong use of the Eligibility criteria) or not more than two minor mistakes

(wrong use of the temporal constraints). F=Fail is given when the user commits more than a severe mistake or more than

two minor mistakes.

3.5. Patient Recruitment – Data Inventory

The creation of the PRS data inventory was described in last years’ deliverable D7.3. Since then the

inventory has been refined and Unified Medical Language System (UMLS) codes were added to each

element.

When identifying codes for the elements it was discovered that some described the same concepts

(e.g. ‘platelets blood’ and ‘platelets count’). Redundant elements were removed so that the revised

data inventory contains 150 data elements now. The number of elements per Data Group can be

taken from table x. The whole data inventory is located in the appendix 4.1.

Data Group Total Example

Demographics 5 Gender

Diagnosis 5 Code

Findings 25 Systolic blood pressure

Laboratory Findings 81 HbA1C

Medical device 1 Type

Medical History 10 Smoking status


Medication 9 Route

Patient

Characteristics

1 day-night cycles

Procedure 3 Procedures date/time

Scores&Classification 10 Expanded Disability Status Scale (EDSS) score

Table 3.5.1. Number of studies for the project disease areas covered in the third version of the Data Inventory.

3.6. Patient Recruitment – Evaluation

The basic evaluation design for this scenario has been developed and described in year 3. To perform

the evaluation, a set of several tasks has to be conducted by different project partners. Some of the

tasks can be worked on in parallel, while others depend on the outcome of their predecessors. Since

the participating sites have slightly different settings and are in different states of the evaluation, this

report will list the current situation for the detailed tasks:

1. Installation and configuration of the PRS components at the sites;

2. Approach and seeking confirmation of the respective Principal Investigator(s) (PI) at the

participating sites for the respective trial(s) that have already been chosen in year 3;

3. Adjustment of the evaluation protocol for each site participating in the evaluation for the

respective trial(s);

4. Approach and requesting approval of the local ethics committee/institutional review board

at the participating sites for the respective trial(s);

5. Review of simplified eligibility criteria (EC) of the trials used for the evaluation, and, if

necessary, re-simplification of these EC;

6. Check availability of data items correlating to the EC within the central terminology, and, if

necessary, coordinate insertion of missing items with Work package 4 team;

7. Extraction, Transformation and Loading (ETL) of EHR data correlating to these data items;

8. Creation of (database) queries for the trials utilized in the evaluation within the central

workbench;

9. Distribution of queries to the participating sites;

10. Execution of queries, collection of necessary numbers and screening lists at the participating

sites;

11. Comparison between screening list from standard method with candidate list from EHR4CR

systems


3.6.1 Task #1 (Installation and configuration of the PRS components at the sites)

The evaluation by WP7 was delayed, because WPG2 was unable to keep the planned timescales for

the roll out of the PRS platform as there were difficulties in the development of the software. When

it was eventually deployed there were difficulties installing it at all of the sites.

This task is a crucial pre-requisite of the whole evaluation. Without successfully installed and

configured PRS components, no patients can be found as described in the specification.

At Dundee the PRS component have been installed successfully and run. However, the platform does

not report counts that are consistent with PFS and there are usability issues (query runtimes). Until

these are resolved Dundee will not be able to perform comparisons with actual study recruitment.

The issue was reported in the project internal ticketing system JIRA.

3.6.2 Task #2 (Approach and seeking confirmation of the respective Principal Investigator(s) (PI) at the participating sites for the respective trial(s) that have already been chosen in year 3)

While trying to install and configure the PRS components in several iterations, the sites were asked to

approach the PIs of the respective trials to explain the evaluation design and ask about their

willingness to cooperate. If willing to participate PIs would provide their local screening list, screen

patients being identified by the EHR4CR system and provide an overview of screened, eligible and

additionally identified patients as described in the evaluation protocol. The template was part of last

year’s deliverable and for reference has been added to the Appendix 4.3.1.

The status of approaching the local PIs is as follows:

Site Status

University College London (UCL) No appropriate study to utilize for this

evaluation could be found. However, UCL is pro-

actively seeking to contribute to this evaluation

by identifying trials at the clinic (inter-

institutional, multi-drug, etc.) for retrospective

evaluation. PI approval obtained, subject to

Ethics approvals being obtained.

Kings College London (KCL) PI approval obtained for study “Bayer 15141”

University of Dundee (UNIVDUN) PI approval obtained for studies “AZ EUCLID”

and “Sanofi GetGoal Duo-2”

Université de Rennes 1 (U936) PI approval obtained for study

“Roche_WA25204_09092013”. Will be done as

simulation as study did not start.

Westfälische Wilhelms-Universität Münster

(WWU)

PI approval obtained for studies „NVS OCTAVE”,

“PASSAGE” and “Roche KATHERINE”; PI


approvals denied for studies “Amgen MM Bone

Study” and “NCT01816295”

Friedrich-Alexander-Universität Erlangen-

Nürnberg (FAU)

PI approval obtained for studies „Bayer EINSTEIN

Junior”, “Roche KATHERINE” and “Sanofi

EFC11785 (Proselica)”

Hôpitaux universitaires de Genève (HUG) PI contacted and approval obtained. But NVS

OCTAVE trial has been cancelled at HUG.

Assistance Publique Hôpitaux de Paris (AP-HP) PI approval obtained for studies “AZ EUCLID”

and “Sanofi GetGoal Duo-2”

The University of Manchester (UoM) Does not participate in any of the trials identified

for the evaluation

Medical University of Warsaw – POLCRIN (MUW) Does not participate in any of the trials identified

for the evaluation

University of Glasgow (UoG) PI approval for study “Sanofi EFC11785

(Proselica)” obtained.

Table 3.6.1 – PI participation status overview

No appropriate study to utilize for this evaluation could be found for two of the sites (UoM and

MUW). Thus, these sites won’t be listed in the following evaluation steps.

PI approvals could not be obtained for some of the studies at some sites (e.g. “Amgen MM Bone

Study” at WWU). Whenever a missing approval affected a trial that is conducted at only the

respective site, the trial was no longer considered to be evaluated. However, in most of the cases, we

were able to gain approvals by the local PIs, which indicates that the local clinicians are willing to test

new methods for identifying potentially eligible patients and cooperate in the evaluation of such

methods.

FAU will not proceed with the Sanofi PROSELICA study because the required patient population is

treated not at Erlangen University Hospital but at a collaborating hospital with a separate IT

infrastructure and only limited usage of the Erlangen University Hospital EHR system.

At HUG, the local PI did agree to participate in the PRS evaluation. However, since the sponsor pulled

the study at this site, the evaluation will not be done.

UoG is one of the sites of the “Sanofi EFC11785 (Proselica)” study. However, the recruitment has

already been finished in 2013 for this site, thus only leaving the option for a retrospective evaluation

at UoG.


3.6.3 Tasks #3 (Adjustment of the evaluation protocol for each site participating in the evaluation for the respective trial(s)) & #4 (Approach and requesting approval of the local ethics committee/institutional review board at the participating sites for the respective trial(s))

Sites that have successfully gained approvals by their local PIs adjusted the evaluation protocol

(template) according to the evaluated trials and approached their local ethics

committee/institutional review board. The ethics committees’ responses were as follows:

Site Status

University College London (UCL) A project wide approval is being sought, and no

obstacles are anticipated.

Kings College London (KCL) The study has ethical approval and we have

been given Caldicott approval to do the

evaluation.

University of Dundee (UNIVDUN) Gold standard test will not be performed for this

site. Thus, no ethics approval was sought

Université de Rennes 1 (U936) The study actually did not start in Rennes but

will be simulated.


(WWU)

Ethics committee approved evaluation with

studies „NVS OCTAVE”, “PASSAGE” and “Roche

KATHERINE”


Nürnberg (FAU)

Ethics committee approved evaluation with

studies "Roche KATHERINE", "Bayer EINSTEIN

JUNIOR" and "Sanofi EFC11785 (PROSELICA)".

Hôpitaux universitaires de Genève (HUG) Ethics committee did approve, but trial has been

pulled at HUG.

Assistance Publique Hôpitaux de Paris (AP-HP) Ethics committee approved evaluation with

studies “AZ EUCLID” and “Sanofi GetGoal Duo-2”

University of Glasgow (UoG) Local Privacy Advisory committee (responsible

group regarding evaluation without patient

contact) approved evaluation with study "Sanofi

EFC11785 (PROSELICA)"

Table 3.6.2 – Ethics status overview

In summary, whenever a local PI was willing to cooperate in the EHR4CR PRS evaluation, the local

ethics committee did approve the conduction of this evaluation as well.


3.6.4 Tasks #5 (Review of simplified eligibility criteria (EC) of the trials used for the evaluation, and, if necessary, re-simplification of these EC) & #6 (Check availability of data items correlating to the EC within the central terminology, and, if necessary, coordinate insertion of missing items with Work package 4 team)

The EC of the trials utilized for this evaluation have already been simplified. However, the

simplification took place in the beginning of the project with focus on protocol feasibility, not patient

identification and recruitment. Hence, the result of the simplification was more a list of necessary EC

rather than EC suitable for screening patients. Additionally, the team gained experience throughout

the project. Because of this, we decided to review the simplified EC of the utilized trial with focus on

patient identification and recruitment and re-simplified the EC where necessary.

Based on the simplified criteria, the central terminology has been checked to ensure that all the data

items necessary to build proper queries are available in this central terminology. This task is of

special importance for the trials that are conducted within multiple sites, because the sites shall use

the very same query and thus agree upon the usage and interpretation of data items. The following

table lists the status of the central terminology check for each of the utilized studies:

Study Participating site(s) Central terminology check

status

AZ EUCLID AP-HP, UNIVDUN Central terminology has been

checked by AP-HP; In process of

checking EC and mappings to

central terminology by

UNIVDUN.

Bayer EINSTEIN Junior FAU Simplification was optimized,

terminology checked (no

additions necessary)

Bayer 15141 KCL The terminology checking is

ongoing

NVS OCTAVE WWU, HUG Terminology not checked yet

NVS CFTY720D2406 WWU Terminology not checked yet

Roche KATHERINE FAU, WWU Simplification was optimized,

terminology checked and

missing terms were added to

the terminology by WP4

Sanofi GetGoal Duo-2 AP-HP, UNIVDUN In process of checking EC and

mappings to central

terminology. AP-HP did an

analysis of the EC and provided


it to UNIVDUN; Drug dosage

could not be used in the

eligibility criteria

Sanofi EFC11785 (Proselica) FAU, UoG Simplification was optimized,

terminology checked and

missing items reported to WP4

UCL-internal study UCL Not started yet, because no

local trial has been identified

yet.

Rennes-internal study U936 Rennes will not do the

evaluation with the EHR4CR

system but with their local

recruitment system and did

therefore not review the

criteria according to the central

terminology.

Table 3.6.3 – Status of reviewing and aligning of study EC and codes available in central terminology

For some of the studies it was not checked whether their EC have corresponding codes in the central

terminology. The main reason is that the same people who have to check the availability of the data

items are involved in installation of the PRS systems and ETL of patient data and could not make the

additional terminology check. This task revealed that the EC and data items in the central

terminology have to be reviewed for patient identification and recruitment, because some items

could not be found. The result of the task did already and will lead to an improvement of the central

terminology, because it will be suitable for PFS and PRS EC.

3.6.5 Task #7 (Extraction, Transformation and Loading (ETL) of EHR data correlating to these data items)

Having identified the data items that are necessary to query for potentially eligible patients, the sites

have to check and, if necessary, perform the task of Extraction, Transformation and Loading (ETL) (at

least) for the studies evaluated at their sites. The following table shows the current status of the

sites’ ETL.

Site Status

University College London (UCL) PRS scenario at UCL uses the same warehouse as

PFS, thus no special ETL for the PRS studies is

necessary. ETL will be performed on-demand

once local studies have been selected. Additional

mappings might be required.


Kings College London (KCL) ETL tools currently being tested on live database

for data extraction for “Bayer 15141”

University of Dundee (UNIVDUN) PRS scenario at UNIVDUN uses the same

warehouse as PFS, thus no special ETL for the

PRS studies is necessary.

Université de Rennes 1 (U936) We did not have the approval from our ethic

committee to connect any source of real patient

data to the EHR4CR infrastructure. No sufficient

warranty of security has been provided by the

EHR4CR to convince our IT department. That’s

why the own recruitment system is going to be

used.


(WWU)

ETL done for “Roche KATHERINE” and “Novartis

PASSAGE”, ETL started for „Novartis OCTAVE”


Nürnberg (FAU)

ETL is in progress for study "Roche KATHERINE",

not started yet for "Bayer EINSTEIN JUNIOR" and

"Sanofi EFC11785 (Proselica)"

Hôpitaux universitaires de Genève (HUG) ETL is done and data for the period 06.2012-

07.2013 were extracted.

Assistance Publique Hôpitaux de Paris (AP-HP) ETL checked for “AZ EUCLID” and “Sanofi

GetGoal Duo-2”; no adjustment necessary

University of Glasgow (UoG) ETL is currently ongoing

Table 3.6.4 – Status of ETL

AP-HP has used the hospital clinical data warehouse and therefore where the items were already

stored. The ETL process has not been changed, but additional mapping was used to take the new

items into account.

3.6.6 Tasks #8 (Creation of (database) queries for the trials utilized in the evaluation within the central workbench), #9 (Distribution of queries to the participating sites) & #10 (Execution of queries, collection of necessary numbers and screening lists at the participating sites)

AP-HP has chosen the AstraZeneca EUCLID and SANOFI GetGoal Duo-2 clinical trials to evaluate the

EHR4CR recruitment scenario. The whole workflow of the PRS scenario has been tested during the

AP-HP local evaluation:

The central terminology items used by the 2 studies have been mapped to the AP-HP Clinical

Data Warehouse items


The 2 corresponding queries have been written in the central workbench.

For the SANOFI GetGoal Duo-2 study, as daily drug dosage was not available in the query

workbench, only the presence of the drugs was used. For AstraZeneca EUCLID, as some

logical operations were not available in the central workbench, we decided to transform

exclusion criteria: {"ICD 10 code I49.5 (Sick Sinus Syndrome)" AND NOT "Pacemaker"} into

inclusion criteria: {"No ICD 10 code I49.5 (Sick Sinus Syndrome)" OR "Pacemaker"}.

The queries in the Central Workbench have been submitted to the AP-HP Local Workbench

and validated by the AP-HP Data Relation Manager. A PI user has been assigned to each

study.

The PI has locally launched the 2 queries in order to get a list of patients for each study.

The eligibility status of each patient of the EUCLID study has been checked. For the SANOFI

Get Goal Duo-2 study, this process is not yet finished.

All along the AP-HP recruitment evaluation, the screening dashboard was available in the

central workbench

AP-HP was able to create a query in the central workbench that covers a selected set of eligibility

criteria, send it to the local PRS component(s) of their site, execute it at least once with the local PRS

components and gather a list of potentially eligible patients.

WWU and FAU were able to create the query for the KATHRINE study in the central workbench and

send it to their local components. The execution resulted in 0 patients found for WWU.

For FAU, the electronic query returned 13 possibly eligible patients for the Roche KATHERINE study.

At this time, the screening log of this study contained 10 enrolled patients. 5 enrolled patients were

also identified by the electronic query.

A closer examination still has to decide whether the other 8 suggested patients are truly eligible for

the study or whether they represent false-positive results. However, the 5 correctly found patients

clearly demonstrate the usefulness of electronic patient recruitment based on routine care data. For

studies with a low recruitment rate, such as the KATHERINE study, this result clearly demonstrates

success.

Due to current mapping problems with the EHR4CR platform, FAU executed the query on its i2b2

platform. FAU is confident that the query can also be executed successfully on the EHR4CR platform

once these mapping problems are being sorted out. FAU actually expects a lower false-positive rate

on the EHR4CR platform, because for the i2b2 query no temporal constraints were imposed (thus

probably returning more “false” patients).


WWU created the query for the PASSAGE study at the central workbench but was not able to send it

to their local components due to ongoing technical issues with the platform.

Other sites and studies were not able to complete these tasks, since steps that need to be performed

before are not finished/encounter problems.

The reasons for this are, that the software – especially the local PRS components – had to be

installed, tested and reported about in several iterations, which leads to massive time delays. The

additional tasks (e.g. check of central terminology) have been delayed as well. Since the successful

conduction of the evaluation depends on both the availability of the PRS components and fulfilled

additional tasks, most sites were not able to achieve both in the same time. However, all

participating sites still are working on both topics and try to conduct the evaluation.

3.6.7 Task #11 (Comparison between screening list from standard method with candidate list from EHR4CR systems)

No site was able to conduct the whole evaluation, because no site was ready to do the manual check

of patient records yet. Partial results have been collected and are described below.

AP-HP has completed the evaluation of the AstraZeneca EUCLID study in the context of the EHR4CR

PRS scenario on Oct. 8th 2014 and the Sanofi GetGoal Duo2 study on Jan. 15th 2015. The comparison

of the traditional recruitment process and the result of the EHR4CR recruitment process for the

EUCLID study are shown in table 3.6.7.1 and for GetGoal Duo2 in table 3.6.7.2. The official results

document from AP-HP can be found in Appendix 4.5.

Patient Counts

Method

Traditional

Recruitment

EHR4CR

Platform

Unique Clinically Validated 53 2

Identical Clinically Validated Traditional

Recruitment & EHR4CR Platform 0

Table 3.6.7.1 - AP-HP PRS results for the AstraZeneca study (EUCLID)

Patient Counts

Method

Traditional

Recruitment

EHR4CR

Platform



Identical Clinically Validated Traditional

Recruitment & EHR4CR Platform 0

Table 3.6.7.1 - AP-HP PRS results for the Sanofi study (GetGoal Duo2)

For WWU the results of the KATHRINE study were 0 Patients identified by the EHR4CR platform.

When contacting the PI to investigate further he said that the study was cancelled at the site,

because no patients had tumour tissue left after their surgery, which was a requirement for

inclusion.

3.7. Clinical Trial Execution & Serious Adverse Event Reporting – Data Inventory

Due to delays in the platform availability and subsequent evaluation for PFS and PRS the execution of

this task has been shifted to the fourth year. Another reason was the discussion and following

decision to combine these two tasks into one single task by including SAE reporting into the Clinical

Trial Execution.

For the CTE and SAE reporting scenario a data inventory based on the most common data elements

in clinical trials has been established. Complete Trial CRFs from the EFPIA partners were collected

and categorized by their respective disease domains. The number of studies received per company is

as follows: Amgen: 1; AstraZeneca: 2; Bayer: 4; GSK: 7; J&J: 5; Merck: 0; Lilly: 0; Novartis: 5; Roche: 0;

Sanofi: 1. Table 3.7.1 shows the number of trials for which CRFs are present for a given disease

domain.

Disease domain Number of Trial CRFs

Oncology 3

Diabetes 3

Cardiovascular 4

Renal 1

Respiratory 10

Infections 1

Psychiatric 1

Ophthalmology 1

Neuroscience 1 Table 3.7.1 – Number of Trial CRFs for a specific disease domain that was used for the fourth Data Inventory.

A data standards catalogue from Amgen and Lilly containing the basis form domains in clinical trials

has also been included. In addition, a frequency analysis of the most used data elements had been

performed by Bayer and Novartis, which was also included in the overall analysis. Due to the fact that

all CRFs were received in multiple different formats, all forms and data elements had to be mapped

into a central database schema. As a second step, the form domains were harmonized by each

delivering EFPIA partner. During this, all data elements were normalized to allow the determination


of equal elements over all trials. Based on this normalization step frequency analyses were

performed. To identify the relevance of data elements, the top 24 form domains selected and for

each domain one list containing data elements sorted by occurrence were created. To determine the

relevance of data elements, each list was sent to at least 2 EFPIA partners for review. In this process

step the priority and the category for each element were stated, whether a data element is relevant

or not relevant and whether it is a study administrative or a clinical value. This information is

necessary to identify those data elements that are most likely to be found within an electronic health

record system. During this process, naming errors and SDTM variable labels were assigned where

necessary and where possible. The collation process for the data elements in all selected form

domains was performed during a face-to-face meeting in which again the naming and the relevance

of data elements was discussed and agreed.

The Data Inventory consists of the following form domains with the number of data elements in

brackets: Concomitant Medication (9), Demographics (4), Disease Characteristics (2), Disposition (2),

ECG-Findings (9), Laboratory Data (6), Common Lab Data Analytes (57), Medical History (4), Patient

Reported Outcome / Questionnaires (3), Substance Use (8), Surgery (4), Tumor Response (6) and

Vital Signs (8). For the SAE reporting scenario data elements of the Adverse Events domain were

included into this data inventory. The concluded data elements were compared with the previous

data inventory of PRS and highlighted what elements are already present and what are novel ones.

Terminology concept codes of UMLS as well as a short description of the data element were

assigned.

After the identification and determination of the common data elements in clinical trials, the

complete element list was sent out to the sites to perform data exports in their local systems. As in

the previous scenarios, the availability and completeness was assessed. Availability is determined by

the localization of data elements within the sites’ EHR database; completeness by the frequency of

the uniquely documented value per patient divided by the total number of admitted patients in

2013.

In the extension process of the PRS Data Inventory, in total 133 data elements were identified, 83

novel ones and 50 that already exist in the previous one.

The complete CTE Data Inventory including the results for availability and completeness for each data

element per site can be found in the Appendix 4.7. At this point the results from Rennes, Glasgow

and Manchester are missing. Column D represents the average completeness for each site (E-O). The

completeness is given in percentage. Missing results are colored in purple. Available data elements

have white as background color and not available ones have black.


4. Appendix

4.1. Scalability Evaluation

Outlined below are the three queries used in the scalability testing.

Query 1

1 gender() in {[SNOMED Clinical Terms:248152002,"Female"]} and

2 born() at least 60 year before now

Query 2

1 born() at least 18 year before now and

2 last procedure([SNOMED Clinical Terms:64915003,"Operation on heart"]) and

3 last vitalsign([SNOMED Clinical Terms:50373000,"Body height measure"]) in range(>=1.6)

unit([ucum:m,"meter"]) and

4 last vitalsign([SNOMED Clinical Terms:27113001,"Body weight"]) in range(<=100.0)

unit([ucum:kg,"kilogram"]) and

5 not last medication([ATC:A10BG02,"rosiglitazone"])

Query 3

1 born() at least 18 year before now and

2 last vitalsign([SNOMED Clinical Terms:271649006,"Systolic blood pressure"]) in range(>=140.0)

unit([ucum:mm[Hg],"millimeter Mercury column"]) and

3 not last medication([ATC:A10BG02,"rosiglitazone"]) at most 5 year before now and

4 last procedure([SNOMED Clinical Terms:64915003,"Operation on heart"]) at most 9 year before

now

4.2. Usability Evaluation

Appendix 4.4.1.- Information note for the testers

EHR4CR Usability Evaluation Information note.pdf


Appendix 4.4.2.- Manual + Test Script:

EHR4CR Usability Evaluation of the Query Builder v2.5.docx

4.3. PRS Evaluation

4.3.1 PRS testing template from last year’ s deliverable

PRS_testing_protocol_template.docx


4.3.2 PRS results document from AP-HP for the EUCLID study

EHR4CR Patient Identification and Recruitment (PIR)

Evaluation

Study name: EUCLID

Study Identifier: NCT:01732822

Site: APHP

Date: 08/10/2014

Authorized Site User: Dr Yannick GIRARDEAU

Signature:

Patient Counts

Method

Traditional

Recruitment

EHR4CR

Platform


Identical Clinically Validated

Traditional Recruitment & EHR4CR

Platform

0

Unique Clinically Validated Patient Counts indicates the total number of patients uniquely identified

with the specified method based on pseudonymized case numbers as determined by the treating

physician or authorized site user.

Identical Clinically Validated Patient Counts indicates the total number of identical patients

identified with two or three methods based on pseudonymized case numbers as determined by the

treating physician or authorized site user.

PRS Results document from AP-HP for the AstraZeneca EUCLID study


4.3.3 PRS results document from AP-HP for the GetGoal Duo-2 study

EHR4CR Patient Identification and Recruitment (PIR)

Evaluation

Study name: GetGoal Duo-2

Study Identifier: NCT:01768559

Site: APHP

Date: 15/01/2013

Authorized Site User: Dr Yannick GIRARDEAU

Signature:

Patient Counts

Method

Traditional

Recruitment

EHR4CR

Platform


Identical Clinically Validated

Traditional Recruitment & EHR4CR

Platform

0

Unique Clinically Validated Patient Counts indicates the total number of patients uniquely identified

with the specified method based on pseudonymized case numbers as determined by the treating

physician or authorized site user.

Identical Clinically Validated Patient Counts indicates the total number of identical patients

identified with two or three methods based on pseudonymized case numbers as determined by the

treating physician or authorized site user.

PRS Results document from AP-HP for the Sanofi GetGoal Duo2 study


4.4. PRS Data Inventory

Data Group Data Item

Demographics Admission date

Demographics Case Status

Demographics Date of Birth

Demographics discharge date

Demographics Gender

Diagnosis COPD exacerbation

Diagnosis Diagnosis Code

Diagnosis Diagnosis Date

Diagnosis Diagnosis text

Diagnosis Histologically confirmed diagnosis

Findings Blood pressure diastolic

Findings Blood pressure systolic

Findings Body Mass Index (BMI)

Findings Cardiac function

Findings CT

Findings Date/Time of Finding

Findings ECG

Findings Focal lesion

Findings Forced expiratory volume in 1 second / forced vital capacity (FEV1/FVC) ratio

Findings Gallium scan

Findings Height

Findings HR

Findings hypertension

Findings Infection

Findings Intraocular pressure

Findings Lytic bone lesion

Findings MRI

Findings Oxigen therapy

Findings Percent predicted forced expiratory volume in 1 second (FEV1)

Findings Percentage lesion area

Findings PET scan

Findings Pulse

Findings Spirometry

Findings Temperature

Findings Weight

Laboratory findings Albumin

Laboratory Findings albumin-adjusted serum calcium


Laboratory Findings Alkaline Phosphatase

Laboratory findings Amylase

Laboratory findings ANA titer

Laboratory Findings antibodies

Laboratory findings anti-cyclic citrullinated peptide antibodies (anti-PP)

Laboratory Findings Beta HCG in serum

Laboratory Findings biPTH

Laboratory Findings Blood Urea Nitrogen [BUN]

Laboratory Findings BNP

Laboratory Findings Ca x P

Laboratory Findings Calcitonin

Laboratory Findings Calcium in serum

Laboratory Findings calculated creatinine clearance

Laboratory Findings Cardiac troponin T

Laboratory findings C-Reactive protein (hs-CRP)

Laboratory Findings Creatinine clearance

Laboratory Findings Creatinine in serum

Laboratory Findings CRP in serum

Laboratory Findings Direct Bilirubin in serum

Laboratory findings eGFR

Laboratory Findings Eosinophils Blood

Laboratory Findings Erythrocytes

Laboratory Findings Fasting C-peptide

Laboratory findings Fasting hypertriglyceridemia

Laboratory Findings Fasting plasma glucose (in serum)

Laboratory findings Ferritin

Laboratory findings Folate

Laboratory findings Gamma GT

Laboratory Findings Glomerular Filtration Rate

Laboratory Findings Glucose in serum

Laboratory Findings Haematocrit Blood

Laboratory Findings HbA1c

Laboratory Findings HDL in serum

laboratory findings hepatitis C virus (HCV)

Laboratory findings HER2 status

Laboratory findings KRAS mutation

Laboratory Findings LDL in serum

Laboratory findings Lipase

Laboratory findings Leukocytes

Laboratory Findings Lymphocytes Blood

Laboratory Findings Mantoux Test


Laboratory Findings Measles Antibody

Laboratory Findings Monoclonal light chain in the urine protein electrophoresis

Laboratory Findings Monoclonal plasma cells in the bone marrow

Laboratory Findings Monoclonal protein in Serum

Laboratory Findings Monoclonal protein in Urine

Laboratory Findings Neutrophils Blood

Laboratory findings NRAS mutation

Laboratory Findings NTproBNP

Laboratory Findings Platelet Count

Laboratory Findings Potassium in serum

Laboratory Findings Prolactin

Laboratory Findings PSA

Laboratory findings PT

Laboratory Findings PT (INR)

Laboratory Findings PTT Blood

Laboratory findings Rheumatoid Factor

Laboratory Findings sampling Date / Time of Laboratory Finding

Laboratory Findings Serum immunoglobulin free light chain

Laboratory Findings serum immunoglobulin kappa lambda free light chain ratio

Laboratory Findings Serum monoclonal paraprotein (M Protein)

Laboratory Findings serum pregnancy test

Laboratory Findings SGOT (AST) in serum

Laboratory Findings SGPT (ALT) in serum

Laboratory Findings Sodium in Serum

Laboratory Findings Thyroid-stimulating hormone (TSH)

Laboratory Findings total bilirubin

Laboratory Findings Total Bilirubin in serum

Laboratory Findings Total Cholesterol in serum

Laboratory Findings Total Protein in serum

Laboratory findings Total testosterone level

Laboratory findings Transferrin saturation

Laboratory Findings Triglycerides

Laboratory Findings Urine monoclonal light chain protein

Laboratory Findings Urine monoclonal paraprotein (M Protein)

Laboratory findings Urine protein to creatine ratio

Laboratory findings Varicella Antibody

Laboratory findings Vitamin B12

Laboratory findings white blood cell count

Medical device type

Medical History Alcohol Abuse

Medical History Allergies and Hypersensitivity reactions


Medical History Currently breast feeding

Medical History Currently pregnant

Medical History Diet

Medical History Libido

Medical History menopausal status

Medical History pregnancy number

Medical History Smoking Status

Medical History Substance Abuse

Medication active substance

Medication Dosage

Medication Drug Class

Medication Drug Group

Medication Drug name

Medication Medication Code

Medication Medication end date

Medication Medication start date

Medication Route

Patient Characteristics

Day-night cycles

Procedure Procedure Code

Procedure Procedure Date

Procedure Procedure Text

Scores&Classification AJCC

Scores&Classification Best-corrected visual acuity (BCVA) Score

Scores&Classification CTCAE

Scores&Classification Expanded Disability Status Scale (EDSS) score

Scores&Classification IPI

Scores&Classification modified Rankin Scores&Classification

Scores&Classification NCI-Common Terminology Criteria for Adverse Events

Scores&Classification SELENA-SLEDAI Scores&Classification

Scores&Classification SLE

Scores&Classification WHO

4.5. CTE Data Inventory

_Top Data Items Export Evaluation CTE v1.0.xlsx


Nr Data Element Domain

Average

completeness APHP FAU KCL MUW U936 UNIVDUN UOG UoM WWU

UCL

breastcancer HUG

22 Date Of Birth Demographics 88% 100% 100% A 100% 100% 100% 100% 100%

21 Sex Demographics 87% 100% 100% A 100% 100% 100% 100% 100%

26 Diagnosis Code Disease Characteristics 53% 33% 79% N/A 100% A 80% 100% 35%

25 Date of diagnosis Disease Characteristics 49% N/A 79% A 100% A 80% 100% 35%

117 Date Of Procedure Surgery 22% 13% 15% N/A A A 32% 100% 18%

116 Procedure Name Surgery 22% 13% 15% N/A A A 32% 100% 18%

85 Platelets Laboratory 15% 46% A A A A 25% N/A 50%

38 Result Laboratory Data 15% 68% A A A A 45% 6% N/A

39 Laboratory Test Laboratory Data 15% 68% A A A A 45% 6% N/A

77 Hematocrit Laboratory 14% 49% A N/A A A 36% N/A 31%

40 Original Result Unit Laboratory Data 14% 68% A A A A 41% 6% N/A

80 MCHC (Erythrocyte Mean Corpuscular Hemoglobin Concentration)Laboratory 14% 49% A N/A A A 34% N/A 32%

55 Creatinine Laboratory 14% 48% A A A A 24% N/A 43%

42 Reference Range Upper Limit (reported in Original Unit)Laboratory Data 14% 68% A N/A A A 45% N/A N/A

89 Red Blood Count Laboratory 14% 49% A N/A A A 36% N/A 28%

98 Urine Red Blood Cells Laboratory 14% 0% A N/A A A 100% N/A 11%

120 Date Of Assessment Tumor Response 14% 11% N/A N/A N/A N/A N/A 100% N/A

121 Lesion Location Tumor Response 14% 11% N/A N/A N/A N/A N/A 100% N/A

123 Lesion Description Tumor Response 14% 11% N/A N/A N/A N/A N/A 100% N/A

79 Lymphocytes Laboratory 14% 46% A A A A 33% N/A 31%

41 Reference Range Lower Limit (reported in Original Unit)Laboratory Data 14% 68% A N/A A A 41% N/A N/A

84 Neutrophils (total) Laboratory 14% 46% A A N/A A 36% N/A 27%

128 Body Weight Vital Signs 13% 26% 2% A N/A A 80% N/A N/A

129 Height Vital Signs 13% 26% N/A A N/A A 76% N/A N/A

133 Date Of Assessment Vital Signs 13% N/A 2% A N/A A 100% N/A N/A

101 Ongoing Medical History 13% N/A N/A A N/A N/A N/A 100% N/A

102 Reported Term Medical History 13% 0% N/A N/A N/A N/A N/A 100% N/A

122 Method Of Tumor Measurement Tumor Response 13% N/A N/A N/A N/A N/A N/A 100% N/A

124 New Lesion Description Tumor Response 13% N/A N/A N/A N/A N/A N/A 100% N/A

125 Measurement Of Target Lesion Diameter Tumor Response 13% N/A N/A N/A N/A N/A N/A 100% N/A

105 Questionnaire Name PRO 12% 93% N/A N/A N/A N/A 1% N/A N/A

106 Date / Time Of Assessment PRO 12% 93% N/A N/A N/A N/A 1% N/A N/A

107 Question Name PRO 12% 93% N/A N/A N/A N/A 1% N/A N/A

78 Hemoglobin Laboratory 12% 49% A A A A 13% N/A 31%

82 MCV (Erythrocyte Mean Corpuscular Volume)Laboratory 12% 49% A N/A A A 12% N/A 32%

87 PT,INR (International Normalized Ratio of Prothrombin Time)Laboratory 11% 36% A A A A 25% N/A 30%

63 Potassium Laboratory 11% 47% A A A A 3% N/A 41%

66 SGPT/ALT Laboratory 11% 32% A A A A 27% N/A 31%

86 PT (Prothrombin time) Laboratory 11% 36% A N/A A A 24% N/A 30%

74 Basophils Laboratory 11% 46% A N/A A A 11% N/A 31%

67 Sodium Laboratory 11% 47% A A A A 0% N/A 41%

64 Protein, total Laboratory 11% 47% A A A A 23% N/A 18%

83 Monocytes Laboratory 11% 46% A N/A A A 9% N/A 31%

65 SGOT/AST Laboratory 11% 32% A A A A 23% N/A 31%

81 mean corpuscular hemoglobin Laboratory 10% 49% A N/A A A 34% N/A N/A

131 Temperature Vital Signs 10% 63% N/A N/A N/A A 16% N/A N/A

57 Glucose, unspecified Laboratory 10% 40% A A A A 3% N/A 35%

12 Start Date Concomitant Medication 10% 33% 3% A N/A A 21% N/A 21%

19 Drug Name Concomitant Medication 10% 33% 3% N/A N/A A 21% N/A 20%

75 Eosinophils Laboratory 10% 46% A N/A A A 0% N/A 30%

48 Bilirubin, total Laboratory 9% 30% A A A A 22% N/A 24%

90 WBC Laboratory 9% 75% A A A A 0% N/A N/A

13 Route Of Administration Concomitant Medication 9% 33% N/A A N/A A 20% N/A 21%

14 Frequency Concomitant Medication 9% 33% N/A A N/A A 20% N/A 21%

18 Dose Unit Concomitant Medication 9% 33% N/A A N/A A 21% N/A 20%

17 Dose Per Administration Concomitant Medication 9% 33% N/A A N/A N/A 20% N/A 20%

47 Bilirubin, indirect Laboratory 9% 30% A N/A A N/A 18% N/A 24%

45 Alkaline phosphatase Laboratory 9% 47% A A A A 1% N/A 24%

51 Calcium Laboratory 9% 38% A A A A 10% N/A 21%

52 Chloride Laboratory 9% 47% A N/A N/A A 8% N/A 13%

126 Systolic Blood Pressure Vital Signs 9% 46% N/A A N/A A 22% N/A N/A

127 Diastolic Blood Pressure Vital Signs 9% 46% N/A A N/A A 22% N/A N/A

88 Partial Thromboplastin Time Laboratory 8% 1% A N/A A N/A 34% N/A 30%

43 Date / Time Sample Was Taken Laboratory Data 8% 48% A A A A 7% 6% N/A

54 Creatine Kinase (CK, CPK) Laboratory 7% 15% A A A A 32% N/A 11%

46 Bilirubin direct Laboratory 7% 30% A N/A A N/A 1% N/A 24%

44 Albumin Laboratory 7% 7% A A A A 16% 7% 24%

20 Total Daily Dose Concomitant Medication 7% 33% N/A A N/A N/A 21% N/A N/A

56 GGT Laboratory 6% 30% A A A A 20% N/A N/A

37 QTCF Interval ECG 6% N/A N/A N/A N/A A 45% N/A N/A

72 TSH Laboratory 5% 13% A A A A 15% N/A 15%

71 Troponin T Laboratory 5% 19% A N/A N/A A 23% N/A N/A

16 Stop Date Concomitant Medication 5% 33% 3% N/A N/A N/A 7% N/A N/A

76 Erythrocyte Sedimentation Rate Laboratory 5% 3% A N/A A N/A 35% N/A N/A

53 Cholesterol, total Laboratory 5% 13% A A A A 14% N/A 11%

92 Urine Glucose Laboratory 5% 1% A N/A A A 2% N/A 35%

58 Glycated Haemoglobin / Hemoglobin A1C Laboratory 4% 10% A A A A 19% N/A 6%

97 Urine Protein Laboratory 4% 15% A N/A A A 4% N/A 15%

49 Blood Urea Nitrogen Laboratory 4% 32% A N/A A N/A 0% N/A N/A

50 Brain Natriuretic Peptide Laboratory 4% 6% A N/A N/A N/A 20% N/A 5%

69 Triglycerides Laboratory 4% 13% A A A A 4% N/A 12%

62 Phosphorus, Inorg Laboratory 3% N/A A N/A A N/A 28% N/A N/A

130 Pulse Vital Signs 3% 5% N/A N/A N/A A 21% N/A N/A

73 Uric acid Laboratory 3% 15% A N/A A A 10% N/A 0%

91 Urine Bilirubin Laboratory 3% N/A A N/A A A 0% N/A 25%

95 Urine Nitrites Laboratory 3% N/A A N/A A A 8% N/A 15%

93 Urine Ketones Laboratory 2% N/A A N/A A A 3% N/A 17%

96 Urine pH Laboratory 2% N/A A N/A A A 3% N/A 15%

61 N-terminal probrain natriuretic peptide Laboratory 2% 6% A N/A A N/A 10% N/A N/A

100 Urine Urobilinogen Laboratory 2% 0% A N/A A A 0% N/A 15%

94 Urine Leucocytes Laboratory 2% N/A A N/A A A 0% N/A 15%

68 Total T4 Laboratory 1% 2% A N/A N/A N/A 8% N/A 1%

32 ECG Clinical Findings ECG 1% 4% N/A N/A N/A A 7% N/A N/A

60 Magnesium Laboratory 1% N/A A A A A 2% N/A 8%

108 What Is The History Of Smoking Use For This SubjectSubstance Use 1% 2% N/A A N/A N/A 7% A N/A

59 LDH Laboratory 1% 5% A N/A A A 4% N/A N/A

28 Disposition Start Date Disposition 1% N/A N/A N/A N/A A 8% N/A N/A

113 Alcohol Consumption Substance Use 1% 8% N/A A N/A A 0% N/A N/A

70 Troponin I Laboratory 1% 0% A N/A A A 0% N/A 7%

33 QRS Interval / Complex ECG 1% N/A N/A N/A N/A A 7% N/A N/A

35 ECG Heart Rate ECG 1% N/A N/A N/A N/A A 7% N/A N/A

36 QTCB Interval ECG 1% N/A N/A N/A N/A A 7% N/A N/A

30 Sinus Rhythm ECG 1% N/A N/A N/A N/A A 5% N/A N/A

110 Cigarettes Smoked Per Day Substance Use 0% 2% N/A N/A N/A A 1% N/A N/A

8 Date Of Death Adverse Events 0% 1% N/A A N/A A 0% A 1%

112 Number Of Pack Years Substance Use 0% 2% N/A N/A N/A A N/A N/A N/A

34 QT Interval ECG 0% N/A N/A N/A N/A A 1% N/A N/A

31 PR Interval ECG 0% N/A N/A N/A N/A A 1% N/A N/A

10 Time Of Death Adverse Events 0% N/A N/A A N/A N/A 0% N/A N/A

29 Electrocardiogram Date / Time ECG 0% N/A N/A N/A N/A A 0% N/A N/A

109 Last Smoked Substance Use 0% N/A N/A N/A N/A A 0% N/A N/A

1 Start Date / Time Adverse Events 0% N/A N/A A N/A A 0% N/A N/A

2 Outcome Adverse Events 0% N/A N/A A N/A A 0% N/A N/A

3 Verbatim Description Adverse Events 0% N/A N/A A N/A A 0% N/A N/A

4 End Date / Time Adverse Events 0% N/A N/A N/A N/A A 0% N/A N/A

5 Severity of Adverse Event Adverse Events 0% N/A N/A A N/A N/A 0% N/A N/A

6 Seriousness of Adverse Event Adverse Events 0% N/A N/A N/A N/A N/A 0% N/A N/A

7 Action(s) taken Adverse Events 0% N/A N/A N/A N/A N/A 0% N/A N/A

9 Cause Of Death Adverse Events 0% N/A N/A A N/A A 0% N/A N/A

11 In Case Of Death, Autopsy Report Adverse Events 0% N/A N/A N/A N/A N/A 0% N/A N/A

15 Reason Concomitant Medication 0% N/A N/A N/A N/A N/A N/A N/A N/A

23 Ethnicity Demographics 0% 0% N/A A N/A A N/A N/A N/A

24 Race Demographics 0% N/A N/A N/A N/A N/A N/A N/A N/A

27 Disposition Category Disposition 0% N/A N/A N/A N/A A 0% N/A N/A

99 Urine Specific Gravity Laboratory 0% N/A A N/A A N/A N/A N/A N/A

103 Event End Date Time Medical History 0% N/A N/A N/A A N/A N/A N/A N/A

104 Event Start Date Time Medical History 0% N/A N/A N/A A N/A N/A A N/A

111 Years Smoked Substance Use 0% N/A N/A N/A N/A A N/A N/A N/A

114 Substance Use Start Date Time Substance Use 0% N/A N/A N/A N/A A N/A N/A N/A

115 Substance Use End Date Time Substance Use 0% N/A N/A N/A N/A A N/A N/A N/A

118 Indication Surgery 0% N/A N/A N/A N/A N/A N/A N/A N/A

119 Planned Date Of Surgery Procedure Surgery 0% N/A N/A N/A N/A N/A N/A N/A N/A

132 Position of VS Measurement Vital Signs 0% N/A N/A N/A N/A N/A N/A N/A N/A