EVALUATION OF CARDIO-METABOLIC EFFECTS OF TREATMENT … · Olga Montvida, Kerenaftali Klein, Sanjoy...
Transcript of EVALUATION OF CARDIO-METABOLIC EFFECTS OF TREATMENT … · Olga Montvida, Kerenaftali Klein, Sanjoy...
EVALUATION OF CARDIO-METABOLIC EFFECTS OF TREATMENT WITH INCRETIN-BASED THERAPIES IN PATIENTS WITH TYPE 2 DIABETES
Olga Montvida, MSc Student number: 9341625
Submitted in fulfilment of the requirements for the degree of Doctor of Philosophy
School of Biomedical Sciences
Institute of Health and Biomedical Innovation
Faculty of Health
Queensland University of Technology
2018
ABSTRACT
Type 2 diabetes (T2DM) is a chronic and progressive metabolic disorder with a complex and
multifactorial pathophysiology. As patients with T2DM are at increased risk of cardiovascular
(CV) complications and mortality, efficient disease management requires a holistic multi-
faceted approach to control blood glucose, blood pressure, lipids, and body weight. While
metformin has been suggested as the first-line anti-diabetic drug (ADD), given the progressive
nature of the disease, many patients eventually require intensification. International guidelines
suggest multiple options for second- and third-line ADDs, including incretin-based therapies:
dipeptidyl peptidase 4 inhibitor (DPP-4i) and glucagon-like peptide-1 receptor agonist (GLP-
1RA). While current disease management guidelines are primarily based on results from
randomised controlled trials that are conducted on a protocol-driven selective patient
population, the population-level evaluation of the effectiveness and safety of such therapies in
the real-world practice would guide the patients and their carer’s in terms of choosing the right
therapies for optimum disease management. Clinical studies have evaluated the possible
beneficial association of treatment with novel anti-diabetic therapies with CV risk factors,
however the real-world evidence on such aspects is scarce.
With a central focus on incretin-based therapies, the aims of this thesis were to explore the real-
world patterns of (1) longitudinal changes in ADD choices, (2) population-level glycaemic
control and its sustainability, and (3) the long-term cardio-metabolic risk factor burden.
Using a large database of Electronic Medical Records (EMRs) of the United States, six
pharmaco-epidemiological and three methodological studies were conducted. A number of
important findings were reported in high impact journals including Diabetes Care and
Diabetes, Obesity, and Metabolism, with one publication receiving a dedicated review in the
Nature Reviews journal.
Extensive methodological and data mining studies were performed to extract reliable data from
voluminous EMRs and to develop efficient study designs and analysis approaches. One study
was devoted to the data management of medication prescriptions, specifically to the estimation
of treatment duration at individual patient-level accounting for intensifications and alterations
with multiple therapies. Methodological challenges associated with robust identification of the
patients with T2DM were addressed in another separate study. An exploratory analysis to
3
investigate the mechanisms and patterns of longitudinal missing risk factor data along with a
comparative study of multiple imputation techniques for such data were conducted to account
for the uncertainty due to missing values and to ensure the generalisability of study findings.
Advanced statistical methodologies, such as “treatment effects modelling”, were performed
throughout the thesis to ensure robust inferences drawn in the individual studies.
It was observed that the use of incretin-based drugs has increased since their approval in 2005,
in particular the use of DPP-4i as a second-line choice. Patient profiles significantly varied by
the class of chosen ADD, for instance GLP-1RA users were younger, had lower HbA1c level,
and were more likely to be female, compared to other major ADD users. It was observed that
around half of the patients with T2DM do not reach glycaemic targets and clear evidence of
therapeutic inertia persists at population-level.
Patients who intensified metformin with incretin-based drugs or thiazolidinedione were more
likely to achieve and sustain glycaemic control over 24 months of continuous treatment,
compared to those treated with sulfonylurea – the most popular intensification choice. A
separate study investigated the outcomes of intensifying GLP-1RA with insulin and reported
beneficial cardio-metabolic effects of combining these therapies. Even though the popularity
of newer therapeutic classes as second-line options was notably increasing, the longitudinal
rates of intensification with a third-line ADD was not reduced significantly at the population
level.
Neither glycaemic nor CV risk factor burden significantly improved over the last decade in
patients with T2DM, even though most patients were using multiple drugs for glucose, blood
pressure and lipid control. The long-term glycaemic burden consistently increased over time,
and more than half of the patients with a history of CV disease continued to have uncontrolled
blood pressure and lipids post-therapy initiation. Three out of five patients who are already
receiving multiple anti-diabetic and cardio-protective drugs were failing to simultaneously
control glucose and at least one CV risk factor. Compared to those who initiated second-line
ADD with sulfonylurea and insulin, patients who intensified metformin with incretin-based
therapies or thiazolidinedione were more likely to achieve simultaneous glucose and CV risk
factor control. Treatment with GLP-1RA was associated with lower rates of major adverse
macrovascular events, compared to other ADDs.
To conclude, this dissertation provides a detailed exploration and valuable insights of T2DM
management in the real-world setting and highlights alarming rates of the existing cardio-
4
metabolic burden at the population level. Incretin-based therapies and thiazolidinedione were
found to provide higher chances of sustainable glycaemic and CV risk factor control, and
treatment with GLP-1RA appears to have a beneficial association with CV risk, compared to
other anti-diabetic treatment options. Nonetheless, proper control in terms of timely
intensification with anti-hyperglycaemic, anti-hypertensive, and anti-dyslipidemic therapies
when needed, remains a key aspect to improve long-term outcomes in patients with T2DM.
5
KEYWORDS
Glucagon-like peptide-1 receptor agonist, dipeptidyl peptidase 4 inhibitor, incretin-based
therapy, glycaemic control, cardiovascular risk, macrovascular event, type 2 diabetes mellitus,
electronic medical records, longitudinal cohort study.
6
LIST OF PUBLICATIONS
The following is a list of published or submitted manuscripts that have been incorporated
into this thesis, thereby producing a thesis by publication.
Chapter 4: Olga Montvida, Ognjen Arandjelović, Edward Reiner, and Sanjoy K. Paul. Data
Mining Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic
Medical Records. Open Bioinformatics, 2017, 10:1-15. DOI:
10.2174/1875036201709010001.x.
Chapter 6: Mayukh Samanta, Olga Montvida, Joanne Tropea, and Sanjoy K. Paul. A
comparison of imputation methods for missing risk factor data from large real-world electronic
medical records for comparative effectiveness studies. (Submitted)
Chapter 7: Olga Montvida, Jonathan Shaw, John J Atherton, Francis Stringer, Sanjoy K Paul.
Long-term Trends in Antidiabetes Drug Usage in the US: Real-world Evidence in Patients
Newly Diagnosed With Type 2 Diabetes. Diabetes Care. 2017 Nov 6:dc171414. DOI:
10.2337/dc17-1414.x.
Chapter 8: Olga Montvida, Jonathan Shaw, Lawrence Blonde, Sanjoy K Paul. Long-term
sustainability of glycaemic achievements with second-line anti-diabetic therapies in patients
with type 2 diabetes: A real-world study. Diabetes, Obesity, and Metabolism. 2018;20:1722–
1731. DOI: 10.1111/dom.13288.x.
Chapter: 9: Olga Montvida, Sanjoy K Paul. Cardiovascular risk factor burden and safety in
patients with type 2 diabetes receiving intensified anti-diabetic and cardio-protective therapies.
(Submitted)
The following is a list of accepted and submitted manuscripts that are highly relevant to the
work performed in this thesis and were developed throughout candidature.
Appendix A: Olga Montvida, Kerenaftali Klein, Sudhesh Kumar, Kamplesh Khunti, Sanjoy
K. Paul. Addition of or switch to insulin therapy in people treated with glucagon‐like peptide‐
1 receptor agonists: A real‐world study in 66 583 patients. Diabetes, Obesity and Metabolism.
2017 Jan 1;19(1):108-17. DOI: 10.1111/dom.12790.x.
7
Appendix B: Ebenezer S. Owusu Adjah*, Olga Montvida*, Julius Agbeve, Sanjoy K. Paul.
Data Mining Approach to Identify Disease Cohorts from Primary Care Electronic Medical
Records: A Case of Diabetes Mellitus. The Open Bioinformatics Journal, 2017, 10: 16-27.
DOI: 10.2174/1875036201710010016.x. *Joint first authorship.
Appendix C: Sanjoy K Paul, Jonathan Shaw, Olga Montvida, Kerenaftali Klein. Weight gain
in insulin treated patients by BMI categories at treatment initiation: New evidence from real-
world data in patients with type 2 diabetes. Diabetes, Obesity and Metabolism. 2016 Dec
1;18(12):1244-52. DOI:10.111/dom.12761.x.
Appendix D: Olga Montvida, Jennifer B Green, John Atherton, Sanjoy K Paul. Risk of
Pancreatic Diseases by Second-line Drug Class: Real World Evidence in 225,898 Type 2
Diabetes Patients. Diabet Med. 2018 Oct 10. doi: 10.1111/dme.13835.
The following is a list of presentations and papers in refereed conference proceedings
throughout candidature.
1. Olga Montvida, Sanjoy Paul. Cardiovascular risk factor burden and safety in patients
with type 2 diabetes receiving intensified anti-diabetic and cardio-protective therapies. QIMR
Berghofer Early Career Researcher Seminar, Brisbane, Australia, 18 May 2018.
2. Olga Montvida, Jonathan Shaw, Sanjoy Paul. Comparative assessment of glycaemic
achievements with second-line anti-diabetes therapy intensification – Real world evidence
based choices for patients and providers. Annual Meeting of the European Association for the
Study of Diabetes (EASD), Lisbon, Portugal, 11-15 September 2017
3. Sanjoy Paul, Jennifer B Green, John Atherton, Olga Montvida. Risk of Pancreatic
Diseases by Second-line Anti Diabetes Drug Class: Real World Based Evidence. Annual
Meeting of the European Association for the Study of Diabetes (EASD), Lisbon, Portugal, 11-
15 September 2017
4. Ebenezer Adjah, Olga Montvida, Kamlesh Khunti, Sanjoy Paul. Interactive changes
in cardiovascular risk factors and the long-term cardiovascular risk differ by adiposity levels
in incident type 2 diabetes patients: real world study. Annual Meeting of the European
Association for the Study of Diabetes (EASD), Lisbon, Portugal, 11-15 September 2017
8
5. Olga Montvida, Sanjoy Paul. Time to third-line anti-diabetes therapy intensification
in patients receiving second-line GLP-1 receptor agonist, DPP-4 inhibitor and Sulfonylurea: A
real-world study. The Australian Diabetes Society (ADS) and the Australian Diabetes
Educators Association (ADEA) Annual Scientific Meeting 2017, Perth, Australia. 30 August -
1 September 2017.
6. Olga Montvida, Sanjoy Paul. Long-term glycaemic control with incretin-based
therapies in patients with type 2 diabetes: real-world study. QIMR Berghofer Early Career
Researcher Seminar, Brisbane, Australia, 23 June 2017.
7. Sanjoy K. Paul, Brian L. Thorsted, Michael L. Wolden, Kamlesh Khunti, Olga
Montvida. Delay in Treatment Intensification Increases the Risk of Cardiovascular Events in
Patients with Type 2 Diabetes. Cardiovascular Research Showcase, Brisbane, Australia, 22
November 2016.
8. Sanjoy Paul, Jonathan Shaw, Olga Montvida, Kerenaftali Klein. Obese Patients Gain
Less Weight than Non-obese Patients when Treated with Insulin, with Similar HbA1c
Reductions: New Evidence from Real-world Data in Type 2 Diabetes. Annual Meeting of the
European Association for the Study of Diabetes (EASD), Munich, Germany, 12-16 Sep, 2016.
9. Olga Montvida, Sanjoy Paul. Addition or Switch to Insulin Therapy in People Treated
with GLP-1 Receptor Agonists: A Real World Study in 66,583 Patients. Australian Diabetes
Society and Australian Diabetes Educators Association Annual Scientific Meeting, Gold Coast,
24-26 Aug, 2016.
10. Olga Montvida, Sanjoy Paul. Real World Outcomes of Addition or Switch to Insulin
Therapy in People Treated with GLP-1 Receptor Agonist. QIMR Berghofer Student
Symposium, 15 Jul 2016.
11. Sanjoy K. Paul, Jonathan Shaw, Kerenaftali Klein, Olga Montvida. Obese T2DM
Patients Gain Less Weight with Insulin Treatment Compared with Normal and Overweight
Patients: New Evidence from Real-World Data. American Diabetes Association (ADA) 76th
Scientific Sessions, New Orleans, USA, 10-14 June 2016.
9
12. Olga Montvida, Kerenaftali Klein, Sanjoy K. Paul. Evaluation of the Cardio-metabolic
Effects of Treatment with Incretin-based Therapies in Patients with Type 2 Diabetes. 8th
Biennial QIMR Berghofer Student Retreat, Canungra, Australia, 17-18 September 2015.
13. Kerenaftali Klein, Olga Montvida, Sanjoy K Paul. Real World Glucose and Weight
Control in Patients Treated with GLP-1 Receptor Agonists, with Addition or Treatment Change
to Insulin. Annual Meeting of the European Association for the Study of Diabetes (EASD),
Stockholm, Sweden, 14-18 September 2015.
10
STATEMENT OF ORIGINAL AUTHORSHIP
The work contained in this thesis has not been previously submitted to meet requirements for
an award at Queensland University of Technology or any other higher education institution.
To the best of my knowledge and belief, the thesis contains no material previously published
or written by another person except where due reference is made.
Olga Montvida 6 Nov 2018
11
QUT Verified Signature
ACKNOWLEDGEMENTS
I would like to express sincerest and deepest gratitude to my principal supervisor, Professor
Sanjoy Ketan Paul. Thank you for trusting in my abilities to conduct this project from the very
beginning and for such a close mentorship and inspiration over the past three years. There was
a lot of hard work, tons of coffee, and a lot of fun. We are no Michelangelo, but the quote
reflects it perfectly: “If people knew how hard I worked to achieve my mastery, it wouldn't
seem so wonderful after all".
Deep thanks to my associate supervisors, Professor Ross Young and Professor Louise Hafner
for supporting me and advising me from the first day of my enrolment. To all former colleagues
at QIMR Berghofer - Kerenaftali Klein, Julius Agbeve, Mayukh Samanta, Margaret Haughton,
and Gunter Hartel. You were here to help me during good and bad times, thank you for that.
To my PhD buddy Ebenezer Senyo Owusu Adjah, who never refused to share with me his
materials, thoughts, and advice.
I gratefully acknowledge the financial support received through the scholarship from
Queensland University of Technology. I also want to thank QIMR Berghofer Medical Research
Institute for giving me working space and granting a research top-up scholarship. Big thank
you to every employee working for these two institutions who ensured that I felt safe and
comfortable in a new country.
Finally, thanks to those who are always in my heart – family and friends. Your love contributed
to this dissertation much more than you think. Special thanks to my brother, who kept boosting
my self-confidence and fighting spirit.
It’s been an amazing journey. With confidence I may now say that a significant (p<0.01 😊)
development of myself as a scientist and as a person has been achieved.
12
TABLE OF CONTENTS
Abstract ..................................................................................................................................... 3 Keywords .................................................................................................................................. 6 List of Publications ................................................................................................................... 7 Statement of Original Authorship ........................................................................................... 11 Acknowledgements ................................................................................................................. 12 List of Figures ......................................................................................................................... 14 List of Tables .......................................................................................................................... 15 List of Supplementary Material .............................................................................................. 16 List of Abbreviations .............................................................................................................. 17
Chapter 1: Introduction ............................................................................................... 18 1.1 Diabetes Mellitus .......................................................................................................... 18 1.2 Epidemiology of Diabetes Mellitus .............................................................................. 19 1.3 Complications of Type 2 Diabetes ............................................................................... 20 1.4 Treatment of Type 2 Diabetes ...................................................................................... 22 1.5 Incretin-based therapies ................................................................................................ 23 1.6 Glycaemic effects of incretin-based therapies .............................................................. 24 1.7 Cardio-metabolic effects of incretin-based therapies ................................................... 25 1.8 Aims and Objectives ..................................................................................................... 25 1.9 Methodological Background ........................................................................................ 27 1.10 Thesis structure and logics ............................................................................................ 29 Chapter 2: Literature Review ..................................................................................... 31 2.1 Clinical trials ................................................................................................................. 31 2.2 Observational studies .................................................................................................... 34 2.3 Conclusions and implications ....................................................................................... 36 Chapter 3: Data Description ........................................................................................ 38 3.1 Centricity Electronic Medical Records ......................................................................... 38 3.2 Medication data ............................................................................................................ 39 3.3 Disease data .................................................................................................................. 43 3.4 Laboratory, clinical, and anthropometric data .............................................................. 45 3.5 Ethics approval ............................................................................................................. 47 Chapter 4: Medication Data Extraction ..................................................................... 48 Chapter 5: Diabetes Mellitus Cohort .......................................................................... 65 5.1 Diagnostic codes ........................................................................................................... 66 5.2 Supervised machine learning ........................................................................................ 66 5.3 Final cohort ................................................................................................................... 68 5.4 Representativeness of diabetes cohort .......................................................................... 69 5.5 Type 2 diabetes cohort .................................................................................................. 70 Chapter 6: Imputation of Longitudinal Observation Data ....................................... 73 Chapter 7: Trends in Anti-diabetic Drug Prescribing Patterns ............................... 98 Chapter 8: Glycaemic Control and Sustainability ..................................................110 Chapter 9: Cardio-metabolic Risk Factor Burden and Safety ...............................122 Chapter 10: Discussion and Conclusions ....................................................................148
Bibliography ........................................................................................................................153 Appendices ...........................................................................................................................165
13
LIST OF FIGURES
Figure 3.1. Schematic representation of the data in CEMR database. ................................... 39
Figure 3.2. Schematic diagram of identifying list of medication keys for Liraglutide. .......... 42
Figure 3.3. Schematic diagram of arranging longitudinal risk factor data. ............................ 46
Figure 5.1. Cohort of patients with T2DM and distribution of identified sub-types. ............. 66
Figure 5.2. Selected Decision Tree algorithm. ....................................................................... 68
14
LIST OF TABLES
Table 1.1 Incretin-based medications approved in the US and EU ....................................... ............24
Table 1.2 Possible sources of bias in Electronic Medical Record data ............................................. 29
Table 2.1 Completed cardiovascular outcome trials for DPP-4i in patients with type 2 diabetes ............................................................................................................................................ .31
Table 2.2 Completed cardiovascular outcome trials for GLP-1RA in patients with type 2 diabetes ................................................................................................................................ 32
Table 2.3 Summary of observational CV-outcome studies of treatment with incretin-based therapies .............................................................................................................................. 35
Table 3.1 Therapeutic Class and highest corresponding ATC code .................................................. 40
Table 3.2 Diseases, ICD codes, and Weights used to compute Charlson Comorbidity Index...........44
Table 5.1 Features Selected as Best Diabetes Predictors in CEMR ...................................... ............67
Table 5.2 Performance of Machine Learning Algorithms on the Training Dataset .............. ............68
Table 5.3 Characteristics of patients with diabetes in the CEMR database and in the National Diabetes Statistics report, 2015 ............................................................................... ...........69
Table 5.4 Baseline characteristics among adults with T2DM ............................................... ............71
Table 5.5 Exposure to medications any time during available follow-up among adults with T2DM ....................................................................................................................... ...........72
15
LIST OF SUPPLEMENTARY MATERIAL
Appendix A: Addition of or switch to insulin therapy in people treated with glucagon‐like
peptide‐1 receptor agonists: A real‐world study in 66 583 patients.
Appendix B: Data Mining Approach to Identify Disease Cohorts from Primary Care Electronic
Medical Records: A Case of Diabetes Mellitus.
Appendix C: Weight gain in insulin treated patients by BMI categories at treatment initiation:
New evidence from real-world data in patients with type 2 diabetes.
Appendix D: Risk of Pancreatic Diseases by Second-line Drug Class: Real World Evidence in
225,898 Type 2 Diabetes Patients.
16
LIST OF ABBREVIATIONS
ADA American Diabetes Association ADD Anti-diabetic Drug ATC Anatomical Therapeutic Chemical Classification AUC Area Under Receiver Operating Characteristic curve BMI Body Mass Index CCI Charlson Comorbidity Index CEMR Centricity Electronic Medical Record CI Confidence Interval CPM Cardio-protective Medication CPU Central Processing Unit CV Cardiovascular CVD Cardiovascular Disease DCCT Diabetes Control and Complications Trial DM Diabetes Mellitus DPP-4 Dipeptidyl Peptidase-4 DPP-4i Dipeptidyl Peptidase-4 inhibitor EMA Europe and Medicines Agency EMR Electronic Medical Record FDA Food and Drug Administration GIP Glucose-dependent Insulinotropic Polypeptide GLP Glucagon-like Peptide-1 GLP-1RA Glucagon-Like Peptide-1 Receptor Agonist HbA1c Glycated Haemoglobin HR Hazard Ratio ICD-10 International Classification of Diseases 10th Revision ICD-9 International Classification of Diseases 9th Revision IDF International Diabetes Federation INS Insulin LDL Low-density Lipoprotein MACE Major Cardiovascular Event MET Metformin ML Machine Learning NDS National Diabetes Statistics OR Odds Ratio RCT Randomised Controlled Trial SBP Systolic Blood Pressure SD Standard Deviation SGLT-2i Sodium Glucose co-Transporter 2 inhibitor SU Sulfonylurea T2DM Type 2 diabetes Mellitus TZD Thiazolidinedione UKPDS The UK Prospective Diabetes Study
17
Chapter 1: Introduction
1.1 DIABETES MELLITUS
Diabetes mellitus (DM) is a group of metabolic disorders characterised by defects in insulin
secretion or action that leads to increased blood glycose levels (hyperglycaemia) [1]. It is a
chronic disease with increasing prevalence, currently affecting around 9% of the global adult
population [2].
Modern aetiologic classification of DM consists of four categories: Type 1, Type 2,
Gestational, and Other Types [1, 3]. Absolute insulin deficiency that resulted from autoimmune
or idiopathic β-cell destruction is usually classified as Type 1 diabetes. Type 2 diabetes mellitus
(T2DM) is generally characterised by relative (rather than absolute) insulin deficiency, and it
is attributable for 90-95% of all DM cases [1, 4]. Gestational diabetes occurs in women during
gestation. Other specific types of diabetes caused by genetic defects of β-cell function, diseases
of pancreas, and other aetiologies are grouped together. Prediabetes or borderline diabetes, is
referred when blood glycose levels are higher than normal, but do not reach DM diagnostic
threshold [1, 5].
1.1.1 Diagnosis of Type 2 Diabetes
Early symptoms of T2DM are polyuria, polydipsia, and polyphagia. Symptoms also may
include fatigue, headaches, trouble concentrating, blurred vision, and weight loss. American
Diabetes Association guidelines recommend three tests to diagnose T2DM [1]:
Fasting Plasma Glucose ≥ 126 mg/dL [7.0 mmol/L],
2-hour Plasma Glucose ≥200 mg/dL [11.1mmol/L] during Oral Glucose Tolerance
Test,
Glycated Haemoglobin ≥ 6.5% [48 mmol/mol].
For fasting plasma glucose test, fasting is defined as no caloric intake for at least 8 hours. Oral
glucose tolerance test is defined as glucose load containing the equivalent of 75 gram
anhydrous glucose dissolved in water. Glycated Haemoglobin (HbA1c) test reflects average
plasma glucose level concentrations over approximately 3 months. HbA1c is a measurement
of the percentage of haemoglobin A molecules that formed a stable ketoamine linkage between
the amino terminal valine residue of the beta chain and a glucose moiety [6]. The method that
18
is certified by National Glycohemoglobin Standardization Program and standardised to the
Diabetes Control and Complications Trial (DCCT) assay, should be used to perform the HbA1c
test. Random plasma glycose of more than 200 mg/dL [11.1 mmol/L] may also be used to
diagnose T2DM in patients with classic symptoms of hyperglycaemia [1].
For all tests, a second test (same or a different) is recommended to be immediately conducted
with new blood sample to confirm the diagnosis.
1.2 EPIDEMIOLOGY OF DIABETES MELLITUS
Global prevalence of DM has increased fourfold during last 30 years. According to the 2017
world wide survey conducted by International Diabetes Federation (IDF), more than 425
million individuals (equivalent to 1 in 11 adults) have diabetes, and 1 in every 2 adults with
diabetes is undiagnosed (~212 million) [7, 8]. In the US, the age-adjusted (20-79 years)
prevalence of diabetes in 2017 was 10.8%, while about 11.5 million individuals were estimated
to have undiagnosed diabetes [8]. More than three quarters of people with DM live in low and
middle-income countries, and most of them are 20 to 64 years old. Over a million of children
and adolescents are suffering from type 1 diabetes.
About 12% of global health expenditure is spent on DM management [8]. In the US, a quarter
of total health expenditures was estimated to be spent on DM management [9]. American
Diabetes Association estimated cost of diagnosed DM as USD 327 billion in 2017: 72% direct
medical costs and 28% in reduced productivity [10]. Among Australians with DM aged 20–65
years, Magliano and colleagues estimated productivity-adjusted life years lost to DM by 12.2%
and 11.0% for men and women respectively [11]. Bommer and colleagues modelled economic
burden of DM under various scenarios and reported an increase in the costs as a share of global
GDP from 1.8% (1.7–1.9) in 2015 to 2.2% (2.1–2.2) in 2030 [12].
In 2004 the World Health Organisation (WHO) provided an estimate of diabetes prevalence in
2000 and conducted forecasting for diabetes till 2030 – 171 million in 2000 and estimated 366
million in 2030 [13]. In practise, these estimates appeared extremely underestimated, as in 2017
there were already 425 million people with DM. The IDF projects the prevalence of diabetes
to rise to 642 million by 2040. However, these estimates may be again underestimated, as IDF
extrapolates prevalence for countries with missing data from various less reliable sources [7,
14].
19
Advances in epidemiological research on DM lead to better understanding of various risk
factors associated with development of T2DM. The determinants of T2DM consist of many
contrasting and interacting genetic, epigenetic and lifestyle factors [14]. The risk of T2DM
development increases with age, body mass index (BMI), and with sedentary lifestyle. Also,
high-calorie diet leading to excess body fat, hypertension, and dyslipidaemia is considered to
be a major contributor to the disease burden. People with a history of diabetes in first- and
second- degree relatives have an increased risk of developing T2DM.
Ethnic minorities in the US and Australia have a higher risk of developing T2DM compared to
non-minority individuals [1]. South Asians develop diabetes earlier and at lower BMI levels,
compared to Western population [15, 16]. In India, 72 million people were estimated to have
DM in 2017, and 123.5 million were predicted to have DM by 2040. A population based survey
conducted in China in 2010 suggests that about 12% of the adult population had diabetes and
about 50% of total population had pre-diabetes (defined as 2-hour oral glucose tolerance levels
140-199 mg/dL [7.8–11.0 mmol/l], and impaired fasting glucose, defined as fasting glucose
levels 100-125 mg/dL [5.6–6.9 mmol/l]) [17].
The estimated number of females (20-79 years) living with DM in 2017 is 204 million.
Gestational diabetes, defined as hyperglycaemia onset or first recognition during pregnancy,
significantly increases the risk of T2DM development in both the woman and the child.
According to IDF 2017 estimates, about 16% of live births had some form of hyperglycaemia
in pregnancy, and 1 in every 7 births was affected by gestational diabetes. Compared to women
who did not have gestational diabetes, a 7-fold increased risk of developing T2DM was
observed in those who did have it [18-20]. In the children of women with gestational diabetes,
exposure to intrauterine hyperglycaemia was associated with an 8-fold risk of developing
diabetes/prediabetes at 19-27 years of age [21].
American Diabetes Association estimated that people with DM have more than twice higher
medical expenditures than it were without presence of DM [10]. The costs of DM present
immense problem to patients, health systems, and community in general [9]. In the US, people
with DM spend on average USD 16,750 per year, 57% of which are attributable to diabetes
[10].
1.3 COMPLICATIONS OF TYPE 2 DIABETES
Patients with T2DM are at increased risk of developing a number of comorbidities and life-
threatening complications [22]. The short- and long-term complications associated with T2DM
20
are many. Traditionally, macrovascular (cardiovascular) diseases and microvascular diseases
have been considered as the primary complications associated with T2DM. While a number of
clinical trials and epidemiological outcome studies have established the significantly increased
microvascular risk in patients with T2DM [23, 24], the evidence of the long-term
macrovascular benefits of tight glucose control in patients with T2DM is less clear [25-27].
1.3.1 Microvascular complications
Microvascular complications of long-term hyperglycaemia occur due to damage to small blood
vessels leading to neuropathy, retinopathy, and nephropathy. Diabetic neuropathy is
characterised by progressive loss of nerve fibres affecting the peripheral nerves and the
autonomic neurons. It was estimated that 60-70% of people with DM develop some form of
neuropathy [28]. Diabetes-associated longstanding peripheral neuropathy increases the chance
of foot ulcer (“diabetic foot”), infection and eventual need for limb amputation [29]. Diabetic
retinopathy affects the peripheral retina and/ or macula leading to partial or total vision loss. It
was estimated that 2.6% of global blindness can be attributed to diabetes [30]. Diabetic
nephropathy is characterised by angiopathy of the capillaries in the kidney glomeruli. It is the
most common cause of kidney failure in developed countries [28]. The first indication of
nephropathy is typically microalbuminuria, which further worsens to albuminuria (at rate 2-
3% per year), and eventually leads to renal failure [29]. The risk of microvascular
complications increases with age, DM duration, and blood glucose level. Randomised
controlled trials (RCT), including Diabetes Control and Complications Trial (DCCT) and The
UK Prospective Diabetes Study (UKPDS), have indicated that tight glycaemic control (HbA1c
<7% [53 mmol/mol]) in patients with diabetes reduces risk of microvascular complications
[31-34]. Lowering HbA1c to 6% [42 mmol/mol] is associated with further reductions in the
risk of microvascular complications, although at a much smaller pace [1].
1.3.2 Macrovascular complications
Atherosclerosis of large vessels due to long-term hyperglycaemia leads to ischaemic heart
disease, cerebrovascular disease (stroke), and peripheral vascular disease. Cardiovascular
disease (CVD) is a leading cause of death and disability among people with T2DM [22, 35].
While patients with T2DM have an increased cardiovascular (CV) risk profile, DM is
considered as an independent CVD risk factor [36].
In patients with T2DM, CVD develops about 14 years earlier with great severity, compared to
individuals without diabetes [37-39]. After controlling for traditional CV risk factors, patients
21
with T2DM have more than twice the risk of major CV events compared to the general
population [40]. In patients with T2DM, peripheral artery disease (occlusion of the lower-
extremity arteries) and heart failure (impaired cardiac pump function) are the most common
initial manifestations of CVD [41]. Risk of CVD increases with age and DM duration. Some
studies report that the presence of microvascular complications increases the risk of CVD as
well [28]. Several large trials (ACCORD, ADVANCE, VADT, DCCT, UKPDS), designed to
address the CVD related concerns have shown no beneficial effect of tight glucose control on
CV events [33, 42-44]. Nonetheless, follow-up data, meta-analyses, and prospective
observational studies suggest positive effects of tighter glycaemic targets on CVD risk,
especially in those with shorter DM duration and no history of severe hypoglycaemia [1, 22,
45, 46].
1.4 TREATMENT OF TYPE 2 DIABETES
Glycaemic targets:
According to International and American Diabetes Association guidelines [1, 47], adults with
T2DM are recommended to achieve HbA1c < 7% [53 mmol/mol]. Selected adults with shorter
duration of diabetes and no significant CVD, may be recommended to maintain HbA1c < 6.5%
[48 mmol/mol]. Less stringent targets (e.g. < 8% [64 mmol/mol] may be appropriate for
patients for whom the 7% target is difficult to achieve due to extensive comorbidities, history
of severe hypoglycaemia, and/or limited life expectancy.
HbA1c testing is recommended at least twice a year for patients who meet treatment targets,
and quarterly for those who do not meet targets or changed therapy.
Lifestyle modifications including dietary considerations and physical activity are initially
recommended to prevent or delay T2DM [1]. Metformin (MET), if not contraindicated, is
widely accepted as the first choice as anti-diabetic drug (ADD) since it does not cause weight
gain or hypoglycaemia and may improve macrovascular outcomes [48]. However progressive
deterioration of diabetes generally leads to the need for further treatment intensification. In
patients with new diagnosis of T2DM, the UKPDS study has shown that approximately half of
the people maintain acceptable glucose level after 3 years of monotherapy, however after 9
years the proportion declines to only one quarter of patients [49]. Guidelines recommend to
22
intensify anti-diabetic therapy when treatment targets are not met within 3-6 months of
monotherapy [1].
Successful treatment of T2DM is generally complicated by treatment-related adverse effects
(hypoglycaemia, weight gain) and the progressive nature of the disease. Many patients
eventually require therapy intensification with another drug, however a consensus among
physicians has not been achieved [50].
The current six common post metformin second-line therapy intensification options are
sulfonylurea (SU), thiazolidinedione (TZD), sodium glucose co-transporter 2 inhibitor (SGLT-
2i), glucagon-like peptide-1 receptor agonist (GLP-1RA), dipeptidyl peptidase-4 inhibitor
(DPP-4i) or insulin (INS); other drugs are recommended under specific conditions. MET, SU,
and insulin represent the ‘old agents’; TZD has been used for the last decade, especially in
Asian countries. GLP-1RA, DPP-4i, and SGLT-2i represent the ‘novel agents’. All of the
agents, used alone or in combination, are associated with different adverse events including
hypoglycaemia (SU and insulin), weight gain (SU, insulin and TZD), gastrointestinal side
effects (MET, GLP-1RA) and increased risk of fractures (TZD) [51, 52].
1.5 INCRETIN-BASED THERAPIES
Glucagon-like peptide-1 (GLP-1) and glucose-dependent insulinotropic polypeptide (GIP) are
gut-derived hormones, also called incretins, which induce insulin secretion in a glucose-
dependent manner as a response to nutrient ingestion. Additionally, GLP-1 inhibits glucagon
secretion, slows gastric emptying, and increases satiety [53]. GLP-1 and GIP are degraded
within 2-3 minutes by the enzyme dipeptidyl peptidase-4 (DPP-4).
Incretin-based therapies are represented by two classes: oral DPP-4i and subcutaneous GLP-
1RA. The former increases effective levels of incretins by targeting and inactivating DPP-4,
while the later increases insulin release through direct action on GLP-1 receptors [54]. These
therapies have been in the focus during the last several years because of their unique
mechanisms of action [52, 55-59]. GLP-1RAs stimulate insulin secretion and inhibit glucagon
release in a strictly glucose-dependent manner. The pancreatic effects also include increased
beta-cell proliferation, and decreased beta-cell apoptosis [56, 60, 61].
Several GLP-1RAs have been approved for the treatment of patients with T2DM. Exenatide,
the first GLP-1RA representative, was approved in April 2005 by the USA Food and Drug
Administration (FDA), and in October 2006 by the Europe and Medicines Agency (EMA).
FDA and EMA approved agents are summarised in the Table 1.1. GLP-1RAs differ in the
23
structure, and also may be distinguished by durability of action: short-acting (once- or twice-
daily administration) and long-acting (once-weekly administration). The first DPP-4i was
Sitagliptin – approved in October 2006 and March 2007 by FDA and EMA, respectively [62].
DPP-4is are administrated once daily, and all DPP-4is are available in combination with
metformin.
Table 1.1
Incretin-based medications approved in the US and EU
GLP-1RA class DPP-4i class Exenatide (Byetta®, Bydureon®) Sitagliptin (Januvia®)
Liraglutide (Victoza®) Vildagliptin (Galvus®) Lixisenatide (Lyxumia®) Saxagliptin (Onglyza®)
Albiglutide (Eperzan®, Tanzeum®) Linagliptin (Trajenta®) Dulaglutide (Trulicity®) Alogliptin (Vipidia®, Nesina®) Semaglutide (Ozempic®)
Table source: Scheen, A.J., Cardiovascular outcome studies with incretin-based therapies: comparison between
DPP-4 inhibitors and GLP-1 receptor agonists. diabetes research and clinical practice, 2017. 127: p. 224-237
[63].
1.6 GLYCAEMIC EFFECTS OF INCRETIN-BASED THERAPIES
Incretin-based therapies have demonstrated their ability to significantly reduce glucose levels
while maintaining a low risk of hypoglycaemia in patients with T2DM.
GLP-1RAs reduce fasting plasma glucose by 1.4-3.4 mmol/L and HbA1c by 0.8-1.8% [64].
HbA1c reductions are 0.5-1.0% and 1.5-2.0% with short- and long-acting exenatide, and
around 0.8-1.5% with liraglutide [65, 66]. Short-acting agents have a greater effect on
postprandial glucose levels mainly through inhibition of gastric emptying, while long-acting
GLP-1RAs have a greater effect on fasting glucose levels mainly through their insulinotropic
and glucagonostatic actions [65, 67]. In a direct head-to-head study, patients treated with
liraglutide achieved greater reductions in HbA1c and fasting glucose, compared to those treated
with exenatide [67]. In the review of direct head-to-head trials on GLP-1RAs (n=9), Madsbad
(2016) reports higher reductions of HbA1c with liraglutide than with exenatide formulations
and albiglutide, and no differences in HbA1c reductions between liraglutide and dulaglutide
[68].
DPP-4i reduce fasting plasma glucose levels by 1.0-1.4 mmol/L and HbA1c by 0.5-1.1% [64].
HbA1c reductions with sitagliptin are 0.6-0.8%, with saxagliptin 0.4-0.8, with linagliptin 0.5-
0.7%, with alogliptin 0.5-0.9% [65, 66]. There are no head-to-head trials comparing DPP-4i
24
agents [69], however several systematic reviews and meta-analyses reported similar efficacy
and safety of DPP-4i agents [70, 71]. Several head-to-head trials were conducted to compare
GLP-1RA representatives with DPP-4i agents, where GLP-1RA class demonstrated higher
glycose reductions than DPP-4i [66].
1.7 CARDIO-METABOLIC EFFECTS OF INCRETIN-BASED THERAPIES
GLP-1 receptors are widely expressed throughout the human body and their presence on
coronary artery endothelial cells may benefit ischemic conditioning [72, 73]. Data from animal
models, pre-clinical and exploratory studies suggest potential CV benefits by improved
endothelial and myocardial function [74-76], improved left ventricular ejection fraction and
wall indices [55, 77], decreased levels of inflammatory markers and atherosclerosis [62, 78],
and recovery of failing and ischemic hearts [76, 79-81].
DPP-4i were shown to be weight-neutral and GLP-1RAs were shown to significantly reduce
body weight by 2-4 kg over 6 months of therapy [60, 63]. While T2DM population have
increased burden of obesity, an independent risk factor for CVD, incretin-based therapies
represent favourable therapeutic option comparing to agents that cause weight gain. These
therapies were also reported to decrease blood pressure (BP) [62, 78], which might be
independent of weight reductions [82, 83]. The Liraglutide Effect and Action in Diabetes trials
1-5 demonstrated systolic blood pressure reductions of 3.6-6.7 mmHg [73]. Also, incretin-
based therapies demonstrated modest improvements in total cholesterol, LDL-cholesterol,
HDL-cholesterol, and triglyceride profiles [62, 66, 67, 84]. However, small increase in heart
rate 2-4 beats/minute has been associated with GLP-1RA treatment [84, 85].
Nonetheless, beneficial effects on CV risk from these promising studies in animals and humans
are not yet transferred to clinical evidence. Chapter 2 presents a summary on current evidence
of incretin-based therapies association with CV risk.
1.8 AIMS AND OBJECTIVES
Most of the patients with T2DM require intensified treatment with multiple ADDs, apart from
medications for CV risk factor control, as the disease progresses. While metformin is
recommended as the first-line ADD, the guidelines suggest at least six possible options for
therapy intensification, where evidence is primarily drawn from RCTs. While a great number
of studies report beneficial effects of one medication brand over placebo or another brand,
patient and practitioner decisions on intensification therapy have become more complicated
25
than ever. At the same time, patient choices, disease management, and cardio-metabolic
outcomes significantly differ in the real-world scenario compared to RCT practices.
The Aim of this project was to explore cardio-metabolic effects of treatment with incretin-
based drugs, compared to other treatment options in the real-world setting.
Based on large patient-level electronic medical records (EMRs) from primary and ambulatory
care systems, the objectives of this real-world data based pharamaco-epidemiological project
combine both comparative effectiveness and outcome studies, while addressing a host of
methodological challenges in dealing with large EMRs.
Objective 1: To develop and validate data mining techniques to extract and analyse
longitudinal prescription data from EMR database.
Objective 2: To develop algorithms and machine learning techniques to identify disease
cohorts from EMR database.
Objective 3: To explore temporal trends in anti-diabetic drug prescribing and intensification
patterns, along with glycaemic levels and comorbidities by class of anti-diabetes drug.
Objective 4: To explore long-term dynamics of glucose control and its sustainability in the
following treatment groups:
Group 1: Metformin plus GLP-1RA
Group 2: Metformin plus DPP-4i
Group 3: Metformin plus Insulin
Group 4: Metformin plus Sulfonylurea
Objective 5: To explore long-term burden of blood pressure, low density lipoprotein, and
triglycerides in the above-mentioned groups, and the association of such control with risk of
major adverse cardiovascular events.
SGLT-2 inhibitors, first approved in 2013, have demonstrated glycaemic and extra-glycaemic
benefits. Additionally, some class representatives demonstrated renal protection and
association with reduced risk of CV events. However, due to data constraints and limited
follow-up time, this therapy group could not be included for robust comparative analyses
outlined in Objectives 4 and 5.
26
1.9 METHODOLOGICAL BACKGROUND
Patient-level data from electronic medical records (EMRs) collected from 1995 till 2016 across
all states of the US during routine primary and ambulatory care were used throughout this
project. This subsection is devoted to a general description of the role, scientific value, and
limitations of evidence based on real-world data. Chapter 3 describes the data in detail.
1.9.1 Scientific value of real-world evidence
According to the FDA, real-world data is defined as any data related to patient health status
and /or delivery of health care that is routinely collected from various sources. [86]. These
sources include EMRs, claims and billing systems, product and disease registries, health-
monitoring devices, and health-related applications [87]. Evidence based on analysing such
data includes studies on therapeutics, disease-related outcomes, safety, cost-effectiveness,
epidemiology, patient-care, and delivery systems.
The nature of real-world evidence is very different from RCTs, with both important advantages,
but also disadvantages relative to RCTs. RCTs are considered as the gold standard for testing
hypotheses not only because baseline randomisation supports conclusions of causality, but also
due to the ability of tight control over measurement and clinical conduct and ease of
communicating results [88]. Tests of safety and efficacy of an intervention in a RCT are
considered to be bias free, and provide a reliable source of internal validity. However, RCTs
are often conducted with specific populations and findings may be less generalisable to broader
populations, apart from being costly and requiring a long time to complete. Real-world based
studies provide opportunities to observe health outcomes in populations that are often excluded
from RCTs, such as pregnant, older, or co-morbid patients. These studies also allow exploration
of research questions that may be unethical for testing in RCTs, for instance the outcomes of a
delay/ failure in treatment intensification [89, 90]. While RCTs are conducted in a specialised
environment and assess whether a treatment may work, real-world studies observe whether the
actual use and outcomes of interventions works in the everyday clinical practice. For example,
Edelman and colleagues reported that HbA1c reductions observed in RCTs are much higher
than in the real-world scenario: 1.25% vs 0.52% for GLP-1RA and 0.68% vs 0.51% for DPP-
4i [91, 92]. The authors suggest poor medication adherence as a key driver of such a disconnect.
Several countries (UK, Sweden, Estonia) have implemented a nationwide “birth-to-death”
EMRs for nearly every citizen, which brings a unique opportunity to observe population-level
behaviour, effectiveness of changes in health care policies, and health management costs in
27
addition to population-level safety, effectiveness, and health-related long-term outcome
research [93-96]. Large population level databases also provide an opportunity to bring
together benefits of EMRs and RCTs by randomising and recruiting patients from EMRs to
protocol-driven RCTs in a convenient, fast, and cost-effective manner [87].
The increasing role of real-world data in health care decisions has led the European Union to
establish a project to monitor adverse drug reactions (EU-ADR) using EMRs from the
Netherlands, Denmark, United Kingdom, and Italy [97]. Another example of combining
national registry data (US, Norway, Denmark, Sweden, Germany, UK) is CVD-REAL study
that was designed to assess whether positive outcomes observed in the completed RCT are also
applicable to broader population in the real-world practice [98].
1.9.2 Limitations and challenges of electronic medical records
Health related data from EMRs reflect complex multi-factorial relationships of everyday
clinical practice – with the challenges in design, analysis and interpretations of the findings
[88]. Confounding and data quality limit the ability to conclude direct causation, and in in this
sense, the EMR-based studies should be interpreted with caution. EMRs are collected during
routine medical care and usually extensively capture demographics, medication prescriptions,
diagnoses and procedures, laboratory and anthropometric measures. However, EMR data are
prone to (1) loss of follow-up, (2) misdiagnosis, misclassification and miscoding, (3) missing
data on certain variables, (4) unreliable data on some relevant variables, and (5) biases and
confounding.
Follow-up in the EMRs may be lost when a patient moves to a different location or transfers
out of a practice. While nation-wide EMRs lose follow-up when a patient moves to another
country, commercial EMRs are not able to track patient records once he /she moves to a
practice that does not contribute data to a particular EMR network. Due to the nature of general
practitioner settings, some variables are recorded more often than others. For example, blood
pressure measurements are taken at almost every general practitioner encounter because of the
relative ease with which it can be measured. At the same time, information on diet and exercise,
disease activity progression, or medications dose escalations are entered to the EMRs less
often. Also, patients are prone to provide non-reliable data on drug abuse, smoking habits, and
alcohol consumption. Laboratory and anthropometric measures may be conducted on different
equipment and may follow different procedures. Miscommunications, errors during data-entry
28
process, and non-attendance of scheduled visits are part of routine medical care, which brings
additional errors to data from EMRs.
Real-world studies are prone to various sources of biases, where some of them reflect data
collection nature (e.g. specific insurance or clinic), some of them of them may be reduced with
careful study design (e.g. immortal time bias), and others may be reduced with advanced data
mining and statistical methodologies (e.g. information bias). In a recent publication, Verheij
and colleagues discuss possible sources of bias in the EMR-based research and categorise them
as presented in the Table 1.2 [99].
Table 1.2
Possible sources of bias in Electronic Medical Record data
Reimbursement system, pay for performance parameters Role of general practitioner in the health care system; gatekeeping / nongatekeeping Professional clinical guidelines
Ease of access by patients to their records
Data sharing between health care providers
Practice workload
Variations between EMR system functionalities and lay-out
Coding systems and thesauruses
Knowledge and education regarding the use of EMR systems
Data extraction tools
Data processing
Research dataset preparation
Research methodologies Table source: Verheij, A.R., et al., Possible Sources of Bias in Primary Care Electronic Health Record Data Use
and Reuse. J Med Internet Res, 2018. 20(5): p. e185. [99]
1.10 THESIS STRUCTURE AND LOGICS
This project is designed as a series of methodological and pharmaco-epidemiological studies
that were conducted using a large database of EMRs. Chapter 2 provides a literature review on
the association of treatment with incretin-based therapies with CV risk.
Chapters 3-6 are devoted to data science. Chapter 3 introduces the database and basic data
management considerations. Chapter 4 describes the algorithm developed to extract and
aggregate medication information at individual patient-level. The information on ADD use was
obtained for all patients in the database (~34 million patients), and these data were incorporated
in the algorithm to identify a robust cohort of patients with diabetes (chapter 5). For this cohort
29
of patients, chapter 6 reports the patterns of missingness in the longitudinal laboratory and
anthropometric measures, and compares performance of several multiple imputation
techniques.
Chapters 3-6 describe the data groundwork that was essential in order to draw reliable clinical
inferences from voluminous and complex EMRs. These methods were part of the data
preparation for each pharmaco-epidemiological study described in chapters 7-9 and
appendices. Each of these clinical studies has its own design and methodology (data mining
and statistical), described separately within the respective chapters. Note that each chapter’s
database description is repetitive and presents a compressed version of chapter 3.
Chapter 7 explores longitudinal trends in the use of ADDs, glycaemic control, and patients’
characteristics with respect to the drug initiation order. Chapter 8 focuses on the glycaemic
control and its sustainability comparing second-line treatment options. Chapter 9 explores
cardio-metabolic risk factor burden at population level and cardio-metabolic risk factor control
by class of second-line ADD. It also explores association of cardio-metabolic risk factor burden
and the risk of CV events. Finally, chapter 10 summarise the results, concludes conducted
work, and discuss future directions.
30
Chapter 2: Literature Review
2.1 CLINICAL TRIALS
Prior to 2008, the approval of new ADDs was based on improvements in glycaemia with
detailed investigation of adverse events. The trials were usually 6 months long, where presence
of CVD was often an exclusion criterion [100]. In 2007, a meta-analysis of 43 studies reported
a significant increase in the risk of myocardial infarction in patients treated with pioglitazone
(TZD class), and a non-significant increase in the risk of CV death [101]. This controversial
publication generated enormous public reaction, which resulted in the FDA recommending
conducting long-term CV safety trials or other equivalent evidence to support CV safety of
new anti-diabetic agents in 2008. The guidance document suggested a meta-analysis of phase
2 and 3 trials to rule out CV risk as a default option, and the need for additional CV safety trial
only when the data are insufficient [100].
2.1.1 Cardiovascular outcome trials
In practice, a large dedicated CV safety trial has been conducted for every novel agent. Till
date, 3 large CV outcome trials have been completed for DPP-4i agents (Table 2.1) and 4 for
GLP-1RA agents (Table 2.2).
Table 2.1
Completed cardiovascular outcome trials for DPP-4i in patients with type 2 diabetes
SAVOR-TIMI53 EXAMINE TECOS Drug Saxagliptin Alogliptin Sitagliptin
Primary Endpoint 3-point MACE 3-point MACE 4-point MACE N 16,492 5,380 14,671
Follow-up, years 2.1 1.5 3 Inclusion:
Minimum Age, years
40 18 50
HbA1c, % ≥6.5 6.5-11.0 6.5-8.0
Cardiovascular Status
Pre-existing CVD or high CV risk
Acute Coronary Syndrome 15- 90
days before Pre-existing CVD
Mean at baseline: BMI, kg/m2 31.1 28.7 30.2 Age, years 65 61 65.5 HbA1c, % 8 8 7.2
31
SAVOR-TIMI53 EXAMINE TECOS Outcome, Hazard Ratio (95% CI) Primary composite 1.00 (0.89–1.12) 0.96 (≤1.16)* 0.98 (0.89–1.08)
Myocardial infarction
0.95 (0.80–1.12) 1.08 (0.88–1.33) 0.95 (0.81–1.11)
Stroke 1.11 (0.88–1.39) 0.95 (≤1.14)* 0.97 (0.79–1.19) Heart Failure 1.27 (1.07–1.51) 1.07 (0.79–1.46) 1.00 (0.83–1.20) CV Mortality 1.03 (0.87–1.22) 0.85 (0.66–1.10) 1.03 (0.89–1.19)
All-cause mortality
1.11 (0.96–1.27) 0.88 (0.71–1.09) 1.01 (0.90–1.14)
*one-sided repeated CI, at an alpha level of 0.01MACE: major cardiovascular event 3-point MACE: CV death, Myocardial Infarction, or Stroke 4-point MACE: CV death, Myocardial Infarction, Unstable Angina, or Stroke
The trials were designed to assess CV safety of novel agents over placebo in patients with
established CVD or high CV risk. With median follow-up of 1.5-3.8 years and average of
10,000 patients, all completed RCTs could prove CV safety. In the SAVOR-TIMI-53 trial,
increased rates of hospitalisation for heart failure were observed in the saxagliptin arm
compared to placebo with HR (95% CI) of 1.27 (1.07, 1.51) [102]. Notably, neither this nor
other CV safety trials included heart failure as a primary or secondary end point [103]. Upon
showing non-inferiority, secondary analyses of the LEADER trial demonstrated superiority of
liraglutide compared to placebo with HR (95% CI) for 3-point MACE (CV death, myocardial
infarction, or stroke) of 0.87 (0.78, 0.97) [104]. Notably, significantly lower HbA1c (0.4%),
body weight (2.3 kg), and systolic blood pressure (1.2 mmHg) were achieved in the liraglutide
arm compared to the placebo [63].
Table 2.2
Completed cardiovascular outcome trials for GLP-1RA in patients with type 2 diabetes
ELIXA LEADER SUSTAIN-6 EXCEL Drug Lixisenatide Liraglutide Semaglutide Exenatide
Primary Endpoint 4-point MACE 3-point MACE
3-point MACE
3-point MACE
N 6,068 9,340 3,297 14,752 Follow-up, years 2.1 3.8 1.9 3.2
Inclusion: Minimum Age,
years 30 50 50 18
HbA1c, % ≥7 ≥7 ≥7 6.5-10.0 Cardiovascular
Status Acute Coronary Syndrome 0-180
days before
Pre-existing CVD or high CV
risk
Pre-existing CVD or high
CV risk
Pre-existing CVD or high
CV risk
32
ELIXA LEADER SUSTAIN-6 EXCEL Mean at baseline:
BMI, kg/m2 30.2 32.5 31.1 Age, years 60 64.3 64.6 HbA1c, % 7.7 8.7 8.7 8.0
Outcome, Hazard Ration (95% CI) Primary
composite 1.02
(0.89–1.17) 0.87
(0.78–0.97) 0.74
(0.58–0.95) 0.91
(0.83, 1.00) Myocardial infarction
1.03 (0.87–1.22)
0.86 (0.73–1.00)
0.74 (0.51–1.08)
0.97 (0.85−1.10)
Stroke 1.12 (0.79–1.58)
0.86 (0.71–1.06)
0.61 (0.38–0.99)
0.85 (0.70−1.03)
Heart Failure 0.96 (0.75–1.23)
0.87 (0.73–1.05)
1.11 (0.77–1.61)
0.94 (0.78−1.13)
CV Mortality 0.98 (0.78–1.22)
0.78 (0.66–0.93)
0.98 (0.65–1.48)
0.88 (0.76−1.02)
All-cause mortality
0.94 (0.78–1.13)
0.85 (0.74–0.97)
1.05 (0.74–1.50)
0.86 (0.77−0.97)
MACE: major cardiovascular event 3-point MACE: CV death, Myocardial Infarction, or Stroke 4-point MACE: CV death, Myocardial Infarction, Unstable Angina, or Stroke
These trials provide very valuable clinical evidence of CV safety with novel agents.
Importantly, none of them was designed to demonstrate CV superiority and only patients at
increased risk of CV were recruited in these RCTs, much longer trials would be required for a
low-risk population. Also, most of the patients with T2DM were on a background of cardio-
protective and lipid modifying drugs, and those in the placebo group were more likely to
receive other (older) agents for treatment intensification.
2.1.2 Non-cardiovascular outcome trials
Analyses of non-CV outcome trials with shorter duration that included patients with much
lower CV risk demonstrated CV safety or superiority of incretin-based drugs over comparators.
A meta-analysis of 70 trials on DPP-4i with at least 24 weeks of follow-up, reported a HR (95%
CI) of 0.71 (0.59, 0.86) for major cardiovascular events (MACE) against placebo or other
comparators [105]. A meta-analysis of RCTs on patients who used GLP-1RA for a minimum
duration of 6 months, reported a HR (95% CI) of 0.78 (0.54, 1.13) for MACE against placebo
or other comparators [106]. Another recent meta-analysis that included 281 RCTs on treatment
with incretin-based drugs for ≥ 12 weeks reported odds ratio (95% CI) of 0.89 (0.80, 0.99) for
the risk of CV events favouring GLP-1RA use against placebo [107]. This meta-analysis also
reported odds ratio (95% CI) of 0.92 (0.83, 1.01) for CV events for DPP-4i against placebo.
33
In an overview of reviews, Gamble and colleagues assessed the quality of systematic reviews
evaluating the safety, efficacy and effectiveness of incretin-based therapies [108]. A total of 83
pooled treatment effect estimates from 10 systematic reviews on CV outcomes were analysed,
where none received a high-quality Assessing the Methodological Quality of Systematic
Reviews (AMSTAR) score. The study reported that most of reviews suggested a potential
decreased risk (41 of 45 for DPP-4i and 28 of 38 for GLP-1RA), while only few (18 of 41 for
DPP-4i and 3 of 28 for GLP-1 RA) pooled treatment effect estimates were statistically
significant. The authors suggested possible overestimations in the results and possible
publication bias in analysed reviews.
2.2 OBSERVATIONAL STUDIES
Table 2.3 presents a summary of observational studies that explored CV risk of treatment with
incretin-based therapies. Overall, conclusions are consistent with CV-outcome trial results –
treatment with incretin-based therapies does not increase and possibly reduces CV risk.
Multiple factors such as study design, available data, data management methodology and
statistical approaches make direct comparison of these studies very difficult. Patorno and
colleagues (2016) compared CV outcomes of treatment with GLP-1RAs with DPP-4i, SU, and
INS under the same study design [109]. GLP-1RA users were 1:1 matched to other treatment
groups (allowing patient overlap), and during 0.8 years of follow-up there were no significant
differences in the CV risk between the groups. Kannah and colleagues (2016) compared
MET+SU, MET+TZD, MET+DPP-4i, and MET+GLP-1RA combinations using a Cox
regression approach with propensity score as adjustment covariate [110]. While there was no
difference in the risk of overall mortality and coronary artery disease between all groups,
compared to MET+SU, patients treated with MET+DDP-4i had higher risk of heart failure with
a HR (95% CI) of 1.10 (1.04, 1.17). Notably, the methodological approach used in this study
is generally not recommended in the statistical literature [111-113]. Zghebi and colleagues
(2016) observed a non-significant reduction in the risk of major CVD or CV death for second-
line DPP-4i users, compared to second-line SU users adopting Cox regression approach
weighted with inverse probability of treatment [114]. The same study observed a significant
CV risk reduction for second-line TZD users compared to SU users. The most recent
observational study comparing post metformin second-line GLP-1RA, DPP-4i users, reported
lower CV risk of treatment with incretin-based therapies, compared with SU users – significant
for the DPP-4i group but non-significant for the GLP-1RA group [115].
34
Table 2.3
Summary of observational CV-outcome studies of treatment with incretin-based therapies
Source Drug / Cohort
size
Comparator /
Cohort size
Follow
-up
(yr)
Conclusion
Comparative effectiveness of incretin-based therapies and the risk of death and cardiovascular events in 38,233 metformin monotherapy users. Gamble et at, 2016 [115]
GLP-1RA (added to MET) 487
Sulfonylurea (added to MET) 25,916
2.7
non-significant CV risk reduction
DPP-4i (added to MET) 6,213
Sulfonylurea (added to MET) 25,916
significant CV risk reduction
Comparative risk of major cardiovascular events associated with second-line antidiabetic treatments: a retrospective cohort study using UK primary care data linked to hospitalization and mortality records, Zghebi et al, 2016 [114]
DPP-4i (added to MET) 1,030
Sulfonylurea (added to MET) 6,740
2.4 non-significant CV risk reduction
Comparative Cardiovascular Safety of Glucagon-Like Peptide-1 Receptor Agonists versus Other Antidiabetic Drugs in Routine Care: a Cohort Study. Patorno et al, 2016 [109]
GLP-1RA (added to MET) 18,658
DPP-4i (added to MET) 69,807
0.8
no significant difference in CV risk
GLP-1RA (added to MET) 14,466
SU (added to MET) 114,480
no significant difference in CV risk
GLP-1RA (added to MET) 29,343
Insulin (added to MET) 42,982
no significant difference in CV risk
The association of the treatment with glucagon-like peptide-1 receptor agonist exenatide or insulin with cardiovascular outcomes in patients with type 2 diabetes: a retrospective observational study. Paul et al, 2015 [116]
GLP-1RA: Exenatide with other OAD 2,804 Exenatide with Insulin 7,870
Insulin (concomitant ADD allowed) 28,551
3.5 reduced CV risk
Risk of cardiovascular disease events in patients with type 2 diabetes prescribed the Glucagon-Like Peptide 1 (GLP-1) receptor agonist exenatide twice daily or other glucose-lowering therapies: A retrospective analysis of the lifelink database. Best et al, 2011 [117]
GLP-1RA: Exenatide (concomitant ADD allowed) 21,754
Other ADD (concomitant ADD allowed) 391,771
- reduced CV risk and all-cause hospitalization
Association of Anti-Diabetic Medications Targeting the Glucagon-Like Peptide-1 Pathway and Heart Failure Events in Patients with Diabetes. Velez, 2015 [118]
GLP-1RA or DPP-4i (concomitant ADD allowed) 1,426
Other ADD (concomitant ADD allowed) 2,798
2
reduced risk of hospitalization for HF, all-cause hospitalization or death
Risk of overall mortality and cardiovascular events in patients with type 2 diabetes on dual drug therapy including metformin: A large database study from the Cleveland Clinic. Kannah et al, 2016 [110]
GLP-1RA (added to MET) 433
SU (added to MET) 9,419
4
no significant difference in HF events
DPP-4i (added to MET) 1,487
SU (added to MET) 9,419
increased risk of HF
35
Cardiovascular safety of combination therapies with incretin-based drugs and metformin compared with a combination of metformin and sulphonylurea in type 2 diabetes mellitus - a retrospective nationwide study, Mogensen et al, 2014 [119]
GLP-1RA (added to MET) 4,345
SU (added to MET) 25,092
2.3 non-significant CV risk reduction
DPP-4i (added to MET) 11,138
SU (added to MET) 25,092
3 significant CV risk reduction
Association Between Hospitalization for Heart Failure and Dipeptidyl Peptidase-4 Inhibitors in Patients With Type 2 Diabetes: An Observational Study. Fu et al, 2016 [120]
DPP-4i (concomitant ADD allowed) 109,278
SU (concomitant ADD allowed) 109,278
0.5 no significant difference in CV risk
Sitagliptin use in patients with diabetes and heart failure: a population-based retrospective cohort study. Weir et al, 2014 [121]
DPP-4i: Sitagliptin + HF (concomitant ADD allowed) 887
Other ADD + HF (not on Sitagliptin) 6,733
1.4 increased risk of hospitalization for HF
All-cause mortality and cardiovascular effects associated with the DPP-IV inhibitor sitagliptin compared with metformin, a retrospective cohort study on the Danish population. Scheller et al, 2014 [122]
DPP-4i: Sitagliptin (monotherapy) 1,228
MET (monotherapy) 83,528
1.3 no significant difference in CV risk
Dipeptidyl peptidase-4 inhibitors do not increase the risk of cardiovascular events in type 2 diabetes: a cohort study. Kim et al, 2014 [123]
DPP-4i (concomitant ADD allowed) 39,769
Other ADD (concomitant ADD allowed) 39,769
0.6 significant CV risk reduction
Sitagliptin and the risk of hospitalization for heart failure: a population-based study. Wang et al, 2014 [124]
DPP-4i (concomitant ADD allowed) 8,288
Other ADD (concomitant ADD allowed) 8,288
1.5 increased risk of hospitalization for HF
Combination therapy with metformin plus sulphonylureas versus metformin plus DPP-4 inhibitors: association with major adverse cardiovascular events and all-cause mortality. Morgan et al, 2014 [125]
DPP-4i (added to MET) 7,864
SU (added to MET) 33,983
- significant CV risk reduction
2.3 CONCLUSIONS AND IMPLICATIONS
While the completed RCTs clearly demonstrated CV safety of incretin-based therapies in the
high-risk population, there is no clear evidence of CV benefits of these therapies. Several meta-
analyses of non-CV outcome trials and observational studies supported CV safety of treatment
with incretin-based therapies in broader populations. The risk of heart failure with DPP-4i is
not completely ruled out, and will remain under more careful monitoring in the future. There
is a trend towards CV superiority of treatment with incretin-based therapies, especially with
GLP-1RA. While a RCT designed to demonstrate CV superiority of GLP-1RA or DPP-4i over
36
placebo or other comparator is unlikely, a large multi-national observational study with long
follow-up could provide strong evidence of comparative CV superiority of one ADD class over
others. However, no such study has been for either GLP-1RA nor DPP-4i till date.
Given multi-comorbid profile of patients with T2DM, it is now more urgent to explore whether
introduction of novel drug classes to the market has helped to reduce glycaemic and CV risk
factor burden at population level. This dissertation is designed to assess such trends and their
reflection on the rates of CV events in patients treated with major second-line ADDs.
37
Chapter 3: Data Description
3.1 CENTRICITY ELECTRONIC MEDICAL RECORDS
Data from Centricity Electronic Medical Records (CEMR) was used in this thesis.
Centricity™ is a brand of 27 healthcare IT software solutions from General Electric Healthcare,
which incorporates software for independent physician practices, academic medical centres,
hospitals and large integrated delivery networks. It refers to the systematised data collection,
storing and secure transmitting of patient health information in a digital format [126].
The Medical Quality Improvement Consortium is a rapidly growing community of over 400
CEMR customers who contribute de-identified clinical data to the CEMR database in order to
enable quality improvement, benchmarking, and population-based medical research. The
database covers over 35,000 health care providers from all US states, where ~70% are primary
care providers.
CEMR database contains patient-level information on demographics (sex, ethnicity, year of
birth) and longitudinal entries on anthropometrics, diseases, clinical observations, laboratory
results, and medications (Figure 3.1). Variables such as BMI, blood pressure, HbA1c, urine
albumin and creatinine, or lipid profiles along with dates and other relevant information are
stored in the form of a relational database.
The database extract that captured longitudinal EMRs from January 1995 until October 2014
was used at the initial stages of the project. The database extract that captured patient history
for more than 34 million individuals with a mean 3.5 years of follow-up from January 1995
until April 2016, was used to achieve the main results of this project, reported in chapters 7-9.
38
Figure 3.1. Schematic representation of the data in CEMR database.
Representativeness of CEMR Database
In general, the database is representative of US population in terms of age and ethnic
subgroups, however higher proportions of patients from north eastern and mid-western states
are represented in the CEMR [127]. The patients’ demographic characteristics in the CEMR
database are generally similar to those of the overall US population, with a slight bias towards
older, black, female and non-Hispanic. The distribution of CV risk factors was found to be
similar to the prospective national health surveys [128]. The representativeness of patients with
diabetes is discussed in chapter 5. CEMR has demonstrated its usefulness for various
epidemiological and outcome studies. It has been extensively used for academic research
worldwide in the fields of diabetes [129-132], CV research [133-135], obesity [136-138],
inflammatory diseases [139-142], and other diseases [143-146].
3.2 MEDICATION DATA
Medication data are extensively recorded in the CEMR - includes names, doses, the dates of
prescriptions, and the number of repeat prescriptions for the whole period of the electronic
record (or available follow-up time). CEMR also stores data from patients’ medication list,
which includes over-the-counter medications and those received from outside the EMR
network. This data contains start/stop dates and specific fields to track treatment alterations.
39
Dose escalation for individual medications (e.g. increasing doses of MET) are captured.
However, the data on dose titration, especially for insulin, is relatively poor in primary care
databases. Chapter 4 provides a detailed description of the available data and discusses data
related issues associated with longitudinal information on medication usage from large
relational databases such as CEMR. The “chaining” approach (described in chapter 4) was used
to extract and aggregate medication information at patient-level throughout this project.
3.2.1 Drug identification
As the database captures data from various systems since 1995, medication data were entered
in various ways including, but not limited to, Generic Product Identifier codes and the National
Drug Codes. CEMR stores the original medication name and less frequently the generic name
from the EMR source system terminology reference database, as well as normalised names for
clinical drugs (RxNorm terminology system), when possible. Therefore, several medication
name related fields exist and may include generic name, brand name, free text comments or
missing entries. The procedure described below has been elaborated to identify medication
keys (unique identifiers) of anti-diabetic and other relevant drugs:
1. Identify highest level in the Anatomical Therapeutic Chemical Classification (ATC)
System [147] for a relevant drug category (Table 3.1).
Table 3.1
Therapeutic Class and highest corresponding ATC code
Therapeutic Class Highest ATC code
Anti-obesity preparations, excluding diet products A08
Anti-diabetic drug* A10
Antihypertensive drug C02
Diuretic drug C03
Peripheral vasodilator C04
Beta blocking agent C07
Calcium channel blocker C08
Angiotensin-converting-enzyme inhibitor C09AA and C09B
Angiotensin C09CA and C09D
Agents acting on the renin-angiotensin system C09
HMG CoA reductase inhibitors (Statin) C10AA and C10B
Lipid modifying agents C10
Antidepressant drugs N06A
Non-steroidal anti-inflammatory drug M01A and B01AC06
40
2. For each therapeutic class obtain a list of generic names browsing lower ATC
categories.
3. Search generic names in the official FDA catalogue [148], and create lists of all
approved brand names. Link brands of combination products to each generic name.
4. Combine obtained generic and brand names in the list of keywords.
5. Text-mine CEMR to obtain sets of medication keys for each therapeutic class.
6. Manually review obtained lists and exclude inappropriate keys.
Illustrative example (steps 1-5) of identifying medication keys for Liraglutide is provided in
the Figure 3.2.
Therapeutic class “Anti-diabetic drug” included eleven groups: MET, SU, TZD, Alpha
glucosidase inhibitor, amylin, Dopamine receptor agonist, Meglitinides, DPP-4i, GLP-1RA,
SGLT-2i, and INS. Saxenda (brand of Liraglutide, GLP-1RA) was excluded from the GLP-
1RA group as it was approved in 2014 as weight lowering medication only [149]. Although
Welchol (Colesevelam) was approved for the treatment of T2DM, it is usually prescribed to
reduce cholesterol levels; therefore Colesevelam was not considered as ADD in this project
[150].
Angiotensin-converting-enzyme inhibitors, agents acting on the renin-angiotensin system, beta
blocking agents, and statins were considered as cardio-protective drugs.
41
Figure 3.2. Schematic diagram of identifying list of medication keys for Liraglutide.
42
3.3 DISEASE DATA
CEMR database stores patients’ disease data by means of International Classification of
Diseases 9th Revision (ICD-9) codes, International Classification of Diseases 10th Revision
(ICD-10) codes, or less frequently with SNOMED Clinical Terms (SNOMED CT) codes.
Reliability of diagnosis coding in CEMRs for various diseases has been examined in prior
studies [94, 128, 136], therefore diagnostic codes were directly used to identify presence of a
disease. Nonetheless, additional advanced techniques were applied to improve the quality of
the cohort of patients with diabetes (chapter 5).
The history of disease before baseline of a particular study and disease events during follow-
up were constructed using the date of diagnosis of diseases. “Time to event” was calculated as
time from baseline till the first available diagnosis date for a particular disease. Disease events
included CV disease, chronic kidney disease (CKD) with its stage, cancer, depression and other
relevant diseases. Patients with diagnostic codes for bariatric surgery were also identified. CVD
was defined as ischaemic heart disease (including myocardial infarction), peripheral
vascular/arterial disease, heart failure or stroke.
3.3.1 Charlson Comorbidity Index
While controlling for comorbidities that may affect study outcome is essential, adjusting for
large number of possible comorbidities may be problematic from clinical and methodological
points of view [151, 152]. Rather than adjusting for the effect of each comorbidity, several
methods have been proposed to control for overall comorbidity burden [151, 153, 154]. The
Charlson comorbidity index (CCI) is the most widely comorbidity index used in the medical
literature [155, 156].
CCI was developed to predict 1-year mortality in a cohort of 604 patients admitted to a New
York teaching hospital during 1 month in 1984. The validation of CCI was performed on a
cohort of 685 breast cancer patients admitted to a Connecticut teaching hospital from 1962 to
1969 [151, 152]. Weights (1, 2, 3, or 6) for CCI score computation were created by assessing
adjusted hazard ratios for each predefined comorbidity from a Cox proportional hazards
regression model [151, 157].
Since CCI was introduced, it has been extensively validated in cohorts with different diseases.
Also, numerous adaptations of this index were developed, including adaptations for
administrative databases [152, 158, 159]. In this project, the algorithm recommended by Quan
and colleagues [160] was used to identify ICD-9 and ICD-10 codes for diseased cohorts (except
43
diabetes, Table 3.2). Quan and colleagues expanded ICD-9 codes of Deyo CCI [161] and
identified corresponding ICD-10 codes for each comorbidity. Multiple physicians were
actively involved through all stages of the algorithm development and reached consensus on
the final lists.
To follow the algorithm proposed by Quan and colleagues, SNOMED CT codes were
translated to ICD-10 codes using the mapping released in July 2016 by US National Library of
Medicine (version 20160301) [162]. CCI score at baseline was calculated using original
weights, as presented in Table 3.2.
Table 3.2
Diseases, ICD codes, and Weights used to compute Charlson Comorbidity Index
Disease ICD-9 ICD-10 Weight Myocardial Infarction 410.x, 412.x I21.x, I22.x, I25.2 1
Congestive heart failure
398.91, 402.01, 402.11, 402.91, 404.01, 404.03, 404.11, 404.13, 404.91, 404.93, 425.4–425.9, 428.x
I09.9, I11.0, I13.0, I13.2, I25.5, I42.0, I42.5–I42.9, I43.x, I50.x, P29.0
1
Peripheral vascular disease
093.0, 437.3, 440.x, 441.x, 443.1–443.9, 47.1, 557.1, 557.9, V43.4
I70.x, I71.x, I73.1, I73.8, I73.9, I77.1, I79.0, I79.2, K55.1, K55.8, K55.9, Z95.8, Z95.9
1
Cerebrovascular disease 362.34, 430.x–438.x G45.x, G46.x, H34.0, I60.x–I69.x 1
Dementia 290.x, 294.1, 331.2 F00.x–F03.x, F05.1, G30.x, G31.1 1
Chronic pulmonary disease
416.8, 416.9, 490.x–505.x, 506.4, 508.1, 508.8
I27.8, I27.9, J40.x–J47.x, J60.x–J67.x, J68.4, J70.1, J70.3
1
Rheumatic disease 446.5, 710.0–710.4, 714.0– 714.2, 714.8, 725.x
M05.x, M06.x, M31.5, M32.x–M34.x, M35.1, M35.3, M36.0
1
Peptic ulcer disease 531.x–534.x K25.x–K28.x 1
Mild liver disease
070.22, 070.23, 070.32, 070.33, 070.44, 070.54, 070.6, 070.9, 570.x, 571.x, 573.3, 573.4, 573.8, 573.9, V42.7
B18.x, K70.0–K70.3, K70.9, K71.3–K71.5, K71.7, K73.x, K74.x, K76.0, K76.2–K76.4, K76.8, K76.9, Z94.4
1
Moderate or severe liver disease
456.0–456.2, 572.2–572.8
I85.0, I85.9, I86.4, I98.2, K70.4, K71.1, K72.1, K72.9, K76.5, K76.6, K76.7
3
Diabetes without chronic complication
Identified as described in 0 1
44
Diabetes with chronic complication
250.4–250.7
E10.2–E10.5, E10.7, E11.2–E11.5, E11.7, E12.2–E12.5, E12.7, E13.2– E13.5, E13.7, E14.2–E14.5, E14.7
2
Hemiplegia or paraplegia 334.1, 342.x, 343.x, 344.0– 344.6, 344.9
G04.1, G11.4, G80.1, G80.2, G81.x, G82.x, G83.0–G83.4, G83.9
2
Renal disease
403.01, 403.11, 403.91, 404.02, 404.03, 404.12, 404.13, 404.92, 404.93, 582.x, 583.0–583.7, 585.x, 586.x, 588.0, V42.0, V45.1, V56.x
I12.0, I13.1, N03.2–N03.7, N05.2– N05.7, N18.x, N19.x, N25.0, Z49.0– Z49.2, Z94.0, Z99.2
2
Cancer 140.x–172.x, 174.x–195.8, 200.x–208.x, 238.6
C00.x–C26.x, C30.x–C34.x, C37.x– C41.x, C43.x, C45.x–C58.x, C60.x– C76.x, C81.x–C85.x, C88.x, C90.x–C97.x
2
Metastatic solid tumour 196.x–199.x C77.x–C80.x 6 AIDS/HIV 042.x–044.x B20.x–B22.x, B24.x 6
Note: original table source [160], weights in [157].
As recommended by Quan and colleagues (2005), cancer was considered as any malignancy,
including lymphoma and leukaemia, and excluding malignant neoplasm of the skin [160]. In
cases where moderate or severe liver disease was present for a patient, mild liver disease did
not contribute to the CCI score. Similarly, if a record of diabetes with chronic complications
was present, diabetes without chronic complications did not contribute to the CCI score
computation. Finally, if a patient had a record of metastatic solid tumour, cancer did not affect
the CCI score.
3.4 LABORATORY, CLINICAL, AND ANTHROPOMETRIC DATA
Longitudinal observations on laboratory, clinical, and anthropometric data are extensively
recorded in the CEMR. These data are usually entered repeatedly throughout the whole period
of the electronic record (available follow-up) for an individual patient. The data used during
this project included: HbA1c, fasting/random blood glucose, low-density lipoprotein (LDL),
high-density lipoprotein, triglycerides, systolic blood pressure (SBP), diastolic blood pressure,
heart rate, urine microalbumin/creatinine ratio, serum creatinine, body mass index (BMI),
weight, and tobacco use status. Extensive data validation and cleaning techniques were applied
prior to data extraction and all measurements were converted to standard or most frequently
used units.
45
3.4.1 Arranging longitudinal measures
For individual patients, the longitudinal laboratory, clinical and anthropometric data were
arranged in 6 monthly windows: ±3 months both sides of a baseline of a particular study and
progressively further on (Figure 3.3). The closest risk factor measure to the middle of the
window (or average of multiple measurements if available within that window) was preserved
as the observed measure for this window. For baseline HbA1c data, closest measurement on or
within 3 months prior to baseline was used for the baseline measurement. The six-monthly
longitudinal follow-up data for HbA1c followed the same principle described above.
Figure 3.3. Schematic diagram of arranging longitudinal risk factor data.
Missing data (example: Figure 3.3, window “6M”) were imputed with the Multiple Imputation
Monte Carlo Markov Chain approach, after extensive assessments of the missingness patterns
and comparison of several imputation techniques, as described in chapter 6.
3.4.2 Tobacco use status
The longitudinal free text inputs are also available in the CEMR. Tobacco status included status
on any type of tobacco use: cigars, pipe, cigarettes, chewing tobacco or snuff. The majority of
records providing such information followed standard coding practice, and were in the form of
“current” / “former” / “never” smoker. Remaining records (>80,000) were classified to these 3
categories by creating classification rules upon manual review of entered free text. For
example: if description includes keywords “trying” and “quit”, classify as “current”.
Occasional smokers were classified as "current". In case of discordant same-day statuses,
priority was given to "current" status, than to "former" and lastly to "never" status. Records
indicating "never" status were disregarded in case of previous records of "current" or "former"
status record. For each patient, last status recorded on or prior to particular analysis baseline
was considered as tobacco use status. Nonetheless, a large number of patients with T2DM
appeared not to have a record for the tobacco use status.
46
3.5 ETHICS APPROVAL
This thesis involved the use of existing data, where the subjects could not be identified directly
or through identifiers linked to the subjects. Thus, according to the US Department of Health
and Human Services Exemption 4 (CFR 46.101(b)(4)), this study is exempt from ethics
approval from an institutional review board and informed consent.
47
Chapter 4: Medication Data Extraction
Statement of Contribution of Co-Authors for Thesis by Published
Paper
The authors listed below have certified* that:
1. they meet the criteria for authorship in that they have participated in the conception,
execution, or interpretation, of at least that part of the publication in their field of
expertise;
2. they take public responsibility for their part of the publication, except for the
responsible author who accepts overall responsibility for the publication;
3. there are no other authors of the publication according to these criteria;
4. potential conflicts of interest have been disclosed to (a) granting bodies, (b) the editor
or publisher of journals or other publications, and (c) the head of the responsible
academic unit, and
5. they agree to the use of the publication in the student’s thesis and its publication on
the QUT’s ePrints site consistent with any limitations set by publisher requirements.
In the case of this chapter:
Olga Montvida, Ognjen Arandjelović, Edward Reiner, and Sanjoy K. Paul. Data Mining
Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic Medical
Records. Open Bioinformatics, 2017, 10:1-15. DOI: 10.2174/1875036201709010001.x.
Contributor Statement of Contribution*
Olga Montvida Conceived the idea, was responsible for the primary design
of the study and the methodological developments.
Conducted the data extraction and statistical analyses.
Developed first draft and contributed towards development
of the manuscript.
Ognjen Arandjelović Evaluated the methodological approach and contributed
towards development of the manuscript.
Edward Reiner Evaluated the methodological approach and contributed
towards development of the manuscript.
Sanjoy K. Paul Conceived the idea, was responsible for the primary design
of the study and the methodological developments.
Contributed to the statistical analyses. Developed first draft
and contributed towards development of the manuscript.
48
29.06.2018QUT Verified
Signature
Principal Supervisor Confirmation
I have sighted email or other correspondence from all Co-authors confirming their certifying
authorship.
Sanjoy Ketan Paul 29.06.2018
Name Signature Date
49
QUT Verified Signature
Send Orders for Reprints to [email protected]
The Open Bioinformatics Journal , 2017, 10, 1-15 1
The Open Bioinformatics Journal
Content list available at: www.benthamopen.com/TOBIOIJ/
DOI: 10.2174/1875036201709010001
RESEARCH ARTICLE
Data Mining Approach to Estimate the Duration of Drug Therapyfrom Longitudinal Electronic Medical Records
Olga Montvida1,2, Ognjen Arandjelović3, Edward Reiner4 and Sanjoy K. Paul5,*
1Clinical Trials and Biostatistics Unit, QIMR Berghofer Medical Research Institute, Brisbane, Australia2School of Biomedical Sciences, Institute of Health and Biomedical Innovation, Faculty of Health, QueenslandUniversity of Technology, Brisbane, Australia3School of Computer Science, University of St. Andrews, St. Andrews, United Kingdom4Smart Analyst Inc., New York, Unites States of America5Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, Australia
Received: March 27, 2017 Revised: May 06, 2017 Accepted: May 12, 2017
Abstract:
Background:
Electronic Medical Records (EMRs) from primary/ ambulatory care systems present a new and promising source of information forconducting clinical and translational research.
Objectives:
To address the methodological and computational challenges in order to extract reliable medication information from raw data whichis often complex, incomplete and erroneous. To assess whether the use of specific chaining fields of medication information mayadditionally improve the data quality.
Methods:
Guided by a range of challenges associated with missing and internally inconsistent data, we introduce two methods for the robustextraction of patient-level medication data. First method relies on chaining fields to estimate duration of treatment (“chaining”),while second disregards chaining fields and relies on the chronology of records (“continuous”). Centricity EMR database was used toestimate treatment duration with both methods for two widely prescribed drugs among type 2 diabetes patients: insulin and glucagon-like peptide-1 receptor agonists.
Results:
At individual patient level the “chaining” approach could identify the treatment alterations longitudinally and produced more robustestimates of treatment duration for individual drugs, while the “continuous” method was unable to capture that dynamics. Atpopulation level, both methods produced similar estimates of average treatment duration, however, notable differences wereobserved at individual-patient level.
Conclusion:
The proposed algorithms explicitly identify and handle longitudinal erroneous or missing entries and estimate treatment durationwith specific drug(s) of interest, which makes them a valuable tool for future EMR based clinical and pharmaco-epidemiologicalstudies. To improve accuracy of real-world based studies, implementing chaining fields of medication information is recommended.
* Address correspondence to this author at the Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, Australia; Tel: +613 93428433; Fax: +61 3 93428780; E-mails: [email protected]; [email protected]
1875-0362/17 2017 Bentham Open
50
2 The Open Bioinformatics Journal , 2017, Volume 10 Montvida et al.
Keywords: Electronic medical records, Treatment duration, Data mining, Type 2 diabetes, Rule-based algorithm, Patient-level dataaggregation.
1. INTRODUCTION
The electronic medical records (EMRs) and the administrative data from the primary/ambulatory care systems areincreasingly being used in epidemiological [1 - 3], pharmaco-epidemiological [4 - 6], pharmaco-vigilance [7 - 9],clinical outcome [5, 10 - 12], health economic [13, 14] and public health related studies [15 - 18]. Analyses of largeprimary care based EMRs from various countries, most notably from UK, USA and Sweden, have provided significantinsight into the effectiveness of changes in health care practices/polices on overall disease and health management costs[3, 15, 19, 20], in addition to population level evidences on the safety and effectiveness of various therapies and theassociation of disease-related risk factors on long-term outcomes [5, 6, 18, 21 - 23]. Increasing use of such large real-world patient-level data is illustrated well by the sixfold increase in EMR based published studies since 2000 [10, 24].
In structured EMRs, especially from the primary/ambulatory care systems, comprehensive patient level data arecaptured on different domains simultaneously and stored in the form of relational database [25, 26]. Representativeexamples include the UK Clinical Practice Research Database and CentricityTM EMR (CEMR) database of USA [27,28]. The extraction, quality control and management of such voluminous longitudinal data under individual studyprotocols is highly methodologically and computationally involved, and challenging from data mining and analyticalviewpoints [22, 29]. Data science generally considers that data preparation tasks consume about 80% of total projecttimeline leaving only 20% for ultimate analysis itself [30, 31]. Data completeness, systematic biases, reproducibilityand quality are some of the notable limitations in such databases [18, 29, 32].
Most EMR databases capture large amounts of detailed information on medications provided to individuals overtime, while specific form in which this information is stored varies from database to database [26]. It is usually possibleto obtain the drug class, specific brand name within the corresponding class, prescription dates, dosage, and number ofrefills [32]. However, a significant number of entries for an individual prescription may be missing or contain errors.The problem with information completeness can also arise when the medication nomenclature is not correctly matched[29].
Clinical and pharmaco-epidemiological studies, which rely on the data from EMRs, are often interested in theeffectiveness of specific therapies, therapeutical dynamics, treatments with concomitant medications, and durationsthereof in specific disease areas. Such real-world analysis provides an extremely valuable means for the understandingof drug utilization patterns, treatment initiation periods following the diagnosis of a disease, the effectiveness of specifictherapies on disease-related risk factors, and possible associations of therapies with long-term outcomes [1, 6]. Thesestudies warrant appropriate extraction of longitudinal information on prescriptions or medications at individual patientlevel, inappropriate extraction of the data may result in misleading inferences reported [33 - 35]. Generally, pharmaco-epidemiological studies do not estimate treatment duration, but only account for the fact of one or more prescriptionsfor a particular drug(s) [36, 37]. Some studies calculated medication duration by extracting first prescription date fromthe last prescription date [38, 39], and only few studies additionally considered a drug being discontinued if thesubsequent prescription was not refilled within the expected time of drug coverage [40, 41]. While some studies havediscussed the challenges in the analysis of medication data from EMRs [18, 42], to the best of our knowledge noexisting study has analysed the quality, consistency, and completeness of EMR prescription information, nor proposed apractical algorithm able to extract salient medication information from large and complex longitudinal data sets [43].
The aims of this explanatory and methodological study are (1) to discuss and analyse the most pressing challengesencountered by computer based methods in the process of extracting and aggregating longitudinal medication data fromEMRs, (2) to describe two algorithms to extract prescription information of individual therapies and to estimate thecorresponding duration of treatment, and (3) to discuss how estimates of individual medication duration are affected bythe choice of the study design. The effectiveness of algorithms is compared is on a cohort of patients with a clinicaldiagnosis of type 2 diabetes (T2DM) using a real-world EMR database collected across the USA.
2. MATERIALS AND METHODS
2.1. Centricity Electronic Medical Records
The CEMR database contains more than 40 million patients’ clinical/treatment records from 1995. CEMRrepresents 49 US states and a variety of ambulatory medical practices, including solo practitioners, community clinics,
51
Estimation of Drug Therapy Duration from EMRs The Open Bioinformatics Journal , 2017, Volume 10 3
academic medical centres, and large integrated delivery networks. The database has been extensively used for academicresearch worldwide [3, 37, 44 - 47]. The CEMR database consists of over 30,000 health care providers, of whomapproximately 70% are primary care providers. For both insured and uninsured patients, this database containscomprehensive patient-level information on many aspects including demographic information, laboratory results,history of diseases, clinical diagnosis of symptoms/ diseases, vital signs, history of medications and detailedinformation on the ongoing medications. For this study we used longitudinal information from January 1995 to October2014.
2.2. Medication Data in Centricity EMR database
The medications taken by an individual (medication domain) and the prescriptions for drugs provided to theindividuals by the service provider registered within the EMR system (prescription domain) are extensively documentedin the database by means of three tables: medication dimension (MD), medication fact (MF) and prescription fact (PF).The MF and PF belong to the medication and prescription domains respectively. The MF may include a broader list ofall medications that a patient is taking including over the counter medications, herbal remedies and medicationsprescribed by a provider that may be out of the EMR network. MD is linked to both MF and PF. Each record in the MDcontains information on individual drug, which includes the National Drug Code (NDC) and Generic Product Identifier(GPI), as well as the four ordered attributes derived from the GPI such as generic drug names. The MD also includes themedication doses corresponding to different brands’ products, identified by a unique medication key value assigned toeach record.
The entries in MF capture individual patient’s medication prescription history and active prescriptions from allpractitioners including the service provider registered within EMR system. It contains several special fields to tracklongitudinal patterns, such as active medication flag, which indicates if a patient was taking the drug at the databaseextraction moment. Active medication list is identified by records with value “Y” of active flag. The chain identification(ID) values facilitate tracking of treatment alterations (including the addition of new medications) over time, with therelated chain sequence values which track medication adjustments within the same chain ID. The initiation (‘start’) andcessation (’stop’) dates associated with different treatments are also stored in the MF. However we found that thecorresponding values are missing with alarming frequencies: 67% of the cases for the former and 11% for the latter.Also, some of the start and stop date entries could be erroneous, such as stop date preceding start date. An excerpt fromthe MF for an individual patient is shown in Table 1.
Table 1. Snapshot of MF table – treatment intensification.
GPI category 4 Medication key (M) Patient key (P) Create date(C)
Start date(B)
Stop date(S)
Activeflag (F)
Chain ID(H)
Chainseq (G)
METFORMIN HCL 41467 288859 6-May-09 6-May-09 N 307667619 0METFORMIN HCL 41467 288859 11-Jun-10 11-Jun-10 N 307667619 1METFORMIN HCL 41467 288859 25-Apr-11 11-Jun-10 25-Apr-11 N 307667619 2
LIRAGLUTIDE 3347202 288859 25-Apr-11 25-Apr-11 N 812855070 0LIRAGLUTIDE 3347202 288859 10-May-11 10-May-11 N 812855070 1LIRAGLUTIDE 3347202 288859 10-May-11 10-May-11 N 820957274 0LIRAGLUTIDE 3347202 288859 14-Dec-11 10-May-11 14-Dec-11 N 820957274 1LIRAGLUTIDE 3347202 288859 14-Dec-11 10-May-11 14-Dec-11 N 812855070 2
INSULIN GLARGINE 682327 288859 27-Feb-12 N 1092145628 0INSULIN ISOPHANE HUMAN 682834 288859 27-Feb-12 N 1092145627 0
INSULIN GLARGINE 682327 288859 26-Sep-12 26-Sep-12 N 1092145628 1INSULIN ISOPHANE HUMAN 682834 288859 14-Nov-12 N 1092145627 1
INSULIN GLARGINE 682327 288859 14-Nov-12 Y 1092145628 2INSULIN LISPRO (HUMAN) 682825 288859 26-Feb-14 26-Feb-14 Y 1092145627 2
The entries in the PF capture the prescription date and the associated number of refills only for medications thathave been prescribed by the responsible provider within the EMR network. The MF dataset contains a broader set ofentry sources, moreover the form of recording potentially comprises more details than corresponding data in the PF.Nevertheless it was determined that PF may contain unique entries that are not stored in MF. Therefore, the MF wasconsidered as the primary source of medication information and the PF as a complimentary one.
52
4 The Open Bioinformatics Journal , 2017, Volume 10 Montvida et al.
3. METHODS
In this section, we introduce a novel algorithm for mining large-scale longitudinal EMRs with the ultimate goal ofestimating the duration of treatment of a particular individual with a drug(s) of interest. The first method we introduce(“chaining”) relies on chain ID and chain sequence values recorded in the MF. This feature of the approach allows toaccount for treatments which include alternative drug use. To assess the importance and power of longitudinal chaininformation, we also describe a modification of the “chaining” method (“continuous”) which disregards chain ID andchain sequence values, and instead relies only on the chronology of patient’s records of particular drug(s). In the currentliterature, the latter approach is used more frequently.
3.1. Data Pre-processing: Auxiliary Fields
Although erroneous entries generally cannot be identified, various types of global consistency rules may be appliedto reduce the error. Chronology of the events may be corrected by incorporating two additional fields: patient’s lastavailable follow-up date and patient’s date of birth (DOB).
CEMR database stores last available follow-up date for each patient. As initial data pre-processing step, erroneousfollow-up date entries were identified and corrected by the latest record creation dates of all activities within thedatabase for corresponding patients.
Similar to many anonymized EMRs, the exact DOB was not available within CEMR. Simple procedure was appliedto approximate DOB:
Obtain multiple DOB estimates per patient by subtracting reported ‘valid’ age from the record creation date for1.all activities within the database. CEMR groups patients older than 80 years under a single age key. The non-missing age data and the non 80+ age keys were considered as ‘valid’ age entries.Approximate DOB as minimum of all estimates from Step 1.2.For patients without reported activities estimate DOB from the dataset containing demographic information by3.subtracting reported ‘valid’ age from the database extraction date.
The parameters for the mathematical formulations are identified in the Table 2 below.
Table 2. Mathematical Formulation
Scalarsn number of records in MF tablek number of records in PF tablesd standard prescription duration for individual drugmx maximal number of prescription refills for individual drugu number of unique patient keys in the cohort of interest
SetsPS = {ps1,ps2,…… psu} set of unique patient keys in the cohort of interest
V set of missing valuesMS set of medication keys of selected drug(s)
FY = {fi|fi = "Y", i = } set of active drugs
MF={M,P,C,B,S,F,H,G} datasetM = (m1, m2,... ,mn)
T medication keys for drugs
P = (p1, p2,... ,pn)T patient keys
C = (c1, c2,... ,cn)T record creation dates
B = (b1, b2,... ,bn)T start dates of individual records
S = (s1, s2,... ,sn)T stop dates of individual records
F = (f1, f2,... ,fn)T active medication flag values
H = (h1, h2,... ,hn)T chain identification values
G = (g1, g2,... ,gn)T chain sequence values
PF={M,P,C,B,R} datasetM = (m1, m2,... ,mk)
T medication keys for individual prescriptions
P = (p1, p2,... ,pk)T patient keys
1, 𝑛
53
Estimation of Drug Therapy Duration from EMRs The Open Bioinformatics Journal , 2017, Volume 10 5
C = (c1, c2,... ,ck)T record creation date
B = (b1, b2,... ,bk)T prescription dates
R = (r1, r2,... ,rk)T number of refills for individual prescription
The scalars sd and mx may be defined on the basis of the standard prescription protocol for individual drugs. Thedefault values of sd =1 and mx =24 were considered in our analyses.
MS may be identified by text-mining the MD dataset. For example, glucagon-like peptide-1 receptor agonist(GLP-1RA) may be identified by searching for “GLP-1 RECEPTOR AGONIST” in the second order GPI attributedfield.
3.2. “Chaining” Method
The algorithm for the first approach to extract and aggregate data for the estimation of duration of treatment iselaborated below.
1. Merge the following to the MF dataset by patient key:
1.1) date of birth DOB = (db1, db2,...,dbn)T.
1.2) last available follow-up date L = (l1, l2,...,ln)T. The extended MF dataset would be of the form.
2. Replace erroneous values of start dates (bi V (bi<dbi bi>si bi>li), i = ) with missing values
3. Sort by patient key ascending, chain ID ascending within the same patient, chain sequence descending within thesame chain ID.
4. Set initial values p0 = 0, and approximate individual medication end dates E = (e1, e1,...,en)T on the basis of the
following rules:
4.1) if stop date is not missing, then end date equals to stop date.
4.2) else, if active flag is “Y”, then end date equals to last follow-up date.
4.3) else, if first unique value of patient key or first unique value of chain ID, and start date is not missing, then enddate equals to start date plus standard prescription duration.
4.4) else, if first unique value of patient key or first unique value of chain ID, and start date is missing, then end dateequals to record creation date plus standard prescription duration.
4.5) else, end date equals to the create date of a previous record.
(Table 2) contd.....
𝑀𝐹1 = {𝑀, 𝑃, 𝐶, 𝐵, 𝑆, 𝐹, 𝐻, 𝐺, 𝐷𝑂𝐵, 𝐿}
∉ ∧ ∨ ∨ 1, 𝑛
𝑀𝐹1: 𝑎) 𝑝𝑖 ≤ 𝑝𝑖+1, 𝑖 = 1, 𝑛
𝑏) ℎ𝑖 ≤ ℎ𝑖+1, ∀𝑖: 𝑝𝑖 = 𝑝𝑖+1 - post a) sorting
𝑐) 𝑔𝑖 ≥ 𝑔𝑖+1, ∀𝑖: ℎ𝑖 = ℎ𝑖+1 ∧ 𝑝𝑖 = 𝑝𝑖+1 - post b) sorting
𝑒𝑖 = 𝕀{𝑏𝑖∉𝑉} ⋅ (𝕀{𝑝𝑖≠𝑝𝑖−1} + 𝕀{ℎ𝑖≠ℎ𝑖−1} − 𝕀{𝑝𝑖≠𝑝𝑖−1} ∙ 𝕀{ℎ𝑖≠ℎ𝑖−1} ) ∙ (𝑠𝑖 ⋅ 𝕀𝑠𝑖∉𝑉 + 𝑙𝑖 ⋅ 𝕀𝑓𝑖∈𝐹𝑦⋅ 𝕀𝑠𝑖∈𝑉 + (𝑏𝑖 + 𝑠𝑑) ⋅ 𝕀𝑓𝑖∉𝐹𝑦
⋅ 𝕀𝑠𝑖∈𝑉) +
𝕀{𝑏𝑖∈𝑉} ⋅ (𝕀{𝑝𝑖≠𝑝𝑖−1} + 𝕀{ℎ𝑖≠ℎ𝑖−1} − 𝕀{𝑝𝑖≠𝑝𝑖−1} ∙ 𝕀{ℎ𝑖≠ℎ𝑖−1} ) ∙ (𝑠𝑖 ⋅ 𝕀𝑠𝑖∉𝑉 + 𝑙𝑖 ⋅ 𝕀𝑓𝑖∈𝐹𝑦⋅ 𝕀𝑠𝑖∈𝑉 + (𝑐𝑖 + 𝑠𝑑) ⋅ 𝕀𝑓𝑖∉𝐹𝑦
⋅ 𝕀𝑠𝑖∈𝑉) +
𝕀{𝑝𝑖=𝑝𝑖−1} ⋅ 𝕀{ℎ𝑖=ℎ𝑖−1} ∙ (𝑙𝑖 ⋅ 𝕀𝑓𝑖∈𝐹𝑦⋅ 𝕀𝑠𝑖∈𝑉 + 𝑠𝑖 ⋅ 𝕀𝑠𝑖∉𝑉 + 𝑐𝑖−1 ⋅ 𝕀𝑓𝑖∉𝐹𝑦
⋅ 𝕀𝑠𝑖∈𝑉),
54
6 The Open Bioinformatics Journal , 2017, Volume 10 Montvida et al.
5. Replace values of end dates that falls out of the follow-up interval with last follow-up date.
6. Delete records if start date is missing and create date is greater than stop date. Reduce the dataset to the set ofpatients from the cohort of interest and to set of keys of selected drug(s).
7. Merge the following to the PF set by patient key:
7.1) date of birth DOB = (db1, db2,...,dk)T
7.2) last available follow-up date within the database L = (l1, l2,...lk)T. The extended PF dataset would take the
following form:
8. Replace erroneous prescription dates (bi V (bi<dbi bi>li), i= ) with missing values.
9. If number of refills is greater than pre-defined maximal number of possible refills or negative or missing, replaceit with zero.
10. Calculate end dates E = (e1, e2,...,ek)T by the following rules.
10.1) if prescription date is not missing, then end date is equals to standard duration multiplied by the number ofrefills plus one and added to prescription date.
10.2) if prescription date is missing, then end date is equals to standard duration multiplied by the number of refillsplus one and added to record creation date.
11. Update end dates as described in Step 5 .
12. Reduce PF1 to the set of patients from the cohort of interest, to the set of patients not in MF2, and to the set ofkeys of selected drug(s).
13. Append both datasets by the following values: patient key, record creation date, start / prescription date and enddate, assume that the new dataset MP contain n' records.
where 𝕀{⋅} is an indicator function:
𝕀{𝑎=𝑏} = {1, 𝑖𝑓 𝑎 = 𝑏0, 𝑒𝑙𝑠𝑒
𝕀{𝑎∈𝑏} = {1, 𝑖𝑓 𝑎 ∈ 𝑏0, 𝑒𝑙𝑠𝑒
𝑒𝑖 = 𝑒𝑖 ⋅ 𝕀{𝑒𝑖≤𝑙𝑖} + 𝑙𝑖 ⋅ 𝕀{𝑒𝑖>𝑙𝑖}
𝑀𝐹2 = {𝑀𝐹1: 𝑝𝑖 ∈ 𝑃𝑆 ∧ 𝑚𝑖 ∈ 𝑀𝑆 ∧ ¬(𝑏𝑖 ∈ 𝑉 ∧ 𝑐𝑖 > 𝑒𝑖), 𝑖 = 1, 𝑛}
∉ ∧ 1, 𝑘
𝑟𝑖 = 𝑟𝑖 ⋅ 𝕀{𝑟𝑖<𝑚𝑥} ⋅ 𝕀{𝑟𝑖≥0} ⋅ 𝕀{𝑟𝑖∉𝑉}, 𝑖 = 1, 𝑘
∧
𝑃𝐹1 = {𝑀, 𝑃, 𝐶, 𝐵, 𝑅, 𝐷𝑂𝐵, 𝐿}
𝑒𝑖 = (𝑒𝑖 + (𝑟𝑖 + 1) ⋅ 𝑠𝑑) ⋅ 𝕀{𝑒𝑖∉𝑉} + (𝑐𝑖 + (𝑟𝑖 + 1) ⋅ 𝑠𝑑) ⋅ 𝕀{𝑒𝑖∈𝑉}
𝑃𝐹2 = {𝑃𝐹1: 𝑝𝑖 ∈ 𝑃𝑆 ∧ 𝑚𝑖 ∈ 𝑀𝑆 ∧ (𝑝𝑖 ∉ 𝑃 ⊂ 𝑀𝐹2), 𝑖 = 1, 𝑘 }
( 𝑒𝑖 = 𝑒𝑖 ⋅ 𝕀{𝑒𝑖≤𝑙𝑖} + 𝑙𝑖 ⋅ 𝕀{𝑒𝑖>𝑙𝑖}, 𝑖 = 1, 𝑘)
55
Estimation of Drug Therapy Duration from EMRs The Open Bioinformatics Journal , 2017, Volume 10 7
14. Calculate the number (cn) of distinct record creation dates for each patient, treat missing start dates by thefollowing rules:
14.1) if cn is equal to one, then delete the record.
14.2) if cn is greater than one, replace it with record creation date.
15. Sort by patient key ascending, start date ascending within same patient key.
16. For each unique patient key psj PS,j = reduce MP to the set FN j containing only pi = psj, i = .Assume that obtained dataset FN j has n'' rows. Set e0 = 0 and calculate selected medication duration for the patientavoiding double calculations of overlapping intervals.
17. Use medication duration D = (d1, d2,...,du)T to conduct further research.
3.3. “Continuous” Method
1. Repeat steps 1 and 2 from “chaining” method, then perform step 6, and treat missing values in MF2 as describedin step 14. Assume that obtained dataset MF2 has instances.
2. Create stop date status variable SI = (st1, st2,...,st )T on the basis of the following rules:
2.1) if active flag is “Y” and stop date is missing, then stop date status equals to 2.
2.2) if stop date is not missing, then stop date status equals to 1.
2.3) else 0.
3. Sort MF3 by patient key ascending, start date descending within same patient key, stop date status ascendingwithin the same start dates of the same patient:
𝑀𝐹3 = {𝑃, 𝐶, 𝐵, 𝐸} ⊂ 𝑀𝐹2
𝑃𝐹3 = {𝑃, 𝐶, 𝐵, 𝐸} ⊂ 𝑃𝐹2
𝑀𝑃 = 𝑀𝐹3⋃𝑃𝐹3
𝑀𝑃 ∶ 𝑎) 𝑝𝑖 ≤ 𝑝𝑖+1, 𝑖 = 1, 𝑛′
𝑏) 𝑏𝑖 ≤ 𝑏𝑖+1, ∀𝑖𝑖: 𝑝𝑖 = 𝑝𝑖+1 - post a) sorting
∈ 1, 𝑢 1, 𝑛′
𝐹𝑁𝑗: 𝑑𝑗 = ∑((𝑒𝑖 − 𝑏𝑖) ⋅ 𝕀{𝑏𝑖≥𝑒𝑖−1} + (𝑒𝑖 − 𝑒𝑖−1) ⋅ 𝕀{𝑏𝑖<𝑒𝑖−1} ⋅ 𝕀{𝑒𝑖≥𝑒𝑖−1})
𝑛′′
𝑖=1
��
��
𝑠𝑡𝑖 = 2 ⋅ 𝕀𝑓𝑖∈𝐹𝑦⋅ 𝕀𝑠𝑖∈𝑉 + 1 ⋅ 𝕀𝑠𝑖∉𝑉 + 0 ⋅ 𝕀𝑓𝑖∉𝐹𝑦
⋅ 𝕀𝑠𝑖∈𝑉
𝑀𝐹3 = {𝑀, 𝑃, 𝐶, 𝐵, 𝑆, 𝐹, 𝐻, 𝐺, 𝐷𝑂𝐵, 𝐿, 𝑆𝑇}
𝑀𝐹3: 𝑎) 𝑝𝑖 ≤ 𝑝𝑖+1, 𝑖 = 1, ��
𝑏) 𝑏𝑖 ≥ 𝑏𝑖+1, ∀𝑖: 𝑝𝑖 = 𝑝𝑖+1 - post a) sorting
𝑐) 𝑠𝑡𝑖 ≤ 𝑠𝑡𝑖+1, ∀𝑖: 𝑏𝑖 = 𝑏𝑖+1 ∧ 𝑝𝑖 = 𝑝𝑖+1 - post b) sorting
56
8 The Open Bioinformatics Journal , 2017, Volume 10 Montvida et al.
4. Set initial value p0 = 0 and approximate individual medication end dates E = (e1, e2,...,e )T.
4.1) if stop date is not missing, then end date equals to stop date.
4.2) else, if active flag is “Y”, then end date equals to last follow-up date.
4.3) else, if first unique patient key, then end date equals to start date plus standard duration.
4.4) else end date equals to start date of previous record.
5. Perform step 5 from “chaining” method, and steps 7-11.
6. Reduce PF1 to the set of patients from the cohort of interest, to the set of patients not in MF3, and to the set ofkeys selected drug(s).
7. Treat missing values in PF2 as described in step 14 of "chaining" method.
8. Append both datasets by the following values: patient key, record creation date, start/prescription date and enddate, assume that the new dataset MP contain records.
9. Perform steps 15-17 from “chaining” method.
4. REMARKS
Identified erroneous entries are declared as missing in Steps 2, 8, and 9 of “chaining” method. In the Step 14, thealgorithm counts the number of unique creation dates for selected drug(s) at patient level. If obtained number is greaterthan one, then missing start dates are replaced with record creation dates. In such a way, a patient is considered to take aparticular drug if the medication records were entered in a systematic manner, otherwise the records with missing startdates are disregarded.
As an example, the prescription scenario for anti-diabetes drugs for a patient with type 2 diabetes is presented inTable 1. The treatment was initiated with metformin (METFORMIN HCL) on the 6th of May 2009 and continued untilthe 25th of April 2011, when a switch to GLP-1RA (LIRAGLUTIDE) was made. With a stop date for GLP-1RArecorded on 14th of December 2011, data show a gap in the treatment till 26th of September 2012, when insulin therapybegun. However, a patient with diabetes using GLP-1RA is unlikely to have had a nine month long gap in the treatment.Indeed, careful data examination leads to the conclusion that insulin treatment started on 27th of February 2012, aswould be estimated by the algorithm.
As it was mentioned earlier, MF was considered as primary data source, thus if at least one record for selecteddrug(s) at patient level is present in the MF, then both methods disregard entities in the PF. However, if there is noavailable data in MF table, the methods append data from PF.
Assessment of the first marketing date for a particular drug is an example of additional global consistency audit thatis omitted in the methods’ description. For instance, any start date of GLP-1RA drugs must not be prior to April 2005,
��
𝑒𝑖 = 𝕀{𝑝𝑖≠𝑝𝑖−1} ∙ (𝑙𝑖 ⋅ 𝕀𝑓𝑖∈𝐹𝑦⋅ 𝕀𝑠𝑖∈𝑉 + 𝑠𝑖 ⋅ 𝕀𝑠𝑖∉𝑉 + (𝑏𝑖 + 𝑠𝑑) ⋅ 𝕀𝑓𝑖∉𝐹𝑦
⋅ 𝕀𝑠𝑖∈𝑉) +
+𝕀{𝑝𝑖=𝑝𝑖−1} ∙ (𝑙𝑖 ⋅ 𝕀𝑓𝑖∈𝐹𝑦⋅ 𝕀𝑠𝑖∈𝑉 + 𝑠𝑖 ⋅ 𝕀𝑠𝑖∉𝑉 + 𝑏𝑖−1 ⋅ 𝕀𝑓𝑖∉𝐹𝑦
⋅ 𝕀𝑠𝑖∈𝑉)
𝑃𝐹2 = {𝑃𝐹1: 𝑝𝑖 ∈ 𝑃𝑆 ∧ 𝑚𝑖 ∈ 𝑀𝑆 ∧ (𝑝𝑖 ∉ 𝑃 ⊂ 𝑀𝐹3), 𝑖 = 1, 𝑘 }
𝑀𝐹4 = {𝑃, 𝐶, 𝐵, 𝐸} ⊂ 𝑀𝐹3
𝑃𝐹3 = {𝑃, 𝐶, 𝐵, 𝐸} ⊂ 𝑃𝐹2
𝑀𝑃 = 𝑀𝐹4⋃𝑃𝐹3
��
57
Estimation of Drug Therapy Duration from EMRs The Open Bioinformatics Journal , 2017, Volume 10 9
the date when first representative (Exenatide) was approved.
5. RESULTS
To evaluate the performance of described methods, we chose to focus on the estimation of the duration of treatmentwith two widely used anti-diabetic drugs, namely GLP-1RA and insulin. In the CEMR database 1,861,560 patients wereidentified as having been diagnosed with type 2 diabetes mellitus, as inferred from the assigned ICD-9 codes.
5.1. Case Study 1
As the first case study, we consider a randomly selected patient from the CEMR database, whose relevant treatmentdetails are shown in Table 3. The treatments with EXENATIDE and INSULIN GLARGINE started on the 18th of June2007. The treatment with EXENATIDE was terminated on the 7th of January 2008, while INSULIN therapy continueduntil the last recorded follow-up date on the 24th of January 2008 (notice that the treatment is flagged as active, “Y”). Inthis case, the “chaining” and “continuous” methods produce the same estimates for the durations of the two treatments.Specifically, the estimates corresponding to insulin and GLP-1RA are 7.2 and 6.7 months, respectively.
Table 3. Snapshot of MF table-combining therapies. Patient’s last follow-up date was identified as 24 January 2008.
GPI category 4 Medicationkey (M)
Patient key(P)
Create date(C)
Start date(B)
Stopdate (S)
Activeflag(F)
Chain ID(H)
Chainseq(G)
Enddate(“chaining”)
Enddate(“continuous”)
INSULINGLARGINE
682327 15219411 18-Jun-07 18-Jun-07 N 136664321 0 20-Jun-07 20-Jun-07
EXENATIDE 12670645 15219411 18-Jun-07 18-Jun-07 N 136664552 0 17-Oct-07 15-Oct-07INSULIN
GLARGINE1096062 15219411 20-Jun-07 20-Jun-07 N 136664321 1 7-Jan-08 7-Jan-08
EXENATIDE 12670548 15219411 17-Oct-07 15-Oct-07 N 136664552 1 7-Jan-08 7-Jan-08INSULIN
GLARGINE1096062 15219411 7-Jan-08 7-Jan-08 Y 136664321 2 24-Jan-08 24-Jan-08
EXENATIDE 12670548 15219411 7-Jan-08 7-Jan-08 7-Jan-08 N 136664552 2 7-Jan-08 7-Jan-08
5.2. Case Study 2
As an insightful case study, we consider a patient whose relevant treatment details are shown in Table 4. Since all ofthe records shown have the same chain ID it can be concluded that in the period from the 23rd of April of 2010 until the13th of March 2013 the patient was alternating between two therapies, namely with GLP-1RA (EXENATIDE) andinsulin (INSULIN GLARGINE). This example illustrates the importance of chain ID information, as readilycorroborated by comparing the predicted therapy end dates using the “chaining” and “continuous” methods (per recordestimates are shown in the two rightmost columns of Table 4). The latter disregards chain ID information, it implicitlyassumes that EXENATIDE was taken continuously from the 23rd of April 2010 until the 27th of April 2011, with the lastprescription date being the 28th of March 2011. However, treatment with EXENATIDE was terminated on the 29th ofDecember 2010 when a switch to insulin was made. Treatment with insulin continued until the 28th of March 2011when a switch back to EXENATIDE appeared. This complex and frequent pattern of therapy alteration leads to vastlydifferent treatment duration estimates when chain ID information is used (“chaining”) and when it is not (“continuous”).For example, in this particular case, “continuous” approach estimates the total duration of insulin/ EXENATIDEtreatment to be 5.7/ 28.9 months, compared to 26.5/ 12.1 months estimated by “chaining” method.
Table 4. Snapshot of MF table-switching between therapies. Patient’s last follow up date was identified as 13 March 2013.
GPI category 4 Medicationkey (M)
Patient key(P)
Create date(C)
Start date(B)
Stopdate(S)
Activeflag(F)
Chain ID (H) Chainseq(G)
Enddate(“chaining”)
Enddate(“continuous”)
EXENATIDE 1523512 64832053 23-Apr-10 23-Apr-10 N 1002923273 0 29-Dec-10 28-Mar-11INSULIN
GLARGINE682327 64832053 29-Dec-10 29-Dec-10 N 1002923273 1 06-Jan-11 06-Jan-11
INSULINGLARGINE
682327 64832053 06-Jan-11 06-Jan-11 N 1002923273 2 28-Mar-11 18-Dec-12
58
10 The Open Bioinformatics Journal , 2017, Volume 10 Montvida et al.
GPI category 4 Medicationkey (M)
Patient key(P)
Create date(C)
Start date(B)
Stopdate(S)
Activeflag(F)
Chain ID (H) Chainseq(G)
Enddate(“chaining”)
Enddate(“continuous”)
EXENATIDE 1523512 64832053 28-Mar-11 28-Mar-11 N 1002923273 3 18-Dec-12 27-Apr-11INSULIN
GLARGINE682327 64832053 18-Dec-12 18-Dec-12 N 1002923273 4 13-Mar-13 13-Mar-13
INSULINGLARGINE
682327 64832053 13-Mar-13 13-Mar-13 Y 1002923273 5 13-Mar-13 13-Mar-13
5.3. General Analysis
Given our focus on GLP-1RA and insulin, to facilitate further analysis, from the cohort of all T2DM patients weselected those who at any point in their medical history received treatment with either of the two drugs of interest. Textmining of drug names in MD table revealed various insulin regimens as well as related devices (e.g. insulin syringe). Toquantify the result, we found that approximately 30% of the patients in the T2DM cohort received at least oneprescription for insulin drug. Interestingly, a large number of patients (~25,000) were found to have receivedprescriptions for insulin devices but not for insulin therapy itself. Further exploration on these patients revealed that theaverage duration of use of these devices in this patient group was 21 months (Table 5), strongly suggesting that therewas an accompanying insulin therapy which was not recorded in the stored EMRs. This conclusion is furthercorroborated by the finding that the mean glycated haemoglobin (HbA1c) level for these patients was measured to be7.8% on the date of the first record associated with the device.
Table 5. Summary statistics on the estimated duration in months of treatment with specific medications in T2DM cohort(n=1,861,560) by “chaining” and “continuous” methods, and the difference in the estimated duration between “chaining” and“continuous” methods.
“Chaining” method “Continuous” method “Chaining” - “continuous”n (%) Mean
(sd)(min,max)
Median(IQR)
n (%) Mean(sd)
(min,max)
Median(IQR)
n (%) Mean(sd)
(min,max)
Median(IQR)
Insulin +device
588923(32)
32.5 (35) (0,657.8)
21.6 (6.5,46.8)
591441(32)
32.7(34.9)
(0,657.8)
21.8 (6.3,47.4)
588923(32)
-0.2(4.8)
(-167.8,183.4)
0 (0, 0)
Insulin only 563293(30)
32.0(34.9)
(0,657.8)
20.8 (6.1,45.8)
566014(30)
32.2(34.8)
(0,657.8)
21.0 (6,46.5)
563293(30)
-0.3 (5) (-167.8,176.9)
0 (0, 0)
no Insulin, butdevice
25536 (1) 21.2(21.5)
(0,196.8)
14.3 (4.8,30.9)
25386 (1) 21.2(21.9)
(0,190.7)
14.1 (4.4,31.1)
24910 (1) -0.2 (5) (-131.8,183.4)
0 (0, 0)
GLP1RA 113416(6)
18.3(19.4)
(0,110.7)
11.7 (3.9,26)
114316(6)
19.2(21.0)
(0,111.7)
11.7 (3.5,27.4)
113416(6)
-1.0(7.6)
(-103.9,95.4)
0 (0, 0)
Exenatide 73326 (4) 18.8(20.2)
(0,110.7)
11.6 (3.9,26.5)
74060 (4) 18.8(21.4)
(0,111.2)
10.6 (3.1,26.7)
73326 (4) -0.2(8.4)
(-97.0,95.4)
0 (0, 0)
Liraglutide 56406 (3) 12.5(11.9)
(0, 56.2) 8.6 (3, 19) 56907 (3) 12.7(12.4)
(0, 56.2) 8.3 (2.5,19.5)
56406 (3) -0.3(4.0)
(-49.5,47.5)
0 (0, 0)
Albiglutide 14 (0) 1.3 (0.5) (1, 2.4) 1 (1, 1.9) 15 (0) 1.3 (0.5) (1, 2.4) 1.0 (1, 1.9) 14 (0) 0 (0) (0, 0) 0 (0, 0)In patients with treatment duration ≥2 Months
Insulin +device
518000(28)
36.8(35.2)
(2,657.8)
26.4 (11.1,51.6)
518318(28)
37.1(35.0)
(2,657.8)
26.8 (11.2,52.3)
516808(28)
-0.3(4.9)
(-167.8,176.9)
0 (0, 0)
Insulin only 492992(26)
36.4(35.2)
(2,657.8)
25.8 (10.7,50.8)
493494(27)
36.7(35.1)
(2,657.8)
26.3 (10.9,51.6)
491847(26)
-0.4(5.2)
(-167.8,176.9)
0 (0, 0)
no Insulin, butdevice
22085 (1) 24.3(21.5)
(2,196.8)
17.8 (8,34.1)
21628 (1) 24.7(21.9)
(2,190.7)
18.0 (8,34.8)
21342 (1) -0.5(4.1)
(-131.8,65.3)
0 (0, 0)
GLP1RA 96458 (5) 21.3(19.6)
(2,110.7)
14.9 (6.8,29.3)
94972 (5) 22.9(21.3)
(2,111.7)
15.7 (6.9,31.8)
94372 (5) -1.5(7.8)
(-103.9,95.4)
0 (0, 0)
Exenatide 62538 (3) 21.8(20.4)
(2,110.7)
14.7 (6.6,30.4)
60228 (3) 22.9(21.7)
(2,111.2)
15.0 (6.5,32.1)
59812 (3) -0.8(8.0)
(-97.0,95.4)
0 (0, 0)
Liraglutide 45432 (2) 15.3(11.6)
(2, 56.2) 12 (5.8,22.1)
44344 (2) 16 (12.2) (2, 56.2) 12.5 (5.9,23.4)
43991 (2) -0.6(3.9)
(-49.5,43.9)
0 (0, 0)
Albiglutide 2 (0) 2.2 (0.2) (2.1, 2.4) 2.2 (2.1,2.4)
2 (0) 2.2 (0.2) (2.1, 2.4) 2.2 (2.1,2.4)
2 (0) 0 (0) (0, 0) 0 (0, 0)
The number of patients receiving insulin and GLP-1RA, and the corresponding treatment duration estimates (inmonths) produced by our algorithms (“chaining” and “continuous”), are summarized in Table 5. Different insulinregimens were treated jointly, as we found that any finer level of detail is poorly recorded in the database. As regards to
(Table 4) contd.....
59
Estimation of Drug Therapy Duration from EMRs The Open Bioinformatics Journal , 2017, Volume 10 11
GLP-1RA treatment, only three different GLP-1RA drugs (namely, Exenatide, Liraglutide, and Albiglutide) have beenused. Being new to the market (introduced in 2014), only limited data was available for Albiglutide treatment.
The estimate of the proportion of patients identified as having received specific individual drugs was found to bevery similar using both the “chaining” approach, as well as the non-chain ID based alternative “continuous” approach,as shown in Table 5. The corresponding values of the key statistics – namely the mean, standard deviation (SD),median, and the interquartile range (IQR)- of the respective estimates of the duration of treatment with individual drugswere also similar. The average differences in the estimated duration of treatment with insulin only and GLP-1RA drugswere 0.3 month and 1 month respectively. There were no differences at the median levels. Separate analyses for patientswith minimum 2 months of treatment duration with individual therapies also revealed the same results. However, it isimportant to note that although the cumulative statistics of the estimated treatment durations with different therapieswere not significantly different, we did find notable differences in the minimum and maximum duration estimates forspecific patient subgroups, as evident from (Table 5).
6. DISCUSSION
In this work we addressed a number of challenging data mining related issues while extracting patient-levellongitudinal information on prescription patterns and medication usages from large relational databases (our data setcomprises more than a billion records). There are several key contributions of note. Firstly we identified the specificchallenges which automatic methods must deal with in the processing of this complex voluminous data. Wecorroborated our arguments using analysis of real-world EMRs and discussed the importance and the implications ofbeing able to handle erroneous and incomplete longitudinal information. Secondly, we introduced two methods for theestimation of the duration of treatment with specific drug(s) in the presence of the aforementioned challenges.Developed sequentially ordered case by case rules were presented mathematically. To the best of our knowledge, norobust algorithmic approach has yet been reported to evaluate treatment duration with individual medications inmultiple treatment scenario [22, 27].
We have described two algorithmic approaches to estimate treatment duration on the individual record level. Firstmethod (“chaining”) relies on specific chaining fields of medication information, while second approach (“continuous”)does not use chain related information and employs only chronological record information instead. Our results on thelarge Centricity EMR database show that the two approaches do not produce significantly different results on average atpopulation level. However, when examined in detail, the “chaining” method could identify the treatment alterationslongitudinally and was shown to be more robust at individual patient level. Furthermore, treatment duration estimatesfrom the “continuous” approach are more sensible to the set of selected medications. The difference between methods isparticularly prominent in studies involving multiple drugs as opposed to single drug therapies or focusing on the orderof treatment initiation [48, 49].
Our study highlighted the potential risk of underestimating the duration of treatment when EMR data is useddirectly, due to erroneous or incomplete data emerging from omissions in the data entry process, appointments missedby patients, typographical errors, or numerous others. Both proposed algorithms robustly handle these challengeswhenever is possible, estimating values of the missing or erroneous entries. Importantly, being rule based, the decisionsof our algorithms are readily interpretable by humans and lend themselves to effortless use by medical professionals notnecessarily proficient in data mining and related disciplines. Both approaches implement two fact datasets available inthe Centricity EMRs, however algorithms are easily adjusted in case of only one available dataset.
CONCLUSION
This study discusses the challenges in exploring the prescription / medication patterns for individual patients inlarge primary / ambulatory care electronic databases, and introduces two algorithmic approaches for robust estimationof treatment duration with individual drug(s). We have demonstrated that implementing chaining fields of medicationinformation additionally improve the quality of estimates. Given the importance of extracting medication informationappropriately in pharmaco-epidemiological studies based on real world data, the proposed algorithms has the potentialto significantly contribute to the analytical quality aspects in the future EMR based clinical and epidemiological studies.
LIST OF ABBREVIATIONS
EMR = Electronic Medical Rerecords
CEMR = Centricity Electronic Medical Rerecords
60
12 The Open Bioinformatics Journal , 2017, Volume 10 Montvida et al.
T2DM = Type 2 Diabetes
MD = Medication Dimension
MF = Medication Fact
PF = Prescription Fact
GPI = Generic Product Identifier
ID = Identification
DOB = Date of Birth
SD = Standard Deviation
IQR = Interquartile Range
GLP-1RA = Glucagon-Like Peptide-1 Receptor Agonist
ETHICS APPROVAL AND CONSENT TO PARTICIPATE
Not applicable.
HUMAN AND ANIMAL RIGHTS
No Animals/Humans were used for studies that are base of this research.
CONSENT FOR PUBLICATION
Not applicable.
CONFLICT OF INTEREST
Sanjay K. Paul (SKP) has acted as a consultant and/or speaker for Novartis, GI Dynamics, Roche, AstraZeneca,Guangzhou Zhongyi Pharmaceutical and Amylin Pharmaceuticals LLC. He has received grants in support ofinvestigator and investigator initiated clinical studies from Merck, Novo Nordisk, AstraZeneca, Hospira, AmylinPharmaceuticals, Sanofi-Avensis and Pfizer. Olga Montvida (OM) and Ognjen Arandjelovic (OA) has no conflict ofinterest to declare. Edward Reiner (ER) was an employee of Quintiles and was responsible for the strategicdevelopment of the Centricity EMR database.
ACKNOWLEDGEMENTS
Olga Montvida (OM) and Sanjay K. Paul (SKP) conceived the idea and were responsible for the primary design ofthe study and the methodological developments. Ognjen Arandjelovic (OA) and Edward Reiner (ER) evaluated themethodological approach. Olga Montvida (OM) conducted the data extraction and statistical analyses. The first draft ofthe manuscript was developed by Sanjay K. Paul (SKP) and Olga Montvida (OM), and all authors contributed to thefinalization of the manuscript. Sanjay K. Paul (SKP) had full access to all the data in the study and takes responsibilityfor the integrity of the data and the accuracy of the data analysis.
Melbourne EpiCentre gratefully acknowledges the support from the Australian Government’s NationalCollaborative Research Infrastructure Strategy (NCRIS) initiative through Therapeutic Innovation Australia. Noseparate funding was obtained for this study.
REFERENCES
[1] Paul SK, Klein K, Thorsted BL, Wolden ML, Khunti K. Delay in treatment intensification increases the risks of cardiovascular events inpatients with type 2 diabetes. Cardiovasc Diabetol 2015; 14: 100.[http://dx.doi.org/10.1186/s12933-015-0260-x] [PMID: 26249018]
[2] Bhatnagar P, Wickramasinghe K, Williams J, Rayner M, Townsend N. The epidemiology of cardiovascular disease in the UK 2014. Heart2015; 101(15): 1182-9.[http://dx.doi.org/10.1136/heartjnl-2015-307516] [PMID: 26041770]
[3] Crawford AG, Cote C, Couto J, et al. Comparison of GE Centricity Electronic Medical Record database and National Ambulatory MedicalCare Survey findings on the prevalence of major conditions in the United States. Popul Health Manag 2010; 13(3): 139-50.[http://dx.doi.org/10.1089/pop.2009.0036] [PMID: 20568974]
[4] Wettermark B, Zoëga H, Furu K, et al. The Nordic prescription databases as a resource for pharmacoepidemiological research--a literaturereview. Pharmacoepidemiol Drug Saf 2013; 22(7): 691-9.[http://dx.doi.org/10.1002/pds.3457] [PMID: 23703712]
61
Estimation of Drug Therapy Duration from EMRs The Open Bioinformatics Journal , 2017, Volume 10 13
[5] Lau EC, Mowat FS, Kelsh MA, et al. Use of electronic medical records (EMR) for oncology outcomes research: assessing the comparabilityof EMR information to patient registry and health claims data. Clin Epidemiol 2011; 3: 259-72.[PMID: 22135501]
[6] Paul SK, Klein K, Maggs D, Best JH. The association of the treatment with glucagon-like peptide-1 receptor agonist exenatide or insulin withcardiovascular outcomes in patients with type 2 diabetes: A retrospective observational study. Cardiovasc Diabetol 2015; 14: 10.[http://dx.doi.org/10.1186/s12933-015-0178-3] [PMID: 25616979]
[7] Nadkarni PM. Drug safety surveillance using de-identified EMR and claims data: issues and challenges. J Am Med Inform Assoc 2010; 17(6):671-4.[http://dx.doi.org/10.1136/jamia.2010.008607] [PMID: 20962129]
[8] Liu M, McPeek Hinz ER, Matheny ME, et al. Comparative analysis of pharmacovigilance methods in the detection of adverse drug reactionsusing electronic medical records. J Am Med Inform Assoc 2013; 20(3): 420-6.[http://dx.doi.org/10.1136/amiajnl-2012-001119] [PMID: 23161894]
[9] Coloma PM, Trifirò G, Patadia V, Sturkenboom M. Postmarketing safety surveillance: where does signal detection using electronic healthcarerecords fit into the big picture? Drug Saf 2013; 36(3): 183-97.[http://dx.doi.org/10.1007/s40264-013-0018-x] [PMID: 23377696]
[10] Lin J, Jiao T, Biskupiak JE, McAdam-Marx C. Application of electronic medical record data for health outcomes research: a review of recentliterature. Expert Rev Pharmacoecon Outcomes Res 2013; 13(2): 191-200.[http://dx.doi.org/10.1586/erp.13.7] [PMID: 23570430]
[11] Belletti D, Zacker C, Mullins CD. Perspectives on electronic medical records adoption: Electronic Medical Records (EMR) in outcomesresearch. Patient Relat Outcome Meas 2010; 1: 29-37.[http://dx.doi.org/10.2147/PROM.S8896] [PMID: 22915950]
[12] Khunti K, Davies M, Majeed A, Thorsted BL, Wolden ML, Paul SK. Hypoglycemia and risk of cardiovascular disease and all-cause mortalityin insulin-treated people with type 1 and type 2 diabetes: a cohort study. Diabetes Care 2015; 38(2): 316-22.[http://dx.doi.org/10.2337/dc14-0920] [PMID: 25492401]
[13] Canavan C, West J, Card T. Calculating Total Health Service Utilisation and Costs from Routinely Collected Electronic Health Records Usingthe Example of Patients with Irritable Bowel Syndrome Before and After Their First Gastroenterology Appointment. Pharmacoeconomics2016; 34(2): 181-94.[PMID: 26497004]
[14] Bessou A, Guelfucci F, Aballea S, Toumi M, Poole C. Comparison of comorbidity measures to predict economic outcomes in a large UKprimary care database. Value Health. 2015; 18(7): A691.[http://dx.doi.org/10.1016/j.jval.2015.09.2565]
[15] Birkhead GS, Klompas M, Shah NR. Uses of electronic health records for public health surveillance to advance public health. Annu RevPublic Health 2015; 36: 345-59.[http://dx.doi.org/10.1146/annurev-publhealth-031914-122747] [PMID: 25581157]
[16] Paul MM, Greene CM, Newton-Dame R, et al. The state of population health surveillance using electronic health records: a narrative review.Popul Health Manag 2015; 18(3): 209-16.[http://dx.doi.org/10.1089/pop.2014.0093] [PMID: 25608033]
[17] Kukafka R, Ancker JS, Chan C, et al. Redesigning electronic health record systems to support public health. J Biomed Inform 2007; 40(4):398-409.[http://dx.doi.org/10.1016/j.jbi.2007.07.001] [PMID: 17632039]
[18] Menachemi N, Collum TH. Benefits and drawbacks of electronic health record systems. Risk Manag Healthc Policy 2011; 4: 47-55.[http://dx.doi.org/10.2147/RMHP.S12985] [PMID: 22312227]
[19] Crapo J. Big data in gealthcare: separating the hype from the reality. HealthCatalyst 2015; p. 5.
[20] Grabenbauer L, Skinner A, Windle J. Electronic Health Record Adoption - Maybe It’s not about the Money: Physician Super-Users,Electronic Health Records and Patient Care. Appl Clin Inform 2011; 2(4): 460-71.[http://dx.doi.org/10.4338/ACI-2011-05-RA-0033] [PMID: 23616888]
[21] Paul SK, Klein K, Majeed A, Khunti K. Association of smoking and concomitant metformin use with cardiovascular events and mortality inpeople newly diagnosed with type 2 diabetes. J Diabetes 2016; 8(3): 354-62.[http://dx.doi.org/10.1111/1753-0407.12302] [PMID: 25929583]
[22] Gaitanou P, Garoufallou E, Balatsoukas P. The effectiveness of big data in health care: a systematic review. Commun Comput Inf Sci 2014;141-53.[http://dx.doi.org/10.1007/978-3-319-13674-5_14]
[23] Svensson MK, Cederholm J, Eliasson B, Zethelius B, Gudbjörnsdottir S. Albuminuria and renal function as predictors of cardiovascularevents and mortality in a general population of patients with type 2 diabetes: a nationwide observational study from the Swedish NationalDiabetes Register. Diab Vasc Dis Res 2013; 10(6): 520-9.[http://dx.doi.org/10.1177/1479164113500798] [PMID: 24002670]
[24] Dean BB, Lam J, Natoli JL, Butler Q, Aguilar D, Nordyke RJ. Review: use of electronic medical records for health outcomes research: a
62
14 The Open Bioinformatics Journal , 2017, Volume 10 Montvida et al.
literature review. Med Care Res Rev 2009; 66(6): 611-38.[http://dx.doi.org/10.1177/1077558709332440] [PMID: 19279318]
[25] Wei WQ, Denny JC. Extracting research-quality phenotypes from electronic health records to support precision medicine. Genome Med 2015;7(1): 41.[http://dx.doi.org/10.1186/s13073-015-0166-y] [PMID: 25937834]
[26] Denny JC. Chapter 13: Mining electronic health records in the genomics era. PLOS Comput Biol 2012; 8(12): e1002823.[http://dx.doi.org/10.1371/journal.pcbi.1002823] [PMID: 23300414]
[27] Torre C, Martins AP. Overview of Pharmacoepidemiological Databases in the Assessment of Medicines Under real-life Conditions. In: LunetN, Eds. Epidemiolgy-current perspective on Research and practical Intech open publishers contributers 2012; pp.131-54.[http://dx.doi.org/10.5772/35318]
[28] Centricity Electronic Medical Record Brochure. GE Healthcare 2011.
[29] Lin J, Jiao T, Biskupiak JE, McAdam-Marx C. Application of electronic medical record data for health outcomes research: a review of recentliterature. Expert Rev Pharmacoecon Outcomes Res 2013; 13(2): 191-200.[http://dx.doi.org/10.1586/erp.13.7] [PMID: 23570430]
[30] Jermyn P, Dixon M, Read BJ. Preparing clean views of data for data mining. ERCIM Work on Database Res 1999; pp. 1-15.
[31] Zhang S, Zhang C, Yang Q. Data preparation for data mining. Appl Artif Intell 2003; 17(5-6): 375-81.[http://dx.doi.org/10.1080/713827180]
[32] Jensen PB, Jensen LJ, Brunak S. Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 2012;13(6): 395-405.[http://dx.doi.org/10.1038/nrg3208] [PMID: 22549152]
[33] Benchimol EI, Smeeth L, Guttmann A, et al. The REporting of studies Conducted using Observational Routinely-collected health Data(RECORD) statement. PLoS Med 2015; 12(10): e1001885.[http://dx.doi.org/10.1371/journal.pmed.1001885] [PMID: 26440803]
[34] PLOS Medicine Editors. From checklists to tools: Lowering the barrier to better research reporting. PLoS Med 2015; 12(11): e1001910.[http://dx.doi.org/10.1371/journal.pmed.1001910] [PMID: 26600090]
[35] Yao L, Zhang Y, Li Y, Sanseau P, Agarwal P. Electronic health records: Implications for drug discovery. Drug Discov Today 2011;16(13-14): 594-9.[http://dx.doi.org/10.1016/j.drudis.2011.05.009] [PMID: 21624499]
[36] Hall GC, McMahon AD, Dain M-P, Home PD. A comparison of duration of first prescribed insulin therapy in uncontrolled type 2 diabetes.Diabetes Res Clin Pract 2011; 94(3): 442-8.[http://dx.doi.org/10.1016/j.diabres.2011.09.003] [PMID: 21963105]
[37] Hansen RA, Farley JF, Maciejewski ML, Ye X, Qian C, Powers B. Real-world utilization patterns and outcomes of colesevelam hcl in the geelectronic medical record. BMC Endocr Disord 2013; 13(1): 24.[http://dx.doi.org/10.1186/1472-6823-13-24] [PMID: 23866087]
[38] Hippisley-Cox J, Coupland C. 2016.
[39] Fardet L, Petersen I, Nazareth I. Prevalence of long-term oral glucocorticoid prescriptions in the UK over the past 20 years. Rheumatology(Oxford) 2011; 50(11): 1982-90.[http://dx.doi.org/10.1093/rheumatology/ker017] [PMID: 21393338]
[40] Davis KL, Tangirala M, Meyers JL, Wei W. Real-world comparative outcomes of US type 2 diabetes patients initiating analog basal insulintherapy. Curr Med Res Opin 2013; 29(9): 1083-91.[http://dx.doi.org/10.1185/03007995.2013.811403] [PMID: 23734906]
[41] Xie L, Wei W, Pan C, Du J, Baser O. A real-world study of patients with type 2 diabetes initiating basal insulins via disposable pens. AdvTher 2011; 28(11): 1000-11.[http://dx.doi.org/10.1007/s12325-011-0074-5] [PMID: 22038703]
[42] Deléger L, Grouin C, Zweigenbaum P. Extracting medical information from narrative patient records: the case of medication-relatedinformation. J Am Med Inform Assoc 2010; 17(5): 555-8.[http://dx.doi.org/10.1136/jamia.2010.003962] [PMID: 20819863]
[43] Etminan M. Reporting guidelines for pharmacoepidemiological studies are urgently needed. BMJ 2014; 349: g5511.[http://dx.doi.org/10.1136/bmj.g5511] [PMID: 25231185]
[44] Kamal KM, Chopra I, Elliott JP, Mattei TJ. Use of electronic medical records for clinical research in the management of type 2 diabetes. ResSocial Adm Pharm 2014; 10(6): 877-84.[http://dx.doi.org/10.1016/j.sapharm.2014.01.001] [PMID: 24556384]
[45] Herrin J, da Graca B, Nicewander D, et al. The effectiveness of implementing an electronic health record on diabetes care and outcomes.Health Serv Res 2012; 47(4): 1522-40.[http://dx.doi.org/10.1111/j.1475-6773.2011.01370.x] [PMID: 22250953]
[46] Holt TA, Stables D, Hippisley-Cox J, O’Hanlon S, Majeed A. Identifying undiagnosed diabetes: cross-sectional survey of 3.6 million patients’
63
Estimation of Drug Therapy Duration from EMRs The Open Bioinformatics Journal , 2017, Volume 10 15
electronic records. Br J Gen Pract 2008; 58(548): 192-6.[http://dx.doi.org/10.3399/bjgp08X277302] [PMID: 18318973]
[47] Davis KL, Tangirala M, Meyers JL, Wei W. Real-world comparative outcomes of US type 2 diabetes patients initiating analog basal insulintherapy. Curr Med Res Opin 2013; 29(9): 1083-91.[http://dx.doi.org/10.1185/03007995.2013.811403] [PMID: 23734906]
[48] Paul SK, Klein K, Majeed A, Khunti K. Association of smoking and concomitant use of metformin with cardiovascular events and mortalityin people newly diagnosed with type 2 diabetes. J Diabetes 2015; 8(3): 354-62.[PMID: 25929583]
[49] Paul SK, Klein K, Maggs D, Best JH. The association of the treatment with glucagon-like peptide-1 receptor agonist exenatide or insulin withcardiovascular outcomes in patients with type 2 diabetes: a retrospective observational study. Cardiovasc Diabetol 2015; 14(1): 10.[http://dx.doi.org/10.1186/s12933-015-0178-3] [PMID: 25616979]
© 2017 Montvida et al.
This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), acopy of which is available at: (https://creativecommons.org/licenses/by/4.0/legalcode). This license permits unrestricted use, distribution, andreproduction in any medium, provided the original author and source are credited.
64
Chapter 5: Diabetes Mellitus Cohort
Using diagnostic codes to identify a diseased cohort is the standard approach in EMR-based
studies [163, 164]. As was mentioned in chapter 3, use of such codes was shown as a reliable
tool to identify diseased cohorts in various databases including CEMR. Nonetheless, inaccurate
coding or incomplete data-entry process is a part of everyday practice, and EMRs reflect this
matter [165]. With regards to DM, identifying cohorts by diagnostic codes brings-up the
following concerns: (1) unknown disease subtype from high-level diagnostic codes, (2)
longitudinally overlapping codes of different subtypes for individual patients, (3) absence of
diagnostic codes for patients with DM (false negatives), and (4) presence of DM codes for
those who are not diseased (false positives). The accuracy of disease cohorts identified with
diagnostic codes, theoretically, may be improved by implementing advanced algorithms that
robustly utilise available patient-level information.
Appendix B provides more details on the aforementioned challenges and existing methods
trying to overcome them. A systematic review summarising methods on identification of
disease cohorts from various clinical databases has been conducted by Shivade and colleagues
[166]. While some studies applied restrictive rules to cohorts identified by diagnostic codes,
several studies applied and compared machine learning (ML) techniques. For example, Tapak
and colleagues [167] reported support vector machine as the best classifier, while Mani and
colleagues [168] reported that using decision trees as the best approach to identify patients with
diabetes.
As patients with T2DM are in primary focus of this dissertation, the aim was to achieve the
highest possible quality of T2DM cohort. To overcome the aforementioned challenges,
diagnostic codes were initially used to identify a cohort of patients with any DM (subsection
5.1). Next, machine learning (ML) approaches that seek to identify all patients in the database
who are highly likely to be diabetic were compared (subsection 5.2). Afterwards, cohorts that
were identified by one selected ML algorithm and by diagnostic codes, were combined. To
obtain the final cohort, additional clinically guided rules were incorporated (subsection 5.3).
Next, basic characteristics of the obtained DM cohort were compared to national reports
(subsection 5.4). Finally, basic characteristics were explored in adults identified to have T2DM
(subsection 5.5).
65
5.1 DIAGNOSTIC CODES
Patients with diagnostic codes for any type of diabetes were identified from CEMR. After
quality assessments and data corrections, 111,303 and 2,160,098 patients were identified to
have type 1 and type 2 diabetes, respectively (Figure 5.1). Patients with records for gestational
diabetes and no record of type 1 or type 2 diabetes were treated separately (n=89,562). Patients
were also categorised as “unspecific” diabetes type (n=118,557) due to high-level diagnostic
codes (e.g. “ICD-9: 250 Diabetes Mellitus”), unspecific codes (e.g. “ICD-9: 362.07 Diabetic
macular edema”), or longitudinally overlapping codes for type 1 and type 2. Among
aforementioned 2,479,520 patients, 59% (n=1,468,246) had a record of ADD use with duration
longer than 6 month, and only 38% (n=935,652) had a record of 2 elevated glucose measures
within one consecutive year.
Figure 5.1. Cohort of patients with T2DM and distribution of identified sub-types.
5.2 SUPERVISED MACHINE LEARNING
5.2.1 Training datasets
The training dataset with 150,000 patients, containing equal number of positive and negative
representatives was extracted. Positive representatives were randomly chosen from those with
a diagnostic code for DM, and negative representatives from those without a code for DM. All
representatives were chosen from patients with at least 1 year of available follow-up, and non-
missing sex and age. To verify presence of bias arising from selecting random negative
representatives, two additional training datasets were created by including disjoint negative
and the same positive representatives as in the main training set.
66
5.2.2 Feature selection
For patients with DM (identified by diagnostic codes n=2,479,520), data on medication
prescriptions, diseases, and laboratory measurements were analysed. It was found that 66% of
these patients have a diagnosis of “Essential hypertension”, and 62% have a diagnosis of
“Disorders of lipoid metabolism”. Antidiabetics, antihypertensives, and antilipidemics drugs
were used by 82%, 70%, and 66% patients, respectively. Analgesics non-narcotic and
analgesics opioid were used by 53% and 37% patients, respectively. Beta blockers, ulcer drugs,
diuretic drugs, and antidepressants were used by 40%, 40%, 38%, and 34% patients,
respectively. The most frequent observations and laboratory tests for diabetic patients were:
weight, blood pressure, pulse, height, body mass index, creatinine (serum), urea nitrogen
(blood), calcium (serum), alanine aminotransferase (serum), aspartate aminotransferase
(serum), sodium (serum), HbA1c (blood), bilirubin (serum), and cholesterol (serum).
Obtained results were combined with clinical considerations and guidelines [169], and 11
potential disease predictors were obtained. Scheme-independent (or classifier-independent)
attribute subset selection algorithms were applied to determine the best predictors. Using 10-
fold cross-validation, bi-directional, forward, and backward greedy search techniques [170]
agreed on the 4 features shown in the Table 5.2.
Table 5.1
Features Selected as Best Diabetes Predictors in CEMR
Feature description Feature type Two measurements of: HbA1C ≥ 6.5% or fasting blood glucose ≥ 126 mg/dL [7.0 mmol/L], or random blood glucose ≥ 200 mg/dL [11.1 mmol/L] within 1 year
Binary
Anti-diabetic drug duration ≥ 6 months Binary
Average body mass index Continuous
Ischemic heart disease, heart failure or stroke Binary
5.2.3 Classification algorithm selection
Keeping four selected predictors only, the performance of six classification algorithms was
compared on training sets. Sensitivity (true positive rate), specificity (true negative rate), and
area under receiver operating characteristic curve (AUC) were calculated as the average of 10-
repeat 10-fold cross-validations. Central processing unit (CPU) time of training and percent of
correctly classified instances were also recorded. Compared classifiers were: Naïve Bayes
67
[171], Logistic regression [172], Support Vector Machine [173], Multilayer Perceptron [174],
Decision Tree with J48 modification [175], and One Rule [176].
While the One Rule algorithm performed significantly worse, performance of the other
algorithms was similar (Table 5.2). Among them, the false positive rate was the same, but
Support Vector Machine and Decision Tree algorithms produced fewer false negatives. Given
the higher AUC and smaller CPU time, the Decision Tree algorithm was chosen as the final
machine learning approach. Absence of bias arising from selecting random negative instances
was confirmed by almost identical performance of all algorithms on three training sets.
Table 5.2
Performance of Machine Learning Algorithms on the Training Dataset
Naïve Bayes
Logistic Regression
Multilayer Perceptron
Support Vector Machine
J48 Decision Tree
One Rule
Percent correct 90.04 90.12 90.15 90.17 90.17 86.6 True positive rate 0.96 0.96 0.96 0.97 0.97 0.98 True negative rate 0.84 0.84 0.84 0.84 0.84 0.76 AUC 0.94 0.94 0.94 0.90 0.91 0.87 CPU time 0.05 2.26 49.23 157.71 0.52 0.09
5.3 FINAL COHORT
The selected J48 algorithm (Figure 5.2) was applied to all patients in the CEMR with at least
1 year of follow-up, non-missing sex and age, and resulted in 2,023,956 patients that are highly
likely to have diabetes. Of them, 78% (n=1,580,867) had a diagnostic code for diabetes.
Figure 5.2. Selected Decision Tree algorithm.
68
As errors during the data entry process in a real-world setting usually results in a smaller
number of false positive patients, and a larger number of false negatives identified by
diagnostic codes, cohorts identified by the ML approach and by diagnostic codes were
combined (n=2,922,609). Minimizing false negative instances was ensured by careful design
in further studies (e.g. inclusion of patients who initiated pharmacological therapy with MET,
and added second-line ADD). Patients who were identified by ML and not by diagnostic codes
(n=443,089), were categorised as “unspecific” diabetes type. As a final step, the following rules
were applied to distinguish diabetes types amongst all the “unspecific” cases:
7. if duration of non-insulin ADD ≥ 2 months, then type 2;
8. otherwise, if age at first available diagnosis date ≤ 18 years and insulin initiated
within 1 year, then type 1;
9. otherwise, if age at first available diagnosis date > 18 years and insulin initiated
within 3 months, then type 1;
10. otherwise, consider that patient does not have diabetes.
Steps 1-3 were unable to identify the type of diabetes for 29,288 patients, which were excluded
from the cohort. The final diabetes cohort consisted of 2,893,321 patients with 178,805 /
2,624,954 / 89,562 patients identified to have type 1 / type 2 / gestational diabetes (Figure 5.1).
5.4 REPRESENTATIVENESS OF DIABETES COHORT
Among all patients who were active in the CEMR during 2015 and were older than 18 years,
11.6% were identified to have any type of diabetes. This estimate stands very close to the US
National Diabetes Statistics (NDS) report that estimated 12.2% of adult population to have
diabetes in 2015 [177]. NDS reported an almost equal gender distribution (Table 5.3), while a
higher proportion of women (55%) was captured in the CEMR. Calculating the age of patients
in 2015, CEMR appeared to have younger patients than the NDS report. Body weight of adults
with diabetes in the CEMR was found to reflect the NDS report well, with the majority of
patients (90%) being overweight or obese (Table 5.3).
Table 5.3
Characteristics of patients with diabetes in the CEMR database and in the National Diabetes Statistics report,
2015
CEMR, % NDS, % Gender (p=0.4)
Male 45 51 Female 55 49
69
Age, years (p=0.5) 18-44 13 9 45-65 39 37 ≥65 47 55
BMI, kg/m2 (p=0.9) <25 10 13
25-30 25 26 30-40 46 44 ≥40 19 18
Other estimates in the NDS report are hard to directly compare with CEMR due to
methodological considerations. Although a large number of patients with diabetes do not have
a record of tobacco use in the CEMR, 17% were current smokers among those who have a
record of tobacco status. This estimate stands very close to the 16% of current smokers reported
in the NDS. NDS estimated that 74% of adults with DM had SBP ≥ 140 mm/Hg, or DBP ≥ 90
mm/Hg, or they were on prescription medication for high blood pressure. This estimate in the
CEMR ranged between 70% and 82%, depending on the definition of medication for high
blood pressure and the timeline chosen. According to the NDS report, 58% of adults aged 21-
75 with no self-reported CVD but who were eligible for statin therapy were on a lipid-lowering
medication (data source: 2011–2014 National Health and Nutrition Examination Survey).
Rough estimates from CEMR for this figure were 60-65% for adults 21-75 in 2015.
5.5 TYPE 2 DIABETES COHORT
Among 2,624,954 patients identified to have T2DM, 2,596,630 were at least 18 years old at
the time of the diabetes diagnosis. Basic characteristics at the time of diagnosis, along with
existing diseases for these adults is presented in the Table 5.4. In this cohort, 53% /47% are
females /males with mean (SD) follow-up of 3.9 (4.8) years. The majority of patients are White
Caucasians with mean (SD) age of 59 (13) years. At the time of T2DM diagnosis 39% /45%
of females /males had HbA1c ≥ 8%, and 68% /62% of females /males were obese. Blood
pressure and lipid profiles were very similar across females and males (Table 5.4). At the time
of the T2DM diagnosis, 23% of males and 15% of females were diagnosed with CVD. CCI
was around 1.5 for both genders, however more females had a record of depression diagnosis:
11% against 6% for males.
70
Table 5.4
Baseline characteristics among adults with T2DM
Female Male All N 1,381,075 1,215,555 2,596,630 Follow-up, yearsα 4.0 (4.8) 3.9 (4.9) 3.9 (4.8) Follow-up, yearsβ 2.8 (0.9, 5.6) 2.7 (0.9, 5.5) 2.7 (0.9, 5.5) Follow-up ≥ 6 monthsγ 1,148,017 (83) 991,743 (82) 2,139,760 (82) follow-up ≥ 12 monthsγ 1,021,250 (74) 882,575 (73) 1,903,825 (73) follow-up ≥ 24 monthsγ 815,233 (59) 703,941 (58) 1,519,174 (59) Age, yearsα 58 (14) 60 (12) 59 (13) Age ≥ 70 yearsγ 319,075 (23) 293,480 (24) 612,555 (24) Ethnicityγ White, n (%) 852, 115 (62) 799, 320 (66) 1,651,435 (64) Black, n (%) 185, 040 (13) 110,491 (9) 295,531 (11) Asian, n (%) 24,718 (2) 22,753 (2) 47,471 (2) Tobacco use statusγ Current 82,218 (6) 85,101 (7) 167,319 (6) Former 113,714 (8) 151,353 (12) 265,067 (10) Never 306,603 (22) 197,396 (16) 503,999 (19) Unknown 878,540 (64) 781,705 (64) 1,660,245 (64) HbA1c, %α 8.2 (1.8) 8.4 (1.9) 8.3 (1.9) HbA1c ≥ 7.5γ 156,850 (51) 173,302 (57) 330,152 (54) HbA1c ≥ 8%γ 121,476 (39) 138,013 (45) 259,489 (42) Weight, kgα 90 (24) 102 (24) 96 (25) BMI, kg/m2 α 34.6 (8.5) 32.8 (7.1) 33.8 (7.9) Obese γ 707,335 (68) 557,226 (62) 1,264,561 (65) SBP, mmHgα 131 (17) 132 (17) 132 (17) SBP ≥140 mmHgγ 299,248 (28) 270,470 (29) 569,718 (29) DBP, mmHgα 77 (10) 78 (10) 77 (10) Heart Rate, bmpα 79 (12) 77 (12) 78 (12) LDL, mg/dLα 109 (38) 101 (37) 105 (38) HDL, mg/dLα 49 (14) 41 (12) 45 (14) Triglycerides, mg/dLβ 139 (101, 188) 139 (99, 191) 139 (100, 190) Present diseasesγ CVD 204,114 (15) 279,284 (23) 483,398 (19) Heart Failure 47,427 (3) 56,471 (5) 103,898 (4) Myocardial Infarction 18,019 (1) 32,515 (3) 50,534 (2) Stroke 55,988 (4) 55,631 (5) 111,619 (4) Chronic Kidney Disease 45,140 (3) 50,929 (4) 96,069 (4) Rheumatoid Arthritis 20,742 (2) 8,113 (1) 28,855 (1) Cancer 48,105 (3) 47,882 (4) 95,987 (4) Depression 156,623 (11) 69,138 (6) 225,761 (9) Charlson Comorbidity Indexα 1.48 (0.95) 1.53 (1.03) 1.51 (0.99)
αmean (SD), βmedian (IQR), γn(%)
71
In adult patients with T2DM, exposure to various medications any time during follow-up is
presented in table 5.5. The majority of patients (61%) were prescribed metformin, while a
quarter of patients eventually received insulin. Chapter 7 provides a detailed exploration of
longitudinal prescription patterns of ADDs in patients with T2DM. Around 80% of patients
were using a cardio-protective medication (CPM), while 64% /71% of females /males received
lipid-modifying drugs sometime during follow-up.
Table 5.5
Exposure to medications any time during available follow-up among adults with T2DM
N (%) Female Male All Metformin 842,806 (61) 736,378 (61) 1,579,184 (61) Sulfonylurea 405,132 (29) 428,792 (35) 833,924 (32) Thiazolidinedione 151,198 (11) 164,740 (14) 315,938 (12) Insulin 351,106 (25) 330,066 (27) 681,172 (26) GLP-1RA 90,030 (7) 67,750 (6) 157,780 (6) DPP-4i 193,388 (14) 188,045 (15) 381,433 (15) SGLT-2i 39,574 (3) 42,320 (3) 81,894 (3) Lipid modifying 880,364 (64) 860,370 (71) 1,740,734 (67) Statin 800,753 (58) 792,066 (65) 1,592,819 (61) CPM* 1,094,662 (79) 1,024,434 (84) 2,119,096 (82) Diuretic 666,838 (48) 519,525 (43) 1,186,363 (46) Antihypertensive 116,478 (8) 119,395 (10) 235,873 (9) Antidepressant 575,425 (42) 323,365 (27) 898,790 (35) Anti-obesity 39,942 (3) 12,503 (1) 52,445 (2)
*CPM: beta blocker, statin, angiotensin-converting-enzyme inhibitor, or angiotensin II receptor blocker
72
Chapter 6: Imputation of Longitudinal
Observation Data
Statement of Contribution of Co-Authors for Thesis by Published
Paper
The authors listed below have certified* that:
1. they meet the criteria for authorship in that they have participated in the conception,
execution, or interpretation, of at least that part of the publication in their field of
expertise;
2. they take public responsibility for their part of the publication, except for the
responsible author who accepts overall responsibility for the publication;
3. there are no other authors of the publication according to these criteria;
4. potential conflicts of interest have been disclosed to (a) granting bodies, (b) the editor
or publisher of journals or other publications, and (c) the head of the responsible
academic unit, and
5. they agree to the use of the publication in the student’s thesis and its publication on
the QUT’s ePrints site consistent with any limitations set by publisher requirements.
In the case of this chapter:
Mayukh Samanta, Olga Montvida, Joanne Tropea, and Sanjoy K. Paul. A comparison of
imputation methods for missing risk factor data from large real-world electronic medical
records for comparative effectiveness studies.
Contributor Statement of Contribution*
Olga Montvida Conducted data extraction and contributed towards
manuscript development.
Mayukh Samanta Conceived the idea and was responsible for the primary
design of the study. Conducted statistical analyses.
Developed first draft and contributed towards development
of the manuscript.
Joanne Tropea Contributed towards development of the manuscript.
Sanjoy K. Paul Conceived the idea and was responsible for the primary
design of the study. Contributed to the statistical analyses.
73
29.06.2018 QUT Verified Signature
Developed first draft and contributed towards development
of the manuscript.
Principal Supervisor Confirmation
I have sighted email or other correspondence from all Co-authors confirming their certifying
authorship.
Sanjoy Ketan Paul 29.06.2018
Name Signature Date
74
QUT Verified Signature
Page 1 of 23
Title: A Comparison of Imputation Methods for Missing Risk Factor Data from
Large Real-world Electronic Medical Records for Comparative Effectiveness
Studies
Mayukh Samanta, PhD1, Olga Montvida1,2, Joanne Tropea3 and Sanjoy K. Paul, PhD3
1Statistics Unit, QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston,
QLD 4006, Australia
2School of Biomedical Sciences, Faculty of Health, Queensland University of Technology, Brisbane, Australia
3Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, Australia
Correspondence to:
Prof. Sanjoy Ketan Paul
Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, The Royal Melbourne Hospital - City Campus | 7 East, Main Building, Grattan Street, Parkville Victoria 3050, Australia
Email: [email protected]
ORCID Researcher ID: F-8199-2010
Word Count Abstract: 244
Word Count Main Text: 2969
Number of Tables + Figures: 3 +2
Supplementary Figures: 1
75
Page 2 of 23
ABSTRACT
Background: Evaluation of appropriate methodologies for imputation of missing risk factor
or outcome data from electronic medical records (EMRs) is crucial but lacking for comparative
effectiveness studies. Robust imputation of missing data relies on the understanding of the
predictors of missingness in the risk factor data, especially in patients with chronic diseases.
These two aspects have not been explored simultaneously to support methodological
developments in clinical epidemiological studies with real-world data.
Methods: Using disease-biomarker data (glycated haemoglobin, HbA1c) from large EMR
database in patients with diabetes, exploratory analyses were conducted to ascertain the
possible predictors of missingness. Three approaches, based on multiple imputation technique,
namely two-fold multiple imputation, by chained equations, and with Monte Carlo Markov
Chain, were evaluated in terms of their robustness in imputing missing data. The value of using
imputed data for drawing robust inferences on comparative effectiveness of two anti-diabetes
therapies were compared with the complete-case analyses.
Results: Older patients and patients with higher disease-severity were less likely to have
missing HbA1c data longitudinally over 12 months, while gender and pre-existing
comorbidities were not associated with the likelihood of missingness. No significant
differences in the distributions of follow-up imputed data with the three methods were
observed.
Conclusion: While complete case analyses were prone to bias by indication, use of three
multiple imputation techniques for large proportion of missing primary outcome data under
unknown patterns of missingness appeared to be valid, and able to provide consistent and
reliable clinical inferences.
76
Page 3 of 23
Key Words: Missing data; Imputation; Multiple Imputation; Electronic Medical Records;
Real-world data
77
Page 4 of 23
INTRODUCTION
Recent advances in the design and implementation of large electronic medical records (EMRs)
from national primary/ambulatory care databases have created new opportunities in clinical
and epidemiological studies [1, 2]. These databases have been extensively used to evaluate the
risk factor changes in patients with different clinical conditions [3-6]. However, one of the
critical problems with EMR data, as with all longitudinal observational data, is the issue of
missing data [7-10].
The data entry in EMRs depends on the nature and level of engagement between the individual
and the clinical service provider. For example, a patient with diabetes would be advised to get
blood tests done every 6 months for the assessment of various risk factors including glucose
level and lipids. However, given the severity of the disease state and the nature of anti-diabetes
drug (ADD) titration, the patient might need blood tests done more frequently. In primary care
settings, it is hypothesized that younger patients and those with lower risk profiles are less
likely to get blood tests done. The missing data may also arise simply because a patient failed
to attend the scheduled consultations. These aspects complicate the assertion about the nature
of missing data in EMRs, making it difficult to appropriately differentiate between random and
non-random missingness patterns. Although the problem of having significant proportions of
missing data in longitudinal studies can be minimised through careful design, it is almost
unavoidable in most clinical and epidemiological studies [11-13].
The inference drawn from a clinical or epidemiological study may be compromised when
individuals have missing data on health indicators, and inadequate handling of the missing data
can lead to substantial bias in the inferences drawn [12, 13]. Hence, before investigating and
imputing for the missing data, understanding the mechanisms behind the missing data is
crucial. In practice, incomplete data are typically considered as missing at random (MAR) even
if they may not be [12, 14]. In most EMRs, some variables would be expected to partially
78
Page 5 of 23
explain some of the variation in missingness, which indicates imputation under MAR setting
[12]. A previous study reported that the standard imputation of missing EMR data with not
missing at random (NMAR) assumption but without NMAR model might produce biased
estimates, although the bias might not be large [15].
Multiple imputations for missing data, compared to a single imputation, accounts for the
statistical uncertainty in missing values. Multiple imputation can lead to consistent,
asymptotically normal and efficient estimates for a dataset with MAR missing pattern which
makes it very attractive [10, 16-18]. Several statistical and machine learning methods including
the multiple imputation techniques have been used to deal with the complex problem of missing
data [7-9, 19]. There is a strong body of literature on the methodological and application aspects
of multiple imputation of missing values [10, 14, 18-21]. Carpenter and Kenward (2013)
described the theoretical justifications and computations aspects of various approaches within
the multiple imputation platform [22]. However, the studies addressing the fundamental
aspects of missingness patterns in risk factor data from EMRs and the practical implications of
such missingness while conducting comparative effective studies is scarce [20, 23-25].
Using nationally representative EMR from the primary care system, the aims of this study were
to: (1) evaluate the association of different patient-level characteristics with the likelihood of
missingness of the risk factor or outcome data, and (2) investigate the performance of three
multiple imputation techniques for the imputation for missing longitudinal clinical risk factor
data in the context of evaluating comparative therapeutic effectiveness. Three multiple
imputation techniques for missing disease biomarker data (glycated haemoglobin, HbA1c)
were compared in patients with type 2 diabetes (T2DM), treated with two different ADDs under
a new user design setup. The robustness of drawing clinical inferences with imputed data and
complete case (CC) was explored for comparison of effectiveness of the therapies at population
level.
79
Page 6 of 23
METHODS
Data
Centricity Electronic Medical Record (CEMR) of USA represents a variety of ambulatory and
primary care medical practices. Over 35,000 physicians and other providers from all US states
contribute to the CEMR, with approximately 75% being primary care providers. The database
is generally representative of the USA population; diabetes prevalence (7.1% diabetic patients
identified by diagnostic codes) is similar to National Diabetes Statistics (6.7% diagnosed
diabetes in 2014) [26]. CEMR has been used extensively for academic research worldwide [27-
29].
For more than 34 million individuals, longitudinal EMRs were available from 1995 until April
2016. This database contains comprehensive patient-level information on demographics,
anthropometric, clinical and laboratory variables including age, sex, ethnicity, and longitudinal
measures of HbA1c. Medication data includes brand names and doses for individual
medications prescribed, along with start/ stop dates and specific fields to track treatment
alterations. This dataset also contains patient reported medications, including prescriptions
received outside the EMR network and over-the-counter medications. A robust methodology
for extraction and assessment of longitudinal patient-level medication data from the CEMR
database has been recently described by the authors [30].
Study population
The T2DM study cohort was selected on the basis of the following conditions: (1) valid
diagnosis of T2DM, (2) 18 - 80 years old at the date of treatment initiation (baseline), (3) no
missing data on age and sex, and (4) valid baseline HbA1c measure. We focused on two ADDs:
dipeptidyl peptidase-4 inhibitor (DPP-4) and Glucagon-like peptide-1 receptor agonist (GLP-
1RA) when added to the first-line metformin. The number of patients with minimum 12/24
80
Page 7 of 23
months of follow-up post initiation of DPP-4 and GLP-1RA were 38,483/23,859 and
8,977/5,312 respectively. These patients were receiving the respective therapies for a minimum
of one year. HbA1c measures at baseline, 6, 12, 18, and 24 months were obtained as the nearest
measure within 3 months either side of the time point.
Multivariate Imputation by Chained Equations (MICE)
MICE is an increasingly popular method of dealing with missing data in epidemiological and
clinical research. This iterative imputation approach imputes multiple variables by using
chained equations under the assumption that missing data are MAR [31]. This method creates
multiple imputations for missing multivariate data by Gibbs sampling. The advantages of
MICE is that it can handle arbitrary missing data patterns as well as variables of different types.
For imputation with continuous variables, linear regression models or predictive mean
matching are used while logistic regression and polytomous models are needed for
dichotomous and categorical variables respectively [32, 33]. MICE is also commonly known
as fully conditional specification (FCS) and sequential regression multivariate imputation [34].
Two-fold Multiple Imputation
There are two phases in two-fold method of imputation; firstly, the filled-in phase followed by
the imputation phase. At the filled-in phase, the missing values for all variables are filled in
sequentially over the variables taken one at a time, which specifies separate univariate
imputation models for each variable with missing data conditional on all other variables [35] .
At the imputation stage, the missing values are imputed using a specified method and covariates
at each iteration. Two-fold multiple imputation imputes missing values at each time point
conditional on observed measures within a small time window using FCS or Chained Equations
[35, 36]. Usually, this method uses information from that time point where the imputation is
conducted and from immediately adjacent time points. A distinct advantage of this method is
81
Page 8 of 23
that it can handle both time-dependent and time-independent variables as well as allowing users
to specify the time window. This method also reduces the issues of collinearity and overfitting.
Multiple imputation with Bayesian Monte Carlo Markov Chain (MCMC)
Multiple imputation platform incorporates both parametric and non-parametric approaches.
Parametric approaches include improper, approximate proper imputation, and the Bayesian
imputation (proper) which usually uses Markov Chain Monte Carlo (MCMC) methods to
obtain the posterior distribution [22]. In the context of arbitrary missing data patterns, the
MCMC method is often used which creates multiple imputations by using simulations from a
Bayesian prediction distribution for normal data. We used multiple imputation with Bayesian
iterative MCMC procedures which can also be used when the pattern of missing data is
monotone or non-monotone [22].
Statistical Methods
We imputed the missing HbA1c (measured in %) measurements at 6-, 12-, 18- and 24-month
follow-up using these three methods and compared the results with CC analysis. CC analysis
considered only the non-missing HbA1c at the same time points. In all three imputation
methods, imputed values were adjusted for age, sex, and addition of any third-line ADD within
2 years of follow-up.
Basic statistics were presented by number (percentage), mean (SD), mean (95% CI) or median
(first quartile, third quartile) separately for two the treatment groups, as appropriate. Both the
unadjusted and adjusted change in HbA1c (%) at 6 and 12 months by the two treatment groups
were evaluated, the adjustment factors being age at treatment initiation, sex, diabetes duration
at treatment initiation, and time to second-line ADD. Among patients with baseline HbA1c ≥
7.5%, logistic regression was used to evaluate the odds of reducing HbA1c below 7% (glucose
82
Page 9 of 23
management target in patients with T2DM) at 6 and 12 months of follow-up in the GLP-1RA
group compared to the DPP-4 group. Treatment status is usually not randomized in
observational data which implies that the outcome and treatment are not necessarily
independent. To avoid this issue we applied a treatment effects model adjusting for age, sex,
diabetes duration at treatment initiation, and time to second-line ADD to make treatment and
outcome independent conditioning on those covariates [37].
To evaluate the association of various factors that could be associated with missingness of
HbA1c measures at follow-up, the likelihood of missingness of HbA1c at 6 and 12 months of
follow-up from baseline for each treatment group was estimated using logistic regression,
adjusting for age, sex, pre-existing cardiovascular disease (CVD) or pre-existing chronic
kidney disease (CKD), baseline HbA1c ≥ 7.5% and use of other medications. Instead of
ordinary age groups we considered quartiles of age group denoted by Q1 (18-50 years), Q2
(50-58 years), Q3 (58-66 years) and Q4 (66-80 years).
RESULTS
The basic characteristics of the study cohort are presented in Table 1. The mean (SD) age was
58 (12) and 54 (11) years, 49 % and 35% were male, and 71% and 78% of the patients were
White Caucasian in DPP-4 and GLP-1RA respectively. Median (Q1, Q3) HbA1c at baseline in
patients with minimum of 12 month of DPP-4 and GLP-1RA were 7.5 (6.8, 8.8) and 7.1 (6.5,
8.3) respectively.
There were no missing data on HbA1c at baseline by design. The proportions of missing
HbA1c data for patients with a minimum 12 and 24 months of treatment are presented for every
6 months of follow-up in Table 1. Among patients with a minimum treatment duration of 12
months, proportions of missing HbA1c ranged from 28% to 32%. Similar missing proportions
83
Page 10 of 23
(28- 34%) were observed at 24 months follow-up in patients with a minimum of 24 months of
treatment.
The possible association of various patient characteristics with the likelihood of missing
HbA1c in the study cohort at 6 and 12 months of follow-up is presented in Table 2. Age at
treatment initiation had significant influence on the likelihood of missing HbA1c. Among
patients treated with DPP-4, compared to younger patients (age quartile - Q1) the odds of non-
missing measure of HbA1c in older patients (Q2 to Q4) increased from 16% to 30% and from
21% to 34% at 6 and 12 months of follow-up respectively. Similar results (20% to 38% higher
odds) were observed in patients treated with GLP1-RA at 6 months.
Sex and pre-existing CVD or CKD did not have any influence on the likelihood of missingness
of HbA1c at 6 or 12 months follow-up. Patients with HbA1c ≥ 7.5% at baseline were 12% (OR
CI: 0.83, 0.93) and 14% (OR CI: 0.77, 0.97) less likely to have missing data at 6 months follow-
up in the DPP-4 and GLP-1RA groups respectively. However, this association seems to
disappear at 12 month follow-up.
There was no difference in the distributions of follow-up imputed Hba1c data with the three
imputed methods and the complete case analyses (Table 3 and Figure 1). The estimates of
unadjusted and adjusted changes in HbA1c during follow-up were also similar with all
imputation approaches, and there was no difference in these estimates with the CC analyses
(Table 3 and Figure 2).
Among patients with HbA1c ≥ 7.5% at baseline, the proportions of patients identified to have
reduced HbA1c ≤ 7% at 6 and 12 months follow-up were similar using all imputation
approaches (Table 3). While making clinical inference on the likelihood of reducing HbA1c
below 7% in the GLP-1RA group, compared to those treated with DPP-4, there was no
84
Page 11 of 23
disagreement among the three imputation approaches, and this inference was also in line with
the analysis of complete cases.
Figure 1 shows that there was no difference in distribution of HbA1c at 6 months and 12 months
among the three imputation approaches for patients treated with DPP-4. In patients treated with
GLP-1RA, at 6 months of follow-up, although MICE indicated a slightly leptokurtic
distribution due to its higher variability (SD = 1.4), there was no difference at 12 months. The
density plot (Figure 2) for the change in HbA1c in DPP-4, compared to the two treatment
groups, indicated there was no obvious difference between the three imputation methods.
However, in the GLP-1RA group, the density plots obtained using MICE were leptokurtic at
both 6 and 12 months of follow-up. Supplementary Figure 1 showed that patients treated with
DPP-4 had a higher mean HbA1c level at baseline and maintained it during follow-up
compared to patients treated with GLP-1RA. No significant difference in the trajectories of
HbA1c over 24 months of follow-up was observed between CC analyses and the three
imputation techniques for both DPP-4 and GLP-1RA.
DISCUSSION
A novel component of this study is the investigation of the likelihood of missingness of follow-
up risk factor measures (HbA1c) with patients’ demographic and clinical characteristics (age,
sex, pre-existing comorbidities and disease severity (baseline HbA1c). The results clearly
indicated that the missingness in the follow-up risk factor data is less likely in older patients,
irrespective of the drug they are taking for glycaemic control. We also observed that patients
with higher disease severity (HbA1c above 7.5% at baseline) are more likely to visit their GP
or primary care provider at 6 months post treatment intensification/ therapy titration – which
disappears over longer follow-up time – likely to be a result of effectiveness of the therapies in
terms of better risk factor control. This extensive assessment clearly informs on the random
and non-random patterns of missingness. However, it is very difficult to identify and
85
Page 12 of 23
distinguish random and non-random missing patterns in EMRs, and model them accordingly
to obtain robust inference(s) through imputations.
Another novel component of our study is the comparative assessment of the usability and
robustness of using imputed data for making clinical inferences in comparative effectiveness
studies at the population level using large real-world EMRs. We observed that the inferences
drawn on the risk factor changes in this pharmaco-epidemiological study are similar between
CC and imputed data based analyses. More importantly, the clinical contexts of evaluating the
effectiveness of the therapies, using continuous measures of risk factors or clinical
categorisation of the therapeutic achievements, were well supported with confidence in making
robust inferences using different methods of imputation. The EMR database presents a
formidable challenge that the “missing data” have an intermittent pattern of missingness over
time (non-monotone) and are NMAR, so approaches such as CC produces biased and
statistically inefficient results [38].
As expected in the primary care based EMR, a large number of patients had missing HbA1c in
the 6-monthly follow-up data over 24 months. We observed that in estimating changes in
HbA1c at 6 and 12 months from baseline, MICE was estimating marginally lower compared
to the other two methods and also slightly leptokurtic in density plot. In almost all instances,
both Two-fold and MCMC performed similarly. Due to its simplified nature of imputing
missing values at a given time by using values at nearby times makes Two-fold imputation a
more attractive technique in the context of EMR databases. This automatically reduces the
complexity of the imputation models, collinearity and overfitting issues [35]. Furthermore,
sometimes measurements further away from time may produce independent information
compared to the adjacent time points. In these circumstances, we have to be careful in using
Two-fold imputation with small time window widths. Possibly a greater time window width
may solve this problem.
86
Page 13 of 23
The MAR approach assumes that like the observed values, the missing observations are not
random samples that are generated from the same sampling distribution [19]. In our study, the
distributions of the imputed data we obtained using these three imputation methodologies were
similar compared to the data for CC because the underlying theory behind these imputation
techniques is the same as multiple imputations and are based upon MAR assumption. The
missing outcome measure data in this context also raises the issue of some kind of indication
bias – the fact that patients with better glycaemic control are less representative in the follow-
up outcome measure data. In this case, any analysis on the CC is highly likely to bias the result
towards those who are doing poorly in terms of glycaemic control, as observed in this study.
Kim (2004) showed that under the regression model, the bias of the multiple imputation
variance estimator decreases with large sample size [39]. We have a large sample size in our
study, hence we had almost unbiased variance estimator for the imputed data. Clearly the use
of robust statistical analytical techniques employed on analysis of imputed data is highly likely
to produce robust and reliable clinical inferences, compared to that based on the CC analyses.
In this context, the use of all three multiple imputation techniques (MICE, Two-fold and
MCMC) to impute for a relatively large proportion of missing primary outcome data, under
unknown patterns of missingness, appears to be valid and able to provide consistent and reliable
clinical inferences.
ACKNOLEDGEMENTS
University of Melbourne gratefully acknowledges the support from the Australian
Government’s National Collaborative Research Infrastructure Strategy (NCRIS) initiative
through Therapeutic Innovation Australia. No separate funding was obtained for this study.
OM acknowledges the Ph. D. scholarship from Queensland University of Technology,
Australia, and her co-supervisors Prof. Ross Young and Prof. Louise Hafner of the same
University.
87
Page 14 of 23
CONFLICT OF INTEREST
SKP has acted as a consultant and/or speaker for Novartis, GI Dynamics, Roche, AstraZeneca,
Guangzhou Zhongyi Pharmaceutical and Amylin Pharmaceuticals LLC. He has received grants
in support of investigator and investigator-initiated clinical studies from Merck, Novo Nordisk,
AstraZeneca, Hospira, Amylin Pharmaceuticals, Sanofi-Avensis and Pfizer. MS, OM and JT
have no conflict of interest to declare.
88
Page 15 of 23
REFERENCES
1. Sagreiya, H. and R.B. Altman, The utility of general purpose versus specialty clinical databases for research: warfarin dose estimation from extracted clinical variables. Journal of biomedical informatics, 2010. 43(5): p. 747-751.
2. Shivade, C., et al., A review of approaches to identifying patient phenotype cohorts using electronic health records. Journal of the American Medical Informatics Association, 2014. 21(2): p. 221-230.
3. Montvida, O., et al., Addition of or switch to insulin therapy in people treated with glucagon-like peptide-1 receptor agonists: A real-world study in 66 583 patients. Diabetes, Obesity and Metabolism, 2016: p. n/a-n/a.
4. Paul, S.K., et al., Delay in treatment intensification increases the risks of cardiovascular events in patients with type 2 diabetes. Cardiovascular Diabetology, 2015. 14(1): p. 100.
5. Badve, S.V., et al., The Association between Body Mass Index and Mortality in Incident Dialysis Patients. PLoS One, 2014. 9(12): p. e114897.
6. Thomas, G., et al., Obesity paradox in people newly diagnosed with type 2 diabetes with and without prior cardiovascular disease. Diabetes Obes Metab, 2013. 16(4): p. 317-25.
7. Jerez, J.M., et al., Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artificial Intelligence in Medicine, 2010. 50(2): p. 105-115.
8. Biering, K., N.H. Hjollund, and M. Frydenberg, Using multiple imputation to deal with missing data and attrition in longitudinal studies with repeated measures of patient-reported outcomes. Clin Epidemiol, 2015. 7: p. 91-106.
9. Thomas, G., K. Klein, and S. Paul, Statistical challenges in analysing large longitudinal patient-level data: The danger of misleading clinical inferences with imputed data. Journal of the Indian Society of Agricultural Statistics, 2014. 68(2): p. 39-54.
10. Sterne, J.A.C., et al., Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ, 2009. 338: p. 157- 160.
11. Little, R.J., et al., The Prevention and Treatment of Missing Data in Clinical Trials. New England Journal of Medicine, 2012. 367(14): p. 1355-1360.
12. Wells, B.J., et al., Strategies for handling missing data in electronic health record derived data. EGEMS (Wash DC), 2013. 1(3): p. 1035.
13. Madden, J.M., et al., Missing clinical and behavioral health data in a large electronic health record (EHR) system. J Am Med Inform Assoc, 2016.
14. Mackinnon, A., The use and reporting of multiple imputation in medical research - a review. J Intern Med, 2010. 268(6): p. 586-93.
15. Lin, J.H. and P.J. Haug, Exploiting missing clinical data in Bayesian network modeling for predicting medical problems. J Biomed Inform, 2008. 41(1): p. 1-14.
16. Marston, L., et al., Issues in multiple imputation of missing data for large general practice clinical databases. Pharmacoepidemiol Drug Saf, 2010. 19(6): p. 618-26.
17. Rubin, D.B. and N. Schenker, Multiple imputation in health-care databases: an overview and some applications. Stat Med, 1991. 10(4): p. 585-98.
18. Spratt, M., et al., Strategies for multiple imputation in longitudinal studies. Am J Epidemiol, 2010. 172(4): p. 478-87.
19. Rubin, D.B., Inference and missing data. Biometrika, 1976. 63(3): p. 581-592.
89
Page 16 of 23
20. Bounthavong, M., J.H. Watanabe, and K.M. Sullivan, Approach to addressing missing data for electronic medical records and pharmacy claims data research. Pharmacotherapy, 2015. 35(4): p. 380-7.
21. Yuan, Y.C., Multiple imputation for missing data: concepts and new developments (version 9.0). 2010, SAS Institute Inc.: Rockville.
22. Carpenter, J.K., Michael, Multiple Imputation and its Application. 2013, Wiley. 23. Wells, B.J., et al., Strategies for Handling Missing Data in Electronic Health Record
Derived Data. eGEMs, 2013. 1(3): p. 1035. 24. Madden, J.M., et al., Missing clinical and behavioral health data in a large electronic
health record (EHR) system. J Am Med Inform Assoc, 2016. 23(6): p. 1143-1149. 25. Montvida, O., et al., Addition of or switch to insulin therapy in people treated with
glucagon-like peptide-1 receptor agonists: A real-world study in 66 583 patients. Diabetes Obes Metab, 2017. 19(1): p. 108-117.
26. Control, C.f.D. and Prevention, National diabetes statistics report: estimates of diabetes and its burden in the United States, 2014. Atlanta, GA: US Department of Health and Human Services, 2014. 2014.
27. Crawford, A.G., et al., Comparison of GE Centricity Electronic Medical Record database and National Ambulatory Medical Care Survey findings on the prevalence of major conditions in the United States. Popul Health Manag, 2010. 13(3): p. 139-50.
28. Brixner, D., et al., Assessment of cardiometabolic risk factors in a national primary care electronic health record database. Value in health, 2007. 10(s1): p. S29-S36.
29. Paul, S.K., et al., Weight gain in insulin treated patients by BMI categories at treatment initiation: New evidence from real-world data in patients with type 2 diabetes. Diabetes, Obesity and Metabolism, 2016.
30. Montvida, O., et al., Data Mining Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic Medical Records. Open Bioinformatics Journal, 2017. 10: p. 1-15.
31. van Buuren, S. and K. Groothuis-Oudshoorn, mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 2011. 45(3): p. 1-67.
32. White, I.R., R. Daniel, and P. Royston, Avoiding bias due to perfect prediction in multiple imputation of incomplete categorical variables. Comput Stat Data Anal, 2010. 54(10): p. 2267-2275.
33. van Buuren, S., Multiple imputation of discrete and continuous data by fully conditional specification. Stat Methods Med Res, 2007. 16(3): p. 219-42.
34. Raghunathan, T.E., et al., A Multivariate Technique for Multiply Imputing Missing Values Using a Sequence of Regression Model. Survey Methodology, 2001. 27(1): p. 85-95.
35. Welch, C.A., et al., Evaluation of two-fold fully conditional specification multiple imputation for longitudinal electronic health record data. Statistics in Medicine, 2014. 33(21): p. 3725-3737.
36. Welch, C., J. Bartlett, and I. Petersen, Application of multiple imputation using the two-fold fully conditional specification algorithm in longitudinal clinical data. Stata J, 2014. 14(2): p. 418-431.
37. Cattaneo, M.D., Efficient semiparametric estimation of multi-valued treatment effects under ignorability. Journal of Econometrics, 2010. 155(2): p. 138-154.
38. Little, R.J.A., Rubin, Donald B. , Statistical Analysis with Missing Data. Second ed. 2002: Wiley-Interscience.
39. Kim, J.K., Finite sample properties of multiple imputation estimators. The Annals of Statistics, 2004. 32(2): p. 766-783.
90
Page 17 of 23
Table 1: Basic statistics and missingness of HbA1c (%) by treatment group with a
minimum 12 months treatment duration.
Minimum 12-months treatment duration
Minimum 24-months treatment duration
DPP-4 GLP1-RA DPP4 GLP1-RA N 38483 8977 23859 5312 Age in years𝜙 58 (12) 54 (11) 58 (12) 54 (11) Maleη 18721 (49) 3136 (35) 11672 (49) 1834 (35) Ethnicityη
Asian 828 (2) 100 (1) 541 (2) 65 (1) Black 4393 (11) 711 (8) 2682 (11) 412 (8) Indian
(American) 483 (1) 47 (1) 343 (1) 27 (1)
Native Hawaiian 67 (0) 19 (0) 49 (0) 10 (0) Unknown 5507 (14) 1192 (13) 3343 (14) 667 (13)
White 27205 (71) 6908 (77) 16901 (71) 4131 (78) HbA1c(%)γ
Baseline 8.1 (6.5, 18.8)
7.7 (6.5, 16.5)
8.1 (6.5, 18.8) 7.7 (6.5, 16.5)
6 months 7.1 (5, 17.7) 6.9 (5, 15.9) 7.1 (5, 17.7) 6.9 (5, 15.9) 12 months 7.2 (5, 17.9) 7.0 (5, 15.6) 7.2 (5, 17.9) 7.0 (5, 15.6) 18 months - - 7.2 (5, 17.5) 7.1 (5, 17.4) 24 months - - 7.3 (5, 17.9) 7.1 (5, 16.7)
HbA1c(%)ω Baseline 7.5 (6.8, 8.8) 7.1 (6.5, 8.3) 7.5 (6.8, 8.8) 7.1 (6.5, 8.3) 6 months 6.8 (6.3, 7.5) 6.6 (6, 7.4) 6.8 (6.3, 7.5) 6.6 (6, 7.4)
12 months 6.9 (6.3, 7.7) 6.6 (6, 7.5) 6.9 (6.3, 7.7) 6.6 (6, 7.5) 18 months - - 6.9 (6.3, 7.7) 6.7 (6.1, 7.6) 24 months - - 6.9 (6.3, 7.8) 6.8 (6.1, 7.6)
HbA1c(%)η – Missingness
Baseline 0 0 0 0 6 months 6622 (28) 1553 (31) 4110 (28) 883 (29)
12 months 6904 (30) 1643 (32) 4093 (28) 945 (31) 18 months - - 4518 (31) 1032 (33) 24 months - - 4680 (32) 1050 (34)
Unless stated otherwise; α: Mean (95% CI); 𝜙: Mean (SD); η: N (%); ω: Median (IQR); γ: Mean (Min, Max)
91
Pag
e 18
of
23
Tab
le 2
: Odd
s ra
tio
(95%
CI)
and
p-v
alue
s fo
r lik
elih
ood
of m
issi
ngne
ss o
f HbA
1c (%
) mea
sure
at 6
and
12
mon
ths
of fo
llow
-up
in p
atie
nts
trea
ted
wit
h D
PP
-4 a
nd G
LP
1-R
A a
djus
ted
for a
ge q
uart
iles
(Q1,
Q2,
Q3,
Q4)
, se
x, p
re-e
xist
ing
CV
D a
nd C
KD
and
pat
ient
s w
ith
base
line
HbA
1c≥
7.5%
.
6 m
onth
s fo
llow
-up
12
mon
ths
foll
ow-u
p
DP
P-4
p-
valu
eG
LP
1-R
A
p-va
lue
DP
P-4
p-
valu
eG
LP
1-R
A
p-va
lue
Age
Qua
rtil
es
Q2
0.84
(0.
78, 0
.90)
<0.
001
0.80
(0.
70, 0
.91)
<0.
001
0.79
(0.
74, 0
.85)
<
0.00
1 0.
69 (
0.60
, 0.7
8)<
0.00
1 Q
3 0.
75 (
0.70
, 0.8
1)<
0.00
1 0.
68 (
0.58
, 0.8
0)<
0.00
1 0.
74 (
0.69
, 0.8
0)
<0.
001
0.61
(0.
52, 0
.71)
<0.
001
Q4
0.70
(0.
64, 0
.76)
<0.
001
0.69
(0.
55, 0
.88)
<0.
001
0.66
(0.
60, 0
.72)
<
0.00
1 0.
62 (
0.50
, 0.7
8)<
0.00
1 M
ale
vs F
emal
e 1.
02 (
0.97
, 1.0
8)0.
48
1.05
(0.
94, 1
.19)
0.39
1.
03 (
0.98
, 1.0
9)
0.28
0.
87 (
0.78
, 0.9
8)0.
024
Car
diov
ascu
lar
Dis
ease
(C
VD
) 1.
06 (
0.98
, 1.1
4)0.
16
1.03
(0.
86, 1
.23)
0.76
1.
08 (
1.00
, 1.1
6)
0.06
1.
13(0
.95,
1.3
5)
0.17
C
hron
ic K
idne
y D
isea
se (
CK
D)
1.
01 (
0.88
, 1.1
6)0.
88
1.02
(0.
70, 1
.50)
0.30
0.
92 (
0.80
, 1.0
6)
0.26
0.
99 (
0.68
, 1.4
3)0.
95
HbA
1c ≥
7.5
% a
t Bas
elin
e 0.
88 (
0.83
, 0.9
3)<
0.00
1 0.
86 (
0.77
, 0.9
7)0.
011
1.0
0 (0
.94,
1.0
5)0.
81
0.94
(0.
84, 1
.05)
0.24
Page 19 of 23
Table 3: (i) Mean (SD) for HbA1c (%) at 6 and 12 months by treatment group for complete
case and on imputed data by three imputation methods (ii) Change in HbA1c (%) at 6 and 12
months from baseline by treatment group for unadjusted analyses and adjusted for age, gender,
baseline HbA1c (%), diabetes duration, and time to second-line ADD as measured on complete
cases and on imputed data; (iii) Among patients with Hba1c ≥ 7.5% at the index date, the
proportions of patients who reduced HbA1c ≤ 7% at 6 and 12 months during follow-up, by
treatment groups, for complete cases and imputed data; (iv) Odds ratio for HbA1c ≤ 7% in
patients treated with GLP-1RA compared to DPP-4 group adjusted for baseline HbA1c (%),
age, gender, diabetes duration, time to second-line ADD and third-line ADD started within 6
(or 12 month) or not.
6 months follow-up 12 months follow-up
DPP-4 GLP1-RA DPP-4 GLP1-RA
HbA1c (%)
CC 7.1 (1.3) 6.9 (1.3) 7.2 (1.4) 7.0 (1.4) MICE 7.1 (1.3) 6.9 (1.4) 7.2 (1.3) 7.0 (1.4)
Two-fold 7.1 (1.3) 6.9 (1.3) 7.2 (1.3) 7.0 (1.4) Bayesian
MCMC 7.1 (1.3) 6.9 (1.3) 7.2 (1.3) 7.0 (1.4) Change in HbA1c (%)α – Unadjusted
CC -1.05 (-1.07, -1.02)
-0.89 (-0.93, -0.84)
-0.91 (-0.94, -0.89)
-0.73 (-0.77, -0.68)
MICE -1.00 (-1.02, -0.98)
-0.84 (-0.87, -0.80)
-0.91 (-0.93, -0.89)
-0.71 (-0.75, -0.69)
Two-fold -1.00 (-1.02, -0.98)
-0.84 (-0.87, -0.80)
-0.92 (-0.94, -0.9)
-0.72 (-0.76, -0.69)
Bayesian MCMC
-1.00 (-1.02, -0.98)
-0.85 (-0.88, -0.81)
-0.92 (-0.94, -0.9)
-0.72 (-0.76, -0.69)
Change in HbA1c (%)α – Adjusted CC -1.14
(-1.16, -1.11) -0.98
(-1.02, -0.94) -1.00
(-1.03, -0.98) -0.81
(-0.85, -0.76) MICE -1.09
(-1.11, -1.07) -0.93
(-0.96, -0.89) -1.00
(-1.02, -0.98) -0.79
(-0.83, -0.75) Two-fold -1.09
(-1.11, -1.07) -0.92
(-0.95, -0.89) -1.01
(-1.03, -0.99) -0.80
(-0.84, -0.77) Bayesian
MCMC -1.09
(-1.11, -1.07) -0.93
(-0.96, -0.89) -1.00
(-1.02, -0.98) -0.80
(-0.84, -0.77)
93
Page 20 of 23
Patients with HbA1c ≥7.5% at Baseline and ≤ 7% at follow-up
CC 5193 (48) 1043 (49) 4768 (46) 956 (48)
MICE 6142 (45) 1312 (47) 5673 (43) 1238 (45)
Two-fold 6221 (45) 1294 (46) 5883 (43) 1226 (45)
Bayesian MCMC
6112 (45) 1311 (47) 5774 (43) 1239 (45)
Odds Ratio - Adjusted α
CC Ref 1.03 (1, 1.06) Ref 1.02 (0.99, 1.04) MICE 1.03 (1, 1.06) 1.02 (1, 1.04)
Two-fold 1.03 (1, 1.06) 1.01 (1, 1.04) Bayesian
MCMC 1.03 (1, 1.05) 1 (0.99, 1.03) Unless stated otherwise; α: Mean (95% CI); 𝜙: Mean (SD); η: N (%); ω: Median (IQR); γ: Mean (Min, Max)
94
Page 21 of 23
Figure 1: Distribution of HbA1c (%) at 6 months and 12 months for DPP-4 and GLP-1RA
respectively for complete case, MICE, Two-fold and MCMC imputation.
95
Page 22 of 23
Figure 2: Distribution of change in HbA1c (%) (∆HbA1c) at 6 months and 12 months for
DPP-4 and GLP-1RA respectively for complete case, MICE, Two-fold and MCMC
imputation.
96
Page 23 of 23
Supplementary Figure 1: Trajectory plot for mean (95% CI) HbA1c (%) at baseline and follow-
up for two treatment groups for complete case and on imputed data by three imputation
methods.
97
Chapter 7: Trends in Anti-diabetic Drug
Prescribing Patterns
Statement of Contribution of Co-Authors for Thesis by Published
Paper
The authors listed below have certified* that:
1. they meet the criteria for authorship in that they have participated in the conception,
execution, or interpretation, of at least that part of the publication in their field of
expertise;
2. they take public responsibility for their part of the publication, except for the
responsible author who accepts overall responsibility for the publication;
3. there are no other authors of the publication according to these criteria;
4. potential conflicts of interest have been disclosed to (a) granting bodies, (b) the editor
or publisher of journals or other publications, and (c) the head of the responsible
academic unit, and
5. they agree to the use of the publication in the student’s thesis and its publication on
the QUT’s ePrints site consistent with any limitations set by publisher requirements.
In the case of this chapter:
Olga Montvida, Jonathan Shaw, John J Atherton, Francis Stringer, Sanjoy K Paul. Long-term
Trends in Antidiabetes Drug Usage in the US: Real-world Evidence in Patients Newly
Diagnosed With Type 2 Diabetes. Diabetes Care. 2017 Nov 6:dc171414.
Contributor Statement of Contribution*
Olga Montvida Conceived the idea and was responsible for the primary
design of the study. Conducted the data extraction and
statistical analyses. Developed first draft and contributed
towards development of the manuscript.
Jonathan Shaw Contributed significantly in the study design and manuscript
development.
John J Atherton Contributed significantly in the study design and manuscript
development.
Francis Stringer Contributed in the interpretation of the results and
manuscript development.
98
29.06.2018QUT Verified
Signature
Sanjoy K. Paul Conceived the idea and was responsible for the primary
design of the study. Contributed to the statistical analyses.
Developed first draft and contributed towards development
of the manuscript.
Principal Supervisor Confirmation
I have sighted email or other correspondence from all Co-authors confirming their certifying
authorship.
Sanjoy Ketan Paul ___ 29.06.2018
Name Signature Date
99
QUT Verified Signature
Long-term Trends in AntidiabetesDrug Usage in the U.S.: Real-worldEvidence in Patients NewlyDiagnosed With Type 2 DiabetesDiabetes Care 2018;41:69–78 | https://doi.org/10.2337/dc17-1414
OJBECTIVE
To explore temporal trends in antidiabetesdrug (ADD)prescribing and intensificationpatterns, alongwith glycemic levels and comorbidities, and possible benefits of novelADDs in delaying the need for insulin initiation in patients diagnosed with type 2diabetes.
RESEARCH DESIGN AND METHODS
Patients with type 2 diabetes aged 18–80 years, who initiated any ADD, were se-lected (n = 1,023,340) from theU.S. Centricity ElectronicMedical Records. Thosewhoinitiated second-line ADD after first-line metforminwere identified (subcohort 1, n =357,482); the third-line therapy choices were further explored.
RESULTS
From 2005 to 2016, first-line use increased for metformin (60–77%) and decreasedfor sulfonylureas (20–8%). During amean follow-up of 3.4 years postmetformin, 48%initiated a second ADD at a mean HbA1c of 8.4%. In subcohort 1, although sulfonyl-urea usage as second-line treatment decreased (60–46%), it remained themost pop-ular secondADD choice. Use increased for insulin (7–17%) anddipeptidyl peptidase-4inhibitors (DPP-4i) (0.4–21%). The rates of intensification with insulin and sulfonyl-ureas did not decline over the last 10 years. The restricted mean time to insulininitiationwasmarginally longer in second-line DPP-4i (7.1 years) and in the glucagon-like peptide 1 receptor agonist group (6.6 years) compared with sulfonylurea (6.3years, P < 0.05).
CONCLUSIONS
Most patients initiate second-line therapy at elevated HbA1c levels, with highlyheterogeneous clinical characteristics across ADD classes. Despite the introductionof newer therapies, sulfonylureas remained the most popular second-line agent,and the rates of intensification with sulfonylureas and insulin remained consistentover time. The incretin-based therapies were associated with a small delay in theneed for therapy intensification compared with sulfonylureas.
A broad choice of “old” and “new” antidiabetes drugs (ADDs) is available, which differnot only in their mechanisms of action but also in their glycemic and extraglycemiceffects (1). While treatment guidelines for type 2 diabetes are regularly updated basedon new evidence, real-world prescription trends may also be driven by other factors,such as medication costs, side effect profile, and provider and patient preferences.
1Statistics Unit, QIMR Berghofer Medical Re-search Institute, Brisbane, Australia2Faculty of Health, School of Biomedical Sci-ences, Queensland University of Technology,Brisbane, Australia3Baker Heart and Diabetes Institute, Melbourne,Australia4Cardiology Department, Royal Brisbane andWomen’s Hospital, and University of QueenslandSchool of Medicine, Brisbane, Australia5Model Answers Pty Ltd, Brisbane, Australia6Melbourne EpiCentre, University of Melbourne,Melbourne, Australia
Correspondingauthor: SanjoyK. Paul, [email protected].
Received 14 July 2017 and accepted 25 Septem-ber 2017.
This article contains Supplementary Data onlineat http://care.diabetesjournals.org/lookup/suppl/doi:10.2337/dc17-1414/-/DC1.
This article is featured in a podcast available athttp://www.diabetesjournals.org/content/diabetes-core-update-podcasts.
© 2017 by the American Diabetes Association.Readers may use this article as long as the workis properly cited, the use is educational and notfor profit, and the work is not altered. More infor-mation is available at http://www.diabetesjournals.org/content/license.
Olga Montvida,1,2 Jonathan Shaw,3
John J. Atherton,4 Frances Stringer,5 and
Sanjoy K. Paul1,6
Diabetes Care Volume 41, January 2018 69
EPIDEM
IOLO
GY/H
EALTH
SERVICES
RESEA
RCH
100
With the development of new classes ofantidiabetes therapies since 2005, includingincretin-based drugs and sodium–glucosecotransporter 2 inhibitors (SGLT2i), theparadigm of therapy options for patientswith highly heterogeneous glycemic andcardiovascular risk factors has changedsignificantly. However, the way in whichthis has occurred in real-world practice,especially in the trade-off between olderand new classes of ADDs as initial and in-tensification therapy options, has notbeen studied thoroughly.The newer ADDs have been shown to
be associated with significantly lower riskof hypoglycemia compared with the sul-fonylureas (SU) and insulin (INS) (2). Theweight neutrality or benefits of weightreductions have also been well establishedfor new therapies including the incretins(3,4). Given the glycemic and extraglyce-mic benefits of these agents, one wouldexpect a fall in the use of SU or INS as in-tensification therapies. However, studiesevaluating the possible benefits of usingnewer ADDs in terms of delaying the needfor INS are scarce (5,6). In this context, un-derstanding the changing patterns of ther-apy initiation and intensifications withsecond- and third-line therapies, in conjunc-tionwith theheterogeneous patients’ char-acteristics, is a fundamental backgroundrequirement.Cohenetal. (7) exploredADD-prescribing
patterns in the U.S. from 1997 to 2000 andreported decreasing use of SU and increas-ing trends in metformin (MET) and thiazoli-dinedione (TZD) prescription over time.Utilizing a claims database, Desai et al.(8) reported an increasing proportionalshare of MET and decreasing prescrip-tions for TZD between 2006 and 2008.One of the reasons for the reduction inthe use of TZD was the safety concerns(9–12). While Berkowitz et al. (13) evalu-ated treatment initiations with MET, SU,and dipeptidyl peptidase-4 inhibitors(DPP-4i) between 2009 to 2013 in theU.S., utilization patterns of other ADDsand the changing utilization trend overtime were not explored (14). Lipskaet al. (15) have evaluated the temporaltrend in the use of ADDs from 2006 to2013 using claims data from the U.S. Asmall number of studies have exploredthe clinical characteristics of patientsaccording to the type of ADD pre-scribed, but only over relatively shorttime periods, and did not evaluate treat-ment intensification with second- or
third-line ADDs over a long period oftime (16,17).To the best of our knowledge, the pro-
gressive changes in the proportional dis-tributions across all new and old ADDs,and the patterns and determinants oftherapy intensification with second- andthird-line ADDs, have not been exploredcomprehensively in any study. With rec-ognition of the growing disease burdenand increasing volumes of dispensedmedications (18–21), the primary aimof this studywas to provide a comprehen-sive up-to-date exploration of the treat-ment pattern changes for type 2 diabetesin the U.S. using the nationally representa-tive Centricity Electronic Medical Records(CEMR) from primary and secondary am-bulatory care systems. Specifically, the aimswere to1) explore temporal changes inpre-scribing patterns from 2005 to 2016 withrespect to the drug initiation order, 2) ex-plore therapy intensification with secondand third ADDs, 3) explore patient char-acteristics including risk factors and co-morbidities according to ADD therapyprescribed, 4) explore the temporal pat-terns in the rates of intensification withSU and INS, and 5) evaluate whether useof incretin-based therapies as second-linetherapy delays the need for intensifica-tion with third-line ADDs and with INSany time during follow-up.
RESEARCH DESIGN AND METHODS
Data SourceCEMR represents a variety of ambulatoryand primary care medical practices, in-cluding solo practitioners, communityclinics, academic medical centers, andlarge integrated delivery networks in theU.S. More than 34 million individuals’longitudinal electronic medical records(EMRs) were available from 1995 to April2016. More than 35,000 physicians andother providers from all U.S. states con-tribute to the CEMR, of whom;75% areprimary care providers. The database isgenerally representative of the U.S. pop-ulation, with a diabetes prevalence of 7.1%(identified by diagnostic codes) that issimilar to the national diabetes preva-lence of 6.7% (diagnosed diabetes in2014) (14). The CEMR has been used ex-tensively for academic research world-wide (3,22,23).This database contains comprehensive
patient-level information on demo-graphic, anthropometric, clinical, and lab-oratory variables including age, sex,
ethnicity, and longitudinal measures ofBMI, blood pressure, glycated hemoglo-bin (HbA1c), and lipids. All disease eventsalong with dates are coded with ICD-9,ICD-10, or SNOMEDCT codes.Medicationdata include brand names and doses forindividual medications prescribed, alongwith start/stop dates and specific fields totrack treatment alterations. This dataset also contains patient-reported medi-cations, including prescriptions receivedoutside the EMR network and over-the-counter medications.
MethodsEleven antidiabetes therapeutic classeswere considered in this study: MET, SU,TZD, a-glucosidase inhibitors (AGI), amy-lin, dopamine receptor agonists (DOPRA),meglitinides (MEG), DPP-4i, glucagon-likepeptide 1 receptor agonists (GLP-1RA),SGLT2i, and INS. For each patient, theseADDs were arranged chronologically ac-cording to the initiation dates. Same-dayinitiations (including combination thera-pies) were prioritized in the order as listedabove, with highest order priority assignedto MET and lowest to INS. Additions orswitches were defined by comparing stopdates and start dates of correspondingtherapies. Details on the medication datastructure, associated data-mining chal-lenges, and description of an algorithmapplied to extract and aggregate patient-level medication data from CEMR haverecently been published (24).For convenience, AGI, amylin, DOPRA,
andMEGwere combined into the “other”category. Saxenda (a version of liraglutide)was excluded from the GLP-1RA list, asit was approved in 2014 for weight lower-ing and not as an ADD (25). AlthoughWel-chol (colesevelam) was approved for thetreatment of type 2 diabetes, it wasmainlyprescribed to reduce cholesterol levels;therefore, we did not include colesevelamin our analyses (18).Patients with diabetes were identified
on the basis of diagnostic codes; thosewith a diagnosis of type 1 diabetes or onlygestationaldiabetesmellituswere identifiedand excluded. For identified patients withtype 2 diabetes, the following inclusion cri-teriawere applied:1) age at diagnosis$18and ,80 years, 2) diagnosis date strictlyafter first registered activity in the data-base, 3) diagnosis date on or after 1 Janu-ary 2005, and 4) initiation of any ADD.
Demographic variables included sex,age, and ethnicity. HbA1c values on the
70 Antidiabetes Treatment Patterns in the U.S. Diabetes Care Volume 41, January 2018
101
date of diagnosis and first-, second-, andthird-line ADD therapy initiation were ob-tained as the closest observations withina 3-month window. Body weight, BMI,systolic/diastolic blood pressure (SBP/DBP),lipids (LDL, HDL, and triglycerides), andheart rate were calculated as the aver-age of available measurements within a3-month window of the diagnosis or ADDinitiation date. Obesity was defined asBMI$30 kg/m2.The presence of comorbidities prior to
the first and second drug initiation wasexplored. Cardiovascular disease (CVD)was defined as ischemic heart disease (in-cluding myocardial infarction), peripheralvascular disease, heart failure, or stroke.Cancer was defined as anymalignancy ex-cept malignant neoplasm of skin. CharlsonComorbidity Index was defined and calcu-lated following the algorithm described byQuan et al. (26).
Statistical MethodsThe characteristics of patients were sum-marized byADD classesdat first prescrip-tion and at second ADD initiation whenadded to MET. Separate analyses wereconducted to explore the pattern of ad-dition or switch to third ADD by majorclasses of second-line ADDs. Study vari-ables were summarized as number (%),mean (SD), or median (first quartile [Q1],third quartile [Q3]) as appropriate. In pa-tients who had a second-line ADD addedafter MET, and had at least 1 year offollow-up post–second-line initiation,the “restricted mean survival time” esti-mation approach was used to comparethe mean time to the third ADD/INS ini-tiation among major second-line ADDgroups. This method computes survivaltime as time to third ADD/INS if initiated,and otherwise as time to the end of fol-low-up (date of patient’s last available re-cord within the database). Standard lifetable methods were used to estimaterates per 100 person-years (95% CI) ofsecond-line ADD, INS, and SU initiations inpatients with a minimum of 1 year follow-up post–MET initiation.
RESULTS
From the 2,624,954 patients identifiedwith type 2 diabetes, 2,590,853 wereaged between $18 and ,80 years atthe date of diagnosis with mean/median3.9/2.7yearsof follow-up.Of thesepatients,1,305,686 (50%) were newly diagnosed af-ter 1 January 2005,while 1,023,340patients
initiated any ADD (study cohort) (Supple-mentary Fig. 1) during the availablemean/median 3.4/2.8 years of follow-uptime. In the study cohort, 46%weremale,mean (SD) agewas 58 years (13), and 68%were white Caucasians, 12% blacks, and2% Asians (Table 1).
First ADD
Prescription Pattern Changes Over Time
Figure 1A presents the proportional dis-tribution of the first-line ADD by year ofinitiation. The proportional share of METas the first choice increased consistentlyfrom 60% in 2005 to 77% in 2016 (firstquarter of the year). SU’s share declinedfrom 20 to 8%, while INS’s share rangedfrom 8 to 10%. Starting at 11% in 2005,TZD’s proportional share dropped pro-gressively to 0.7% in 2016. Other drugswere chosen as first-line in 3% of casesor less.
Patients’ Characteristics
In the study cohort of 1,023,340 patients,the distribution of prescription patternsfor individual ADDs at any time from Jan-uary 2005 to April 2016 and as the firstADD are presented in Table 1. The demo-graphic and clinical characteristics of thepatients, along with the prevalence of co-morbidities at the time of first ADD initi-ation, are also presented in Table 1.In the study cohort, 79% received MET
any time during the recorded follow-up,and 72% received MET as the first ADD.The mean time to initiation of MET asthe first ADD and the available follow-up time since initiation were 3.7 monthsand 3.3 years, respectively. The propor-tions of patients with HbA1c level $7.5%(58 mmol/mol) and 8% (64 mmol/mol) atMET initiationwere 48% and 37%, respec-tively. Those who initiated with GLP-1RA,DPP-4i, TZD, and SU had a similar meanHbA1c of 8.0, 7.7, 7.8, and 8.0%, respec-tively (64, 61, 62, and 64 mmol/mol). INSwas initiated at an average HbA1c of 8.9%(74mmol/mol), with 71% and 59%havingHbA1c $7.5% (58 mmol/mol) and 8%(64 mmol/mol), respectively.Patients who initiated treatment with
MET were younger (mean age 57 years,with 19% $70 years) than those whoinitiated with SU (mean age 64 years, 43%$70 years), with INS (mean 60 years,29%$70years),withTZD (mean62years,32% $70 years), or with DPP-4i (mean64 years, 39% $70 years). Those whohad GLP-1RA and SGLT2i as the firstADD were younger, more likely to be
white Caucasian, female, and obese, ascompared with those who initiated withMET, DPP-4i, INS, TZD, or SU.
Comorbidities
The proportions of patients with CVD,chronic kidney disease (CKD), cancer, ordepression at first ADD initiation were19%, 4%, 4%, and 11%, respectively.Those patients initiating therapy withINS had a significantly higher prevalenceof CVD (27%, P , 0.01), CKD (11%, P ,0.01), and higher Charlson ComorbidityIndex with mean (SD) of 1.84 (1.31),compared with those initiating with MET,DPP-4i, GLP-1RA, or TZD (Table 1 for com-parative estimates).
Discontinuation of First ADD
Among patients with at least 1 year offollow-up (n = 813,826), the proportionsof patients discontinuing the first-lineADD within 1 year by individual ADDsare presented in Table 1. While only 8%of patients discontinued MET within ayear, 20%, 17%, and 25% of patients dis-continued GLP-1RA, DPP-4i, and SGLT2iwithin a year, respectively.
Second ADDAmong 740,478 patients who initiatedtherapy with MET, 357,482 (48%, subco-hort 1) (Supplementary Fig. 1) initiated asecond ADD, with an annual mean rate of10.7 initiations per 100 person-years(minimum 10.2, maximum 14.0) duringa mean 3.3 years of available follow-up,at an average HbA1c level of 8.4%(68mmol/mol), with 60% and 48% havingHbA1c $7.5% (58 mmol/mol) and 8.0%(64 mmol/mol), respectively. The propor-tional share of second-line ADD (post-MET) over time is presented in Fig. 1B.The demographic and clinical characteris-tics of the patients along with the time tosecond ADD, and the prevalence of co-morbidities at the time of second druginitiation, are presented in Table 2.Although the proportional share of SU
as a second-line therapy gradually de-creased from 60 to 46% over time (Fig. 1B),it remained the most popular choice(53%) of therapy intensification post–MET initiation across the whole time pe-riod. SU was initiated as second-line ADDat an average HbA1c level of 8.4% (68mmol/mol), with 62% and 49% havingHbA1c $7.5% (58 mmol/mol) and 8.0%(64 mmol/mol), respectively (Table 2).Among patients with a second ADDand a minimum 1 year of follow-up, only
care.diabetesjournals.org Montvida and Associates 71
102
Tab
le1—
Patientch
arac
teristicsat
thetimeoffirstADD
initiation,bydru
gclassin
thestudyco
hort
(N=1,023,34
0)
GLP-1RA
SGLT2i
MET
INS
TZD
DPP
-4i
SUOther§
All
Any
timedu
ring
follow-up
n(%
ofN)
68,522
(7)
39,549
(4)
808,518
(79)
270,432
(26)
109,754
(11)
182,457
(18)
354,36
7(35)
25,358
(2)
1,02
3,34
0(100
)Timeto
firstprescription
,mon
ths*
22.32(27.23
)39
.64(33.48
)5.3(14.16
)13
.03(22.54
)8.93(17.72
)19
.57(25.56
)11
.09(20.25
)14
.97(23.56
)d
AtfirstADD
n2(%
ofN)
9,49
4(1)
1,93
5(0.2)
740,478
(72)
93,078
(9)
28,004
(3)
25,005
(2)
116,43
5(11)
8,91
1(1)
d
Timeto
firstprescription
,mon
ths*
3.82(11.92
)7.94
(18.68
)3.73(11.78
)3.74
(11.06
)2.33
(8.2)
4.98(13.65
)3.84(11.51
)2.45(9.66)
3.73(11.66
)Follow-upfrom
ADDinitiation
,years*
3.22(2.5)
0.98(0.62)
3.31
(2.54)
2.94(2.45)
5.06(3.02)
2.79(2.1)
3.72(2.7)
3.86(2.71)
3.36(2.58)
Follow-up$1year
from
firstADDinitiation
,n3(%
ofn2)
7,40
0(78)
889(46)
589,246
(80)
68,385
(73)
25,105
(90)
19,278
(77)
95,854
(82)
7,66
9(86)
813,826
(80)
Discontinuation
within1year,n
(%of
n3)
1,51
6(20)
225(25)
44,485
(8)
3,35
9(5)
4,79
5(19)
3,24
3(17)
9,76
5(10)
1,36
7(18)
68,755
(8)
Age
(years)*
55(12)
56(11)
57(13)
60(13)
62(11)
64(11)
64(11)
58(16)
58(13)
Age
$70
years,n(%
ofn2)
1,21
7(13)
223(12)
144,210
(19)
26,790
(29)
8,83
8(32)
9,74
1(39)
50,280
(43)
2,92
5(33)
244,224
(24)
Male,n(%
ofn2)
3,20
5(34)
849(44)
332,206
(45)
44,016
(47)
14,075
(50)
10,968
(44)
59,208
(51)
3,33
9(37)
467,866
(46)
White
Caucasian,n
(%of
n2)
7,00
5(74)
1,43
0(74)
512,521
(69)
58,396
(63)
18,342
(65)
17,082
(68)
77,533
(67)
5,81
2(65)
698,121
(68)
Black,n
(%of
n2)
899(9)
218(11)
83,767
(11)
14,089
(15)
2,79
5(10)
2,95
7(12)
13,988
(12)
944(11)
119,657
(12)
HbA
1c,%*
8(1.6)
8.1(1.7)
8.1(1.8)
8.9(2.1)
7.8(1.6)
7.7(1.5)
8(1.6)
7.8(1.5)
8.2(1.8)
HbA
1c,mmol/m
ol‡
6465
6574
6261
6462
66HbA
1c$7.5%
(58mmol/m
ol),n(%
ofn2)
831(47)
295(51)
108,114
(48)
19,756
(71)
2,24
9(42)
2,41
0(40)
15,109
(49)
509(45)
149,273
(50)
HbA
1c$8%
(64mmol/m
ol),n(%
ofn2)
609(34)
209(36)
82,914
(37)
16,284
(59)
1,56
5(29)
1,65
6(28)
10,900
(36)
348(31)
114,485
(39)
Weight,kg*
107.6
(26.6)
103.5(24.8)
98.2
(24.9)
95.3
(25.3)
96.7
(25.2)
92.8
(24)
93.6
(23.7)
87.6
(24.6)
97.3
(24.9)
BMI,kg/m
2*
38.1
(8.5)
36.2
(8)
34.6
(7.9)
33.6
(8.3)
34(7.9)
33(7.6)
33.1
(7.6)
31.5
(8)
34.3
(7.9)
Obese,n
(%of
n2)
6,46
0(85)
1,28
1(78)
427,651
(70)
43,030
(64)
12,954
(66)
12,106
(62)
53,029
(62)
3,47
6(51)
559,987
(68)
SBP,mmHg*
129(15)
129(14)
131(15)
131(18)
131(16)
130(16)
132(17)
126(17)
131(16)
SBP$140mmHg,n(%
ofn2)
1,53
8(21)
345(22)
153,062
(25)
20,320
(29)
5,40
0(27)
4,96
0(25)
26,435
(30)
1,31
3(19)
213,373
(26)
DBP,mmHg*
78(10)
78(9)
78(10)
75(11)
75(10)
75(10)
75(10)
74(10)
77(10)
Heartrate,bpm
*79
(11)
79(12)
78(12)
78(12)
75(12)
76(12)
76(12)
76(11)
78(12)
LDL,mg/dL*
103(37)
106(39)
106(37)
98(40)
100(38)
99(37)
98(37)
98(37)
104(37)
HDL,mg/dL*
45(13)
45(14)
44(13)
44(15)
46(14)
45(14)
44(14)
48(15)
44(13)
Triglycerides,mg/dL†
140(102
,187
)14
4(104
,195
)14
3(104
,193
)13
3(93,
184)
131(94,
182)
141(102
,188
)14
2(103
,192
)12
3(88,
174)
142(103
,192
)CVD,n
(%of
n2)
1,42
2(15)
331(17)
118,342
(16)
25,051
(27)
5,44
6(19)
6,18
2(25)
31,293
(27)
1,81
5(20)
189,882
(19)
CKD
,n(%
ofn2)
377(4)
47(2)
12,590
(2)
10,329
(11)
1,82
0(6)
2,37
1(9)
10,988
(9)
643(7)
39,165
(4)
Cancer,n(%
ofn2)
289(3)
64(3)
30,195
(4)
3,63
6(4)
1,37
4(5)
1,34
0(5)
6,55
1(6)
451(5)
43,900
(4)
Depression,n(%
ofn2)
1,26
6(13)
213(11)
88,673
(12)
7,83
4(8)
2,21
0(8)
2,32
7(9)
8,93
1(8)
845(9)
112,299
(11)
CharlsonCom
orbidity
Index*
1.47(0.9)
1.45(0.91)
1.44
(0.89)
1.84(1.31)
1.57(1.06)
1.76(1.22)
1.77(1.23)
1.66(1.16)
1.53(1.0)
bpm,beatsper
minute;n2,num
berof
stud
ycoho
rtpatientsprescribed
each
drug
classas
afirstADD;n
3,nu
mberof
n2patientswith$1year
follow-upafterfirstADDinitiation
.*Mean(SD);
†median
(interquartile
range);‡mean;
§other:amylin,D
OPR
A,A
GI,or
MEG
.
72 Antidiabetes Treatment Patterns in the U.S. Diabetes Care Volume 41, January 2018
103
Figure 1—A: Proportional share of the first ADD by year of initiation in the study cohort. B: Proportional share of the second ADD by year of initiation insubcohort 1 and key studies listed. C: In patients with a minimum of 1 year of follow-up post-MET, annual rates (95% CI) of SU and INS initiations per100 person-years. Subcohort 1: initiated second ADD and had MET as first-line treatment. *Other: amylin, DOPRA, AGI, or MEG. EMPA REG, BI10773 (Empagliflozin) Cardiovascular Outcome Event Trial in Type 2 Diabetes Mellitus Patients; EXAMINE, Examination of Cardiovascular Outcomeswith Alogliptin versus Standard of Care; FDA, U.S. Food and Drug Administration; LEADER, Liraglutide Effect and Action in Diabetes: Evaluation ofcardiovascular outcome Results; PROactive, Prospective Pioglitazone Clinical Trial in Macrovascular Events; RECORD, Rosiglitazone Evaluated for Cardio-vascular Outcomes in Oral Agent Combination Therapy for Type 2 Diabetes; SAVOR-TIMI, Saxagliptin Assessment of Vascular Outcomes Recorded inPatients with Diabetes Mellitus–Thrombolysis in Myocardial Infarction; TECOS, Trial Evaluating Cardiovascular Outcomes with Sitagliptin; UKPDS, UKProspective Diabetes Study.
care.diabetesjournals.org Montvida and Associates 73
104
Tab
le2—
Patientch
arac
teristicsat
thetimeoftheseco
ndADD
initiation,bydru
gclassad
ded
insu
bco
hort
1||(N
=35
7,482)
GLP-1RA
SGLT2i
INS
TZD
DPP
-4i
SUOther§
All
n1(%
ofN)
15,448
(4)
5,97
1(2)
49,939
(14)
33,021
(9)
61,508
(17)
187,81
9(53)
3,77
6(1)
357,482
(100
)
Timefrom
firstto
second
ADD(m
onths)*
11.1
(18.58)
18.52(23.73
)5.74(14.58
)4.09
(11.19
)11
.15(18.95
)7.38(16.08
)7.35(15.92
)7.84(16.5)
Follow-upfrom
second
ADDinitiation
(years)*
2.97(2.44)
0.95(0.66)
2.71
(2.26)
4.77(2.98)
2.68(2.01)
3.34(2.56)
3.59(2.65)
3.22(2.53)
Follow-up$1year
from
second
ADDinitiation
,n2(%
ofn1)
11,431
(74)
2,55
8(43)
36,337
(73)
28,841
(87)
46,822
(76)
149,10
9(79)
3,09
0(82)
278,18
8(78)
Discontinuation
within1year,n
(%of
n2)
2,40
7(21)
643(25)
2,53
7(7)
5,91
3(21)
8,23
4(18)
15,569
(10)
724(23)
36,027
(13)
Age
(years)*
53(12)
54(11)
57(13)
58(11)
58(12)
60(12)
61(12)
59(12)
Age
$70
years,n(%
ofn1)
1,20
3(8)
435(7)
9,57
6(19)
6,45
0(20)
11,595
(19)
47,416
(25)
1,11
6(30)
77,791
(22)
Male,n(%
ofn1)
5,30
5(34)
2,90
9(49)
23,366
(47)
17,301
(52)
29,463
(48)
97,730
(52)
1,66
4(44)
177,73
8(50)
White
Caucasian,n
(%of
n1)
11,698
(76)
4,62
2(77)
33,215
(67)
22,710
(69)
43,076
(70)
130,41
8(69)
2,57
5(68)
248,31
4(69)
HbA
1c,%*
7.8(1.6)
8.1(1.8)
9.3(2.3)
7.9(1.7)
8.2(1.7)
8.4(1.8)
7.5(1.6)
8.4(1.9)
HbA
1c,mmol/m
ol‡
6265
7863
6668
5868
HbA
1c$7.5%
(58mmol/m
ol),n(%
ofn1)
3,89
0(44)
2,10
2(59)
19,083
(73)
8,15
0(46)
20,960
(57)
65,980
(62)
587(41)
118,06
3(60)
HbA
1c$8%
(64mmol/m
ol),n(%
ofn1)
2,91
3(33)
1,58
1(44)
16,871
(64)
6,10
6(34)
15,768
(43)
51,841
(49)
418(29)
93,499
(48)
Weight,kg*
108.1(25.9)
105.1
(25.2)
99.8
(26)
99.9
(24.3)
98(24.2)
98(24.6)
94(26.1)
98.9
(24.9)
BMI,kg/m
2*
38.1
(8.2)
36.2
(7.8)
35(8.3)
34.8
(7.8)
34.3
(7.6)
34.3
(7.7)
33.3
(7.9)
34.6
(7.8)
Obese,n
(%of
n1)
12,429
(86)
4,38
7(79)
32,472
(71)
20,920
(71)
39,574
(69)
117,23
2(68)
1,96
1(61)
222,62
7(70)
SBP,mmHg*
128(14)
129(14)
131(16)
130(15)
130(14)
132(15)
129(15)
131(15)
SBP$140mmHg,n(%
ofn1)
2,56
5(18)
1,09
8(20)
11,642
(25)
6,99
3(24)
12,446
(22)
46,191
(27)
703(22)
79,837
(25)
DBP,mmHg*
78(9)
79(9)
76(10)
77(9)
78(9)
77(10)
75(9)
77(9)
Heartrate,bpm
*81
(11)
80(12)
80(12)
78(11)
79(11)
79(12)
78(12)
79(12)
LDL,mg/dL*
96(34)
79(9)
98(37)
97(35)
98(35)
97(35)
75(9)
97(35)
HDL,mg/dL*
44(12)
43(12)
43(13)
45(13)
44(12)
43(12)
47(15)
43(12)
Triglycerides,mg/dL†
150(109
,200
)15
6(115
,207
)14
3(103
,196
)13
9(100
,190
)14
8(109
,197
)14
9(109
,199
)13
5(97,
185)
147(107
,197
)
CVD,n
(%of
n1)
2,13
4(14)
1,00
4(17)
11,781
(24)
5,89
4(18)
12,212
(20)
41,220
(22)
876(23)
75,121
(21)
CKD,n
(%of
n1)
301(2)
128(2)
1,68
6(3)
836(3)
2,09
0(3)
6,80
6(4)
140(4)
11,987
(3)
Cancer,n
(%of
n1)
606(4)
264(4)
2,55
4(5)
1,35
0(4)
3,35
9(5)
9,47
2(5)
237(6)
17,842
(5)
Depression,n(%
ofn1)
2,97
9(19)
1,12
3(19)
7,29
4(15)
3,71
0(11)
9,07
3(15)
23,433
(12)
465(12)
48,077
(13)
Charlson
Comorbidity
Index*
1.52(0.9)
1.580.0(1)
1.71
(1.16)
1.48(0.92)
1.63(1.08)
1.63(1.08)
1.69(1.11)
1.62(1.07)
bpm,beatsperminute;n1,num
berof
subcoh
ort1patientsprescribed
each
drug
classas
asecond
ADD;n
2,nu
mberof
n1patientswith$1year
follow-upaftersecond
ADDinitiation
.*Mean(SD);
†median
(interquartile
range);‡mean;
§other:amylin,D
OPR
A,A
GI,or
MEG
;|subcoh
ort1:initiatedsecond
ADDandhadMET
asfirst-linetreatm
ent.
74 Antidiabetes Treatment Patterns in the U.S. Diabetes Care Volume 41, January 2018
105
10% discontinued SU within 1 year com-pared with significantly higher discontin-uation proportions in other second-linenon-INS ADDs.The proportional share of DPP-4i as a
therapy intensification option post–METinitiation sharply increased from 0.4% in2006 (approved in October 2006) to 20%in 2016 (Fig. 1B). DPP-4i were initiated atan average HbA1c of 8.2% (66mmol/mol),with 57% and 43% having HbA1c $7.5%(58 mmol/mol) and 8.0% (64 mmol/mol),respectively. While 18% discontinuedDPP-4i within a year of initiation, the pro-portions of patients discontinuing second-line GLP-1RA, TZD, or SGLT2i within a yearwere higher.The proportional share of patients
receiving GLP-1RA as a second ADD in-creased from 3% in 2006 to 7% in 2016. Ini-tiation of GLP-1RA occurred at relativelylower HbA1c levels of 7.8% (62 mmol/mol)and at the highest BMI levels amongsecond-line ADDgroups. Twenty-one per-cent of patients discontinued GLP-1RAtherapy within a year of commencing itas a second ADD. After approval of thefirst SGLT2i in 2013, the proportionalshare of those receiving it as a secondADD reached 7% in 2016. One-quarterof patients discontinued SGLT2i therapywithin a year of adding it as second-lineADD. The proportional share of patientsreceiving TZD as a second-line therapydropped from 30 to 4% (Fig. 1B), with21% of patients discontinuing therapywithin 1 year.The proportional share of patients re-
ceiving INS as a second ADD post–METinitiation has consistently increased from7% in 2005 to 17% in 2016 (Fig. 1B). Theintensification with INS occurred at a9.3% (78 mmol/mol) average HbA1c level,with 73% and 64% having HbA1c $7.5%(58 mmol/mol) and 8.0% (64 mmol/mol),respectively. Only 7% patients discontin-ued INS within 1 year of initiation.
Third ADDAmong patients in subcohort 1, 78% hadat least 1 year of follow-up from the sec-ond ADD initiation (subcohort 2; n =278,188). Of these patients, 144,106(52%) initiated a third ADD, with an an-nual mean rate of 12.6 initiations per100 person-years (minimum 11.4, maxi-mum 14.9) during a mean follow-upof 4 years post–second-line initiation.Table 3 presents treatment intensifica-tion patterns by the major second-line
ADDs. Most of the patients (84% [n =121,559]) added a third drug on top ofthe second ADD, while 16% (n = 22,547)ceased the second ADD and switched toa third ADD. Addition of the third drugoccurred at higher HbA1c levels (8.5%[69mmol/mol]) comparedwith switching(8.2% [66 mmol/mol]).Among patients with SU as the sec-
ond ADD, 49% (n = 73,776) added and6% (n = 8,204) switched to a third drugduring a mean follow-up of 4.1 years. Themost popular third ADD addition wasDPP-4i (34% of those who added a thirdADD), followed by INS (28%) and TZD(26%). Among those who switched, al-most one-half (49%) switched to INS,while 30% and 8% switched to DPP-4iand GLP-1RA, respectively.SU, DPP-4i, and GLP-1RA were added
to INS in 32%, 26%, and 22% of patients(from those who added a third ADD),respectively. Only 3% of patients ceasedINS therapy to switch to another ADDduring a mean 3.6 years of follow-up. Inthe second-lineDPP-4i group (n = 46,822),40% added and 11% switched to a thirddrug during a mean 3.4 years of follow-up. The most popular third ADD additionwas SU (40% of those who added a thirdADD), followed by INS (29%) andGLP-1RA(9%). Of thosewho switched fromDPP-4i,one-half of the patients moved to SU, fol-lowed by INS and GLP-1RA (17%).Among thosewho had aGLP-1RA as the
second ADD, 52% added INS (of thosewhoadded a third ADD) and 18% switched toINS (of thosewho switched to a third ADD)during a mean 3.9 years of follow-up; 11%added and 34% switched to DPP-4i. In theTZD group, 43% added and 22% switchedto a third ADD during 5.4 years of follow-up. Among those who switched, 45%chose SU while 35% moved to DPP-4i.
Temporal Changes in Rates ofIntensification With SU and INSAmong patients with first-line MET and aminimum 1 year of follow-up, the annualrates per 100 person-years of INS/SU ini-tiation (irrespective of order of therapyintensification) are presented in Fig. 1C.The rates did not significantly declinefrom 2005 to 2014.
Do Novel ADDs Help Delay the Needfor Therapy Intensification?The Kaplan-Meier analyses, based on re-stricted mean years to adding or movingto a third ADD, in major second-line ADD
groups are presented in Table 3. Themean time to intensification with a thirdADD was marginally longer in incretingroups (DPP-4i 4.1 years [95% CI 4.1,4.2] and GLP-1RA 4.2 years [4.1, 4.3])compared with that in patients with SUas the second-line ADD (3.9 years [3.8,3.9]; P = 0.04). The restricted mean timesto intensificationwith INS any timeduringfollow-up were 6.3, 7.1, and 6.6 years inthe SU, DPP-4i, and GLP-1RA groups, re-spectively (all comparative P , 0.05).
CONCLUSIONS
This longitudinal exploratory study of alarge cohort of patients with type 2 dia-betes observed between 2005 and 2016from primary and ambulatory care sys-tems in the U.S. provides 1) a detailedaccount of glycemic states, clinical char-acteristics, and comorbidities at first-lineand second-line therapy initiation by dif-ferent drug classes, as well as new in-sights into 2) the changes in the choiceof first- and second-line ADDs over thelast 10 years, 3) patterns of therapy inten-sification with third-line ADDs and withINS, separately for major second-lineADDs, 4) changes in the annual rates oftherapy intensification with SU and INSover time, and 5) possible benefits of us-ing newer novel antidiabetes therapies interms of delaying the need for third-linetherapy intensification, including theneed for initiating INS.With 3.4 years of mean follow-up in
more than one million patients with a di-agnosis of type 2 diabetes from 2005, thisstudy provides robust and detailed infor-mation on the changing clinical practicesfor the management of type 2 diabetesin a real-world setting. We are not awareof any study that simultaneously evalu-ated the changing prescribing patternsof old and new ADDs as first-line therapyand as intensification options at variouslevels of glycemia and comorbidities.The proportional share of MET as the
first-line therapy choice has increasedfrom 60 to 77%, while that for SU hasdecreased from 20 to 8%, over the lastdecade. However, SU continue to be themost popular second-line therapy inten-sification option, although with a declin-ing share (from 60 to 46% over the lastdecade). The discontinuation rate of SUwas found to be the lowest among non-INS second-line ADDs. Among those whointensified with a third-line therapy, theratio of addition to switching to third ADD
care.diabetesjournals.org Montvida and Associates 75
106
Tab
le3—
Intensifica
tionofmajorseco
nd-linetherap
iesin
subco
hort
2‡(N
=278
,188)
GLP-1RA
INS
TZD
DPP
-4i
SUAll
N11
,431
36,337
28,841
46,822
149,109
278,188
Follow-upfrom
second
ADDinitiation
,years*
3.85(2.24)
3.56(2.09)
5.40(2.66)
3.37
(1.81)
4.09(2.35)
4.02(2.33)
InitiatedthirdADD,n
(%of
N)
5,94
2(52)
10,677
(29)
18,788
(65)
23,840
(51)
81,980
(55)
144,106
InitiatedINS,n(%
ofN)
3,28
5(29)
8,22
3(29)
9,63
3(21)
45,293
(30)
67,812
Restrictedmeantimeto
athirdADD,years§
4.23(4.14,
4.32)
6.15
(6.09,
6.21
)3.53(3.49,
3.58)
4.12(4.07,
4.17)
3.91(3.88,
3.93)
4.18(4.17,
4.20)
Restrictedmeantimeto
INS,years§
6.58(6.49,
6.67)
6.82(6.78,
6.87)
7.14(7.09,
7.18)
6.26(6.23,
6.28)
6.51(6.49,
6.53)
Add
edthirdADD
n1(%
ofN)
4,52
2(40)
9,67
5(27)
12,481
(43)
18,881
(40)
73,776
(49)
121,55
9(44)
HbA
1c,%*
8.2(1.7)
8.9(2)
8.1(1.8)
8.5(1.8)
8.6(1.7)
8.5(1.8)
HbA
1c,mmol/m
ol†
6674
6569
7069
HbA
1c$7.5%
(58mmol/m
ol),n(%
ofn1)
1,79
8(61)
4,95
3(74)
4,10
2(55)
8,96
8(69)
32,611
(72)
53,100
(69)
HbA
1c$8%
(64mmol/m
ol),n(%
ofn1)
1,36
4(47)
4,16
0(62)
3,10
1(42)
6,97
4(54)
26,231
(58)
42,330
(55)
GLP-1RAas
thirdADD,n
(%of
n1)
2,12
5(22)
1,53
2(12)
1,64
3(9)
5,66
2(8)
11,230
(9)
INSas
thirdADD,n
(%of
n1)
2,35
6(52)
3,33
8(27)
5,47
2(29)
20,483
(28)
32,447
(27)
TZDas
thirdADD,n
(%of
n1)
241(5)
688(7)
1,42
5(8)
19,010
(26)
21,472
(18)
DPP-4iasthirdADD,n
(%of
n1)
481(11)
2,48
8(26)
3,80
4(30)
25,216
(34)
32,696
(27)
SUas
thirdADD,n
(%of
n1)
865(19)
3,09
0(32)
3,04
8(24)
7,55
6(40)
14,844
(12)
SGLT2ias
thirdADD,n
(%of
n1)
521(12)
938(10)
138(1)
2,39
9(13)
2,07
7(3)
6,09
5(5)
Switched
tothirdADD
n2(%
ofN)
1,42
0(12)
1,00
2(3)
6,30
7(22)
4,95
9(11)
8,20
4(6)
22,547
(8)
HbA
1c,%*
7.9(1.6)
8.2(1.8)
7.8(1.7)
8.1(1.7)
8.7(2)
8.2(1.8)
HbA
1c,mmol/m
ol†
6366
6265
7266
HbA
1c$7.5%
(58mmol/m
ol),n(%
ofn2)
470(54)
377(60)
1,84
6(49)
1,94
0(59)
3,38
0(68)
8,23
1(59)
HbA
1c$8%
(64mmol/m
ol),n(%
ofn2)
335(38)
280(44)
1,29
0(34)
1,45
1(44)
2,78
0(56)
6,28
6(45)
GLP-1RAas
thirdADD,n
(%of
n2)
84(8)
383(6)
842(17)
677(8)
2,06
5(9)
INSas
thirdADD,n
(%of
n2)
262(18)
703(11)
849(17)
4,01
2(49)
5,93
1(26)
TZDas
thirdADD,n
(%of
n2)
61(4)
48(5)
273(6)
510(6)
924(4)
DPP-4iasthirdADD,n
(%of
n2)
488(34)
269(27)
2,19
9(35)
2,43
6(30)
5,56
1(25)
SUas
thirdADD,n
(%of
n2)
390(27)
521(52)
2,82
9(45)
2,45
6(50)
6,45
0(29)
SGLT2ias
thirdADD,n
(%of
n2)
205(14)
66(7)
109(2)
464(9)
373(5)
1,22
8(5)
n1,n
umberofsubco
hort
2patients
foreachdrugclasswhoad
dedathirdADD;n2,num
berof
subcoh
ort2patientsforeach
drug
classwho
switched
toathirdADD.*Mean(SD);†mean;
§mean(95%
CI).
‡Subcoh
ort2:thosewithMET
asfirst-linetreatm
ent,who
initiatedsecond
ADD,w
ithaminimum
of1year
offollow-uppo
st–secon
ddrug
initiation
.
76 Antidiabetes Treatment Patterns in the U.S. Diabetes Care Volume 41, January 2018
107
was highest in the SU group (9.0), followedbyDPP-4i (3.8)andGLP-1RA(3.2),during.4years of mean follow-up post–second-lineADD initiation.We observed that second-line SU users
initiate a third ADDmarginally sooner com-pared with incretin users. A study based onEMRdata from theU.K. reported the oppo-site results, with an average of 1.6 and 2.4years to third ADD initiation in the 3,080and 15,508 patients treated with MET +DPP-4i and MET + SU, respectively (5).The proportions of patients who added
INS were similar between patients whohad a DPP-4i and an SU as the secondADD. However, among thosewho switchedto a third ADD, only 17% of patients in theDPP-4i group switched to INS, comparedwith almost 50% in the SU group. We alsoobserved that the mean time to INS initia-tionwas significantly shorter for second-lineSUusers (6.3 years) than in theDPP-4i group(7.1 years). This finding is similar to a study(2015) based on 3,864matched pairs of pa-tients treated with DPP-4i or SU whenadded to MET, where Inzucchi et al. (6) re-ported that those treated with DPP-4iwere significantly less likely to initiateINS comparedwith those treatedwith SU.We observed an increasing propor-
tional share of INS as a second-line ther-apy option over the last 10 years, despitethe availability of novel therapies thatwere found to have similar or better gly-cemic efficacy in clinical trials. Also, theannual rates of intensification with INSremained similar over the last decade.In a similar study, Lipska et al. (15) ob-served that the overall rate of severe hy-poglycemia did not reduce from 2006 to2013. Thismay reflect pressure to achieveglycemic targets rapidly and an increasingrecognition that for peoplewith very poorglycemic control, INS may be the onlydrug likely to achieve targets.Compared with rates for older ADDs,
high discontinuation rates of new thera-peutic classes, particularly of DPP-4i, aresurprising. The higher cost of these newerdrugs may be relevant and may also con-tribute to the fairly low rates of initiationof these drugs overall. More studies, uti-lizing additional data sources, are neededto specifically test hypotheses for the dif-ferences in initiation, adherence, and per-sistence between drug classes.The HbA1c level at pharmacological
therapy initiation was found to be 8.2%(66 mmol/mol), with 50% having HbA1c$7.5% (58 mmol/mol). The HbA1c levels
at first-line ADD initiation were similaracross all ADDs, except in those who ini-tiatedwith INS,whose average HbA1cwas8.9% (74 mmol/mol). Although the meantime to second ADD post–MET initiationwas only 8 months, it occurred at a highHbA1c level of 8.4% (68 mmol/mol),with 60% and 48% of patients havingHbA1c $7.5% (58 mmol/mol) and 8%(64 mmol/mol), respectively. Amongthose with a minimum of 1 year of follow-up post–second ADD, ;52% intensifiedwith a third ADD at an average HbA1clevel of 8.5% (69 mmol/mol). These find-ings reflect the continued glycemic riskburden in patients with type 2 diabetes(27–30). While the persistent therapeuticinertia (28) and the long-term conse-quences of therapeutic inertia (27,29)for glycemic control in primary care sys-tems have been evaluated, exploration ofthe glycemic state at therapy initiationand intensification by ADD classes isscarce. Our study provides a detailed ac-count of the glycemic state in peoplewithtype 2 diabetes at therapy initiation andintensifications during a reasonable follow-up period in primary and ambulatory caresettings.The main strength of this study is the
availability of data from the patients’medication lists that included prescribedmedications within the EMR network andalsomedication information that could beprescribed outside of the EMR, as well asdata on glycemic control and comorbid-ities. The CEMR database tracks longitudi-nal treatment adjustments and containscomprehensive clinical information, whichis usually not available in claims databases.The limitations of this study include the
nonavailability of complete and reliabledata on 1) medication adherence andside effects, 2) diet and exercise, 3) socio-economic status, and 4) insurance type.We did not include dosage changes orbrands’ distribution in our analyses. Thefindings of this study should be inter-pretedwith caution: EMR data are in gen-eral biased toward unhealthy populationsand commercially insured individuals,white Caucasians are overrepresented inthe CEMR, and the results are subject tolimited follow-up.Less popular ADDs such as MEG, AGI,
DOPRA, and amylin were included in ourstudy for multiple reasons: first, to assessutilization data of such medications, asthese drugs are usually omitted, and sec-ond, to ensure market shares of other
drugs are not overestimated.Weobservedthat only 39,549 patientswith type 2 diabe-tes were using SGLT2i during the availablefollow-up period, which is not surprising,given that theywerefirst approved in 2013.To conclude, while we have observed
significant increase in the use of MET asthe first-line therapy over the last 10 years,the second- and third-line therapy intensi-fication choices are highly heterogeneous.While increasing popularity of “new”drugs, especially DPP-4i and SGLT2i, wasobserved as the second and third drugschoices, SU remain the most populartherapy intensification choice and havea lower discontinuation rate comparedwith other non-INS ADDs. The propor-tional share of INS as a second-line ther-apy choice has also increased significantlyover the last decade. Incretin-based ther-apies were found to delay the need fortherapy intensification only marginallycompared with other ADDs. Contrary tothe guidelines for proactive glycemicmanagement, pharmacological therapyinitiation and the intensifications oc-curred at very high levels of HbA1c, with48% of patients having HbA1c $8.0%(58 mmol/mol) at second-line therapyinitiation.
Acknowledgments. O.M. acknowledges hercosupervisors Ross Young and Louise Hafner ofQueensland University of Technology.Funding. J.J.A. and S.K.P. acknowledge projectgrant support provided by the Royal BrisbaneandWomen’s Hospital Foundation.O.M. acknowl-edges a PhD scholarship fromQueenslandUniver-sity of Technology. J.S. is supported by a NationalHealth and Medical Research Council ResearchFellowship. Melbourne EpiCentre received sup-port from the National Health and Medical Re-search Council and the Australian Government’sNational Collaborative Research InfrastructureStrategy initiative throughTherapeutic InnovationAustralia.Duality of Interest. J.S. has received speakerhonoraria, consultancy fees, and/or travel spon-sorship from AstraZeneca, Boehringer Ingelheim,Lilly, Sanofi, Mylan, Novo Nordisk, Merck Sharp &Dohme, and Novartis. J.J.A. has received speakerhonoraria, consultancy fees, and/or travel spon-sorship from AstraZeneca, Boehringer Ingelheim,Lilly, and Novartis. S.K.P. has acted as a consultantand/or speaker for Novartis, GI Dynamics, Roche,AstraZeneca, Guangzhou Zhongyi Pharmaceutical,and Amylin Pharmaceuticals. S.K.P. has receivedgrants in support of investigator and investigator-initiated clinical studies fromMerck, Novo Nordisk,AstraZeneca, Hospira, Amylin Pharmaceuticals,Sanofi, and Pfizer. No other potential conflicts ofinterest relevant to this article were reported.Author Contributions. O.M. and S.K.P. wereresponsible for the primary design of the study.J.S. and J.J.A. contributed significantly in the study
care.diabetesjournals.org Montvida and Associates 77
108
design.O.M. conductedthedataextraction. O.M.and S.K.P. jointly conducted the statistical anal-yses. F.S. contributed in the interpretation of theresults. The first draft of the manuscript wasdeveloped by O.M. and S.K.P., and all authorscontributed to the finalization of the manuscript.S.K.P. is theguarantorofthisworkand,assuch,hadfull access to all the data in the study and takesresponsibility for the integrity of the data andthe accuracy of the data analysis.
References1. American Diabetes Association. Standards ofMedical Care in Diabetesd2017: summary of re-visions. Diabetes Care 2017;40(Suppl. 1):S4–S52. Paul SK,MaggsD, Klein K, Best JH. Dynamic riskfactors associated with non-severe hypoglycemiain patients treated with insulin glargine or exena-tide once weekly. J Diabetes 2015;7:60–673. Paul SK, Shaw JE, Montvida O, Klein K. Weightgain in insulin-treated patients by body mass in-dex category at treatment initiation: new evidencefrom real-world data in patients with type 2 diabe-tes. Diabetes Obes Metab 2016;18:1244–12524. Waldrop G, Zhong J, Peters M, et al. Incretin-based therapy in type 2 diabetes: an evidence basedsystematicreviewandmeta-analysis. J DiabetesCom-plications. 28 August 2016 [Epub ahead of print].https://doi.org/10.1016/j.jdiacomp.2016.08.0185. Mamza J, Mehta R, Donnelly R, Idris I. Impor-tant differences in the durability of glycaemic re-sponse among second-line treatment options whenadded to metformin in type 2 diabetes: a retrospec-tive cohort study. Ann Med 2016;48:224–2346. Inzucchi SE, Tunceli K, Qiu Y, et al. Progressionto insulin therapy among patients with type 2 di-abetes treated with sitagliptin or sulphonylureaplus metformin dual therapy. Diabetes ObesMetab 2015;17:956–9647. Cohen FJ, Neslusan CA, Conklin JE, Song X. Re-cent antihyperglycemic prescribing trends for U.S.privately insured patients with type 2 diabetes.Diabetes Care 2003;26:1847–18518. Desai NR, Shrank WH, Fischer MA, et al. Pat-terns of medication initiation in newly diagnoseddiabetes mellitus: quality and cost implications.Am J Med 2012;125:302.e1-79. Tanne JH. FDA places “black box” warning onantidiabetes drugs. BMJ 2007;334:1237
10. Nissen SE, Wolski K. Effect of rosiglitazone onthe risk of myocardial infarction and death fromcardiovascular causes. N Engl J Med 2007;356:2457–247111. Woodcock J, Sharfstein JM, HamburgM. Reg-ulatory action on rosiglitazone by the U.S. Foodand Drug Administration. N Engl J Med 2010;363:1489–149112. Home PD, Pocock SJ, Beck-Nielsen H, et al.;RECORD Study Team. Rosiglitazone evaluated forcardiovascular outcomes in oral agent combina-tion therapy for type 2 diabetes (RECORD): a mul-ticentre, randomised, open-label trial. Lancet2009;373:2125–213513. Berkowitz SA, Krumme AA, Avorn J, et al. Ini-tial choice of oral glucose-loweringmedication fordiabetesmellitus: a patient-centered comparativeeffectiveness study. JAMA Intern Med 2014;174:1955–196214. Segal JB, Maruthur NM. Initial therapy for di-abetes mellitus. JAMA Intern Med 2014;174:1962–196315. Lipska KJ, Yao X, Herrin J, et al. Trends in drugutilization, glycemic control, and rates of severehypoglycemia, 2006–2013. Diabetes Care 2017;40:468–47516. Pantalone KM, Hobbs TM,Wells BJ, et al. Clin-ical characteristics, complications, comorbidities andtreatment patterns among patients with type 2 di-abetes mellitus in a large integrated health system.BMJ Open Diabetes Res Care 2015;3:e00009317. Raebel MA, Xu S, Goodrich GK, et al. Initialantihyperglycemic drug therapy among 241 327adults with newly identified diabetes from 2005through 2010: a surveillance, prevention, and man-agementofdiabetesmellitus (SUPREME-DM)study.Ann Pharmacother 2013;47:1280–129118. Hampp C, Borders-Hemphill V, Moeny DG,Wysowski DK. Use of antidiabetic drugs in the U.S.,2003-2012. Diabetes Care 2014;37:1367–137419. Centers for Disease Control and Prevention.National Diabetes Statistics Report: Estimates ofDiabetes and Its Burden in the United States,2014. Atlanta, GA, U.S. Department of Healthand Human Services, 201420. International Diabetes Federation. IDF Diabe-tes Atlas. 7th ed. Brussels, Belgium, 201521. Higgins V, Piercy J, Roughley A, et al. Trends inmedication use in patients with type 2 diabetes
mellitus: a long-term view of real-world treat-ment between 2000 and 2015. Diabetes MetabSyndr Obes 2016;9:371–38022. Crawford AG, Cote C, Couto J, et al. Compar-ison of GE Centricity Electronic Medical Recorddatabase and National Ambulatory Medical CareSurvey findings on the prevalence of major con-ditions in the United States. Popul Health Manag2010;13:139–15023. Brixner D, Said Q, Kirkness C, Oberg B,Ben-Joseph R, Oderda G. Assessment of cardio-metabolic risk factors in a national primary careelectronic health record database. Value Health2007;10:S29–S3624. Montvida O, Arandjelovic O, Reiner E, PaulSK. Data mining approach to estimate the dura-tion of drug therapy from longitudinal electronicmedical records. Open Bioinform J 2017;10:1–1525. U.S. Food and Drug Administration. FDA ap-proves weight-management drug [article online],2014. Available from https://wayback.archive-it.org/7993/20170111160832/http://www.fda.gov/NewsEvents/Newsroom/PressAnnouncements/ucm427913.htm. Accessed 23 December 201426. Quan H, Sundararajan V, Halfon P, et al. Cod-ing algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care2005;43:1130–113927. Paul SK, Klein K, Thorsted BL, Wolden ML,Khunti K. Delay in treatment intensification in-creases the risks of cardiovascular events in pa-tients with type 2 diabetes. Cardiovasc Diabetol2015;14:10028. Khunti K, NikolajsenA, Thorsted BL, AndersenM, Davies MJ, Paul SK. Clinical inertia with regardto intensifying therapy in people with type 2 di-abetes treated with basal insulin. Diabetes ObesMetab 2016;18:401–40929. Paul SK, Klein K, Thorsted BL, Wolden ML,Khunti K. Delay in treatment intensification in-creases the risks of cardiovascular events in pa-tients with type 2 diabetes. Cardiovasc Diabetol2015;14:10030. Montvida O, Klein K, Kumar S, Khunti K, PaulSK. Addition of or switch to insulin therapyin people treated with glucagon-like peptide-1receptor agonists: a real-world study in66 583 patients. Diabetes Obes Metab 2017;19:108–117
78 Antidiabetes Treatment Patterns in the U.S. Diabetes Care Volume 41, January 2018
109
Chapter 8: Glycaemic Control and
Sustainability
Statement of Contribution of Co-Authors for Thesis by Published
Paper
The authors listed below have certified* that:
1. they meet the criteria for authorship in that they have participated in the conception,
execution, or interpretation, of at least that part of the publication in their field of
expertise;
2. they take public responsibility for their part of the publication, except for the
responsible author who accepts overall responsibility for the publication;
3. there are no other authors of the publication according to these criteria;
4. potential conflicts of interest have been disclosed to (a) granting bodies, (b) the editor
or publisher of journals or other publications, and (c) the head of the responsible
academic unit, and
5. they agree to the use of the publication in the student’s thesis and its publication on
the QUT’s ePrints site consistent with any limitations set by publisher requirements.
In the case of this chapter:
Olga Montvida, Jonathan Shaw, Lawrence Blonde, Sanjoy K Paul. Long-term sustainability
of glycaemic achievements with second-line anti-diabetic therapies in patients with type 2
diabetes: A real-world study. Accepted at Diabetes, Obesity, and Metabolism.
Contributor Statement of Contribution*
Olga Montvida Conceived the idea and was responsible for the primary
design of the study. Conducted the data extraction and
statistical analyses. Developed first draft and contributed
towards development of the manuscript.
Jonathan Shaw Contributed in the study design and manuscript
development.
Lawrence Blonde Contributed in the manuscript development.
Sanjoy K. Paul Conceived the idea, and was responsible for the primary
design of the study. Contributed to the statistical analyses.
Developed first draft and contributed towards development
of the manuscript.
110
29.06.2018QUT Verified
Signature
Principal Supervisor Confirmation
I have sighted email or other correspondence from all Co-authors confirming their certifying
authorship.
Sanjoy Ketan Paul 29.06.2018
Name Signature Date
111
QUT Verified Signature
OR I G I N A L A R T I C L E
Long-term sustainability of glycaemic achievements withsecond-line antidiabetic therapies in patients with type2 diabetes: A real-world study
Olga Montvida MSc1,2 | Jonathan E. Shaw MD3 | Lawrence Blonde MD4 |
Sanjoy K. Paul PhD1,5
1Statistics Unit, QIMR Berghofer Medical
Research Institute, Brisbane, Australia
2School of Biomedical Sciences, Faculty of
Health, Queensland University of Technology,
Brisbane, Australia
3Baker Heart and Diabetes Institute,
Melbourne, Australia
4Ochsner Diabetes Clinical Research Unit,
Department of Endocrinology, Frank Riddick
Diabetes Institute, Ochsner Medical Center,
New Orleans, Louisiana
5Melbourne EpiCentre, University of
Melbourne and Melbourne Health,
Melbourne, Australia
Correspondence
Sanjoy K. Paul PhD, The Royal Melbourne
Hospital, City Campus, 7 East, Main Building,
Grattan Street, Parkville, Victoria 3050,
Australia.
Email: [email protected]
Funding information
No separate funding was obtained for this
study.
Aims: To inform patients and their carers about both the probability of reducing glycated hae-
moglobin (HbA1c) to clinically desirable levels and the sustainability of such control over
2 years with major second-line antidiabetic therapies, in individual risk scenarios, with and
without third-line intensification.
Materials and Methods: From US Centricity Electronic Medical Records, 163 081 patients with
type 2 diabetes aged 18 to 80 years, who had initiated metformin, intensified their treatment
with dipeptidyl peptidase-4 (DPP-4) inhibitors, glucagon-like peptide-1 (GLP-1) receptor ago-
nists (GLP-1RAs), sulphonylureas (SUs), insulin or thiazolidinediones (TZDs), and continued
second-line treatment for ≥6 months, were selected. Treatment groups were balanced with
regard to baseline characteristics, and glycaemic achievements were estimated using logistic
regression analysis.
Results: With HbA1c concentrations of 58–63.9 mmol/mol (7.5–7.9%) at second-line treat-
ment initiation, the adjusted probabilities of achieving HbA1c <53 mmol/mol (<7%) at
6 months were 32%, 38%, 39%, 26% and 38% in the SU, DPP-4 inhibitor, GLP-1RA, insulin
and TZD groups, respectively, while with baseline HbA1c concentrations of 64–75 mmol/mol
(8–9%), the corresponding probabilities of reducing HbA1c to <58 mmol/mol (<7.5%) were
38%, 44%, 40%, 34% and 42%, respectively. In these baseline HbA1c categories, the adjusted
probabilities of sustaining HbA1c achievements over 2 years were higher in the GLP-1RA and
TZD groups, compared with the SU and insulin groups (P < .01). With baseline HbA1c concen-
trations of 75.1–108 mmol/mol (9.1–12%) 38% of patients achieved an HbA1c concentration
<58 mmol/mol (<7.5%) at 6 months. The adjusted probability of sustaining this control over
2 years was higher in the incretin and TZD groups (range 62%-75%), while insulin and SUs
offered lower chances of sustainable control (range 54%-56%).
Conclusions: Patients treated with second-line incretins and TZDs had a significantly higher
probability of achieving and sustaining glycaemic control over 2 years without further intensifi-
cation, compared with those treated with SUs or insulin.
KEYWORDS
antidiabetic drug, glycaemic control, therapeutic choice
1 | INTRODUCTION
Metformin is recommended as a first-line pharmacological treatment
for patients with type 2 diabetes; however, most patients eventually
require therapy intensification with multiple antidiabetic drugs
(ADDs) to achieve glycaemic control.1–3 For second-line treatment
intensification, the American Diabetes Association recommends sul-
phonylureas (SUs), thiazolidinediones (TZDs), dipeptidyl peptidase
Received: 21 January 2018 Revised: 28 February 2018 Accepted: 7 March 2018
DOI: 10.1111/dom.13288
Diabetes Obes Metab. 2018;1–10. wileyonlinelibrary.com/journal/dom © 2018 John Wiley & Sons Ltd 1
112
(DPP-4) inhibitors, sodium-glucose co-transporter-2 (SGLT2) inhibi-
tors, glucagon-like peptide-1 receptor agonists (GLP-1RAs) or insulin.
Other drugs are recommended under specific conditions.1 The tem-
poral patterns of the changes in the second-line ADD choices over
the last decade in the United States have recently been explored by
Montvida et al.4
While clinicians' and patients' decisions with regard to add-on
agents have become more complicated,5,6 few studies have directly
compared the glycaemic effectiveness of second-line therapies.7–11 A
recent network meta-analysis reported similar glycaemic achieve-
ments with all second-line ADDs when added to metformin.9 Analo-
gous results were discussed in two observational studies using
electronic medical records (EMRs).7,8 In 7009 patients from Germany,
Rathmann et al7 reported an unadjusted mean glycated haemoglobin
(HbA1c) reduction of 0.7% to 1.1% after 6 months of treatment with
major second-line ADDs, including insulin. A study from Denmark in
4734 patients by Thomsen et al8 reported median reductions of 0.8%
to 1.3% at 12 months for non-insulin drugs (from baseline HbA1c of
60–64 mmol/mol [7.6–8.0%]) and 2.4% for insulin-treated patients
(from 81 mmol/mol [9.6%]).8
Given the increasing complexity and challenges with regard to
multiple risk factor management in patients with type 2 diabetes, and
the availability of a number of new and older classes of ADDs, a
population-level assessment of the likelihood of short- and long-term
glycaemic achievements and their sustainability with the use of dif-
ferent second-line ADDs would be of great value. With evaluation of
a reasonably large number of patients from primary and ambulatory
care systems, probabilistic estimates of sustainable glycaemic
achievements with different second-line ADDs for different risk para-
digms would empower clinicians and their patients to make more
informed therapeutic choices. To the best of our knowledge, no study
has evaluated early glycaemic control and its sustainability among
groups of patients with different HbA1c concentrations at the time
of post-metformin second-line ADD intensification.
The newer classes of ADDs, including GLP-1RAs and SGLT2
inhibitors, potentially have both extra glycaemic benefits, such as
weight and blood pressure reductions, and possible associations with
reduced risk of cardiovascular diseases. These therapies are costlier
in comparison to the older ADDs; however, if the newer drugs have
longer-lasting benefits with regard to glycaemic control than older
ADDs, then over time, a lower rate of third-line therapy intensifica-
tion across the whole population would be expected, as the use of
newer drugs increases. In this context, evaluation of whether second-
line therapy intensification with newer drug classes has been helpful
in reducing the need for third-line therapy intensification over time,
would be of great interest, but there is a paucity of population-level
studies addressing this question. A modelling study by Zhang et al12
reported a marginally shorter time to insulin treatment in patients
treated with incretin-based therapies (GLP-1RAs and DPP-4 inhibi-
tors) compared with those treated with SUs, but we are not aware of
any population-level study that has evaluated the possible delay in
the need for third-line therapy intensification in patients who choose
incretin-based therapies as second-line therapy.
Taking into account the heterogeneous HbA1c levels among
patients at second-line treatment initiation, the main aims of the
present study were: to inform clinicians and patients on the likelihood
of reducing HbA1c to a clinically desirable level over 6, 12 and
24 months of treatment with major second-line ADDs when added
to metformin; to estimate the probability of sustaining early glycae-
mic control over 24 months of therapy continuation with and without
the need for third-line ADD addition; and to determine whether the
availability of newer ADDs has reduced the need for intensification
with third-line therapy at the population-level over time.
2 | MATERIALS AND METHODS
2.1 | Data source
The US Centricity Electronic Medical Record (CEMR) database was
used in the present study. The CEMR represents >35 000 solo practi-
tioners, community clinics, academic medical centres and large inte-
grated delivery networks across all of the United States. Patients in
the database are generally representative of the US population, with
a diabetes prevalence (7.1%, as identified by diagnostic codes) that is
similar to National Diabetes Statistics (6.7% with diagnosed diabetes
in 2014).13 The CEMR database has been extensively used for aca-
demic research worldwide.14–16
Research involved existing data, from which the patients could
not be identified either directly or through identifiers linked to the
patients. According to the US Department of Health and Human Ser-
vices Exemption 4 [CFR 46.101(b) (4)], therefore, this study was
exempt from ethics approval from an institutional review board and
informed consent.
For more than 34 million individuals, longitudinal EMRs were
available from 1995 until April 2016, with comprehensive patient-
level information on demographics, anthropometric, clinical and labo-
ratory variables. Medication data included brand names and doses for
individual medications prescribed, along with start/stop dates and
specific fields to track treatment alterations. The dataset also con-
tained patient-reported medications, including prescriptions received
outside the EMR network and over-the-counter medications.
2.2 | Study design
To obtain data on the first-, second-, and third-line ADDs for each
patient with type 2 diabetes, the following drug classes were
arranged chronologically according to the initial prescription dates:
metformin, SUs, TZDs, alpha glucosidase inhibitors, amylin, dopamine
receptor agonists, meglitinide, DPP-4 inhibitors, GLP-1RAs, SGLT2
inhibitors and insulin. Same-day initiations (including combination
therapies) were prioritized in the order as listed above, from highest
to lowest. A robust methodology for extraction and assessment of
longitudinal patient-level medication data from the CEMR database
has recently been described by the authors.17
The study cohort included patients with: (1) age at diagnosis ≥18
and <80 years; (2) a diagnosis date strictly after first registered activ-
ity in the CEMR database; (3) a diagnosis date on or after January
1, 2005; (4) initiation of antidiabetic therapy with metformin; (5) initia-
tion of second-line ADD with SU, TZD, DPP-4 inhibitors, GLP-1RAs
2 MONTVIDA ET AL.
113
or insulin; (6) available HbA1c measure at second-line ADD initiation
(baseline); and (7) second-line therapy duration ≥6 months. Additional
restrictions on the duration of second-line therapy were applied:
≥12 months (sub-cohort 1) and ≥24 months (sub-cohort 2).
Baseline body weight, body mass index (BMI), systolic/diastolic
blood pressure and lipids were calculated as the average of available
measurements within the 3 months before and 3 months after initia-
tion of therapy. HbA1c values at baseline, 6, 12, 18 and 24 months
were obtained as the nearest measure within 3 months either side of
the time point. Under condition of at least two non-missing HbA1c
measures over 24 months, the missing data were imputed using a
Markov chain Monte Carlo method, adjusting for age, diabetes dura-
tion and usage of concomitant ADDs.18 The following baseline
HbA1c categories were then created: (1) 53.0-63.9 mmol/mol (7.0–
7.9%); (2) 64.0–75.0 mmol/mol (8.0–9.0%); (3) 75.1-108.0 mmol/mol
(9.1–12%); (4) >108 mmol/mol (>12%).
The presence of comorbidities prior to baseline was assessed
using the relevant disease identification codes. The Charlson comor-
bidity index (CCI) score was calculated using the algorithm described
by Quan et al19 Cardiovascular disease was defined as ischaemic
heart disease, peripheral vascular disease, heart failure or stroke. Can-
cer was defined as any malignancy except malignant neoplasm
of skin.
2.3 | Statistical methods
Baseline characteristics were summarized as number (%), mean
(SD) or median (first quartile, third quartile) as appropriate. Patterns
of intensification with third-line ADDs were summarized according to
second-line ADDs in the study cohort, sub-cohort 1 and sub-cohort
2. Among patients with ≥2 years of follow-up in the study cohort,
the proportions (95% confidence interval [CI]) of those who initiated
a third ADD within 2 years of baseline were calculated according to
year of second-line treatment initiation.
Propensity scores for multiple treatment levels20 were calculated
within each HbA1c category to account for heterogeneous baseline
characteristics among second-line ADD groups. Inverse probability of
these exposure weights21,22 was used to balance second-line treat-
ment groups with regard to age, sex, baseline HbA1c and baseline
CCI score. In patients without a history of cardiovascular disease,
chronic kidney disease or cancer at baseline, the probabilities (95%
CIs) of achieving glycaemic control (HbA1c <53 or 58 mmol/mol (7 or
7.5%)) at 6, 12 and 24 months after second-line treatment initiation
were estimated in the study cohort, sub-cohort 1 and sub-cohort
2, respectively. Three outcomes were assessed with multinomial
logistic regression: (1) no glycaemic control achievement at corre-
sponding time point; (2) glycaemic control achievement with a third
ADD addition within the analysis time window; and (3) glycaemic
achievement without a third ADD addition within the analysis time
window. Analyses were conducted by balancing the data as described
above, with additional covariate adjustments for age, sex, and time
from metformin to second-line treatment, separately for the HbA1c
categories of 58–63.9 mmol/mol (7.5–7.9%); 64–75 mmol/mol
(8–9%); and 75.1–108 mmol/mol (9.1–12%).
In patients with baseline HbA1c concentrations 58–63.9 mmol/
mol (7.5–7.9%) who achieved an HbA1c target of 53 mmol/mol (7%)
at 6 months without addition of a third ADD, the probabilities of sus-
taining HbA1c control over 24 months were estimated after the bal-
ancing and adjustments described above. Similarly, in patients with
baseline HbA1c of 64–75 mmol/mol (8–9%) who achieved HbA1c of
<58 mmol/mol (<7.5%) at 6 months without addition of a third ADD,
the adjusted probabilities of sustaining HbA1c control over
24 months were estimated. Finally, in patients with baseline HbA1c
concentration of 75.1–108 mmol/mol (9.1–12%) who achieved
HbA1c <58 mmol/mol (<7.5%) at 6 months with or without third-line
treatment intensification, the adjusted probabilities of sustaining
HbA1c control (irrespective of third-line ADD status) over 24 months
were estimated. The assessment of achieving HbA1c <53 mmol/mol
(<7%) in this category was considered clinically unrealistic.
Sensitivity analyses included an intention-to-treat evaluation and
separate assessment in patients with comorbidities at baseline.
3 | RESULTS
From 2 624 954 identified patients with type 2 diabetes, 195 720
initiated second-line ADD after metformin and had available HbA1c
measurements (Figure S1). Of these, 85%, 79%, 77%, 83% and 83%
in the SU, DPP-4 inhibitor, GLP-1RA, insulin and TZD groups, respec-
tively, continued therapy for at least 6 months. The study cohort
included 90 572, 29 308, 6696, 21 827 and 14 678 patients in the
SU, DPP-4 inhibitor, GLP-1RA, insulin and TZD groups, respectively
(Table 1). On average, the progression to a second ADD occurred
9 months after metformin initiation. Available follow-up years from
baseline were 4.0, 3.2, 3.7, 3.5 and 5.6 years in the SU, DPP-4 inhibi-
tor, GLP-1RA, insulin and TZD groups, respectively, and 84% of
patients continued therapy for at least 1 year. The distributions of
age, sex, BMI and comorbidities at baseline were significantly differ-
ent among the the second-line ADDs (Table 1).
The distribution of HbA1c categories at baseline was heteroge-
neous among the treatment groups (Table 1). With a mean
(SD) cohort HbA1c level of 8.4 (1.9)% (68 mmol/mol) at second-line
therapy initiation, the proportions of patients with baseline HbA1c
<64 mmol/mol (<8%) were 52%, 58%, 67%, 36% and 66% in the SU,
DPP-4 inhibitor, GLP-1RA, insulin and TZD groups, respectively.
3.1 | Treatment intensification with a third drug
Overall, 52% in the cohort had a third ADD prescribed (either in addi-
tion to or as a switch from a second ADD) during the available
follow-up. On average, the progression to a third ADD occurred at
15 months after second-line treatment initiation (Table 2). Of those
who initiated a third drug, 88% added it to dual therapy (ranging from
70% in the insulin group to 94% in the GLP-1RA group), while only
12% ceased the second ADD and switched to a third agent.
By study design, patients who switched to a third agent within
6, 12 or 24 months were not included in the study cohort, sub-cohort
1 or sub-cohort 2, respectively. During 6 months of therapy post
baseline, 27%, 21%, 26%, 12% and 29% of patients in the SU, DPP-4
MONTVIDA ET AL. 3
114
inhibitor, GLP-1RA, insulin and TZD groups, respectively, added a
third-line therapy (Table 2). Insulin was the most popular third-line
ADD, followed by DPP-4 inhibitors. Of those who added a third drug,
insulin was chosen by 26%, 36%, 69% and 32% of patients in the SU,
DPP-4 inhibitor, GLP-1RA and TZD groups, respectively (Table 2).
Among those who continued the second-line therapy for 12 months
(sub-cohort 1) and for 24 months (sub-cohort 2), 30% and 39% added
a third-line therapy respectively.
3.2 | Temporal pattern of initiating third-line ADDs
Irrespective of the class of second-line ADD, the proportions of
patients who initiated a third ADD within 2 years of baseline are
shown in Figure 1A (“All”), stratified by calendar year of second-line
initiation. Figure 1 also shows those who intensified treatment with a
third ADD, excluding TZD as second-line group (“All without TZD”)
because a large proportion of patients ceased TZD treatment as a
result of cardiovascular safety concerns23–25 and not necessarily
TABLE 1 Characteristics of patients at initiation of second-line antidiabetic drug
Metformin + SUMetformin +DPP-4 inhibitor
Metformin+ GLP-1RA
Metformin+ insulin
Metformin+ TZD
All
N 90 572 29 308 6696 21 827 14 678 163 081
Mean (SD) age, years 59 (12) 57 (12) 53 (11) 56 (13) 57 (11) 57 (12)
Men, n (%) 46 005 (51) 14 330 (49) 2354 (35) 9858 (45) 7782 (53) 80 329 (49)
White ethnicity, n (%) 63 338 (70) 20 366 (69) 5100 (76) 14 267 (65) 10 256 (70) 113 327 (69)
Black ethnicity, n (%) 11 703 (13) 3618 (12) 616 (9) 3690 (17) 1434 (10) 21 061 (13)
Mean (SD) time from metformin tosecond drug, mo
8.9 (16.8) 12.8 (19.3) 12.1 (18.3) 5.8 (14.2) 4.8 (11.7) 9.0 (16.8)
Mean (SD) follow-up from baseline,years
4.03 (2.49) 3.22 (1.95) 3.66 (2.39) 3.46 (2.24) 5.57 (2.80) 3.93 (2.47)
Mean (SD) therapy duration frombaseline, mo
38.3 (26.3) 29.6 (20.1) 28.4 (21.0) 37.9 (26.0) 35.8 (26.1) 36.0 (25.3)
Therapy duration ≥12 mo, n (%) 77 779 (86) 23 327 (80) 5061 (76) 18 729 (86) 12 040 (82) 136 936 (84)
Therapy duration ≥24 mo, n (%) 56 324 (62) 14 746 (50) 3090 (46) 13 472 (62) 8297 (57) 95 929 (59)
Mean (SD) HbA1c, % 8.4 (1.8) 8.2 (1.7) 7.8 (1.6) 9.3 (2.3) 7.9 (1.7) 8.4 (1.9)
Mean HbA1c mmol/mol 68 66 62 78 63 68
HbA1c category, n (%)
53–63.9 mmol/mol (7–7.9%) 26 493 (29) 10 112 (35) 1953 (29) 4034 (18) 4139 (28) 46 731 (29)
64–75 mmol/mol (8–9%) 18 701 (21) 5726 (20) 1027 (15) 3838 (18) 2295 (16) 31 587 (19)
75.1–108 mmol/mol (9.1–12%) 20 148 (22) 5373 (18) 989 (15) 7432 (34) 2183 (15) 36 125 (22)
>108 mmol/mol (>12%) 4695 (5) 1227 (4) 166 (2) 2798 (13) 504 (3) 9390 (6)
Mean (SD) weight, kg 98.3 (24.5) 98.9 (24.2) 109.5 (25.9) 99.8 (26.2) 100.2 (23.9) 99.3 (24.8)
Mean (SD) BMI, kg/m2 34.5 (7.7) 34.5 (7.6) 38.5 (8.1) 35.2 (8.4) 34.8 (7.6) 34.8 (7.8)
BMI categorya, n (%)
Normal 5803 (7) 1841 (6) 89 (1) 1712 (8) 827 (6) 10 272 (6)
Overweight 20 477 (23) 6567 (23) 669 (10) 4217 (20) 3130 (22) 35 060 (22)
Grade 1 25 568 (29) 8570 (30) 1661 (25) 5697 (27) 4029 (29) 45 525 (29)
Grade ≥ 2 35 853 (41) 11 788 (41) 4144 (63) 9587 (45) 6012 (43) 67 384 (43)
Mean (SD) SBP, mm Hg 131 (15) 129 (13) 128 (13) 130 (15) 130 (14) 130 (14)
SBP ≥ 140 mm Hg, n (%) 22 164 (25) 5807 (20) 1084 (17) 5022 (23) 3088 (22) 37 165 (23)
Mean (SD) LDL cholesterol, mg/dL 98 (34) 98 (35) 95 (34) 99 (37) 97 (34) 98 (35)
Mean (SD) HDL cholesterol, mg/dL 43 (12) 44 (12) 44 (12) 43 (13) 45 (12) 43 (12)
Median (IQR) triglycerides, mg/dL 150 (109, 199) 147 (108, 196) 150 (109, 200) 144 (103, 197) 139 (101, 190) 147 (107, 198)
CVD, CKD or cancer, n (%) 23 281 (26) 7223 (25) 1205 (18) 5870 (27) 2982 (20) 40 561 (25)
CVD, n (%) 18 031 (20) 5406 (18) 852 (13) 4675 (21) 2276 (16) 31 240 (19)
CKD, n (%) 3750 (4) 1205 (4) 151 (2) 811 (4) 431 (3) 6348 (4)
Cancer, n (%) 4469 (5) 1628 (6) 285 (4) 1103 (5) 552 (4) 8037 (5)
Neuropathy, n (%) 7153 (8) 2080 (7) 519 (8) 2305 (11) 879 (6) 12 936 (8)
Retinopathy, n (%) 1329 (1) 288 (1) 76 (1) 535 (2) 166 (1) 2394 (1)
Depression, n (%) 14 925 (16) 5427 (19) 1576 (24) 4200 (19) 2145 (15) 28 273 (17)
Mean (SD) CCI 1.7 (1.1) 1.7 (1.1) 1.6 (0.9) 1.8 (1.2) 1.5 (0.9) 1.7 (1.1)
Abbreviations: BMI, body mass index; CCI, Charlson comorbidity index; CVD, cardiovascular disease; CKD, chronic kidney disease; DBP, diastolic bloodpressure; DPP-4, dipeptidyl peptidase-4; GLP-1RA, glucagon-like peptide-1 receptor agonist; HbA1c, glycated haemoglobin; IQR, interquartile range; SBP,systolic blood pressure; SU, sulphonylurea; TZD, thiazolidinedione.a BMI category: normal: <25 kg/m2; overweight: ≥25 and <30 kg/m2; Grade 1: ≥30 and <35 kg/m2; Grade ≥ 2: ≥35 kg/m2.
4 MONTVIDA ET AL.
115
TABLE 2 Third-line anti-diabetic drug usage in the study cohort and two sub-cohortsa
Metformin + SUMetformin +DPP-4inhibitor
Metformin +GLP-1RA
Metformin +insulin
Metformin +TZD
All
Study cohort N 90 572 29 308 6696 21 827 14 678 163 081
Initiated third drug n (% fromN)
49 255 (54) 15 248 (52) 3513 (52) 7275 (33) 10 006 (68) 85 297 (52)
Time from second tothird drug, mo
Mean (SD) 14.3 (19.5) 14.3 (16.0) 13.2 (17.8) 17.7 (19.3) 18.4 (23.1) 15.0 (19.4)
Added third drugwithin 6 mo
n1 (% fromN)
24 600 (27) 6053 (21) 1725 (26) 2627 (12) 4260 (29) 39 265 (24)
• Most popularthird drug
Name; n (%from n1)
TZD; 8107 (33) Insulin; 2200(36)
Insulin; 1189(69)
SU; 888 (34) Insulin; 1352(32)
Insulin; 11 054(28)
• Second mostpopular third drug
Name; n (%from n1)
DPP-4 inhibitor;7455 (30)
SU; 2073(34)
SU; 193 (11) DPP-4inhibitor;703 (27)
DPP-4inhibitor;1236 (29)
DPP-4inhibitor;9499 (24)
Sub-cohort 1 N2 77 779 23 327 5061 18 729 12 040 136 936
Added third drugwithin 6 mo
n (% fromN2)
20 990 (27) 4581 (20) 1300 (26) 2220 (12) 3450 (29) 32 541 (24)
Added third drugwithin 6 to 12 mo
n2 (% fromN2)
4265 (5) 1860 (8) 293 (6) 1076 (6) 682 (6) 8176 (6)
• Most popularthird drug
Name; n (%from n2)
DPP-4inhibitor;1737(41)
SU; 975 (52) SU; 104 (35) SU; 336 (31) SU; 340 (50) DPP-4inhibitor;2217 (27)
• Second mostpopular third drug
Name; n (%from n2)
Insulin; 1074(25)
Insulin;267 (14)
Insulin;62 (21)
GLP1RA;269 (25)
DPP-4inhibitor;160 (23)
SU; 1755 (21)
Sub-cohort2 N3 56 324 14 746 3090 13 472 8297 95 929
Added third drugwithin 6 mo
n (% fromN3)
15 074 (27) 2549 (17) 800 (26) 1521 (11) 2309 (28) 22 253 (23)
Added third drugwithin 6 to 12 mo
n (% fromN3)
2867 (5) 1124 (8) 168 (5) 756 (6) 471 (6) 5386 (6)
Added third drugwithin 12 to 24 mo
n3 (% fromN3)
5302 (9) 1833 (12) 319 (10) 1070 (8) 645 (8) 9169 (10)
• Most popularthird drug
Name; n (%from n3)
DPP-4 inhibitor;2356 (44)
SU; 959 (52) SU; 113 (35) SU; 301 (28) SU; 297 (46) DPP-4inhibitor;2876 (31)
• Second mostpopular third drug
Name; n (%from n3)
Insulin; 1225(23)
SGLT2;274 (15)
Insulin;63 (20)
DPP-4inhibitor;269 (25)
DPP-4inhibitor;201 (31)
SU; 1670 (18)
Abbreviations: DPP-4, dipeptidyl peptidase-4; GLP-1RA, glucagon-like peptide-1 receptor agonist; SU, sulphonylurea; TZD, thiazolidinedione.a Duration of second-line agent ≥6 months/ ≥12 months/ ≥24 months in the study cohort/ sub-cohort 1/ sub-cohort 2, respectively.
FIGURE 1 Among patients who had at least 2 years of follow-up in the study cohort, the proportion (95% confidence interval) of patients who
initiated third antidiabetic drug (ADD) within 2 years of second ADD: A, irrespective of HbA1c level at second-line initiation; B, among thosewith HbA1c of 64.0–75.0 mmol/mol (8–9%) at second-line initiation; and C, among those with HbA1c of 75.1-108.0 mmol/mol (9.1–12%) atsecond-line initiation. HbA1c, glycated haemoglobin; INS, insulin; TZD, thiazolidinedione
MONTVIDA ET AL. 5
116
because of efficacy issues. The figure also provides the data exclud-
ing those who had a TZD or insulin as second-line (“All without
TZD & insulin”) to explore the possible change in intensification rate
with non-insulin ADDs over time, accounting for decreasing popular-
ity of TZDs. Figure 1B and C focus on those who had baseline HbA1c
of 64–75 mmol/mol (8–9%) and 75.1–108 mmol/mol (9.1–12%),
respectively. Figure 1 shows that between 2007 and 2014 the pro-
portion of patients initiating a third ADD, within 2 years of adding
the second ADD, decreased; however, this decline started to reverse
in 2014, especially among those whose HbA1c was 75.1–108 mmol/
mol (9.1–12%) at initiation of the second ADD.
3.3 | Glycaemic achievements and sustainability
At 6 months, the mean unadjusted HbA1c reductions were 0.8%,
0.8%, 0.7%, 1.0% and 0.8% in the SU, DPP-4 inhibitor, GLP-1RA,
insulin and TZD groups, respectively. The mean adjusted reductions
at 6 months were 0.8%, 1.0%, 1.1%, 0.7% and 1.0% in the respective
treatment groups (significant for all groups, P < .01).
3.3.1 | Baseline HbA1c group 58–63.9 mmol/mol(7.5–7.9%)
Among patients with HbA1c concentrations of 58–63.9 mmol/mol
(7.5–7.9%) at baseline, 44%, 47%, 57%, 31% and 57% of patients in
the SU, DPP-4 inhibitor, GLP-1RA, insulin and TZD groups, respec-
tively, achieved HbA1c <53 mmol/mol (<7%) at 6 months without
third-line treatment addition. The corresponding adjusted probabili-
ties were 32%, 38%, 39%, 26% and 38% in the second-line treatment
groups (P < .01 for all groups [Figure 2A]); however, the probabilities
of reducing HbA1c below the target 53 mmol/mol (7%) without
third-line ADD intensification declined by 5%, 5%, 6%, 2% and 1% at
12 months and by 9%, 8%, 15%, 5% and 7% at 24 months in the SU,
DPP-4 inhibitor, GLP-1RA, insulin and TZD groups, respectively.
Among those who reduced HbA1c to <53 mmol/mol (<7%) with-
out a third ADD at 6 months, 68% and 58% of patients sustained this
glycaemic achievement at 12 and 24 months, respectively. The prob-
ability of sustaining this glycaemic achievement was higher and simi-
lar in the GLP-1RA and TZD groups at 12 months (range of 95% CI
of probability: 76%, 79%), compared with other second-line therapy
options (Figure 2B). While the probability of sustaining this glycaemic
control declined significantly by 24 months, GLP-1RAs, DPP-4 inhibi-
tors and TZDs provided significantly higher chances of sustainability
(range of 95% CI of probability: 53%, 58%) compared with patients
treated with insulin or SUs (range of 95% CI of probability:
46%, 50%).
3.3.2 | Baseline HbA1c 64–75 mmol/mol (8–9%)
Among patients with baseline HbA1c concentrations of 64–75
mmol/mol (8-9%), 55%, 58%, 66%, 41% and 67% of patients in the
SU, DPP-4 inhibitor, GLP-1RA, insulin and TZD groups achieved
HbA1c <58 mmol/mol (7.5%) at 6 months without third-line ADD
addition, and the corresponding adjusted probabilities were 38%,
44%, 40%, 34% and 42%, respectively (Figure 2C). The probabilities
of this glycaemic achievement declined significantly by at least 5%
across all treatment groups at 12 months, and by at least 8% at
24 months.
Among those who achieved an HbA1c concentration <58 mmol/
mol (<7.5%) without a third ADD at 6 months, 76% and 67% sus-
tained this glycaemic achievement at 12 and 24 months, respectively,
without requiring third-line treatment intensification. The probability
of sustaining this glycaemic achievement was significantly higher in
the GLP-1RA and TZD groups at 12 months (range of 95% CI of
probability: 76%, 79%), compared with other second-line ADDs
(Figure 2D; P < .01). While the probability of sustaining this glycae-
mic control declined significantly by 24 months of therapy across all
groups, patients treated with insulin had the lowest probability of
sustaining the glycaemic control.
3.3.3 | Baseline HbA1c 75.1–108 mmol/mol (9.1–12%)
In the patients with 75.1–108 mmol/mol (9.1–12%) baseline HbA1c,
29%, 36% and 45% added a third ADD within 6, 12 and 24 months
of baseline, respectively. Irrespective of third ADD status, 37%, 45%,
38%, 21% and 43% of patients in the SU, DPP-4 inhibitor, GLP-1RA,
insulin and TZD groups, respectively, achieved HbA1c <58 mmol/mol
(<7.5%) at 6 months, with corresponding probabilities of 36%, 45%,
38%, 33% and 43% (Figure 2E). The probability of achieving an
HbA1c concentration <58 mmol/mol (<7.5%) at 24 months reduced
by 4% for insulin users, did not change in the SU and DPP-4 inhibitor
groups, and increased by 8% and 9% in the second-line GLP-1RA and
TZD groups (all P < .01). Among those who achieved an HbA1c con-
centration <58 mmol/mol (<7.5%) at 6 months, 72% and 58%,
respectively, sustained this glycaemic achievement at 12 and
24 months, irrespective of third-line treatment intensification status.
The probability of sustaining glycaemic control at <58 mmol/mol
(<7.5%) over 12 and 24 months of treatment was significantly higher
in the incretin and TZD groups, while insulin and SU offered lower
chances of sustainable control (Figure 2F).
3.3.4 | Baseline HbA1c >108 mmol/mol (>12%)
In patients with baseline HbA1c >108 mmol/mol (>12%), the proba-
bility of reducing HbA1c by at least 2% increased over time: 82% at
2 years of insulin therapy, and ~90% for other second-line choices.
The probabilities of reducing HbA1c by at least 1.5% in this baseline
HbA1c group were not significantly different among the ADD groups
over 2 years (results not shown).
3.4 | Sensitivity analyses
An intention-to-treat approach obtained similar results to those of
the main analyses. Patients with cardiovascular disease, chronic kid-
ney disease or cancer at baseline had marginally higher probabilities
of glycaemic achievements in all treatment groups, compared with
those without comorbidities (results not shown).
4 | DISCUSSION
The novelty of the present pharmaco-epidemiological study, with
real-world population-level data, is its evaluation of short- and long-
term glycaemic control with post-metformin major second-line ADDs,
and the comparison of the sustainability of such glycaemic goals over
6 MONTVIDA ET AL.
117
FIGURE 2 At 6, 12, and 24 months of second-line initiation, adjusted probability (95% confidence interval) of A, reducing glycated haemoglobin
(HbA1c) below 53 mmol/mol (7%) without adding third anti-diabetic drug (ADD), from baseline HbA1c of 58–63.9 mmol/mol (7.5–7.9%); B,sustaining 6-month achievement without adding a third ADD; C, reducing HbA1c below 58 mmol/mol (7.5%) without adding a third ADD, frombaseline HbA1c of 64–75 mmol/mol (8–9%); D, sustaining 6-month achievement without adding a third ADD; E, reducing HbA1c below 58mmol/mol (7.5%) (irrespective of third ADD), from baseline HbA1c of 75.1–108 mmol/mol (9.1–12%); and F, sustaining 6-month achievement(irrespective of third ADD). DPP-4, dipeptidyl peptidase-4; GLP-1RA, glucagon-like peptide-1 receptor agonist; INS, insulin; MET, metformin; SU,sulphonylurea; TZD, thiazolidinedione
MONTVIDA ET AL. 7
118
24 months of continuous treatment. Among patients with HbA1c
concentrations of 58-63.9 mmol/mol (7.5-7.9%) at second-line ADD
initiation, the probabilities of achieving an HbA1c concentration of
<53 mmol/mol (<7%) without adding a third-line ADD at 6 and
12 months were significantly higher in the incretin and TZD groups,
compared with the insulin and SU groups. Treatment with incretins
or TZDs also offered a significantly higher probability of sustaining
this glycaemic achievement over 24 months of treatment without the
need for further therapy intensification. Among those who initiated a
second-line ADD at HbA1c levels of 64–75 mmol/mol (8–9%), DPP-
4 inhibitors and TZDs offered significantly higher and similar chances
of reducing HbA1c to <58 mmol/mol (<7.5%) over 24 months of
therapy without adding a third ADD, compared with other second-
line groups. GLP-1RAs and TZDs offered the highest chances of sus-
taining this control over 24 months, while treatment with SUs, insulin
and DPP-4 inhibitors provided significantly lower sustainability
chances.
In this real-world study, we observed similar performance by
DPP-4 inhibitors and GLP-1RAs in terms of the probability of reduc-
ing HbA1c to a clinically desirable glycaemic target over 24 months
of therapy, when added to metformin. In terms of sustaining the gly-
caemic achievements over 12 months, GLP-1RAs appear to offer
higher chances among patients with HbA1c <75 mmol/mol (<9%) at
second-line initiation (~76%-79% probability), compared with DPP-4
inhibitors (~68%-73% probability); however, this difference disap-
pears at 24 months of therapy. While SUs as second-line therapy
offer a higher probability of achieving desirable glycaemic control
across all HbA1c categories (<108 mmol/mol (<12%)) compared with
insulin over 2 years, the probability of sustaining the early glycaemic
achievement appears to be similar between these two therapy
options. We have seen that, across all HbA1c categories, treatment
with second-line TZDs provided better or similar glycaemic achieve-
ments and sustainability, compared with other therapy options. This
result supports the study by Mamza et al,26 who reported that treat-
ment with post-metformin TZD provides the most durable glycaemic
response compared with second-line SU and DPP-4 inhibitor treat-
ment. Recent results of the TOSCA.IT trial, providing cardiovascular
safety reassurance with pioglitazone, taken in conjunction with our
results may increase the popularity of TZDs as a therapeutic
option.27
Compared with sulphonylurea add-on to metformin, Thomsen
et al8 reported a higher likelihood of achieving HbA1c below 53
mmol/mol (7%) at 6 months for second-line GLP-1RA users (Relative
Risk (95% CI) of 1.10 (1.01, 1.19)), and lower likelihoods for DPP-4
inhibitor (Relative Risk (95% CI) 0.94 (0.89, 0.99)) and insulin users
(Relative Risk (95% CI) 0.88 (0.77, 0.99)). Our results are closer to
those of the study conducted by Rathmann et al,7 who reported odds
ratios (with SU as reference) of achieving HbA1c below 53 mmol/mol
(7%) of 1.2, 1.4, 1.7 and 0.7 for second-line DPP-4 inhibitors, GLP-
1RAs, TZDs and insulin, respectively.
Our findings are also in line with a study that used data from the
National Health and Nutrition Examination Survey, which reported
that only half of patients achieve HbA1c below 53 mmol/mol (7%).28
Furthermore, in patients with HbA1c <75 mmol/mol (<9%) at
second-line initiation, we observed that only 30% maintained
glycaemic control after 2 years of continuous treatment without fur-
ther intensification with a third ADD.
Comparatively poor performance of insulin as a second-line
agent may be surprising, as randomized controlled trial data show
that insulin can achieve at least as much HbA1c reduction as other
agents. A possible reason for this is that insulin is often chosen when
there are multiple comorbidities, and in such patients, the HbA1c tar-
get may be higher, and many other potential third-line ADDs may be
contra-indicated. In addition, the insulin dose may be inadequately
titrated because of adverse effects, such as hypoglycaemia and
weight gain, as well as inadequate healthcare professional support for
the regular titration of insulin doses. More work needs to be carried
out to determine how best to translate the clinical trial efficacy of
insulin into clinical practice effectiveness.
We observed that the proportion of patients who intensify treat-
ment with a third ADD has decreased only moderately during the last
decade, despite the increasing availability of newer agents. Lipska
et al29 reported that overall glycaemic control in the United States
did not change between 2006 and 2013.
A strength of the present study is the availability of data from
patients' medication lists that include prescribed medications within
the EMR network and also medications that could be prescribed out-
side of the EMR. Furthermore, the CEMR database tracks longitudinal
treatment adjustments, and contains comprehensive clinical informa-
tion, which is usually not available in claims databases. In addition, we
used advanced data mining and statistical methods. Given unequal
probabilities of receiving particular second-line agents in the real-
world scenario, we modelled treatment assignment with multinomial
propensity scores, and then assessed the adjusted outcomes of the
study.
The limitations of this study include the non-availability of data
on: (1) adherence and side-effects; (2) diet and exercise; (3) socio-
economic status; and (4) insurance type. Carls and colleagues31
highlighted alarmingly low rates of medication adherence as the main
cause of the disconnect between results of real-world studies and
clinical trials. Importantly, in the present study, we focused only on
those who continued the second-line therapy for a minimum of
6 months. Montvida et al4 recently reported higher discontinuation
rates for incretins, compared with older treatment alternatives.
To conclude, incretin-based therapies and TZDs offered a higher
probability of long-term glycaemic achievements and of sustaining
these, compared with SUs and insulin for metformin-treated patients
with type 2 diabetes. While the results of a large randomized
controlled trial (GRADE) comparing glycaemic efficacy of major
second-line therapies are not expected before 2020, the present
study provides much-needed information to patients and clinicians
with regard to the probability of sustainable glycaemic control with
different therapy options.30
ACKNOWLEDGMENTS
Melbourne EpiCentre gratefully acknowledges the support from the
Australian Government Department of Education's National Collabo-
rative Research Infrastructure Strategy (NCRIS) initiative through
Therapeutic Innovation Australia. O.M. acknowledges the PhD
8 MONTVIDA ET AL.
119
scholarship from Queensland University of Technology, Australia, and
her co-supervisors Prof. Ross Young and Prof. Louise Hafner of the
same University. No separate funding was obtained for this study.
Conflict of interest
S.K.P. has acted as a consultant and/or speaker for Novartis, GI
Dynamics, Roche, AstraZeneca, Guangzhou Zhongyi Pharmaceutical
and Amylin Pharmaceuticals LLC. He has received grants in support
of investigator and investigator-initiated clinical studies from Merck,
Novo Nordisk, AstraZeneca, Hospira, Amylin Pharmaceuticals, Sanofi-
Avensis and Pfizer. O.M. has no conflict of interest to declare.
J.E.S. has received honoraria or grant support from Merck Sharp and
Dohme, Novo Nordisk, Eli Lilly, AstraZeneca, Sanofi-Aventis, Mylan
Pharmaceuticals and Boehringer Ingelheim.
Author contributions
O.M. and S.K.P. were responsible for the primary design of the study.
O.M. conducted the data extraction. O.M. and SKP jointly conducted
the statistical analyses. The first draft of the manuscript was devel-
oped by O.M. and S.K.P., and all authors contributed to the finaliza-
tion of the manuscript. S.K.P. had full access to all the data in the
study and takes responsibility for the integrity of the data and the
accuracy of the data analysis.
ORCID
Jonathan E. Shaw http://orcid.org/0000-0002-6187-2203
Lawrence Blonde http://orcid.org/0000-0003-0492-6698
Sanjoy K. Paul http://orcid.org/0000-0003-0848-7194
REFERENCES
1. American Diabetes Association. Standards of medical care indiabetes—2017: summary of revisions. Diabetes Care. 2017;40:S4-S5.
2. Turner RC, Cull CA, Frighi V, Holman RR, Group UPDS. Glycemic con-trol with diet, sulfonylurea, metformin, or insulin in patients with type2 diabetes mellitus: progressive requirement for multiple therapies(UKPDS 49). JAMA. 1999;281:2005-2012.
3. Garber AJ, Abrahamson MJ, Barzilay JI, et al. Consensus statement by theAmerican association of clinical endocrinologists and American College ofEndocrinology on the comprehensive type 2 diabetes managementalgorithm–2017 executive summary. Endocr Pract. 2017;23:207-238.
4. Montvida O, Shaw J, Atherton JJ, Stringer F, Paul SK. Long-termtrends in antidiabetes drug usage in the US: real-world evidence inpatients newly diagnosed with type 2 diabetes. Diabetes Care. 2018;41:69-78.
5. Giugliano D, Maiorino MI, Bellastella G, Esposito K. Comment onEdelman and Polonsky. Type 2 diabetes in the real world: the elusivenature of glycemic control. Diabetes Care. 2017;40:1425-1432. Dia-betes Care 2018;41:e17.
6. McCarthy MI. Painting a new picture of personalised medicine fordiabetes. Diabetologia. 2017;60:793-799.
7. Rathmann W, Bongaerts B, Kostev K. Change in glycated haemoglo-bin levels after initiating second-line therapy in type 2 diabetes: a pri-mary care database study. Diabetes Obes Metab. 2016;18:840-843.
8. Thomsen RW, Baggesen LM, Søgaard M, et al. Early glycaemic controlin metformin users receiving their first add-on therapy: apopulation-based study of 4,734 people with type 2 diabetes. Diabe-tologia. 2015;58:2247-2253.
9. Palmer SC, Mavridis D, Nicolucci A, et al. Comparison of clinical out-comes and adverse events associated with glucose-lowering drugs in
patients with type 2 diabetes: a meta-analysis. JAMA. 2016;316:313-324.
10. Maruthur NM, Tseng E, Hutfless S, et al. Diabetes medications asmonotherapy or metformin-based combination therapy for type 2 dia-betes: a systematic review and meta-analysis. Ann Intern Med. 2016;164:740-751.
11. Bennett WL, Maruthur NM, Singh S, et al. Comparative effectivenessand safety of medications for type 2 diabetes: an update includingnew drugs and 2-drug combinations. Ann Intern Med. 2011;154:602-613.
12. Zhang Y, McCoy RG, Mason JE, Smith SA, Shah ND, Denton BT.Second-line agents for glycemic control for type 2 diabetes: arenewer agents better? Diabetes Care. 2014;37:1338-1345.
13. Centers for Disease Control and Prevention. National Diabetes Statis-tics Report: estimates of diabetes and its burden in the United States.Atlanta, GA: US Department of Health and Human Services, 2014.
14. Crawford AG, Cote C, Couto J, et al. Comparison of GE centricityelectronic medical record database and National Ambulatory MedicalCare Survey findings on the prevalence of major conditions in theUnited States. Popul Health Manag. 2010;13:139-150.
15. Brixner D, Said Q, Kirkness C, Oberg B, Ben-Joseph R, Oderda G.Assessment of cardiometabolic risk factors in a national primary careelectronic health record database. Value in health. 2007;10(s1).
16. Paul SK, Shaw J, Montvida O, Klein K. Weight gain in insulin treatedpatients by BMI categories at treatment initiation: new evidence fromreal-world data in patients with type 2 diabetes. Diabetes Obes Metab.2016;18(12):1244-1252.
17. Montvida O, Arandjelović O, Reiner E, Paul SK. Data mining approachto estimate the duration of drug therapy from longitudinal electronicmedical records. Open Bioinforma J. 2017;10:1-15.
18. Thomas G, Klein K, Paul S. Statistical challenges in analysing large lon-gitudinal patient-level data: the danger of misleading clinical infer-ences with imputed data. J Indian Soc Agric Stat. 2014;68:39-54.
19. Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defin-ing comorbidities in ICD-9-CM and ICD-10 administrative data. MedCare. 2005;43:1130-1139.
20. Ridgeway G, McCaffrey DF, Morral AR, Burgette LF, Griffin BA.Toolkit for Weighting and Analysis of Nonequivalent Groups: A Tuto-rial for the R TWANG Package. Santa Monica, CA: RAND Corpora-tion, 2014. https://www.rand.org/pubs/tools/TL136z1.html.
21. Lunceford JK, Davidian M. Stratification and weighting via the pro-pensity score in estimation of causal treatment effects: a comparativestudy. Stat Med. 2004;23:2937-2960.
22. McCaffrey DF, Ridgeway G, Morral AR. Propensity score estimationwith boosted regression for evaluating causal effects in observationalstudies. Psychol Methods. 2004;9:403.
23. Woodcock J, Sharfstein JM, Hamburg M. Regulatory action on rosigli-tazone by the US Food and Drug Administration. N Engl J Med. 2010;363:1489-1491.
24. Tanne JH. FDA places" black box" warning on antidiabetes drugs.BMJ. 2007;334:1237.
25. Nissen SE, Wolski K. Effect of rosiglitazone on the risk of myocardialinfarction and death from cardiovascular causes. N Engl J Med. 2007;356:2457-2471.
26. Mamza J, Mehta R, Donnelly R, Idris I. Important differences in thedurability of glycaemic response among second-line treatmentoptions when added to metformin in type 2 diabetes: a retrospectivecohort study. Ann Med. 2016;48:224-234.
27. Vaccaro O, Masulli M, Nicolucci A, Bonora E, Del Prato S,Maggioni AP, Rivellese AA, Squatrito S, Giorda CB, Sesti G,Mocarelli P. Effects on the incidence of cardiovascular events of theaddition of pioglitazone versus sulfonylureas in patients with type 2diabetes inadequately controlled with metformin (TOSCA. IT): a ran-domised, multicentre trial. Lancet Diabetes Endocrinol. 2017;5:887-897.
28. Edelman SV, Polonsky WH. Type 2 diabetes in the real world: theelusive nature of glycemic control. Diabetes Care. 2017;40:1425-1432.
29. Lipska KJ, Yao X, Herrin J, et al. Trends in drug utilization, glycemiccontrol, and rates of severe hypoglycemia, 2006–2013. Diabetes Care.2017;40:468-475.
MONTVIDA ET AL. 9
120
30. Nathan DM, Buse JB, Kahn SE, et al. Rationale and design of the gly-cemia reduction approaches in diabetes: a comparative effectivenessstudy (GRADE). Diabetes Care. 2013;36:2254-2261.
31. Carls GS, Tuttle E, Tan R-D, et al. Understanding the gap betweenefficacy in randomized controlled trials and effectiveness inreal-world use of GLP-1 RA and DPP-4 therapies in patients withtype 2 diabetes. Diabetes Care. 2017;40:1469-1478.
SUPPORTING INFORMATION
Additional Supporting Information may be found online in the sup-
porting information tab for this article.
How to cite this article: Montvida O, Shaw JE, Blonde L,
Paul SK. Long-term sustainability of glycaemic achievements
with second-line antidiabetic therapies in patients with type 2
diabetes: A real-world study. Diabetes Obes Metab. 2018;
1–10. https://doi.org/10.1111/dom.13288
10 MONTVIDA ET AL.
121
Chapter 9: Cardio-metabolic Risk Factor
Burden and Safety
Statement of Contribution of Co-Authors for Thesis by Published
Paper
The authors listed below have certified* that:
1. they meet the criteria for authorship in that they have participated in the conception,
execution, or interpretation, of at least that part of the publication in their field of
expertise;
2. they take public responsibility for their part of the publication, except for the
responsible author who accepts overall responsibility for the publication;
3. there are no other authors of the publication according to these criteria;
4. potential conflicts of interest have been disclosed to (a) granting bodies, (b) the editor
or publisher of journals or other publications, and (c) the head of the responsible
academic unit, and
5. they agree to the use of the publication in the student’s thesis and its publication on
the QUT’s ePrints site consistent with any limitations set by publisher requirements.
In the case of this chapter:
Olga Montvida, Xiaoling Cai, Sanjoy K Paul. Cardio-metabolic risk factor burden and safety
in patients with type 2 diabetes receiving intensified anti-diabetic and cardio-protective
therapies.
Contributor Statement of Contribution*
Olga Montvida Conceived the idea and was responsible for the primary
design of the study. Conducted the data extraction and
statistical analyses. Developed first draft and contributed
towards development of the manuscript.
Xiaoling Cai Contributed to the manuscript development.
Sanjoy K. Paul Conceived the idea, and was responsible for the primary
design of the study. Contributed to the statistical analyses.
Developed first draft and contributed towards development
of the manuscript.
122
29.06.2018QUT Verified
Signature
Principal Supervisor Confirmation
I have sighted email or other correspondence from all Co-authors confirming their certifying
authorship.
Sanjoy Ketan Paul 29.06.2018
Name Signature Date
123
QUT Verified Signature
Page 2 of 25
ABSTRACT
Background: Individualized treatment of patients with type 2 diabetes requires detailed
evaluation of risk factor dynamics at population level. The aim was to evaluate persistent
glycaemic and cardiovascular (CV) risk factor burden over 2 years post treatment
intensification (TI).
Methods: From US Centricity Electronic Medical Records, 276,884 patients with type 2
diabetes who intensified metformin were selected. SBP ≥ 130 / 140 mmHg and LDL ≥ 70 /
100 mg/dL were defined as uncontrolled for those with / without a history of CV disease
(CVD) at TI. Triglycerides (Trig) ≥ 150 mg/dL and HbA1c ≥ 7.5% were defined as
uncontrolled. Longitudinal measures over 2 years post TI were used to define continuously
uncontrolled patients.
Findings: With 3.7 years mean follow-up, patients were 59 years old, 70% were obese, 22%
had history of CVD; 60 / 30 / 50 / 48% had uncontrolled HbA1c / SBP / LDL / Trig at TI;
81% and 69% were receiving therapies for blood pressure and lipid control respectively.
The proportion of patients with consistently uncontrolled HbA1c increased from 31% in 2005
to 41% in 2014. Among those on lipid-modifying drugs, 41% and 37% had consistently high
LDL and Trig over 2 years. Being on blood pressure control therapies, 29% had continuously
uncontrolled SBP. Among patients receiving cardio-protective therapies, 62% failed to
achieve control in HbA1c + LDL, 62% in HbA1c + Trig, and 55% in HbA1c + SBP over 2
year post TI. Rates per 1000-person years of major adverse cardiovascular events were lower
among those who intensified metformin with GLP-1RA, compared to other therapies.
124
Page 3 of 25
Interpretation: Among patients on multiple therapies for risk factor control, more than a
third had uncontrolled HbA1c, lipid and SBP levels, and 3 out of 5 had uncontrolled HbA1c
and at least one CV risk factor over 2 years post TI.
125
Page 4 of 25
INTRODUCTION
Cardiovascular (CV) disease in patients with type 2 diabetes has been in much of focus
during the last decade and remains so till date, being the most common reason of death and
comorbidities among patients with diabetes 1,2. The efficient management of these patients
requires a multi-faced approach to holistically control for hyperglycaemia and CV risk factors
such as blood pressure, body weight, and lipids 3,4. Recent review by Khunti and colleagues
discuss current evidence of early control of glycose, lipids, and blood pressure on CV
benefits 5.
While American and international guidelines constantly stress out the importance of cardio-
metabolic risk factor control, the population-level control has not improved during last
decade in the US 6-8. Using data from National Health and Nutrition Examination Survey,
Carls and colleagues reported that 57% of patients with diabetes during 2003-2006 achieved
HbA1c < 7%, while only 51% in the 2011-20148. Similarly, using privately insured and
Medicare Advantage patients with type 2 diabetes, Lipska and colleagues reported declining
proportion of patients with HbA1c < 7% from 56% in 2006 to 54% in 2013 7. Ali and
colleagues reported that only 14% of patients with diabetes had simultaneous control of
glycose, blood pressure, cholesterol and non-smoking status during 1999-2010 in the US 6.
Another study on 530,747 patients from Diabetes Collaborative Registry, reported that 83%
and 81% of patients have hypertension and hyperlipidaemia respectively 3.
Significant portion of patients with type 2 diabetes eventually intensify first-line metformin
apart from using multiple cardio-protective medications, nonetheless poor cardio-metabolic
risk factor control is common in these patients. While previous studies were using general
population of individuals with diabetes, to the best of our knowledge, there is no study that
holistically explores the patterns of risk factor control post therapy intensification at
126
Page 5 of 25
population level. In this context, assessment of those who continuously fail to control risk
factors would help to understand whether increasing evidence of early control benefits and
introduction of newer classes of anti-diabetic drugs (ADDs) has helped to improve population
health during the last decade.
In patients with type 2 diabetes identified from US primary and secondary ambulatory care
systems’ electronic medical records (EMRs), the aims of this study were to provide up-to-
date exploration of the population-level (1) glycaemic and CV risk factor control at the time
of metformin intensification; (2) simultaneous control of glycaemic and CV risk factors post-
metformin intensification, (3) persistent glycaemic and CV risk factor burden in those who
are using anti-diabetic and cardio-protective therapies; and (4) rates of major adverse
cardiovascular events by different second-line ADD choices.
RESEARCH DESIGN AND METHODS
Data Source
The Centricity EMRs were used in this study, the database represents more than 35,000 solo
practitioners, community clinics, academic medical centres, and large integrated delivery
networks across all US states. Patients in the database are generally representative of the
USA population, among those who were active in the CEMR during 2015 and were older
than 18 years, 11.6% were identified to have any type of diabetes. This estimate stands very
close to the US National Diabetes Statistics (NDS) report that estimated 12.2% of adult
population to have diabetes in 20159. The database has been extensively used for academic
research worldwide10-12.
127
Page 6 of 25
For more than 34 million individuals, longitudinal EMRs were available from 1995 until
April 2016, with comprehensive patient-level information on demographics, anthropometric,
clinical and laboratory variables.
Study Design
For each identified patient with type 2 diabetes, the following drug classes were arranged
chronologically according to the initial prescription dates: metformin, sulfonylurea (SU),
thiazolidinedione (TZD), alpha glucosidase inhibitor, amylin, dopamine receptor agonist,
meglitinide, dipeptidyl peptidase-4 inhibitor (DPP-4i), glucagon like peptide 1 receptor
agonist (GLP-1RA), sodium glucose cotransporter inhibitor, and insulin (INS). Next, the data
on individual patient’s first-, second-, and third-line ADDs was created. A robust
methodology for extraction and assessment of longitudinal patient-level medication data from
the Centricity EMRs has been recently described by the authors13.
Main study cohort included patients with: (1) age at diagnosis 18 and <80 years, (2)
diagnosis date strictly after first registered activity in the EMR database, (3) diagnosis date on
or after January 1, 2005, (4) initiated anti-diabetic therapy with metformin, (5) initiated
second-line ADD with SU, TZD, DPP-4i, GLP-1RA or INS, (6) available HbA1c, systolic
blood pressure (SBP), low density lipoprotein (LDL), or triglycerides measure at second-line
ADD initiation (baseline), (7) second-line therapy duration at least three months, and (8)
follow-up from baseline at least six months. Additional restrictions on the follow-up were
applied: 12 months (sub-cohort 1) and 24 months (sub-cohort 2).
HbA1c measures at baseline, 6, 12, 18, and 24 months were obtained as the nearest measure
within 3 months either side of the time point. Baseline and longitudinal body weight, SBP,
and lipids were calculated as the average of available measures within 3 months either side of
128
Page 7 of 25
the time point. With the condition of at least two non-missing follow-up data over 24 months,
the missing data were imputed using a Markov Chain Monte Carlo method adjusting for age,
diabetes duration and usage of concomitant ADDs14.
The presence of comorbidities prior to baseline was assessed by relevant disease
identification codes. Cardiovascular disease (CVD) was defined as ischaemic heart disease,
peripheral vascular disease, heart failure (HF), or stroke. Three-point major adverse
cardiovascular event (MACE) was defined as presence of HF, Myocardial infarction (MI), or
stroke.
Lipid modifying agents included all FDA approved drugs with highest Anatomical
Therapeutic Chemical (ATC) classification code of C10. Drugs against high blood pressure
were defined by the ATC codes of C02-C04 and C07-C09 (includes diuretics and
vasodilators).
SBP ≥ 130/ 140 mmHg for those with/ without CVD history at baseline was defined as
uncontrolled. Similarly, LDL ≥ 70/ 100 mg/dL for those with/ without CVD history at
baseline was defined as uncontrolled. Triglycerides ≥ 150 mg/dL and HbA1c ≥ 7.5% were
defined as uncontrolled.
Statistical Methods
Baseline characteristics were summarised as number (%), mean (SD) or median (first
quartile, third quartile). Longitudinal failure to control risk factors (individual and pairwise)
and risk factor burden were calculated irrespective of baseline control status. Failure to
control LDL and triglycerides at 6, 12, 24 months was calculated in those who were using a
lipid modifying drug prior to 6, 12, 24 months of baseline, respectively. Similarly, failure to
control SBP was calculated only in those who were using a blood pressure lowering drug
129
Page 8 of 25
prior to 6, 12, 24 months of baseline. Sub-cohort 1 and sub-cohort 2 were used for one and
two year analyses, respectively. Pairwise failure to simultaneously control HbA1c plus (1)
LDL, (2) triglycerides, and (3) SBP was summarised as proportion (95% CI) at 6, 12, and 24
of baseline. Probability (95% CI) of failure to control both risk factors by second-line ADD
groups was calculated using “Treatment Effects” modelling approach15-17. Second-line
treatment groups were balanced on baseline risk factor measurements. Probit model for
likelihood of failure was adjusted for sex, duration of diabetes, baseline age and body weight.
Risk factor two-year burden was defined as uncontrolled measures (at 6 months OR at 12
months) AND (at 18 months OR at 24 months) for patients in sub-cohort 2. Two-year burden
for LDL and triglycerides was calculated among those who were using a lipid modifying drug
prior to 12 months of baseline. Two-year burden for SBP was calculated among those who
were using a blood pressure lowering drug prior to 12 months of baseline. Proportions of
continuously uncontrolled patients (two-year burden) were summarized by the year of
second-line ADD initiation and by the class of second-line ADD. Standard life-table methods
were used to estimate rates per 1000 person years (95% CI) of MACE by class of second-line
ADDs.
RESULTS
From 2,624,954 identified patients with T2DM, 276,884 met the inclusion criteria
(Supplementary Figure 1, Table 1). With mean follow-up of 3.7 years, 89% of cohort had at
least one year of follow-up. In the cohort majority of patients were obese (n=187,936, 70%),
and 60,317 (22%) had a history of CVD on or prior to baseline. Those with a history of CVD
were older (mean: 64 years) and more likely to be male (61%) than those without a history of
CVD (mean: 57 years; 46% male). With a mean (SD) HbA1c of 8.4 (1.9)% at the time of
second ADD initiation, 54/ 61% of patients with/ without a history of CVD had HbA1c
130
Page 9 of 25
≥7.5% respectively. With mean (SD) LDL of 97 (35) mg/dL, 67% of those with a history of
CVD had LDL ≥70 mg/dL, while 46% of those without CVD history had LDL ≥100 mg/dL.
Baseline triglycerides ≥ 150 mg/dL had 48% of patients. In the sub-cohort 1, among those
with/ without a history of CVD 90/ 74% were using a lipid modifying drug prior or within 1
year of baseline (data not shown). With mean (SD) SBP of 131 (15) mmHg, 50% of those
with a history of CVD had SBP ≥130 mmHg, while 25% of those without CVD history had
SBP ≥140 mmHg. In the sub-cohort 1, among those with/ without a history of CVD 97 / 84%
were using a blood pressure lowering drug prior or within 1 year of baseline (data not
shown).
Individual Risk Factor Failure
Irrespective of baseline control, 37, 39, and 42% of patients failed to achieve HbA1c below
7.5% at 6, 12, and 24 months post intensification with second-line ADD (Table 2). The
proportions of those who failed to control HbA1c were lower for those with a history of CVD
at baseline (32-38%), compared to those without a history of CVD at baseline (38-42%, data
not shown). Among patients, who were using a lipid modifying drug, 43% had uncontrolled
LDL over two years post baseline (Table 2), whereas 64 / 36% of those with / without a
history of CVD failed to achieve LDL< 70 /100 mg/dL (data not shown). Among patients,
who were using a lipid modifying drug, 46% had uncontrolled triglycerides over two years
post baseline (Table 2), the proportions were similar among those with/ without a history of
CVD at baseline. Among patients, who were using a blood pressure lowering drug, 30%
failed to control SBP during two years post intensification with second-line ADD, whereas 49
/ 24% of those with / without a history of CVD failed to achieve SBP < 130/140 mmHg over
2 years.
131
Page 10 of 25
Among patients with baseline HbA1c ≥7.5 and ≤ 9%, 43/ 46/ 48% failed to achieve HbA1c <
7.5% at 6/ 12/ 24 months, irrespective of additional therapy intensification (Supplementary
Figure 2). Among those who were using a lipid modifying drug and had uncontrolled LDL at
baseline, the proportions of those who were uncontrolled at 6/12/ 24 months were 71 /65/
60%. Similarly, more than 60% continued to have uncontrolled triglycerides, among those
who were uncontrolled at baseline. Among those who were using a blood pressure lowering
drug and had uncontrolled SBP at baseline, the proportions of those who were uncontrolled at
6/12/ 24 months were 60/ 55/ 51% (Supplementary Figure 2).
Pairwise Risk Factor Control
Among patients who were using a lipid modifying drug, apart from being on intensified ADD
by design, around 62% failed to simultaneously control HbA1c+LDL over two years post
second-line ADD initiation (Table 2), whereas around 75 / 58% of those with / without a
history of CVD failed to control both risk factors simultaneously (data not shown). The
adjusted probability (95% CI) of failing to simultaneously control both risk factors at 24
months was the lowest in those who initiated second-line with GLP-1RA [0.55 (0.53, 0.58)]
and TZD [0.55 (0.54, 0.57)], followed by DPP-4i [0.59 (0.58, 0.60)], and significantly higher
for SU [0.65 (0.64, 0.65)] and INS [0.69 (0.68, 0.70)] users (Table 3).
Among patients who were using a lipid modifying drug, around 62% failed to simultaneously
control HbA1c+Triglycerides over two years post second-line ADD initiation (Table 2). The
adjusted probability (95% CI) of failing to simultaneously control both risk factors at 24
months was the lowest in those who initiated second-line with TZD [0.52 (0.50, 0.53)],
followed by DPP-4i [0.56 (0.55, 0.57)] and GLP-1RA [0.56 (0.52, 0.60)], and significantly
higher for SU [0.63 (0.62, 0.64)] and insulin [0.65 (0.63, 0.66)] users (Table 3).
132
Page 11 of 25
Among patients who were using a blood pressure lowering drug, 53/ 55/ 57% failed to
simultaneously control HbA1c+SBP at 6, 12, and 24 months post second-line ADD initiation
(Table 2). Among those with/ without a history of CVD 64-67% and 50-54% of patients
failed to control both risk factors simultaneously (data not shown). The adjusted probability
(95% CI) of failing to simultaneously control both risk factors at 24 months, was lower in
those who initiated second-line with TZD [0.48 (0.47, 0.50)], GLP-1RA [0.49 (0.47, 0.51)],
or DPP-4i [0.51 (0.50, 0.52)], compared to those who initiated with SU [0.60 (0.59,0.60)] and
insulin [0.64 (0.63, 0.65)] groups (Table 3).
Continued Risk Factor Burden
Among those with follow-up for at least 24 months, 35% had continuously uncontrolled
HbA1c of more than 7.5%. The two-year burden increased from 31% for those who
intensified first-line in 2005 till 41% for those who intensified therapy in 2014 (Figure 1A).
Two-year burden increased from 28 to 36% and from 32 to 42% for those with/ without a
history of CVD at baseline (Figure 1B and C). The proportions of those with continuously
uncontrolled HbA1c were lower among those who initiated second-line with GLP-1RA (95%
CI: 24-26%) and TZD (95% CI: 23-24%), followed by DPP-4i (95% CI: 28-30%), and
significantly higher for SU (95% CI: 39-40%) and INS (95% CI: 50-51%) (Figure 1 D).
Among those who initiated a lipid modifying drug prior to 12 months of baseline and had at
least two years of follow-up, 41% had continuously uncontrolled LDL (Figure 1A). Among
those with/ without a history of CVD at baseline, 65/ 33% had continuously uncontrolled
LDL over 2 years of baseline (Figure 1B and C). Comparing by the CVD status at baseline,
133
Page 12 of 25
the proportions of those with two-year LDL burden were similar among second-line ADD
classes (Figure 1 E-F).
Among those who initiated a lipid modifying drug prior to 12 months of baseline and had at
least two years of follow-up, 37% had continuously uncontrolled triglycerides of more than
150 mg/dL (Figure 1A-C). Among those without CVD history at baseline, the proportion of
those with continuously uncontrolled triglycerides was lower in TZD (95% CI: 31-33%) and
INS (95% CI: 33-36%) groups, compared to other second-line ADDs (95% CIs: 37-41%)
(Figure 1E). Among those with CVD history at baseline, the lowest proportion of those with
continuously uncontrolled triglycerides was in TZD (95% CI: 28-32%) group, compared to
other second-line ADDs (95% CIS: 34-43%) (Figure 1F).
Among those who initiated a blood pressure lowering prior to 12 months of baseline and had
at least two years of follow-up, 27-33% had continuously uncontrolled SBP (Figure 1A).
Among those with/ without a history of CVD at baseline, 51/ 21% had continuously
uncontrolled SBP over 2 years of baseline (Figure 1B and C). Those who initiated second-
line ADD with GLP-1RA had the lowest two-year SBP burden (95% CI: 18-20%), followed
by DPP-4i (95% CI: 23-24%), TZD (95% CI: 25-26%), SU and INS (95% CIs: 30-32%)
(Figure 1 D-F).
Cardiovascular risk
Individual and composite rates per 1000 person-years of HF, MI, and stroke along with
number of failures are presented in the Table 4. In the primary prevention group, 18,438 3-
point MACE events occurred during available follow-up. The lowest rate per 1000 person-
years was observed in those who initiated second-line with GLP-1RA (95% CI: 13-15),
134
Page 13 of 25
followed by DPP-4i and TZD groups (95% CIs: 19, 21), SU (95% CI: 26-27), and INS (95%
CI: 31-34). Such pattern was preserved in individual analyses for HF, MI, and stroke as well.
In the secondary prevention group, 15,323 3-point MACE events occurred during available
follow-up. The lowest rate per 1000 person-years was observed in those who initiated
second-line with GLP-1RA (95% CI: 53-66), followed by DPP-4i and TZD groups (95% CIs:
66- 80), SU (95% CI: 86-90), and INS (95% CI: 100-108). Such pattern was preserved in
individual analyses for HF, MI, and Stroke as well (Table 4).
CONCLUSIONS
In this longitudinal exploratory study of a large cohort of patients with type 2 diabetes from
primary and ambulatory care systems of USA we have observed that (1) irrespective of
baseline control, more than 40% of patients do not meet 7.5% target after two years post
metformin intensification; (2) long-term glycaemic burden has increased over the last decade;
(3) around third of patients have consistently uncontrolled lipids and SBP even though they
are using cardio-protective drugs; (4) treatment with GLP-1RA was associated with lower
rates of major adverse macrovascular events.
The results of this study clearly demonstrate persistent glycaemic and CV risk factor burden
among patients who are using multiple medications for glucose, lipid, and blood pressure
control. Three out of five patients who are already receiving intensified treatment are failing
to simultaneously control glucose level and at least one CV risk factor. Furthermore, the
proportions of patients who fail to control CV risk factors are not reducing over the time, and
glycaemic burden has increased during the last decade.
Population ageing, therapy non-adherence and not adequate treatment intensification when
needed (therapeutic inertia) may explain observed patterns. While statin prescribing patterns
135
Page 14 of 25
are increasing, using US Medical Expenditure Panel Survey data, Salami and colleagues
reported that use of high intensity statins was only 18-20% in patients with diabetes and no
atherosclerotic cardiovascular disease 18. Similarly, Abdallah and colleagues reported that
among 1,300 patients with diabetes, 88% were prescribed statins at the time of hospital
discharge for acute myocardial infarction, whereas only 22% were prescribed intensive statin
therapy 19. Further studies investigating intensification patterns for lipid and blood pressure
control and long-term consequences of not intensifying therapy when needed are required in
patients with diabetes.
We have observed the lower probabilities of simultaneous failure in control of HbA1c and
LDL, triglycerides, or SBP in those who initiated second-line with GLP-1RA, DPP-4i, and
TZD, compared to those who were treated with SU and INS. In a recent study by Montvida
and colleagues it was shown that incretin-based therapies and TZD provide higher chance of
sustainable glycaemic compared to SU and INS 20. While two-year glycaemic burden was
significantly different by the ADD classes, those who were treated with TZD had lower
triglycerides burden and those on GLP-1RA had lower SBP burden. The composite and
individual rates of HF, MI, and stroke were lowest for those who initiated second-line with
GLP-1RA, compared to other ADDs in cohorts with and without a history of CVD.
In general, the CEMR database is representative of US population in terms of age and ethnic
subgroups, however a higher proportions of patients from north eastern and mid-western
states are represented in the CEMR 21. The distribution of CV risk factors was found to be
similar to the prospective national health surveys 22. A large cohort size with average of 3.7
years of follow-up post metformin intensification assure reliable estimates reported in the
present study. Drug use data available from patient’s medication lists along with prescribing
information and robust data mining methodologies applied, bring additional value to this
136
Page 15 of 25
study 13,23. Nonetheless, the findings should be interpreted with caution: EMR data are in
general biased towards unhealthy populations and commercially insured individuals, White
Caucasians are over represented in the CEMR, and the results are subject to limited follow-
up.
To conclude, we have observed alarming rates of population-level glycaemic and CV risk
factor control, whereas the burden has not reduced during the last decade. While treatment
guidelines, clinician, and population education are constantly improving, the CV burden and
associated costs of diabetes management are unlikely to reduce in the nearest future.
ACKNOWLEDGEMENTS
OM and SKP were responsible for the primary design of the study. JHS contributed
significantly in the study design. OM conducted the data extraction. OM and SKP jointly
conducted the statistical analyses. The first draft of the manuscript was developed by OM and
SKP, and all authors contributed to the finalization of the manuscript. SKP had full access to
all the data in the study and takes responsibility for the integrity of the data and the accuracy
of the data analysis.
Melbourne EpiCentre gratefully acknowledges the support from the National Health and
Medical Research Council and the Australian Government’s National Collaborative Research
Infrastructure Strategy (NCRIS) initiative through Therapeutic Innovation Australia. OM
acknowledges the Ph. D. scholarship from Queensland University of Technology, Australia,
and her co-supervisors Prof. Ross Young and Prof. Louise Hafner of the same University.
JHS is supported by a National Health and Medical Research Council Research Fellowship.
No separate funding was obtained for this study.
Declaration of interests
137
Page 16 of 25
SKP has acted as a consultant and/or speaker for Novartis, GI Dynamics, Roche,
AstraZeneca, Guangzhou Zhongyi Pharmaceutical and Amylin Pharmaceuticals LLC. He has
received grants in support of investigator and investigator initiated clinical studies from
Merck, Novo Nordisk, AstraZeneca, Hospira, Amylin Pharmaceuticals, Sanofi-Avensis and
Pfizer. OM has no conflict of interest to declare. JHS has received speaker honoraria,
consultancy fees and/or travel sponsorship from AstraZeneca, Boehringer Ingelheim, Lilly,
Sanofi, Mylan, Novo Nordisk, Merck Smith and Dohme and Novartis.
138
Page 17 of 25
REFERENCES
1. Fox CS, Coady S, Sorlie PD, et al. Increasing cardiovascular disease burden due to diabetes mellitus: the Framingham Heart Study. Circulation 2007; 115(12): 1544-50. 2. Turnbull FM, Abraira C, Anderson RJ, et al. Intensive glucose control and macrovascular outcomes in type 2 diabetes. Diabetologia 2009; 52(11): 2288-98. 3. Arnold SV, Kosiborod M, Wang J, Fenici P, Gannedahl G, LoCasale RJ. Burden of Cardio-Renal-Metabolic Conditions in Adults with Type 2 Diabetes within the Diabetes Collaborative Registry. Diabetes, Obesity and Metabolism 2018. 4. American Diabetes Association. Standards of Medical Care in Diabetes—2018. Diabetes Care 2018; 41(Supplement 1): S4. 5. Khunti K, Kosiborod M, Ray KK. Legacy benefits of blood glucose, blood pressure and lipid control in people with diabetes and cardiovascular disease: Time to overcome multifactorial therapeutic inertia? Diabetes, Obesity and Metabolism 2018. 6. Ali MK, Bullard KM, Saaddine JB, Cowie CC, Imperatore G, Gregg EW. Achievement of goals in US diabetes care, 1999–2010. New England Journal of Medicine 2013; 368(17): 1613-24. 7. Lipska KJ, Yao X, Herrin J, et al. Trends in drug utilization, glycemic control, and rates of severe hypoglycemia, 2006–2013. Diabetes care 2017; 40(4): 468-75. 8. Carls G, Huynh J, Tuttle E, Yee J, Edelman SV. Achievement of glycated hemoglobin goals in the US remains unchanged through 2014. Diabetes Therapy 2017; 8(4): 863-73. 9. Centers for Disease Control and Prevention. National diabetes statistics report: estimates of diabetes and its burden in the United States, 2018. Atlanta, GA: US Department of Health and Human Services 2018. 10. Crawford AG, Cote C, Couto J, et al. Comparison of GE Centricity Electronic Medical Record database and National Ambulatory Medical Care Survey findings on the prevalence of major conditions in the United States. Population health management 2010; 13(3): 139-50. 11. Brixner D, Said Q, Kirkness C, Oberg B, Ben-Joseph R, Oderda G. Assessment of cardiometabolic risk factors in a national primary care electronic health record database. Value in health 2007; 10(s1): S29-S36. 12. Paul SK, Shaw J, Montvida O, Klein K. Weight gain in insulin treated patients by BMI categories at treatment initiation: New evidence from real-world data in patients with type 2 diabetes. Diabetes, Obesity and Metabolism 2016. 13. Montvida O, Arandjelović O, Reiner E, Paul SK. Data Mining Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic Medical Records. Open Bioinformatics Journal 2017; 10: 1-15. 14. Thomas G, Klein K, Paul S. Statistical challenges in analysing large longitudinal patient-level data: the danger of misleading clinical inferences with imputed data. J Indian Soc Agric Stat 2014; 68: 39-54. 15. Rotnitzky A, Robins JM. Inverse probability weighting in survival analysis. Encyclopedia of Biostatistics 2005. 16. Austin PC, Stuart EA. The performance of inverse probability of treatment weighting and full matching on the propensity score in the presence of model misspecification when estimating the effect of treatment on survival outcomes. Statistical methods in medical research 2015: 0962280215584401. 17. Austin PC. Variance estimation when using inverse probability of treatment weighting (IPTW) with survival analysis. Statistics in Medicine 2016; 35(30): 5642-55.
139
Page 18 of 25
18. Salami JA, Warraich H, Valero-Elizondo J, et al. National trends in statin use and expenditures in the US adult population from 2002 to 2013: insights from the Medical Expenditure Panel Survey. Jama cardiology 2017; 2(1): 56-65. 19. Abdallah MS, Kosiborod M, Tang F, et al. Patterns and predictors of intensive statin therapy among patients with diabetes mellitus after acute myocardial infarction. American Journal of Cardiology 2014; 113(8): 1267-72. 20. Montvida O, Shaw J, Blonde L, Paul SK. Long-term sustainability of glycaemic achievements with second-line anti-diabetic therapies in patients with type 2 diabetes: A real-world study. Diabetes, Obesity and Metabolism 2018. 21. Brixner D, McAdam-Marx C, Ye X, et al. Six-month outcomes on A1C and cardiovascular risk factors in patients with type 2 diabetes treated with exenatide in an ambulatory care setting. Diabetes, Obesity and Metabolism 2009; 11(12): 1122-30. 22. Brixner D, Said Q, Kirkness C, Oberg B, Ben-Joseph R, Oderda G. Assessment of cardiometabolic risk factors in a national primary care electronic health record database. Value in health 2007; 10: S29-S36. 23. Owusu Adjah ES, Montvida O, Agbeve J, Paul SK. Data Mining Approach to Identify Disease Cohorts from Primary Care Electronic Medical Records: A Case of Diabetes Mellitus. The Open Bioinformatics Journal 2017; 10(1).
140
Page 19 of 25
Table 1: Cohort characteristics at the time of second-line anti-diabetic drug initiation.
All No history of CVD
History of CVD
N 276,884 216,567 60,317 Age, years* 59 (12) 57 (12) 64 (9) Male† 136,918 (49) 99,907 (46) 37,011 (61) White† 194,758 (70) 149,180 (69) 45,578 (76) Black† 32,671 (12) 27,274 (13) 5,397 (9) Time from metformin initiation, months*
7.5 (15.7) 7.1 (15.2) 9.0 (17.4)
Follow-up, years* 3.7 (2.4) 3.7 (2.5) 3.6 (2.4) Follow-up ≥ 12 months† 247,223 (89) 193,092 (89) 54,131 (90) Follow-up ≥ 24 months† 191,883 (69) 149,833 (69) 42,050 (70) Therapy duration, months* 33 (25) 33 (25) 33 (24) HbA1c, %* 8.4 (1.9) 8.5 (1.9) 8.1 (1.7) HbA1c ≥ 7.5%† 102,624 (60) 84,835 (61) 17,789 (54) Weight, kg* 99 (25) 100 (25) 97 (23) BMI, kg/m2 * 35 (8) 35 (8) 33 (7) BMI<25 kg/m2† 18,819 (7) 13,735 (7) 5,084 (9) BMI ≥25 and <30 kg/m2† 60,575 (23) 44,963 (22) 15,612 (27) BMI ≥ 30 kg/m2† 187,936 (70) 150,067 (72) 37,869 (65) SBP, mmHg* 131 (15) 131 (15) 130 (16) Uncontrolled SBP† 82,837 (30) 53,168 (25) 29,669 (50) DBP, mmHg* 77 (9) 78 (9) 75 (9)
LDL, mg/dL* 97 (35) 100 (35) 87 (34) Uncontrolled LDL† 71,424 (50) 51,077 (46) 20,347 (67) HDL, mg/dL*` 43 (12) 44 (12) 42 (12) Triglycerides, mg/dL‡ 147 (107, 197) 148 (107, 198) 146 (107, 195) Triglycerides ≥ 150 mg/dL† 54,640 (48) 43,240 (49) 11,400 (48) Chronic kidney disease† 9,602 (3) 5,793 (3) 3,809 (6) Cancer† 13,750 (5) 9,951 (5) 3,799 (6) Depression† 38,444 (14) 29,996 (14) 8,448 (14) Charlson Comorbidity Index* 1.6 (1.1) 1.4 (0.9) 2.4 (1.4) Any lipid modifying drug† 188,272 (68) 137,391 (63) 50,881 (84) Statin† 168,485 (61) 121,287 (56) 47,198 (78) Blood pressure lowering drug† 224,086 (81) 167,177 (77) 56,909 (94) *mean (sd); †n(%); ‡median (IQR);
§Uncontrolled SBP: ≥ 130 /140 mmHg for those with/ without a history of cardiovascular disease;
||Uncontrolled LDL: ≥ 70 /100 mg/dL for those with/ without history cardiovascular disease.
141
Page 20 of 25
Table 2: Proportions (95% CI) of those who failed to control individual risk factors* and proportions (95% CI) of those who failed to control two risk factors simultaneously at 6, 12, and 24 months post second-line anti-diabetic drug initiation,
6 months 12 months 24 months Individual Failure HbA1c 37 (36, 37) 39 (39, 39) 42 (41, 42) LDL 43 (43, 43) 43 (43, 43) 42 (41, 42) Triglycerides 46 (45, 46) 46 (45, 46) 45 (44, 45) SBP 31 (30, 31) 31 (30, 31) 30 (30, 30) Simultaneous Failure HbA1c + LDL 61 (60, 61) 62 (62, 62) 63 (62, 63) HbA1c + Triglycerides 61 (61, 61) 62 (62, 62) 63 (62, 63) HbA1c + SBP 53 (53, 54) 55 (55, 55) 57 (57, 57)
*Control: HbA1c <7.5%; triglycerides <150 mg/dL; SBP < 130/140 mmHg for those with/ without a history of cardiovascular disease; LDL< 70 /100 mg/dL for those with/ without history cardiovascular disease. LDL, Triglycerides and SBP proportions are calculated among users of lipid modifying and blood pressure lowering drugs.
142
Page 21 of 25
Table 3: By groups of second-line anti-diabetic drug, adjusted probability (95% CI) of failure to simultaneously control* both risk factors.
6 months 12 months 24 months HbA1c and LDL MET+SU 0.62 (0.62, 0.63) 0.64 (0.64, 0.65) 0.65 (0.64, 0.65) MET+TZD 0.57 (0.56, 0.58) 0.55 (0.54, 0.56) 0.55 (0.54, 0.57) MET+INS 0.66 (0.65, 0.67) 0.67 (0.66, 0.68) 0.69 (0.68, 0.70) MET+GLP-1RA 0.56 (0.54, 0.58) 0.54 (0.52, 0.56) 0.55 (0.53, 0.58) MET+DPP4 0.58 (0.57, 0.59) 0.58 (0.58, 0.59) 0.59 (0.58, 0.60) HbA1c and Triglycerides MET+SU 0.60 (0.59, 0.61) 0.62 (0.61, 0.62) 0.63 (0.62, 0.64) MET+TZD 0.53 (0.51, 0.54) 0.52 (0.50, 0.53) 0.52 (0.50, 0.53) MET+INS 0.62 (0.61, 0.64) 0.65 (0.63, 0.66) 0.65 (0.63, 0.66) MET+GLP-1RA 0.54 (0.51, 0.57) 0.54 (0.50, 0.57) 0.56 (0.52, 0.60) MET+DPP4 0.54 (0.53, 0.55) 0.56 (0.54, 0.57) 0.56 (0.55, 0.57) HbA1c and SBP MET+SU 0.55 (0.55, 0.56) 0.57 (0.57, 0.58) 0.60 (0.59, 0.60) MET+TZD 0.48 (0.47, 0.49) 0.47 (0.46, 0.48) 0.48 (0.47, 0.50) MET+INS 0.59 (0.58, 0.60) 0.61 (0.60, 0.62) 0.64 (0.63, 0.65) MET+GLP-1RA 0.47 (0.45, 0.49) 0.48 (0.46, 0.49) 0.49 (0.47, 0.51) MET+DPP4 0.49 (0.48, 0.49) 0.49 (0.48, 0.50) 0.51 (0.50, 0.52)
*Control: HbA1c <7.5%; triglycerides <150 mg/dL; SBP < 130/140 mmHg for those with/ without a history of cardiovascular disease; LDL< 70 /100 mg/dL for those with/ without history cardiovascular disease. LDL, Triglycerides and SBP proportions are calculated among users of lipid modifying and blood pressure lowering drugs. Second-line treatment groups balanced on baseline risk factor measurements, analyses adjusted for sex, duration of diabetes, baseline age and body weight.
143
Pag
e 22
of
25
Fig
ure
1:
Pro
port
ion
of c
onti
nuou
sly
unco
ntro
lled
pat
ient
s ov
er tw
o ye
ars
post
sec
ond-
line
ant
i-di
abet
ic d
rug
init
iati
on, b
y th
e ye
ar o
f dr
ug
init
iati
on (
A-C
) an
d by
dru
g cl
ass
(D-F
).
*U
ncon
trol
led
mea
sure
s (a
t 6 m
onth
s O
R a
t 12
mon
ths)
AN
D (
at 1
8 m
onth
s O
R a
t 24
mon
ths)
. L
DL
, Tri
glyc
erid
es a
nd S
BP
pro
port
ions
are
cal
cula
ted
amon
g th
ose
who
wer
e us
ing
lipid
mod
ifyi
ng a
nd b
lood
pre
ssur
e lo
wer
ing
drug
s pr
ior
to o
r w
ithin
12
mon
ths
of s
econ
d-lin
e an
ti-di
abet
ic d
rug
initi
atio
n.
(B)
(C)
(A)
(D)
(E)
(F)
Page 23 of 25
Table 4: By groups of second-line anti-diabetic drug and by CVD status at the time of second-line anti-diabetic drug initiation, number of failures and rates (95% CIs ) per 1000 person years of heart failure, myocardial infarction, stroke and their composite.
Without baseline CVD With baseline CVD Failures Rate Failures Rate Heart Failure, Myocardial Infarction, or StrokeMET+SU 10,781 26.1 (25.7, 26.6) 9,157 87.9 (86.1, 89.7) MET+TZD 2,165 20.2 (19.4, 21.1) 1,416 69.1 (65.6, 72.8) MET+INS 2,809 32.4 (31.2, 33.6) 2,491 103.5 (99.5, 107.6)MET+GLP-1RA 480 14.1 (12.9, 15.4) 307 59.4 (53.1, 66.4) MET+DPP4 2,203 19.3 (18.5, 20.2) 1,952 76.3 (73.0, 79.7) Heart Failure MET+SU 4,835 11.3 (11.0, 11.6) 4,666 39.9 (38.8, 41.0) MET+TZD 916 8.2 (7.7, 8.8) 610 26.0 (24.0, 28.1) MET+INS 1,357 15.0 (14.2, 15.8) 1,336 49.4 (46.8, 52.1) MET+ GLP-1RA
179 5.1 (4.4, 6.0) 134 24.0 (20.3, 28.4)
MET+DPP4 791 6.8 (6.3, 7.3) 898 32.2 (30.1, 34.4) Myocardial Infarction MET+SU 1,429 3.3 (3.1, 3.4) 1,511 12.2 (11.6, 12.8) MET+TZD 284 2.5 (2.2, 2.8) 238 9.7 (8.6, 11.0) MET+INS 366 4.0 (3.6, 4.4) 386 13.4 (12.1, 14.8) MET+ GLP-1RA
63 1.8 (1.4, 2.3) 49 8.4 (6.4, 11.2)
MET+DPP4 229 1.9 (1.7, 2.2) 307 10.6 (9.5, 11.9) Stroke MET+SU 5,925 13.9 (13.6, 14.3) 4,548 39.4 (38.3, 40.6) MET+TZD 1,232 11.2 (10.6, 11.8) 815 36.3 (33.8, 38.8) MET+INS 1,415 15.7 (14.9, 16.6) 1,210 44.8 (42.4, 47.4) MET+ GLP-1RA
276 8.0 (7.1, 9.0) 167 30.2 (26.0, 35.2)
MET+DPP4 1,318 11.4 (10.8, 12.0) 1,011 36.8 (34.6, 39.1)
145
Page 24 of 25
Supplementary Figure 1: Flowchart of study cohorts
History of CVD (n=42,050)
No history of CVD (n=149,833)
Type 2 Diabetes (n=2,624,954)
Age at diagnosis ≥ 18 and <80 (n=2,590,853)
Diabetes Mellitus (n=2,893,321)
Patients with non‐missing sex and age (n=34,299,123)
Diabetes Diagnosis after entry to the EMR (n=1,412,938)
Metformin as first‐line (n=740,478)
Diabetes Diagnosis on or after Jan 1, 2005 (n=1,305,686)
GLP‐1RA (n=15,448
DPP‐4i (n=61,508
Initiated second‐line (n=347,735)
INS(n=49,939
SU (n=187,819)
TZD (n=33,021
Baseline HbA1c, SBP, Triglycerides, or LDL (n=322,630)
History of CVD (n=60,317)
Therapy duration ≥ 3 months and follow‐up ≥ 6 months (n=276,884)
No history of CVD (n=216,567)
STUDY COHORT
Follow‐up at least 12 months (n=247,223)
STUDY SUB‐COHORT 1
Follow‐up at least 24 months (n=191,883)
STUDY SUB‐COHORT 2
History of CVD (n=54,131)
No history of CVD (n=193,092)
146
Page 25 of 25
Supplementary Figure 2: Among uncontrolled* patients at baseline, proportion (95% CI)
of those who failed to control† individual risk factors at 6-,12-, and 24- months post second-
line anti-diabetic drug initiation.
*Uncontrolled: HbA1c ≥7.5 and ≤ 9%; triglycerides ≥ 150 mg/dL; SBP ≥130/140 mmHg for those with/ without a history of cardiovascular disease; LDL ≥ 70 /100 mg/dL for those with/ without history cardiovascular disease.
†Control: HbA1c <7.5%; triglycerides <150 mg/dL; SBP < 130/140 mmHg for those with/ without a history of cardiovascular disease; LDL< 70 /100 mg/dL for those with/ without history cardiovascular disease. LDL, Triglycerides and SBP proportions are calculated among users of lipid modifying and blood pressure lowering drugs.
147
Chapter 10: Discussion and Conclusions
The novelty of this research project includes a holistic evaluation of large representative EMR
to conduct much needed pharmaco-epidemiological comparative effectiveness and outcome
studies in patients with T2DM treated with different older and newer classes of anti-diabetic
therapies, while also addressing the methodological issues to ensure our ability to draw robust
inferences from such epidemiological studies based on EMRs. This thesis provides a detailed
exploration of real-world cardio-metabolic effects of treatment with incretin-based therapies in
patients with T2DM. Six pharmaco-epidemiological and three methodological studies were
conducted over three years and multiple important findings were reported in prestigious high
impact research journals.
Using large EMRs for clinical studies is a comparatively recent, and rapidly developing
research direction that requires specialists in health informatics to have both outstanding data
management skills as well as a deep understanding of the clinical question being studied. Three
methodological studies conducted as part of this thesis have direct implications on the data
quality used for the dissertation’s clinical analyses, and also can potentially improve the quality
of the data in future EMR based clinical and pharmaco-epidemiological research, leading to
reliable inferences drawn from individual studies.
The first methodological study (Chapter 4) focused on the analytical challenges associated with
the dynamics of prescriptions with different drugs, and developed two algorithms to estimate
the duration of treatment with specific drugs of interest. These approaches were compared and
tested on their ability to capture interchanges between therapies during the course of treatment.
The proposed algorithm was shown to be a reliable and effective tool to extract and aggregate
information on medication data at an individual patient level, which makes it a valuable tool
for use in future research.
The second methodological study (Chapter 5 and Appendix B) discussed data mining
challenges associated with the extraction of diseased cohorts, some of which are not
straightforward and potentially may be unnoticed or indicated at the late stages of statistical
analyses. Diagnostic codes, clinically guided algorithms and machine learning approaches
were simultaneously employed to ensure the choice of a robust cohort of patients with diabetes
for observational studies with a reduced selection bias.
148
Missing data is a pervasive problem with all prospective and retrospective observational studies
(including EMRs), posing challenges in terms of efficient design and analyses of clinical
effectiveness studies. Compared to the missing data in RCTs, the patterns and mechanisms
behind the missing risk factor or outcome data in EMRs are very complex and difficult to
ascertain. Robust imputation of missing data relies on the understanding of the predictors of
missingness in the risk factor data, especially in patients with chronic diseases. The third
methodological study (Chapter 6) compared three approaches based on the Multiple Imputation
technique in terms of their robustness in imputing data in patients with diabetes. A novel
component of this study is the investigation of the likelihood of missingness of follow-up risk
factor measures (HbA1c) with patients’ demographic and clinical characteristics (age, sex, pre-
existing comorbidities and disease severity). While all three imputation techniques were able
to provide consistent and reliable clinical inferences under unknown patterns of missingness,
this study demonstrated that complete case analyses were prone to bias by indication and
highlights the importance of missing risk factor data imputation.
This dissertation provides detailed explorations of the population-level disease management
patterns. Firstly, it provides detailed assessments of changes in the choices of first-, second-,
and third- line ADDs over last 10 years (Chapter 7). It also explores the glycaemic state, clinical
characteristics and comorbidities at the time of first-line and second-line therapy initiation by
ADD classes. This dissertation clearly demonstrates that the therapeutic inertia problem exists
at the population level with 50 - 60% of patients having HbA1c above 7.5% at first- and second-
line therapy initiation. The long-term consequences of not intensifying treatment when needed
on the glycaemic and CV risk were shown by Paul and Khunti [90, 178, 179]. In this context,
the study that explored the combination of GLP-1RA and insulin (Appendix A) is of high
importance. It was demonstrated that patients who are not adequately controlled on GLP-1RA
would benefit from an early addition of insulin (compared to switching) in terms of long-term
cardio-metabolic outcomes. In fact, those who intensified the therapy with insulin later, ended
up at the same high HbA1c level after 2 years of therapy initiation with GLP-1RA. The study
that reported that obese patients who initiate insulin do not increase body weight (Appendix C)
brings additional reassurance to these patients, as most of the patients who initiate with GLP-
1RA are obese with mean body weight of 109 kg (Chapters 7-9).
Due to complex study designs, results of RCTs are difficult to compare and individual patient
choices on therapy intensification becomes very complex. While only one large RCT has been
designed to compare the glycaemic outcomes of treatment with major ADDs (GRADE -
149
completion expected in 2020), the study presented in chapter 8 provides much needed estimates
of adjusted probabilities to achieve clinically desirable glycaemic control with major second-
line ADDs, and the probabilities to sustain such glycaemic achievements over 2 years, with
and without the need for third-line therapy intensification. A highly valuable contribution of
this study is the assessment of glycaemic control sustainability in patients treated with major
second-line ADDs. Notably, the sustainability of achieved control would not be ethical to
assess using a RCT. Similarly, the long-term glycaemic and CV risk factor burden in patients
with T2DM who are already using intensified anti-diabetic and cardio-protective therapies, was
not reported to date (Chapter 9). With more than a third of patients having consistently
uncontrolled HbA1c, lipid and SBP levels, 3 out of 5 have uncontrolled HbA1c and at least
one CV risk factor over 2 years post intensification. The results reported in chapter 9 provide
an explanation for the non-declining rates of CV events among patients with T2DM – an issue
that is of major concern for health and government authorities globally.
Extensions of this dissertation include both methodological and clinical directions. The data
quality and linkages of registry data are improving with time, and it becomes possible to
estimate a patient's adherence to prescribed medications in the real-world setting. Such
calculations are methodologically challenging and will provide only rough estimates.
Nonetheless, such studies are essential in order to understand population level patterns of
adherence - given the increased cardio-metabolic burden described in this dissertation and other
studies. Much needed methodological studies include exploring the variability of study
outcomes under different study designs and statistical methodologies. For example, assessing
supplementary hypotheses while working on this dissertation, it was observed that exposure to
insulin is associated with an increased risk of acute pancreatitis in patients with T2DM.
However when events that occurred during the first 6 months of baseline were excluded, the
risk was no longer significant (Appendix D).
Exploration of efficient designs for pharmaco-epidemiological effectiveness studies in chronic
diseases is an important future direction, along with research towards improving statistical
techniques for advanced analyses to account for the longitudinal non-linear risk factor
interactions in real-world scenarios. While the “STrengthening the Reporting of OBservational
studies in Epidemiology’’ (STROBE) statement is guiding through the methodological aspects
in observational studies, there are no standardised protocols to conduct such studies till date
[180].
150
The pharmaco-epidemiological extension directions include assessment of cardio-metabolic
outcomes of combining incretin-based therapies and SGLT-2i, the latest ADD class. This
dissertation was not designed to explore glycaemic and CV outcomes of treatment with SGLT-
2i alone, neither to assess outcomes of combining incretin-based therapies with SGLT-2i due
to very limited data in the initial CEMR extract. As it was shown in the chapter 7, the popularity
of DPP-4i and SGLT-2i is dramatically increasing, therefore assessments of novel drug
combinations present particular interest. Chapter 7 also reports higher discontinuation rates of
novel therapies compared to the older alternatives, which may be attributable to side-effects
and also to higher medication costs. More detailed assessments of the underlying reasons for
medication cessation and the long-term outcomes of such discontinuations also present an
opportunity for future studies. As it was reported in this dissertation, even though most patients
with T2DM are using lipid modifying and blood pressure lowering drugs, many do not meet
LDL, Triglycerides, or SBP targets. There is a paucity of studies assessing outcomes of
therapeutic inertia associated with non-intensifying lipid or blood pressure lowering therapies
when needed. The numerical estimates of the complications associated with therapeutic inertia
could motivate clinicians and patients towards more pro-active / aggressive CV risk factor
management. The direct extension of the studies reported in chapters 8-9 is the development
of an algorithm that would estimate the probabilities of achieving and sustaining cardio-
metabolic risk factor control with different glucose, lipid, and blood pressure lowering drug
intensification options under the given (current) risk factor profile of an individual patient. The
algorithm could be implemented as an open-source tool or integrated into the existing EMR
systems. Such a patient-centred approach would help health care professionals to make therapy
intensification choices in the most informed manner to maximise long-term cardio-metabolic
benefits while reducing diabetes related complications in a cost-effective way. The tool would
also help to involve and engage a patient in the decision making process without the need to
assess a huge amount of clinical literature. Finally, using this tool, health economists could
evaluate cost-benefits related to the therapy choice and risk-factor control at national levels.
Challenges and limitations of EMR-based studies were discussed in the introductory chapter
(Subsection 1.9.2), whereas particular concerns associated with individual studies were
discussed in each chapter separately. Representativeness of the CEMR database and of patients
with diabetes are discussed in the subsections 3.1 and 5.4 respectively. In general, it was
observed that the reported population-level processes are comparable to studies using national
survey data. Compared to surveys, EMR data does not suffer from selective nonresponse,
151
response and recall biases [99], but are biased towards unhealthy populations and commercially
insured individuals. Comparative analyses conducted during this dissertation were carefully
balanced on baseline characteristics between treatment groups and appropriately adjusted on
various confounders including demographics, existing comorbidities, and longitudinal clinical
and medication data. Nonetheless, non-availability of data to control for patients’ social-
economic status, diet and exercise complicates direct causality interpretation.
To summarise, this thesis highlights the existing glycaemic and cardiovascular risk factor
burden at the population level. Treatment with incretin-based therapies and thiazolidinedione
provides higher chances to achieve and sustain a glycaemic and CV risk factor control,
compared to sulfonylurea and insulin. A residual benefit on the risk of major adverse
cardiovascular events was observed among patients treated with GLP-1RA compared to other
major ADD choices. Nonetheless, patient-centred disease management to holistically control
for glycaemic and cardiovascular risk factors remains a key aspect to improve long-term
outcomes in patients with T2DM.
152
Bibliography
1. American Diabetes Association, Standards of Medical Care in Diabetes—2018. Diabetes Care, 2018. 41(Supplement 1): p. S4.
2. International Diabetes Federation, IDF Diabetes Atlas 6th edition. 2014, Brussel, Belgium: International Diabetes Federation.
3. World Health Organization, Definition, diagnosis and classification of diabetes mellitus and its complications: report of a WHO consultation. Part 1, Diagnosis and classification of diabetes mellitus. 1999.
4. Lloyd-Jones, D., et al., Executive summary: Heart disease and stroke statistics-2010 update: A report from the american heart association. Circulation, 2010. 121(7): p. e46-e215.
5. American Diabetes Association, Expert Committee on the Diagnosis and Classification of Diabetes Mellitus. Report of the expert committee on the diagnosis and classification of diabetes mellitus. Diabetes Care, 2003. 26(1): p. S5-S20.
6. Benjamin, M., Miller-Keane Encyclopedia and Dictionary of Medicine. Nursing and Allied Health. Philadelphia: Saunders, 1997.
7. Zimmet, P.Z., Diabetes and its drivers: the largest epidemic in human history? Clin Diabetes Endocrinol, 2017. 3: p. 1.
8. International Diabetes Federation, IDF Diabetes Atlas, 8 edn. Brussels, Belgium, 2017.
9. Riddle, M.C. and W.H. Herman, The Cost of Diabetes Care—An Elephant in the Room. Diabetes Care, 2018. 41(5): p. 929-932.
10. American Diabetes Association, Economic Costs of Diabetes in the US in 2017. Diabetes Care, 2018. 41(5): p. 917-928.
11. Magliano, D.J., et al., The productivity burden of diabetes at a population level. Diabetes care, 2018. 41(5): p. 979-984.
12. Bommer, C., et al., Global economic burden of diabetes in adults: projections from 2015 to 2030. Diabetes care, 2018. 41(5): p. 963-970.
13. Wild, S., et al., Global prevalence of diabetes: estimates for the year 2000 and projections for 2030. Diabetes Care, 2004. 27(5): p. 1047-53.
14. Zheng, Y., S.H. Ley, and F.B. Hu, Global aetiology and epidemiology of type 2 diabetes mellitus and its complications. Nat Rev Endocrinol, 2018. 14(2): p. 88-98.
15. Hu, F.B., Globalization of diabetes: the role of diet, lifestyle, and genes. Diabetes care, 2011. 34(6): p. 1249-1257.
16. Paul, S.K., et al., Comparison of body mass index at diagnosis of diabetes in a multi-ethnic population: A case-control study with matched non-diabetic controls. Diabetes Obes Metab, 2017. 19(7): p. 1014-1023.
17. Xu, Y., et al., Prevalence and control of diabetes in Chinese adults. Jama, 2013. 310(9): p. 948-59.
153
18. Rayanagoudar, G., et al., Quantification of the type 2 diabetes risk in women with gestational diabetes: a systematic review and meta-analysis of 95,750 women. Diabetologia, 2016. 59(7): p. 1403-1411.
19. Wendland, E.M., et al., Gestational diabetes and pregnancy outcomes--a systematic review of the World Health Organization (WHO) and the International Association of Diabetes in Pregnancy Study Groups (IADPSG) diagnostic criteria. BMC Pregnancy Childbirth, 2012. 12: p. 23.
20. Bellamy, L., et al., Type 2 diabetes mellitus after gestational diabetes: a systematic review and meta-analysis. Lancet, 2009. 373(9677): p. 1773-9.
21. Clausen, T.D., et al., High prevalence of type 2 diabetes and pre-diabetes in adult offspring of women with gestational diabetes mellitus or type 1 diabetes: the role of intrauterine hyperglycemia. Diabetes Care, 2008. 31(2): p. 340-6.
22. Turnbull, F.M., et al., Intensive glucose control and macrovascular outcomes in type 2 diabetes. Diabetologia, 2009. 52(11): p. 2288-98.
23. Stratton, I.M., et al., Association of glycaemia with macrovascular and microvascular complications of type 2 diabetes (UKPDS 35): prospective observational study. Bmj, 2000. 321(7258): p. 405-12.
24. Zoungas, S., et al., Effects of intensive glucose control on microvascular outcomes in patients with type 2 diabetes: a meta-analysis of individual participant data from randomised controlled trials. Lancet Diabetes Endocrinol, 2017. 5(6): p. 431-437.
25. Mannucci, E., et al., Is Glucose Control Important for Prevention of Cardiovascular Disease in Diabetes? Diabetes Care, 2013. 36(Suppl 2): p. S259-S263.
26. Holman, R.R., et al., 10-year follow-up of intensive glucose control in type 2 diabetes. NEJM, 2008. 359(15): p. 1577-89.
27. Holman, R.R., et al., Long-term follow-up after tight control of blood pressure in type 2 diabetes. NEJM, 2008. 359(15): p. 1565-76.
28. Zimmet, P., Preventing diabetic complications: a primary care perspective. Diabetes research and clinical practice, 2009. 84(2): p. 107-116.
29. Cade, W.T., Diabetes-related microvascular and macrovascular diseases in the physical therapy setting. Physical therapy, 2008. 88(11): p. 1322-1335.
30. Bourne, R.R., et al., Causes of vision loss worldwide, 1990–2010: a systematic analysis. The lancet global health, 2013. 1(6): p. e339-e349.
31. UK Prospective Diabetes Study (UKPDS) Group, Effect of intensive blood-glucose control with metformin on complications in overweight patients with type 2 diabetes (UKPDS 34). The Lancet, 1998. 352(9131): p. 854-865.
32. Patel, A., et al., Intensive blood glucose control and vascular outcomes in patients with type 2 diabetes. New England Journal of Medicine, 2008. 358(24): p. 2560-2572.
33. Skyler, J.S., et al., Intensive Glycemic Control and the Prevention of Cardiovascular Events: Implications of the ACCORD, ADVANCE, and VA Diabetes Trials. A Position Statement of the American Diabetes Association and a Scientific Statement of the American College of Cardiology Foundation and the American Heart Association. Journal of the American College of Cardiology, 2009. 53(3): p. 298-304.
154
34. de Boer, I.H., et al., Long-term renal outcomes of patients with type 1 diabetes mellitus and microalbuminuria: an analysis of the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications cohort. Arch Intern Med, 2011. 171(5): p. 412-20.
35. Fox, C.S., et al., Increasing cardiovascular disease burden due to diabetes mellitus: the Framingham Heart Study. Circulation, 2007. 115(12): p. 1544-50.
36. American Heart Association, Diabetes mellitus: a major risk factor for cardiovascular disease. 1999: American Heart Association.
37. Booth, G.L., et al., Relation between age and cardiovascular disease in men and women with diabetes compared with non-diabetic people: a population-based retrospective cohort study. Lancet, 2006. 368(9529): p. 29-36.
38. Booth, G.L., et al., Recent trends in cardiovascular complications among men and women with and without diabetes. Diabetes Care, 2006. 29(1): p. 32-7.
39. Benjamin, E.J., et al., Heart Disease and Stroke Statistics-2017 Update: A Report From the American Heart Association. Circulation, 2017. 135(10): p. e146-e603.
40. Goff, D.C., et al., Prevention of cardiovascular disease in persons with type 2 diabetes mellitus: current knowledge and rationale for the Action to Control Cardiovascular Risk in Diabetes (ACCORD) trial. American Journal of Cardiology, 2007. 99(12): p. S4-S20.
41. Shah, A.D., et al., Type 2 diabetes and incidence of cardiovascular diseases: a cohort study in 1ꞏ 9 million people. The lancet Diabetes & endocrinology, 2015. 3(2): p. 105-113.
42. Del Prato, S., Megatrials in type 2 diabetes. From excitement to frustration? Diabetologia, 2009. 52(7): p. 1219-26.
43. Ismail-Beigi, F., et al., Effect of intensive treatment of hyperglycaemia on microvascular outcomes in type 2 diabetes: an analysis of the ACCORD randomised trial. The Lancet, 2010. 376(9739): p. 419-430.
44. Fox, C.S., et al., Update on prevention of cardiovascular disease in adults with type 2 diabetes mellitus in light of recent evidence: A scientific statement from the American Heart Association and the American Diabetes Association. Circulation, 2015. 132(8): p. 691-718.
45. Khunti, K., M. Kosiborod, and K.K. Ray, Legacy benefits of blood glucose, blood pressure and lipid control in people with diabetes and cardiovascular disease: Time to overcome multifactorial therapeutic inertia? Diabetes, Obesity and Metabolism, 2018.
46. Holman, R.R., et al., 10-year follow-up of intensive glucose control in type 2 diabetes. N Engl J Med, 2008. 359(15): p. 1577-89.
47. Diabetes Atlas, International diabetes federation. Press Release, Cape Town, South Africa, 2006. 4.
48. Inzucchi, S.E., et al., Management of hyperglycemia in type 2 diabetes: a patient-centered approach: position statement of the American Diabetes Association (ADA) and the European Association for the Study of Diabetes (EASD). Diabetes Care, 2012. 35(6): p. 1364-79.
49. Turner, R.C., et al., Glycemic control with diet, sulfonylurea, metformin, or insulin in patients with type 2 diabetes mellitus: progressive requirement for multiple therapies (UKPDS 49). UK Prospective Diabetes Study (UKPDS) Group. Jama, 1999. 281(21): p. 2005-12.
155
50. Nathan, D.M., et al., Medical Management of Hyperglycemia in Type 2 Diabetes: A Consensus Algorithm for the Initiation and Adjustment of Therapy. Diabetes Care, 2009. 32(1): p. 193-203.
51. American Diabetes Association, Standards of Medical Care in Diabetes—2011. Diabetes Care, 2011. 34(Supplement 1): p. S11-S61.
52. Adler, A.I., et al., Newer agents for blood glucose control in type 2 diabetes: summary of NICE guidance. BMJ, 2009. 338.
53. Ross, S.A. and J.-M. Ekoé, Incretin agents in type 2 diabetes. Canadian Family Physician, 2010. 56(7): p. 639-648.
54. Hædersdal, S., et al. The Role of Glucagon in the Pathophysiology and Treatment of Type 2 Diabetes. in Mayo Clinic Proceedings. 2018. Elsevier.
55. Drucker, D.J. and A.B. Goldfine, Cardiovascular safety and diabetes drug development. The Lancet, 2011. 377(9770): p. 977-979.
56. Garber, A.J., Novel GLP-1 receptor agonists for diabetes. Expert Opinion on Investigational Drugs, 2012. 21(1): p. 45-57.
57. Holst, J.J., The physiology of glucagon-like peptide 1. Physiological Reviews, 2007. 87(4): p. 1409-1439.
58. Nauck, M.A., et al., Incretin-based therapies: viewpoints on the way to consensus. Diabetes Care, 2009 32(Suppl 2): p. S223-231.
59. Stonehouse, A.H., T. Darsow, and D.G. Maggs, Incretin-Based Therapies. Journal of Diabetes, 2011: p. In press.
60. Baggio, L.L. and D.J. Drucker, Biology of Incretins: GLP-1 and GIP. Gastroenterology, 2007. 132(6): p. 2131-2157.
61. Charbonnel, B. and B. Cariou, Pharmacological management of type 2 diabetes: the potential of incretin-based therapies. Diabetes, Obesity and Metabolism, 2011. 13(2): p. 99-117.
62. Smilowitz, N.R., R. Donnino, and A. Schwartzbard, Glucagon-Like Peptide-1 Receptor Agonists for Diabetes Mellitus: A Role in Cardiovascular Disease. Circulation, 2014. 129(22): p. 2305-2312.
63. Scheen, A.J., Cardiovascular outcome studies with incretin-based therapies: comparison between DPP-4 inhibitors and GLP-1 receptor agonists. diabetes research and clinical practice, 2017. 127: p. 224-237.
64. Nauck, M.A., et al., Incretin-based therapies: viewpoints on the way to consensus. Diabetes care, 2009. 32(suppl 2): p. S223-S231.
65. Neumiller, J.J., Incretin-based therapies. Medical Clinics, 2015. 99(1): p. 107-129.
66. Stonehouse, A.H., T. Darsow, and D.G. Maggs, Incretin‐based therapies. Journal of diabetes, 2012. 4(1): p. 55-67.
67. Gough, S., Handbook of Incretin-based Therapies in Type 2 Diabetes. 2016: Springer.
68. Madsbad, S., Review of head‐to‐head comparisons of glucagon‐like peptide‐1 receptor agonists. Diabetes, Obesity and Metabolism, 2016. 18(4): p. 317-332.
156
69. Munir, K.M. and E.M. Lamos, Diabetes type 2 management: what are the differences between DPP-4 inhibitors and how do you choose? 2017, Taylor & Francis.
70. Craddy, P., H.-J. Palin, and K.I. Johnson, Comparative Effectiveness of Dipeptidylpeptidase-4 Inhibitors in Type 2 Diabetes: A Systematic Review and Mixed Treatment Comparison. Diabetes Therapy, 2014. 5(1): p. 1-41.
71. Keshavarz, K., et al., Linagliptin versus sitagliptin in patients with type 2 diabetes mellitus: a network meta-analysis of randomized clinical trials. DARU Journal of Pharmaceutical Sciences, 2017. 25(1): p. 23.
72. Sivertsen, J., et al., The effect of glucagon-like peptide 1 on cardiovascular risk. Nat Rev Cardiol, 2012. 9(4): p. 209-22.
73. Mora, P.F. and E.L. Johnson, Cardiovascular Outcome Trials Of The Incretin-Based Therapies: What Do We Know So Far? Endocr Pract, 2017. 23(1): p. 89-99.
74. Ha, S.J., et al., Preventive Effects of Exenatide on Endothelial Dysfunction Induced by Ischemia-Reperfusion Injury via KATP Channels. Arteriosclerosis, Thrombosis, and Vascular Biology, 2012. 32(2): p. 474-480.
75. Nikolaidis, L.A., et al., Effects of Glucagon-Like Peptide-1 in Patients With Acute Myocardial Infarction and Left Ventricular Dysfunction After Successful Reperfusion. Circulation, 2004. 109(8): p. 962-965.
76. Ban, K., et al., Cardioprotective and Vasodilatory Actions of Glucagon-Like Peptide 1 Receptor Are Mediated Through Both Glucagon-Like Peptide 1 Receptor–Dependent and –Independent Pathways. Circulation, 2008. 117(18): p. 2340-2350.
77. Chilton, R., et al., Cardiovascular Comorbidities of Type 2 Diabetes Mellitus: Defining the Potential of Glucagonlike peptide–1-Based Therapies. The American Journal of Medicine, 2011. 124(1, Supplement): p. S35-S53.
78. Song, X., et al., Anti-atherosclerotic effects of the glucagon-like peptide-1 (GLP-1) based therapies in patients with type 2 Diabetes Mellitus: A meta-analysis. Scientific reports, 2015. 5.
79. Noyan-Ashraf, M.H., et al., GLP-1R Agonist Liraglutide Activates Cytoprotective Pathways and Improves Outcomes After Experimental Myocardial Infarction in Mice. Diabetes, 2009. 58(4): p. 975-983.
80. Bunck, M.C., et al., One-year treatment with exenatide vs. Insulin Glargine: Effects on postprandial glycemia, lipid profiles, and oxidative stress. Atherosclerosis, 2010. 212(1): p. 223-229.
81. Schwartz, E.A., et al., Exenatide suppresses postprandial elevations in lipids and lipoproteins in individuals with impaired glucose tolerance and recent onset type 2 diabetes mellitus. Atherosclerosis, 2010. 212(1): p. 217-222.
82. Buse, J.B., et al., Switching to Once-Daily Liraglutide From Twice-Daily Exenatide Further Improves Glycemic Control in Patients With Type 2 Diabetes Using Oral Agents. Diabetes Care, 2010. 33(6): p. 1300-1303.
83. Vilsbøll, T., et al., Effects of glucagon-like peptide-1 receptor agonists on weight loss: systematic review and meta-analyses of randomised controlled trials. BMJ, 2012. 344.
157
84. Sun, F., et al., Effect of glucagon-like peptide-1 receptor agonists on lipid profiles among type 2 diabetes: a systematic review and network meta-analysis. Clin Ther, 2015. 37(1): p. 225-241.e8.
85. Katout, M., et al., Effect of GLP-1 mimetics on blood pressure and relationship to weight loss and glycemia lowering: results of a systematic meta-analysis and meta-regression. Am J Hypertens, 2014. 27(1): p. 130-9.
86. Food and Drug Administration. Use of Real-World Evidence To Support Regulatory Decision-Making for Medical Devices; Guidance for Industry and Food and Drug Administration Staff. 2017; Available from: https://www.fda.gov/downloads/medicaldevices/deviceregulationandguidance/guidancedocuments/ucm513027.pdf.
87. Sherman, R.E., et al., Real-world evidence—what is it and what can it tell us. N Engl J Med, 2016. 375(23): p. 2293-2297.
88. Franklin, J.M. and S. Schneeweiss, When and how can real world data analyses substitute for randomized controlled trials? Clinical Pharmacology & Therapeutics, 2017. 102(6): p. 924-933.
89. Paul, S.K., et al., Delay in treatment intensification increases the risks of cardiovascular events in patients with type 2 diabetes. Cardiovascular diabetology, 2015. 14(1): p. 100.
90. Khunti, K., et al., Clinical inertia with regard to intensifying therapy in people with type 2 diabetes treated with basal insulin. Diabetes, Obesity and Metabolism, 2016.
91. Giugliano, D., et al., Comment on Edelman and Polonsky. Type 2 Diabetes in the Real World: The Elusive Nature of Glycemic Control. Diabetes Care 2017;40:1425–1432. Diabetes Care, 2018. 41(2): p. e17.
92. Edelman, S.V. and W.H. Polonsky, Type 2 Diabetes in the Real World: The Elusive Nature of Glycemic Control. Diabetes Care, 2017. 40(11): p. 1425-1432.
93. Crapo, J., Big Data in Healthcare: Separating The Hype From The Reality. HealthCatalyst, 2015: p. 5.
94. Crawford, A.G., et al., Comparison of GE Centricity Electronic Medical Record database and National Ambulatory Medical Care Survey findings on the prevalence of major conditions in the United States. Popul Health Manag, 2010. 13(3): p. 139-50.
95. Grabenbauer, L., A. Skinner, and J. Windle, Electronic Health Record Adoption - Maybe It's not about the Money: Physician Super-Users, Electronic Health Records and Patient Care. Appl Clin Inform, 2011. 2(4): p. 460-71.
96. Birkhead, G.S., M. Klompas, and N.R. Shah, Uses of electronic health records for public health surveillance to advance public health. Annu Rev Public Health, 2015. 36: p. 345-59.
97. Coloma, P.M., et al., Combining electronic healthcare databases in Europe to allow for large‐scale drug safety monitoring: the EU‐ADR Project. Pharmacoepidemiology and drug safety, 2011. 20(1): p. 1-11.
98. Kosiborod, M., et al., Lower Risk of Heart Failure and Death in Patients Initiated on Sodium-Glucose Cotransporter-2 Inhibitors Versus Other Glucose-Lowering DrugsClinical Perspective: The CVD-REAL Study (Comparative Effectiveness of Cardiovascular Outcomes
158
in New Users of Sodium-Glucose Cotransporter-2 Inhibitors). Circulation, 2017. 136(3): p. 249-259.
99. Verheij, A.R., et al., Possible Sources of Bias in Primary Care Electronic Health Record Data Use and Reuse. J Med Internet Res, 2018. 20(5): p. e185.
100. Regier, E.E., M.V. Venkat, and K.L. Close, More than 7 years of hindsight: revisiting the FDA’s 2008 guidance on cardiovascular outcomes trials for Type 2 diabetes medications. Clinical Diabetes, 2016. 34(4): p. 173-180.
101. Nissen, S.E. and K. Wolski, Effect of rosiglitazone on the risk of myocardial infarction and death from cardiovascular causes. New England Journal of Medicine, 2007. 356(24): p. 2457-2471.
102. Scirica, B.M., et al., Saxagliptin and cardiovascular outcomes in patients with type 2 diabetes mellitus. New England Journal of Medicine, 2013. 369(14): p. 1317-1326.
103. McMurray, J.J.V., et al., Heart failure: a cardiovascular outcome in diabetes that can no longer be ignored. The Lancet Diabetes & Endocrinology, 2015. 2(10): p. 843-851.
104. Marso, S.P., et al., Liraglutide and cardiovascular outcomes in type 2 diabetes. New England Journal of Medicine, 2016. 375(4): p. 311-322.
105. Monami, M., et al., Dipeptidyl peptidase‐4 inhibitors and cardiovascular risk: a meta‐analysis of randomized clinical trials. Diabetes, Obesity and Metabolism, 2013. 15(2): p. 112-120.
106. Monami, M., et al., Effects of glucagon‐like peptide‐1 receptor agonists on cardiovascular risk: a meta‐analysis of randomized clinical trials. Diabetes, Obesity and Metabolism, 2014. 16(1): p. 38-47.
107. Wu, S., et al., The cardiovascular effect of incretin-based therapies among type 2 diabetes: a systematic review and network meta-analysis. Expert opinion on drug safety, 2018: p. 1-7.
108. Gamble, J.M., et al., Incretin-based medications for type 2 diabetes: an overview of reviews. Diabetes Obes Metab, 2015. 17(7): p. 649-58.
109. Patorno, E., et al., Comparative Cardiovascular Safety of Glucagon-Like Peptide-1 Receptor Agonists versus Other Antidiabetic Drugs in Routine Care: a Cohort Study. Diabetes, Obesity and Metabolism, 2016: p. n/a-n/a.
110. Kannan, S., et al., Risk of overall mortality and cardiovascular events in patients with type 2 diabetes on dual drug therapy including metformin: A large database study from the Cleveland Clinic. J Diabetes, 2016. 8(2): p. 279-85.
111. d’Agostino, R.B., Tutorial in biostatistics: propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med, 1998. 17(19): p. 2265-2281.
112. Rubin, D.B., Using multivariate matched sampling and regression adjustment to control bias in observational studies. ETS Research Report Series, 1978. 1978(2).
113. Hade, E.M. and B. Lu, Bias associated with using the estimated propensity score as a regression covariate. Statistics in medicine, 2014. 33(1): p. 74-87.
114. Zghebi, S., et al., Comparative risk of major cardiovascular events associated with second‐line antidiabetic treatments: a retrospective cohort study using UK primary care data linked to
159
hospitalization and mortality records. Diabetes, Obesity and Metabolism, 2016. 18(9): p. 916-924.
115. Gamble, J.-M., et al., Comparative effectiveness of incretin-based therapies and the risk of death and cardiovascular events in 38,233 metformin monotherapy users. Medicine, 2016. 95(26).
116. Paul, S.K., et al., The association of the treatment with glucagon-like peptide-1 receptor agonist exenatide or insulin with cardiovascular outcomes in patients with type 2 diabetes: A retrospective observational study. Cardiovascular Diabetology, 2015. 14(1).
117. Best, J.H., et al., Risk of cardiovascular disease events in patients with type 2 diabetes prescribed the Glucagon-Like Peptide 1 (GLP-1) receptor agonist exenatide twice daily or other glucose-lowering therapies: A retrospective analysis of the lifelink database. Diabetes Care, 2011. 34(1): p. 90-95.
118. Velez, M., et al., Association of antidiabetic medications targeting the glucagon-like peptide 1 pathway and heart failure events in patients with diabetes. J Card Fail, 2015. 21(1): p. 2-8.
119. Mogensen, U.M., et al., Cardiovascular safety of combination therapies with incretin-based drugs and metformin compared with a combination of metformin and sulphonylurea in type 2 diabetes mellitus - a retrospective nationwide study. Diabetes Obes Metab, 2014.
120. Fu, A.Z., et al., Association Between Hospitalization for Heart Failure and Dipeptidyl Peptidase-4 Inhibitors in Patients With Type 2 Diabetes: An Observational Study. Diabetes care, 2016: p. dc150764.
121. Weir, D.L., et al., Sitagliptin use in patients with diabetes and heart failure: a population-based retrospective cohort study. JACC Heart Fail, 2014. 2(6): p. 573-82.
122. Scheller, N.M., et al., All-cause mortality and cardiovascular effects associated with the DPP-IV inhibitor sitagliptin compared with metformin, a retrospective cohort study on the Danish population. Diabetes Obes Metab, 2014. 16(3): p. 231-6.
123. Kim, S.C., et al., Dipeptidyl peptidase-4 inhibitors do not increase the risk of cardiovascular events in type 2 diabetes: a cohort study. Acta Diabetol, 2014. 51(6): p. 1015-23.
124. Wang, K.L., et al., Sitagliptin and the risk of hospitalization for heart failure: a population-based study. Int J Cardiol, 2014. 177(1): p. 86-90.
125. Morgan, C.L., et al., Combination therapy with metformin plus sulphonylureas versus metformin plus DPP-4 inhibitors: association with major adverse cardiovascular events and all-cause mortality. Diabetes Obes Metab, 2014. 16(10): p. 977-83.
126. GE Healthcare, Centricity Electronic Medical Record Brochure. 2011.
127. Brixner, D., et al., Six‐month outcomes on A1C and cardiovascular risk factors in patients with type 2 diabetes treated with exenatide in an ambulatory care setting. Diabetes, Obesity and Metabolism, 2009. 11(12): p. 1122-1130.
128. Brixner, D., et al., Assessment of cardiometabolic risk factors in a national primary care electronic health record database. Value in health, 2007. 10: p. S29-S36.
129. Inzucchi, S., et al., Progression to insulin therapy among patients with type 2 diabetes treated with sitagliptin or sulphonylurea plus metformin dual therapy. Diabetes, Obesity and Metabolism, 2015. 17(10): p. 956-964.
160
130. Levin, P., et al., Therapeutically interchangeable? A study of real‐world outcomes associated with switching basal insulin analogues among US patients with type 2 diabetes mellitus using electronic medical records data. Diabetes, Obesity and Metabolism, 2015. 17(3): p. 245-253.
131. Chitnis, A.S., et al., Clinical effectiveness of liraglutide across body mass index in patients with type 2 diabetes in the United States: a retrospective cohort study. Advances in therapy, 2014. 31(9): p. 986-999.
132. Davis, K.L., et al., Real-world comparative outcomes of US type 2 diabetes patients initiating analog basal insulin therapy. Current medical research and opinion, 2013. 29(9): p. 1083-1091.
133. Ashton, V., et al., LDL-C levels in US patients at high cardiovascular risk receiving rosuvastatin monotherapy. Clinical therapeutics, 2014. 36(5): p. 792-799.
134. Chopra, I. and K.M. Kamal, Factors associated with therapeutic goal attainment in patients with concomitant hypertension and dyslipidemia. Hospital Practice, 2014. 42(2): p. 77-88.
135. Saseen, J.J., et al., Maintaining goal blood pressures after switching from olmesartan to other angiotensin receptor blockers. The Journal of Clinical Hypertension, 2013. 15(12): p. 888-892.
136. Crawford, A.G., et al., Prevalence of obesity, type II diabetes mellitus, hyperlipidemia, and hypertension in the United States: findings from the GE Centricity Electronic Medical Record database. Population health management, 2010. 13(3): p. 151-161.
137. Brixner, D., et al., Evaluation of cardiovascular risk factors, events, and costs across four BMI categories. Obesity, 2013. 21(6): p. 1284-1292.
138. DerSarkissian, M., et al., Maintenance of weight loss or stability in subjects with obesity: a retrospective longitudinal analysis of a real-world population. Current Medical Research and Opinion, 2017. 33(6): p. 1105-1110.
139. Tandon, N., et al., Psy64 Ge Centricity® Electronic Medical Records Study: Comorbidities And Biologic Experience Among Patients Receiving Golimumab. Value in Health, 2011. 14(3): p. A70-A71.
140. Wang, J., et al., New diagnosis of hypertension among celecoxib and nonselective NSAID users: a population-based cohort study. Annals of Pharmacotherapy, 2007. 41(6): p. 937-943.
141. Rajagopalan, V., et al., SAT0069 Performance of the Framingham Cardiovascular Risk Prediction Model with and without C-Reactive Protein or Erythrocyte Sedimentation Rate in RA: Analysis of US Electronic Medical Records Database. Annals of the Rheumatic Diseases, 2014. 73(Suppl 2): p. 615-615.
142. Paul, S.K., et al. Effectiveness of biologic and non-biologic antirheumatic drugs on anaemia markers in 153,788 patients with rheumatoid arthritis: new evidence from real-world data. in Seminars in arthritis and rheumatism. 2018. Elsevier.
143. Marelli, C., et al., Statins and risk of cancer: a retrospective cohort analysis of 45,857 matched pairs from an electronic medical records database of 11 million adult Americans. Journal of the American College of Cardiology, 2011. 58(5): p. 530-537.
144. Talal, A., et al., Absolute and relative contraindications to pegylated‐interferon or ribavirin in the US general patient population with chronic hepatitis C: results from a US database of over 45 000 HCV‐infected, evaluated patients. Alimentary pharmacology & therapeutics, 2013. 37(4): p. 473-481.
161
145. Unni, S., et al., Hypertension control and antihypertensive therapy in patients with chronic kidney disease. American journal of hypertension, 2014. 28(6): p. 814-822.
146. Patel, A., et al., Care Provision and Prescribing Practices of Physicians Treating Children and Adolescents With ADHD. Psychiatric Services, 2017: p. appi. ps. 201600130.
147. World Health Organization Collaborating Centre for Drug Statistics Methodology. ATC. 2011; Available from: https://www.whocc.no/atc/structure_and_principles/.
148. US Food and Drug Administration. FDA Approved Drug Products. 2017; Available from: https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm.
149. US Food and Drug Administration, FDA approves weight-management drug Saxenda. 2014.
150. Hampp, C., et al., Use of antidiabetic drugs in the US, 2003–2012. Diabetes care, 2014. 37(5): p. 1367-1374.
151. Yurkovich, M., et al., A systematic review identifies valid comorbidity indices derived from administrative health data. Journal of Clinical Epidemiology, 2015. 68(1): p. 3-14.
152. Needham, D.M., et al., A systematic review of the Charlson comorbidity index using Canadian administrative databases: a perspective on risk adjustment in critical care research. Journal of critical care, 2005. 20(1): p. 12-19.
153. Elixhauser, A., et al., Comorbidity measures for use with administrative data. Medical care, 1998. 36(1): p. 8-27.
154. Von Korff, M., E.H. Wagner, and K. Saunders, A chronic disease score from automated pharmacy data. Journal of clinical epidemiology, 1992. 45(2): p. 197-203.
155. Khan, N.F., et al., Adaptation and validation of the Charlson Index for Read/OXMIS coded databases. BMC family practice, 2010. 11(1): p. 1.
156. Bannay, A., et al., The best use of the charlson comorbidity index with electronic health care database to predict mortality. Medical care, 2016. 54(2): p. 188-194.
157. Charlson, M.E., et al., A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. Journal of chronic diseases, 1987. 40(5): p. 373-383.
158. Quan, H., et al., Updating and validating the Charlson comorbidity index and score for risk adjustment in hospital discharge abstracts using data from 6 countries. American journal of epidemiology, 2011. 173(6): p. 676-682.
159. Denti, L., et al., Validity of the modified Charlson comorbidity index as predictor of short-term outcome in older stroke patients. Journal of Stroke and Cerebrovascular Diseases, 2015. 24(2): p. 330-336.
160. Quan, H., et al., Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Medical care, 2005: p. 1130-1139.
161. Deyo, R.A., D.C. Cherkin, and M.A. Ciol, Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. Journal of clinical epidemiology, 1992. 45(6): p. 613-619.
162. U.S. National Library of Medicine. SNOMED CT to ICD-10-CM Map. 2016; Available from: https://www.nlm.nih.gov/research/umls/mapping_projects/snomedct_to_icd10cm.html.
162
163. Kamal, K.M., et al., Use of electronic medical records for clinical research in the management of type 2 diabetes. Res Social Adm Pharm, 2014. 10(6): p. 877-84.
164. Su, C.C., et al., Risk of diabetes in patients with rheumatoid arthritis: a 12-year retrospective cohort study. J Rheumatol, 2013. 40(9): p. 1513-8.
165. Seidu, S., et al., Prevalence and characteristics in coding, classification and diagnosis of diabetes in primary care. Postgraduate medical journal, 2014. 90(1059): p. 13-17.
166. Shivade, C., et al., A review of approaches to identifying patient phenotype cohorts using electronic health records. Journal of the American Medical Informatics Association, 2014. 21(2): p. 221-230.
167. Tapak, L., et al., Real-Data Comparison of Data Mining Methods in Prediction of Diabetes in Iran. Healthcare Informatics Research, 2013. 19(3): p. 177-185.
168. Mani, S., et al., Type 2 diabetes risk forecasting from EMR data using machine learning. AMIA Annu Symp Proc, 2012. 2012: p. 606-15.
169. American Diabetes Association, Standards of Medical Care in Diabetes—2017: Summary of Revisions. Diabetes Care, 2017. 40(Supplement 1): p. S4-S5.
170. Witten, I.H., et al., Data Mining: Practical machine learning tools and techniques. 2016: Morgan Kaufmann.
171. Leung, K.M., Naive bayesian classifier. Polytechnic University Department of Computer Science/Finance and Risk Engineering, 2007.
172. Ng, A.Y. and M.I. Jordan. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. in Advances in neural information processing systems. 2002.
173. Cristianini, N. and J. Shawe-Taylor, An introduction to support vector machines and other kernel-based learning methods. 2000: Cambridge university press.
174. Murtagh, F., Multilayer perceptrons for classification and regression. Neurocomputing, 1991. 2(5): p. 183-197.
175. Bhargava, N., et al., Decision tree analysis on j48 algorithm for data mining. Proceedings of International Journal of Advanced Research in Computer Science and Software Engineering, 2013. 3(6).
176. Holte, R.C., Very simple classification rules perform well on most commonly used datasets. Machine learning, 1993. 11(1): p. 63-90.
177. Centers for Disease Control and Prevention, National diabetes statistics report: estimates of diabetes and its burden in the United States, 2018. Atlanta, GA: US Department of Health and Human Services, 2018.
178. Paul, S., J. Shaw, and K. Klein. Therapeutic inertia for glycaemic and blood pressure control in patients with type 2 diabetes mellitus and the cardiovascular consequences. in Diabetologia. 2015. Springer 233 Spring St, New York, NY 10013 USA.
179. Mata‐Cases, M., et al., Therapeutic inertia in patients treated with two or more antidiabetics in primary care: F actors predicting intensification of treatment. Diabetes, Obesity and Metabolism, 2018. 20(1): p. 103-112.
163
180. Von Elm, E., et al., The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: guidelines for reporting observational studies. International journal of surgery, 2014. 12(12): p. 1495-1499.
164
APPENDIX A
165
OR I G I N A L A R T I C L E
Addition of or switch to insulin therapy in people treated withglucagon-like peptide-1 receptor agonists: A real-world studyin 66 583 patients
Olga Montvida MSc1,2 | Kerenaftali Klein PhD1 | Sudhesh Kumar MD3 |
Kamlesh Khunti PhD4 | Sanjoy K. Paul PhD1
1Clinical Trials and Biostatistics Unit, QIMR
Berghofer Medical Research Institute,
Brisbane, Australia
2School of Biomedical Sciences, Institute of
Health and Biomedical Innovation, Faculty of
Health, Queensland University of Technology,
Brisbane, Australia
3Warwick Medical School, University of
Warwick, and University Hospitals Coventry
and Warwickshire, Coventry, UK
4Diabetes Research Centre, Leicester Diabetes
Centre, University of Leicester, Leicester, UK
Corresponding Author: Sanjoy K. Paul, PhD,
Clinical Trials and Biostatistics Unit, QIMR
Berghofer Medical Research Institute,
300 Herston Road, Herston, QLD 4006
Brisbane, Australia (Sanjoy.
Funding Information
No separate funding was obtained for this
study.
Background: Real world outcomes of addition or switch to insulin therapy in type 2 diabetes
(T2DM) patients on glucagon-like paptide-1 receptor agonist (GLP-1RA) with inadequately con-
trolled hyperglycaemia, are not known.
Materials and methods: Patients with T2DM (n = 66 583) with a minimum of 6 months of
GLP-1RA treatment and without previous insulin treatment were selected. Those who added
insulin (n = 39 599) or switched to insulin after GLP-1RA cessation (n = 4706) were identified.
Adjusted changes in glycated haemoglobin (HbA1c), weight, systolic blood pressure (SBP) and
LDL cholesterol were estimated over 24 months follow-up.
Results: Among those who continued with GLP-1RA treatment without adding or switching to
insulin, the highest adjusted mean HbA1c change was achieved within 6 months, with no fur-
ther glycaemic benefits observed during 24 months of follow-up. Addition of insulin within
6 months of GLP-1RA initiation was associated with 18% higher odds of achieving HbA1c <7%
at 24 months, compared with adding insulin later. At 24 months, those who added insulin
reduced HbA1c significantly by 0.55%, while no glycaemic benefit was observed in those who
switched to insulin. Irrespective of intensification with insulin, weight, SBP and LDL cholesterol
were significantly reduced by 3 kg, 3 mm Hg and 0.2 mmol/L, respectively, over 24 months.
Conclusions: Significant delay in intensification of treatment by addition of insulin is observed
in patients with T2DM inadequately controlled with GLP-1RA. Earlier addition of insulin is
associated with better glycaemic control, while switching to insulin is not clinically beneficial
during 2 years of treatment. Non-responding patients on GLP-1RA would benefit from adding
insulin therapy, rather than switching to insulin.
KEYWORDS
GLP-1 analogue, insulin therapy, pharmaco-epidemiology, type 2 diabetes
1 | INTRODUCTION
People with diabetes are at increased risk of developing disabling and
life-threatening health problems, including microvascular and macro-
vascular complications.1,2 Good control of hyperglycaemia and the
associated risk factors in type 2 diabetes (T2DM) has been shown to
reduce the risk of these complications.1,3 Thus anti-hyperglycaemic
treatment strategies should ideally also address the management of
cardiovascular risk factors, including body weight, blood pressure and
lipids.4 Novel antihyperglycaemic glucagon-like peptide-1 receptor
agonist (GLP-1RA) therapies, including exenatide (EXE) and liraglutide
(LIRA), have the potential to address these challenges.5–9
The combination of GLP-1RA and insulin treatments represents a
promising glycaemic management strategy because of the comple-
mentary mechanisms of actions of these therapies.10 Both therapies
affect body weight, but in opposite directions; while significant
weight reductions have been observed in patients treated with GLP-
1RA, insulin is known to significantly increase body weight.11,12
Received: 11 June 2016 Revised: 9 September 2016 Accepted: 11 September 2016
DOI 10.1111/dom.12790
Diabetes Obes Metab; 9999: •• wileyonlinelibrary.com/journal/dom © 2016 John Wiley & Sons Ltd 1
166
A meta-analysis of clinical trials conducted by Eng et al.13 showed
that the combination of a GLP-1RA with basal insulin resulted in
robust glycaemic control without increased risk of hypoglycaemia or
weight gain. The effectiveness of adding GLP-1RA to basal insulin
therapy on glucose and weight control in patients with T2DM has
also been evaluated in a number of observational and real-world
data-based studies.10,14–20 The intensification of insulin therapy by
addition of GLP-1RA, rather than adding mealtime insulin, has been
shown to be an attractive therapeutic option,13,21 and is now recom-
mended in international guidelines: “The available data now suggest
that either a GLP-1 receptor agonist or prandial insulin could be used
in this setting [in patients not achieving target glycated haemoglobin
(HbA1c)], with the former arguably safer, at least for short-term
outcomes.”21
A significant number of patients treated with GLP-1RAs also
intensify therapy by adding insulin, or switching to insulin therapy,
primarily because of sub-optimal glycaemic control; however, to the
best of our knowledge, only three observational studies (n = 44 to
n = 432) have evaluated the effectiveness of adding insulin to GLP-
1RA therapy.20 Moreover, these studies did not explore the HbA1c
trajectories over time to understand the longitudinal patterns of
glycaemic failure. Furthermore, no real-world study, to the best of
our knowledge, has explored the dynamics of changes in glycaemic
and cardiovascular risk factors from the time of GLP-1RA initiation
through to the transition phase of adding or switching to insulin ther-
apy. The real-world patterns of adding or switching to insulin therapy
from initial GLP-1RA treatment are also not well understood. Further,
amongst patients with sub-optimal glycaemic control who intensify
GLP-1RA therapy by addition of insulin, it is not known whether
earlier intensification is beneficial for sustainable glucose control.
The aims of the present longitudinal cohort study were to evalu-
ate, from the time of initiation of GLP-1RA therapy in patients with
T2DM: (1) changes in HbA1c, body weight, blood pressure and LDL
cholesterol over 6, 12 and 24 months of follow-up; (2) possible bene-
fits of adding or switching to insulin treatment; and (3) the likelihood
of clinically meaningful HbA1c reduction in those who intensified
GLP-1RA treatment with insulin earlier, compared with those who
added insulin later.
2 | MATERIALS AND METHODS
2.1 | Data source
The Centricity Electronic Medical Record (CEMR) database was used
for this study. The CEMR represents a variety of ambulatory medical
practices from 49 US states, including solo practitioners, community
clinics, academic medical centres and large integrated delivery net-
works. The CEMR database consists of >35 000 physicians and other
providers, of whom ~75% are primary care providers. The database
has been extensively used for academic research worldwide.5,22–24
The CEMR database contains detailed prescription information
with dates of prescription, including information on medications that
were purchased over the counter or prescribed outside of the EMR
network. The main medication information data set stores individual
records in the form of start/stop dates, along with several specific
fields to track treatment adjustments and alterations over time.
From >2.4 million patients with a confirmed diagnosis of T2DM,
a cohort of 134 268 patients who received treatment with GLP-1RAs
between April 2005 and October 2014 was identified. The final study
cohort of 66 583 patients was selected on the basis of the following
criteria: (1) diagnosis of diabetes from January 1990; (2) age
≥18 years at the diagnosis of diabetes; (3) no insulin therapy before
the initiation of GLP-1RA treatment; (4) no missing data on age, sex,
HbA1c and body weight at GLP-1RA initiation; and (5) minimum
6 months of continuous treatment with GLP-1RA from the first
recorded prescription date.
In the final study cohort, those treated with EXE and those trea-
ted with LIRA were identified. Those who added insulin therapy to
existing GLP-1RA treatment (GLP-1RA + INS), and those who
switched to insulin therapy on cessation of GLP-1RA treatment (GLP-
1RA ! INS), were identified by comparing insulin initiation dates and
GLP-1RA cessation dates. Time to addition of/switch to insulin ther-
apy was calculated as the time difference between GLP-1RA and
insulin initiation dates. Insulin therapy was identified by any insulin
regimen (basal, biphasic or prandial). The majority of the patients in
the EXE group (94%) were treated with a twice-daily EXE regimen.
Demographic, clinical and laboratory information included age,
sex, ethnicity and longitudinal measures of body weight, body mass
index, systolic blood pressure (SBP), diastolic blood pressure, HbA1c,
and lipids. Clinical and laboratory data at GLP-1RA treatment initia-
tion (index date) were included on the basis of a 3-months window,
on or prior to the index date. Follow-up clinical and laboratory mea-
sures were arranged longitudinally on the basis of non-overlapping
6-monthly windows which were defined progressively from the time
of GLP-1RA treatment initiation. Complete information on antihyper-
glycaemic agents, antihypertensive agents, cardio-protective medica-
tions (CPMs), weight-lowering and anti-depressant drugs was
obtained, along with dates of prescriptions. The CPMs included sta-
tins, angiotensin-converting enzyme inhibitors and angiotensin II
receptor blockers. The status of different medication intakes was
defined by whether it was taken during GLP-1RA treatment, and the
treatment durations for such medications were estimated.
2.2 | Statistical methods
While complete data on HbA1c and body weight were available at
index date (by design), the proportion of patients with missing longi-
tudinal data on body weight, SBP, LDL cholesterol and HbA1c over
24 months of follow-up ranged from 9% to 19%. The missing longitu-
dinal follow-up data were imputed using the multiple imputation
approach, with adjustments for age at index date and use of oral anti-
hyperglycaemic drugs during follow-up. All primary analyses were
conducted using the imputed data, with additional analyses con-
ducted for sensitivity assessment based on complete cases.
The mean (95% confidence interval [CI]) changes in HbA1c, body
weight, SBP and LDL cholesterol at 6, 12 and 24 months from index
date were estimated using multivariate regression models. Risk factor
changes were adjusted for age at index date, sex, ethnicity and con-
comitant antihyperglycaemic, antihypertensive and weight-lowering
2 MONTVIDA ET AL.
167
treatments, weighted by the respective baseline measures, as appropri-
ate. Separate analyses for HbA1c and weight changes were conducted
for patients who continued to receive only GLP-1RA treatment over
6, 12 and 24 months, and for those who added or switched to insulin
during follow-up. Robust estimates of the CIs were obtained.
For patients with HbA1c >7.5% at the time of GLP-1RA initia-
tion, the proportions of patients who achieved HbA1c below 7% at
6, 12 and 24 months of treatment were evaluated for all groups. Pro-
portions of those who achieved weight loss ≥5% from initial body
weight at 6, 12 and 24 months after GLP-1RA initiation were also
calculated. The characteristics of the patients who added or switched
to insulin were presented at the index date and at the time of transi-
tion to insulin.
3 | RESULTS
In the cohort of 66 583 patients with minimum 6 months of treat-
ment with GLP-1RA, the mean (standard deviation) age was
56 (11) years, 28 959 (43%) were male, 45 291 (68%) were white,
51 719 (87%) were obese, 3404 (26%)/858 (7%) had micro-/macro-
albuminuria, and 17 415 (26%) had a history of hypertension at the
time of initiation of GLP-1RA treatment (Table 1). The use of differ-
ent medications during GLP-1RA therapy, along with their durations
of treatment, are shown in Table 1.
3.1 | Glycaemic control
With mean HbA1c of 8.2% at index date, among those who continued
with GLP-1RA treatment without adding or switching to insulin, the
highest adjusted mean HbA1c change was achieved within 6 months
(−0.73%; 95% CI −0.73, −0.71), with no further glycaemic benefits at
12 months (−0.65%; 95% CI −0.65, −0.62) or 24 months of follow-up
(−0.59%; 95% CI −0.60, −0.58; Table 2; Figure 1A; all P < 0.01). Among
patients with HbA1c ≥7.5% at index date, who did not add or switch
to insulin, and who continued with GLP-1RA treatment only for
12 and 24 months (n = 14 682 and n = 6825), 26% achieved HbA1c
levels below 7% at 12 and 24 months, respectively (Table 2).
Among those who added insulin during follow-up (GLP-1RA +
INS, n = 39 599), the mean HbA1c values at index date and at the
time of adding insulin were 8.3% and 8.8%, respectively. The median
time to intensification with insulin was 3 months. Among these
patients, 84% and 71% had HbA1c levels above 7.5% and 8%,
respectively, at insulin initiation. Those who added insulin within
6 months of GLP-1RA initiation achieved significantly greater
(P < 0.001) adjusted HbA1c reduction at 24 months of follow-up
(−0.58%; 95% CI −0.61, −0.57), compared with those who added
insulin after 12 months (−0.41%; 95% CI −0.43, −0.40; Figure 1A).
Those who added insulin within 6 months of GLP-1RA initiation were
18% (odds ratio 1.18; 95% CI 1.09, 1.28; P < 0.001) more likely to
achieve HbA1c below 7% at 24 months of follow-up, compared with
those who added insulin treatment later.
The 6-monthly trajectories (mean, 95% CI) of HbA1c levels for
those who switched to insulin therapy within 24 months (n = 2483;
mean time to insulin 14 months; Table 3) are shown in Figure 1B. In
these patients, the mean HbA1c increased to 9.3% at the time of
switching to insulin from a mean HbA1c of 8.5% at GLP1-RA initiation,
and 80% of them had HbA1c above 8% at insulin initiation (Table 3).
Notably, these patients did not achieve better glycaemic control at
24 months compared with their glycaemic status at index date
(Figure 1B,C). In contrast, patients who added insulin within 24 months
of follow-up (n = 36 113; mean time to insulin 3 months; Table 3)
experienced a significant HbA1c reduction of 0.55% (95% CI 0.54,
0.57) at 24 months compared with the index date (Figure 1C). The
adjusted means (95% CI) of change in HbA1c and body weight at
6, 12 and 24 months after GLP-1RA initiation, for those who ceased
GLP-1RA after 6 months of initiation and switched to insulin between
6 and 12, 12 and 18, and 18 and 24 months, are shown in Table S1.
3.2 | Weight change
With a baseline body weight of 109 kg (Table 1), patients with a min-
imum 12 months of treatment with GLP-1RA had significantly greater
adjusted weight reduction (mean reduction 2.5 kg [CI 2.50, 2.51]),
and 24% reduced their body weight by ≥5% (Table 4). Among those
who continued GLP-1RA treatment only (without addition or switch
to insulin) for 24 months, the average weight reduction from index
date was 3.31 kg (95% CI 3.30, 3.32), and a third of patients achieved
a weight loss of ≥5%. Patients who added insulin achieved marginally
higher weight reduction (adjusted) at 12 and 24 months (2.93 and
3.40 kg, respectively; Table 4), compared with those who did not add
or switch to insulin therapy (2.50 and 3.31 kg, respectively).
3.3 | Associations of glucose and weight loss
Among patients with a minimum of 12 months of GLP-1RA treat-
ment, 78% and 67% had reductions in HbA1c and body weight,
respectively, from the index date, while 53% had reductions in both
body weight and HbA1c (similar in patients treated with LIRA and
EXE; Figure 1D). At 12 months of follow-up, 8% and 7% of patients
did not have reductions in both HbA1c and weight in the EXE and
LIRA groups, respectively.
3.4 | Cardiovascular risk factors
With a mean 129 mm Hg SBP level at index date, only 24% had SBP
≥140 mm Hg. The adjusted average reduction in SBP was ~3 mm Hg
consistently over 6, 12 and 24 months of follow-up, and was similar
across EXE and LIRA groups (Tables 1 and 4). Among those who
switched to insulin, the mean SBP levels at index date, at the time of
moving to insulin, and at 24 months of follow-up remained stable at
130 mm Hg.
In all, 92% of patients in the study cohort were on lipid-lowering
therapy. The average reduction in LDL cholesterol was ≥0.18 mmol/L
consistently during 6, 12 and 24 months of follow-up (range of CI of
reduction: 0.17-0.24 mmol/L; Table 4), starting with a baseline LDL
cholesterol level of 2.43 mmol/L. Among patients who did not
receive any statin (n = 15 949), mean reductions in LDL cholesterol
at 6, 12 and 24 months were 0.15 mmol/L, 0.14 mmol/L and
0.17 mmol/L, respectively (all P < 0.001). Among those who switched
MONTVIDA ET AL. 3
168
TABLE 1 Basic statistics on study variables at the time of initiation of exenatide or liraglutide for patients who continued glucagon-like peptide-
1 receptor agonist treatment for at least 6 months, those who added insulin treatment during the follow-up, and those who switched to insulintreatment during the follow-up
ALL EXE LIRA GLP-1RA + INSGLP-1RA ! INS
N 66 583 44 523 22 060 39 599 4706
Age at GLP-1RA initiation, y 56 (11) 56 (11) 56 (11) 56 (11) 56 (11)
Men, n (%) 28 959 (43) 18 917 (42) 10 042 (46) 17 531 (44) 2130 (45)
Ethnicity, n (%)
White 45 291 (68) 29 500 (66) 15 791 (72) 27 231 (69) 3311 (70)
Black 5021 (8) 3118 (7) 1903 (9) 2971 (8) 358 (8)
Hispanic 1465 (2) 943 (2) 522 (2) 1023 (3) 89 (2)
Asian 534 (1) 326 (1) 208 (1) 312 (1) 30 (1)
HbA1c at diagnosis of diabetes, % 8 (1.4) 8 (1.4) 8.1 (1.5) 8.1 (1.5) 8.2 (1.5)
HbA1c at GLP-1RA initiation, % 8.2 (1.3) 8.1 (1.3) 8.3 (1.4) 8.3 (1.4) 8.4 (1.3)
Median (IQR) HbA1c at GLP-1RA initiation, % 7.8 (7, 8.8) 7.8 (7, 8.7) 7.9 (7.1, 8.9) 8 (7.1, 9.0) 8.2 (7.4, 9.0)
HbA1c ≥ 7% at GLP-1RA initiation, n (%) 60 351 (91) 40 180 (90) 20 171 (91) 36 130 (91) 4388 (93)
HbA1c ≥ 7.5% at GLP-1RA initiation, n (%) 41 045 (62) 26 700 (60) 14 345 (65) 25 691 (65) 3388 (72)
HbA1c ≥ 8% at GLP-1RA initiation, n (%) 30 599 (46) 19 777 (44) 10 822 (49) 19 859 (50) 2628 (56)
Weight at GLP-1RA initiation, kg 109 (25) 110 (25) 109 (25) 110 (25) 108 (24)
BMI at GLP-1RA initiation, kg/m2 38 (8) 38 (8) 38 (8) 38 (8) 37 (8)
Obese at GLP-1RA initiation, n (%) 57 927 (87) 38 753 (87) 18 971 (86) 34 847 (88) 4000 (85)
SBP at GLP-1RA initiation, mm Hg 129 (16) 129 (16) 129 (16) 129 (16) 130 (16)
SBP ≥ 140 mm Hg at GLP-1RA initiation, n (%) 15 728 (24) 10 628 (24) 5100 (23) 9487 (24) 1184 (25)
DBP at GLP-1RA initiation, mm Hg 77 (10) 77 (10) 77 (10) 77 (10) 77 (10)
LDL cholesterol at GLP-1RA initiation, mmol/L 2.43 (0.72) 2.44 (0.73) 2.42 (0.72) 2.42 (0.74) 2.47 (0.74)
LDL cholesterol ≥ 3.37 mmol/L at GLP-1RAinitiation, n (%)
5780 (9) 3951 (9) 1829 (8) 3652 (9) 450 (10)
HDL cholesterol at GLP-1RA initiation, mmol/L 1.10 (0.31) 1.11 (0.30) 1.10 (0.31) 1.10 (0.30) 1.09 (0.29)
Median (IQR) triglycerides at GLP-1RA initiation,mmol/L
1.69 (1.23, 2.28) 1.69 (1.23, 2.27) 1.71 (1.24, 2.31) 1.71 (1.24, 2.29) 1.75 (1.25, 2.35)
Triglyceride ≥ 1.69 mmol/L at GLP-1RAinitiation, n (%)
15 060 (51) 9920 (50) 5140 (51) 10 107 (51) 967 (53)
Micro-albuminuria, n (%) 3404 (26) 2126 (25) 1278 (29) 2481 (26) 167 (26)
Macro-albuminuria, n (%) 858 (7) 532 (6) 326 (7) 597 (6) 49 (8)
Hypertension, n (%) 17 415 (26) 12 707 (29) 4708 (21) 11 020 (28) 1340 (28)
Metformin taken during the GLP-1RA treatment,n (%)
56 035 (84) 37 645 (85) 18 390 (83) 33 837 (85) 4129 (88)
Median (IQR) metformin duration, months 52.7 (28.2, 84.3) 62.6 (34.9, 91.9) 36.6 (20.8, 61.1) 55.9 (30.3, 87.5) 71.8 (43.6, 97.6)
Sulphonylurea taken during the GLP-1RAtreatment, n (%)
38 003 (57) 26 719 (60) 11 284 (51) 23 723 (60) 3583 (76)
Median (IQR) sulphonylurea duration, months 32.5 (15, 60.3) 36.8 (17, 66) 25.4 (12.2, 45.8) 33.5 (15.2, 61.9) 39.8 (20.2, 67.9)
Antihypertensive taken during the GLP-1RAtreatment, n (%)
53 821 (81) 36 610 (82) 17 211 (78) 32 655 (82) 4032 (86)
Median (IQR) antihypertensive duration, months 46.5 (23.8, 80.1) 54.7 (28.4, 86.9) 33.2 (17.5, 59.8) 48.7 (25.2, 82.3) 61.4 (33.9, 91.8)
CPM taken during the GLP-1RA treatment, n (%) 61 145 (92) 41 273 (93) 19 872 (90) 36 861 (93) 4462 (95)
Median (IQR) CPM duration, months 51.1 (26.6, 84.9) 59.9 (32.7, 91.6) 35.7 (19.4, 64.5) 53.8 (28.5, 87.5) 68.3 (39, 96.3)
Weight lowering drugs taken during theGLP-1RA treatment, n (%)
4591 (7) 3297 (7) 1294 (6) 2831 (7) 328 (7)
Median (IQR) weight-lowering duration, months 12.6 (5.1, 28.2) 13.1 (5.2, 29.4) 11.9 (4.9, 25.5) 12.1 (4.8, 27.5) 12.6 (5.3, 28.7)
Anti-depressants taken during the GLP-1RAtreatment, n (%)
28 133 (42) 19 865 (45) 8268 (37) 16 950 (43) 2296 (49)
Median (IQR) anti-depressants duration, months 35.6 (15.4, 68.6) 40.6 (17.4, 74.7) 27.2 (12.6, 51.4) 36.2 (15.5, 70.1) 43.4 (19.5, 78.9)
BMI, body mass index; CPM, cardio-protective medication; DBP, diastolic blood pressure; EXE, exenatide; GLP-1RA, glucagon-like peptide-1 receptoragonist; HbA1c, glycated haemoglobin; INS, insulin; IQR, interquartile range; LIRA, liraglutide; SBP, systolic blood pressure.
Values are mean (standard deviation) unless stated otherwise.
4 MONTVIDA ET AL.
169
to insulin, the mean LDL cholesterol levels at index date, at the time
of moving to insulin, and at 24 months of follow-up were 2.47, 2.42
and 2.38 mmol/L, respectively.
4 | DISCUSSION
The present longitudinal cohort study of 66 583 patients with T2DM
treated with GLP-1RA suggests that: (1) significant HbA1c reductions
may be obtained within 6 months from GLP-1RA treatment initiation,
with no further glycaemic benefits likely over 24 months of follow-
up; (2) earlier intensification with insulin therapy by 6 months (when
added to GLP-1RA) is associated with 18% higher odds of lowering
HbA1c below 7% within 2 years of treatment; and (3) adding insulin,
rather than switching to it, is associated with significantly lower glu-
cose levels in a long-term outcome. We have also observed a clear
indication of therapeutic inertia among patients who failed to
respond to GLP-1RA therapy.
This study presents real-world evidence of significant reductions
in HbA1c, body weight, SBP and LDL cholesterol over 2 years of
follow-up in patients treated with GLP-1RAs. Initiation of GLP-1RA
treatment at lower HbA1c levels was associated with better glucose
control over 2 years of follow-up. The observed HbA1c reductions
were consistent with previous findings.15–19 While glycaemic
achievements observed within 6 months of GLP-1RA treatment initi-
ation were higher in clinical trials, it is recognized that the effective-
ness studies based on real-world data generally provide lower
estimates of glycaemic reduction or treatment effect(s) in general.
TABLE 2 Adjusted mean (95% confidence interval) of change in glycated haemoglobin (HbA1c) at 6, 12 and 24 months after glucagon-like
peptide-1 receptor agonist (GLP-RA) initiation, for those who took a GLP-1RA for at least 6, 12 and 24 months, stratified by whether patientscontinued on GLP-1RA treatment only or added insulin therapy, and, for patients with HbA1c levels above 7.5% at GLP-1RA initiation, numberand proportion of those whose HbA1c reduced below 7% at 6, 12 and 24 months after GLP-1RA initiation stratified by whether patients tookGLP-1RA only or added insulin therapy
On GLP-1RA for ≥6 months On GLP-1RA for ≥12 months On GLP-1RA for ≥24 months
All EXE LIRA All EXE LIRA All EXE LIRA66 583 44 523 22 060 50 109 35 085 15 024 28 422 22 111 6311
Δ HbA1c at 6 months, mean (95% CI)
GLP-1RAonly
−0.73(−0.73,−0.71)
−0.70(−0.71,−0.70)
−0.80(−0.80,−0.79)
−0.75(−0.76,−0.75)
−0.72(−0.73,−0.72)
−0.83(−0.81,−0.83)
−0.75(−0.75,−0.74)
−0.73(−0.73,−0.72)
−0.85(−0.86,−0.85)
GLP-1RA + INS
−0.83(−0.84,−0.83)
−0.75(−0.77,−0.75)
−0.95(−0.96,−0.94)
−0.82(−0.82,−0.81)
−0.75(−0.76,−0.75)
−0.95(−0.96,−0.95)
−0.81(−0.81,−0.80)
−0.77(−0.77,−0.76)
−0.93(−0.93,−0.92)
HbA1c < 7% at 6 months for those whose HbA1c was ≥7.5% at GLP-1RA initiation, n (%)
GLP-1RAonly
5410 (25) 3672 (24) 1738 (27) 3965 (27) 2826 (26) 1139 (29) 2018 (30) 1603 (29) 415 (34)
GLP-1RA + INS
4156 (21) 2236 (19) 1920 (24) 3451 (22) 1987 (20) 1464 (25) 2338 (24) 1594 (22) 744 (28)
Δ HbA1c at 12 months, mean (95 CI)
GLP-1RAonly
- - - −0.65(−0.65,−0.62)
−0.62(−0.62,−0.61)
−0.71(−0.72,−0.71)
−0.67(−0.68,−0.67)
−0.65(−0.67,−0.65)
−0.74(−0.74,−0.72)
GLP-1RA + INS
- - - −0.73(−0.73,−0.72)
−0.67(−0.67,−0.66)
−0.85(−0.86,−0.85)
−0.74(−0.75,−0.74)
−0.71(−0.71,−0.70)
−0.84(−0.84,−0.83)
HbA1c < 7% at 12 months for those whose HbA1c was ≥7.5% at GLP-1RA initiation, n (%)
GLP-1RAonly
- - - 3829 (26) 2770 (26) 1059 (27) 1960 (29) 1577 (28) 383 (32)
GLP-1RA + INS
- - - 3462 (22) 2083 (21) 1379 (24) 2401 (24) 1679 (23) 722 (27)
Δ HbA1c at 24 months, mean (95 CI)
GLP-1RAonly
- - - - - - −0.59(−0.60,−0.58)
−0.58(−0.58,−0.57)
−0.63(−0.64,−0.63)
GLP-1RA + INS
- - - - - - −0.65(−0.66,−0.65)
−0.63(−0.64,−0.63)
−0.70(−0.71,−0.70)
HbA1c < 7% at 24 months for those whose HbA1c was ≥7.5% at GLP-1RA initiation, n (%)
GLP-1RAonly
- - - - - - 1806 (26) 1469 (26) 337 (28)
GLP-1RA + INS
- - - - - - 2247 (23) 1621 (23) 626 (23)
CI, confidence interval; EXE, exenatide; GLP-1RA, glucagon-like peptide-1 receptor agonist; HbA1c, glycated haemoglobin; INS, insulin; LIRA, liraglutide.
MONTVIDA ET AL. 5
170
FIGURE1
A,M
ean(95%
confi
denc
einterval[CI])o
flong
itud
inal
glycated
haem
oglobin(H
bA1c)
mea
suremen
tsby
whe
ther
patien
tad
dedinsulin
within
6months/6to
12months/12to
18months/18to
24months
from
theGLP
-1RAinitiation,
orremaine
donGLP
-1RAtreatm
entonly.
B,M
ean(95%
CI)oflong
itud
inalHbA
1cmea
suremen
tsby
whe
ther
patientsw
itch
edto
insulin
within
6to
12months/12
to18months/1
8to
24months
from
theGLP
-1RAinitiation.
C,M
ean(95%
CI)oflong
itud
inal
HbA
1cmea
suremen
tsby
whe
ther
patien
tad
dedorsw
itch
edto
insulin
within
24monthsfrom
GLP
-1RA
initiation.
D,S
catterplotofHbA
1cch
ange
andbo
dyweigh
tch
ange
at6an
d12months
forpa
tien
tstreatedwithex
enatidean
dliraglutide
witho
utad
dingorsw
itch
ingto
insulin
.Theperpen
diculardotted
lines
presen
tthemea
ntimeto
additionorsw
itch
ingto
insulin
bycatego
ries
oftimeline.
6 MONTVIDA ET AL.
171
The mean HbA1c reduction in patients treated with LIRA was margin-
ally higher than in those treated with EXE, although the proportions
of patients who had reductions in HbA1c to below 7% over 12 and
24 months of follow-up (ranging from 26% to 28%) were similar for
each of these two therapies.
With regard to initial response to GLP-1RA within 6 months of
follow-up, patients who switched to insulin (by design after 6 months
of GLP-1 RA treatment) experienced a significant rise in the glycae-
mic level at different points of follow-up, as shown in Figure 1B. For
example, those who switched to insulin within 18 to 24 months after
the index date, clearly experienced rising HbA1c levels consistently
above 8% after 6 months of treatment with GLP-1RA. Furthermore,
we observed that although switching to insulin prevented any further
rise in HbA1c, no significant glycaemic reductions were achieved by
the end of the 24-months follow-up period, compared to the index
date (Figure 1B,C); however, those patients who added insulin had
significantly better glycaemic control by the end of the follow-up
period. After adjusting for the HbA1c levels at index date and at the
time of insulin initiation, those who added insulin achieved a signifi-
cantly higher HbA1c reduction at 24 months, by 0.48% (95% CI 0.47,
0.50%), compared with those who switched to insulin (Figure 1C).
This study showed that addition of insulin and switch to insulin
occurred at elevated HbA1c levels of 8.8% and 9.3%, respectively,
with a significant proportion of patients having HbA1c above 8%/
8.5% (71%/58% in the GLP-1RA + INS group and 80%/68% in the
GLP-1RA ! INS group). This clearly raises the issue of therapeutic
inertia.25,26 Given the high glycaemic burden in this population, the
time to intensification of therapy requires further evaluation, in con-
junction with the factors that might prevent early intensification with
insulin therapy, including fear of weight gain and hypoglycaemia.
Notably, the distributions of body weight were similar between those
who switched to, or added insulin, and those who remained on GLP-
1RA treatment only (Table 1). Moreover, observed adjusted weight
reductions in patients who added insulin were consistently and mar-
ginally greater than in those treated with GLP-1RA only during the
follow-up period (Table 4). Our findings in terms of weight loss are
consistent with a recent study reporting no weight gain after initia-
tion of insulin in obese patients with T2DM27 and the systematic
review by Balena et al.20
The limitations of this study include non-availability of complete
and reliable data on: (1) medication adherence; (2) dose adjustments
in insulin-treated patients; (3) diet and exercise; (4) socio-economic
status; and (5) potential residual confounders. The selection of
patients with a minimum 6 months’ treatment with GLP-1RA, exclud-
ing those who initiated insulin therapy earlier, could lead to a poten-
tial selection bias; however, the large analysis cohort from the
validated CEMR database used in the study should be considered as
a representative sample, and as such, provides a reliable picture of
TABLE 3 Glycated haemoglobin levels at glucagon-like peptide-1 receptor agonist (GLP-1RA) initiation and at insulin initiation, and the time to
insulin therapy, for those who added insulin to existing GLP-1RA (GLP-1RA+INS) and those who switched to insulin treatment (GLP-1RA!INS)within 2 years of follow-up
GLP-1RA + INS GLP-1RA ! INS
All EXE LIRA All EXE LIRAN 36 113 22 703 13 410 2483 1856 627
HbA1c at GLP-1RAinitiation, %
8.3 (1.4) 8.3 (1.4) 8.4 (1.4) 8.5 (1.4) 8.5 (1.4) 8.5 (1.4)
Median (IQR) HbA1c atGLP-1RA initiation, %
8 (7.1, 9) 7.9 (7.1, 9) 8.1 (7.2, 9.1) 8.3 (7.5, 9.2) 8.3 (7.5, 9.2) 8.3 (7.6, 9.3)
HbA1c ≥ 7% at GLP-1RAinitiation, n (%)
33 032 (91) 20 649 (91) 12 383 (92) 2351 (95) 1761 (95) 590 (94)
HbA1c ≥ 7.5% at GLP-1RAinitiation, n (%)
23 736 (66) 14 507 (64) 9229 (69) 1892 (76) 1404 (76) 488 (78)
HbA1c ≥ 8% at GLP-1RAinitiation, n (%)
18 436 (51) 11 202 (49) 7234 (54) 1489 (60) 1119 (60) 370 (59)
HbA1c at insulin initiation, % 8.8 (1.3) 8.8 (1.3) 8.8 (1.3) 9.3 (1.6) 9.2 (1.5) 9.5 (1.7)
Median (IQR) HbA1c at insulininitiation, %
8.7 (7.8, 9.4) 8.7 (7.8, 9.4) 8.8 (7.9, 9.5) 9.1 (8.2, 10) 9 (8.1, 10) 9.1 (8.4, 10.3)
HbA1c ≥ 7% at insulininitiation, n (%)
36 045 (100) 22 652 (100) 13 393 (100) 2482 (100) 1855 (100) 627 (100)
HbA1c ≥ 7.5% at insulininitiation, n (%)
30 267 (84) 18 848 (83) 11 419 (85) 2209 (89) 1634 (88) 575 (92)
HbA1c ≥ 8% at insulininitiation, n (%)
25 649 (71) 15 897 (70) 9752 (73) 1985 (80) 1466 (79) 519 (83)
Median (IQR) Δ HbA1c(insulin - GLP-1RA), %
0.46 (0.45, 0.47) 0.49 (0.47, 0.50) 0.42 (0.4, 0.44) 0.73 (0.67, 0.79) 0.67 (0.60, 0.73) 0.91 (0.80, 1.03)
Mean (range) time to insulininitiation, months
3 (0, 24) 3 (0, 24) 2 (0, 24) 14 (6, 24) 14 (6, 24) 14 (6, 24)
Median (IQR) time to insulininitiation, months
0 (0, 3) 0 (0, 4) 0 (0, 2) 13 (10, 18) 14 (10, 19) 12 (9, 18)
EXE, exenatide; GLP-1RA, glucagon-like peptide-1 receptor agonist; HbA1c, glycated haemoglobin; INS, insulin; IQR, interquartile range; LIRA, liraglutide.
Statistics are mean (standard deviation) unless stated otherwise.
MONTVIDA ET AL. 7
172
TABLE
4Adjustedmea
n(95%
confi
denc
einterval
[CI])ofch
ange
inbo
dyweigh
tat
6,1
2an
d24months
aftergluc
agon-likepe
ptide-1receptoragonist
(GLP
-1RA)initiation,
forthose
who
tookaGLP
-1RA
forat
least6,1
2an
d24months,stratified
bywhe
ther
patien
tsco
ntinue
donGLP
-1RAtrea
tmen
tonlyorad
dedinsulin
therap
y,nu
mbe
ran
dproportionofthose
wholost
≥5%
bodyweigh
tduringfollo
w-up
afterGLP
-1RAinitiation,
andad
justed
mea
n(95%
CI)ofch
ange
sin
SBPan
dLD
Lch
olesterola
t6,1
2an
d24months
afterGLP
-1RAinitiation
OnGLP
-1RAfor≥6mo
OnGLP
-1RAfor≥12mo
OnGLP
-1RAfor≥24mo
All
EXE
LIRA
All
EXE
LIRA
All
EXE
LIRA
ΔW
eigh
tat
6months,a
djustedmea
n(95%
CI)
GLP
-1RAonly
−1.87(−1.88,
−1.87)
−1.90(−1.91,
−1.90)
−1.80(−1.81,
−1.80)
−1.96(−1.97,
−1.96)
−2.00(−2.00,
−1.99)
−1.86(−1.87,
−1.85)
−2.24(−2.24,−
2.23)
−2.23(−2.24,−
2.23)
−2.25(−2.26,−
2.23)
GLP
-1RA
+IN
S−2.32(−2.32,
−2.31)
−2.20(−2.20,
−2.19)
−2.51(−2.52,
−2.51)
−2.41(−2.41,
−2.40)
−2.26(−2.26,
−2.25)
−2.68(−2.68,
−2.67)
−2.50(−2.50,−
2.49)
−2.38(−2.39,−
2.37)
−2.84(−2.85,−
2.83)
Weigh
tloss
≥5%
at6months,n
(%)
GLP
-1RAonly
5539(15)
3957(15)
1582(15)
3978(15)
2980(15)
998(15)
2050(16)
1684(16)
366(17)
GLP
-1RA
+IN
S5241(18)
3085(17)
2156(19)
4413(18)
2729(17)
1684(20)
2985(19)
2124(18)
861(21)
ΔW
eigh
tat
12months,a
djustedmea
n(95%
CI)
GLP
-1RAonly
--
-−2.50(−2.51,
−2.50)
−2.54(−2.55,
−2.54)
−2.39(−2.39,
−2.38)
−2.83(−2.84,−
2.82)
−2.82(−2.83,−
2.82)
−2.87(−2.88,−
2.85)
GLP
-1RA
+IN
S-
--
−2.93(−2.93,
−2.92)
−2.80(−2.81,
−2.80)
−3.16(−3.16,
−3.15)
−3.08(−3.08,−
3.07)
−3.00(−3.00,−
2.99)
−3.31(−3.32,−
3.30)
Weigh
tloss
≥5%
at12months,n
(%)
GLP
-1RAonly
--
-6150(24)
4644(24)
1506(23)
3204(25)
2662(26)
542(25)
GLP
-1RA
+IN
S-
--
6404(26)
4062(26)
2342(28)
4314(27)
3132(27)
1182(29)
ΔW
eigh
tat
24months,a
djustedmea
n(95%
CI)
GLP
-1RAonly
--
--
--
−3.31(−3.32,−
3.30)
−3.3
(−3.31,−
3.29)
−3.36(−3.38,−
3.35)
GLP
-1RA
+IN
S-
--
--
-−3.40(−3.41,−
3.40)
−3.32(−3.33,−
3.31)
−3.63(−3.64,−
3.62)
Weigh
tloss
≥5%
at24months,n
(%)
GLP
-1RAonly
--
--
--
3884(31)
3210(31)
674(31)
GLP
-1RA
+IN
S-
--
--
-5104(32)
3747(32)
1357(33)
Bloodpressure
chan
ges,ad
justed
mea
n(95%
CI)
ΔSB
Pat
6months
−2.82(−2.82,
−2.81)
−2.78(−2.78,
−2.77)
−2.90(−2.91,
−2.89)
−2.95(−2.96,
−2.95)
−2.91(−2.91,
−2.90)
−3.05(−3.06,
−3.04)
−2.91(−2.91,−
2.90)
−2.87(−2.87,−
2.86)
−3.05(−3.07,−
3.04)
ΔSB
Pat
12months
--
-−2.79(−2.79,
−2.78)
−2.79(−2.79,
−2.78)
−2.79(−2.80,
−2.78)
−2.85(−2.86,−
2.85)
−2.78(−2.78,−
2.77)
−3.13(−3.14,−
3.11)
ΔSB
Pat
24months
--
--
--
−2.64(−2.64,−
2.63)
−2.69(−2.70,−
2.69)
−2.44(−2.46,−
2.42)
LDLch
ange
s,ad
justed
mea
n(95%
CI)
ΔLD
Lch
olesterola
t6months
−0.19(−0.20,
−0.19)
−0.19(−0.20,
−0.19)
−0.19(−0.19,
−0.18)
−0.19(−0.19,
−0.18)
−0.19(−0.20,
−0.19)
−0.19(−0.19,
−0.18)
−0.19(−0.21,−
0.18)
−0.18(−0.18,−
0.17)
−0.20(−0.21,−
0.19)
ΔLD
Lch
olesterola
t12months
--
-−0.18(−0.19,
−0.18)
−0.18(−0.19,
−0.18)
−0.19(−0.19,
−0.18)
−0.19(−0.20,−
0.19)
−0.18(−0.18,−
0.17)
−0.20(−0.21,−
0.18)
ΔLD
Lch
olesterola
t24months
--
--
--
−0.23(−0.24,−
0.23)
−0.23(−0.24,−
0.23)
−0.23(−0.23,−
0.22)
CI,co
nfide
nceinterval;EXE,e
xena
tide
;GLP
-1RA,g
lucago
n-likepe
ptide-1receptoragonist;HbA
1c,glycated
haem
oglobin;
INS,
insulin
;IQ
R,interqu
artile
range
;LIRA,liraglutide;
SBP,systolic
bloodpressure.
8 MONTVIDA ET AL.
173
the state of risk factor management in routine practice. Complete risk
factor data were available at index date, and imputations were con-
ducted for only 9% to 19% missing longitudinal data. The results from
complete case analyses and imputed data were very similar. Finally,
the careful new-user design with a reasonable exposure time of
2 years, and appropriate adjustments for confounders are the primary
strengths of the study.
In conclusion, this novel real-world study provides evidence of
significant delays in intensification of treatment in patients with
T2DM treated with a GLP-1RA. Among HbA1c non-responders, early
addition of insulin with GLP-1RA therapy within 6 months resulted in
better and sustainable glycaemic control over 2 years. The findings
from the present study suggest that, in people requiring treatment
intensification on GLP-1RA, the preferred option should be addition
of insulin rather than stopping GLP-1RA and switching to insulin
therapy.
ACKNOWLEDGMENTS
We gratefully acknowledge the support for the QIMR Berghofer
Institute from the Australian Government Department of Education’s
National Collaborative Research Infrastructure Strategy (NCRIS) initi-
ative through Therapeutic Innovation Australia. No separate funding
was obtained for this study. O. M. acknowledges the support from
her associate supervisors Prof. Ross Young and Prof. Louise Hafner.
K. Klein acknowledges support from the National Institute for Health
Research Collaboration for Leadership in Applied Health Research
and Care – East Midlands (NIHR CLAHRC – EM), and the NIHR
Leicester Loughborough Diet, Lifestyle and Physical Activity Biomedi-
cal Research Unit.
Conflict of interest
S. K. P. has acted as a consultant and/or speaker for Novartis, GI
Dynamics, Roche, AstraZeneca, Guangzhou Zhongyi Pharmaceutical
and Amylin Pharmaceuticals LLC. He has received grants in support
of investigator and investigator-initiated clinical studies from Merck,
Novo Nordisk, AstraZeneca, Hospira, Amylin Pharmaceuticals, Sanofi-
Avensis and Pfizer. K. Khunti has acted as a consultant and speaker
for Novartis, Novo Nordisk, Sanofi-Aventis, Lilly, Merck Sharp &
Dohme, Janssen, Astra Zeneca and Boehringer Ingelheim. He has
received grants in support of investigator and investigator-initiated
trials from Novartis, Novo Nordisk, Sanofi-Aventis, Lilly, Pfizer, Boeh-
ringer Ingelheim, Merck Sharp & Dohme, Janssen and Roche, and
funds for research, honoraria for speaking at meetings and has served
on advisory boards for Lilly, Sanofi-Aventis, Merck Sharp & Dohme
and Novo Nordisk, Boehringer Ingelheim, Janssen and Astra Zeneca.
S. K. has received research grants and been on advisory boards for
Novo Nordisk. O. M. and K. Klein have no conflict of interest to
declare.
Author contributions
S. K. P. conceived the idea, and S. K. P. and O. M. were responsible
for the primary design of the study. The design concept was
discussed and agreed with S. K. and K. Khunti. O. M. and Kere K con-
ducted the data extraction. S. K. P., K. Klein and O. M. jointly con-
ducted the statistical analyses. S. K. and K. Khunti worked on the
analysis plan along with S. K. P. The first draft of the manuscript was
developed by S. K. P. and O. M., and all authors contributed to the
finalization of the manuscript. S. K. P. had full access to all the data in
the study and takes responsibility for the integrity of the data and
the accuracy of the data analysis.
REFERENCES
1. Turnbull FM, Abraira C, Anderson RJ, et al. Intensive glucose controland macrovascular outcomes in type 2 diabetes. Diabetologia.2009;52(11):2288-2298.
2. Shah AD, Langenberg C, Rapsomaniki E, et al. Type 2 diabetes andincidence of cardiovascular diseases: a cohort study in 1 � 9 millionpeople. Lancet Diabetes Endocrinol. 2015;3(2):105-113.
3. Fox CS, Golden SH, Anderson C, et al. Update on prevention of cardi-ovascular disease in adults with type 2 diabetes mellitus in light ofrecent evidence: a scientific statement from the American HeartAssociation and the American Diabetes Association. Circulation.2015;132(8):691-718.
4. American Diabetes Association. Standards of medical care indiabetes—2015. Diabetes Care. 2015;38(suppl 1):S49-S57.
5. Paul SK, Klein K, Maggs D, Best JH. The association of the treatmentwith glucagon-like peptide-1 receptor agonist exenatide or insulinwith cardiovascular outcomes in patients with type 2 diabetes: a ret-rospective observational study. Cardiovasc Diabetol. 2015;14:10,doi:10.1186/s12933-015-0178-3.
6. Smilowitz NR, Donnino R, Schwartzbard A. Glucagon-like peptide-1receptor agonists for diabetes mellitus: a role in cardiovascular dis-ease. Circulation. 2014;129(22):2305-2312.
7. Drucker DJ, Goldfine AB. Cardiovascular safety and diabetes drugdevelopment. Lancet. 2011;377(9770):977-979.
8. Garber AJ. Novel GLP-1 receptor agonists for diabetes. Expert OpinInvestig Drugs. 2012;21(1):45-57.
9. Monami M, Dicembrini I, Nardini C, Fiordelli I, Mannucci E. Effects ofglucagon-like peptide-1 receptor agonists on cardiovascular risk: ameta-analysis of randomized clinical trials. Diabetes Obes Metab.2014;16(1):38-47.
10. Vora J. Combining incretin-based therapies with insulin: realizing thepotential in type 2 diabetes. Diabetes Care. 2013;36(suppl 2):S226-S232.
11. Shaefer CF, Reid TS, Dailey G, et al. Weight change in patients withtype 2 diabetes starting basal insulin therapy: correlates and impacton outcomes. Postgrad Med. 2014;126(6):93-105.
12. Balkau B, Home PD, Vincent M, Marre M, Freemantle N. Factorsassociated with weight gain in people with type 2 diabetes startingon insulin. Diabetes Care. 2014;37(8):2108-2113.
13. Eng C, Kramer CK, Zinman B, Retnakaran R. Glucagon-like peptide-1receptor agonist and basal insulin combination treatment for themanagement of type 2 diabetes: a systematic review and meta-analy-sis. Lancet. 2014;384(9961):2228-2234.
14. Lee WC, Dekoven M, Bouchard J, Massoudi M, Langer J. Improvedreal-world glycaemic outcomes with liraglutide versus other incretin-based therapies in type 2 diabetes. Diabetes Obes Metab.2014;16(9):819-826.
15. Li Q, Chitnis A, Hammer M, Langer J. Real-world clinical and eco-nomic outcomes of liraglutide versus sitagliptin in patients withtype 2 diabetes mellitus in the United States. Diabetes Ther.2014;5(2):579-590.
16. Rigato M, Avogaro A, Fadini GP. Effects of dose escalating liraglutidefrom 1.2 to 1.8 mg in clinical practice: a case-control study.J Endocrinol Invest. 2015;38(12):1357-1363.
17. Gautier JF, Martinez L, Penfornis A, et al. Effectiveness and persist-ence with liraglutide among patients with type 2 diabetes in routineclinical practice–EVIDENCE: a prospective, 2-year follow-up, observa-tional, post-marketing study. Adv Ther. 2015;32(9):838-853.
MONTVIDA ET AL. 9
174
18. Ryder B, Thong K. Findings from the Association of British ClinicalDiabetologists (ABCD) nationwide exenatide and liraglutide audits.Hot topics in diabetes. 2012;5:49-61.
19. Thong KY, Jose B, Sukumar N, et al. Safety, efficacy and tolerabilityof exenatide in combination with insulin in the Association of BritishClinical Diabetologists nationwide exenatide audit. Diabetes ObesMetab. 2011;13(8):703-710.
20. Balena R, Hensley IE, Miller S, Barnett AH. Combination therapy withGLP-1 receptor agonists and basal insulin: a systematic review of theliterature. Diabetes Obes Metab. 2013;15(6):485-502.
21. Inzucchi SE, Bergenstal RM, Buse JB, et al. Management of hypergly-cemia in type 2 diabetes, 2015: a patient-centered approach: updateto a position statement of the American Diabetes Association and theEuropean Association for the Study of Diabetes. Diabetes Care.2015;38(1):140-149.
22. Kamal KM, Chopra I, Elliott JP, Mattei TJ. Use of electronic medicalrecords for clinical research in the management of type 2 diabetes.Res Social Adm Pharm. 2014;10(6):877-884.
23. Davis KL, Tangirala M, Meyers JL, Wei W. Real-world comparativeoutcomes of US type 2 diabetes patients initiating analog basal insulintherapy. Curr Med Res Opin. 2013;29(9):1083-1091.
24. Crawford AG, Cote C, Couto J, et al. Comparison of GE centricityelectronic medical record database and National Ambulatory MedicalCare Survey findings on the prevalence of major conditions in theUnited States. Popul Health Manag. 2010;13(3):139-150.
25. Khunti K, Nikolajsen A, Thorsted BL, Andersen M, Davies MJ,Paul SK. Clinical inertia with regard to intensifying therapy in people
with type 2 diabetes treated with basal insulin. Diabetes Obes Metab.2016;18(4):401-409.
26. Paul SK, Klein K, Thorsted BL, Wolden ML, Khunti K. Delay intreatment intensification increases the risks of cardiovascularevents in patients with type 2 diabetes. Cardiovasc Diabetol.2015;14(1):100.
27. Paul SK, Shaw J, Montvida O, Klein K. Weight gain in insulin treatedpatients by BMI categories at treatment initiation: new evidence fromreal-world data in patients with type 2 diabetes. Diabetes Obes Metab.2016, doi:10.1111/dom.12761. [Epub ahead of print].
SUPPORTING INFORMATION
Additional Supporting Information may be found online in the sup-
porting information tab for this article.
How to cite this article: Montvida O, Klein K, Kumar S,
Khunti K and Paul SK. Addition of or switch to insulin therapy
in people treated with glucagon-like peptide-1 receptor ago-
nists: A real-world study in 66 583 patients, Diabetes Obes
Metab, 2016. doi: 10.1111/dom.12790
10 MONTVIDA ET AL.
175
Using data obtained from the Centricity Electronic Medical Record database, which documents patient and treatment data from >35,000 USA‑based physicians or other health‑care providers, Montvida et al.1 present data on changes in HbA1c levels, body weight, systolic blood pressure and LDL cholesterol levels in patients with type 2 diabetes mellitus (T2DM) who received treatment with glucagon‑like peptide 1 receptor agonists (GLP1‑RAs; spe‑cifically, exenatide and liraglutide). Patient data were examined between 6 months and 24 months after the initiation of GLP1‑RA treatment in ~67,000 individuals with T2DM. After a minimum of 6 months of treatment with GLP1‑RAs, 33.5% of patients continued with GLP1‑RA treatment alone for at least 24 months, 7.1% of patients were switched to insulin treatment and simultaneously dis‑continued GLP1‑RA treatment and 59.9% of patients received insulin (of any preparation) in addition to GLP1‑RA treatment.
In patients initially receiving GLP1‑RA treatment and continuing with this treat‑ment alone for up to 24 months, the great‑est reduction in HbA1c levels were observed within 6 months of starting treatment, with negligible changes observed after 6 months. Patients in whom GLP1‑RA treatment was discontinued and replaced with insulin treatment did not show improved glycaemic control by switching treatment. However, if insulin was added to GLP1‑RA treatment, a net improvement in glycaemic control was observed. This net improvement was greater if insulin was added earlier rather than later,
on such trials6,7. Although about one‑third of patients can achieve HbA1c concentrations in the target range with GLP1‑RAs alone and will maintain treatment with this therapy for a long period of time, such long‑term treat‑ment efficacy is not observed in other patients. Furthermore, the necessity for treatment intensification can occur earlier or later dur‑ing the course of treatment with GLP1‑RAs. In head‑to‑head trials comparing GLP1‑RAs with insulin (mainly basal insulin) treatment, both approaches lead to similar degrees of glycae‑mic control when used as separate treatment strategies8. Small but statistically significant differences in efficacy favouring GLP1‑RAs over insulin treatments have been noted. This effect seems to persist even in patients with high baseline HbA1c levels, in whom one might intuitively assume a better response to insulin therapy than to GLP1‑RAs9. As the clinical effi‑cacy of GLP1‑RA treatment or insulin regimens is similar, the lack of net changes in glycaemic control when switching from one to the other, as shown by Montvida et al., is not surprising1.
An unexpected result of the analysis by Montvida et al.1 is that adding insulin treat‑ment to GLP1‑RA treatment does not increase body weight; rather, weight loss was greater in patients adding insulin to their regimen. In studies directly comparing GLP1‑RAs and insulin as single injectable agents and in combi‑nation4,5, weight reduction for the combination treatment was attenuated compared with the GLP1‑RA treatment alone, with insulin treat‑ment alone promoting weight gain. Whether body weight was assessed by using calibrated scales or less reliable information provided by the patient is not known, which points to some limitations of such ‘real world’ approaches.
Another interesting and novel aspect of the study by Montvida et al.1 is the change in HbA1c levels before intensification of the therapy was considered necessary. If a patient responded well to GLP1‑RA treatment, the reduction in HbA1c levels was evident after 6 months of treatment, with little further change in HbA1c levels with time. However, treatment intensification (either a switch to insulin treatment or the addition of insu‑lin treatment to GLP1‑RA treatment) was preceded by an elevation in HbA1c levels that was usually only evident during the 6‑month period before the requirement for treatment change. Thus, as with the trajectories of
relative to the time point at which GLP1‑RA treatment alone failed to provide sufficient glycaemic control1.
GLP1‑RAs have been approved as glucose‑ lowering treatments for patients with T2DM. Usually, these agonists are added to one or more oral glucose‑lowering agents (for exam‑ple, metformin or sulfonylureas), as alternatives to insulin treatment. However, GLP1‑RAs can also be used with insulin treatment, for which long‑acting insulin preparations such as insulin glargine2, insulin detemir3 or insulin degludec4 are most often used. This combination can be achieved by adding GLP1‑RAs to pre‑existing insulin treatment2, by adding insulin to pre‑ existing GLP1‑RA treatment3 or by introduc‑ing both treatments at the same time, in the lat‑ter case fixed‑dose combinations can be used, such as IDegLira (insulin degludec plus lira‑glutide)4 or LixiLan (insulin glargine plus lix‑isenatide)5. Prospective, randomized controlled trials have demonstrated achievement of HbA1c levels <7% in 60–85% of patients using these approaches2–5.
The findings of Montvida et al.1 confirm that results obtained from randomized, prospective trials can be replicated in the ‘real world’, as the database used contained results from diagnos‑tic procedures (such as measuring HbA1c levels, body weight and LDL cholesterol levels) and the prescription of medication (including the date of initiation and the duration of continu‑ous use of drugs, such as GLP1‑RAs or insulin treatment). At least qualitatively, the results of Montvida et al.1 confirm the findings reported in clinical trials2–5 and in meta‑analyses based
D I A B E T E S
Incretin mimetics and insulin — closing the gap to normoglycaemiaMichael A. Nauck and Juris J. Meier
Treatment of type 2 diabetes mellitus with GLP1 receptor agonists can result in long-term glycaemic control or can fail over time, in which case insulin can be used as an alternative or as an additive treatment. New research shows that the latter is more likely to achieve glycaemic targets than the former.
Refers to Montvida, O. et al. Addition or switch to insulin therapy in people treated with GLP-1 receptor agonists: a real world study in 66,583 patients. Diabetes Obes. Metab. http://dx.doi.org/10.1111/dom.12790 (2016)
NATURE REVIEWS | ENDOCRINOLOGY www.nature.com/nrendo
NEWS & VIEWS
© 2016
Macmillan
Publishers
Limited,
part
of
Springer
Nature.
All
rights
reserved.176
Nature Reviews | Endocrinology
Real-world targetachievement
40–70% patients
20–60% patients
60–85% patients
20–60% patients
100% patients
100% patients
20–40% patients
Traditional approach Novel approach 1 Novel approach 2
Add basal insulin Add basal insulin
Add basal insulin
Add prandial insulin
Add a GLP1 analogue
Add a GLP1 analogue
Add short-acting insulin at one meal
Add short-actinginsulin at all meals
Pote
ntia
l for
targ
et a
chie
vem
ent
Current target achievement
Normoglycaemia
Residual glycaemic burdenFuture unknown strategies
Oral glucose-lowering drugs 40–60%patients
Target achievement in clinical trials
?
measures of insulin secretory capacity and insulin resistance before the manifestation of T2DM10, short‑term dynamics in important determinants of glycaemic control are also evident during the transition from satisfactory glycaemic control with a glucose‑lowering medication (in this case GLP1‑RAs) to treat‑ment failure, indicating the need for a rapid response when implementing a treatment intensification. Increased granularity, ena‑bling the examination of these relationships at a higher temporal resolution, is not possi‑ble with the present data set, but should be an area of interest for future research.
One might also argue whether clinical inertia (that is, an undue delay in treatment intensification), is at work in these circum‑stances, or whether the hesitation to escalate treatment rather reflects physicians’ concerns about hypoglycaemia, other treatment‑related adverse effects or monetary considerations. In accordance with these concerns, long‑term adherence to GLP1‑RAs is still subop‑timal, probably owing to treatment‑emergent adverse effects.
The study by Montvida et al.1 also shows that even with the combination of GLP1‑RAs and insulin, the two drug classes consid‑ered to be most effective in terms of reduc‑ing HbA1c levels, only a minority of patients achieve an HbA1c target of <7%, in contrast with the higher efficacy reported in current
short‑term clinical trials. Addition of insulin to GLP1‑RA treatment is, therefore, suggested to be more effective than switching from a GLP1‑RA to insulin in reducing HbA1c lev‑els, thus offering one effective way to narrow the gap to near‑normoglycaemia. However, despite these clear advances in the treatment of T2DM, a large proportion of patients still fail to reach normoglycaemia with current glucose‑lowering strategies (FIG. 1). Continued efforts will be needed, to develop novel treat‑ment strategies, reduce treatment‑related adverse effects, optimize treatment adherence and refine current combination strategies.
Michael A. Nauck and Juris J. Meier are at the Division of Diabetology, Department of Medicine, St Josef-
Hospital, Ruhr University Bochum, Gudrunstraße 56, D-44791, Bochum, Germany.
[email protected]; [email protected]
doi:10.1038/nrendo.2016.180 Published online 10 Nov 2016
1. Montvida, O., Kleine, K., Kumar, S., Khunti, K. & Paul, S. K. Addition or switch to insulin therapy in people treated with GLP‑1 receptor agonists: a real world study in 66,583 patients. Diabetes Obes. Metab. http://dx.doi.org/10.1111/dom.12790 (2016).
2. Buse, J. B. et al. Use of twice‑daily exenatide in basal insulin‑treated patients with type 2 diabetes: a randomized, controlled trial. Ann. Intern. Med. 154, 103–112 (2011).
3. DeVries, J. H. et al. Sequential intensification of metformin treatment in type 2 diabetes with liraglutide followed by randomized addition of basal insulin prompted by A1C targets. Diabetes Care 35, 1446–1454 (2012).
4. Gough, S. C. et al. Efficacy and safety of a fixed‑ratio combination of insulin degludec and liraglutide (IDegLira) compared with its components given alone: results of a phase 3, open‑label, randomised, 26‑week, treat‑to‑target trial in insulin‑naive patients with type 2 diabetes. Lancet Diabetes Endocrinol. 2, 885–893 (2014).
5. Rosenstock, J. et al. Benefits of LixiLan, a titratable fixed‑ratio combination of insulin glargine plus lixisenatide, versus insulin glargine and lixisenatide monocomponents in type 2 diabetes inadequately controlled with oral agents: the LixiLan‑O randomized trial. Diabetes Care http://dx.doi.org/10.2337/dc16‑0917 (2016).
6. Balena, R., Hensley, I. E., Miller, S. & Barnett, A. H. Combination therapy with GLP‑1 receptor agonists and basal insulin: a systematic review of the literature. Diabetes. Obes. Metab. 15, 485–502 (2013).
7. Eng, C., Kramer, C. K., Zinman, B. & Retnakaran, R. Glucagon‑like peptide‑1 receptor agonist and basal insulin combination treatment for the management of type 2 diabetes: a systematic review and meta‑analysis. Lancet 384, 2228–2234 (2014).
8. Abdul‑Ghani, M. A., Williams, K., Kanat, M., Altuntas, Y. & DeFronzo, R. A. Insulin versus GLP‑1 analogues in poorly controlled type 2 diabetic subjects on oral therapy: a meta‑analysis. J. Endocrinol. Invest. 36, 168–173 (2013).
9. Buse, J. B. et al. Is insulin the most effective injectable antihyperglycaemic therapy? Diabetes Obes. Metab. 17, 145–151 (2015).
10. Tabak, A. G. et al. Trajectories of glycaemia, insulin sensitivity, and insulin secretion before diagnosis of type 2 diabetes: an analysis from the Whitehall II study. Lancet 373, 2215–2221(2009).
Competing interests statementM.A.N. declares that he has received personal fees, grants, non‑financial support or other support from AstraZeneca, Berlin Chemie‑AG, Boehringer Ingelheim, Eli Lil ly, G laxoSmithK l ine, Hof fmann La Roche, In tarc ia Therapeuticals, Janssen Global Services, Medscape LLC, Merck Sharp & Dohme, Novartis, Novo Nordisk, Sanofi‑Aventis and Versartis. J.J.M. declares that he has received grants or personal fees from Astra Zeneca, Berlin‑Chemie, Boehringer‑Ingelheim, Eli Lilly, MSD, NovoNordisk, Sanofi and Servier.
Figure 1 | Glycaemic target achievement resulting from different treatment approaches. The achievement of glycaemic targets in clinical trials and in real world studies of patients with type 2 diabetes mellitus (T2DM) differs with conventional and novel treatment approaches, and at different stages of treatment escalation. The lack of achievement of normoglycaemia in all patients illustrates the need for further advances in the treatment of T2DM. The gap between the target achievement with current combination treatment approaches and normoglycaemia is called the ‘residual glycaemic burden’.
NATURE REVIEWS | ENDOCRINOLOGY www.nature.com/nrendo
N E W S & V I E W S
© 2016
Macmillan
Publishers
Limited,
part
of
Springer
Nature.
All
rights
reserved. ©
2016
Macmillan
Publishers
Limited,
part
of
Springer
Nature.
All
rights
reserved.177
APPENDIX B
178
Send Orders for Reprints to [email protected]
16 The Open Bioinformatics Journal , 2017, 10, 16-27
1875-0362/17 2017 Bentham Open
The Open Bioinformatics Journal
Content list available at: www.benthamopen.com/TOBIOIJ/
DOI: 10.2174/1875036201710010016
RESEARCH ARTICLE
Data Mining Approach to Identify Disease Cohorts from PrimaryCare Electronic Medical Records: A Case of Diabetes Mellitus
Ebenezer S. Owusu Adjah1,2, Olga Montvida1,3, Julius Agbeve1 and Sanjoy K. Paul4,*
1QIMR Berghofer Medical Research Institute, Brisbane, Australia2Faculty of Medicine, The University of Queensland, Brisbane, Australia3School of Biomedical Sciences, Institute of Health and Biomedical Innovation, Faculty of Health, QueenslandUniversity of Technology, Brisbane, Australia4Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, Australia
Received: August 17, 2017 Revised: November 28, 2017 Accepted: November 29, 2017
Abstract:
Background:
Identification of diseased patients from primary care based electronic medical records (EMRs) has methodological challenges thatmay impact epidemiologic inferences.
Objective:
To compare deterministic clinically guided selection algorithms with probabilistic machine learning (ML) methodologies for theirability to identify patients with type 2 diabetes mellitus (T2DM) from large population based EMRs from nationally representativeprimary care database.
Methods:
Four cohorts of patients with T2DM were defined by deterministic approach based on disease codes. The database was mined for aset of best predictors of T2DM and the performance of six ML algorithms were compared based on cross-validated true positive rate,true negative rate, and area under receiver operating characteristic curve.
Results:
In the database of 11,018,025 research suitable individuals, 379 657 (3.4%) were coded to have T2DM. Logistic Regression classifierwas selected as best ML algorithm and resulted in a cohort of 383,330 patients with potential T2DM. Eighty-three percent (83%) ofthis cohort had a T2DM code, and 16% of the patients with T2DM code were not included in this ML cohort. Of those in the MLcohort without disease code, 52% had at least one measure of elevated glucose level and 22% had received at least one prescriptionfor antidiabetic medication.
Conclusion:
Deterministic cohort selection based on disease coding potentially introduces significant mis-classification problem. ML techniquesallow testing for potential disease predictors, and under meaningful data input, are able to identify diseased cohorts in a holistic way.
Keywords: Electronic Medical Records, Primary Care Database, Machine Learning Algorithm, Diabetes, Type 2 Diabetes, CohortIdentification.
* Address correspondence to this author at the Melbourne EpiCentre , University of Melbourne and Melbourne Health, Melbourne, Australia;Tel: +61-3-93428433; E-mail: [email protected]
179
Cohort Identification from Primary Care Database The Open Bioinformatics Journal , 2017, Volume 10 17
1. INTRODUCTION
Recent advances in the design and implementation of large patient-level electronic medical records (EMRs) fromnational primary care databases have created opportunities in clinical, epidemiological and public health research [1, 2].In a typical primary or ambulatory care setting, large volumes of data are generated as patients go through variousphases of treatment. Individual patients’ longitudinal data on demographics, lifestyle, disease and treatment history,clinical and laboratory parameters, hospitalization statistics, and clinical events are typically organized and stored in aform of relational database. Such databases present unique challenges in terms of efficient and effective extraction ofdata for various investigative interests [3]. One of the challenging aspects in this context is the identification of diseasecohorts for retrospective or prospective clinical epidemiological studies [4, 5].
Diagnostic codes, such as the International Classification of Diseases (ICD) codes or Read codes [6], are generallyused to identify disease cohorts from EMRs [4]. The reliability of diagnosis coding for various diseases has beenextensively examined for many primary care databases including The Health Improvement Network (THIN) databasefrom the United Kingdom [7 - 9]. However, there are four specific issues in relation to identifying cohorts by diagnosticcodes: (1) differentiating between disease subtypes from high-level codes, (2) overlapping codes of disease subtypeslongitudinally at individual patient level, (3) absence of codes for diseased patients (false negatives), and (4) presence ofdisease specific codes for patients without the specific disease (false positives).
With regards to diabetes mellitus (DM), identification and appropriate classification of different types of diabetes inthe primary care databases are particularly challenging [5, 10 - 13]. These challenges border mostly on inaccuratecoding leading to misclassification, misdiagnosis, and undiagnosed diabetes [12]. Algorithms based on laboratory,clinical, and medication data have thus been proposed as tools for distinguishing between type 1 diabetes mellitus(T1DM) and type 2 diabetes mellitus (T2DM) [10, 14 - 16]. However, the overall accuracy and reliability of deriveddisease cohorts based on diagnostic codes can be improved by implementing advanced machine learning (ML) orstatistical data mining techniques and clinically guided cohort selection algorithms that robustly capture comprehensivepatient level information available in the EMRs [4, 5, 12, 13].
Shivade and colleagues (2014) have conducted a systematic review of various techniques used for the identificationof different disease cohorts from different sources of clinical databases [2]. Some of these proposed algorithms havebeen criticized for their appropriateness in the context of other studies [17]. While several studies compared or appliedML techniques to identify T2DM patients, to the best of our knowledge, there is no study that employed an extensiveassessment of diagnostic codes, deterministic clinical selection algorithms, and ML algorithms simultaneously toidentify T2DM cohorts from primary care EMRs.
The aims of this exploratory methodological study were to (1) explore technical challenges in the extraction ofdisease cohorts, (2) compare the ability of different clinically guided cohort selection algorithms to identify the diseasecohorts, and (3) compare the disease cohorts identified by ML algorithms and clinically guided cohort selectionalgorithms using a large nationally representative primary care database from the UK.
2. MATERIALS AND METHODS
In this section, we introduce the primary care database, describe the challenges in identifying cohort of patients withspecific disease (i.e. T2DM), explain the clinically guided cohort selection algorithms, and the data mining andcomputational processes leading to comparison of different supervised ML techniques.
2.1. Data Source
Data from The Health Improvement Network (THIN), which is a patient level primary care data from UK was usedin this study. THIN is an ongoing primary care database of medical records of anonymized patients from generalpractitioners, covers over 600 UK general practices, and has been linked to the hospital episode statistics (HES) andother statistics from the National of Bureau of Statistics. Longitudinal patient level records have been collected since1990 and the current version of the database holds more than 13 million individual patient records. The patientsincluded in this database are representative of the UK population by age, gender, medical conditions and death ratesadjusted for demographics and social deprivation. The accuracy and completeness of THIN database have beenpreviously described elsewhere [18, 19]. The THIN database is considered as one of the most comprehensive patientlevel databases available globally, and has been extensively used by researchers and government bodies for clinical,epidemiological and public health related studies [20]. The database contains extensive information on individuals’
180
18 The Open Bioinformatics Journal , 2017, Volume 10 Owusu Adjah et al.
demographic, clinical, laboratory, medications and event history data. The study protocol was approved by theIndependent Scientific Review Committee for the THIN database (Protocol Number: 15THIN030) and the InstitutionalReview Board of QIMR Berghofer Medical Research Institute.
2.2. Challenges in Identifying Disease Cohort
THIN uses the UK’s standard Read code classification system which is useful for hierarchical classification ofpatients’ specific circumstances and lifestyles, thereby enhancing scalability and retrieval (6). However, the Readcoding system is complex as a disease or an encounter with a general practitioner can be coded in several waysincluding use of existing codes or by creating new user-defined codes [21]. In this way, considerable variation andinconsistency is introduced into the coding system as observed in the case of DM [11, 14, 22].
2.2.1. Differentiating Between Disease Subtypes
Typically, many diabetes related codes are available for a single patient, some of which are high- level codes (e.g.C10 - “Diabetes mellitus”) or disease related codes that are unspecific in the description of the diabetes type (e.g.C106.12-“Diabetes mellitus with neuropathy”). Common practice has been to exclude any high level codes [14, 23]which may lead to underestimation of the disease cohort. When it is impossible to identify disease subtype (type 1 ortype 2 diabetes) from the diagnostic codes, data on surrogate markers (like glutamic acid carboxylase) could be useful,but such information is not available in THIN database. Nevertheless, combinations of available biomarkers (such asage, weight or HbA1c) and medication prescriptions have been used to distinguish types of diabetes in some studies [10,14].
2.2.2. Longitudinally Overlapping Disease Subtypes
Patients may have different disease subtypes coded longitudinally as a result of data entry errors or biologicalprogression of the disease. While the former can lead to any combinations of subtypes, the latter may result indeveloping T1DM from T2DM or T2DM from gestational diabetes. To distinguish between contradictory codes,longitudinal exploratory techniques were applied in some studies [5]. Also, the techniques described above that dealwith unspecific codes may be considered. To address the issue of contradictory diagnostic codes longitudinally, thefollowing was adopted to distinguish between T1DM and T2DM.
Use of Read codes that uniquely distinguish between T1DM and T2DM.i.In patients with unspecific codes, or longitudinally overlapping subtypes, the following is used:ii.
If oral antidiabetic drug (ADD) is taken ≥ 2 months, then T2DM.a.Otherwise, if age at first available diagnosis date ≤ 18 years and insulin initiated within 1 year, thenb.T1DM.Otherwise, if age at first available diagnosis date > 18 years and insulin initiated within 3 months thenc.T1DM.Else T2DM.d.
Patients with codes for gestational diabetes and other forms of diabetes were not include in this studyiii.
2.2.3. Absence of Codes for Patients with Disease and Presence of Codes for Patients without Disease
Data entry errors such as omissions, typing, communicating errors and patients’ temporary loss of follow-up inEMRs usually result in relatively small amount of false positive, and larger numbers of false negative patients identifiedby diagnostic codes. Earlier studies have addressed this complex issue by employing deterministic or probabilisticalgorithms [2, 15, 16]. We further focus on this challenging aspect by comparing deterministic (clinically guided) andprobabilistic (ML) cohort identification approaches.
2.3. Clinically Guided Cohort Selection Algorithms
Four separate cohorts were created by applying logical, clinically guided algorithms that select patients from thosewho have at least one record of Read code for T2DM (Fig. 1). Specifically, the T2DM cohorts were selected on thebasis of available records for:
Selection algorithm 1: T2DM Read code (Cohort 1);i.
181
Cohort Identification from Primary Care Database The Open Bioinformatics Journal , 2017, Volume 10 19
Selection algorithm 2: Lifestyle modification intervention + T2DM Read code (Cohort 2);ii.Selection algorithm 3: At least one prescription for antidiabetic medication + lifestyle modification interventioniii.+ T2DM Read code (Cohort 3);Selection algorithm 4: At least one prescription for antidiabetic medication or lifestyle modification interventioniv.+ T2DM Read code (Cohort 4).
Selection algorithm 1: T2DM Read code only; Selection algorithm 2: T2DM Read code + lifestyle modificationadvice. Selection algorithm 3: T2DM Read code + antidiabetic medication + lifestyle modification advice. Selectionalgorithm 4: T2DM Read code + (antidiabetic medication or lifestyle modification advice)
Fig. (1). Flow chart for the selection of type 2 diabetes (T2DM) cohorts by clinically guided algorithms.
2.4. Supervised Machine Learning Techniques
The process of selecting one most appropriate probabilistic algorithm to identify patients with T2DM is describedbelow.
2.4.1. Feature Selection
THIN database was mined to detect the most frequent medications, comorbidities, laboratory and anthropometricmeasurements among patients with T2DM identified on the basis of Read codes. The resulting 280 variables werecombined with current clinical considerations, practices and guidelines for T2DM management [24], and 11 potentialdisease predictors were obtained through iterative process (Table 1). Correlation based Feature Selection (CFS)algorithm was applied to determine best of these predictors [25, 26]. This scheme independent attribute subset selectionapproach is particularly useful when attributes are correlated with one another, and with the class attribute. Bi-directional, forward and backward greedy search methods were applied using 10-fold cross-validation [27] and they allagreed on the same seven features described in Table 1.
2.4.2. Training Dataset
From the 11,018,025 patients in THIN database, a training dataset of 150,000 instances, containing equal number ofpositive and negative representatives was extracted. Positive instances were randomly selected from patients with (1)available T2DM Read code, (2) at least one year of follow-up, and (3) 18-90 years old at the time of T2DM diagnosis.
All patients with valid record
(n=11,018,025)
Individuals with any type of DM
(n=530,948)
No record of DM(n=10,487,077)
T2DM (Selection algorithm 1) n=379,657
Age, mean = 60 yearsMale, % = 55
Exclude :1. Type 1 Diabetes (n=46,238)2. Gestational Diabetes (n=15,814)3. Prediabetes (n=86,800)4. Other Types (n=2,439)
T2DM (Selection algorithm 3) n=197,326
Age, mean = 58 yearsMale, % = 56
T2DM (Selection algorithm 2) n=243,597
Age, mean = 59 yearsMale, % = 55
T2DM (Selection algorithm 4) n=346,993
Age, mean = 60 yearsMale, % = 55
182
20 The Open Bioinformatics Journal , 2017, Volume 10 Owusu Adjah et al.
Negative instances were also randomly selected from those without Read code for any subtype of DM and at least oneyear of follow-up (Fig. (2), training set).
Table 1. Features selected as best T2DM predictors.
– Feature Name Feature Type Selected for ML
1 Two measurements of HbA1c>6% or fasting blood glucose > 7 mmol/l or random blood glucose > 11.1 mmol/lwithin 1 year. Binary Yes
2 Any antidiabetic drug prescriptions for at least 6 months. Binary Yes3 Average BMI. Continuous Yes
4 Hypertension diagnosis or antihypertensive drug use greater or equal to 6 months or beta blockers prescriptionfor 6 months or more. Binary Yes
5 Chronic kidney diagnosis. Binary Yes6 Retinopathy or neuropathy diagnosis. Binary Yes7 Average systolic blood pressure. Continuous Yes8 Lifestyle modification advice. Binary No9 Average HbA1c. Continuous No10 Average random glucose Continuous No11 Heart failure or myocardial infarction or stroke or coronary artery disease Binary No
Fig. (2). Flowchart of creating dataset for machine learning training, and of dataset for predicting diabetes status.
2.4.3. Classification Algorithm Selection
Keeping the selected subset of 7 robust predictors of T2DM, six classification algorithms were applied to thetraining set. Ten repeat 10-fold cross-validation was applied to calculate true positive rate (sensitivity), true negative
All patients with valid record
(n=11,018,025)
All patients with valid record
(n=11,018,025)
No record of DM(n=10,487,077)
T2DM (Cohort 1)(n=379,657)
Exclude :1. Type 1 Diabetes (n=46,238)2. Gestational Diabetes (n=15,814)3. Prediabetes (n=86,800)4. Other Types (n=2,439)
Randomly select 75,000(+ instances)
Follow-up ≥ 1 yr age at diagnosis 18 - 90 years
(n=350,201)
Randomly select 75,000
(- instances)
Follow-up ≥ 1 yr (n=9,587,202)
Training set(n=150,000)
Prediction set (n=9,937,403)
183
Cohort Identification from Primary Care Database The Open Bioinformatics Journal , 2017, Volume 10 21
rate (specificity), and area under receiver operating characteristic curve (AUC). Percent of correctly classified instancesand required central processing unit (CPU) time for training the algorithms were also derived. The algorithms forcomparison were: Naïve Bayes [28, 29], Logistic regression [30], Support Vector Machine (SVM) [31, 32], MultilayerPerceptron (MP) [33], Decision Tree with J48 modification [34], and One Rule [35].
One Rule algorithm performed significantly worse. Except differences in CPU time, performance of otheralgorithms was similar. Among them, Naïve Bayes had lower sensitivity misclassifying approximately 500 additionalpatients compared to other approaches. AUC was smaller for SVM and J48, while SVM and MP required significantlyhigher CPU time (Table 2). Interestingly, neither body mass index nor blood pressure contributed significantly to anymodel. Logistic regression was selected as most appropriate model for predicting T2DM. The model obtained from fulltraining dataset was applied to all THIN database patients with no record of Read code for diabetes diagnosis other thanT2DM, and with available follow-up for at least one year (Fig. (2), prediction set).
Table 2. Performance of machine learning algorithms on the training dataset.
– Naïve Bayes Logistic Regression MultilayerPerceptron
Support VectorMachine
J48Decision Tree One Rule
Percent correct 95.6 95.9 95.9 95.9 95.9 91.7TPR 0.98 0.99 0.99 0.99 0.99 0.99TNR 0.93 0.93 0.93 0.93 0.93 0.84AUC 0.98 0.98 0.98 0.96 0.96 0.92
CPU time 0.09 3.36 68.03 191.9 1.78 0.21TPR: True Positive Rate, TNR: True Negative Rate; AUC: Area Under receiver operating characteristic Curve; CPU: Central Processing Unit.
3. RESULTS
The distributions of basic characteristics of patients identified by all four clinically guided algorithms and the MLalgorithm were similar (Table 3). Clinically guided algorithms 1-4 and the ML algorithm resulted in cohorts of 379,657;243,597; 197,326; 346,993; and 383,330 patients with T2DM respectively. For patients identified by the ML algorithmwho did not have a Read code, the first available date of entry of the significant predictors was used as their date ofdiagnosis. At the time of diabetes diagnosis, identified patients were on average 60 years old, 86 kg in weight with 55%male. The proportions of those who had two elevated glucose level measurements within one year were 75, 86, 90, 79,and 82% in cohorts identified by selection algorithms 1-4 and ML respectively. With median 11 years of follow-up postdiagnosis, proportions of those who received at least one prescription for antidiabetic medication were 79, 81, 100, 87,and 75% in cohorts identified by rules 1-4 and ML respectively.
Among the cohort of T2DM patients identified by ML algorithm, 317,979 (83% of 383,330) patients had Read codefor T2DM (Table 4). It is worth noting that 59,678 (16% of 379,657) patients with a record of T2DM Read code werenot selected by ML approach. Almost a fifth (17% of 383,330) of the patients in ML cohort were without a record ofT2DM Read code. Of them, 52% had at least one measure of elevated glucose level and 22% had received at least oneprescription for antidiabetic medication (Table 4).
In order to assess the proportion of patients that remain undetected by the algorithms used in this study, complementcohort-specific analysis was performed (data not shown). Among patients not selected by ML as T2DM, only 884patients had at least two elevated glucose measurements (HbA1C > 6% or fasting blood glucose > 7 mmol/l or randomblood glucose > 11.1 mmol/l) within 1 year, compared to 32,039, 106,671, 137,796, and 42,583 patients not selected byselection algorithms 1-4.
Table 3. Baseline characteristics of T2DM patients identified by selection algorithms and logistic regression classifier (ML).
– Selection Algorithm 1 Selection Algorithm 2 Selection Algorithm 3 Selection Algorithm 4 MLPatients, n 379,657 243,597 197,326 346,993 383,330Age at diagnosis (years) α 60 (15) 59 (14) 58 (14) 60 (15) 59 (15)Age at diagnosis (years) * 61 (50,71) 60 (50,69) 58 (49,67) 60 (50,70) 60 (50,70) ≤40 32,644 (9) 19,761 (8) 17,969 (9) 29,701 (9) 71,752 (19) 41-50 62,656 (17) 43,872 (18) 39,289 (20) 59,608 (17) 58,813 (15) 51-60 90,464 (24) 62,610 (26) 54,006 (27) 85,587 (25) 84,277 (22) 61+ 193,893 (51) 117,354 (48) 86,062 (44) 172,097 (50) 168,488 (44)Male # 208,155 (55) 134,393 (55) 110,178 (56) 191,107(55) 200,447 (52)
184
22 The Open Bioinformatics Journal , 2017, Volume 10 Owusu Adjah et al.
– Selection Algorithm 1 Selection Algorithm 2 Selection Algorithm 3 Selection Algorithm 4 MLAt least one prescription# 300,722 (79) 197,326 (81) 197,326 (100) 300,722 (87) 287,095 (75)Prescription duration ≥ 6months# 243,064 (64) 171,800 (71) 171,800 (87) 243,064 (70) 254,255 (66)
RBG (mmol/l) α § 11.5 (5.1) 11.4 (5.1) 12.1 (5.3) 11.6 (5.2) 11.3 95.2)
RBG (mmol/l) α ‡ 9.5 (3.4) 9.4 (3.3) 9.9 (3.4) 9.6(3.4) 9.1 (3.5)
FBG (mmol/l) α § 8.4 (2.3) 8.4 (2.3) 8.9 (2.4) 8.5 (2.3) 8.3 (2.3)
FBG (mmol/l) α ‡ 7.8 (2.1) 7.7 (2.0) 8.0 (2.1) 7.8(2.1) 7.5 (2.1)
HbA1c (%)α § 8.4 (2.1) 8.4 (2.1) 8.7 (2.2) 8.5 (2.2) 8.3 (2.1)
HbA1c (%)α ‡ 7.5 (1.4) 7.5 (1.3) 7.7 (1.3) 7.5(1.4) 7.4 (1.3)
Composite measure# ‡ 283,419 (75) 208,787 (86) 177,689 (90) 272,875 (79) 314,574 (82)
Weight (kg) α § 89.4(20.8) 90.3 (21.0) 91.1 (21.1) 89.6 (20.9) 89.3 (21.0)
Weight (kg) α ‡ 85.0 (19.8) 86.6 (19.9) 87.6 (20.0) 85.5 (19.8) 86.1 (20.6)
BMI (kg/m2) α § 31.6 (6.7) 32.0 (6.7) 32.2 (6.7) 31.7 (6.7) 31.7 (6.8)
BMI (kg/m2) α ‡ 30.2 (6.1) 30.7 (6.1) 31.0 (6.2) 30.4(6.1) 30.7 (6.7)
Normal weight # 22311(12) 15,821 (11) 12,339 (11) 21,108 (12) 24,453 (13)
Overweight # 58,447 (32) 44,283 (32) 35,289 (31) 55,885 (32) 61,846 (32)
Grade 1 obese # 52,465 (29) 41,323 (30) 33,669 (30) 50,423 (29) 55,684 (29)
Grade 2 obese # 27,168 (15) 22,163 (16) 18,497 (16) 26,336 (15) 29,178 (15)
Any CVD# 106,523 (28) 67,011 (28) 51,905 (26) 96,147 (28) 93,703 (24)
CKD# 10,547 (3) 8,035 (3) 4,609 (2) 9,445 (3) 12,404 (3)
Cancer# 24,159 (6) 15,998 (7) 11,084 (6) 21,536 (6) 22,112 (6)
Hypertension# 149,752 (39) 104,916 (43) 79,193 (40) 137,440 (40) 140,341 (37)Follow-up (years) * 11 (6,17) 10 (6,15) 11 (6,16) 11(6,17) 10 (5,16)Legend: Selection algorithm 1: Read code only; Selection algorithm 2: Read code and lifestyle modification advice; Selection algorithm 3: Read codeand medication and lifestyle modification advice; Selection algorithm 4: Read code and (medication or lifestyle modification advice); ML: Machinelearned cohort; RBG: random blood glucose; FBG: fasting blood glucose; Composite measure: fasting blood glucose > 7mmol/l or random bloodglucose >11.1 mmol/l or HbA1c >6; BMI: Body Mass Index (kg/m2); Normal: (18.5-24.99), Overweight: (25-29.99); Grade 1 obese: (30-34.99), Grade2 obese (35-39.99); α: Mean(SD); #: n(%); *: median (Q1,Q3); CKD: Chronic kidney disease ; Any CVD: any cardiovascular disease defined asoccurrence of angina, MI, coronary heart disease (CHD), HF, stroke, and peripheral artery disease (PAD) on or before diagnosis of T2DM; §:measured at diagnosis and ‡: an average over of all available measurements.
Table 4. Baseline characteristics and distribution of glycaemic markers among patients identified by ML.
– Machine Learned T2DM Cohort(n=383,330)
– With Read Code Without Read CodePatients # 319,979 (83) 63,351 (17)
Age at diagnosis (years) α 60 (14) 54 (24)Age at diagnosis (years) * 60 (50, 70) 56 (33, 73) ≤ 40 25,645 (8) 46,107 (73) 41-50 56,583 (18) 2,230 (4) 51-60 81,262 (25) 3,015 (5) 61+ 156,489 (49) 11,999 (19)Male # 176,568 (55) 23,879 (38)
At least one prescription # 273,272 (85) 13,823 (22)
Prescription duration ≥ 6 months # 241,517 (76) 12,738 (20)
RBG >11.1 mmol/l #, 101,135 (32) 1,471 (2)
FBG > 7 mmol/l# 50,446 (16) 1,695 (3)
HbA1c > 6%# 274,565 (86) 29,793 (47)
Composite measure# 274,565 (86) 29,793 (47)Legend: RBG: random blood glucose; FBG: fasting blood glucose; Composite measure: fasting blood glucose > 7 mmol/l or random blood glucose>11.1 mmol/l or HbA1c > 6; *: median (Q1,Q3), #: n (%), α: mean (SD)
(Table 3) contd.....
185
Cohort Identification from Primary Care Database The Open Bioinformatics Journal , 2017, Volume 10 23
4. DISCUSSION
In this study we addressed a number of problems encountered by computer based methods in the complex tasks ofidentifying a disease cohort from large EMR databases. Specifically, (1) we have defined and discussed commontechnical challenges in differentiating diabetes subtypes, (2) combining clinical, medication and morbidity informationwith database patterns, we selected a set of best predictors as feeds to ML algorithms that can be used to identifypatients with T2DM in the absence of any disease code, and (3) compared T2DM cohorts identified by clinically guidedselection algorithm and ML algorithm. The results of this study are of particular interest to researchers who work withTHIN database, however methods explored in this study are generalizable for any EMR with different disease codingsystems.
Although we have seen no difference in distributions of basic characteristics among cohorts obtained bydeterministic and probabilistic approaches, ML algorithms were found to be superior. With the use of selected features,we could confirm that 83% of the patients identified by the ML algorithm had a Read code for T2DM (Table 3). Thosewithout Read code had comparable high risk as identified by the significant predictors. While 25 / 21% of patients withRead code / Read code + (medication or life style advice) for T2DM did not have at least two elevated measures ofblood glucose within one year, only 18% of ML identified cohort did not have such measures. Among Read code / MLdefined patients without elevated composite glucose measure, 69 / 41% did not receive ADD for at least 6 months. It isimportant to note that the patients without a Read code for diabetes are highly less likely to have a 2 elevated bloodglucose measures within one year unless they were known to be diabetic or pre-diabetic.
Five of the six ML algorithms demonstrated similar performances in the training-testing data sets. Logisticregression approach was chosen as the best classifier for THIN database, however different feature patterns within otherEMRs could potentially lead to better performance of other ML techniques to predict T2DM cohort. Tapak andcolleagues [36] reported SVM as the better classifier, while Mani and colleagues [37] reported decision trees tooutperform other ML algorithms. In this context it is important to mention that, ML algorithms cannot operate withoutmeaningful data fed-in (“Garbage in, garbage out” principle). Although the use of different datasets makes it difficultfor direct comparisons, a critical part of ML steps is the feature engineering or selection. Some recent studies have usedlarge sets of variables associated with diabetes with the aim of enhancing the predictive accuracy [38, 39]. However,this may be limited by inclusion of irrelevant and redundant variables, and model overfitting in cases where number ofobservations are less than number of variables. While earlier studies were primarily based on clinically guided featureselection, we adopted a more holistic approach initially to identify the data driven candidates as potential predictors ofT2DM from the whole database. Combining clinical knowledge and data driven candidate predictors, we ensuredselection of most robust set of 7 predictors. Although selected features were not surprising, we have seen that, BMI,lifestyle modification advice and hypertension did not contribute to the models, while microvascular complications did.
We have compared the performances of six classification algorithms on a set of 150,000 instances, which wasreconfirmed to be large enough by assessing the performance curves of several incremental classifiers. Nevertheless,training dataset was small compared to the whole database; therefore in order to ensure that our results are not prone toselection bias, we performed same analyses on 2 other randomly selected training datasets and obtained almost identicalresults.
Unlike most ML applications that focus on training to ensure best fit for future predictions, in this study, we haveused various techniques to correct available labelling with ultimate goal to improve quality of diseased cohort (Type 2Diabetes). It would be of great interest to compare ML error, Rule-based error, and human error in terms of predictingdisease from available data. For this task a “gold standard” dataset would consist of random patients whose true diseasestate was reconfirmed approaching both clinician and patient. We were not able to conduct this task, as the THINdatabase contains de-identified patient-level data, which is true for all large EMR databases that are used for researchpurposes. THIN database also does not have data on surrogate markers that could improve quality of the cohortidentification algorithms. Miscoding between type 1 and type 2 diabetes in the primary care database is not uncommon[40, 41]. It is important to mention that ML techniques may poorly distinguish between disease subtypes withoutincorporating additional classification rules. We have excluded patients with other diabetes Read codes from the dataseton which our ML algorithm was applied. Furthermore, for patients identified as T2DM without Read codes, the MLtechniques are not able to provide exact diagnosis date, therefore requiring incorporation of additional techniques.
186
24 The Open Bioinformatics Journal , 2017, Volume 10 Owusu Adjah et al.
CONCLUSION
Careful investigation of diagnostic codes patterns within the databases is essential prior to conducting analyses onthe disease cohort. Direct extraction of a disease cohort using diagnostic codes may lead to inclusion of falselydiagnosed patients and omitting patients with true disease state. Rule-based techniques represent conservative approach,which results in minimizing only false positive cases. ML techniques that minimize both false positives and falsenegatives cases represent more robust approach. However, ML techniques heavily rely on the meaningful input and usediagnostic codes for training purposes. Combining human expertise and machine power represent best strategy thatallows to test hypotheses on potential disease predictors, lower human interventions, and to reduce the burden ofselection bias.
LIST OF ABBREVIATIONS
ADD = Antidiabetic Drug
AUC = Area Under the Curve
BMI = Body Mass Index
CHD = Coronary Heart Disease
CPU = Central Processing Unit
CVD = Cardiovascular Disease
DM = Diabetes Mellitus
EMR = Electronic Medical Record
FBG = Fasting Blood Glucose
GP = General Practitioner
HbA1c = Glycated Haemoglobin
HES = Hospital Episode Statistics
HF = Heart Failure
ICD = International Classification of Diseases
MI = Myocardial Infarction
ML = Machine Learning
MP = Multilayer Perceptron
PAD = Peripheral Artery Disease
RBG = Random Blood Glucose
SD = Standard Deviation
SVM = Support Vector Machine
T1DM = Type 1 Diabetes Mellitus
T2DM = Type 2 Diabetes Mellitus
THIN = The Health Improvement Network
TNR = True Negative Rate
TPR = True Positive Rate
UK = United Kingdom
ETHICS APPROVAL AND CONSENT TO PARTICIPATE
The study protocol was approved by the Independent Scientific Review Committee for the THIN database (ProtocolNumber: 15THIN030) and the Institutional Review Board of QIMR Berghofer Medical Research Institute.
HUMAN AND ANIMAL RIGHTS
No Animals/Humans were used for studies that are base of this research.
CONSENT FOR PUBLICATION
Not applicable.
187
Cohort Identification from Primary Care Database The Open Bioinformatics Journal , 2017, Volume 10 25
CONFLICT OF INTEREST
Sanjoy K. Paul has acted as a consultant and/or speaker for Novartis, GI Dynamics, Roche, AstraZeneca,Guangzhou Zhongyi Pharmaceutical and Amylin Pharmaceuticals LLC. He has received grants in support ofinvestigator and investigator initiated clinical studies from Merck, Novo Nordisk, AstraZeneca, Hospira, AmylinPharmaceuticals, Sanofi-Avensis and Pfizer. Ebenezer S. Owusu Adjah, Olga Montvida, and Julius Agbeve. have noconflict of interest to declare.
ACKNOWLEDGEMENTS
Sanjoy K. Paul conceived the idea and was responsible for the primary design of the study. Ebenezer S. OwusuAdjah , and Olga Montvida significantly contributed in the study design. Julius Agbeve conducted the primary raw dataextraction. Ebenezer S. Owusu Adjah and Olga Montvida jointly conducted the data extraction, data manipulation,statistical analyses and developed the first draft of the manuscript. Ebenezer S. Owusu Adjah , Olga Montvida , SanjoyK. Paul, and Julius Agbeve contributed to the finalization of the manuscript. Sanjoy K. Paul had full access to all thedata in the study and is the guarantor, taking responsibility for the integrity of the data and the accuracy of the dataanalysis. Ebenezer S. Owusu Adjah was supported by QIMR Berghofer International Ph.D. Scholarship and TheUniversity of Queensland International Scholarship. Olga Montvida was supported by the Queensland University ofTechnology International Scholarship. No separate funding was obtained for this study. Melbourne EpiCentre gratefullyacknowledges the support from the Australian Government’s National Collaborative Research Infrastructure Strategy(NCRIS) initiative through Therapeutic Innovation Australia and the research project funding from the National Healthand Medical Research Council of Australia (Project Number: GNT1063477). Olga Montvida acknowledges the supportfrom her associate supervisors Prof. Ross Young and Prof. Louise Hafner.
REFERENCES
[1] Sagreiya H, Altman RB. The utility of general purpose versus specialty clinical databases for research: Warfarin dose estimation fromextracted clinical variables. J Biomed Inform 2010; 43(5): 747-51.[http://dx.doi.org/10.1016/j.jbi.2010.03.014] [PMID: 20363365]
[2] Shivade C, Raghavan P, Fosler-Lussier E, et al. A review of approaches to identifying patient phenotype cohorts using electronic healthrecords. J Am Med Inform Assoc 2014; 21(2): 221-30.[http://dx.doi.org/10.1136/amiajnl-2013-001935] [PMID: 24201027]
[3] Tate AR, Beloff N, Al-Radwan B, et al. Exploiting the potential of large databases of electronic health records for research using rapid searchalgorithms and an intuitive query interface. J Am Med Inform Assoc 2014; 21(2): 292-8.[http://dx.doi.org/10.1136/amiajnl-2013-001847] [PMID: 24272162]
[4] Kandula S, Zeng-Treitler Q, Chen L, Salomon WL, Bray BE. A bootstrapping algorithm to improve cohort identification using structureddata. J Biomed Inform 2011; 44(Suppl. 1): S63-8.[http://dx.doi.org/10.1016/j.jbi.2011.10.013] [PMID: 22079803]
[5] Sadek AR, Van Vlymen J, Khunti K, De Lusignan S. Automated identification of miscoded and misclassified cases of diabetes from computerrecords. Diabet Med 2012; 29(3): 410-4.[http://dx.doi.org/10.1111/j.1464-5491.2011.03457.x] [PMID: 21916978]
[6] Read J. The Read clinical classification (Read codes). Br Homeopath J 1991; 80(1): 14-20.[http://dx.doi.org/10.1016/S0007-0785(05)80418-1]
[7] Herrett E, Thomas SL, Schoonen WM, Smeeth L, Hall AJ. Validation and validity of diagnoses in the General Practice Research Database: Asystematic review. Br J Clin Pharmacol 2010; 69(1): 4-14.[http://dx.doi.org/10.1111/j.1365-2125.2009.03537.x] [PMID: 20078607]
[8] Hammad TA, Margulis AV, Ding Y, Strazzeri MM, Epperly H. Determining the predictive value of Read codes to identify congenital cardiacmalformations in the UK Clinical Practice Research Datalink. Pharmacoepidemiol Drug Saf 2013; 22(11): 1233-8.[http://dx.doi.org/10.1002/pds.3511] [PMID: 24002995]
[9] Khan NF, Harrison SE, Rose PW. Validity of diagnostic coding within the General Practice Research Database: A systematic review. Br JGen Pract 2010; 60(572): e128-36.[http://dx.doi.org/10.3399/bjgp10X483562]
[10] Stone MA, Camosso-Stefinovic J, Wilkinson J, de Lusignan S, Hattersley AT, Khunti K. Incorrect and incomplete coding and classificationof diabetes: A systematic review. Diabet Med 2010; 27(5): 491-7.[http://dx.doi.org/10.1111/j.1464-5491.2009.02920.x] [PMID: 20536944]
[11] De Lusignan S, Sadek K, McDonald H, et al. Call for consistent coding in diabetes mellitus using the Royal College of General Practitionersand NHS pragmatic classification of diabetes. Inform Prim Care 2012; 20(2): 103-13.[PMID: 23710775]
188
26 The Open Bioinformatics Journal , 2017, Volume 10 Owusu Adjah et al.
[12] Seidu S, Davies MJ, Mostafa S, de Lusignan S, Khunti K. Prevalence and characteristics in coding, classification and diagnosis of diabetes inprimary care. Postgrad Med J 2014; 90(1059): 13-7.[http://dx.doi.org/10.1136/postgradmedj-2013-132068] [PMID: 24225940]
[13] De Lusignan S, Liaw S-T, Dedman D, Khunti K, Sadek K, Jones S. An algorithm to improve diagnostic accuracy in diabetes in computerisedproblem orientated medical records (POMR) compared with an established algorithm developed in episode orientated records (EOMR). JInnov Health Inform 2015; 22(2): 255-64.[http://dx.doi.org/10.14236/jhi.v22i2.79] [PMID: 26245239]
[14] De Lusignan S, Khunti K, Belsey J, et al. A method of identifying and correcting miscoding, misclassification and misdiagnosis in diabetes: Apilot and validation study of routinely collected data. Diabet Med 2010; 27(2): 203-9.[http://dx.doi.org/10.1111/j.1464-5491.2009.02917.x] [PMID: 20546265]
[15] Holt TA, Gunnarsson CL, Cload PA, Ross SD. Identification of undiagnosed diabetes and quality of diabetes care in the United States: Cross-sectional study of 11.5 million primary care electronic records. CMAJ Open 2014; 2(4): E248-55.[http://dx.doi.org/10.9778/cmajo.20130095] [PMID: 25485250]
[16] Holt TA, Stables D, Hippisley-Cox J, O’Hanlon S, Majeed A. Identifying undiagnosed diabetes: cross-sectional survey of 3.6 million patients’electronic records. Br J Gen Pract 2008; 58(548): 192-6.[http://dx.doi.org/10.3399/bjgp08X277302] [PMID: 18318973]
[17] Magliano DJ, Zimmet P, Shaw J. US trends for diabetes prevalence among adults. JAMA 2016; 315(7): 705.[http://dx.doi.org/10.1001/jama.2015.16455] [PMID: 26881376]
[18] Blak BT, Thompson M, Dattani H, Bourke A. Generalisability of The Health Improvement Network (THIN) database: Demographics, chronicdisease prevalence and mortality rates. Inform Prim Care 2011; 19(4): 251-5.[PMID: 22828580]
[19] Denburg MR, Haynes K, Shults J, Lewis JD, Leonard MB. Validation of The Health Improvement Network (THIN) database forepidemiologic studies of chronic kidney disease. Pharmacoepidemiol Drug Saf 2011; 20(11): 1138-49.[http://dx.doi.org/10.1002/pds.2203] [PMID: 22020900]
[20] IMS Health Incorporated The Health Improvement Network (THIN) database London: IMS Health Incorporated 2017. Available at:http://www.csdmruk.imshealth.com/index.html
[21] Gray J, Orr D, Majeed A. Use of Read codes in diabetes management in a south London primary care group: Implications for establishingdisease registers. BMJ 2003; 326(7399): 1130.[http://dx.doi.org/10.1136/bmj.326.7399.1130] [PMID: 12763987]
[22] Rollason W, Khunti K, De Lusignan S. Variation in the recording of diabetes diagnostic data in primary care computer systems: Implicationsfor the quality of care. Inform Prim Care 2009; 17(2): 113-9.[PMID: 19807953]
[23] Lycett D, Nichols L, Ryan R, et al. The association between smoking cessation and glycaemic control in patients with type 2 diabetes: ATHIN database cohort study. Lancet Diabetes Endocrinol 2015; 3(6): 423-30.[http://dx.doi.org/10.1016/S2213-8587(15)00082-0] [PMID: 25935880]
[24] American Diabetes Association. Standards of Medical Care in Diabetes-2015. Diabetes Care 2015; 38(Suppl. 1): S4.[http://dx.doi.org/10.2337/dc15-S003]
[25] Hall MA. 1999. Correlation-based feature selection for machine learning PhD dissertation. Hamilton, NZ: University of Waikato, 1999
[26] Senliol B, Gulgezen G, Yu L, Cataltepe Z. Fast Correlation Based Filter (FCBF) with a different search strategy. Computer and InformationSciences. 2008 ISCIS'08 23rd International SymposiumIstanbol, Turkey: IEEE, 2008.
[27] Witten IH, Frank E. Data Mining: Practical machine learning tools and techniques. Berlington, MA: Morgan Kaufmann 2005.
[28] Friedman N, Geiger D, Goldszmidt M. Bayesian Network Classifiers. Mach Learn 1997; 29(2): 131-63.
[29] John GH, Langley P, Eds. Estimating continuous distributions in Bayesian classifiers. Proceedings of the Eleventh conference on Uncertaintyin artificial intelligence. Berlington, MA: Morgan Kaufmann Publishers Inc.338-45.
[30] Schmidt M, Roux NL, Bach F. Minimizing finite sums with the stochastic average gradient. Math Program 2017; 162(1-2): 83-112.
[31] Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995; 20(3): 273-97.[http://dx.doi.org/10.1007/BF00994018]
[32] Wu T-F, Lin C-J, Weng RC. Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 2004; 5: 975-1005.
[33] Ruck DW, Rogers SK, Kabrisky M. Feature selection using a multilayer perceptron. J Neural Netw Comput 1990; 2(2): 40-8.
[34] Loh W-Y. Improving the precision of classification trees. Ann Appl Stat 2009; 3(4): 1710-37.[http://dx.doi.org/10.1214/09-AOAS260]
[35] Holte RC. Very simple classification rules perform well on most commonly used datasets. Mach Learn 1993; 11(1): 63-90.[http://dx.doi.org/10.1023/A:1022631118932]
[36] Tapak L, Mahjub H, Hamidi O, Poorolajal J. Real-data comparison of data mining methods in prediction of diabetes in Iran. Healthc InformRes 2013; 19(3): 177-85.
189
Cohort Identification from Primary Care Database The Open Bioinformatics Journal , 2017, Volume 10 27
[http://dx.doi.org/10.4258/hir.2013.19.3.177] [PMID: 24175116]
[37] Mani S, Chen Y, Elasy T, Clayton W, Denny J. Type 2 diabetes risk forecasting from EMR data using machine learning. AMIA Annu SympProc 2012. 606-15.
[38] Zheng T, Xie W, Xu L, et al. A machine learning-based framework to identify type 2 diabetes through electronic health records. Int J MedInform 2017; 97: 120-7.[http://dx.doi.org/10.1016/j.ijmedinf.2016.09.014] [PMID: 27919371]
[39] Razavian N, Blecker S, Schmidt AM, Smith-McLallen A, Nigam S, Sontag D. Population-level prediction of type 2 diabetes from claims dataand analysis of risk factors. Big Data 2015; 3(4): 277-87.[http://dx.doi.org/10.1089/big.2015.0020] [PMID: 27441408]
[40] Thomas G, Klein K, Paul S. Statistical challenges in analysing large longitudinal patient-level data: The danger of misleading clinicalinferences with imputed data. J Indian Soc Agric Stat 2014; 68(2): 39-54.
[41] Khunti K, Davies M, Majeed A, Thorsted BL, Wolden ML, Paul SK. Hypoglycemia and risk of cardiovascular disease and All-causemortality in insulin-treated people with type 1 and type 2 diabetes: A cohort study. Diabetes Care 2015; 38(2): 316-22.[PMID: 25492401]
© 2017 Owusu Adjah et al.
This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), acopy of which is available at: (https://creativecommons.org/licenses/by/4.0/legalcode). This license permits unrestricted use, distribution, andreproduction in any medium, provided the original author and source are credited.
190
APPENDIX C
191
OR I G I N A L A R T I C L E
Weight gain in insulin-treated patients by body mass indexcategory at treatment initiation: new evidence fromreal-world data in patients with type 2 diabetes
S. K. Paul PhD1 | J. E. Shaw MSc2 | O. Montvida MSc1,3 | K. Klein PhD1
1Clinical Trials and Biostatistics Unit, QIMR
Berghofer Medical Research Institute,
Brisbane, Australia
2Baker IDI Heart and Diabetes Institute,
Melbourne, Australia
3School of Biomedical Sciences, Faculty of
Health, Institute of Health and Biomedical
Innovation, Queensland University of
Technology, Brisbane, Australia
Corresponding Author: Prof. S. K. Paul, Clinical
Trials and Biostatistics Unit, QIMR Berghofer
Medical Research Institute, Brisbane,
Queensland 4006, Australia
Funding information
The QIMR Berghofer Medical Research
Institute gratefully acknowledges the support
from the National Health and Medical
Research Council and the Australian
Government’s National Collaborative Research
Infrastructure Strategy (NCRIS) initiative
through Therapeutic Innovation Australia.
Aims: To evaluate, in patients with type 2 diabetes (T2DM) treated with insulin, the extent of
weight gain over 2 years of insulin treatment, and the dynamics of weight gain in relation to
glycaemic achievements over time according to adiposity levels at insulin initiation.
Materials and methods: Patients with T2DM (n = 155 917), who commenced insulin therapy
and continued it for at least 6 months, were selected from a large database of electronic medi-
cal records in the USA. Longitudinal changes in body weight and glycated haemoglobin (HbA1c)
according to body mass index (BMI) category were estimated.
Results: Patients had a mean age of 59 years, a mean HbA1c level of 9.5%, and a mean BMI of
35 kg/m2 at insulin initiation. The HbA1c levels at insulin initiation were significantly lower
(9.2-9.4%) in the obese patients than in patients with normal body weight (10.0%); however,
the proportions of patients with HbA1c >7.5% or >8.0% were similar across the BMI cate-
gories. The adjusted weight gain fell progressively with increasing baseline BMI category over
6, 12 and 24 months (p < .01). The adjusted changes in HbA1c were similar across BMI cate-
gories. A 1% decrease in HbA1c was associated with progressively less weight gain as pretreat-
ment BMI rose, ranging from a 1.24 kg gain in those with a BMI <25 kg/m2 to a 0.32 kg loss in
those with a BMI > 40 kg/m2.
Conclusions: During 24 months of insulin treatment, obese patients gained significantly less
body weight than normal-weight and overweight patients, while achieving clinically similar
glycaemic benefits. These data provide reassurance with regard to the use of insulin in
obese patients.
KEYWORDS
body mass index, glycaemic control, insulin initiation, type 2 diabetes, weight change
1 | INTRODUCTION
Type 2 diabetes (T2DM) is a progressive disease in which β-cell
function continues to decline over time, leading to the need for
insulin treatment in a significant proportion of patients. Although
studies suggest that early initiation of insulin supplementation may
alter the progressive course of T2DM, insulin initiation is often
delayed, mainly because of patients’ hesitation and physician
barriers.1–7 This may significantly increase the risk of developing dia-
betic complications.7,8 Fear of weight gain is one of the common
reasons for delaying insulin therapy, and concerns about potential
weight gain are usually greatest for those who are already
obese.8–11
While many studies have reported that significant weight gain is
associated with insulin therapy, no study, to the best of our knowl-
edge, has explored the weight gain with insulin therapy according to
adiposity levels at the time of insulin initiation. If it were the case that
insulin-related weight gain declines with increasing pretreatment adi-
posity, then significant reassurance could be provided to obese
patients, and timely insulin therapy could be initiated.
In the present longitudinal study, using data from a comprehen-
sive electronic medical record database, the aims were to evaluate
Received: 2 June 2016 Revised: 31 July 2016 Accepted: 1 August 2016
DOI 10.1111/dom.12761
1244 © 2016 John Wiley & Sons Ltd wileyonlinelibrary.com/journal/dom Diabetes Obes Metab December 2016; 18: 1244–1252
192
the following, according to pretreatment body mass index (BMI) cate-
gory: (1) change in body weight during the 2 years after initiation of
insulin treatment; (2) glycaemic control in these patients in relation to
weight change; and (3) weight gain associated with a 1 percentage
point improvement in glycated haemoglobin (HbA1c).
2 | MATERIALS AND METHODS
2.1 | Data source
The General Electronic Centricity Electronic Medical Records
(CEMR) contain more than 40 million patients’ clinical/treatment
records from 1995 to January 2015. The CEMR represents 49 US
states and includes data from >35 000 healthcare providers, of
which ~70% are primary care providers. The CEMR database stores
data on medication prescriptions within the electronic medical
records network, and also information on medications that may be
used over the counter or prescribed outside of the electronic medi-
cal records network. This database includes insured and uninsured
patients, and has been extensively used for academic research
worldwide.12–16
From more than 1.6 million patients with T2DM, a cohort of
432 287 patients, who received at least one prescription of insulin on
or after diagnosis of T2DM, was selected. Identification of T2DM
was based on the International Classification of Diseases (ICD)-9
codes and at least two prescriptions for any antidiabetes drugs within
6 months of diagnosis of T2DM. Patients with incomplete description
of the disease and with any record of type 1 diabetes longitudinally
were excluded. Inclusion criteria were no missing data on age, sex or
ethnicity at diagnosis of T2DM, age at insulin initiation between
18 and 75 years, first prescription date of insulin on or after 1 January
1995, no missing data on body weight and HbA1c at and within
3 months of the first date of insulin prescription, and any prescription
of glucagon-like peptide-1 (GLP-1) receptor agonists during the
follow-up period. The sizes of the cohorts that had 6, 12 and
24 months of insulin treatment duration were 155 917, 151 220
(sub-cohort 1) and 144 857 (sub-cohort 2), respectively.
Patients’ BMI at insulin initiation was categorized as: BMI < 25
kg/m2 (normal weight); BMI ≥ 25 and <30 kg/m2 (overweight);
BMI ≥ 30 and <35 kg/m2 (Grade 1 obesity); BMI ≥ 35 and <40 kg/
m2 (Grade 2 obesity); and BMI ≥ 40 kg/m2 (Grade 3 obesity).
Baseline data included age, sex, ethnicity, body weight, BMI and
blood pressure at the time of diagnosis of diabetes and at the time of
insulin initiation (index date). Longitudinal clinical and laboratory mea-
sures were arranged on the basis of 6-monthly windows, progres-
sively from 6 months before the index date, and only the latest
measurement within each window was preserved. For example, the
latest HbA1c value measured >6 and ≤12 months after the index
date was kept as HbA1c at 12 months. Complete information on
antidiabetes drugs, antihypertensive and cardioprotective medica-
tions over time was obtained, along with dates of prescriptions. For
antidiabetes drugs, information was extracted on any medication that
was prescribed after diagnosis of diabetes and after the index date.
The treatment duration with individual medications was calculated.
2.2 | Statistical methods
The proportions of patients with missing data on body weight
and HbA1c between 6 and 24 months of follow-up ranged between
9% – 16% and 8% – 17%, respectively. The missing data were
imputed using a multiple imputation technique, with adjustments for
age at insulin initiation, diabetes duration at insulin initiation and
usage of oral antidiabetes drugs (OAD) during follow-up. All primary
analyses were conducted using the imputed weight and HbA1c data,
with additional analyses based on complete cases for sensitivity
analyses.
The mean [95% confidence interval (CI)] changes in body weight
and HbA1c at 6, 12 and 24 months for the main study cohort and
the two sub-cohorts were obtained using weighted multivariate
regression models, adjusting for age at index date, sex, diabetes dura-
tion at index date, OAD usage, and distribution of weight or HbA1c
at index date, separately for BMI categories at index date. The
regression models for weight and HbA1c change were weighted by
baseline weight and HbA1c respectively. The mean (95% CI) values
for the possible marginal contribution of sulphonylurea usage to
weight and HbA1c changes were estimated using the same regres-
sion models, as appropriate. Separate sensitivity analyses were con-
ducted using data from patients with a minimum of 2 years’ diabetes
duration at index date, to verify the consistency of the findings on
weight change according to BMI category at index date. Additional
sensitivity analyses included evaluations of changes in body weight
and HbA1c, with further adjustments for the insulin regimen at index
date, and among those who did not undergo any bariatric surgery
procedure before insulin initiation or during follow-up (n = 153 788).
To evaluate the possible association of a 1% reduction in HbA1c
by insulin � other antidiabetes drugs with weight gain over 6, 12 and
24 months of insulin treatment, multivariate factorial regression mod-
els were fitted. For example, to evaluate the possible association of a
1% reduction in HbA1c at 12 months of treatment, the fitted
model was:
weight12 m – weightindex dateð Þ � function ageindex date + sexf
+ diabetesdurationindex date
+ use of anyOADon index date or during12months of follow-up
+ BMI categoriesindex dateð Þ× HbA1c12m – HbA1cindex dateð Þg
−weightedbyweightindex date:
The regression-based approach described above was also used to
evaluate the possible differences in the patterns of association of
HbA1c change with weight change in people with different BMI
levels at insulin initiation.
3 | RESULTS
The demographic, clinical, laboratory and medications data at the
time of insulin initiation are shown in Table 1, for the whole study
PAUL ET AL. 1245
193
cohort (n = 155 917) and the two sub-cohorts defined on the basis
of minimum insulin treatment duration of 12 months (n = 151 220)
and 24 months (n = 144 857). In the whole cohort, patients had a
mean [standard deviation (s.d.)] age of 59 (11) years, 48% were male,
54% were white, and the median diabetes duration at index date was
~2 years. The mean HbA1c at insulin initiation (9.5%) and the propor-
tions of patients with HbA1c >7.5% and >8% were similar in all
cohorts.
Table 2 shows the baseline characteristics according to BMI cate-
gory. Female and white patients were more likely to have a higher
BMI at index date. The diabetes duration was similar among the BMI
groups. The mean HbA1c levels at index date were significantly lower
(9.2-9.4%) in the obese patients than in patients with normal body
weight (10.0%); however, the proportions of patients with HbA1c
>7.5% and >8.0% were similar across the BMI categories. The obese
patients were more likely to be on concomitant metformin and/or
sulphonylurea therapies than patients with normal weight.
3.1 | Weight change
Weight change over 24 months is shown in Figure 1 and Table 3.
The weight gain was significantly and consistently lower in patients
with a higher BMI, compared with that in patients with normal
body weight. In Grade 1 and Grade 2 obese patients, the adjusted
mean weight gain over 6 and 12 months of insulin treatment ran-
ged between 0.1 and 0.9 kg (Table 3), combining the main cohort
and sub-cohort 1. In Grade 3 obese patients, the adjusted reduc-
tions in body weight were 0.7, 1.1 and 2.2 kg over 6, 12 and
24 months of insulin treatment, respectively. The adjusted mean
weight gain in the normal-weight patients ranged between 2 and
TABLE 1 Baseline characteristics of main cohort and two sub-cohorts
Insulin treatment ≥ 6 months Insulin treatment ≥ 12 months Insulin treatment ≥ 12 monthsMain cohort Sub-cohort 1 Sub-cohort 2
N 155 917 151 220 144 857
Male, n (%) 75 038 (48) 72 797 (48) 69 788 (48)
Ethnicity, n (%)
White 83 441 (54) 80 840 (54) 77 431 (54)
Black 21 658 (14) 21 086 (14) 20 274 (14)
Hispanic 6180 (4) 6004 (4) 5740 (4)
Asian 2789 (2) 2713 (2) 2616 (2)
Other/Unknown 41 489 (27) 40 577 (27) 38 796 (27)
Mean (s.d.) age at insulin initiation, years 59 (11) 59 (11) 59 (11)
Diabetes duration
Median (Q1, Q3), years 1.9 (0.2, 2.6) 1.9 (0.2, 2.6) 2.0 (0.2, 2.7)
<2 years, n (%) 102 038 (72) 98 911 (72) 94 655 (71)
2-5 years, n (%) 25 754 (18) 25 020 (18) 24 027 (18)
>5years, n (%) 14 848 (10) 14 431 (10) 13 856 (11)
Mean (s.d.) weight, kg 98.8 (26.0) 98.6 (26.0) 98.6 (25.9)
Mean (s.d.) BMI, kg/m2 34.8 (8.4) 34.8 (8.5) 34.8 (8.4)
BMI category1, n (%)
<25 kg/m2 13 880 (9) 13 585 (9) 13 153 (9)
≥25 and <30 kg/m2 32 047 (21) 31 254 (21) 30 106 (21)
≥30 and < 35 kg/m2 43 274 (28) 41 916 (28) 40 153 (28)
≥35 and <40 kg/m2 32 445 (21) 31 385 (21) 29 940 (21)
≥40 kg/m2 34 271 (22) 33 080 (22) 31 505 (22)
HbA1c
Mean (s.d.) HbA1c, % 9.5 (2.3) 9.5 (2.3) 9.5 (2.2)
Median (Q1, Q3) HbA1c, % 9.2 (7.7, 11.0) 9.2 (7.7, 10.9) 9.2 (7.7, 10.9)
HbA1c ≥ 7.5%, n (%) 123 802 (79) 120 111 (79) 114 965 (79)
HbA1c ≥ 8%, n (%) 111 368 (71) 108 052 (72) 103 379 (71)
Antidiabetes medication, n (%)
Metformin 108 377 (70) 105 084 (70) 100 531 (69)
Sulphonylurea 74 492 (48) 72 305 (48) 69 214 (48)
Insulin type at initiation, n (%)
Basal 99 299 (65) 96 343 (65) 92 212 (65)
Biphasic 21 136 (14) 20 548 (14) 19 843 (14)
Prandial 31 314 (21) 30 335 (21) 29 044 (21)
1 BMI categories at insulin initiation: <25 kg/m2 (normal weight); ≥25 and <30 kg/m2 (overweight); ≥30 and <35 kg/m2 (Grade 1 obesity); ≥35 and<40 kg/m2 (Grade 2 obesity); and ≥40 kg/m2 (Grade 3 obesity).
1246 PAUL ET AL.
194
4 kg over 6-24 months of insulin treatment. In normal-weight and
overweight patients, the mean weight gain was significantly higher
than in obese patients. As evident from Figure 1C, normal-weight
and overweight patients continuously gained weight over 2 years
of follow-up, while declining body weight trajectories were
observed in obese patients after 12 months of insulin initiation.
The proportions of patients with weight gain of ≥5 kg were sig-
nificantly lower in obese groups (16% and 19%) than in normal-
weight patients (28% and 37%) during 12 and 24 months of insulin
treatment (Table 3). Sulphonylurea treatment only marginally contrib-
uted to the weight gain (0.17-0.27 kg over 6-24 months of insulin
treatment).
Approximately 28% of patients (n = 53 879) had a minimum of
2 years’ diabetes duration at insulin initiation (Table S1). The patterns
of weight change by BMI category in this subset of patients were
similar to those observed in all patients across the insulin treatment
duration categories. The observed weight changes in different BMI
categories were similar after adjustments for insulin regimen, and also
for those who did not undergo any bariatric procedure(s).
3.2 | Glycaemic control
The 6-monthly longitudinal measures [mean (95% CI)] of HbA1c over
24 months from index date are shown in Figure 1B. The adjusted
changes in HbA1c at 6, 12 and 24 months over the BMI categories
are shown in Table 4. Starting with a significantly higher HbA1c of
10% at the index date (compared with overweight and obese
patients), the normal-weight patients had a mean reduction in HbA1c
of ~1.4% over 6-24 months of insulin treatment. The overweight,
Grade 1 and Grade 2 obese patients had similar glycaemic achieve-
ments over the follow-up time (1.0-1.3%). Although the HbA1c
reductions in Grade 3 obese patients were statistically significantly
lower compared with the other groups of overweight and obese
patients, these were clinically marginal differences.
In the main study cohort, the proportions of patients on metfor-
min, sulphonylureas or both medications were 70%, 48% and 39%,
respectively. Among normal-weight patients with a minimum of
24 months of insulin treatment (13 153/144 857 patients), the pro-
portions of patients receiving metformin, sulphonylurea or both were
TABLE 2 Study variables and concomitant medication usage at insulin initiation according to BMI category in patients with a minimum
24 months of insulin treatment (sub-cohort 2)
BMI category
<25 kg/m2≥25 and<30 kg/m2
≥30 and<35 kg/m2
≥35 and<40 kg/m2 ≥40 kg/m2
N 13 153 30 106 40 153 29 940 31 505
Male, n (%) 7126 (54) 16 925 (56) 20 847 (52) 13 592 (45) 11 298 (36)
Ethnicity, n (%)
White 6200 (47) 14 959 (50) 21 491 (54) 16 727 (56) 18 054 (57)
Black 1851 (14) 4244 (14) 5414 (14) 3967 (13) 4798 (15)
Hispanic 584 (4) 1577 (5) 1641 (4) 1051 (4) 887 (3)
Asian 639 (5) 891 (3) 603 (2) 267 (1) 216 (1)
Other/Unknown 3879 (30) 8435 (28) 11 004 (28) 7928 (27) 7550 (24)
Mean (s.d.) age at insulin initiation,years
58 (12) 60 (11) 59 (11) 59 (11) 57 (12)
Diabetes duration, years
Median (Q1, Q3) 1.6 (0.1, 1.9) 1.8 (0.1, 2.4) 1.9 (0.1, 2.4) 1.8 (0.1, 2.5) 1.8 (0.1, 2.5)
<2 years, n (%) 9223 (76) 19 800 (71) 26 050 (71) 19 281 (71) 20 301 (71)
2–5 years, n (%) 1927 (16) 5004 (18) 6697 (18) 5076 (19) 5323 (19)
>5 years, n (%) 1015 (8) 2952 (11) 3937 (11) 2892 (11) 3060 (11)
Mean (s.d.) weight, kg 64.4 (10.5) 79.8 (11.1) 93.8 (13.9) 106.4 (15.0) 130.2 (23.1)
Mean (s.d.) BMI, kg/m2 22.5 (2.2) 27.7 (1.4) 32.6 (1.4) 37.3 (1.4) 46.8 (6.9)
HbA1c
Mean (s.d.), % 10.0 (2.8) 9.7 (2.4) 9.4 (2.3) 9.3 (2.2) 9.2 (2.1)
Median (Q1, Q3), % 9.6 (7.7, 12.0) 9.4 (7.8, 11.3) 9.2 (7.7, 10.9) 9.0 (7.7, 10.7) 9.0 (7.6, 10.5)
HbA1c ≥ 7.5%, n (%) 10 400 (79) 24 289 (81) 32 015 (80) 23 686 (79) 24 575 (78)
HbA1c ≥ 8%, n (%) 9 462 (72) 22 045 (73) 28 766 (72) 21 188 (71) 21 918 (70)
Antidiabetes medication, n (%)
Metformin only 7892 (60) 20 685 (69) 27 973 (70) 21 328 (71) 22 653 (72)
Sulphonylurea only 5416 (41) 14 339 (48) 19 378 (48) 14 683 (49) 15 398 (49)
Metformin + Sulphonylurea 4390 (33) 11 896 (40) 15 945 (40) 12 074 (40) 12 636 (40)
Insulin type at initiation, n (%)
Basal 8164 (64) 19 503 (66) 25 757 (66) 19 018 (65) 19 770 (64)
Biphasic 1756 (14) 4114 (14) 5449 (14) 4125 (14) 4399 (14)
Prandial 2901 (23) 5755 (20) 7847 (20) 6025 (21) 6516 (21)
PAUL ET AL. 1247
195
60%, 41% and 33%, respectively. The distribution of usages of these
drugs alone or in combination were similar for overweight and obese
patients, and were significantly higher compared with the usage
observed among the normal-weight patients (Table 2). The adjusted
marginal HbA1c reductions, associated with metformin treatment,
were 0.37%, 0.38% and 0.31% at 6, 12 and 24 months of follow-up.
The adjusted changes in HbA1c level associated with sulphonylurea
were marginal (Table 4). These estimates were similar across all BMI
categories (results are not presented).
3.3 | Association of weight change withimprovement in glycaemic control
The adjusted estimates (regression coefficients) and 95% CIs of
weight gain associated with 1% reduction in HbA1c by BMI cate-
gories over 6, 12 and 24 months of insulin treatment among patients
in sub-cohort 2 are shown in Table 3 and Figure 1D. The estimated
weight gains were significantly lower among obese patients than
among normal- and overweight patients. While a 1% HbA1c reduc-
tion was associated with weight gains of 0.92 and 1.24 kg among
normal-weight patients at 12 and 24 months of insulin treatment,
weight gain was 0.13-0.46 kg among Grade 1 and Grade 2 obese
patients during the same follow-up period. Among Grade 3 obese
patients, a 1% HbA1c reduction was not associated with any increase
in weight.
3.4 | Pattern of association of weight changewith glycaemic control by BMI category
Figure 1D shows a difference in the association between HbA1c
reduction and weight gain by BMI category. For example, at
24 months of treatment, with the slope of 1.24 (95% CI 1.18, 1.31)
kg associated with a 1% reduction in HbA1c for normal-weight
patients as reference, the differences in the slopes for obese cate-
gories (from slope for normal-weight category) were significantly dif-
ferent (p < .01) from zero, and were also significantly different among
three obese categories. The differences in the slopes in the over-
weight, Grade 1 and Grade 2 categories were 0.43, 0.78 and 1.11 kg,
respectively (p < .01). The difference in estimated slopes in the
normal-weight and Grade 3 obese patients was 1.56 kg. The pattern
of association of longitudinal changes in body weight with HbA1c
was similar for different ethnic groups.
As a sensitivity analysis, all analyses (described above) were also
conducted in patients with complete data on weight and HbA1c
FIGURE 1 A, Mean (95% CI) of longitudinal measures of body weight (kg) over 2 years from the time of insulin initiation by BMI category at
index date; B, mean (95% CI) of longitudinal measures of HbA1c over 2 years from the time of insulin initiation by BMI category at index date;C, adjusted mean (95% CI) of change in body weight at 6, 12 and 24 months of treatment with insulin by BMI category at index date;D, adjusted mean (95% CI) change in body weight at 6, 12 and 24 months of treatment with insulin, associated with 1% reduction in HbA1c atthese time points.
1248 PAUL ET AL.
196
measures at 6, 12 and 24 months of follow-up. No difference in the
estimates or inferences was observed between complete case ana-
lyses and analyses based on the imputed data.
4 | DISCUSSION
The present exploratory clinical study, based on large-scale longitudi-
nal real-world data, clearly suggests that: (1) weight gain associated
with insulin treatment is significantly lower in obese patients with
T2DM compared with that observed in patients with normal body
weight; (2) the significantly lower weight gain in obese patients is
consistent over 6, 12 and 24 months of treatment with insulin,
adjusted for various factors including the use of concomitant antidia-
betes medications; (3) the glycaemic control over 24 months of treat-
ment with insulin is similar among patients with different BMI levels;
and (4) the weight gain associated with a 1% reduction in HbA1c falls
progressively as pretreatment BMI increases.
Our finding that obese patients gain significantly less weight with
insulin treatment is robust, and is supported by consistent estimates
of weight changes according to different BMI categories over 6-
24 months of insulin treatment (Figure 1C). Separate analyses to
explore possible confounders, through evaluation of various charac-
teristics according to BMI category in patients with a minimum of
2 years’ insulin treatment (Table 2), provide a basis to support this
robust finding. This is further supported by consistent findings in
patients with a minimum of 2 years’ diabetes duration before insulin
initiation (Table S1). It is likely that some of the normal-weight
patients (BMI <25 kg/m2), and perhaps some with a BMI 25-30 kg/
m2, had lost weight before commencing insulin, which may account
for some of their weight gain after insulin initiation; however, we still
see progressive reductions in body weight gain with progressive
increases in baseline BMI >30 kg/m2.
TABLE 3 Adjusted weight change over 24 months of insulin treatment initiation, by baseline BMI category
Main cohortn = 155 917
Sub-cohort 1n = 151 220
Sub-cohort 2n = 144 857
Sub-cohort 2n = 144 857
Mean (95% CI) weight change1, kg Mean (95% CI) weight change associatedwith a 1% reduction in HbA1c2
At 6 months
BMI < 25 kg/m2 2.0 (2.0, 2.1) 2.0 (2.0, 2.1) 2.0 (2.0, 2.1) 0.66 (0.61, 0.70)
BMI ≥ 25 and <30 kg/m2 1.2 (1.1, 1.2) 1.2 (1.1, 1.2) 1.1 (1.1, 1.2) 0.44 (0.41, 0.47)
BMI ≥ 30 and <35 kg/m2 0.6 (0.6, 0.7) 0.6 (0.6, 0.7) 0.6 (0.6, 0.7) 0.29 (0.26, 0.31)
BMI ≥ 35 and <40 kg/m2 0.1 (0.1, 0.2) 0.1 (0.1, 0.2) 0.1 (0.1, 0.2) 0.11 (0.08, 0.15)
BMI ≥40 kg/m2 −0.7 (−0.7, −0.6) −0.7 (−0.7, −0.6) −0.7 (−0.7, −0.6) −0.06 (−0.02, 0.0)
At 12 months
BMI < 25 kg/m2 3.0 (3.0, 3.1) 3.0 (3.0, 3.1) 0.92 (0.86, 0.97)
BMI ≥ 25 and <30 kg/m2 1.8 (1.7, 1.8) 1.7 (1.7, 1.8) 0.61 (0.58, 0.65)
BMI ≥ 30 and <35 kg/m2 0.9 (0.9, 1.0) 0.9 (0.9, 1.0) 0.39 (0.36, 0.43)
BMI ≥ 35 and <40 kg/m2 0.2 (0.2, 0.3) 0.2 (0.2, 0.3) 0.15 (0.11, 0.19)
BMI ≥ 40 kg/m2 −1.1 (−1.2, −1.1) −1.1 (−1.2, −1.1) −0.18 (−0.22, −0.14)
At 24 months
BMI < 25 kg/m2 3.9 (3.8, 3.9) 1.24 (1.18, 1.31)
BMI ≥ 25 and <30 kg/m2 2.0 (1.9, 2.0) 0.81 (0.76, 0.85)
BMI ≥ 30 and <35 kg/m2 0.7 (0.7, 0.8) 0.46 (0.41, 0.50)
BMI ≥ 35 and <40 kg/m2 −0.2 (−0.3, −0.2) 0.13 (0.08, 0.18)
BMI ≥ 40 kg/m2 −2.2 (−2.2, −2.1) −0.32 (−0.37, −0.27)
Percentage of patients who increased body weight ≥ 5 kg
BMI < 25 kg/m2 13 28 37
BMI ≥ 25 and < 30 kg/m2 12 19 24
BMI ≥ 30 and < 35 kg/m2 11 17 19
BMI ≥ 35 and < 40 kg/m2 11 16 18
BMI ≥ 40 kg/m2 11 17 18
Mean (95% CI) of marginal contribution towards changes in weight, kg
Sulphonylurea treatment 0.17 (0.11, 0.22) 0.25 (0.17, 0.32) 0.27 (0.18, 0.36)
1 Adjusted for sex, diabetes duration and metformin and sulphonylurea usage, according to BMI categories at insulin initiation.2 Adjusted for sex, diabetes duration, concomitant antidiabetic medication usage, and baseline HbA1c, according to BMI category at insulin initiation.BMI categories at insulin initiation: <25 kg/m2 (normal weight); ≥25 and <30 kg/m2 (overweight); ≥30 and <35 kg/m2 (Grade 1 obesity); ≥35 and<40 kg/m2 (Grade 2 obesity); and ≥40 kg/m2 (Grade 3 obesity).
PAUL ET AL. 1249
197
The level of HbA1c reached over time was clinically similar in the
different BMI groups, despite the fact that obese patients had signifi-
cantly lower HbA1c levels (9.2-9.4%) at insulin initiation compared
with those with normal body weight or who were overweight (9.7-
10.0%; Table 2). Thus, patients arrive at approximately the same
HbA1c level, independently of the starting value. This might suggest
that the lower weight gain seen in obese patients was attributable to
the use of less intensive insulin therapy. Although we did not have
insulin dose data, we nevertheless showed that, even when corrected
for the same HbA1c reduction, weight gain remained smaller in the
obese group.
Based on a cohort of 2179 patients with a median diabetes dura-
tion of ~9 years, Balkau et al. reported 1.78 kg of average weight
gain (unadjusted) over 1 year of treatment with insulin, with 24% of
patients experiencing weight gain of ≥5 kg, and a significant inverse
association of baseline BMI with weight gain.17 The adjusted mean
weight gain in the present study cohort with a minimum of 1 year of
treatment with insulin (cohort 2) ranged between 0.2 and 3.49 kg
among patients with BMI ≤ 40 kg/m2 (combining sub-cohorts 1 and
2). A marginal weight loss was observed in patients with Grade 3 obe-
sity. The proportion of patients gaining ≥5 kg body weight at 1 year
was similar across obese patients (16-17%; Table 3), while 28% of
patients with normal body weight gained ≥5 kg. Compared with
patients with normal weight, patients in the Grade 1, 2 and 3 obesity
categories were 48%, 51% and 50% less likely, respectively, to
increase body weight by >5 kg.
We observed that the cost of glycaemic control in terms of
weight gain is marginal in Grade 1 and Grade 2 obese patients.
Among the Grade 3 obese patients, a 1% reduction in HbA1c was
associated with a decrease in weight of ~0.3 kg after 24 months of
insulin treatment. Balkau et al.17 reported that high baseline HbA1c
level, insulin dose requirements and lower baseline BMI were inde-
pendently associated with greater weight gain. In a meta-analysis,
Pontiroli et al.10 found that intensity of treatment, insulin dose, final
HbA1c level, change in HbA1c level and frequency of hypoglycaemia
were significantly associated with weight increase as well as type of
insulin regimen; however, these studies did not evaluate the possible
association of glycaemic control with weight change in patients with
different levels of adiposity at the time of insulin initiation. The pres-
ent study also provides new information on the significant differences
in the patterns of possible association between glycaemic control and
weight change in insulin-treated patients by BMI category.
Electronic databases present challenges in terms of accuracy
and completeness of the required data. The limitations of this
TABLE 4 Adjusted HbA1c change over 24 months of insulin treatment initiation, by baseline BMI category
Main cohortn = 155 917
Sub-cohort 1n = 151 220
Sub-cohort 2n = 144 857
Mean (95% CI) HbA1c change (%)1
At 6 months
<25 kg/m2 −1.4 (−1.4, −1.3) −1.4 (−1.4, −1.3) −1.3 (−1.3, −1.2)
≥25 and <30 kg/m2 −1.3 (−1.3, −1.2) −1.3 (−1.3, −1.2) −1.2 (−1.2, −1.1)
≥30 and <35 kg/m2 −1.1 (−1.2, −1.1) −1.1 (−1.2, −1.1) −1.1 (−1.2, −1.1)
≥35 and <40 kg/m2 −1.1 (−1.2, −1.1) −1.1 (−1.1, −1.0) −1.1 (−1.1, −1.0)
≥40 kg/m2 −1.0 (−1.0, −0.9) −1.0 (−1.0, −0.9) −1.0 (−1.0, −0.9)
At 12 months
<25 kg/m2 −1.3 (−1.4, −1.3) −1.3 (−1.3, −1.2)
≥25 and <30 kg/m2 −1.2 (−1.2, −1.1) −1.2 (−1.2, −1.1)
≥30 and <35 kg/m2 −1.1 (−1.1, −1.0) −1.1 (−1.1, −1.0)
≥35 and <40 kg/m2 −1.0 (−1.0, −0.9) −1.0 (−1.1, −1.0)
≥40 kg/m2 −1.0 (−0.1, −0.9) −1.0 (−1.0, −0.9)
At 24 months
<25 kg/m2 −1.4 (−1.4, −1.3)
≥25 and <30 kg/m2 −1.2 (−1.2, −1.1)
≥30 and <35 kg/m2 −1.1 (−1.1, −1.0)
≥35 and <40 kg/m2 −1.0 (−1.1, −1.0)
≥40 kg/m2 −1.0 (−1.0, −0.9)
Metformin treatment
Yes −1.2 (−1.3, −1.2) −1.2 (−1.3, −1.2) −1.2 (−1.2, −1.1)
No −0.9 (−0.9, −0.8) −0.9 (−0.9, −0.8) −0.9 (−0.9, −0.8)
Mean (95% CI) of marginal contribution towards changes in HbA1c, %
Sulphonylurea treatment 0.03 (0.02, 0.03) 0.3 (0.1, 0.5) 0.7 (0.5, 0.9)
1 Adjusted for sex, diabetes duration and metformin and sulphonylurea usage. Estimates are provided by BMI categories at insulin initiation. BMI cate-gories at insulin initiation: <25 kg/m2 (normal weight); ≥25 and <30 kg/m2 (overweight); ≥30 and <35 kg/m2 (Grade 1 obesity); ≥35 and <40 kg/m2
(Grade 2 obesity); and ≥40 kg/m2 (Grade 3 obesity).
1250 PAUL ET AL.
198
study include non-availability of complete and reliable data on:
(1) medication adherence; (2) diet and exercise; (3) socio-economic
status; and (4) potential residual confounders. We believe the non-
availability of longitudinal insulin doses would not affect the
robustness of our findings, as the main interest was to evaluate
the observed weight change and glycaemic control at a population
level, reflecting the primary/ambulatory care disease risk factor
management. Our analysis of weight change in relation to change
in HbA1c confirms that the weight gain “cost” of achieving any
given improvement in glycaemic control is, in fact, less with
increase in pretreatment BMI.
Although we excluded patients treated with GLP-1 receptor ago-
nists, it is not possible to know to what extent the lower weight gain
with increasing BMI is attributable to pharmacological differences in
the effects of insulin or to other attempts to lose weight in the obese
groups. Patients may well intensify lifestyle efforts on commencing
insulin. Whether or not this happens more in the more obese is diffi-
cult to ascertain because of the non-availability of longitudinal life-
style intervention data. Thus, these results should not be interpreted
as indicating that lifestyle efforts to control weight gain are not nec-
essary for obese patients initiating insulin. Nevertheless, they do indi-
cate that within the facilities available in routine care, weight gain can
readily be limited when initiating insulin therapy in obese patients
with T2DM.
A large analysis cohort from the validated CEMR database
should be considered as a representative sample, and as such, pro-
vides a good picture of the state of weight and glycaemic control
in routine practice. We had complete data on weight and HbA1c
measured within 3 months of insulin initiation, and the 6-monthly
follow-up measures of weight and HbA1c were imputed for only
8-16% of missing cases. The results from complete case analyses
and imputed data were very similar. Finally, a careful new-user
design with a reasonable exposure time of 2 years and appropriate
adjustments for various aspects are the primary strengths of the
study.
In conclusion, we observed that, over 24 months of treatment
with insulin, obese patients gained significantly less weight than
normal- and overweight patients, while achieving clinically similar gly-
caemic benefits. These findings should provide important reassurance
that, among obese patients with T2DM in routine clinical practice,
meaningful improvements in glycaemic control can be achieved with
only small increases in weight.
ACKNOWLEDGMENTS
The QIMR Berghofer Medical Research Institute gratefully acknowl-
edges the support from the National Health and Medical Research
Council and the Australian Government’s National Collaborative
Research Infrastructure Strategy (NCRIS) initiative through Therapeu-
tic Innovation Australia.
Conflict of interest
S. K. P. has acted as a consultant and/or speaker for Novartis, GI
Dynamics, Roche, AstraZeneca, Guangzhou Zhongyi Pharmaceutical
and Amylin Pharmaceuticals LLC. He has received grants in support
of investigator and investigator initiated clinical studies from Merck,
Novo Nordisk, AstraZeneca, Hospira, Amylin Pharmaceuticals, Sanofi-
Avensis and Pfizer. J. S. has received honoraria for consultancies and
lectures from: Novartis, Novo Nordisk, Astra Zeneca, Sanofi, Merck
Sharp and Dohme, Abbott, Janssen Cilag, and Takeda. O. M. and
K. K. have no conflict of interest to declare.
Author contributions
S. K. P. conceived the idea and was responsible for the primary
design of the study. J. S. and K. K. significantly contributed to the
study design. K. K. conducted the data extraction, and S. K. P., O. M.
and K. K. jointly conducted the statistical analyses. The first draft of
the manuscript was developed by S. K. P., and all authors contributed
to the finalization of the manuscript. S. K. P. had full access to all the
data in the study and is the guarantor, taking responsibility for the
integrity of the data and the accuracy of the data analysis.
REFERENCES
1. Ishii H, Iwamoto Y, Tajima N. An exploration of barriers to insulin ini-tiation for physicians in Japan: findings from the Diabetes Attitudes,Wishes and Needs (DAWN) JAPAN study. PLoS One. 2012;7:e36361.
2. Weng J, Li Y, Xu W, et al. Effect of intensive insulin therapy on beta-cell function and glycaemic control in patients with newly diagnosedtype 2 diabetes: a multicentre randomised parallel-group trial. Lancet.2008;371:1753–1760.
3. Alvarsson M, Sundkvist G, Lager I, et al. Effects of insulinvs. glibenclamide in recently diagnosed patients with type 2 diabetes:a 4-year follow-up. Diabetes Obes Metab. 2008;10:421–429.
4. Wang Z, York NW, Nichols CG, Remedi MS. Pancreatic beta celldedifferentiation in diabetes and redifferentiation following insulintherapy. Cell Metab. 2014;19:872–882.
5. Khunti K, Davies M, Majeed A, Thorsted BL, Wolden ML, Paul SK.Hypoglycemia and risk of cardiovascular disease and all-cause mortal-ity in insulin-treated people with type 1 and type 2 diabetes: a cohortstudy. Diabetes Care. 2015;38:316–322.
6. Russell-Jones D, Khan R. Insulin-associated weight gain in diabetes–causes, effects and coping strategies. Diabetes Obes Metab.2007;9:799–812.
7. Paul SK, Klein K, Thorsted BL, Wolden ML, Khunti K. Delay in treat-ment intensification increases the risks of cardiovascular events inpatients with type 2 diabetes. Cardiovasc Diabetol. 2015;14:1–10.
8. Khunti K, Nikolajsen A, Thorsted BL, Andersen M, Davies MJ,Paul SK. Clinical inertia in intensifying therapy among people withtype 2 diabetes treated with basal insulin. Diabetes Obes Metab.2016;18:401–409.
9. Wang H, Ni YF, Li HZ, Yang S, Feng B. Effects of insulin monother-apy on body weight, composition, and fat distribution in newly diag-nosed patients with type 2 diabetes mellitus. J Diabetes.2013;5:146–148.
10. Pontiroli AE, Miele L, Morabito A. Increase of body weight during thefirst year of intensive insulin treatment in type 2 diabetes: systematicreview and meta-analysis. Diabetes Obes Metab. 2011;13:1008–1019.
11. Paul S, Thorsted BL, Wolden M, Klein K, Khunti K. Delay in treatmentintensification increases the risks of cardiovascular events in patientswith type 2 diabetes. Cardiovascular diabetology. 2015;14:1.
12. Kamal KM, Chopra I, Elliott JP, Mattei TJ. Use of electronic medicalrecords for clinical research in the management of type 2 diabetes.Res Social Adm Pharm. 2014;10:877–884.
13. Herrin J, da Graca B, Nicewander D, et al. The effectiveness of imple-menting an electronic health record on diabetes care and outcomes.Health Serv Res. 2012;47:1522–1540.
PAUL ET AL. 1251
199
14. Hansen RA, Farley JF, Maciejewski ML, Ye X, Qian C, Powers B. Real-world utilization patterns and outcomes of colesevelam HCL in the geelectronic medical record. BMC Endocr Disord. 2013;13:24.
15. Levin P, Wei W, Miao R, et al. Therapeutically interchangeable? Astudy of real-world outcomes associated with switching basal insulinanalogues among US patients with type 2 diabetes mellitus usingelectronic medical records data. Diabetes Obes Metab.2015;17:245–253.
16. Davis KL, Tangirala M, Meyers JL, Wei W. Real-world comparativeoutcomes of US type 2 diabetes patients initiating analog basal insulintherapy. Curr Med Res Opin. 2013;29:1083–1091.
17. Balkau B, Home PD, Vincent M, Marre M, Freemantle N. Factorsassociated with weight gain in people with type 2 diabetes startingon insulin. Diabetes Care. 2014;37:2108–2113.
SUPPORTING INFORMATION
Additional Supporting Information may be found online in the sup-
porting information tab for this article.
How to cite this article: Paul SK, Shaw J, Montvida O and
Klein K. Weight gain in insulin-treated patients by body mass
index category at treatment initiation: new evidence from
real-world data in patients with type 2 diabetes, Diabetes
Obes Metab 2016, 18, 1244–1252. DOI:10.1111/dom.12761
1252 PAUL ET AL.
200
APPENDIX D
201
Research Article: Treatment
Treatment with incretins does not increase the
risk of pancreatic diseases compared to older
anti-hyperglycaemic drugs, when added to metformin:
real world evidence in people with Type 2 diabetes
O. Montvida1,2, J. B. Green3, J. Atherton4 and S. K. Paul1,5
1Statistics Unit, QIMR Berghofer Medical Research Institute, 2School of Biomedical Sciences, Queensland University of Technology, Brisbane, Australia, 3Division of
Endocrinology and Duke Clinical Research Institute, Duke University Medical Center, Durham, NC,USA, 4Cardiology Department, Royal Brisbane and Women’s
Hospital and University of Queensland School of Medicine, Brisbane and 5Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne,
Australia
Accepted 9 October 2018
Abstract
Aims In people with metformin-treated diabetes, to evaluate the risk of acute pancreatitis, pancreatic cancer and other
diseases of the pancreas post second-line anti-hyperglycaemic agent initiation.
Methods People with Type 2 diabetes diagnosed after 2004 who received metformin plus a dipeptidyl peptidase-4
inhibitor (DPP-4i, n = 50 095), glucagon-like peptide-1 receptor agonist (GLP-1RA, n = 12 654), sulfonylurea
(n = 110 747), thiazolidinedione (n = 17 597) or insulin (n = 34 805) for at least 3 months were identified in the US
Centricity Electronic Medical Records. Time to developing acute pancreatitis, other diseases of the pancreas and
pancreatic cancer was estimated, balancing and adjusting anti-hyperglycaemic drug groups for appropriate confounders.
Results In the DPP-4i group, the adjusted mean time to acute pancreatitis was 2.63 [95% confidence intervals (CI)
2.38, 2.88] years; time to pancreatic cancer was 2.70 (2.19, 3.21) years; and time to other diseases of the pancreas was
2.73 (2.33, 3.12) years. Compared with DPP-4i, the insulin group developed acute pancreatitis 0.48 years (P < 0.01)
earlier and the GLP-1RA group developed pancreatic cancer 3 years later (P < 0.01). However, with the constraint of no
event within 6 months of insulin initiation, the risk of acute pancreatitis in the insulin group was insignificant. No other
significant differences were observed between groups.
Conclusions No significant differences in the risk of developing pancreatic diseases in those treated with various
anti-hyperglycaemic drug classes were found.
Diabet. Med. 00, 1–8 (2018)
Introduction
Glucagon-like peptide-1 receptor agonists (GLP-1RAs) and
dipeptidyl peptidase-4 inhibitors (DPP-4i) represent incretin-
based therapeutic drug classes used to treat Type 2 diabetes.
These drugs have demonstrated efficacy in reducing blood
glucose levels with low risk of hypoglycaemia [1–3]. Treat-
ment with GLP-1RAs is associated with favourable changes
in metabolic measurements such as body weight, and some
agents in the class have been shown to reduce the risk of
cardiovascular events [3–5]. However, some recent clinical
observational studies have raised questions as to the possible
association of treatment with incretin-based therapies, par-
ticularly with DPP-4i, and the risk of acute pancreatitis or
pancreatic cancer [6–17].
Although a number of cohort studies and meta-analyses
reported no association between incretin-based therapies and
risk of acute pancreatitis and pancreatic cancer [9,12,13,
15,16,18], other studies have reported an increased risk of
acute pancreatitis with such agents [8,11,14,17]. In a meta-
analysis based on pooled data from the SAVOR-TIMI 53,
EXAMINE and TECOS trials, Tk�a�c and Raz [8] reported a
significant increase in the incidence of acute pancreatitis
[odds ratio (OR): 1.79; 95% confidence intervals (CI) 1.13,
2.82] in people treated with DPP-4i when compared with
placebo. The observed increase in the absolute risk of acuteCorrespondence to: Sanjoy Ketan Paul. E-mail: [email protected]
ª 2018 Diabetes UK 1
DIABETICMedicine
DOI: 10.1111/dme.13835
202
pancreatitis with DPP-4i therapy was 0.13%. In a cohort
study based on real-world primary care data from the UK,
Knapen and colleagues [17] reported a 1.5-fold increased risk
of any pancreatitis in incretin-based therapy users compared
with other non-insulin anti-hyperglycaemic drug users.
However, another study by Knapen and colleagues [16],
based on the same database, reported no association of
incretin-based therapies with the risk of pancreatic cancer.
Previously published cohort studies have generally assessed
pancreatic risk by comparing rates of pancreatic diseases in
users of incretin-based therapies with rates in users of any non-
incretin based anti-hyperglycaemic regimen. However, no
prior study of adequate size and duration has holistically
evaluated the risks of acute pancreatitis and pancreatic cancer
with incretin-based therapy compared with use of other
specific anti-hyperglycaemic drug classes. Furthermore, pre-
vious publications based on real-world data, in which baseline
risks differ significantly and are modified over time by
contrasting confounders, may not have utilized optimal
analytical approaches to assess risk. Using extensive person-
level longitudinal data from ambulatory and primary care
systems in theUSA, the aims of this exploratory outcome study
were to evaluate the rates and risks of developing acute
pancreatitis, other diseases of the pancreas or pancreatic
cancer in people with metformin-treated Type 2 diabetes who
initiated second-line anti-hyperglycaemic therapy with a DPP-
4i, GLP-1RA, sulfonylurea, thiazolidinedione or insulin.
Materials and methods
Data source
Centricity Electronic Medical Record (CEMR) of the USA
represents a variety of ambulatory and primary care medical
practices, including solo practitioners, community clinics,
academic medical centres and large integrated delivery
networks. Over 35 000 physicians and other providers from
all US states contribute to the CEMR, wherein ~ 75% are
primary care providers. The database is generally represen-
tative of the US population: the diabetes prevalence (7.1% of
people with diabetes identified by diagnostic codes) is similar
to the US National Diabetes Statistics (6.7% diagnosed
diabetes in 2014) [19]. CEMR has been used extensively for
academic research worldwide [20,21].
For more than 34 million individuals, longitudinal EMRs
were available from 1995 to April 2016. This database
contains comprehensive person-level information on demo-
graphics, anthropometric, clinical and laboratory variables
including age, sex, ethnicity, smoking status, and longitudi-
nal measures of body weight, BMI, blood pressure, HbA1c,
full lipid profiles, urine albumin and creatinine, and serum
creatinine.
Medication data include brand names and doses for
individual medications prescribed (RxNorm), along with
start/stop dates and specific fields to track treatment alter-
ations. This data set also contains self-reported medications,
including prescriptions received outside the EMRnetwork and
over-the-counter medications. All disease events along with
dates are coded with International Classification of Diseases
(ICD)-9, ICD-10 or SNOMED Clinical Terms (CT) codes.
Study design and study data
All individuals with a diagnosis of Type 2 diabetes were
included in this study with the conditions of no missing data
for age, sex or ethnicity; age ≥ 18 and < 80 years at the
diagnosis of Type 2 diabetes; and date of diagnosis of Type 2
diabetes after EMR registration date and after 1 January
2005. All those included also first began anti-hyperglycaemic
therapy with metformin, followed by second-line additional
treatment with a DPP-4i, GLP-1RA, insulin, thiazolidine-
dione or sulfonylurea for ≥ 3 months. Users of second-line
insulin, thiazolidinedione or sulfonylurea who had ever
received a DPP-4i or GLP-1RA were excluded, as were those
with other diseases of the pancreas or any type of cancer that
occurred prior to initiation of second-line anti-hyperglycae-
mic drug (index date).
Baseline (index date) data included age, sex, ethnicity,
body weight, BMI and blood pressure at the time of second
anti-hyperglycaemic drug initiation. Baseline HbA1c was
obtained as the closest observation to second drug initiation
within a [�3, +3] month window. Body weight, BMI, SBP
and lipids were calculated as the average of available
measurements within [�3, +3] months of baseline. Obesity
was defined as BMI ≥ 30 kg/m2.
The presence of comorbidities prior/post index date and
the time to such events were also obtained. Acute pancreati-
tis, other diseases of the pancreas, cancer, cardiovascular
disease, chronic kidney disease and hypertension were
identified. Cancer was defined as any malignant neoplasm
or carcinoma in situ. Cancer of the pancreas was additionally
separated. Other diseases of the pancreas included specified
What’s new?
• Association of treatment with incretin-based therapies
and the risk of pancreatic diseases remains controver-
sial. However, no study explored the comparative
safety of different anti-hyperglycaemic drugs in this
context.
• This study provides a holistic population-level compar-
ative outcome evaluation of the risk of pancreatic
diseases from the time of receiving different second-line
anti-hyperglycaemic drugs post metformin.
• Although treatment with incretin-based therapies was
not found be to be associated with an increased risk of
pancreatic diseases, people treated with insulin experi-
enced higher risk of such diseases.
2 ª 2018 Diabetes UK
DIABETICMedicine Pancreatic diseases in people with Type 2 diabetes � O. Montvida et al.
203
(e.g. pancreatic cyst) and unspecified diseases of the pancreas
with appropriate clinical codes. Cardiovascular disease was
defined as ischaemic heart disease (includes myocardial
infarction), peripheral vascular/artery disease, heart failure
or stroke.
Tobacco use status included data on the use of cigars, pipe,
cigarettes, chewing tobacco, snuff and smokeless tobacco.
Occasional smokers were classified as ‘current’. In case of
discordant same-day statuses, priority was given to ‘current’,
rather than to ‘former’ and lastly to ‘never’ status. Last status
recorded prior to index date was preserved as tobacco use
status. Complete information on anti-hyperglycaemic drugs,
along with non-steroidal anti-inflammatory drugs, lipid-
modifying drugs, anti-hypertensive and cardioprotective
medications was obtained. Cardioprotective medications
included beta-blocking agents, angiotensin-converting
enzyme inhibitors, angiotensin II antagonists and statin.
Ethical approval
Research involved existing data, in which individuals could
not be identified directly or through identifiers linked to
them. Thus, according to the US Department of Health and
Human Services Exemption 4 (CFR 46.101(b)(4)), this study
is exempt from ethics approval from an institutional review
board and informed consent.
Statistical methods
All analyses were performed by class of second-line anti-
hyperglycaemic drugs. Basic statistics were presented using
number (%), mean (SD) or median [interquartile range
(IQR)], as appropriate. The event rates per 1000 person-
years (95% CI) were estimated for acute pancreatitis, other
diseases of the pancreas, pancreatic cancer using standard
life-table method.
In the presence of significant differences in risk factors
between comparative treatment groups in observational
studies, standard Cox regression survival models after
propensity score adjustments are often used. Estimation of
hazard ratios are useful for population effects when they are
constant, which occurs when the treatment enters linearly,
and the distribution of the outcome has a proportional-
hazards form [22]. However, decisions on therapeutic
introductions and modifications are neither linear nor
conform to proportional-hazards form in the context of risk
of an event. Given the observational nature of the study, with
high likelihood of inherent differences in the comparator
treatment groups, we used a ‘treatment effect’ modelling
approach [23–25]. The parametric gamma time-to-event
model with inverse probability-weighted regression adjust-
ment for the confounders was used to evaluate the adjusted
mean (95% CI) time to event for the reference treatment
group (DPP-4), and the adjusted time difference (95% CI) to
the occurrence of event in the comparator treatment groups
were estimated. The robustness of choosing gamma distri-
bution was tested on the basis of information criteria
estimates. Risk analyses were balanced on age and the
follow-up time by treatment groups, and were adjusted for
age, sex, smoking status, BMI and diabetes duration,
following the weighted propensity-score approach. Survival
time was computed as time to event (acute pancreatitis, other
diseases of the pancreas or pancreatic cancer) if an event
occurred, otherwise as time to the end of follow-up (date of a
person’s last available record within the database). The
robustness of risk modelling with balancing factors were
evaluated by estimating the weighted standardised differ-
ences in these factors by treatment groups.
Two sensitivity analyses were conducted to evaluate the
robustness of the risk analyses in two sub-cohorts: (i) in all
people from the study cohort excluding those with acute
pancreatitis, other diseases of the pancreas or any type of
cancer within 6 months of the index date (sub-cohort 1); and
(ii) in all people from the study cohort with non-missing
baseline HbA1c (sub-cohort 2). Sub-cohort 2 was addition-
ally balanced on HbA1c and body weight for risk analyses.
Results
From 2 624 954 people identified as having Type 2 diabetes,
225 898 met the inclusion criteria for the study (Fig. 1). At
the index date, participants had a mean (SD) age of 59
(12) years, 49% were men, 69% had White European
ancestry, and had an overall mean follow-up time of
3.2 years. Anti-hyperglycaemic drug groups as defined
included 22% (n = 50 095) using DPP-4i; 6%
(n = 12 654) using GLP-1RA; 15% (n = 34 805) using
insulin; 49% (n = 110 747) using sulfonylurea; and 8%
(n = 17 597) using thiazolidinedione (Table 1). Follow-up
time of those in most of the treatment groups was similar,
except for those in the thiazolidinedione group which had a
longer mean follow-up of 4.6 years. The distributions of age,
diabetes duration and HbA1c were significantly different
between groups, as expected. The proportions of people
adding or moving to a third-line anti-hyperglycaemic drug
were similar in those receiving incretin-based therapy (47%
in both the DPP-4i and GLP-1RA groups), although other
groups had significantly lower proportions. The non-incretin
groups could not have received incretin-based therapies
during follow-up, by design.
Risk of acute pancreatitis
Only 1049 (0.46%) people developed acute pancreatitis
during the mean 3.2 years of follow-up (Table 1). The rates
per 1000 person-years of acute pancreatitis were similar in
the DPP-4i (1.31; 95% CI 1.21, 1.59), GLP-1RA (1.49; 1.16,
1.92) and sulfonylurea (1.45; 1.33, 1.58) groups, whereas
those treated with insulin had significantly higher acute
pancreatitis rate (2.01; 1.75, 2.31) and those treated with
ª 2018 Diabetes UK 3
Research article DIABETICMedicine
204
thiazolidinedione had significantly lower acute pancreatitis
rate (0.89; 0.70, 1.12) compared with DPP-4i group
(Table 2). The adjusted mean (95% CI) time to acute
pancreatitis in people treated with DPP-4 was 2.63 (2.38,
2.88) years. Those treated with insulin were likely to develop
acute pancreatitis 0.48 years (P < 0.01) earlier. The adjusted
mean time to acute pancreatitis were similar in other
treatment groups.
Risk of cancer of the pancreas
In the cohort, 357 (0.16%) people developed cancer of the
pancreas, and there was no significant difference in the rate of
pancreatic cancer per 1000 person-years among the treatment
groups. Among those with acute pancreatitis, 17 (2%)
developed pancreatic cancer, of whom four and two individ-
uals belonged to DPP-4 and GLP-1RA groups, respectively.
The adjusted mean (95% CI) time to outcome in the DPP-4
group was 2.70 (2.19, 3.21) years, with no significant
differences in time to event in the insulin, sulfonylurea and
thiazolidinedione groups (Table 2). However, people treated
with GLP-1RA were likely to develop pancreatic cancer by
~ 3 years later (P < 0.01) compared with the DPP-4 group.
Risk of other diseases of the pancreas
Only 0.33% (n = 752) of the cohort experienced other
diseases of the pancreas during follow-up. Among those with
Type 2 diabetes (n = 2 624 954)
Age at diagnosis ≥ 18 and <80 (n = 2 590 853)
Diabetes mellitus (n = 2 893 321)
Pa�ents with non-missing sex and age (n = 34 299 123)
Diabetes diagnoses a�er entry to the EMR (n = 1 412 938)
Me�ormin as first line (n = 740 478)
Diabetes diagnosis on or a�er jan 1 2005 (n = 1 305 686)
GLP-1RA (n = 15 448)
Ini�ated second line (n = 357 482)
DPP-4i(n = 61 508)
INS(n = 49 939)
TZD (n = 33 021)
SU (n = 187 819)
No record of acute pancrea��s, or other disease of pancreas, or any type of cancer prior to second-line ini�a�on (n = 320 754)
DPP-4i(n = 56 327)
INS(n = 45 936)
TZD (n = 30 856)
SU (n = 173 137)
GLP-1RA (n = 12 654)
On treatment for at least 3 months (n = 289 434)
DPP-4i(n = 50 095)
INS(n = 40 846)
TZD (n = 28 337)
SU (n = 157 502)
GLP-1RA(n = 14 498)
GLP-1RA (n = 12 654)
No DPP-4 or GLP-1RA ever taken in TZD, INS, or SU groups (n = 225 898)
DPP-4i(n = 50 095)
INS(n = 34 805)
TZD (n = 17 597)
SU (n = 110 747)
GLP-1RA (n = 12 518)
No record of acute pancrea��s, or other disease of pancreas, or any type of cancer 6 months post second-line ini�a�on (n = 221 882)
DPP-4i(n = 49 419)
INS(n = 34 098)
TZD (n = 17 294)
SU (n = 108 553)
GLP-1RA (n = 7 580)
Non-missing HbA1c at the �me of second-line ini�a�on (n = 131 482)
DPP-4i(n = 31 618)
INS(n = 18 924)
TZD (n = 9 274)
SU (n = 64 086)
MAIN COHORT
SUB-COHORT 1 SUB-COHORT 2
FIGURE 1 Flow-chart of the study cohort. EMR, electronic Medical Record; DDP-4i, dipeptidyl peptidase-4 inhibitor; GLP-1RA, glucagon-like
peptide-1 receptor agonist; INS, insulin; SU, sulfonylurea; TZD, thiazolidinedione.
4 ª 2018 Diabetes UK
DIABETICMedicine Pancreatic diseases in people with Type 2 diabetes � O. Montvida et al.
205
Table
1Basiccharacteristics
atthetimeofsecond-lineanti-hyperglycaem
ictherapyinitiation(index
date)
Dipeptidylpeptidase-4
inhibitor
Glucagon-likepeptide-1
receptoragonist
Insulin
Sulfonylurea
Thiazolidinedione
Total
N50095
12654
34805
110747
17597
225898
Age,
years*
58(12)
53(12)
57(13)
60(12)
59(12)
59(12)
Men
†24034(48)
4346(34)
16302(47)
57876(52)
9174(52)
111732(49)
WhiteEuropeanancestry
†34989(70)
9613(76)
23229(67)
76430(69)
11791(67)
156052(69)
Black
†5852(12)
1083(9)
5083(15)
12971(12)
1581(9)
26570(12)
Currenttobaccouse
†4872(10)
979(8)
3929(11)
9980(9)
797(5)
20557(9)
Form
ertobaccouse
†8086(16)
1989(16)
5729(16)
16790(15)
1232(7)
33826(15)
Never
tobaccouse
†15265(30)
3839(30)
8828(25)
26797(24)
2482(14)
57211(25)
Tobaccouse
–unknownstatus†
21872(44)
5847(46)
16319(47)
57180(52)
13086(74)
114304(51)
Diabetes
durationpriorto
index
date,
months*
14.35(21.47)
13.50(20.85)
7.76(16.61)
11.13(19.92)
6.34(14.27)
11.09(19.64)
Treatm
entduration,months*
26.22(20.07)
24.85(20.34)
31.15(24.77)
31.59(24.94)
32.49(26.13)
30.03(23.91)
Follow-up,years
‡2.54(1.3,4.11)
2.67(1.25,4.67)
2.27(1.13,3.99)
2.67(1.31,4.6)
4.33(1.99,6.79)
2.66(1.3,4.54)
Follow-up,years*
2.91(1.96)
3.24(2.39)
2.81(2.13)
3.24(2.37)
4.56(2.89)
3.20(2.34)
HbA1c,mmol/mol‡
61(52,74)
56(48,68)
74(56,93)
63(53,77)
55(48,68)
62(52,78)
HbA1c,%
‡7.7
(6.9,8.9)
7.3
(6.5,8.4)
8.9
(7.3,10.7)
7.9
(7.0,9.2)
7.2
(6.5,8.4)
7.8
(6.9,9.3)
SBP,mmHg*
130(14)
128(13)
131(16)
132(16)
130(15)
131(15)
LDL,mmol/l*
2.53(0.91)
2.48(0.88)
2.53(0.96)
2.53(0.91)
2.51(0.88)
2.53(0.91)
Triglicerides,mmol/l‡
1.66(1.22,2.21)
1.68(1.22,2.24)
1.61(1.15,2.20)
1.67(1.22,2.24)
1.55(1.12,2.11)
1.65(1.21,2.21)
Weight,kg*
98(24)
108(26)
100(26)
97(25)
99(24)
99(25)
BMI,kg/m
2*
34.4
(7.6)
38.2
(83)
35.2
(8.3)
34.1
(7.7)
34.6
(7.9)
34.6
(7.9)
Obese†
32567(70)
10201(87)
22600(71)
67899(68)
10763(70)
144030(70)
Hypertension†
28063(56)
6675(53)
17477(50)
60207(54)
8434(48)
120856(54)
Cardiovasculardisease
†8796(18)
1531(12)
7745(22)
22995(21)
2958(17)
44025(19)
Chronic
kidney
disease
†1525(3)
229(2)
1129(3)
3910(4)
447(3)
7240(3)
Received
thirdanti-hyperglycaem
icagent†
23318(47)
5986(47)
4482(13)
34905(32)
7015(40)
75706(34)
Received
cardio-protective
medication†
46395(93)
11336(90)
31862(92)
103553(94)
16390(93)
209536(93)
Received
non-steroidalanti-
inflammatory
drugs
37265(74)
9269(73)
25121(72)
82963(75)
13207(75)
167825(74)
Received
anti-hypertensive†
3613(7)
824(7)
3821(11)
10518(10)
1627(9)
20403(9)
Sub-cohort
1†
49419(99)
12518(99)
34098(98)
108553(98)
17294(98)
221882(98)
Sub-cohort
2†
31618(63)
7580(60)
18924(54)
64086(58)
9274(53)
131482(58)
Values
are
given
as*m
ean(SD),
†n(%
)or‡median(IQR).
Sub-cohort
1,patients
from
thestudycohort
excludingthose
withacute
pancreatitis,
other
disease
ofpancreasoranytypeofcancerwithin
6monthsofindex
date.
Sub-cohort
2,patients
from
thestudycohort
withnon-m
issingHbA1cmeasure
atindex
date.
ª 2018 Diabetes UK 5
Research article DIABETICMedicine
206
acute pancreatitis, 101 (10%) also had other diseases of the
pancreas. Except for the insulin group, the rate per 1000
person-years for other diseases of the pancreas was similar
across treatment groups (Table 2). The mean (95% CI) time
to develop other diseases of the pancreas in DPP-4i group
was 2.73 (2.33, 3.12) years, with no significant difference
from other treatment groups.
In all survival regression models, the balances for defined
confounders were achieved among the treatment groups. The
weighted standardized differences in confounding factors
across the treatment groups were similar while evaluating the
risk of acute pancreatitis, other diseases of the pancreas and
pancreatic cancer separately (Table S1). The sensitivity
analyses with sub-cohorts 1 and 2 showed similar estimates
of time to events. However, after removing those who
developed acute pancreatitis or other diseases of the pancreas
or any cancer within 6 months of the index date (sub-cohort
1), the adjusted time to acute pancreatitis among people
treated with insulin was no more different from the other
groups, as otherwise observed in the primary cohort.
Discussion
A potential relationship between the use of incretin-based
anti-hyperglycaemic drugs and adverse pancreatic outcomes
has been suggested by various pre-clinical and clinical
studies. In particular, meta-analysis of data obtained from
cardiovascular safety trials of several DPP-4i medications
suggests that exposure to those medications significantly
increased the risk of acute pancreatitis compared with
placebo, although the absolute increase in risk was small
[8]. Interestingly, although people with Type 2 diabetes and
obesity are known to be at increased risk for pancreatitis and
pancreatic cancer, rates of those complications have been
low in recent clinical trials. Low rates of these outcomes of
interest, as well as the likely extended duration of exposure
and follow-up needed to ascertain a relationship between
anti-hyperglycaemic drug exposure and the development of
malignancy, pose significant challenges in determining the
pancreatic safety of drugs commonly used in diabetes
management.
In this retrospective, longitudinal, real-world study of a
large cohort of people with metformin-treated Type 2
diabetes, we analysed the effects of second-line anti-
hyperglycaemic drugs upon rates and times to clinically
important pancreatic complications. Our analyses have
shown that the rate of acute pancreatitis was higher in
the group treated with insulin, lower in the thiazolidine-
dione group, and similar in the groups receiving GLP-1RA
or sulfonylurea therapy when compared with the group
treated with DPP-4i. Time to development of pancreatic
cancer was longer in the GLP-1RA group compared with
DPP-4i users, but did not differ significantly between the
other anti-hyperglycaemic drug groups compared with DPP-
4i. Rates of any other disease of the pancreas were not
higher among people who received additional therapy with
a DPP-4i compared with other classes of anti-hyperglycae-
mic drugs.
The increased risk of pancreatitis in insulin users is perhaps
not surprising, because insulin users often tend to have a
greater burden of comorbidities and risks for adverse
outcomes that cannot be fully adjusted for in a retrospective
analysis. However, time to acute pancreatitis was no longer
significantly different between the insulin and DPP-4i groups
after removing individuals who developed acute pancreatitis
or other diseases of the pancreas or any cancer within
6 months of the index date (sub-cohort 1). The lower rate
per 1000 person-years of acute pancreatitis noted in
Table 2 Event rates (95% CI) per 1000 person-years; adjusted mean time to events (95% CI) in dipeptidyl peptidase-4 (DPP-4) group and adjusteddifference in time to events in other treatment groups with DPP-4 inhibitor as a reference
Acute pancreatitis (95% CI) Pancreatic cancer (95% CI) Other disease of pancreas (95% CI)
DPP-4Rate per 1000 person-years 1.38 (1.21, 1.59) 0.46 (0.36, 0.59) 0.93 (0.78, 1.10)Mean time to event (years) 2.63 (2.38, 2.88) 2.70 (2.19, 3.21) 2.73 (2.33, 3.12)
Glucagon-like peptide-1 receptor agonistRate per 1000 person-years 1.49 (1.16, 1.92) 0.17 (0.08, 0.36) 0.78 (0.55, 1.10)Time difference (years) �0.18 (�0.72, 0.37) 3.00 (0.84, 5.16)* 0.52 (�0.60, 1.65)
InsulinRate per 1000 person-years 2.01 (1.75, 2.31) 0.59 (0.46, 0.77) 1.48 (1.26, 1.75)Time difference (years) �0.48 (�0.90, �0.06)* �0.70 (�1.56, 0.17) �0.49 (�1.01, 0.03)
SulfonylureaRate per 1000 person-years 1.45 (1.33, 1.58) 0.55 (0.47, 0.63) 1.04 (0.94, 1.15)Time difference (years) �0.01 (�0.51, 0.50) �0.57 (�1.26, 0.11) �0.43 (�1.13, 0.28)
ThiazolidinedioneRate per 1000 person-years 0.89 (0.70, 1.12) 0.36 (0.25, 0.52) 0.85 (0.67, 1.08)Time difference (years) �0.25 (�0.56, 0.05) �0.09 (�0.74, 0.56) �0.28 (�0.74, 0.18)
*P < 0.01.†Analyses were balanced on age and the follow-up time by treatment groups, and were adjusted for age, sex, BMI, smoking status anddiabetes duration.
6 ª 2018 Diabetes UK
DIABETICMedicine Pancreatic diseases in people with Type 2 diabetes � O. Montvida et al.
207
thiazolidinedione users is perhaps more unexpected. Inter-
estingly, studies in rodent models suggest that rosiglitazone
exposure may limit the severity of pancreatitis and shorten
recovery from pancreatic inflammation in the setting of
induced pancreatic injury [26,27]. However, the thiazo-
lidinedione class is associated with a number of adverse side
effects that have significantly limited clinical use of these
medications [28]. The findings of this analysis provide
reassurance to prescribers and users of DPP-4i that these
medications do not significantly increase the risk of adverse
pancreatic outcomes compared with other commonly pre-
scribed second-line therapies.
Several strengths of this analysis include the large number
of people who met the inclusion criteria for analysis, the
robust amount of data collected and the reasonable duration
of follow-up after addition of the second-line anti-hypergly-
caemic drug. Although the overall numbers of people expe-
riencing acute pancreatitis and pancreatic cancer were small,
the numbers exceed those available for inclusion in the
previously cited meta-analysis [8]. The unique and novel
aspects of this study include the careful choice of study cohort
without any history of pancreatic diseases or cancer. This
approach reduces the likelihood of events attributable to pre-
existing conditions such as biliary disease, structural pancre-
atic disorders or an autoimmune/genetic predisposition to
pancreatic diseases. The analyses also include a holistic
evaluation of the risks associated with use of different anti-
hyperglycaemic drugs rather than a one drug vs. all approach;
a detailed evaluation of the treatment patterns, ensuring non-
exposure to incretin-based therapies in other comparator
treatment groups; and careful choice of statistical methodol-
ogy to evaluate the risk while robustly addressing the
challenging issues of imbalances in important risk factors
and confounders across treatment groups. Our finding that
people treated with a DPP-4i are not at higher risk of
pancreatic diseases is robust, supported by the sensitivity
analyses in a large number of people with Type 2 diabetes.
Electronic databases do present challenges in terms of the
accuracy and completeness of the required data. As a result,
limitations of this study include non-availability of complete
and reliable longitudinal data related tomedication adherence,
tobacco and alcohol consumption, socio-economic status and
potential residual confounders. In particular, the inability to
quantify alcohol exposure does not permit adjustment or
balancing for this known risk factor for pancreatitis. The
analyses were not adjusted for conditions such as hypertriglyc-
eridaemia, hypercalcaemia or non-anti-hyperglycaemic drug
exposures that have been associated with pancreatitis; how-
ever, these are considered responsible for only a small
percentage of overall cases of acute pancreatitis [29]. Further-
more, other known risk factors for pancreatic cancer including
dietary composition, inactivity or family history/genetic pre-
disposition are not available in routinely collected electronic
health records. There may also have been inherent medical,
socio-economic or other differences that affected the types of
anti-hyperglycaemic drug prescribed. These differences could
not be readily determined or adjusted for in the analyses, and
the impact of these individual characteristics upon pancreatic
outcomes is unknown. Future pharmaco-epidemiologic stud-
ies with longer-term follow up and more robust medication
exposure and outcomes ascertainment will further comple-
ment our understanding of the pancreatic safety of anti-
hyperglycaemic therapies.
Funding sources
None.
Competing interests
SKP has acted as a consultant and/or speaker for Novartis,
GI Dynamics, Roche, AstraZeneca, Guangzhou Zhongyi
Pharmaceutical and Amylin Pharmaceuticals LLC. He has
received grants in support of investigator and investigator-
initiated clinical studies from Merck, Novo Nordisk,
AstraZeneca, Hospira, Amylin Pharmaceuticals, Sanofi-
Avensis and Pfizer. JA has received honoraria for consultan-
cies and lectures from Novartis, Novo Nordisk, Astra
Zeneca, Sanofi, Merck Sharp and Dohme, Abbott, Janssen
Cilag and Takeda. OM has no conflict of interest to declare.
JG has received research grants from AstraZeneca, Boehrin-
ger Ingelheim, GlaxoSmithKline, Intarcia and Sanofi, and
personal fees for consultative work from AstraZeneca,
Daiichi, Merck Sharp & Dohme, NovoNordisk and
Boehringer-Ingelheim.
Acknowledgements
SKP, OM and JA acknowledge a grant provided by Royal
Brisbane Women Hospital Foundation. Melbourne EpiCen-
tre gratefully acknowledges support from the National
Health and Medical Research Council and the Australian
Government’s National Collaborative Research Infrastruc-
ture Strategy (NCRIS) initiative through Therapeutic Inno-
vation Australia. The authors are grateful to all contributors
in the CEMR database. OM gratefully acknowledges the
PhD scholarship from Queensland University of Technology,
Australia, and her co-supervisors Prof. Ross Young and Prof.
Louise Hafner of the same university.
Author contributions
SKP conceived the idea and was responsible for the primary
design of the study. OM, JA and JG contributed in the study
design. OM conducted the data extraction, and SKP and OM
jointly conducted the statistical analyses. The first draft of
the manuscript was developed by OM and SKP, and all
authors contributed to the finalisation of the manuscript.
SKP had full access to all the data in the study and is the
guarantor, taking responsibility for the integrity of the data
ª 2018 Diabetes UK 7
Research article DIABETICMedicine
208
and the accuracy of the data analysis. Aggregated data is
available upon on request.
References
1 Deacon CF, Mannucci E, Ahr�en B. Glycaemic efficacy of glucagon-
like peptide-1 receptor agonists and dipeptidyl peptidase-4 inhibi-
tors as add-on therapy to metformin in subjects with type 2 diabetes
—a review and meta analysis. Diabetes Obes Metab 2012; 14: 762–767.
2 Paul SK, Agbeve J, Maggs D, Best JH. Comparison of trajectories of
self-monitored glucose levels by hypoglycemia status over 52 weeks
of treatment with insulin glargine or exenatide once weekly. J
Diabetes 2016; 8: 148–157.3 American Diabetes Association. Standards of Medical Care in
Diabetes—2018. Diabetes Care 2018; 41(Suppl 1): S4.
4 Paul SK, Klein K, Maggs D, Best JH. The association of the
treatment with glucagon-like peptide-1 receptor agonist exenatide
or insulin with cardiovascular outcomes in patients with type 2
diabetes: a retrospective observational study. Cardiovasc Diabetol
2015; 14: 10.
5 Wu S, Cipriani A, Yang Z, Yang J, Cai T, Xu Y et al. The
cardiovascular effect of incretin-based therapies among type 2
diabetes: a systematic review and network meta-analysis. Expert
Opin Drug Saf 2018; 17: 243–249.6 Azoulay L, Filion KB, Platt RW, Dahl M, Dormuth CR, Clemens
KK et al. Incretin based drugs and the risk of pancreatic cancer:
international multicentre cohort study. BMJ 2016; 352: i581.
7 Azoulay L. Incretin-based drugs and adverse pancreatic events:
almost a decade later and uncertainty remains. Diabetes Care 2015;
38: 951–953.8 Tk�a�c I, Raz I. Combined analysis of three large interventional trials
with gliptins indicates increased incidence of acute pancreatitis in
patients with type 2 diabetes. Diabetes Care 2017; 40: 284–286.9 Thomsen RW, Pedersen L, Møller N, Kahlert J, Beck-Nielsen H,
Sørensen HT. Incretin-based therapy and risk of acute pancreatitis:
a nationwide population-based case-control study. Diabetes Care
2015; 38: 1089–1098.10 Faillie J-L, Azoulay L, Patenaude V, Hillaire-Buys D, Suissa S.
Incretin based drugs and risk of acute pancreatitis in patients with
type 2 diabetes: cohort study. BMJ 2014; 348: g2780.
11 Roshanov PS, Dennis BB. Incretin-based therapies are associated
with acute pancreatitis: meta-analysis of large randomized con-
trolled trials. Diabetes Res Clin Pract 2015; 110: e13–e17.12 Wang T, Wang F, Gou Z, Tang H, Li C, Shi L et al. Using real-
world data to evaluate the association of incretin-based therapies
with risk of acute pancreatitis: a meta-analysis of 1 324 515
patients from observational studies. Diabetes Obes Metab 2015;
17: 32–41.13 Li L, Shen J, Bala MM, Busse JW, Ebrahim S, Vandvik PO et al.
Incretin treatment and risk of pancreatitis in patients with type 2
diabetes mellitus: systematic review and meta-analysis of ran-
domised and non-randomised studies. BMJ 2014; 348: g2366.
14 Chou H-C, Chen W-W, Hsiao F-Y. Acute pancreatitis in patients
with type 2 diabetes mellitus treated with dipeptidyl peptidase-4
inhibitors: a population-based nested case-control study. Drug Saf
2014; 37: 521–528.15 Chang C-H, Lin J-W, Chen S-T, Lai M-S, Chuang L-M, Chang
Y-C. Dipeptidyl peptidase-4 inhibitor use is not associated with
acute pancreatitis in high-risk type 2 diabetic patients: a nationwide
cohort study. Medicine 2016; 95: e2603.
16 Knapen L, van Dalem J, Keulemans Y, van Erp N, Bazelier M, De
Bruin M et al. Use of incretin agents and risk of pancreatic cancer: a
population-based cohort study. Diabetes Obes Metab 2016; 18:
258–265.17 Knapen LM, de Jong RG, Driessen JH, Keulemans YC, van Erp
NP, De Bruin ML et al. The use of incretin agents and risk of acute
and chronic pancreatitis: a population-based cohort study. Dia-
betes Obes Metab 2017; 19: 401–411.18 Chen H, Zhou X, Chen T, Liu B, Jin W, Gu H et al. Incretin-based
therapy and risk of pancreatic cancer in patients with Type 2
diabetes mellitus: a meta-analysis of randomized controlled trials.
Diabetes Ther 2016; 7: 725–742.19 Centers for Disease Control and Prevention. National Diabetes
Statistics Report: Estimates of Diabetes and its Burden in the
United States, 2014. Atlanta, GA: US Department of Health and
Human Services, 2014.
20 Crawford AG, Cote C, Couto J, Daskiran M, Gunnarsson C, Haas
K et al. Comparison of GE Centricity Electronic Medical Record
database and National Ambulatory Medical Care Survey findings
on the prevalence of major conditions in the United States. Popul
Health Manag 2010; 13: 139–150.21 Paul SK, Shaw J, Montvida O, Klein K. Weight gain in insulin-
treated patients by body mass index category at treatment
initiation: new evidence from real-world data in patients with type
2 diabetes. Diabetes Obes Metab 2016; 18: 1244–1252.22 ElHafeez SA, Torino C, D’Arrigo G, Bolignano D, Provenzano F,
Mattace-Raso F et al. An overview on standard statistical methods
for assessing exposure–outcome link in survival analysis (Part II):
the Kaplan-Meier analysis and the Cox regression method. Aging
Clin Exp Res 2012; 24: 203–206.23 Rotnitzky A, Robins JM. Inverse probability weighting in survival
analysis. Encyclopedia of Biostatistics. Chichester: Wiley, 2005.
24 Austin PC, Stuart EA. The performance of inverse probability of
treatment weighting and full matching on the propensity score in
the presence of model misspecification when estimating the effect of
treatment on survival outcomes. Stat Methods Med Res 2017; 26:
1654–1670.25 Austin PC. Variance estimation when using inverse probability of
treatment weighting (IPTW) with survival analysis. Stat Med 2016;
35: 5642–5655.26 Pini M, Rhodes DH, Castellanos KJ, Cabay RJ, Grady EF,
Fantuzzi G. Rosiglitazone improves survival and hastens recovery
from pancreatic inflammation in obese mice. PLoS One 2012; 7:
e40944.
27 Wan H, Yuan Y, Liu J, Chen G. Pioglitazone, a PPAR-c activator,
attenuates the severity of cerulein-induced acute pancreatitis by
modulating early growth response-1 transcription factor. Transl
Res 2012; 160: 153–161.28 Woodcock J, Sharfstein JM, Hamburg M. Regulatory action on
rosiglitazone by the US Food and Drug Administration. N Engl J
Med 2010; 363: 1489–1491.29 Forsmark CE, Vege SS, Wilcox CM. Acute pancreatitis. N Engl J
Med 2016; 375: 1972–1981.
Supporting Information
Additional supporting information may be found online in
the Supporting Information section at the end of the article.
Table S1. Weighted standardized differences in balanced
groups.
8 ª 2018 Diabetes UK
DIABETICMedicine Pancreatic diseases in people with Type 2 diabetes � O. Montvida et al.
209