EVALUATION OF CARDIO-METABOLIC EFFECTS OF TREATMENT … · Olga Montvida, Kerenaftali Klein, Sanjoy...

EVALUATION OF CARDIO-METABOLIC EFFECTS OF TREATMENT WITH INCRETIN-BASED THERAPIES IN PATIENTS WITH TYPE 2 DIABETES

Olga Montvida, MSc Student number: 9341625

Submitted in fulfilment of the requirements for the degree of Doctor of Philosophy

School of Biomedical Sciences

Institute of Health and Biomedical Innovation

Faculty of Health

Queensland University of Technology

2018

ABSTRACT

Type 2 diabetes (T2DM) is a chronic and progressive metabolic disorder with a complex and

multifactorial pathophysiology. As patients with T2DM are at increased risk of cardiovascular

(CV) complications and mortality, efficient disease management requires a holistic multi-

faceted approach to control blood glucose, blood pressure, lipids, and body weight. While

metformin has been suggested as the first-line anti-diabetic drug (ADD), given the progressive

nature of the disease, many patients eventually require intensification. International guidelines

suggest multiple options for second- and third-line ADDs, including incretin-based therapies:

dipeptidyl peptidase 4 inhibitor (DPP-4i) and glucagon-like peptide-1 receptor agonist (GLP-

1RA). While current disease management guidelines are primarily based on results from

randomised controlled trials that are conducted on a protocol-driven selective patient

population, the population-level evaluation of the effectiveness and safety of such therapies in

the real-world practice would guide the patients and their carer’s in terms of choosing the right

therapies for optimum disease management. Clinical studies have evaluated the possible

beneficial association of treatment with novel anti-diabetic therapies with CV risk factors,

however the real-world evidence on such aspects is scarce.

With a central focus on incretin-based therapies, the aims of this thesis were to explore the real-

world patterns of (1) longitudinal changes in ADD choices, (2) population-level glycaemic

control and its sustainability, and (3) the long-term cardio-metabolic risk factor burden.

Using a large database of Electronic Medical Records (EMRs) of the United States, six

pharmaco-epidemiological and three methodological studies were conducted. A number of

important findings were reported in high impact journals including Diabetes Care and

Diabetes, Obesity, and Metabolism, with one publication receiving a dedicated review in the

Nature Reviews journal.

Extensive methodological and data mining studies were performed to extract reliable data from

voluminous EMRs and to develop efficient study designs and analysis approaches. One study

was devoted to the data management of medication prescriptions, specifically to the estimation

of treatment duration at individual patient-level accounting for intensifications and alterations

with multiple therapies. Methodological challenges associated with robust identification of the

patients with T2DM were addressed in another separate study. An exploratory analysis to

3

investigate the mechanisms and patterns of longitudinal missing risk factor data along with a

comparative study of multiple imputation techniques for such data were conducted to account

for the uncertainty due to missing values and to ensure the generalisability of study findings.

Advanced statistical methodologies, such as “treatment effects modelling”, were performed

throughout the thesis to ensure robust inferences drawn in the individual studies.

It was observed that the use of incretin-based drugs has increased since their approval in 2005,

in particular the use of DPP-4i as a second-line choice. Patient profiles significantly varied by

the class of chosen ADD, for instance GLP-1RA users were younger, had lower HbA1c level,

and were more likely to be female, compared to other major ADD users. It was observed that

around half of the patients with T2DM do not reach glycaemic targets and clear evidence of

therapeutic inertia persists at population-level.

Patients who intensified metformin with incretin-based drugs or thiazolidinedione were more

likely to achieve and sustain glycaemic control over 24 months of continuous treatment,

compared to those treated with sulfonylurea – the most popular intensification choice. A

separate study investigated the outcomes of intensifying GLP-1RA with insulin and reported

beneficial cardio-metabolic effects of combining these therapies. Even though the popularity

of newer therapeutic classes as second-line options was notably increasing, the longitudinal

rates of intensification with a third-line ADD was not reduced significantly at the population

level.

Neither glycaemic nor CV risk factor burden significantly improved over the last decade in

patients with T2DM, even though most patients were using multiple drugs for glucose, blood

pressure and lipid control. The long-term glycaemic burden consistently increased over time,

and more than half of the patients with a history of CV disease continued to have uncontrolled

blood pressure and lipids post-therapy initiation. Three out of five patients who are already

receiving multiple anti-diabetic and cardio-protective drugs were failing to simultaneously

control glucose and at least one CV risk factor. Compared to those who initiated second-line

ADD with sulfonylurea and insulin, patients who intensified metformin with incretin-based

therapies or thiazolidinedione were more likely to achieve simultaneous glucose and CV risk

factor control. Treatment with GLP-1RA was associated with lower rates of major adverse

macrovascular events, compared to other ADDs.

To conclude, this dissertation provides a detailed exploration and valuable insights of T2DM

management in the real-world setting and highlights alarming rates of the existing cardio-

4

metabolic burden at the population level. Incretin-based therapies and thiazolidinedione were

found to provide higher chances of sustainable glycaemic and CV risk factor control, and

treatment with GLP-1RA appears to have a beneficial association with CV risk, compared to

other anti-diabetic treatment options. Nonetheless, proper control in terms of timely

intensification with anti-hyperglycaemic, anti-hypertensive, and anti-dyslipidemic therapies

when needed, remains a key aspect to improve long-term outcomes in patients with T2DM.

5

KEYWORDS

Glucagon-like peptide-1 receptor agonist, dipeptidyl peptidase 4 inhibitor, incretin-based

therapy, glycaemic control, cardiovascular risk, macrovascular event, type 2 diabetes mellitus,

electronic medical records, longitudinal cohort study.

6

LIST OF PUBLICATIONS

The following is a list of published or submitted manuscripts that have been incorporated

into this thesis, thereby producing a thesis by publication.

Chapter 4: Olga Montvida, Ognjen Arandjelović, Edward Reiner, and Sanjoy K. Paul. Data

Mining Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic

Medical Records. Open Bioinformatics, 2017, 10:1-15. DOI:

10.2174/1875036201709010001.x.

Chapter 6: Mayukh Samanta, Olga Montvida, Joanne Tropea, and Sanjoy K. Paul. A

comparison of imputation methods for missing risk factor data from large real-world electronic

medical records for comparative effectiveness studies. (Submitted)

Chapter 7: Olga Montvida, Jonathan Shaw, John J Atherton, Francis Stringer, Sanjoy K Paul.

Long-term Trends in Antidiabetes Drug Usage in the US: Real-world Evidence in Patients

Newly Diagnosed With Type 2 Diabetes. Diabetes Care. 2017 Nov 6:dc171414. DOI:

10.2337/dc17-1414.x.

Chapter 8: Olga Montvida, Jonathan Shaw, Lawrence Blonde, Sanjoy K Paul. Long-term

sustainability of glycaemic achievements with second-line anti-diabetic therapies in patients

with type 2 diabetes: A real-world study. Diabetes, Obesity, and Metabolism. 2018;20:1722–

1731. DOI: 10.1111/dom.13288.x.

Chapter: 9: Olga Montvida, Sanjoy K Paul. Cardiovascular risk factor burden and safety in

patients with type 2 diabetes receiving intensified anti-diabetic and cardio-protective therapies.

(Submitted)

The following is a list of accepted and submitted manuscripts that are highly relevant to the

work performed in this thesis and were developed throughout candidature.

Appendix A: Olga Montvida, Kerenaftali Klein, Sudhesh Kumar, Kamplesh Khunti, Sanjoy

K. Paul. Addition of or switch to insulin therapy in people treated with glucagon‐like peptide‐

1 receptor agonists: A real‐world study in 66 583 patients. Diabetes, Obesity and Metabolism.

2017 Jan 1;19(1):108-17. DOI: 10.1111/dom.12790.x.

7

Appendix B: Ebenezer S. Owusu Adjah*, Olga Montvida*, Julius Agbeve, Sanjoy K. Paul.

Data Mining Approach to Identify Disease Cohorts from Primary Care Electronic Medical

Records: A Case of Diabetes Mellitus. The Open Bioinformatics Journal, 2017, 10: 16-27.

DOI: 10.2174/1875036201710010016.x. *Joint first authorship.

Appendix C: Sanjoy K Paul, Jonathan Shaw, Olga Montvida, Kerenaftali Klein. Weight gain

in insulin treated patients by BMI categories at treatment initiation: New evidence from real-

world data in patients with type 2 diabetes. Diabetes, Obesity and Metabolism. 2016 Dec

1;18(12):1244-52. DOI:10.111/dom.12761.x.

Appendix D: Olga Montvida, Jennifer B Green, John Atherton, Sanjoy K Paul. Risk of

Pancreatic Diseases by Second-line Drug Class: Real World Evidence in 225,898 Type 2

Diabetes Patients. Diabet Med. 2018 Oct 10. doi: 10.1111/dme.13835.

The following is a list of presentations and papers in refereed conference proceedings

throughout candidature.

1. Olga Montvida, Sanjoy Paul. Cardiovascular risk factor burden and safety in patients

with type 2 diabetes receiving intensified anti-diabetic and cardio-protective therapies. QIMR

Berghofer Early Career Researcher Seminar, Brisbane, Australia, 18 May 2018.

2. Olga Montvida, Jonathan Shaw, Sanjoy Paul. Comparative assessment of glycaemic

achievements with second-line anti-diabetes therapy intensification – Real world evidence

based choices for patients and providers. Annual Meeting of the European Association for the

Study of Diabetes (EASD), Lisbon, Portugal, 11-15 September 2017

3. Sanjoy Paul, Jennifer B Green, John Atherton, Olga Montvida. Risk of Pancreatic

Diseases by Second-line Anti Diabetes Drug Class: Real World Based Evidence. Annual

Meeting of the European Association for the Study of Diabetes (EASD), Lisbon, Portugal, 11-

15 September 2017

4. Ebenezer Adjah, Olga Montvida, Kamlesh Khunti, Sanjoy Paul. Interactive changes

in cardiovascular risk factors and the long-term cardiovascular risk differ by adiposity levels

in incident type 2 diabetes patients: real world study. Annual Meeting of the European

Association for the Study of Diabetes (EASD), Lisbon, Portugal, 11-15 September 2017

8

5. Olga Montvida, Sanjoy Paul. Time to third-line anti-diabetes therapy intensification

in patients receiving second-line GLP-1 receptor agonist, DPP-4 inhibitor and Sulfonylurea: A

real-world study. The Australian Diabetes Society (ADS) and the Australian Diabetes

Educators Association (ADEA) Annual Scientific Meeting 2017, Perth, Australia. 30 August -

1 September 2017.

6. Olga Montvida, Sanjoy Paul. Long-term glycaemic control with incretin-based

therapies in patients with type 2 diabetes: real-world study. QIMR Berghofer Early Career

Researcher Seminar, Brisbane, Australia, 23 June 2017.

7. Sanjoy K. Paul, Brian L. Thorsted, Michael L. Wolden, Kamlesh Khunti, Olga

Montvida. Delay in Treatment Intensification Increases the Risk of Cardiovascular Events in

Patients with Type 2 Diabetes. Cardiovascular Research Showcase, Brisbane, Australia, 22

November 2016.

8. Sanjoy Paul, Jonathan Shaw, Olga Montvida, Kerenaftali Klein. Obese Patients Gain

Less Weight than Non-obese Patients when Treated with Insulin, with Similar HbA1c

Reductions: New Evidence from Real-world Data in Type 2 Diabetes. Annual Meeting of the

European Association for the Study of Diabetes (EASD), Munich, Germany, 12-16 Sep, 2016.

9. Olga Montvida, Sanjoy Paul. Addition or Switch to Insulin Therapy in People Treated

with GLP-1 Receptor Agonists: A Real World Study in 66,583 Patients. Australian Diabetes

Society and Australian Diabetes Educators Association Annual Scientific Meeting, Gold Coast,

24-26 Aug, 2016.

10. Olga Montvida, Sanjoy Paul. Real World Outcomes of Addition or Switch to Insulin

Therapy in People Treated with GLP-1 Receptor Agonist. QIMR Berghofer Student

Symposium, 15 Jul 2016.

11. Sanjoy K. Paul, Jonathan Shaw, Kerenaftali Klein, Olga Montvida. Obese T2DM

Patients Gain Less Weight with Insulin Treatment Compared with Normal and Overweight

Patients: New Evidence from Real-World Data. American Diabetes Association (ADA) 76th

Scientific Sessions, New Orleans, USA, 10-14 June 2016.

9

12. Olga Montvida, Kerenaftali Klein, Sanjoy K. Paul. Evaluation of the Cardio-metabolic

Effects of Treatment with Incretin-based Therapies in Patients with Type 2 Diabetes. 8th

Biennial QIMR Berghofer Student Retreat, Canungra, Australia, 17-18 September 2015.

13. Kerenaftali Klein, Olga Montvida, Sanjoy K Paul. Real World Glucose and Weight

Control in Patients Treated with GLP-1 Receptor Agonists, with Addition or Treatment Change

to Insulin. Annual Meeting of the European Association for the Study of Diabetes (EASD),

Stockholm, Sweden, 14-18 September 2015.

10

STATEMENT OF ORIGINAL AUTHORSHIP

The work contained in this thesis has not been previously submitted to meet requirements for

an award at Queensland University of Technology or any other higher education institution.

To the best of my knowledge and belief, the thesis contains no material previously published

or written by another person except where due reference is made.

Olga Montvida 6 Nov 2018

11

QUT Verified Signature

ACKNOWLEDGEMENTS

I would like to express sincerest and deepest gratitude to my principal supervisor, Professor

Sanjoy Ketan Paul. Thank you for trusting in my abilities to conduct this project from the very

beginning and for such a close mentorship and inspiration over the past three years. There was

a lot of hard work, tons of coffee, and a lot of fun. We are no Michelangelo, but the quote

reflects it perfectly: “If people knew how hard I worked to achieve my mastery, it wouldn't

seem so wonderful after all".

Deep thanks to my associate supervisors, Professor Ross Young and Professor Louise Hafner

for supporting me and advising me from the first day of my enrolment. To all former colleagues

at QIMR Berghofer - Kerenaftali Klein, Julius Agbeve, Mayukh Samanta, Margaret Haughton,

and Gunter Hartel. You were here to help me during good and bad times, thank you for that.

To my PhD buddy Ebenezer Senyo Owusu Adjah, who never refused to share with me his

materials, thoughts, and advice.

I gratefully acknowledge the financial support received through the scholarship from

Queensland University of Technology. I also want to thank QIMR Berghofer Medical Research

Institute for giving me working space and granting a research top-up scholarship. Big thank

you to every employee working for these two institutions who ensured that I felt safe and

comfortable in a new country.

Finally, thanks to those who are always in my heart – family and friends. Your love contributed

to this dissertation much more than you think. Special thanks to my brother, who kept boosting

my self-confidence and fighting spirit.

It’s been an amazing journey. With confidence I may now say that a significant (p<0.01 😊)

development of myself as a scientist and as a person has been achieved.

12

TABLE OF CONTENTS

Abstract ..................................................................................................................................... 3 Keywords .................................................................................................................................. 6 List of Publications ................................................................................................................... 7 Statement of Original Authorship ........................................................................................... 11 Acknowledgements ................................................................................................................. 12 List of Figures ......................................................................................................................... 14 List of Tables .......................................................................................................................... 15 List of Supplementary Material .............................................................................................. 16 List of Abbreviations .............................................................................................................. 17

Chapter 1: Introduction ............................................................................................... 18 1.1 Diabetes Mellitus .......................................................................................................... 18 1.2 Epidemiology of Diabetes Mellitus .............................................................................. 19 1.3 Complications of Type 2 Diabetes ............................................................................... 20 1.4 Treatment of Type 2 Diabetes ...................................................................................... 22 1.5 Incretin-based therapies ................................................................................................ 23 1.6 Glycaemic effects of incretin-based therapies .............................................................. 24 1.7 Cardio-metabolic effects of incretin-based therapies ................................................... 25 1.8 Aims and Objectives ..................................................................................................... 25 1.9 Methodological Background ........................................................................................ 27 1.10 Thesis structure and logics ............................................................................................ 29 Chapter 2: Literature Review ..................................................................................... 31 2.1 Clinical trials ................................................................................................................. 31 2.2 Observational studies .................................................................................................... 34 2.3 Conclusions and implications ....................................................................................... 36 Chapter 3: Data Description ........................................................................................ 38 3.1 Centricity Electronic Medical Records ......................................................................... 38 3.2 Medication data ............................................................................................................ 39 3.3 Disease data .................................................................................................................. 43 3.4 Laboratory, clinical, and anthropometric data .............................................................. 45 3.5 Ethics approval ............................................................................................................. 47 Chapter 4: Medication Data Extraction ..................................................................... 48 Chapter 5: Diabetes Mellitus Cohort .......................................................................... 65 5.1 Diagnostic codes ........................................................................................................... 66 5.2 Supervised machine learning ........................................................................................ 66 5.3 Final cohort ................................................................................................................... 68 5.4 Representativeness of diabetes cohort .......................................................................... 69 5.5 Type 2 diabetes cohort .................................................................................................. 70 Chapter 6: Imputation of Longitudinal Observation Data ....................................... 73 Chapter 7: Trends in Anti-diabetic Drug Prescribing Patterns ............................... 98 Chapter 8: Glycaemic Control and Sustainability ..................................................110 Chapter 9: Cardio-metabolic Risk Factor Burden and Safety ...............................122 Chapter 10: Discussion and Conclusions ....................................................................148

Bibliography ........................................................................................................................153 Appendices ...........................................................................................................................165

13

LIST OF FIGURES

Figure 3.1. Schematic representation of the data in CEMR database. ................................... 39

Figure 3.2. Schematic diagram of identifying list of medication keys for Liraglutide. .......... 42

Figure 3.3. Schematic diagram of arranging longitudinal risk factor data. ............................ 46

Figure 5.1. Cohort of patients with T2DM and distribution of identified sub-types. ............. 66

Figure 5.2. Selected Decision Tree algorithm. ....................................................................... 68

14

LIST OF TABLES

Table 1.1 Incretin-based medications approved in the US and EU ....................................... ............24

Table 1.2 Possible sources of bias in Electronic Medical Record data ............................................. 29

Table 2.1 Completed cardiovascular outcome trials for DPP-4i in patients with type 2 diabetes ............................................................................................................................................ .31

Table 2.2 Completed cardiovascular outcome trials for GLP-1RA in patients with type 2 diabetes ................................................................................................................................ 32

Table 2.3 Summary of observational CV-outcome studies of treatment with incretin-based therapies .............................................................................................................................. 35

Table 3.1 Therapeutic Class and highest corresponding ATC code .................................................. 40

Table 3.2 Diseases, ICD codes, and Weights used to compute Charlson Comorbidity Index...........44

Table 5.1 Features Selected as Best Diabetes Predictors in CEMR ...................................... ............67

Table 5.2 Performance of Machine Learning Algorithms on the Training Dataset .............. ............68

Table 5.3 Characteristics of patients with diabetes in the CEMR database and in the National Diabetes Statistics report, 2015 ............................................................................... ...........69

Table 5.4 Baseline characteristics among adults with T2DM ............................................... ............71

Table 5.5 Exposure to medications any time during available follow-up among adults with T2DM ....................................................................................................................... ...........72

15

LIST OF SUPPLEMENTARY MATERIAL

Appendix A: Addition of or switch to insulin therapy in people treated with glucagon‐like

peptide‐1 receptor agonists: A real‐world study in 66 583 patients.

Appendix B: Data Mining Approach to Identify Disease Cohorts from Primary Care Electronic

Medical Records: A Case of Diabetes Mellitus.

Appendix C: Weight gain in insulin treated patients by BMI categories at treatment initiation:

New evidence from real-world data in patients with type 2 diabetes.

Appendix D: Risk of Pancreatic Diseases by Second-line Drug Class: Real World Evidence in

225,898 Type 2 Diabetes Patients.

16

LIST OF ABBREVIATIONS

ADA American Diabetes Association ADD Anti-diabetic Drug ATC Anatomical Therapeutic Chemical Classification AUC Area Under Receiver Operating Characteristic curve BMI Body Mass Index CCI Charlson Comorbidity Index CEMR Centricity Electronic Medical Record CI Confidence Interval CPM Cardio-protective Medication CPU Central Processing Unit CV Cardiovascular CVD Cardiovascular Disease DCCT Diabetes Control and Complications Trial DM Diabetes Mellitus DPP-4 Dipeptidyl Peptidase-4 DPP-4i Dipeptidyl Peptidase-4 inhibitor EMA Europe and Medicines Agency EMR Electronic Medical Record FDA Food and Drug Administration GIP Glucose-dependent Insulinotropic Polypeptide GLP Glucagon-like Peptide-1 GLP-1RA Glucagon-Like Peptide-1 Receptor Agonist HbA1c Glycated Haemoglobin HR Hazard Ratio ICD-10 International Classification of Diseases 10th Revision ICD-9 International Classification of Diseases 9th Revision IDF International Diabetes Federation INS Insulin LDL Low-density Lipoprotein MACE Major Cardiovascular Event MET Metformin ML Machine Learning NDS National Diabetes Statistics OR Odds Ratio RCT Randomised Controlled Trial SBP Systolic Blood Pressure SD Standard Deviation SGLT-2i Sodium Glucose co-Transporter 2 inhibitor SU Sulfonylurea T2DM Type 2 diabetes Mellitus TZD Thiazolidinedione UKPDS The UK Prospective Diabetes Study

17

Chapter 1: Introduction

1.1 DIABETES MELLITUS

Diabetes mellitus (DM) is a group of metabolic disorders characterised by defects in insulin

secretion or action that leads to increased blood glycose levels (hyperglycaemia) [1]. It is a

chronic disease with increasing prevalence, currently affecting around 9% of the global adult

population [2].

Modern aetiologic classification of DM consists of four categories: Type 1, Type 2,

Gestational, and Other Types [1, 3]. Absolute insulin deficiency that resulted from autoimmune

or idiopathic β-cell destruction is usually classified as Type 1 diabetes. Type 2 diabetes mellitus

(T2DM) is generally characterised by relative (rather than absolute) insulin deficiency, and it

is attributable for 90-95% of all DM cases [1, 4]. Gestational diabetes occurs in women during

gestation. Other specific types of diabetes caused by genetic defects of β-cell function, diseases

of pancreas, and other aetiologies are grouped together. Prediabetes or borderline diabetes, is

referred when blood glycose levels are higher than normal, but do not reach DM diagnostic

threshold [1, 5].

1.1.1 Diagnosis of Type 2 Diabetes

Early symptoms of T2DM are polyuria, polydipsia, and polyphagia. Symptoms also may

include fatigue, headaches, trouble concentrating, blurred vision, and weight loss. American

Diabetes Association guidelines recommend three tests to diagnose T2DM [1]:

Fasting Plasma Glucose ≥ 126 mg/dL [7.0 mmol/L],

2-hour Plasma Glucose ≥200 mg/dL [11.1mmol/L] during Oral Glucose Tolerance

Test,

Glycated Haemoglobin ≥ 6.5% [48 mmol/mol].

For fasting plasma glucose test, fasting is defined as no caloric intake for at least 8 hours. Oral

glucose tolerance test is defined as glucose load containing the equivalent of 75 gram

anhydrous glucose dissolved in water. Glycated Haemoglobin (HbA1c) test reflects average

plasma glucose level concentrations over approximately 3 months. HbA1c is a measurement

of the percentage of haemoglobin A molecules that formed a stable ketoamine linkage between

the amino terminal valine residue of the beta chain and a glucose moiety [6]. The method that

18

is certified by National Glycohemoglobin Standardization Program and standardised to the

Diabetes Control and Complications Trial (DCCT) assay, should be used to perform the HbA1c

test. Random plasma glycose of more than 200 mg/dL [11.1 mmol/L] may also be used to

diagnose T2DM in patients with classic symptoms of hyperglycaemia [1].

For all tests, a second test (same or a different) is recommended to be immediately conducted

with new blood sample to confirm the diagnosis.

1.2 EPIDEMIOLOGY OF DIABETES MELLITUS

Global prevalence of DM has increased fourfold during last 30 years. According to the 2017

world wide survey conducted by International Diabetes Federation (IDF), more than 425

million individuals (equivalent to 1 in 11 adults) have diabetes, and 1 in every 2 adults with

diabetes is undiagnosed (~212 million) [7, 8]. In the US, the age-adjusted (20-79 years)

prevalence of diabetes in 2017 was 10.8%, while about 11.5 million individuals were estimated

to have undiagnosed diabetes [8]. More than three quarters of people with DM live in low and

middle-income countries, and most of them are 20 to 64 years old. Over a million of children

and adolescents are suffering from type 1 diabetes.

About 12% of global health expenditure is spent on DM management [8]. In the US, a quarter

of total health expenditures was estimated to be spent on DM management [9]. American

Diabetes Association estimated cost of diagnosed DM as USD 327 billion in 2017: 72% direct

medical costs and 28% in reduced productivity [10]. Among Australians with DM aged 20–65

years, Magliano and colleagues estimated productivity-adjusted life years lost to DM by 12.2%

and 11.0% for men and women respectively [11]. Bommer and colleagues modelled economic

burden of DM under various scenarios and reported an increase in the costs as a share of global

GDP from 1.8% (1.7–1.9) in 2015 to 2.2% (2.1–2.2) in 2030 [12].

In 2004 the World Health Organisation (WHO) provided an estimate of diabetes prevalence in

2000 and conducted forecasting for diabetes till 2030 – 171 million in 2000 and estimated 366

million in 2030 [13]. In practise, these estimates appeared extremely underestimated, as in 2017

there were already 425 million people with DM. The IDF projects the prevalence of diabetes

to rise to 642 million by 2040. However, these estimates may be again underestimated, as IDF

extrapolates prevalence for countries with missing data from various less reliable sources [7,

14].

19

Advances in epidemiological research on DM lead to better understanding of various risk

factors associated with development of T2DM. The determinants of T2DM consist of many

contrasting and interacting genetic, epigenetic and lifestyle factors [14]. The risk of T2DM

development increases with age, body mass index (BMI), and with sedentary lifestyle. Also,

high-calorie diet leading to excess body fat, hypertension, and dyslipidaemia is considered to

be a major contributor to the disease burden. People with a history of diabetes in first- and

second- degree relatives have an increased risk of developing T2DM.

Ethnic minorities in the US and Australia have a higher risk of developing T2DM compared to

non-minority individuals [1]. South Asians develop diabetes earlier and at lower BMI levels,

compared to Western population [15, 16]. In India, 72 million people were estimated to have

DM in 2017, and 123.5 million were predicted to have DM by 2040. A population based survey

conducted in China in 2010 suggests that about 12% of the adult population had diabetes and

about 50% of total population had pre-diabetes (defined as 2-hour oral glucose tolerance levels

140-199 mg/dL [7.8–11.0 mmol/l], and impaired fasting glucose, defined as fasting glucose

levels 100-125 mg/dL [5.6–6.9 mmol/l]) [17].

The estimated number of females (20-79 years) living with DM in 2017 is 204 million.

Gestational diabetes, defined as hyperglycaemia onset or first recognition during pregnancy,

significantly increases the risk of T2DM development in both the woman and the child.

According to IDF 2017 estimates, about 16% of live births had some form of hyperglycaemia

in pregnancy, and 1 in every 7 births was affected by gestational diabetes. Compared to women

who did not have gestational diabetes, a 7-fold increased risk of developing T2DM was

observed in those who did have it [18-20]. In the children of women with gestational diabetes,

exposure to intrauterine hyperglycaemia was associated with an 8-fold risk of developing

diabetes/prediabetes at 19-27 years of age [21].

American Diabetes Association estimated that people with DM have more than twice higher

medical expenditures than it were without presence of DM [10]. The costs of DM present

immense problem to patients, health systems, and community in general [9]. In the US, people

with DM spend on average USD 16,750 per year, 57% of which are attributable to diabetes

[10].

1.3 COMPLICATIONS OF TYPE 2 DIABETES

Patients with T2DM are at increased risk of developing a number of comorbidities and life-

threatening complications [22]. The short- and long-term complications associated with T2DM

20

are many. Traditionally, macrovascular (cardiovascular) diseases and microvascular diseases

have been considered as the primary complications associated with T2DM. While a number of

clinical trials and epidemiological outcome studies have established the significantly increased

microvascular risk in patients with T2DM [23, 24], the evidence of the long-term

macrovascular benefits of tight glucose control in patients with T2DM is less clear [25-27].

1.3.1 Microvascular complications

Microvascular complications of long-term hyperglycaemia occur due to damage to small blood

vessels leading to neuropathy, retinopathy, and nephropathy. Diabetic neuropathy is

characterised by progressive loss of nerve fibres affecting the peripheral nerves and the

autonomic neurons. It was estimated that 60-70% of people with DM develop some form of

neuropathy [28]. Diabetes-associated longstanding peripheral neuropathy increases the chance

of foot ulcer (“diabetic foot”), infection and eventual need for limb amputation [29]. Diabetic

retinopathy affects the peripheral retina and/ or macula leading to partial or total vision loss. It

was estimated that 2.6% of global blindness can be attributed to diabetes [30]. Diabetic

nephropathy is characterised by angiopathy of the capillaries in the kidney glomeruli. It is the

most common cause of kidney failure in developed countries [28]. The first indication of

nephropathy is typically microalbuminuria, which further worsens to albuminuria (at rate 2-

3% per year), and eventually leads to renal failure [29]. The risk of microvascular

complications increases with age, DM duration, and blood glucose level. Randomised

controlled trials (RCT), including Diabetes Control and Complications Trial (DCCT) and The

UK Prospective Diabetes Study (UKPDS), have indicated that tight glycaemic control (HbA1c

<7% [53 mmol/mol]) in patients with diabetes reduces risk of microvascular complications

[31-34]. Lowering HbA1c to 6% [42 mmol/mol] is associated with further reductions in the

risk of microvascular complications, although at a much smaller pace [1].

1.3.2 Macrovascular complications

Atherosclerosis of large vessels due to long-term hyperglycaemia leads to ischaemic heart

disease, cerebrovascular disease (stroke), and peripheral vascular disease. Cardiovascular

disease (CVD) is a leading cause of death and disability among people with T2DM [22, 35].

While patients with T2DM have an increased cardiovascular (CV) risk profile, DM is

considered as an independent CVD risk factor [36].

In patients with T2DM, CVD develops about 14 years earlier with great severity, compared to

individuals without diabetes [37-39]. After controlling for traditional CV risk factors, patients

21

with T2DM have more than twice the risk of major CV events compared to the general

population [40]. In patients with T2DM, peripheral artery disease (occlusion of the lower-

extremity arteries) and heart failure (impaired cardiac pump function) are the most common

initial manifestations of CVD [41]. Risk of CVD increases with age and DM duration. Some

studies report that the presence of microvascular complications increases the risk of CVD as

well [28]. Several large trials (ACCORD, ADVANCE, VADT, DCCT, UKPDS), designed to

address the CVD related concerns have shown no beneficial effect of tight glucose control on

CV events [33, 42-44]. Nonetheless, follow-up data, meta-analyses, and prospective

observational studies suggest positive effects of tighter glycaemic targets on CVD risk,

especially in those with shorter DM duration and no history of severe hypoglycaemia [1, 22,

45, 46].

1.4 TREATMENT OF TYPE 2 DIABETES

Glycaemic targets:

According to International and American Diabetes Association guidelines [1, 47], adults with

T2DM are recommended to achieve HbA1c < 7% [53 mmol/mol]. Selected adults with shorter

duration of diabetes and no significant CVD, may be recommended to maintain HbA1c < 6.5%

[48 mmol/mol]. Less stringent targets (e.g. < 8% [64 mmol/mol] may be appropriate for

patients for whom the 7% target is difficult to achieve due to extensive comorbidities, history

of severe hypoglycaemia, and/or limited life expectancy.

HbA1c testing is recommended at least twice a year for patients who meet treatment targets,

and quarterly for those who do not meet targets or changed therapy.

Lifestyle modifications including dietary considerations and physical activity are initially

recommended to prevent or delay T2DM [1]. Metformin (MET), if not contraindicated, is

widely accepted as the first choice as anti-diabetic drug (ADD) since it does not cause weight

gain or hypoglycaemia and may improve macrovascular outcomes [48]. However progressive

deterioration of diabetes generally leads to the need for further treatment intensification. In

patients with new diagnosis of T2DM, the UKPDS study has shown that approximately half of

the people maintain acceptable glucose level after 3 years of monotherapy, however after 9

years the proportion declines to only one quarter of patients [49]. Guidelines recommend to

22

intensify anti-diabetic therapy when treatment targets are not met within 3-6 months of

monotherapy [1].

Successful treatment of T2DM is generally complicated by treatment-related adverse effects

(hypoglycaemia, weight gain) and the progressive nature of the disease. Many patients

eventually require therapy intensification with another drug, however a consensus among

physicians has not been achieved [50].

The current six common post metformin second-line therapy intensification options are

sulfonylurea (SU), thiazolidinedione (TZD), sodium glucose co-transporter 2 inhibitor (SGLT-

2i), glucagon-like peptide-1 receptor agonist (GLP-1RA), dipeptidyl peptidase-4 inhibitor

(DPP-4i) or insulin (INS); other drugs are recommended under specific conditions. MET, SU,

and insulin represent the ‘old agents’; TZD has been used for the last decade, especially in

Asian countries. GLP-1RA, DPP-4i, and SGLT-2i represent the ‘novel agents’. All of the

agents, used alone or in combination, are associated with different adverse events including

hypoglycaemia (SU and insulin), weight gain (SU, insulin and TZD), gastrointestinal side

effects (MET, GLP-1RA) and increased risk of fractures (TZD) [51, 52].

1.5 INCRETIN-BASED THERAPIES

Glucagon-like peptide-1 (GLP-1) and glucose-dependent insulinotropic polypeptide (GIP) are

gut-derived hormones, also called incretins, which induce insulin secretion in a glucose-

dependent manner as a response to nutrient ingestion. Additionally, GLP-1 inhibits glucagon

secretion, slows gastric emptying, and increases satiety [53]. GLP-1 and GIP are degraded

within 2-3 minutes by the enzyme dipeptidyl peptidase-4 (DPP-4).

Incretin-based therapies are represented by two classes: oral DPP-4i and subcutaneous GLP-

1RA. The former increases effective levels of incretins by targeting and inactivating DPP-4,

while the later increases insulin release through direct action on GLP-1 receptors [54]. These

therapies have been in the focus during the last several years because of their unique

mechanisms of action [52, 55-59]. GLP-1RAs stimulate insulin secretion and inhibit glucagon

release in a strictly glucose-dependent manner. The pancreatic effects also include increased

beta-cell proliferation, and decreased beta-cell apoptosis [56, 60, 61].

Several GLP-1RAs have been approved for the treatment of patients with T2DM. Exenatide,

the first GLP-1RA representative, was approved in April 2005 by the USA Food and Drug

Administration (FDA), and in October 2006 by the Europe and Medicines Agency (EMA).

FDA and EMA approved agents are summarised in the Table 1.1. GLP-1RAs differ in the

23

structure, and also may be distinguished by durability of action: short-acting (once- or twice-

daily administration) and long-acting (once-weekly administration). The first DPP-4i was

Sitagliptin – approved in October 2006 and March 2007 by FDA and EMA, respectively [62].

DPP-4is are administrated once daily, and all DPP-4is are available in combination with

metformin.

Table 1.1

Incretin-based medications approved in the US and EU

GLP-1RA class DPP-4i class Exenatide (Byetta®, Bydureon®) Sitagliptin (Januvia®)

Liraglutide (Victoza®) Vildagliptin (Galvus®) Lixisenatide (Lyxumia®) Saxagliptin (Onglyza®)

Albiglutide (Eperzan®, Tanzeum®) Linagliptin (Trajenta®) Dulaglutide (Trulicity®) Alogliptin (Vipidia®, Nesina®) Semaglutide (Ozempic®)

Table source: Scheen, A.J., Cardiovascular outcome studies with incretin-based therapies: comparison between

DPP-4 inhibitors and GLP-1 receptor agonists. diabetes research and clinical practice, 2017. 127: p. 224-237

[63].

1.6 GLYCAEMIC EFFECTS OF INCRETIN-BASED THERAPIES

Incretin-based therapies have demonstrated their ability to significantly reduce glucose levels

while maintaining a low risk of hypoglycaemia in patients with T2DM.

GLP-1RAs reduce fasting plasma glucose by 1.4-3.4 mmol/L and HbA1c by 0.8-1.8% [64].

HbA1c reductions are 0.5-1.0% and 1.5-2.0% with short- and long-acting exenatide, and

around 0.8-1.5% with liraglutide [65, 66]. Short-acting agents have a greater effect on

postprandial glucose levels mainly through inhibition of gastric emptying, while long-acting

GLP-1RAs have a greater effect on fasting glucose levels mainly through their insulinotropic

and glucagonostatic actions [65, 67]. In a direct head-to-head study, patients treated with

liraglutide achieved greater reductions in HbA1c and fasting glucose, compared to those treated

with exenatide [67]. In the review of direct head-to-head trials on GLP-1RAs (n=9), Madsbad

(2016) reports higher reductions of HbA1c with liraglutide than with exenatide formulations

and albiglutide, and no differences in HbA1c reductions between liraglutide and dulaglutide

[68].

DPP-4i reduce fasting plasma glucose levels by 1.0-1.4 mmol/L and HbA1c by 0.5-1.1% [64].

HbA1c reductions with sitagliptin are 0.6-0.8%, with saxagliptin 0.4-0.8, with linagliptin 0.5-

0.7%, with alogliptin 0.5-0.9% [65, 66]. There are no head-to-head trials comparing DPP-4i

24

agents [69], however several systematic reviews and meta-analyses reported similar efficacy

and safety of DPP-4i agents [70, 71]. Several head-to-head trials were conducted to compare

GLP-1RA representatives with DPP-4i agents, where GLP-1RA class demonstrated higher

glycose reductions than DPP-4i [66].

1.7 CARDIO-METABOLIC EFFECTS OF INCRETIN-BASED THERAPIES

GLP-1 receptors are widely expressed throughout the human body and their presence on

coronary artery endothelial cells may benefit ischemic conditioning [72, 73]. Data from animal

models, pre-clinical and exploratory studies suggest potential CV benefits by improved

endothelial and myocardial function [74-76], improved left ventricular ejection fraction and

wall indices [55, 77], decreased levels of inflammatory markers and atherosclerosis [62, 78],

and recovery of failing and ischemic hearts [76, 79-81].

DPP-4i were shown to be weight-neutral and GLP-1RAs were shown to significantly reduce

body weight by 2-4 kg over 6 months of therapy [60, 63]. While T2DM population have

increased burden of obesity, an independent risk factor for CVD, incretin-based therapies

represent favourable therapeutic option comparing to agents that cause weight gain. These

therapies were also reported to decrease blood pressure (BP) [62, 78], which might be

independent of weight reductions [82, 83]. The Liraglutide Effect and Action in Diabetes trials

1-5 demonstrated systolic blood pressure reductions of 3.6-6.7 mmHg [73]. Also, incretin-

based therapies demonstrated modest improvements in total cholesterol, LDL-cholesterol,

HDL-cholesterol, and triglyceride profiles [62, 66, 67, 84]. However, small increase in heart

rate 2-4 beats/minute has been associated with GLP-1RA treatment [84, 85].

Nonetheless, beneficial effects on CV risk from these promising studies in animals and humans

are not yet transferred to clinical evidence. Chapter 2 presents a summary on current evidence

of incretin-based therapies association with CV risk.

1.8 AIMS AND OBJECTIVES

Most of the patients with T2DM require intensified treatment with multiple ADDs, apart from

medications for CV risk factor control, as the disease progresses. While metformin is

recommended as the first-line ADD, the guidelines suggest at least six possible options for

therapy intensification, where evidence is primarily drawn from RCTs. While a great number

of studies report beneficial effects of one medication brand over placebo or another brand,

patient and practitioner decisions on intensification therapy have become more complicated

25

than ever. At the same time, patient choices, disease management, and cardio-metabolic

outcomes significantly differ in the real-world scenario compared to RCT practices.

The Aim of this project was to explore cardio-metabolic effects of treatment with incretin-

based drugs, compared to other treatment options in the real-world setting.

Based on large patient-level electronic medical records (EMRs) from primary and ambulatory

care systems, the objectives of this real-world data based pharamaco-epidemiological project

combine both comparative effectiveness and outcome studies, while addressing a host of

methodological challenges in dealing with large EMRs.

Objective 1: To develop and validate data mining techniques to extract and analyse

longitudinal prescription data from EMR database.

Objective 2: To develop algorithms and machine learning techniques to identify disease

cohorts from EMR database.

Objective 3: To explore temporal trends in anti-diabetic drug prescribing and intensification

patterns, along with glycaemic levels and comorbidities by class of anti-diabetes drug.

Objective 4: To explore long-term dynamics of glucose control and its sustainability in the

following treatment groups:

Group 1: Metformin plus GLP-1RA

Group 2: Metformin plus DPP-4i

Group 3: Metformin plus Insulin

Group 4: Metformin plus Sulfonylurea

Objective 5: To explore long-term burden of blood pressure, low density lipoprotein, and

triglycerides in the above-mentioned groups, and the association of such control with risk of

major adverse cardiovascular events.

SGLT-2 inhibitors, first approved in 2013, have demonstrated glycaemic and extra-glycaemic

benefits. Additionally, some class representatives demonstrated renal protection and

association with reduced risk of CV events. However, due to data constraints and limited

follow-up time, this therapy group could not be included for robust comparative analyses

outlined in Objectives 4 and 5.

26

1.9 METHODOLOGICAL BACKGROUND

Patient-level data from electronic medical records (EMRs) collected from 1995 till 2016 across

all states of the US during routine primary and ambulatory care were used throughout this

project. This subsection is devoted to a general description of the role, scientific value, and

limitations of evidence based on real-world data. Chapter 3 describes the data in detail.

1.9.1 Scientific value of real-world evidence

According to the FDA, real-world data is defined as any data related to patient health status

and /or delivery of health care that is routinely collected from various sources. [86]. These

sources include EMRs, claims and billing systems, product and disease registries, health-

monitoring devices, and health-related applications [87]. Evidence based on analysing such

data includes studies on therapeutics, disease-related outcomes, safety, cost-effectiveness,

epidemiology, patient-care, and delivery systems.

The nature of real-world evidence is very different from RCTs, with both important advantages,

but also disadvantages relative to RCTs. RCTs are considered as the gold standard for testing

hypotheses not only because baseline randomisation supports conclusions of causality, but also

due to the ability of tight control over measurement and clinical conduct and ease of

communicating results [88]. Tests of safety and efficacy of an intervention in a RCT are

considered to be bias free, and provide a reliable source of internal validity. However, RCTs

are often conducted with specific populations and findings may be less generalisable to broader

populations, apart from being costly and requiring a long time to complete. Real-world based

studies provide opportunities to observe health outcomes in populations that are often excluded

from RCTs, such as pregnant, older, or co-morbid patients. These studies also allow exploration

of research questions that may be unethical for testing in RCTs, for instance the outcomes of a

delay/ failure in treatment intensification [89, 90]. While RCTs are conducted in a specialised

environment and assess whether a treatment may work, real-world studies observe whether the

actual use and outcomes of interventions works in the everyday clinical practice. For example,

Edelman and colleagues reported that HbA1c reductions observed in RCTs are much higher

than in the real-world scenario: 1.25% vs 0.52% for GLP-1RA and 0.68% vs 0.51% for DPP-

4i [91, 92]. The authors suggest poor medication adherence as a key driver of such a disconnect.

Several countries (UK, Sweden, Estonia) have implemented a nationwide “birth-to-death”

EMRs for nearly every citizen, which brings a unique opportunity to observe population-level

behaviour, effectiveness of changes in health care policies, and health management costs in

27

addition to population-level safety, effectiveness, and health-related long-term outcome

research [93-96]. Large population level databases also provide an opportunity to bring

together benefits of EMRs and RCTs by randomising and recruiting patients from EMRs to

protocol-driven RCTs in a convenient, fast, and cost-effective manner [87].

The increasing role of real-world data in health care decisions has led the European Union to

establish a project to monitor adverse drug reactions (EU-ADR) using EMRs from the

Netherlands, Denmark, United Kingdom, and Italy [97]. Another example of combining

national registry data (US, Norway, Denmark, Sweden, Germany, UK) is CVD-REAL study

that was designed to assess whether positive outcomes observed in the completed RCT are also

applicable to broader population in the real-world practice [98].

1.9.2 Limitations and challenges of electronic medical records

Health related data from EMRs reflect complex multi-factorial relationships of everyday

clinical practice – with the challenges in design, analysis and interpretations of the findings

[88]. Confounding and data quality limit the ability to conclude direct causation, and in in this

sense, the EMR-based studies should be interpreted with caution. EMRs are collected during

routine medical care and usually extensively capture demographics, medication prescriptions,

diagnoses and procedures, laboratory and anthropometric measures. However, EMR data are

prone to (1) loss of follow-up, (2) misdiagnosis, misclassification and miscoding, (3) missing

data on certain variables, (4) unreliable data on some relevant variables, and (5) biases and

confounding.

Follow-up in the EMRs may be lost when a patient moves to a different location or transfers

out of a practice. While nation-wide EMRs lose follow-up when a patient moves to another

country, commercial EMRs are not able to track patient records once he /she moves to a

practice that does not contribute data to a particular EMR network. Due to the nature of general

practitioner settings, some variables are recorded more often than others. For example, blood

pressure measurements are taken at almost every general practitioner encounter because of the

relative ease with which it can be measured. At the same time, information on diet and exercise,

disease activity progression, or medications dose escalations are entered to the EMRs less

often. Also, patients are prone to provide non-reliable data on drug abuse, smoking habits, and

alcohol consumption. Laboratory and anthropometric measures may be conducted on different

equipment and may follow different procedures. Miscommunications, errors during data-entry

28

process, and non-attendance of scheduled visits are part of routine medical care, which brings

additional errors to data from EMRs.

Real-world studies are prone to various sources of biases, where some of them reflect data

collection nature (e.g. specific insurance or clinic), some of them of them may be reduced with

careful study design (e.g. immortal time bias), and others may be reduced with advanced data

mining and statistical methodologies (e.g. information bias). In a recent publication, Verheij

and colleagues discuss possible sources of bias in the EMR-based research and categorise them

as presented in the Table 1.2 [99].

Table 1.2

Possible sources of bias in Electronic Medical Record data

Reimbursement system, pay for performance parameters Role of general practitioner in the health care system; gatekeeping / nongatekeeping Professional clinical guidelines

Ease of access by patients to their records

Data sharing between health care providers

Practice workload

Variations between EMR system functionalities and lay-out

Coding systems and thesauruses

Knowledge and education regarding the use of EMR systems

Data extraction tools

Data processing

Research dataset preparation

Research methodologies Table source: Verheij, A.R., et al., Possible Sources of Bias in Primary Care Electronic Health Record Data Use

and Reuse. J Med Internet Res, 2018. 20(5): p. e185. [99]

1.10 THESIS STRUCTURE AND LOGICS

This project is designed as a series of methodological and pharmaco-epidemiological studies

that were conducted using a large database of EMRs. Chapter 2 provides a literature review on

the association of treatment with incretin-based therapies with CV risk.

Chapters 3-6 are devoted to data science. Chapter 3 introduces the database and basic data

management considerations. Chapter 4 describes the algorithm developed to extract and

aggregate medication information at individual patient-level. The information on ADD use was

obtained for all patients in the database (~34 million patients), and these data were incorporated

in the algorithm to identify a robust cohort of patients with diabetes (chapter 5). For this cohort

29

of patients, chapter 6 reports the patterns of missingness in the longitudinal laboratory and

anthropometric measures, and compares performance of several multiple imputation

techniques.

Chapters 3-6 describe the data groundwork that was essential in order to draw reliable clinical

inferences from voluminous and complex EMRs. These methods were part of the data

preparation for each pharmaco-epidemiological study described in chapters 7-9 and

appendices. Each of these clinical studies has its own design and methodology (data mining

and statistical), described separately within the respective chapters. Note that each chapter’s

database description is repetitive and presents a compressed version of chapter 3.

Chapter 7 explores longitudinal trends in the use of ADDs, glycaemic control, and patients’

characteristics with respect to the drug initiation order. Chapter 8 focuses on the glycaemic

control and its sustainability comparing second-line treatment options. Chapter 9 explores

cardio-metabolic risk factor burden at population level and cardio-metabolic risk factor control

by class of second-line ADD. It also explores association of cardio-metabolic risk factor burden

and the risk of CV events. Finally, chapter 10 summarise the results, concludes conducted

work, and discuss future directions.

30

Chapter 2: Literature Review

2.1 CLINICAL TRIALS

Prior to 2008, the approval of new ADDs was based on improvements in glycaemia with

detailed investigation of adverse events. The trials were usually 6 months long, where presence

of CVD was often an exclusion criterion [100]. In 2007, a meta-analysis of 43 studies reported

a significant increase in the risk of myocardial infarction in patients treated with pioglitazone

(TZD class), and a non-significant increase in the risk of CV death [101]. This controversial

publication generated enormous public reaction, which resulted in the FDA recommending

conducting long-term CV safety trials or other equivalent evidence to support CV safety of

new anti-diabetic agents in 2008. The guidance document suggested a meta-analysis of phase

2 and 3 trials to rule out CV risk as a default option, and the need for additional CV safety trial

only when the data are insufficient [100].

2.1.1 Cardiovascular outcome trials

In practice, a large dedicated CV safety trial has been conducted for every novel agent. Till

date, 3 large CV outcome trials have been completed for DPP-4i agents (Table 2.1) and 4 for

GLP-1RA agents (Table 2.2).

Table 2.1

Completed cardiovascular outcome trials for DPP-4i in patients with type 2 diabetes

SAVOR-TIMI53 EXAMINE TECOS Drug Saxagliptin Alogliptin Sitagliptin

Primary Endpoint 3-point MACE 3-point MACE 4-point MACE N 16,492 5,380 14,671

Follow-up, years 2.1 1.5 3 Inclusion:

Minimum Age, years

40 18 50

HbA1c, % ≥6.5 6.5-11.0 6.5-8.0

Cardiovascular Status

Pre-existing CVD or high CV risk

Acute Coronary Syndrome 15- 90

days before Pre-existing CVD

Mean at baseline: BMI, kg/m2 31.1 28.7 30.2 Age, years 65 61 65.5 HbA1c, % 8 8 7.2

31

SAVOR-TIMI53 EXAMINE TECOS Outcome, Hazard Ratio (95% CI) Primary composite 1.00 (0.89–1.12) 0.96 (≤1.16)* 0.98 (0.89–1.08)

Myocardial infarction

0.95 (0.80–1.12) 1.08 (0.88–1.33) 0.95 (0.81–1.11)

Stroke 1.11 (0.88–1.39) 0.95 (≤1.14)* 0.97 (0.79–1.19) Heart Failure 1.27 (1.07–1.51) 1.07 (0.79–1.46) 1.00 (0.83–1.20) CV Mortality 1.03 (0.87–1.22) 0.85 (0.66–1.10) 1.03 (0.89–1.19)

All-cause mortality

1.11 (0.96–1.27) 0.88 (0.71–1.09) 1.01 (0.90–1.14)

*one-sided repeated CI, at an alpha level of 0.01MACE: major cardiovascular event 3-point MACE: CV death, Myocardial Infarction, or Stroke 4-point MACE: CV death, Myocardial Infarction, Unstable Angina, or Stroke

The trials were designed to assess CV safety of novel agents over placebo in patients with

established CVD or high CV risk. With median follow-up of 1.5-3.8 years and average of

10,000 patients, all completed RCTs could prove CV safety. In the SAVOR-TIMI-53 trial,

increased rates of hospitalisation for heart failure were observed in the saxagliptin arm

compared to placebo with HR (95% CI) of 1.27 (1.07, 1.51) [102]. Notably, neither this nor

other CV safety trials included heart failure as a primary or secondary end point [103]. Upon

showing non-inferiority, secondary analyses of the LEADER trial demonstrated superiority of

liraglutide compared to placebo with HR (95% CI) for 3-point MACE (CV death, myocardial

infarction, or stroke) of 0.87 (0.78, 0.97) [104]. Notably, significantly lower HbA1c (0.4%),

body weight (2.3 kg), and systolic blood pressure (1.2 mmHg) were achieved in the liraglutide

arm compared to the placebo [63].

Table 2.2

Completed cardiovascular outcome trials for GLP-1RA in patients with type 2 diabetes

ELIXA LEADER SUSTAIN-6 EXCEL Drug Lixisenatide Liraglutide Semaglutide Exenatide

Primary Endpoint 4-point MACE 3-point MACE

3-point MACE

3-point MACE

N 6,068 9,340 3,297 14,752 Follow-up, years 2.1 3.8 1.9 3.2

Inclusion: Minimum Age,

years 30 50 50 18

HbA1c, % ≥7 ≥7 ≥7 6.5-10.0 Cardiovascular

Status Acute Coronary Syndrome 0-180

days before

Pre-existing CVD or high CV

risk

Pre-existing CVD or high

CV risk

Pre-existing CVD or high

CV risk

32

ELIXA LEADER SUSTAIN-6 EXCEL Mean at baseline:

BMI, kg/m2 30.2 32.5 31.1 Age, years 60 64.3 64.6 HbA1c, % 7.7 8.7 8.7 8.0

Outcome, Hazard Ration (95% CI) Primary

composite 1.02

(0.89–1.17) 0.87

(0.78–0.97) 0.74

(0.58–0.95) 0.91

(0.83, 1.00) Myocardial infarction

1.03 (0.87–1.22)

0.86 (0.73–1.00)

0.74 (0.51–1.08)

0.97 (0.85−1.10)

Stroke 1.12 (0.79–1.58)

0.86 (0.71–1.06)

0.61 (0.38–0.99)

0.85 (0.70−1.03)

Heart Failure 0.96 (0.75–1.23)

0.87 (0.73–1.05)

1.11 (0.77–1.61)

0.94 (0.78−1.13)

CV Mortality 0.98 (0.78–1.22)

0.78 (0.66–0.93)

0.98 (0.65–1.48)

0.88 (0.76−1.02)

All-cause mortality

0.94 (0.78–1.13)

0.85 (0.74–0.97)

1.05 (0.74–1.50)

0.86 (0.77−0.97)

MACE: major cardiovascular event 3-point MACE: CV death, Myocardial Infarction, or Stroke 4-point MACE: CV death, Myocardial Infarction, Unstable Angina, or Stroke

These trials provide very valuable clinical evidence of CV safety with novel agents.

Importantly, none of them was designed to demonstrate CV superiority and only patients at

increased risk of CV were recruited in these RCTs, much longer trials would be required for a

low-risk population. Also, most of the patients with T2DM were on a background of cardio-

protective and lipid modifying drugs, and those in the placebo group were more likely to

receive other (older) agents for treatment intensification.

2.1.2 Non-cardiovascular outcome trials

Analyses of non-CV outcome trials with shorter duration that included patients with much

lower CV risk demonstrated CV safety or superiority of incretin-based drugs over comparators.

A meta-analysis of 70 trials on DPP-4i with at least 24 weeks of follow-up, reported a HR (95%

CI) of 0.71 (0.59, 0.86) for major cardiovascular events (MACE) against placebo or other

comparators [105]. A meta-analysis of RCTs on patients who used GLP-1RA for a minimum

duration of 6 months, reported a HR (95% CI) of 0.78 (0.54, 1.13) for MACE against placebo

or other comparators [106]. Another recent meta-analysis that included 281 RCTs on treatment

with incretin-based drugs for ≥ 12 weeks reported odds ratio (95% CI) of 0.89 (0.80, 0.99) for

the risk of CV events favouring GLP-1RA use against placebo [107]. This meta-analysis also

reported odds ratio (95% CI) of 0.92 (0.83, 1.01) for CV events for DPP-4i against placebo.

33

In an overview of reviews, Gamble and colleagues assessed the quality of systematic reviews

evaluating the safety, efficacy and effectiveness of incretin-based therapies [108]. A total of 83

pooled treatment effect estimates from 10 systematic reviews on CV outcomes were analysed,

where none received a high-quality Assessing the Methodological Quality of Systematic

Reviews (AMSTAR) score. The study reported that most of reviews suggested a potential

decreased risk (41 of 45 for DPP-4i and 28 of 38 for GLP-1RA), while only few (18 of 41 for

DPP-4i and 3 of 28 for GLP-1 RA) pooled treatment effect estimates were statistically

significant. The authors suggested possible overestimations in the results and possible

publication bias in analysed reviews.

2.2 OBSERVATIONAL STUDIES

Table 2.3 presents a summary of observational studies that explored CV risk of treatment with

incretin-based therapies. Overall, conclusions are consistent with CV-outcome trial results –

treatment with incretin-based therapies does not increase and possibly reduces CV risk.

Multiple factors such as study design, available data, data management methodology and

statistical approaches make direct comparison of these studies very difficult. Patorno and

colleagues (2016) compared CV outcomes of treatment with GLP-1RAs with DPP-4i, SU, and

INS under the same study design [109]. GLP-1RA users were 1:1 matched to other treatment

groups (allowing patient overlap), and during 0.8 years of follow-up there were no significant

differences in the CV risk between the groups. Kannah and colleagues (2016) compared

MET+SU, MET+TZD, MET+DPP-4i, and MET+GLP-1RA combinations using a Cox

regression approach with propensity score as adjustment covariate [110]. While there was no

difference in the risk of overall mortality and coronary artery disease between all groups,

compared to MET+SU, patients treated with MET+DDP-4i had higher risk of heart failure with

a HR (95% CI) of 1.10 (1.04, 1.17). Notably, the methodological approach used in this study

is generally not recommended in the statistical literature [111-113]. Zghebi and colleagues

(2016) observed a non-significant reduction in the risk of major CVD or CV death for second-

line DPP-4i users, compared to second-line SU users adopting Cox regression approach

weighted with inverse probability of treatment [114]. The same study observed a significant

CV risk reduction for second-line TZD users compared to SU users. The most recent

observational study comparing post metformin second-line GLP-1RA, DPP-4i users, reported

lower CV risk of treatment with incretin-based therapies, compared with SU users – significant

for the DPP-4i group but non-significant for the GLP-1RA group [115].

34

Table 2.3

Summary of observational CV-outcome studies of treatment with incretin-based therapies

Source Drug / Cohort

size

Comparator /

Cohort size

Follow

-up

(yr)

Conclusion

Comparative effectiveness of incretin-based therapies and the risk of death and cardiovascular events in 38,233 metformin monotherapy users. Gamble et at, 2016 [115]

GLP-1RA (added to MET) 487

Sulfonylurea (added to MET) 25,916

2.7

non-significant CV risk reduction

DPP-4i (added to MET) 6,213


significant CV risk reduction

Comparative risk of major cardiovascular events associated with second-line antidiabetic treatments: a retrospective cohort study using UK primary care data linked to hospitalization and mortality records, Zghebi et al, 2016 [114]



2.4 non-significant CV risk reduction

Comparative Cardiovascular Safety of Glucagon-Like Peptide-1 Receptor Agonists versus Other Antidiabetic Drugs in Routine Care: a Cohort Study. Patorno et al, 2016 [109]

GLP-1RA (added to MET) 18,658


0.8

no significant difference in CV risk


SU (added to MET) 114,480



Insulin (added to MET) 42,982


The association of the treatment with glucagon-like peptide-1 receptor agonist exenatide or insulin with cardiovascular outcomes in patients with type 2 diabetes: a retrospective observational study. Paul et al, 2015 [116]

GLP-1RA: Exenatide with other OAD 2,804 Exenatide with Insulin 7,870

Insulin (concomitant ADD allowed) 28,551

3.5 reduced CV risk

Risk of cardiovascular disease events in patients with type 2 diabetes prescribed the Glucagon-Like Peptide 1 (GLP-1) receptor agonist exenatide twice daily or other glucose-lowering therapies: A retrospective analysis of the lifelink database. Best et al, 2011 [117]

GLP-1RA: Exenatide (concomitant ADD allowed) 21,754

Other ADD (concomitant ADD allowed) 391,771

- reduced CV risk and all-cause hospitalization

Association of Anti-Diabetic Medications Targeting the Glucagon-Like Peptide-1 Pathway and Heart Failure Events in Patients with Diabetes. Velez, 2015 [118]

GLP-1RA or DPP-4i (concomitant ADD allowed) 1,426


2

reduced risk of hospitalization for HF, all-cause hospitalization or death

Risk of overall mortality and cardiovascular events in patients with type 2 diabetes on dual drug therapy including metformin: A large database study from the Cleveland Clinic. Kannah et al, 2016 [110]

GLP-1RA (added to MET) 433


4

no significant difference in HF events



increased risk of HF

35

Cardiovascular safety of combination therapies with incretin-based drugs and metformin compared with a combination of metformin and sulphonylurea in type 2 diabetes mellitus - a retrospective nationwide study, Mogensen et al, 2014 [119]



2.3 non-significant CV risk reduction



3 significant CV risk reduction

Association Between Hospitalization for Heart Failure and Dipeptidyl Peptidase-4 Inhibitors in Patients With Type 2 Diabetes: An Observational Study. Fu et al, 2016 [120]

DPP-4i (concomitant ADD allowed) 109,278

SU (concomitant ADD allowed) 109,278

0.5 no significant difference in CV risk

Sitagliptin use in patients with diabetes and heart failure: a population-based retrospective cohort study. Weir et al, 2014 [121]

DPP-4i: Sitagliptin + HF (concomitant ADD allowed) 887

Other ADD + HF (not on Sitagliptin) 6,733

1.4 increased risk of hospitalization for HF

All-cause mortality and cardiovascular effects associated with the DPP-IV inhibitor sitagliptin compared with metformin, a retrospective cohort study on the Danish population. Scheller et al, 2014 [122]

DPP-4i: Sitagliptin (monotherapy) 1,228

MET (monotherapy) 83,528

1.3 no significant difference in CV risk

Dipeptidyl peptidase-4 inhibitors do not increase the risk of cardiovascular events in type 2 diabetes: a cohort study. Kim et al, 2014 [123]



0.6 significant CV risk reduction

Sitagliptin and the risk of hospitalization for heart failure: a population-based study. Wang et al, 2014 [124]



1.5 increased risk of hospitalization for HF

Combination therapy with metformin plus sulphonylureas versus metformin plus DPP-4 inhibitors: association with major adverse cardiovascular events and all-cause mortality. Morgan et al, 2014 [125]



- significant CV risk reduction

2.3 CONCLUSIONS AND IMPLICATIONS

While the completed RCTs clearly demonstrated CV safety of incretin-based therapies in the

high-risk population, there is no clear evidence of CV benefits of these therapies. Several meta-

analyses of non-CV outcome trials and observational studies supported CV safety of treatment

with incretin-based therapies in broader populations. The risk of heart failure with DPP-4i is

not completely ruled out, and will remain under more careful monitoring in the future. There

is a trend towards CV superiority of treatment with incretin-based therapies, especially with

GLP-1RA. While a RCT designed to demonstrate CV superiority of GLP-1RA or DPP-4i over

36

placebo or other comparator is unlikely, a large multi-national observational study with long

follow-up could provide strong evidence of comparative CV superiority of one ADD class over

others. However, no such study has been for either GLP-1RA nor DPP-4i till date.

Given multi-comorbid profile of patients with T2DM, it is now more urgent to explore whether

introduction of novel drug classes to the market has helped to reduce glycaemic and CV risk

factor burden at population level. This dissertation is designed to assess such trends and their

reflection on the rates of CV events in patients treated with major second-line ADDs.

37

Chapter 3: Data Description

3.1 CENTRICITY ELECTRONIC MEDICAL RECORDS

Data from Centricity Electronic Medical Records (CEMR) was used in this thesis.

Centricity™ is a brand of 27 healthcare IT software solutions from General Electric Healthcare,

which incorporates software for independent physician practices, academic medical centres,

hospitals and large integrated delivery networks. It refers to the systematised data collection,

storing and secure transmitting of patient health information in a digital format [126].

The Medical Quality Improvement Consortium is a rapidly growing community of over 400

CEMR customers who contribute de-identified clinical data to the CEMR database in order to

enable quality improvement, benchmarking, and population-based medical research. The

database covers over 35,000 health care providers from all US states, where ~70% are primary

care providers.

CEMR database contains patient-level information on demographics (sex, ethnicity, year of

birth) and longitudinal entries on anthropometrics, diseases, clinical observations, laboratory

results, and medications (Figure 3.1). Variables such as BMI, blood pressure, HbA1c, urine

albumin and creatinine, or lipid profiles along with dates and other relevant information are

stored in the form of a relational database.

The database extract that captured longitudinal EMRs from January 1995 until October 2014

was used at the initial stages of the project. The database extract that captured patient history

for more than 34 million individuals with a mean 3.5 years of follow-up from January 1995

until April 2016, was used to achieve the main results of this project, reported in chapters 7-9.

38

Figure 3.1. Schematic representation of the data in CEMR database.

Representativeness of CEMR Database

In general, the database is representative of US population in terms of age and ethnic

subgroups, however higher proportions of patients from north eastern and mid-western states

are represented in the CEMR [127]. The patients’ demographic characteristics in the CEMR

database are generally similar to those of the overall US population, with a slight bias towards

older, black, female and non-Hispanic. The distribution of CV risk factors was found to be

similar to the prospective national health surveys [128]. The representativeness of patients with

diabetes is discussed in chapter 5. CEMR has demonstrated its usefulness for various

epidemiological and outcome studies. It has been extensively used for academic research

worldwide in the fields of diabetes [129-132], CV research [133-135], obesity [136-138],

inflammatory diseases [139-142], and other diseases [143-146].

3.2 MEDICATION DATA

Medication data are extensively recorded in the CEMR - includes names, doses, the dates of

prescriptions, and the number of repeat prescriptions for the whole period of the electronic

record (or available follow-up time). CEMR also stores data from patients’ medication list,

which includes over-the-counter medications and those received from outside the EMR

network. This data contains start/stop dates and specific fields to track treatment alterations.

39

Dose escalation for individual medications (e.g. increasing doses of MET) are captured.

However, the data on dose titration, especially for insulin, is relatively poor in primary care

databases. Chapter 4 provides a detailed description of the available data and discusses data

related issues associated with longitudinal information on medication usage from large

relational databases such as CEMR. The “chaining” approach (described in chapter 4) was used

to extract and aggregate medication information at patient-level throughout this project.

3.2.1 Drug identification

As the database captures data from various systems since 1995, medication data were entered

in various ways including, but not limited to, Generic Product Identifier codes and the National

Drug Codes. CEMR stores the original medication name and less frequently the generic name

from the EMR source system terminology reference database, as well as normalised names for

clinical drugs (RxNorm terminology system), when possible. Therefore, several medication

name related fields exist and may include generic name, brand name, free text comments or

missing entries. The procedure described below has been elaborated to identify medication

keys (unique identifiers) of anti-diabetic and other relevant drugs:

1. Identify highest level in the Anatomical Therapeutic Chemical Classification (ATC)

System [147] for a relevant drug category (Table 3.1).

Table 3.1

Therapeutic Class and highest corresponding ATC code

Therapeutic Class Highest ATC code

Anti-obesity preparations, excluding diet products A08

Anti-diabetic drug* A10

Antihypertensive drug C02

Diuretic drug C03

Peripheral vasodilator C04

Beta blocking agent C07

Calcium channel blocker C08

Angiotensin-converting-enzyme inhibitor C09AA and C09B

Angiotensin C09CA and C09D

Agents acting on the renin-angiotensin system C09

HMG CoA reductase inhibitors (Statin) C10AA and C10B

Lipid modifying agents C10

Antidepressant drugs N06A

Non-steroidal anti-inflammatory drug M01A and B01AC06

40

2. For each therapeutic class obtain a list of generic names browsing lower ATC

categories.

3. Search generic names in the official FDA catalogue [148], and create lists of all

approved brand names. Link brands of combination products to each generic name.

4. Combine obtained generic and brand names in the list of keywords.

5. Text-mine CEMR to obtain sets of medication keys for each therapeutic class.

6. Manually review obtained lists and exclude inappropriate keys.

Illustrative example (steps 1-5) of identifying medication keys for Liraglutide is provided in

the Figure 3.2.

Therapeutic class “Anti-diabetic drug” included eleven groups: MET, SU, TZD, Alpha

glucosidase inhibitor, amylin, Dopamine receptor agonist, Meglitinides, DPP-4i, GLP-1RA,

SGLT-2i, and INS. Saxenda (brand of Liraglutide, GLP-1RA) was excluded from the GLP-

1RA group as it was approved in 2014 as weight lowering medication only [149]. Although

Welchol (Colesevelam) was approved for the treatment of T2DM, it is usually prescribed to

reduce cholesterol levels; therefore Colesevelam was not considered as ADD in this project

[150].

Angiotensin-converting-enzyme inhibitors, agents acting on the renin-angiotensin system, beta

blocking agents, and statins were considered as cardio-protective drugs.

41

Figure 3.2. Schematic diagram of identifying list of medication keys for Liraglutide.

42

3.3 DISEASE DATA

CEMR database stores patients’ disease data by means of International Classification of

Diseases 9th Revision (ICD-9) codes, International Classification of Diseases 10th Revision

(ICD-10) codes, or less frequently with SNOMED Clinical Terms (SNOMED CT) codes.

Reliability of diagnosis coding in CEMRs for various diseases has been examined in prior

studies [94, 128, 136], therefore diagnostic codes were directly used to identify presence of a

disease. Nonetheless, additional advanced techniques were applied to improve the quality of

the cohort of patients with diabetes (chapter 5).

The history of disease before baseline of a particular study and disease events during follow-

up were constructed using the date of diagnosis of diseases. “Time to event” was calculated as

time from baseline till the first available diagnosis date for a particular disease. Disease events

included CV disease, chronic kidney disease (CKD) with its stage, cancer, depression and other

relevant diseases. Patients with diagnostic codes for bariatric surgery were also identified. CVD

was defined as ischaemic heart disease (including myocardial infarction), peripheral

vascular/arterial disease, heart failure or stroke.

3.3.1 Charlson Comorbidity Index

While controlling for comorbidities that may affect study outcome is essential, adjusting for

large number of possible comorbidities may be problematic from clinical and methodological

points of view [151, 152]. Rather than adjusting for the effect of each comorbidity, several

methods have been proposed to control for overall comorbidity burden [151, 153, 154]. The

Charlson comorbidity index (CCI) is the most widely comorbidity index used in the medical

literature [155, 156].

CCI was developed to predict 1-year mortality in a cohort of 604 patients admitted to a New

York teaching hospital during 1 month in 1984. The validation of CCI was performed on a

cohort of 685 breast cancer patients admitted to a Connecticut teaching hospital from 1962 to

1969 [151, 152]. Weights (1, 2, 3, or 6) for CCI score computation were created by assessing

adjusted hazard ratios for each predefined comorbidity from a Cox proportional hazards

regression model [151, 157].

Since CCI was introduced, it has been extensively validated in cohorts with different diseases.

Also, numerous adaptations of this index were developed, including adaptations for

administrative databases [152, 158, 159]. In this project, the algorithm recommended by Quan

and colleagues [160] was used to identify ICD-9 and ICD-10 codes for diseased cohorts (except

43

diabetes, Table 3.2). Quan and colleagues expanded ICD-9 codes of Deyo CCI [161] and

identified corresponding ICD-10 codes for each comorbidity. Multiple physicians were

actively involved through all stages of the algorithm development and reached consensus on

the final lists.

To follow the algorithm proposed by Quan and colleagues, SNOMED CT codes were

translated to ICD-10 codes using the mapping released in July 2016 by US National Library of

Medicine (version 20160301) [162]. CCI score at baseline was calculated using original

weights, as presented in Table 3.2.

Table 3.2

Diseases, ICD codes, and Weights used to compute Charlson Comorbidity Index

Disease ICD-9 ICD-10 Weight Myocardial Infarction 410.x, 412.x I21.x, I22.x, I25.2 1

Congestive heart failure

398.91, 402.01, 402.11, 402.91, 404.01, 404.03, 404.11, 404.13, 404.91, 404.93, 425.4–425.9, 428.x

I09.9, I11.0, I13.0, I13.2, I25.5, I42.0, I42.5–I42.9, I43.x, I50.x, P29.0

1

Peripheral vascular disease

093.0, 437.3, 440.x, 441.x, 443.1–443.9, 47.1, 557.1, 557.9, V43.4

I70.x, I71.x, I73.1, I73.8, I73.9, I77.1, I79.0, I79.2, K55.1, K55.8, K55.9, Z95.8, Z95.9

1

Cerebrovascular disease 362.34, 430.x–438.x G45.x, G46.x, H34.0, I60.x–I69.x 1

Dementia 290.x, 294.1, 331.2 F00.x–F03.x, F05.1, G30.x, G31.1 1

Chronic pulmonary disease

416.8, 416.9, 490.x–505.x, 506.4, 508.1, 508.8

I27.8, I27.9, J40.x–J47.x, J60.x–J67.x, J68.4, J70.1, J70.3

1

Rheumatic disease 446.5, 710.0–710.4, 714.0– 714.2, 714.8, 725.x

M05.x, M06.x, M31.5, M32.x–M34.x, M35.1, M35.3, M36.0

1

Peptic ulcer disease 531.x–534.x K25.x–K28.x 1

Mild liver disease

070.22, 070.23, 070.32, 070.33, 070.44, 070.54, 070.6, 070.9, 570.x, 571.x, 573.3, 573.4, 573.8, 573.9, V42.7

B18.x, K70.0–K70.3, K70.9, K71.3–K71.5, K71.7, K73.x, K74.x, K76.0, K76.2–K76.4, K76.8, K76.9, Z94.4

1

Moderate or severe liver disease

456.0–456.2, 572.2–572.8

I85.0, I85.9, I86.4, I98.2, K70.4, K71.1, K72.1, K72.9, K76.5, K76.6, K76.7

3

Diabetes without chronic complication

Identified as described in 0 1

44

Diabetes with chronic complication

250.4–250.7

E10.2–E10.5, E10.7, E11.2–E11.5, E11.7, E12.2–E12.5, E12.7, E13.2– E13.5, E13.7, E14.2–E14.5, E14.7

2

Hemiplegia or paraplegia 334.1, 342.x, 343.x, 344.0– 344.6, 344.9

G04.1, G11.4, G80.1, G80.2, G81.x, G82.x, G83.0–G83.4, G83.9

2

Renal disease

403.01, 403.11, 403.91, 404.02, 404.03, 404.12, 404.13, 404.92, 404.93, 582.x, 583.0–583.7, 585.x, 586.x, 588.0, V42.0, V45.1, V56.x

I12.0, I13.1, N03.2–N03.7, N05.2– N05.7, N18.x, N19.x, N25.0, Z49.0– Z49.2, Z94.0, Z99.2

2

Cancer 140.x–172.x, 174.x–195.8, 200.x–208.x, 238.6

C00.x–C26.x, C30.x–C34.x, C37.x– C41.x, C43.x, C45.x–C58.x, C60.x– C76.x, C81.x–C85.x, C88.x, C90.x–C97.x

2

Metastatic solid tumour 196.x–199.x C77.x–C80.x 6 AIDS/HIV 042.x–044.x B20.x–B22.x, B24.x 6

Note: original table source [160], weights in [157].

As recommended by Quan and colleagues (2005), cancer was considered as any malignancy,

including lymphoma and leukaemia, and excluding malignant neoplasm of the skin [160]. In

cases where moderate or severe liver disease was present for a patient, mild liver disease did

not contribute to the CCI score. Similarly, if a record of diabetes with chronic complications

was present, diabetes without chronic complications did not contribute to the CCI score

computation. Finally, if a patient had a record of metastatic solid tumour, cancer did not affect

the CCI score.

3.4 LABORATORY, CLINICAL, AND ANTHROPOMETRIC DATA

Longitudinal observations on laboratory, clinical, and anthropometric data are extensively

recorded in the CEMR. These data are usually entered repeatedly throughout the whole period

of the electronic record (available follow-up) for an individual patient. The data used during

this project included: HbA1c, fasting/random blood glucose, low-density lipoprotein (LDL),

high-density lipoprotein, triglycerides, systolic blood pressure (SBP), diastolic blood pressure,

heart rate, urine microalbumin/creatinine ratio, serum creatinine, body mass index (BMI),

weight, and tobacco use status. Extensive data validation and cleaning techniques were applied

prior to data extraction and all measurements were converted to standard or most frequently

used units.

45

3.4.1 Arranging longitudinal measures

For individual patients, the longitudinal laboratory, clinical and anthropometric data were

arranged in 6 monthly windows: ±3 months both sides of a baseline of a particular study and

progressively further on (Figure 3.3). The closest risk factor measure to the middle of the

window (or average of multiple measurements if available within that window) was preserved

as the observed measure for this window. For baseline HbA1c data, closest measurement on or

within 3 months prior to baseline was used for the baseline measurement. The six-monthly

longitudinal follow-up data for HbA1c followed the same principle described above.

Figure 3.3. Schematic diagram of arranging longitudinal risk factor data.

Missing data (example: Figure 3.3, window “6M”) were imputed with the Multiple Imputation

Monte Carlo Markov Chain approach, after extensive assessments of the missingness patterns

and comparison of several imputation techniques, as described in chapter 6.

3.4.2 Tobacco use status

The longitudinal free text inputs are also available in the CEMR. Tobacco status included status

on any type of tobacco use: cigars, pipe, cigarettes, chewing tobacco or snuff. The majority of

records providing such information followed standard coding practice, and were in the form of

“current” / “former” / “never” smoker. Remaining records (>80,000) were classified to these 3

categories by creating classification rules upon manual review of entered free text. For

example: if description includes keywords “trying” and “quit”, classify as “current”.

Occasional smokers were classified as "current". In case of discordant same-day statuses,

priority was given to "current" status, than to "former" and lastly to "never" status. Records

indicating "never" status were disregarded in case of previous records of "current" or "former"

status record. For each patient, last status recorded on or prior to particular analysis baseline

was considered as tobacco use status. Nonetheless, a large number of patients with T2DM

appeared not to have a record for the tobacco use status.

46

3.5 ETHICS APPROVAL

This thesis involved the use of existing data, where the subjects could not be identified directly

or through identifiers linked to the subjects. Thus, according to the US Department of Health

and Human Services Exemption 4 (CFR 46.101(b)(4)), this study is exempt from ethics

approval from an institutional review board and informed consent.

47

Chapter 4: Medication Data Extraction

Statement of Contribution of Co-Authors for Thesis by Published

Paper

The authors listed below have certified* that:

1. they meet the criteria for authorship in that they have participated in the conception,

execution, or interpretation, of at least that part of the publication in their field of

expertise;

2. they take public responsibility for their part of the publication, except for the

responsible author who accepts overall responsibility for the publication;

3. there are no other authors of the publication according to these criteria;

4. potential conflicts of interest have been disclosed to (a) granting bodies, (b) the editor

or publisher of journals or other publications, and (c) the head of the responsible

academic unit, and

5. they agree to the use of the publication in the student’s thesis and its publication on

the QUT’s ePrints site consistent with any limitations set by publisher requirements.

In the case of this chapter:

Olga Montvida, Ognjen Arandjelović, Edward Reiner, and Sanjoy K. Paul. Data Mining

Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic Medical

Records. Open Bioinformatics, 2017, 10:1-15. DOI: 10.2174/1875036201709010001.x.

Contributor Statement of Contribution*

Olga Montvida Conceived the idea, was responsible for the primary design

of the study and the methodological developments.

Conducted the data extraction and statistical analyses.

Developed first draft and contributed towards development

of the manuscript.

Ognjen Arandjelović Evaluated the methodological approach and contributed

towards development of the manuscript.

Edward Reiner Evaluated the methodological approach and contributed


Sanjoy K. Paul Conceived the idea, was responsible for the primary design

of the study and the methodological developments.

Contributed to the statistical analyses. Developed first draft

and contributed towards development of the manuscript.

48

29.06.2018QUT Verified

Signature

Principal Supervisor Confirmation

I have sighted email or other correspondence from all Co-authors confirming their certifying

authorship.

Sanjoy Ketan Paul 29.06.2018

Name Signature Date

49


Send Orders for Reprints to [email protected]

The Open Bioinformatics Journal , 2017, 10, 1-15 1

The Open Bioinformatics Journal

Content list available at: www.benthamopen.com/TOBIOIJ/

DOI: 10.2174/1875036201709010001

RESEARCH ARTICLE

Data Mining Approach to Estimate the Duration of Drug Therapyfrom Longitudinal Electronic Medical Records

Olga Montvida1,2, Ognjen Arandjelović3, Edward Reiner4 and Sanjoy K. Paul5,*

1Clinical Trials and Biostatistics Unit, QIMR Berghofer Medical Research Institute, Brisbane, Australia2School of Biomedical Sciences, Institute of Health and Biomedical Innovation, Faculty of Health, QueenslandUniversity of Technology, Brisbane, Australia3School of Computer Science, University of St. Andrews, St. Andrews, United Kingdom4Smart Analyst Inc., New York, Unites States of America5Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, Australia

Received: March 27, 2017 Revised: May 06, 2017 Accepted: May 12, 2017

Abstract:

Background:

Electronic Medical Records (EMRs) from primary/ ambulatory care systems present a new and promising source of information forconducting clinical and translational research.

Objectives:

To address the methodological and computational challenges in order to extract reliable medication information from raw data whichis often complex, incomplete and erroneous. To assess whether the use of specific chaining fields of medication information mayadditionally improve the data quality.

Methods:

Guided by a range of challenges associated with missing and internally inconsistent data, we introduce two methods for the robustextraction of patient-level medication data. First method relies on chaining fields to estimate duration of treatment (“chaining”),while second disregards chaining fields and relies on the chronology of records (“continuous”). Centricity EMR database was used toestimate treatment duration with both methods for two widely prescribed drugs among type 2 diabetes patients: insulin and glucagon-like peptide-1 receptor agonists.

Results:

At individual patient level the “chaining” approach could identify the treatment alterations longitudinally and produced more robustestimates of treatment duration for individual drugs, while the “continuous” method was unable to capture that dynamics. Atpopulation level, both methods produced similar estimates of average treatment duration, however, notable differences wereobserved at individual-patient level.

Conclusion:

The proposed algorithms explicitly identify and handle longitudinal erroneous or missing entries and estimate treatment durationwith specific drug(s) of interest, which makes them a valuable tool for future EMR based clinical and pharmaco-epidemiologicalstudies. To improve accuracy of real-world based studies, implementing chaining fields of medication information is recommended.

* Address correspondence to this author at the Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, Australia; Tel: +613 93428433; Fax: +61 3 93428780; E-mails: [email protected]; [email protected]

1875-0362/17 2017 Bentham Open

50

http://benthamopen.com

http://crossmark.crossref.org/dialog/?doi=10.2174/1875036201709010001&domain=pdf

http://www.benthamopen.com/TOBIOIJ/

http://dx.doi.org/10.2174/1875036201709010001

mailto:[email protected]


2 The Open Bioinformatics Journal , 2017, Volume 10 Montvida et al.

Keywords: Electronic medical records, Treatment duration, Data mining, Type 2 diabetes, Rule-based algorithm, Patient-level dataaggregation.

1. INTRODUCTION

The electronic medical records (EMRs) and the administrative data from the primary/ambulatory care systems areincreasingly being used in epidemiological [1 - 3], pharmaco-epidemiological [4 - 6], pharmaco-vigilance [7 - 9],clinical outcome [5, 10 - 12], health economic [13, 14] and public health related studies [15 - 18]. Analyses of largeprimary care based EMRs from various countries, most notably from UK, USA and Sweden, have provided significantinsight into the effectiveness of changes in health care practices/polices on overall disease and health management costs[3, 15, 19, 20], in addition to population level evidences on the safety and effectiveness of various therapies and theassociation of disease-related risk factors on long-term outcomes [5, 6, 18, 21 - 23]. Increasing use of such large real-world patient-level data is illustrated well by the sixfold increase in EMR based published studies since 2000 [10, 24].

In structured EMRs, especially from the primary/ambulatory care systems, comprehensive patient level data arecaptured on different domains simultaneously and stored in the form of relational database [25, 26]. Representativeexamples include the UK Clinical Practice Research Database and CentricityTM EMR (CEMR) database of USA [27,28]. The extraction, quality control and management of such voluminous longitudinal data under individual studyprotocols is highly methodologically and computationally involved, and challenging from data mining and analyticalviewpoints [22, 29]. Data science generally considers that data preparation tasks consume about 80% of total projecttimeline leaving only 20% for ultimate analysis itself [30, 31]. Data completeness, systematic biases, reproducibilityand quality are some of the notable limitations in such databases [18, 29, 32].

Most EMR databases capture large amounts of detailed information on medications provided to individuals overtime, while specific form in which this information is stored varies from database to database [26]. It is usually possibleto obtain the drug class, specific brand name within the corresponding class, prescription dates, dosage, and number ofrefills [32]. However, a significant number of entries for an individual prescription may be missing or contain errors.The problem with information completeness can also arise when the medication nomenclature is not correctly matched[29].

Clinical and pharmaco-epidemiological studies, which rely on the data from EMRs, are often interested in theeffectiveness of specific therapies, therapeutical dynamics, treatments with concomitant medications, and durationsthereof in specific disease areas. Such real-world analysis provides an extremely valuable means for the understandingof drug utilization patterns, treatment initiation periods following the diagnosis of a disease, the effectiveness of specifictherapies on disease-related risk factors, and possible associations of therapies with long-term outcomes [1, 6]. Thesestudies warrant appropriate extraction of longitudinal information on prescriptions or medications at individual patientlevel, inappropriate extraction of the data may result in misleading inferences reported [33 - 35]. Generally, pharmaco-epidemiological studies do not estimate treatment duration, but only account for the fact of one or more prescriptionsfor a particular drug(s) [36, 37]. Some studies calculated medication duration by extracting first prescription date fromthe last prescription date [38, 39], and only few studies additionally considered a drug being discontinued if thesubsequent prescription was not refilled within the expected time of drug coverage [40, 41]. While some studies havediscussed the challenges in the analysis of medication data from EMRs [18, 42], to the best of our knowledge noexisting study has analysed the quality, consistency, and completeness of EMR prescription information, nor proposed apractical algorithm able to extract salient medication information from large and complex longitudinal data sets [43].

The aims of this explanatory and methodological study are (1) to discuss and analyse the most pressing challengesencountered by computer based methods in the process of extracting and aggregating longitudinal medication data fromEMRs, (2) to describe two algorithms to extract prescription information of individual therapies and to estimate thecorresponding duration of treatment, and (3) to discuss how estimates of individual medication duration are affected bythe choice of the study design. The effectiveness of algorithms is compared is on a cohort of patients with a clinicaldiagnosis of type 2 diabetes (T2DM) using a real-world EMR database collected across the USA.

2. MATERIALS AND METHODS

2.1. Centricity Electronic Medical Records

The CEMR database contains more than 40 million patients’ clinical/treatment records from 1995. CEMRrepresents 49 US states and a variety of ambulatory medical practices, including solo practitioners, community clinics,

51

Estimation of Drug Therapy Duration from EMRs The Open Bioinformatics Journal , 2017, Volume 10 3

academic medical centres, and large integrated delivery networks. The database has been extensively used for academicresearch worldwide [3, 37, 44 - 47]. The CEMR database consists of over 30,000 health care providers, of whomapproximately 70% are primary care providers. For both insured and uninsured patients, this database containscomprehensive patient-level information on many aspects including demographic information, laboratory results,history of diseases, clinical diagnosis of symptoms/ diseases, vital signs, history of medications and detailedinformation on the ongoing medications. For this study we used longitudinal information from January 1995 to October2014.

2.2. Medication Data in Centricity EMR database

The medications taken by an individual (medication domain) and the prescriptions for drugs provided to theindividuals by the service provider registered within the EMR system (prescription domain) are extensively documentedin the database by means of three tables: medication dimension (MD), medication fact (MF) and prescription fact (PF).The MF and PF belong to the medication and prescription domains respectively. The MF may include a broader list ofall medications that a patient is taking including over the counter medications, herbal remedies and medicationsprescribed by a provider that may be out of the EMR network. MD is linked to both MF and PF. Each record in the MDcontains information on individual drug, which includes the National Drug Code (NDC) and Generic Product Identifier(GPI), as well as the four ordered attributes derived from the GPI such as generic drug names. The MD also includes themedication doses corresponding to different brands’ products, identified by a unique medication key value assigned toeach record.

The entries in MF capture individual patient’s medication prescription history and active prescriptions from allpractitioners including the service provider registered within EMR system. It contains several special fields to tracklongitudinal patterns, such as active medication flag, which indicates if a patient was taking the drug at the databaseextraction moment. Active medication list is identified by records with value “Y” of active flag. The chain identification(ID) values facilitate tracking of treatment alterations (including the addition of new medications) over time, with therelated chain sequence values which track medication adjustments within the same chain ID. The initiation (‘start’) andcessation (’stop’) dates associated with different treatments are also stored in the MF. However we found that thecorresponding values are missing with alarming frequencies: 67% of the cases for the former and 11% for the latter.Also, some of the start and stop date entries could be erroneous, such as stop date preceding start date. An excerpt fromthe MF for an individual patient is shown in Table 1.

Table 1. Snapshot of MF table – treatment intensification.

GPI category 4 Medication key (M) Patient key (P) Create date(C)

Start date(B)

Stop date(S)

Activeflag (F)

Chain ID(H)

Chainseq (G)

METFORMIN HCL 41467 288859 6-May-09 6-May-09 N 307667619 0METFORMIN HCL 41467 288859 11-Jun-10 11-Jun-10 N 307667619 1METFORMIN HCL 41467 288859 25-Apr-11 11-Jun-10 25-Apr-11 N 307667619 2

LIRAGLUTIDE 3347202 288859 25-Apr-11 25-Apr-11 N 812855070 0LIRAGLUTIDE 3347202 288859 10-May-11 10-May-11 N 812855070 1LIRAGLUTIDE 3347202 288859 10-May-11 10-May-11 N 820957274 0LIRAGLUTIDE 3347202 288859 14-Dec-11 10-May-11 14-Dec-11 N 820957274 1LIRAGLUTIDE 3347202 288859 14-Dec-11 10-May-11 14-Dec-11 N 812855070 2

INSULIN GLARGINE 682327 288859 27-Feb-12 N 1092145628 0INSULIN ISOPHANE HUMAN 682834 288859 27-Feb-12 N 1092145627 0

INSULIN GLARGINE 682327 288859 26-Sep-12 26-Sep-12 N 1092145628 1INSULIN ISOPHANE HUMAN 682834 288859 14-Nov-12 N 1092145627 1

INSULIN GLARGINE 682327 288859 14-Nov-12 Y 1092145628 2INSULIN LISPRO (HUMAN) 682825 288859 26-Feb-14 26-Feb-14 Y 1092145627 2

The entries in the PF capture the prescription date and the associated number of refills only for medications thathave been prescribed by the responsible provider within the EMR network. The MF dataset contains a broader set ofentry sources, moreover the form of recording potentially comprises more details than corresponding data in the PF.Nevertheless it was determined that PF may contain unique entries that are not stored in MF. Therefore, the MF wasconsidered as the primary source of medication information and the PF as a complimentary one.

52


3. METHODS

In this section, we introduce a novel algorithm for mining large-scale longitudinal EMRs with the ultimate goal ofestimating the duration of treatment of a particular individual with a drug(s) of interest. The first method we introduce(“chaining”) relies on chain ID and chain sequence values recorded in the MF. This feature of the approach allows toaccount for treatments which include alternative drug use. To assess the importance and power of longitudinal chaininformation, we also describe a modification of the “chaining” method (“continuous”) which disregards chain ID andchain sequence values, and instead relies only on the chronology of patient’s records of particular drug(s). In the currentliterature, the latter approach is used more frequently.

3.1. Data Pre-processing: Auxiliary Fields

Although erroneous entries generally cannot be identified, various types of global consistency rules may be appliedto reduce the error. Chronology of the events may be corrected by incorporating two additional fields: patient’s lastavailable follow-up date and patient’s date of birth (DOB).

CEMR database stores last available follow-up date for each patient. As initial data pre-processing step, erroneousfollow-up date entries were identified and corrected by the latest record creation dates of all activities within thedatabase for corresponding patients.

Similar to many anonymized EMRs, the exact DOB was not available within CEMR. Simple procedure was appliedto approximate DOB:

Obtain multiple DOB estimates per patient by subtracting reported ‘valid’ age from the record creation date for1.all activities within the database. CEMR groups patients older than 80 years under a single age key. The non-missing age data and the non 80+ age keys were considered as ‘valid’ age entries.Approximate DOB as minimum of all estimates from Step 1.2.For patients without reported activities estimate DOB from the dataset containing demographic information by3.subtracting reported ‘valid’ age from the database extraction date.

The parameters for the mathematical formulations are identified in the Table 2 below.

Table 2. Mathematical Formulation

Scalarsn number of records in MF tablek number of records in PF tablesd standard prescription duration for individual drugmx maximal number of prescription refills for individual drugu number of unique patient keys in the cohort of interest

SetsPS = {ps1,ps2,…… psu} set of unique patient keys in the cohort of interest

V set of missing valuesMS set of medication keys of selected drug(s)

FY = {fi|fi = "Y", i = } set of active drugs

MF={M,P,C,B,S,F,H,G} datasetM = (m1, m2,... ,mn)

T medication keys for drugs

P = (p1, p2,... ,pn)T patient keys

C = (c1, c2,... ,cn)T record creation dates

B = (b1, b2,... ,bn)T start dates of individual records

S = (s1, s2,... ,sn)T stop dates of individual records

F = (f1, f2,... ,fn)T active medication flag values

H = (h1, h2,... ,hn)T chain identification values

G = (g1, g2,... ,gn)T chain sequence values

PF={M,P,C,B,R} datasetM = (m1, m2,... ,mk)

T medication keys for individual prescriptions

P = (p1, p2,... ,pk)T patient keys

1, 𝑛

53


C = (c1, c2,... ,ck)T record creation date

B = (b1, b2,... ,bk)T prescription dates

R = (r1, r2,... ,rk)T number of refills for individual prescription

The scalars sd and mx may be defined on the basis of the standard prescription protocol for individual drugs. Thedefault values of sd =1 and mx =24 were considered in our analyses.

MS may be identified by text-mining the MD dataset. For example, glucagon-like peptide-1 receptor agonist(GLP-1RA) may be identified by searching for “GLP-1 RECEPTOR AGONIST” in the second order GPI attributedfield.

3.2. “Chaining” Method

The algorithm for the first approach to extract and aggregate data for the estimation of duration of treatment iselaborated below.

1. Merge the following to the MF dataset by patient key:

1.1) date of birth DOB = (db1, db2,...,dbn)T.

1.2) last available follow-up date L = (l1, l2,...,ln)T. The extended MF dataset would be of the form.

2. Replace erroneous values of start dates (bi V (bi<dbi bi>si bi>li), i = ) with missing values

3. Sort by patient key ascending, chain ID ascending within the same patient, chain sequence descending within thesame chain ID.

4. Set initial values p0 = 0, and approximate individual medication end dates E = (e1, e1,...,en)T on the basis of the

following rules:

4.1) if stop date is not missing, then end date equals to stop date.

4.2) else, if active flag is “Y”, then end date equals to last follow-up date.

4.3) else, if first unique value of patient key or first unique value of chain ID, and start date is not missing, then enddate equals to start date plus standard prescription duration.

4.4) else, if first unique value of patient key or first unique value of chain ID, and start date is missing, then end dateequals to record creation date plus standard prescription duration.

4.5) else, end date equals to the create date of a previous record.

(Table 2) contd.....

𝑀𝐹1 = {𝑀, 𝑃, 𝐶, 𝐵, 𝑆, 𝐹, 𝐻, 𝐺, 𝐷𝑂𝐵, 𝐿}

∉ ∧ ∨ ∨ 1, 𝑛

𝑀𝐹1: 𝑎) 𝑝𝑖 ≤ 𝑝𝑖+1, 𝑖 = 1, 𝑛

𝑏) ℎ𝑖 ≤ ℎ𝑖+1, ∀𝑖: 𝑝𝑖 = 𝑝𝑖+1 - post a) sorting

𝑐) 𝑔𝑖 ≥ 𝑔𝑖+1, ∀𝑖: ℎ𝑖 = ℎ𝑖+1 ∧ 𝑝𝑖 = 𝑝𝑖+1 - post b) sorting

𝑒𝑖 = 𝕀{𝑏𝑖∉𝑉} ⋅ (𝕀{𝑝𝑖≠𝑝𝑖−1} + 𝕀{ℎ𝑖≠ℎ𝑖−1} − 𝕀{𝑝𝑖≠𝑝𝑖−1} ∙ 𝕀{ℎ𝑖≠ℎ𝑖−1} ) ∙ (𝑠𝑖 ⋅ 𝕀𝑠𝑖∉𝑉 + 𝑙𝑖 ⋅ 𝕀𝑓𝑖∈𝐹𝑦⋅ 𝕀𝑠𝑖∈𝑉 + (𝑏𝑖 + 𝑠𝑑) ⋅ 𝕀𝑓𝑖∉𝐹𝑦

⋅ 𝕀𝑠𝑖∈𝑉) +

𝕀{𝑏𝑖∈𝑉} ⋅ (𝕀{𝑝𝑖≠𝑝𝑖−1} + 𝕀{ℎ𝑖≠ℎ𝑖−1} − 𝕀{𝑝𝑖≠𝑝𝑖−1} ∙ 𝕀{ℎ𝑖≠ℎ𝑖−1} ) ∙ (𝑠𝑖 ⋅ 𝕀𝑠𝑖∉𝑉 + 𝑙𝑖 ⋅ 𝕀𝑓𝑖∈𝐹𝑦⋅ 𝕀𝑠𝑖∈𝑉 + (𝑐𝑖 + 𝑠𝑑) ⋅ 𝕀𝑓𝑖∉𝐹𝑦


𝕀{𝑝𝑖=𝑝𝑖−1} ⋅ 𝕀{ℎ𝑖=ℎ𝑖−1} ∙ (𝑙𝑖 ⋅ 𝕀𝑓𝑖∈𝐹𝑦⋅ 𝕀𝑠𝑖∈𝑉 + 𝑠𝑖 ⋅ 𝕀𝑠𝑖∉𝑉 + 𝑐𝑖−1 ⋅ 𝕀𝑓𝑖∉𝐹𝑦

⋅ 𝕀𝑠𝑖∈𝑉),

54


5. Replace values of end dates that falls out of the follow-up interval with last follow-up date.

6. Delete records if start date is missing and create date is greater than stop date. Reduce the dataset to the set ofpatients from the cohort of interest and to set of keys of selected drug(s).

7. Merge the following to the PF set by patient key:

7.1) date of birth DOB = (db1, db2,...,dk)T

7.2) last available follow-up date within the database L = (l1, l2,...lk)T. The extended PF dataset would take the

following form:

8. Replace erroneous prescription dates (bi V (bi<dbi bi>li), i= ) with missing values.

9. If number of refills is greater than pre-defined maximal number of possible refills or negative or missing, replaceit with zero.

10. Calculate end dates E = (e1, e2,...,ek)T by the following rules.

10.1) if prescription date is not missing, then end date is equals to standard duration multiplied by the number ofrefills plus one and added to prescription date.

10.2) if prescription date is missing, then end date is equals to standard duration multiplied by the number of refillsplus one and added to record creation date.

11. Update end dates as described in Step 5 .

12. Reduce PF1 to the set of patients from the cohort of interest, to the set of patients not in MF2, and to the set ofkeys of selected drug(s).

13. Append both datasets by the following values: patient key, record creation date, start / prescription date and enddate, assume that the new dataset MP contain n' records.

where 𝕀{⋅} is an indicator function:

𝕀{𝑎=𝑏} = {1, 𝑖𝑓 𝑎 = 𝑏0, 𝑒𝑙𝑠𝑒

𝕀{𝑎∈𝑏} = {1, 𝑖𝑓 𝑎 ∈ 𝑏0, 𝑒𝑙𝑠𝑒

𝑒𝑖 = 𝑒𝑖 ⋅ 𝕀{𝑒𝑖≤𝑙𝑖} + 𝑙𝑖 ⋅ 𝕀{𝑒𝑖>𝑙𝑖}

𝑀𝐹2 = {𝑀𝐹1: 𝑝𝑖 ∈ 𝑃𝑆 ∧ 𝑚𝑖 ∈ 𝑀𝑆 ∧ ¬(𝑏𝑖 ∈ 𝑉 ∧ 𝑐𝑖 > 𝑒𝑖), 𝑖 = 1, 𝑛}

∉ ∧ 1, 𝑘

𝑟𝑖 = 𝑟𝑖 ⋅ 𝕀{𝑟𝑖<𝑚𝑥} ⋅ 𝕀{𝑟𝑖≥0} ⋅ 𝕀{𝑟𝑖∉𝑉}, 𝑖 = 1, 𝑘

∧

𝑃𝐹1 = {𝑀, 𝑃, 𝐶, 𝐵, 𝑅, 𝐷𝑂𝐵, 𝐿}

𝑒𝑖 = (𝑒𝑖 + (𝑟𝑖 + 1) ⋅ 𝑠𝑑) ⋅ 𝕀{𝑒𝑖∉𝑉} + (𝑐𝑖 + (𝑟𝑖 + 1) ⋅ 𝑠𝑑) ⋅ 𝕀{𝑒𝑖∈𝑉}

𝑃𝐹2 = {𝑃𝐹1: 𝑝𝑖 ∈ 𝑃𝑆 ∧ 𝑚𝑖 ∈ 𝑀𝑆 ∧ (𝑝𝑖 ∉ 𝑃 ⊂ 𝑀𝐹2), 𝑖 = 1, 𝑘 }

( 𝑒𝑖 = 𝑒𝑖 ⋅ 𝕀{𝑒𝑖≤𝑙𝑖} + 𝑙𝑖 ⋅ 𝕀{𝑒𝑖>𝑙𝑖}, 𝑖 = 1, 𝑘)

55


14. Calculate the number (cn) of distinct record creation dates for each patient, treat missing start dates by thefollowing rules:

14.1) if cn is equal to one, then delete the record.

14.2) if cn is greater than one, replace it with record creation date.

15. Sort by patient key ascending, start date ascending within same patient key.

16. For each unique patient key psj PS,j = reduce MP to the set FN j containing only pi = psj, i = .Assume that obtained dataset FN j has n'' rows. Set e0 = 0 and calculate selected medication duration for the patientavoiding double calculations of overlapping intervals.

17. Use medication duration D = (d1, d2,...,du)T to conduct further research.

3.3. “Continuous” Method

1. Repeat steps 1 and 2 from “chaining” method, then perform step 6, and treat missing values in MF2 as describedin step 14. Assume that obtained dataset MF2 has instances.

2. Create stop date status variable SI = (st1, st2,...,st )T on the basis of the following rules:

2.1) if active flag is “Y” and stop date is missing, then stop date status equals to 2.

2.2) if stop date is not missing, then stop date status equals to 1.

2.3) else 0.

3. Sort MF3 by patient key ascending, start date descending within same patient key, stop date status ascendingwithin the same start dates of the same patient:

𝑀𝐹3 = {𝑃, 𝐶, 𝐵, 𝐸} ⊂ 𝑀𝐹2

𝑃𝐹3 = {𝑃, 𝐶, 𝐵, 𝐸} ⊂ 𝑃𝐹2

𝑀𝑃 = 𝑀𝐹3⋃𝑃𝐹3

𝑀𝑃 ∶ 𝑎) 𝑝𝑖 ≤ 𝑝𝑖+1, 𝑖 = 1, 𝑛′

𝑏) 𝑏𝑖 ≤ 𝑏𝑖+1, ∀𝑖𝑖: 𝑝𝑖 = 𝑝𝑖+1 - post a) sorting

∈ 1, 𝑢 1, 𝑛′

𝐹𝑁𝑗: 𝑑𝑗 = ∑((𝑒𝑖 − 𝑏𝑖) ⋅ 𝕀{𝑏𝑖≥𝑒𝑖−1} + (𝑒𝑖 − 𝑒𝑖−1) ⋅ 𝕀{𝑏𝑖<𝑒𝑖−1} ⋅ 𝕀{𝑒𝑖≥𝑒𝑖−1})

𝑛′′

𝑖=1

��

��

𝑠𝑡𝑖 = 2 ⋅ 𝕀𝑓𝑖∈𝐹𝑦⋅ 𝕀𝑠𝑖∈𝑉 + 1 ⋅ 𝕀𝑠𝑖∉𝑉 + 0 ⋅ 𝕀𝑓𝑖∉𝐹𝑦

⋅ 𝕀𝑠𝑖∈𝑉

𝑀𝐹3 = {𝑀, 𝑃, 𝐶, 𝐵, 𝑆, 𝐹, 𝐻, 𝐺, 𝐷𝑂𝐵, 𝐿, 𝑆𝑇}

𝑀𝐹3: 𝑎) 𝑝𝑖 ≤ 𝑝𝑖+1, 𝑖 = 1, ��

𝑏) 𝑏𝑖 ≥ 𝑏𝑖+1, ∀𝑖: 𝑝𝑖 = 𝑝𝑖+1 - post a) sorting

𝑐) 𝑠𝑡𝑖 ≤ 𝑠𝑡𝑖+1, ∀𝑖: 𝑏𝑖 = 𝑏𝑖+1 ∧ 𝑝𝑖 = 𝑝𝑖+1 - post b) sorting

56


4. Set initial value p0 = 0 and approximate individual medication end dates E = (e1, e2,...,e )T.

4.1) if stop date is not missing, then end date equals to stop date.

4.2) else, if active flag is “Y”, then end date equals to last follow-up date.

4.3) else, if first unique patient key, then end date equals to start date plus standard duration.

4.4) else end date equals to start date of previous record.

5. Perform step 5 from “chaining” method, and steps 7-11.

6. Reduce PF1 to the set of patients from the cohort of interest, to the set of patients not in MF3, and to the set ofkeys selected drug(s).

7. Treat missing values in PF2 as described in step 14 of "chaining" method.

8. Append both datasets by the following values: patient key, record creation date, start/prescription date and enddate, assume that the new dataset MP contain records.

9. Perform steps 15-17 from “chaining” method.

4. REMARKS

Identified erroneous entries are declared as missing in Steps 2, 8, and 9 of “chaining” method. In the Step 14, thealgorithm counts the number of unique creation dates for selected drug(s) at patient level. If obtained number is greaterthan one, then missing start dates are replaced with record creation dates. In such a way, a patient is considered to take aparticular drug if the medication records were entered in a systematic manner, otherwise the records with missing startdates are disregarded.

As an example, the prescription scenario for anti-diabetes drugs for a patient with type 2 diabetes is presented inTable 1. The treatment was initiated with metformin (METFORMIN HCL) on the 6th of May 2009 and continued untilthe 25th of April 2011, when a switch to GLP-1RA (LIRAGLUTIDE) was made. With a stop date for GLP-1RArecorded on 14th of December 2011, data show a gap in the treatment till 26th of September 2012, when insulin therapybegun. However, a patient with diabetes using GLP-1RA is unlikely to have had a nine month long gap in the treatment.Indeed, careful data examination leads to the conclusion that insulin treatment started on 27th of February 2012, aswould be estimated by the algorithm.

As it was mentioned earlier, MF was considered as primary data source, thus if at least one record for selecteddrug(s) at patient level is present in the MF, then both methods disregard entities in the PF. However, if there is noavailable data in MF table, the methods append data from PF.

Assessment of the first marketing date for a particular drug is an example of additional global consistency audit thatis omitted in the methods’ description. For instance, any start date of GLP-1RA drugs must not be prior to April 2005,

��

𝑒𝑖 = 𝕀{𝑝𝑖≠𝑝𝑖−1} ∙ (𝑙𝑖 ⋅ 𝕀𝑓𝑖∈𝐹𝑦⋅ 𝕀𝑠𝑖∈𝑉 + 𝑠𝑖 ⋅ 𝕀𝑠𝑖∉𝑉 + (𝑏𝑖 + 𝑠𝑑) ⋅ 𝕀𝑓𝑖∉𝐹𝑦


+𝕀{𝑝𝑖=𝑝𝑖−1} ∙ (𝑙𝑖 ⋅ 𝕀𝑓𝑖∈𝐹𝑦⋅ 𝕀𝑠𝑖∈𝑉 + 𝑠𝑖 ⋅ 𝕀𝑠𝑖∉𝑉 + 𝑏𝑖−1 ⋅ 𝕀𝑓𝑖∉𝐹𝑦

⋅ 𝕀𝑠𝑖∈𝑉)

𝑃𝐹2 = {𝑃𝐹1: 𝑝𝑖 ∈ 𝑃𝑆 ∧ 𝑚𝑖 ∈ 𝑀𝑆 ∧ (𝑝𝑖 ∉ 𝑃 ⊂ 𝑀𝐹3), 𝑖 = 1, 𝑘 }

𝑀𝐹4 = {𝑃, 𝐶, 𝐵, 𝐸} ⊂ 𝑀𝐹3

𝑃𝐹3 = {𝑃, 𝐶, 𝐵, 𝐸} ⊂ 𝑃𝐹2

𝑀𝑃 = 𝑀𝐹4⋃𝑃𝐹3

��

57


the date when first representative (Exenatide) was approved.

5. RESULTS

To evaluate the performance of described methods, we chose to focus on the estimation of the duration of treatmentwith two widely used anti-diabetic drugs, namely GLP-1RA and insulin. In the CEMR database 1,861,560 patients wereidentified as having been diagnosed with type 2 diabetes mellitus, as inferred from the assigned ICD-9 codes.

5.1. Case Study 1

As the first case study, we consider a randomly selected patient from the CEMR database, whose relevant treatmentdetails are shown in Table 3. The treatments with EXENATIDE and INSULIN GLARGINE started on the 18th of June2007. The treatment with EXENATIDE was terminated on the 7th of January 2008, while INSULIN therapy continueduntil the last recorded follow-up date on the 24th of January 2008 (notice that the treatment is flagged as active, “Y”). Inthis case, the “chaining” and “continuous” methods produce the same estimates for the durations of the two treatments.Specifically, the estimates corresponding to insulin and GLP-1RA are 7.2 and 6.7 months, respectively.

Table 3. Snapshot of MF table-combining therapies. Patient’s last follow-up date was identified as 24 January 2008.

GPI category 4 Medicationkey (M)

Patient key(P)

Create date(C)

Start date(B)

Stopdate (S)

Activeflag(F)

Chain ID(H)

Chainseq(G)

Enddate(“chaining”)

Enddate(“continuous”)

INSULINGLARGINE

682327 15219411 18-Jun-07 18-Jun-07 N 136664321 0 20-Jun-07 20-Jun-07

EXENATIDE 12670645 15219411 18-Jun-07 18-Jun-07 N 136664552 0 17-Oct-07 15-Oct-07INSULIN

GLARGINE1096062 15219411 20-Jun-07 20-Jun-07 N 136664321 1 7-Jan-08 7-Jan-08

EXENATIDE 12670548 15219411 17-Oct-07 15-Oct-07 N 136664552 1 7-Jan-08 7-Jan-08INSULIN

GLARGINE1096062 15219411 7-Jan-08 7-Jan-08 Y 136664321 2 24-Jan-08 24-Jan-08

EXENATIDE 12670548 15219411 7-Jan-08 7-Jan-08 7-Jan-08 N 136664552 2 7-Jan-08 7-Jan-08

5.2. Case Study 2

As an insightful case study, we consider a patient whose relevant treatment details are shown in Table 4. Since all ofthe records shown have the same chain ID it can be concluded that in the period from the 23rd of April of 2010 until the13th of March 2013 the patient was alternating between two therapies, namely with GLP-1RA (EXENATIDE) andinsulin (INSULIN GLARGINE). This example illustrates the importance of chain ID information, as readilycorroborated by comparing the predicted therapy end dates using the “chaining” and “continuous” methods (per recordestimates are shown in the two rightmost columns of Table 4). The latter disregards chain ID information, it implicitlyassumes that EXENATIDE was taken continuously from the 23rd of April 2010 until the 27th of April 2011, with the lastprescription date being the 28th of March 2011. However, treatment with EXENATIDE was terminated on the 29th ofDecember 2010 when a switch to insulin was made. Treatment with insulin continued until the 28th of March 2011when a switch back to EXENATIDE appeared. This complex and frequent pattern of therapy alteration leads to vastlydifferent treatment duration estimates when chain ID information is used (“chaining”) and when it is not (“continuous”).For example, in this particular case, “continuous” approach estimates the total duration of insulin/ EXENATIDEtreatment to be 5.7/ 28.9 months, compared to 26.5/ 12.1 months estimated by “chaining” method.

Table 4. Snapshot of MF table-switching between therapies. Patient’s last follow up date was identified as 13 March 2013.


Patient key(P)

Create date(C)

Start date(B)

Stopdate(S)

Activeflag(F)

Chain ID (H) Chainseq(G)



EXENATIDE 1523512 64832053 23-Apr-10 23-Apr-10 N 1002923273 0 29-Dec-10 28-Mar-11INSULIN

GLARGINE682327 64832053 29-Dec-10 29-Dec-10 N 1002923273 1 06-Jan-11 06-Jan-11

INSULINGLARGINE

682327 64832053 06-Jan-11 06-Jan-11 N 1002923273 2 28-Mar-11 18-Dec-12

58



Patient key(P)

Create date(C)

Start date(B)

Stopdate(S)

Activeflag(F)

Chain ID (H) Chainseq(G)



EXENATIDE 1523512 64832053 28-Mar-11 28-Mar-11 N 1002923273 3 18-Dec-12 27-Apr-11INSULIN

GLARGINE682327 64832053 18-Dec-12 18-Dec-12 N 1002923273 4 13-Mar-13 13-Mar-13

INSULINGLARGINE

682327 64832053 13-Mar-13 13-Mar-13 Y 1002923273 5 13-Mar-13 13-Mar-13

5.3. General Analysis

Given our focus on GLP-1RA and insulin, to facilitate further analysis, from the cohort of all T2DM patients weselected those who at any point in their medical history received treatment with either of the two drugs of interest. Textmining of drug names in MD table revealed various insulin regimens as well as related devices (e.g. insulin syringe). Toquantify the result, we found that approximately 30% of the patients in the T2DM cohort received at least oneprescription for insulin drug. Interestingly, a large number of patients (~25,000) were found to have receivedprescriptions for insulin devices but not for insulin therapy itself. Further exploration on these patients revealed that theaverage duration of use of these devices in this patient group was 21 months (Table 5), strongly suggesting that therewas an accompanying insulin therapy which was not recorded in the stored EMRs. This conclusion is furthercorroborated by the finding that the mean glycated haemoglobin (HbA1c) level for these patients was measured to be7.8% on the date of the first record associated with the device.

Table 5. Summary statistics on the estimated duration in months of treatment with specific medications in T2DM cohort(n=1,861,560) by “chaining” and “continuous” methods, and the difference in the estimated duration between “chaining” and“continuous” methods.

“Chaining” method “Continuous” method “Chaining” - “continuous”n (%) Mean

(sd)(min,max)

Median(IQR)

n (%) Mean(sd)

(min,max)

Median(IQR)

n (%) Mean(sd)

(min,max)

Median(IQR)

Insulin +device

588923(32)

32.5 (35) (0,657.8)

21.6 (6.5,46.8)

591441(32)

32.7(34.9)

(0,657.8)

21.8 (6.3,47.4)

588923(32)

-0.2(4.8)

(-167.8,183.4)

0 (0, 0)

Insulin only 563293(30)

32.0(34.9)

(0,657.8)

20.8 (6.1,45.8)

566014(30)

32.2(34.8)

(0,657.8)

21.0 (6,46.5)

563293(30)

-0.3 (5) (-167.8,176.9)

0 (0, 0)

no Insulin, butdevice

25536 (1) 21.2(21.5)

(0,196.8)

14.3 (4.8,30.9)

25386 (1) 21.2(21.9)

(0,190.7)

14.1 (4.4,31.1)

24910 (1) -0.2 (5) (-131.8,183.4)

0 (0, 0)

GLP1RA 113416(6)

18.3(19.4)

(0,110.7)

11.7 (3.9,26)

114316(6)

19.2(21.0)

(0,111.7)

11.7 (3.5,27.4)

113416(6)

-1.0(7.6)

(-103.9,95.4)

0 (0, 0)

Exenatide 73326 (4) 18.8(20.2)

(0,110.7)

11.6 (3.9,26.5)

74060 (4) 18.8(21.4)

(0,111.2)

10.6 (3.1,26.7)

73326 (4) -0.2(8.4)

(-97.0,95.4)

0 (0, 0)

Liraglutide 56406 (3) 12.5(11.9)

(0, 56.2) 8.6 (3, 19) 56907 (3) 12.7(12.4)

(0, 56.2) 8.3 (2.5,19.5)

56406 (3) -0.3(4.0)

(-49.5,47.5)

0 (0, 0)

Albiglutide 14 (0) 1.3 (0.5) (1, 2.4) 1 (1, 1.9) 15 (0) 1.3 (0.5) (1, 2.4) 1.0 (1, 1.9) 14 (0) 0 (0) (0, 0) 0 (0, 0)In patients with treatment duration ≥2 Months

Insulin +device

518000(28)

36.8(35.2)

(2,657.8)

26.4 (11.1,51.6)

518318(28)

37.1(35.0)

(2,657.8)

26.8 (11.2,52.3)

516808(28)

-0.3(4.9)

(-167.8,176.9)

0 (0, 0)

Insulin only 492992(26)

36.4(35.2)

(2,657.8)

25.8 (10.7,50.8)

493494(27)

36.7(35.1)

(2,657.8)

26.3 (10.9,51.6)

491847(26)

-0.4(5.2)

(-167.8,176.9)

0 (0, 0)

no Insulin, butdevice

22085 (1) 24.3(21.5)

(2,196.8)

17.8 (8,34.1)

21628 (1) 24.7(21.9)

(2,190.7)

18.0 (8,34.8)

21342 (1) -0.5(4.1)

(-131.8,65.3)

0 (0, 0)

GLP1RA 96458 (5) 21.3(19.6)

(2,110.7)

14.9 (6.8,29.3)

94972 (5) 22.9(21.3)

(2,111.7)

15.7 (6.9,31.8)

94372 (5) -1.5(7.8)

(-103.9,95.4)

0 (0, 0)

Exenatide 62538 (3) 21.8(20.4)

(2,110.7)

14.7 (6.6,30.4)

60228 (3) 22.9(21.7)

(2,111.2)

15.0 (6.5,32.1)

59812 (3) -0.8(8.0)

(-97.0,95.4)

0 (0, 0)

Liraglutide 45432 (2) 15.3(11.6)

(2, 56.2) 12 (5.8,22.1)

44344 (2) 16 (12.2) (2, 56.2) 12.5 (5.9,23.4)

43991 (2) -0.6(3.9)

(-49.5,43.9)

0 (0, 0)

Albiglutide 2 (0) 2.2 (0.2) (2.1, 2.4) 2.2 (2.1,2.4)

2 (0) 2.2 (0.2) (2.1, 2.4) 2.2 (2.1,2.4)

2 (0) 0 (0) (0, 0) 0 (0, 0)

The number of patients receiving insulin and GLP-1RA, and the corresponding treatment duration estimates (inmonths) produced by our algorithms (“chaining” and “continuous”), are summarized in Table 5. Different insulinregimens were treated jointly, as we found that any finer level of detail is poorly recorded in the database. As regards to


59


GLP-1RA treatment, only three different GLP-1RA drugs (namely, Exenatide, Liraglutide, and Albiglutide) have beenused. Being new to the market (introduced in 2014), only limited data was available for Albiglutide treatment.

The estimate of the proportion of patients identified as having received specific individual drugs was found to bevery similar using both the “chaining” approach, as well as the non-chain ID based alternative “continuous” approach,as shown in Table 5. The corresponding values of the key statistics – namely the mean, standard deviation (SD),median, and the interquartile range (IQR)- of the respective estimates of the duration of treatment with individual drugswere also similar. The average differences in the estimated duration of treatment with insulin only and GLP-1RA drugswere 0.3 month and 1 month respectively. There were no differences at the median levels. Separate analyses for patientswith minimum 2 months of treatment duration with individual therapies also revealed the same results. However, it isimportant to note that although the cumulative statistics of the estimated treatment durations with different therapieswere not significantly different, we did find notable differences in the minimum and maximum duration estimates forspecific patient subgroups, as evident from (Table 5).

6. DISCUSSION

In this work we addressed a number of challenging data mining related issues while extracting patient-levellongitudinal information on prescription patterns and medication usages from large relational databases (our data setcomprises more than a billion records). There are several key contributions of note. Firstly we identified the specificchallenges which automatic methods must deal with in the processing of this complex voluminous data. Wecorroborated our arguments using analysis of real-world EMRs and discussed the importance and the implications ofbeing able to handle erroneous and incomplete longitudinal information. Secondly, we introduced two methods for theestimation of the duration of treatment with specific drug(s) in the presence of the aforementioned challenges.Developed sequentially ordered case by case rules were presented mathematically. To the best of our knowledge, norobust algorithmic approach has yet been reported to evaluate treatment duration with individual medications inmultiple treatment scenario [22, 27].

We have described two algorithmic approaches to estimate treatment duration on the individual record level. Firstmethod (“chaining”) relies on specific chaining fields of medication information, while second approach (“continuous”)does not use chain related information and employs only chronological record information instead. Our results on thelarge Centricity EMR database show that the two approaches do not produce significantly different results on average atpopulation level. However, when examined in detail, the “chaining” method could identify the treatment alterationslongitudinally and was shown to be more robust at individual patient level. Furthermore, treatment duration estimatesfrom the “continuous” approach are more sensible to the set of selected medications. The difference between methods isparticularly prominent in studies involving multiple drugs as opposed to single drug therapies or focusing on the orderof treatment initiation [48, 49].

Our study highlighted the potential risk of underestimating the duration of treatment when EMR data is useddirectly, due to erroneous or incomplete data emerging from omissions in the data entry process, appointments missedby patients, typographical errors, or numerous others. Both proposed algorithms robustly handle these challengeswhenever is possible, estimating values of the missing or erroneous entries. Importantly, being rule based, the decisionsof our algorithms are readily interpretable by humans and lend themselves to effortless use by medical professionals notnecessarily proficient in data mining and related disciplines. Both approaches implement two fact datasets available inthe Centricity EMRs, however algorithms are easily adjusted in case of only one available dataset.

CONCLUSION

This study discusses the challenges in exploring the prescription / medication patterns for individual patients inlarge primary / ambulatory care electronic databases, and introduces two algorithmic approaches for robust estimationof treatment duration with individual drug(s). We have demonstrated that implementing chaining fields of medicationinformation additionally improve the quality of estimates. Given the importance of extracting medication informationappropriately in pharmaco-epidemiological studies based on real world data, the proposed algorithms has the potentialto significantly contribute to the analytical quality aspects in the future EMR based clinical and epidemiological studies.


EMR = Electronic Medical Rerecords

CEMR = Centricity Electronic Medical Rerecords

60


T2DM = Type 2 Diabetes

MD = Medication Dimension

MF = Medication Fact

PF = Prescription Fact

GPI = Generic Product Identifier

ID = Identification

DOB = Date of Birth

SD = Standard Deviation

IQR = Interquartile Range

GLP-1RA = Glucagon-Like Peptide-1 Receptor Agonist

ETHICS APPROVAL AND CONSENT TO PARTICIPATE

Not applicable.

HUMAN AND ANIMAL RIGHTS

No Animals/Humans were used for studies that are base of this research.

CONSENT FOR PUBLICATION

Not applicable.

CONFLICT OF INTEREST

Sanjay K. Paul (SKP) has acted as a consultant and/or speaker for Novartis, GI Dynamics, Roche, AstraZeneca,Guangzhou Zhongyi Pharmaceutical and Amylin Pharmaceuticals LLC. He has received grants in support ofinvestigator and investigator initiated clinical studies from Merck, Novo Nordisk, AstraZeneca, Hospira, AmylinPharmaceuticals, Sanofi-Avensis and Pfizer. Olga Montvida (OM) and Ognjen Arandjelovic (OA) has no conflict ofinterest to declare. Edward Reiner (ER) was an employee of Quintiles and was responsible for the strategicdevelopment of the Centricity EMR database.

ACKNOWLEDGEMENTS

Olga Montvida (OM) and Sanjay K. Paul (SKP) conceived the idea and were responsible for the primary design ofthe study and the methodological developments. Ognjen Arandjelovic (OA) and Edward Reiner (ER) evaluated themethodological approach. Olga Montvida (OM) conducted the data extraction and statistical analyses. The first draft ofthe manuscript was developed by Sanjay K. Paul (SKP) and Olga Montvida (OM), and all authors contributed to thefinalization of the manuscript. Sanjay K. Paul (SKP) had full access to all the data in the study and takes responsibilityfor the integrity of the data and the accuracy of the data analysis.

Melbourne EpiCentre gratefully acknowledges the support from the Australian Government’s NationalCollaborative Research Infrastructure Strategy (NCRIS) initiative through Therapeutic Innovation Australia. Noseparate funding was obtained for this study.

REFERENCES

[1] Paul SK, Klein K, Thorsted BL, Wolden ML, Khunti K. Delay in treatment intensification increases the risks of cardiovascular events inpatients with type 2 diabetes. Cardiovasc Diabetol 2015; 14: 100.[http://dx.doi.org/10.1186/s12933-015-0260-x] [PMID: 26249018]

[2] Bhatnagar P, Wickramasinghe K, Williams J, Rayner M, Townsend N. The epidemiology of cardiovascular disease in the UK 2014. Heart2015; 101(15): 1182-9.[http://dx.doi.org/10.1136/heartjnl-2015-307516] [PMID: 26041770]

[3] Crawford AG, Cote C, Couto J, et al. Comparison of GE Centricity Electronic Medical Record database and National Ambulatory MedicalCare Survey findings on the prevalence of major conditions in the United States. Popul Health Manag 2010; 13(3): 139-50.[http://dx.doi.org/10.1089/pop.2009.0036] [PMID: 20568974]

[4] Wettermark B, Zoëga H, Furu K, et al. The Nordic prescription databases as a resource for pharmacoepidemiological research--a literaturereview. Pharmacoepidemiol Drug Saf 2013; 22(7): 691-9.[http://dx.doi.org/10.1002/pds.3457] [PMID: 23703712]

61

http://dx.doi.org/10.1186/s12933-015-0260-x

http://www.ncbi.nlm.nih.gov/pubmed/26249018

http://dx.doi.org/10.1136/heartjnl-2015-307516


http://dx.doi.org/10.1089/pop.2009.0036


http://dx.doi.org/10.1002/pds.3457



[5] Lau EC, Mowat FS, Kelsh MA, et al. Use of electronic medical records (EMR) for oncology outcomes research: assessing the comparabilityof EMR information to patient registry and health claims data. Clin Epidemiol 2011; 3: 259-72.[PMID: 22135501]

[6] Paul SK, Klein K, Maggs D, Best JH. The association of the treatment with glucagon-like peptide-1 receptor agonist exenatide or insulin withcardiovascular outcomes in patients with type 2 diabetes: A retrospective observational study. Cardiovasc Diabetol 2015; 14: 10.[http://dx.doi.org/10.1186/s12933-015-0178-3] [PMID: 25616979]

[7] Nadkarni PM. Drug safety surveillance using de-identified EMR and claims data: issues and challenges. J Am Med Inform Assoc 2010; 17(6):671-4.[http://dx.doi.org/10.1136/jamia.2010.008607] [PMID: 20962129]

[8] Liu M, McPeek Hinz ER, Matheny ME, et al. Comparative analysis of pharmacovigilance methods in the detection of adverse drug reactionsusing electronic medical records. J Am Med Inform Assoc 2013; 20(3): 420-6.[http://dx.doi.org/10.1136/amiajnl-2012-001119] [PMID: 23161894]

[9] Coloma PM, Trifirò G, Patadia V, Sturkenboom M. Postmarketing safety surveillance: where does signal detection using electronic healthcarerecords fit into the big picture? Drug Saf 2013; 36(3): 183-97.[http://dx.doi.org/10.1007/s40264-013-0018-x] [PMID: 23377696]

[10] Lin J, Jiao T, Biskupiak JE, McAdam-Marx C. Application of electronic medical record data for health outcomes research: a review of recentliterature. Expert Rev Pharmacoecon Outcomes Res 2013; 13(2): 191-200.[http://dx.doi.org/10.1586/erp.13.7] [PMID: 23570430]

[11] Belletti D, Zacker C, Mullins CD. Perspectives on electronic medical records adoption: Electronic Medical Records (EMR) in outcomesresearch. Patient Relat Outcome Meas 2010; 1: 29-37.[http://dx.doi.org/10.2147/PROM.S8896] [PMID: 22915950]

[12] Khunti K, Davies M, Majeed A, Thorsted BL, Wolden ML, Paul SK. Hypoglycemia and risk of cardiovascular disease and all-cause mortalityin insulin-treated people with type 1 and type 2 diabetes: a cohort study. Diabetes Care 2015; 38(2): 316-22.[http://dx.doi.org/10.2337/dc14-0920] [PMID: 25492401]

[13] Canavan C, West J, Card T. Calculating Total Health Service Utilisation and Costs from Routinely Collected Electronic Health Records Usingthe Example of Patients with Irritable Bowel Syndrome Before and After Their First Gastroenterology Appointment. Pharmacoeconomics2016; 34(2): 181-94.[PMID: 26497004]

[14] Bessou A, Guelfucci F, Aballea S, Toumi M, Poole C. Comparison of comorbidity measures to predict economic outcomes in a large UKprimary care database. Value Health. 2015; 18(7): A691.[http://dx.doi.org/10.1016/j.jval.2015.09.2565]

[15] Birkhead GS, Klompas M, Shah NR. Uses of electronic health records for public health surveillance to advance public health. Annu RevPublic Health 2015; 36: 345-59.[http://dx.doi.org/10.1146/annurev-publhealth-031914-122747] [PMID: 25581157]

[16] Paul MM, Greene CM, Newton-Dame R, et al. The state of population health surveillance using electronic health records: a narrative review.Popul Health Manag 2015; 18(3): 209-16.[http://dx.doi.org/10.1089/pop.2014.0093] [PMID: 25608033]

[17] Kukafka R, Ancker JS, Chan C, et al. Redesigning electronic health record systems to support public health. J Biomed Inform 2007; 40(4):398-409.[http://dx.doi.org/10.1016/j.jbi.2007.07.001] [PMID: 17632039]

[18] Menachemi N, Collum TH. Benefits and drawbacks of electronic health record systems. Risk Manag Healthc Policy 2011; 4: 47-55.[http://dx.doi.org/10.2147/RMHP.S12985] [PMID: 22312227]

[19] Crapo J. Big data in gealthcare: separating the hype from the reality. HealthCatalyst 2015; p. 5.

[20] Grabenbauer L, Skinner A, Windle J. Electronic Health Record Adoption - Maybe It’s not about the Money: Physician Super-Users,Electronic Health Records and Patient Care. Appl Clin Inform 2011; 2(4): 460-71.[http://dx.doi.org/10.4338/ACI-2011-05-RA-0033] [PMID: 23616888]

[21] Paul SK, Klein K, Majeed A, Khunti K. Association of smoking and concomitant metformin use with cardiovascular events and mortality inpeople newly diagnosed with type 2 diabetes. J Diabetes 2016; 8(3): 354-62.[http://dx.doi.org/10.1111/1753-0407.12302] [PMID: 25929583]

[22] Gaitanou P, Garoufallou E, Balatsoukas P. The effectiveness of big data in health care: a systematic review. Commun Comput Inf Sci 2014;141-53.[http://dx.doi.org/10.1007/978-3-319-13674-5_14]

[23] Svensson MK, Cederholm J, Eliasson B, Zethelius B, Gudbjörnsdottir S. Albuminuria and renal function as predictors of cardiovascularevents and mortality in a general population of patients with type 2 diabetes: a nationwide observational study from the Swedish NationalDiabetes Register. Diab Vasc Dis Res 2013; 10(6): 520-9.[http://dx.doi.org/10.1177/1479164113500798] [PMID: 24002670]

[24] Dean BB, Lam J, Natoli JL, Butler Q, Aguilar D, Nordyke RJ. Review: use of electronic medical records for health outcomes research: a

62


http://dx.doi.org/10.1186/s12933-015-0178-3


http://dx.doi.org/10.1136/jamia.2010.008607


http://dx.doi.org/10.1136/amiajnl-2012-001119


http://dx.doi.org/10.1007/s40264-013-0018-x


http://dx.doi.org/10.1586/erp.13.7


http://dx.doi.org/10.2147/PROM.S8896


http://dx.doi.org/10.2337/dc14-0920



http://dx.doi.org/10.1016/j.jval.2015.09.2565

http://dx.doi.org/10.1146/annurev-publhealth-031914-122747


http://dx.doi.org/10.1089/pop.2014.0093


http://dx.doi.org/10.1016/j.jbi.2007.07.001


http://dx.doi.org/10.2147/RMHP.S12985


http://dx.doi.org/10.4338/ACI-2011-05-RA-0033


http://dx.doi.org/10.1111/1753-0407.12302


http://dx.doi.org/10.1007/978-3-319-13674-5_14

http://dx.doi.org/10.1177/1479164113500798



literature review. Med Care Res Rev 2009; 66(6): 611-38.[http://dx.doi.org/10.1177/1077558709332440] [PMID: 19279318]

[25] Wei WQ, Denny JC. Extracting research-quality phenotypes from electronic health records to support precision medicine. Genome Med 2015;7(1): 41.[http://dx.doi.org/10.1186/s13073-015-0166-y] [PMID: 25937834]

[26] Denny JC. Chapter 13: Mining electronic health records in the genomics era. PLOS Comput Biol 2012; 8(12): e1002823.[http://dx.doi.org/10.1371/journal.pcbi.1002823] [PMID: 23300414]

[27] Torre C, Martins AP. Overview of Pharmacoepidemiological Databases in the Assessment of Medicines Under real-life Conditions. In: LunetN, Eds. Epidemiolgy-current perspective on Research and practical Intech open publishers contributers 2012; pp.131-54.[http://dx.doi.org/10.5772/35318]

[28] Centricity Electronic Medical Record Brochure. GE Healthcare 2011.

[29] Lin J, Jiao T, Biskupiak JE, McAdam-Marx C. Application of electronic medical record data for health outcomes research: a review of recentliterature. Expert Rev Pharmacoecon Outcomes Res 2013; 13(2): 191-200.[http://dx.doi.org/10.1586/erp.13.7] [PMID: 23570430]

[30] Jermyn P, Dixon M, Read BJ. Preparing clean views of data for data mining. ERCIM Work on Database Res 1999; pp. 1-15.

[31] Zhang S, Zhang C, Yang Q. Data preparation for data mining. Appl Artif Intell 2003; 17(5-6): 375-81.[http://dx.doi.org/10.1080/713827180]

[32] Jensen PB, Jensen LJ, Brunak S. Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet 2012;13(6): 395-405.[http://dx.doi.org/10.1038/nrg3208] [PMID: 22549152]

[33] Benchimol EI, Smeeth L, Guttmann A, et al. The REporting of studies Conducted using Observational Routinely-collected health Data(RECORD) statement. PLoS Med 2015; 12(10): e1001885.[http://dx.doi.org/10.1371/journal.pmed.1001885] [PMID: 26440803]

[34] PLOS Medicine Editors. From checklists to tools: Lowering the barrier to better research reporting. PLoS Med 2015; 12(11): e1001910.[http://dx.doi.org/10.1371/journal.pmed.1001910] [PMID: 26600090]

[35] Yao L, Zhang Y, Li Y, Sanseau P, Agarwal P. Electronic health records: Implications for drug discovery. Drug Discov Today 2011;16(13-14): 594-9.[http://dx.doi.org/10.1016/j.drudis.2011.05.009] [PMID: 21624499]

[36] Hall GC, McMahon AD, Dain M-P, Home PD. A comparison of duration of first prescribed insulin therapy in uncontrolled type 2 diabetes.Diabetes Res Clin Pract 2011; 94(3): 442-8.[http://dx.doi.org/10.1016/j.diabres.2011.09.003] [PMID: 21963105]

[37] Hansen RA, Farley JF, Maciejewski ML, Ye X, Qian C, Powers B. Real-world utilization patterns and outcomes of colesevelam hcl in the geelectronic medical record. BMC Endocr Disord 2013; 13(1): 24.[http://dx.doi.org/10.1186/1472-6823-13-24] [PMID: 23866087]

[38] Hippisley-Cox J, Coupland C. 2016.

[39] Fardet L, Petersen I, Nazareth I. Prevalence of long-term oral glucocorticoid prescriptions in the UK over the past 20 years. Rheumatology(Oxford) 2011; 50(11): 1982-90.[http://dx.doi.org/10.1093/rheumatology/ker017] [PMID: 21393338]

[40] Davis KL, Tangirala M, Meyers JL, Wei W. Real-world comparative outcomes of US type 2 diabetes patients initiating analog basal insulintherapy. Curr Med Res Opin 2013; 29(9): 1083-91.[http://dx.doi.org/10.1185/03007995.2013.811403] [PMID: 23734906]

[41] Xie L, Wei W, Pan C, Du J, Baser O. A real-world study of patients with type 2 diabetes initiating basal insulins via disposable pens. AdvTher 2011; 28(11): 1000-11.[http://dx.doi.org/10.1007/s12325-011-0074-5] [PMID: 22038703]

[42] Deléger L, Grouin C, Zweigenbaum P. Extracting medical information from narrative patient records: the case of medication-relatedinformation. J Am Med Inform Assoc 2010; 17(5): 555-8.[http://dx.doi.org/10.1136/jamia.2010.003962] [PMID: 20819863]

[43] Etminan M. Reporting guidelines for pharmacoepidemiological studies are urgently needed. BMJ 2014; 349: g5511.[http://dx.doi.org/10.1136/bmj.g5511] [PMID: 25231185]

[44] Kamal KM, Chopra I, Elliott JP, Mattei TJ. Use of electronic medical records for clinical research in the management of type 2 diabetes. ResSocial Adm Pharm 2014; 10(6): 877-84.[http://dx.doi.org/10.1016/j.sapharm.2014.01.001] [PMID: 24556384]

[45] Herrin J, da Graca B, Nicewander D, et al. The effectiveness of implementing an electronic health record on diabetes care and outcomes.Health Serv Res 2012; 47(4): 1522-40.[http://dx.doi.org/10.1111/j.1475-6773.2011.01370.x] [PMID: 22250953]

[46] Holt TA, Stables D, Hippisley-Cox J, O’Hanlon S, Majeed A. Identifying undiagnosed diabetes: cross-sectional survey of 3.6 million patients’

63

http://dx.doi.org/10.1177/1077558709332440


http://dx.doi.org/10.1186/s13073-015-0166-y


http://dx.doi.org/10.1371/journal.pcbi.1002823


http://dx.doi.org/10.5772/35318

http://dx.doi.org/10.1586/erp.13.7


http://dx.doi.org/10.1080/713827180

http://dx.doi.org/10.1038/nrg3208


http://dx.doi.org/10.1371/journal.pmed.1001885


http://dx.doi.org/10.1371/journal.pmed.1001910


http://dx.doi.org/10.1016/j.drudis.2011.05.009


http://dx.doi.org/10.1016/j.diabres.2011.09.003


http://dx.doi.org/10.1186/1472-6823-13-24


http://dx.doi.org/10.1093/rheumatology/ker017


http://dx.doi.org/10.1185/03007995.2013.811403


http://dx.doi.org/10.1007/s12325-011-0074-5


http://dx.doi.org/10.1136/jamia.2010.003962


http://dx.doi.org/10.1136/bmj.g5511


http://dx.doi.org/10.1016/j.sapharm.2014.01.001


http://dx.doi.org/10.1111/j.1475-6773.2011.01370.x



electronic records. Br J Gen Pract 2008; 58(548): 192-6.[http://dx.doi.org/10.3399/bjgp08X277302] [PMID: 18318973]

[47] Davis KL, Tangirala M, Meyers JL, Wei W. Real-world comparative outcomes of US type 2 diabetes patients initiating analog basal insulintherapy. Curr Med Res Opin 2013; 29(9): 1083-91.[http://dx.doi.org/10.1185/03007995.2013.811403] [PMID: 23734906]

[48] Paul SK, Klein K, Majeed A, Khunti K. Association of smoking and concomitant use of metformin with cardiovascular events and mortalityin people newly diagnosed with type 2 diabetes. J Diabetes 2015; 8(3): 354-62.[PMID: 25929583]

[49] Paul SK, Klein K, Maggs D, Best JH. The association of the treatment with glucagon-like peptide-1 receptor agonist exenatide or insulin withcardiovascular outcomes in patients with type 2 diabetes: a retrospective observational study. Cardiovasc Diabetol 2015; 14(1): 10.[http://dx.doi.org/10.1186/s12933-015-0178-3] [PMID: 25616979]

© 2017 Montvida et al.

This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), acopy of which is available at: (https://creativecommons.org/licenses/by/4.0/legalcode). This license permits unrestricted use, distribution, andreproduction in any medium, provided the original author and source are credited.

64

http://dx.doi.org/10.3399/bjgp08X277302


http://dx.doi.org/10.1185/03007995.2013.811403



http://dx.doi.org/10.1186/s12933-015-0178-3


https://creativecommons.org/licenses/by/4.0/legalcode

Chapter 5: Diabetes Mellitus Cohort

Using diagnostic codes to identify a diseased cohort is the standard approach in EMR-based

studies [163, 164]. As was mentioned in chapter 3, use of such codes was shown as a reliable

tool to identify diseased cohorts in various databases including CEMR. Nonetheless, inaccurate

coding or incomplete data-entry process is a part of everyday practice, and EMRs reflect this

matter [165]. With regards to DM, identifying cohorts by diagnostic codes brings-up the

following concerns: (1) unknown disease subtype from high-level diagnostic codes, (2)

longitudinally overlapping codes of different subtypes for individual patients, (3) absence of

diagnostic codes for patients with DM (false negatives), and (4) presence of DM codes for

those who are not diseased (false positives). The accuracy of disease cohorts identified with

diagnostic codes, theoretically, may be improved by implementing advanced algorithms that

robustly utilise available patient-level information.

Appendix B provides more details on the aforementioned challenges and existing methods

trying to overcome them. A systematic review summarising methods on identification of

disease cohorts from various clinical databases has been conducted by Shivade and colleagues

[166]. While some studies applied restrictive rules to cohorts identified by diagnostic codes,

several studies applied and compared machine learning (ML) techniques. For example, Tapak

and colleagues [167] reported support vector machine as the best classifier, while Mani and

colleagues [168] reported that using decision trees as the best approach to identify patients with

diabetes.

As patients with T2DM are in primary focus of this dissertation, the aim was to achieve the

highest possible quality of T2DM cohort. To overcome the aforementioned challenges,

diagnostic codes were initially used to identify a cohort of patients with any DM (subsection

5.1). Next, machine learning (ML) approaches that seek to identify all patients in the database

who are highly likely to be diabetic were compared (subsection 5.2). Afterwards, cohorts that

were identified by one selected ML algorithm and by diagnostic codes, were combined. To

obtain the final cohort, additional clinically guided rules were incorporated (subsection 5.3).

Next, basic characteristics of the obtained DM cohort were compared to national reports

(subsection 5.4). Finally, basic characteristics were explored in adults identified to have T2DM

(subsection 5.5).

65

5.1 DIAGNOSTIC CODES

Patients with diagnostic codes for any type of diabetes were identified from CEMR. After

quality assessments and data corrections, 111,303 and 2,160,098 patients were identified to

have type 1 and type 2 diabetes, respectively (Figure 5.1). Patients with records for gestational

diabetes and no record of type 1 or type 2 diabetes were treated separately (n=89,562). Patients

were also categorised as “unspecific” diabetes type (n=118,557) due to high-level diagnostic

codes (e.g. “ICD-9: 250 Diabetes Mellitus”), unspecific codes (e.g. “ICD-9: 362.07 Diabetic

macular edema”), or longitudinally overlapping codes for type 1 and type 2. Among

aforementioned 2,479,520 patients, 59% (n=1,468,246) had a record of ADD use with duration

longer than 6 month, and only 38% (n=935,652) had a record of 2 elevated glucose measures

within one consecutive year.

Figure 5.1. Cohort of patients with T2DM and distribution of identified sub-types.

5.2 SUPERVISED MACHINE LEARNING

5.2.1 Training datasets

The training dataset with 150,000 patients, containing equal number of positive and negative

representatives was extracted. Positive representatives were randomly chosen from those with

a diagnostic code for DM, and negative representatives from those without a code for DM. All

representatives were chosen from patients with at least 1 year of available follow-up, and non-

missing sex and age. To verify presence of bias arising from selecting random negative

representatives, two additional training datasets were created by including disjoint negative

and the same positive representatives as in the main training set.

66

5.2.2 Feature selection

For patients with DM (identified by diagnostic codes n=2,479,520), data on medication

prescriptions, diseases, and laboratory measurements were analysed. It was found that 66% of

these patients have a diagnosis of “Essential hypertension”, and 62% have a diagnosis of

“Disorders of lipoid metabolism”. Antidiabetics, antihypertensives, and antilipidemics drugs

were used by 82%, 70%, and 66% patients, respectively. Analgesics non-narcotic and

analgesics opioid were used by 53% and 37% patients, respectively. Beta blockers, ulcer drugs,

diuretic drugs, and antidepressants were used by 40%, 40%, 38%, and 34% patients,

respectively. The most frequent observations and laboratory tests for diabetic patients were:

weight, blood pressure, pulse, height, body mass index, creatinine (serum), urea nitrogen

(blood), calcium (serum), alanine aminotransferase (serum), aspartate aminotransferase

(serum), sodium (serum), HbA1c (blood), bilirubin (serum), and cholesterol (serum).

Obtained results were combined with clinical considerations and guidelines [169], and 11

potential disease predictors were obtained. Scheme-independent (or classifier-independent)

attribute subset selection algorithms were applied to determine the best predictors. Using 10-

fold cross-validation, bi-directional, forward, and backward greedy search techniques [170]

agreed on the 4 features shown in the Table 5.2.

Table 5.1

Features Selected as Best Diabetes Predictors in CEMR

Feature description Feature type Two measurements of: HbA1C ≥ 6.5% or fasting blood glucose ≥ 126 mg/dL [7.0 mmol/L], or random blood glucose ≥ 200 mg/dL [11.1 mmol/L] within 1 year

Binary

Anti-diabetic drug duration ≥ 6 months Binary

Average body mass index Continuous

Ischemic heart disease, heart failure or stroke Binary

5.2.3 Classification algorithm selection

Keeping four selected predictors only, the performance of six classification algorithms was

compared on training sets. Sensitivity (true positive rate), specificity (true negative rate), and

area under receiver operating characteristic curve (AUC) were calculated as the average of 10-

repeat 10-fold cross-validations. Central processing unit (CPU) time of training and percent of

correctly classified instances were also recorded. Compared classifiers were: Naïve Bayes

67

[171], Logistic regression [172], Support Vector Machine [173], Multilayer Perceptron [174],

Decision Tree with J48 modification [175], and One Rule [176].

While the One Rule algorithm performed significantly worse, performance of the other

algorithms was similar (Table 5.2). Among them, the false positive rate was the same, but

Support Vector Machine and Decision Tree algorithms produced fewer false negatives. Given

the higher AUC and smaller CPU time, the Decision Tree algorithm was chosen as the final

machine learning approach. Absence of bias arising from selecting random negative instances

was confirmed by almost identical performance of all algorithms on three training sets.

Table 5.2

Performance of Machine Learning Algorithms on the Training Dataset

Naïve Bayes

Logistic Regression

Multilayer Perceptron

Support Vector Machine

J48 Decision Tree

One Rule

Percent correct 90.04 90.12 90.15 90.17 90.17 86.6 True positive rate 0.96 0.96 0.96 0.97 0.97 0.98 True negative rate 0.84 0.84 0.84 0.84 0.84 0.76 AUC 0.94 0.94 0.94 0.90 0.91 0.87 CPU time 0.05 2.26 49.23 157.71 0.52 0.09

5.3 FINAL COHORT

The selected J48 algorithm (Figure 5.2) was applied to all patients in the CEMR with at least

1 year of follow-up, non-missing sex and age, and resulted in 2,023,956 patients that are highly

likely to have diabetes. Of them, 78% (n=1,580,867) had a diagnostic code for diabetes.

Figure 5.2. Selected Decision Tree algorithm.

68

As errors during the data entry process in a real-world setting usually results in a smaller

number of false positive patients, and a larger number of false negatives identified by

diagnostic codes, cohorts identified by the ML approach and by diagnostic codes were

combined (n=2,922,609). Minimizing false negative instances was ensured by careful design

in further studies (e.g. inclusion of patients who initiated pharmacological therapy with MET,

and added second-line ADD). Patients who were identified by ML and not by diagnostic codes

(n=443,089), were categorised as “unspecific” diabetes type. As a final step, the following rules

were applied to distinguish diabetes types amongst all the “unspecific” cases:

7. if duration of non-insulin ADD ≥ 2 months, then type 2;

8. otherwise, if age at first available diagnosis date ≤ 18 years and insulin initiated

within 1 year, then type 1;

9. otherwise, if age at first available diagnosis date > 18 years and insulin initiated

within 3 months, then type 1;

10. otherwise, consider that patient does not have diabetes.

Steps 1-3 were unable to identify the type of diabetes for 29,288 patients, which were excluded

from the cohort. The final diabetes cohort consisted of 2,893,321 patients with 178,805 /

2,624,954 / 89,562 patients identified to have type 1 / type 2 / gestational diabetes (Figure 5.1).

5.4 REPRESENTATIVENESS OF DIABETES COHORT

Among all patients who were active in the CEMR during 2015 and were older than 18 years,

11.6% were identified to have any type of diabetes. This estimate stands very close to the US

National Diabetes Statistics (NDS) report that estimated 12.2% of adult population to have

diabetes in 2015 [177]. NDS reported an almost equal gender distribution (Table 5.3), while a

higher proportion of women (55%) was captured in the CEMR. Calculating the age of patients

in 2015, CEMR appeared to have younger patients than the NDS report. Body weight of adults

with diabetes in the CEMR was found to reflect the NDS report well, with the majority of

patients (90%) being overweight or obese (Table 5.3).

Table 5.3

Characteristics of patients with diabetes in the CEMR database and in the National Diabetes Statistics report,

2015

CEMR, % NDS, % Gender (p=0.4)

Male 45 51 Female 55 49

69

Age, years (p=0.5) 18-44 13 9 45-65 39 37 ≥65 47 55

BMI, kg/m2 (p=0.9) <25 10 13

25-30 25 26 30-40 46 44 ≥40 19 18

Other estimates in the NDS report are hard to directly compare with CEMR due to

methodological considerations. Although a large number of patients with diabetes do not have

a record of tobacco use in the CEMR, 17% were current smokers among those who have a

record of tobacco status. This estimate stands very close to the 16% of current smokers reported

in the NDS. NDS estimated that 74% of adults with DM had SBP ≥ 140 mm/Hg, or DBP ≥ 90

mm/Hg, or they were on prescription medication for high blood pressure. This estimate in the

CEMR ranged between 70% and 82%, depending on the definition of medication for high

blood pressure and the timeline chosen. According to the NDS report, 58% of adults aged 21-

75 with no self-reported CVD but who were eligible for statin therapy were on a lipid-lowering

medication (data source: 2011–2014 National Health and Nutrition Examination Survey).

Rough estimates from CEMR for this figure were 60-65% for adults 21-75 in 2015.

5.5 TYPE 2 DIABETES COHORT

Among 2,624,954 patients identified to have T2DM, 2,596,630 were at least 18 years old at

the time of the diabetes diagnosis. Basic characteristics at the time of diagnosis, along with

existing diseases for these adults is presented in the Table 5.4. In this cohort, 53% /47% are

females /males with mean (SD) follow-up of 3.9 (4.8) years. The majority of patients are White

Caucasians with mean (SD) age of 59 (13) years. At the time of T2DM diagnosis 39% /45%

of females /males had HbA1c ≥ 8%, and 68% /62% of females /males were obese. Blood

pressure and lipid profiles were very similar across females and males (Table 5.4). At the time

of the T2DM diagnosis, 23% of males and 15% of females were diagnosed with CVD. CCI

was around 1.5 for both genders, however more females had a record of depression diagnosis:

11% against 6% for males.

70

Table 5.4

Baseline characteristics among adults with T2DM

Female Male All N 1,381,075 1,215,555 2,596,630 Follow-up, yearsα 4.0 (4.8) 3.9 (4.9) 3.9 (4.8) Follow-up, yearsβ 2.8 (0.9, 5.6) 2.7 (0.9, 5.5) 2.7 (0.9, 5.5) Follow-up ≥ 6 monthsγ 1,148,017 (83) 991,743 (82) 2,139,760 (82) follow-up ≥ 12 monthsγ 1,021,250 (74) 882,575 (73) 1,903,825 (73) follow-up ≥ 24 monthsγ 815,233 (59) 703,941 (58) 1,519,174 (59) Age, yearsα 58 (14) 60 (12) 59 (13) Age ≥ 70 yearsγ 319,075 (23) 293,480 (24) 612,555 (24) Ethnicityγ White, n (%) 852, 115 (62) 799, 320 (66) 1,651,435 (64) Black, n (%) 185, 040 (13) 110,491 (9) 295,531 (11) Asian, n (%) 24,718 (2) 22,753 (2) 47,471 (2) Tobacco use statusγ Current 82,218 (6) 85,101 (7) 167,319 (6) Former 113,714 (8) 151,353 (12) 265,067 (10) Never 306,603 (22) 197,396 (16) 503,999 (19) Unknown 878,540 (64) 781,705 (64) 1,660,245 (64) HbA1c, %α 8.2 (1.8) 8.4 (1.9) 8.3 (1.9) HbA1c ≥ 7.5γ 156,850 (51) 173,302 (57) 330,152 (54) HbA1c ≥ 8%γ 121,476 (39) 138,013 (45) 259,489 (42) Weight, kgα 90 (24) 102 (24) 96 (25) BMI, kg/m2 α 34.6 (8.5) 32.8 (7.1) 33.8 (7.9) Obese γ 707,335 (68) 557,226 (62) 1,264,561 (65) SBP, mmHgα 131 (17) 132 (17) 132 (17) SBP ≥140 mmHgγ 299,248 (28) 270,470 (29) 569,718 (29) DBP, mmHgα 77 (10) 78 (10) 77 (10) Heart Rate, bmpα 79 (12) 77 (12) 78 (12) LDL, mg/dLα 109 (38) 101 (37) 105 (38) HDL, mg/dLα 49 (14) 41 (12) 45 (14) Triglycerides, mg/dLβ 139 (101, 188) 139 (99, 191) 139 (100, 190) Present diseasesγ CVD 204,114 (15) 279,284 (23) 483,398 (19) Heart Failure 47,427 (3) 56,471 (5) 103,898 (4) Myocardial Infarction 18,019 (1) 32,515 (3) 50,534 (2) Stroke 55,988 (4) 55,631 (5) 111,619 (4) Chronic Kidney Disease 45,140 (3) 50,929 (4) 96,069 (4) Rheumatoid Arthritis 20,742 (2) 8,113 (1) 28,855 (1) Cancer 48,105 (3) 47,882 (4) 95,987 (4) Depression 156,623 (11) 69,138 (6) 225,761 (9) Charlson Comorbidity Indexα 1.48 (0.95) 1.53 (1.03) 1.51 (0.99)

αmean (SD), βmedian (IQR), γn(%)

71

In adult patients with T2DM, exposure to various medications any time during follow-up is

presented in table 5.5. The majority of patients (61%) were prescribed metformin, while a

quarter of patients eventually received insulin. Chapter 7 provides a detailed exploration of

longitudinal prescription patterns of ADDs in patients with T2DM. Around 80% of patients

were using a cardio-protective medication (CPM), while 64% /71% of females /males received

lipid-modifying drugs sometime during follow-up.

Table 5.5

Exposure to medications any time during available follow-up among adults with T2DM

N (%) Female Male All Metformin 842,806 (61) 736,378 (61) 1,579,184 (61) Sulfonylurea 405,132 (29) 428,792 (35) 833,924 (32) Thiazolidinedione 151,198 (11) 164,740 (14) 315,938 (12) Insulin 351,106 (25) 330,066 (27) 681,172 (26) GLP-1RA 90,030 (7) 67,750 (6) 157,780 (6) DPP-4i 193,388 (14) 188,045 (15) 381,433 (15) SGLT-2i 39,574 (3) 42,320 (3) 81,894 (3) Lipid modifying 880,364 (64) 860,370 (71) 1,740,734 (67) Statin 800,753 (58) 792,066 (65) 1,592,819 (61) CPM* 1,094,662 (79) 1,024,434 (84) 2,119,096 (82) Diuretic 666,838 (48) 519,525 (43) 1,186,363 (46) Antihypertensive 116,478 (8) 119,395 (10) 235,873 (9) Antidepressant 575,425 (42) 323,365 (27) 898,790 (35) Anti-obesity 39,942 (3) 12,503 (1) 52,445 (2)

*CPM: beta blocker, statin, angiotensin-converting-enzyme inhibitor, or angiotensin II receptor blocker

72

Chapter 6: Imputation of Longitudinal

Observation Data


Paper




expertise;






academic unit, and




Mayukh Samanta, Olga Montvida, Joanne Tropea, and Sanjoy K. Paul. A comparison of

imputation methods for missing risk factor data from large real-world electronic medical

records for comparative effectiveness studies.


Olga Montvida Conducted data extraction and contributed towards

manuscript development.

Mayukh Samanta Conceived the idea and was responsible for the primary

design of the study. Conducted statistical analyses.


of the manuscript.

Joanne Tropea Contributed towards development of the manuscript.

Sanjoy K. Paul Conceived the idea and was responsible for the primary

design of the study. Contributed to the statistical analyses.

73

29.06.2018 QUT Verified Signature


of the manuscript.



authorship.


Name Signature Date

74


of 23

Title: A Comparison of Imputation Methods for Missing Risk Factor Data from

Large Real-world Electronic Medical Records for Comparative Effectiveness

Studies

Mayukh Samanta, PhD1, Olga Montvida1,2, Joanne Tropea3 and Sanjoy K. Paul, PhD3

1Statistics Unit, QIMR Berghofer Medical Research Institute, 300 Herston Road, Herston,

QLD 4006, Australia

2School of Biomedical Sciences, Faculty of Health, Queensland University of Technology, Brisbane, Australia

3Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, Australia

Correspondence to:

Prof. Sanjoy Ketan Paul

Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, The Royal Melbourne Hospital - City Campus | 7 East, Main Building, Grattan Street, Parkville Victoria 3050, Australia

Email: [email protected]

ORCID Researcher ID: F-8199-2010

Word Count Abstract: 244

Word Count Main Text: 2969

Number of Tables + Figures: 3 +2

Supplementary Figures: 1

75

of 23

ABSTRACT

Background: Evaluation of appropriate methodologies for imputation of missing risk factor

or outcome data from electronic medical records (EMRs) is crucial but lacking for comparative

effectiveness studies. Robust imputation of missing data relies on the understanding of the

predictors of missingness in the risk factor data, especially in patients with chronic diseases.

These two aspects have not been explored simultaneously to support methodological

developments in clinical epidemiological studies with real-world data.

Methods: Using disease-biomarker data (glycated haemoglobin, HbA1c) from large EMR

database in patients with diabetes, exploratory analyses were conducted to ascertain the

possible predictors of missingness. Three approaches, based on multiple imputation technique,

namely two-fold multiple imputation, by chained equations, and with Monte Carlo Markov

Chain, were evaluated in terms of their robustness in imputing missing data. The value of using

imputed data for drawing robust inferences on comparative effectiveness of two anti-diabetes

therapies were compared with the complete-case analyses.

Results: Older patients and patients with higher disease-severity were less likely to have

missing HbA1c data longitudinally over 12 months, while gender and pre-existing

comorbidities were not associated with the likelihood of missingness. No significant

differences in the distributions of follow-up imputed data with the three methods were

observed.

Conclusion: While complete case analyses were prone to bias by indication, use of three

multiple imputation techniques for large proportion of missing primary outcome data under

unknown patterns of missingness appeared to be valid, and able to provide consistent and

reliable clinical inferences.

76

of 23

Key Words: Missing data; Imputation; Multiple Imputation; Electronic Medical Records;

Real-world data

77

of 23

INTRODUCTION

Recent advances in the design and implementation of large electronic medical records (EMRs)

from national primary/ambulatory care databases have created new opportunities in clinical

and epidemiological studies [1, 2]. These databases have been extensively used to evaluate the

risk factor changes in patients with different clinical conditions [3-6]. However, one of the

critical problems with EMR data, as with all longitudinal observational data, is the issue of

missing data [7-10].

The data entry in EMRs depends on the nature and level of engagement between the individual

and the clinical service provider. For example, a patient with diabetes would be advised to get

blood tests done every 6 months for the assessment of various risk factors including glucose

level and lipids. However, given the severity of the disease state and the nature of anti-diabetes

drug (ADD) titration, the patient might need blood tests done more frequently. In primary care

settings, it is hypothesized that younger patients and those with lower risk profiles are less

likely to get blood tests done. The missing data may also arise simply because a patient failed

to attend the scheduled consultations. These aspects complicate the assertion about the nature

of missing data in EMRs, making it difficult to appropriately differentiate between random and

non-random missingness patterns. Although the problem of having significant proportions of

missing data in longitudinal studies can be minimised through careful design, it is almost

unavoidable in most clinical and epidemiological studies [11-13].

The inference drawn from a clinical or epidemiological study may be compromised when

individuals have missing data on health indicators, and inadequate handling of the missing data

can lead to substantial bias in the inferences drawn [12, 13]. Hence, before investigating and

imputing for the missing data, understanding the mechanisms behind the missing data is

crucial. In practice, incomplete data are typically considered as missing at random (MAR) even

if they may not be [12, 14]. In most EMRs, some variables would be expected to partially

78

of 23

explain some of the variation in missingness, which indicates imputation under MAR setting

[12]. A previous study reported that the standard imputation of missing EMR data with not

missing at random (NMAR) assumption but without NMAR model might produce biased

estimates, although the bias might not be large [15].

Multiple imputations for missing data, compared to a single imputation, accounts for the

statistical uncertainty in missing values. Multiple imputation can lead to consistent,

asymptotically normal and efficient estimates for a dataset with MAR missing pattern which

makes it very attractive [10, 16-18]. Several statistical and machine learning methods including

the multiple imputation techniques have been used to deal with the complex problem of missing

data [7-9, 19]. There is a strong body of literature on the methodological and application aspects

of multiple imputation of missing values [10, 14, 18-21]. Carpenter and Kenward (2013)

described the theoretical justifications and computations aspects of various approaches within

the multiple imputation platform [22]. However, the studies addressing the fundamental

aspects of missingness patterns in risk factor data from EMRs and the practical implications of

such missingness while conducting comparative effective studies is scarce [20, 23-25].

Using nationally representative EMR from the primary care system, the aims of this study were

to: (1) evaluate the association of different patient-level characteristics with the likelihood of

missingness of the risk factor or outcome data, and (2) investigate the performance of three

multiple imputation techniques for the imputation for missing longitudinal clinical risk factor

data in the context of evaluating comparative therapeutic effectiveness. Three multiple

imputation techniques for missing disease biomarker data (glycated haemoglobin, HbA1c)

were compared in patients with type 2 diabetes (T2DM), treated with two different ADDs under

a new user design setup. The robustness of drawing clinical inferences with imputed data and

complete case (CC) was explored for comparison of effectiveness of the therapies at population

level.

79

of 23

METHODS

Data

Centricity Electronic Medical Record (CEMR) of USA represents a variety of ambulatory and

primary care medical practices. Over 35,000 physicians and other providers from all US states

contribute to the CEMR, with approximately 75% being primary care providers. The database

is generally representative of the USA population; diabetes prevalence (7.1% diabetic patients

identified by diagnostic codes) is similar to National Diabetes Statistics (6.7% diagnosed

diabetes in 2014) [26]. CEMR has been used extensively for academic research worldwide [27-

29].

For more than 34 million individuals, longitudinal EMRs were available from 1995 until April

2016. This database contains comprehensive patient-level information on demographics,

anthropometric, clinical and laboratory variables including age, sex, ethnicity, and longitudinal

measures of HbA1c. Medication data includes brand names and doses for individual

medications prescribed, along with start/ stop dates and specific fields to track treatment

alterations. This dataset also contains patient reported medications, including prescriptions

received outside the EMR network and over-the-counter medications. A robust methodology

for extraction and assessment of longitudinal patient-level medication data from the CEMR

database has been recently described by the authors [30].

Study population

The T2DM study cohort was selected on the basis of the following conditions: (1) valid

diagnosis of T2DM, (2) 18 - 80 years old at the date of treatment initiation (baseline), (3) no

missing data on age and sex, and (4) valid baseline HbA1c measure. We focused on two ADDs:

dipeptidyl peptidase-4 inhibitor (DPP-4) and Glucagon-like peptide-1 receptor agonist (GLP-

1RA) when added to the first-line metformin. The number of patients with minimum 12/24

80

of 23

months of follow-up post initiation of DPP-4 and GLP-1RA were 38,483/23,859 and

8,977/5,312 respectively. These patients were receiving the respective therapies for a minimum

of one year. HbA1c measures at baseline, 6, 12, 18, and 24 months were obtained as the nearest

measure within 3 months either side of the time point.

Multivariate Imputation by Chained Equations (MICE)

MICE is an increasingly popular method of dealing with missing data in epidemiological and

clinical research. This iterative imputation approach imputes multiple variables by using

chained equations under the assumption that missing data are MAR [31]. This method creates

multiple imputations for missing multivariate data by Gibbs sampling. The advantages of

MICE is that it can handle arbitrary missing data patterns as well as variables of different types.

For imputation with continuous variables, linear regression models or predictive mean

matching are used while logistic regression and polytomous models are needed for

dichotomous and categorical variables respectively [32, 33]. MICE is also commonly known

as fully conditional specification (FCS) and sequential regression multivariate imputation [34].

Two-fold Multiple Imputation

There are two phases in two-fold method of imputation; firstly, the filled-in phase followed by

the imputation phase. At the filled-in phase, the missing values for all variables are filled in

sequentially over the variables taken one at a time, which specifies separate univariate

imputation models for each variable with missing data conditional on all other variables [35] .

At the imputation stage, the missing values are imputed using a specified method and covariates

at each iteration. Two-fold multiple imputation imputes missing values at each time point

conditional on observed measures within a small time window using FCS or Chained Equations

[35, 36]. Usually, this method uses information from that time point where the imputation is

conducted and from immediately adjacent time points. A distinct advantage of this method is

81

of 23

that it can handle both time-dependent and time-independent variables as well as allowing users

to specify the time window. This method also reduces the issues of collinearity and overfitting.

Multiple imputation with Bayesian Monte Carlo Markov Chain (MCMC)

Multiple imputation platform incorporates both parametric and non-parametric approaches.

Parametric approaches include improper, approximate proper imputation, and the Bayesian

imputation (proper) which usually uses Markov Chain Monte Carlo (MCMC) methods to

obtain the posterior distribution [22]. In the context of arbitrary missing data patterns, the

MCMC method is often used which creates multiple imputations by using simulations from a

Bayesian prediction distribution for normal data. We used multiple imputation with Bayesian

iterative MCMC procedures which can also be used when the pattern of missing data is

monotone or non-monotone [22].

Statistical Methods

We imputed the missing HbA1c (measured in %) measurements at 6-, 12-, 18- and 24-month

follow-up using these three methods and compared the results with CC analysis. CC analysis

considered only the non-missing HbA1c at the same time points. In all three imputation

methods, imputed values were adjusted for age, sex, and addition of any third-line ADD within

2 years of follow-up.

Basic statistics were presented by number (percentage), mean (SD), mean (95% CI) or median

(first quartile, third quartile) separately for two the treatment groups, as appropriate. Both the

unadjusted and adjusted change in HbA1c (%) at 6 and 12 months by the two treatment groups

were evaluated, the adjustment factors being age at treatment initiation, sex, diabetes duration

at treatment initiation, and time to second-line ADD. Among patients with baseline HbA1c ≥

7.5%, logistic regression was used to evaluate the odds of reducing HbA1c below 7% (glucose

82

of 23

management target in patients with T2DM) at 6 and 12 months of follow-up in the GLP-1RA

group compared to the DPP-4 group. Treatment status is usually not randomized in

observational data which implies that the outcome and treatment are not necessarily

independent. To avoid this issue we applied a treatment effects model adjusting for age, sex,

diabetes duration at treatment initiation, and time to second-line ADD to make treatment and

outcome independent conditioning on those covariates [37].

To evaluate the association of various factors that could be associated with missingness of

HbA1c measures at follow-up, the likelihood of missingness of HbA1c at 6 and 12 months of

follow-up from baseline for each treatment group was estimated using logistic regression,

adjusting for age, sex, pre-existing cardiovascular disease (CVD) or pre-existing chronic

kidney disease (CKD), baseline HbA1c ≥ 7.5% and use of other medications. Instead of

ordinary age groups we considered quartiles of age group denoted by Q1 (18-50 years), Q2

(50-58 years), Q3 (58-66 years) and Q4 (66-80 years).

RESULTS

The basic characteristics of the study cohort are presented in Table 1. The mean (SD) age was

58 (12) and 54 (11) years, 49 % and 35% were male, and 71% and 78% of the patients were

White Caucasian in DPP-4 and GLP-1RA respectively. Median (Q1, Q3) HbA1c at baseline in

patients with minimum of 12 month of DPP-4 and GLP-1RA were 7.5 (6.8, 8.8) and 7.1 (6.5,

8.3) respectively.

There were no missing data on HbA1c at baseline by design. The proportions of missing

HbA1c data for patients with a minimum 12 and 24 months of treatment are presented for every

6 months of follow-up in Table 1. Among patients with a minimum treatment duration of 12

months, proportions of missing HbA1c ranged from 28% to 32%. Similar missing proportions

83

of 23

(28- 34%) were observed at 24 months follow-up in patients with a minimum of 24 months of

treatment.

The possible association of various patient characteristics with the likelihood of missing

HbA1c in the study cohort at 6 and 12 months of follow-up is presented in Table 2. Age at

treatment initiation had significant influence on the likelihood of missing HbA1c. Among

patients treated with DPP-4, compared to younger patients (age quartile - Q1) the odds of non-

missing measure of HbA1c in older patients (Q2 to Q4) increased from 16% to 30% and from

21% to 34% at 6 and 12 months of follow-up respectively. Similar results (20% to 38% higher

odds) were observed in patients treated with GLP1-RA at 6 months.

Sex and pre-existing CVD or CKD did not have any influence on the likelihood of missingness

of HbA1c at 6 or 12 months follow-up. Patients with HbA1c ≥ 7.5% at baseline were 12% (OR

CI: 0.83, 0.93) and 14% (OR CI: 0.77, 0.97) less likely to have missing data at 6 months follow-

up in the DPP-4 and GLP-1RA groups respectively. However, this association seems to

disappear at 12 month follow-up.

There was no difference in the distributions of follow-up imputed Hba1c data with the three

imputed methods and the complete case analyses (Table 3 and Figure 1). The estimates of

unadjusted and adjusted changes in HbA1c during follow-up were also similar with all

imputation approaches, and there was no difference in these estimates with the CC analyses

(Table 3 and Figure 2).

Among patients with HbA1c ≥ 7.5% at baseline, the proportions of patients identified to have

reduced HbA1c ≤ 7% at 6 and 12 months follow-up were similar using all imputation

approaches (Table 3). While making clinical inference on the likelihood of reducing HbA1c

below 7% in the GLP-1RA group, compared to those treated with DPP-4, there was no

84

of 23

disagreement among the three imputation approaches, and this inference was also in line with

the analysis of complete cases.

Figure 1 shows that there was no difference in distribution of HbA1c at 6 months and 12 months

among the three imputation approaches for patients treated with DPP-4. In patients treated with

GLP-1RA, at 6 months of follow-up, although MICE indicated a slightly leptokurtic

distribution due to its higher variability (SD = 1.4), there was no difference at 12 months. The

density plot (Figure 2) for the change in HbA1c in DPP-4, compared to the two treatment

groups, indicated there was no obvious difference between the three imputation methods.

However, in the GLP-1RA group, the density plots obtained using MICE were leptokurtic at

both 6 and 12 months of follow-up. Supplementary Figure 1 showed that patients treated with

DPP-4 had a higher mean HbA1c level at baseline and maintained it during follow-up

compared to patients treated with GLP-1RA. No significant difference in the trajectories of

HbA1c over 24 months of follow-up was observed between CC analyses and the three

imputation techniques for both DPP-4 and GLP-1RA.

DISCUSSION

A novel component of this study is the investigation of the likelihood of missingness of follow-

up risk factor measures (HbA1c) with patients’ demographic and clinical characteristics (age,

sex, pre-existing comorbidities and disease severity (baseline HbA1c). The results clearly

indicated that the missingness in the follow-up risk factor data is less likely in older patients,

irrespective of the drug they are taking for glycaemic control. We also observed that patients

with higher disease severity (HbA1c above 7.5% at baseline) are more likely to visit their GP

or primary care provider at 6 months post treatment intensification/ therapy titration – which

disappears over longer follow-up time – likely to be a result of effectiveness of the therapies in

terms of better risk factor control. This extensive assessment clearly informs on the random

and non-random patterns of missingness. However, it is very difficult to identify and

85

of 23

distinguish random and non-random missing patterns in EMRs, and model them accordingly

to obtain robust inference(s) through imputations.

Another novel component of our study is the comparative assessment of the usability and

robustness of using imputed data for making clinical inferences in comparative effectiveness

studies at the population level using large real-world EMRs. We observed that the inferences

drawn on the risk factor changes in this pharmaco-epidemiological study are similar between

CC and imputed data based analyses. More importantly, the clinical contexts of evaluating the

effectiveness of the therapies, using continuous measures of risk factors or clinical

categorisation of the therapeutic achievements, were well supported with confidence in making

robust inferences using different methods of imputation. The EMR database presents a

formidable challenge that the “missing data” have an intermittent pattern of missingness over

time (non-monotone) and are NMAR, so approaches such as CC produces biased and

statistically inefficient results [38].

As expected in the primary care based EMR, a large number of patients had missing HbA1c in

the 6-monthly follow-up data over 24 months. We observed that in estimating changes in

HbA1c at 6 and 12 months from baseline, MICE was estimating marginally lower compared

to the other two methods and also slightly leptokurtic in density plot. In almost all instances,

both Two-fold and MCMC performed similarly. Due to its simplified nature of imputing

missing values at a given time by using values at nearby times makes Two-fold imputation a

more attractive technique in the context of EMR databases. This automatically reduces the

complexity of the imputation models, collinearity and overfitting issues [35]. Furthermore,

sometimes measurements further away from time may produce independent information

compared to the adjacent time points. In these circumstances, we have to be careful in using

Two-fold imputation with small time window widths. Possibly a greater time window width

may solve this problem.

86

of 23

The MAR approach assumes that like the observed values, the missing observations are not

random samples that are generated from the same sampling distribution [19]. In our study, the

distributions of the imputed data we obtained using these three imputation methodologies were

similar compared to the data for CC because the underlying theory behind these imputation

techniques is the same as multiple imputations and are based upon MAR assumption. The

missing outcome measure data in this context also raises the issue of some kind of indication

bias – the fact that patients with better glycaemic control are less representative in the follow-

up outcome measure data. In this case, any analysis on the CC is highly likely to bias the result

towards those who are doing poorly in terms of glycaemic control, as observed in this study.

Kim (2004) showed that under the regression model, the bias of the multiple imputation

variance estimator decreases with large sample size [39]. We have a large sample size in our

study, hence we had almost unbiased variance estimator for the imputed data. Clearly the use

of robust statistical analytical techniques employed on analysis of imputed data is highly likely

to produce robust and reliable clinical inferences, compared to that based on the CC analyses.

In this context, the use of all three multiple imputation techniques (MICE, Two-fold and

MCMC) to impute for a relatively large proportion of missing primary outcome data, under

unknown patterns of missingness, appears to be valid and able to provide consistent and reliable

clinical inferences.

ACKNOLEDGEMENTS

University of Melbourne gratefully acknowledges the support from the Australian

Government’s National Collaborative Research Infrastructure Strategy (NCRIS) initiative

through Therapeutic Innovation Australia. No separate funding was obtained for this study.

OM acknowledges the Ph. D. scholarship from Queensland University of Technology,

Australia, and her co-supervisors Prof. Ross Young and Prof. Louise Hafner of the same

University.

87

of 23


SKP has acted as a consultant and/or speaker for Novartis, GI Dynamics, Roche, AstraZeneca,

Guangzhou Zhongyi Pharmaceutical and Amylin Pharmaceuticals LLC. He has received grants

in support of investigator and investigator-initiated clinical studies from Merck, Novo Nordisk,

AstraZeneca, Hospira, Amylin Pharmaceuticals, Sanofi-Avensis and Pfizer. MS, OM and JT

have no conflict of interest to declare.

88

of 23

REFERENCES

1. Sagreiya, H. and R.B. Altman, The utility of general purpose versus specialty clinical databases for research: warfarin dose estimation from extracted clinical variables. Journal of biomedical informatics, 2010. 43(5): p. 747-751.

2. Shivade, C., et al., A review of approaches to identifying patient phenotype cohorts using electronic health records. Journal of the American Medical Informatics Association, 2014. 21(2): p. 221-230.

3. Montvida, O., et al., Addition of or switch to insulin therapy in people treated with glucagon-like peptide-1 receptor agonists: A real-world study in 66 583 patients. Diabetes, Obesity and Metabolism, 2016: p. n/a-n/a.

4. Paul, S.K., et al., Delay in treatment intensification increases the risks of cardiovascular events in patients with type 2 diabetes. Cardiovascular Diabetology, 2015. 14(1): p. 100.

5. Badve, S.V., et al., The Association between Body Mass Index and Mortality in Incident Dialysis Patients. PLoS One, 2014. 9(12): p. e114897.

6. Thomas, G., et al., Obesity paradox in people newly diagnosed with type 2 diabetes with and without prior cardiovascular disease. Diabetes Obes Metab, 2013. 16(4): p. 317-25.

7. Jerez, J.M., et al., Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artificial Intelligence in Medicine, 2010. 50(2): p. 105-115.

8. Biering, K., N.H. Hjollund, and M. Frydenberg, Using multiple imputation to deal with missing data and attrition in longitudinal studies with repeated measures of patient-reported outcomes. Clin Epidemiol, 2015. 7: p. 91-106.

9. Thomas, G., K. Klein, and S. Paul, Statistical challenges in analysing large longitudinal patient-level data: The danger of misleading clinical inferences with imputed data. Journal of the Indian Society of Agricultural Statistics, 2014. 68(2): p. 39-54.

10. Sterne, J.A.C., et al., Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ, 2009. 338: p. 157- 160.

11. Little, R.J., et al., The Prevention and Treatment of Missing Data in Clinical Trials. New England Journal of Medicine, 2012. 367(14): p. 1355-1360.

12. Wells, B.J., et al., Strategies for handling missing data in electronic health record derived data. EGEMS (Wash DC), 2013. 1(3): p. 1035.

13. Madden, J.M., et al., Missing clinical and behavioral health data in a large electronic health record (EHR) system. J Am Med Inform Assoc, 2016.

14. Mackinnon, A., The use and reporting of multiple imputation in medical research - a review. J Intern Med, 2010. 268(6): p. 586-93.

15. Lin, J.H. and P.J. Haug, Exploiting missing clinical data in Bayesian network modeling for predicting medical problems. J Biomed Inform, 2008. 41(1): p. 1-14.

16. Marston, L., et al., Issues in multiple imputation of missing data for large general practice clinical databases. Pharmacoepidemiol Drug Saf, 2010. 19(6): p. 618-26.

17. Rubin, D.B. and N. Schenker, Multiple imputation in health-care databases: an overview and some applications. Stat Med, 1991. 10(4): p. 585-98.

18. Spratt, M., et al., Strategies for multiple imputation in longitudinal studies. Am J Epidemiol, 2010. 172(4): p. 478-87.

19. Rubin, D.B., Inference and missing data. Biometrika, 1976. 63(3): p. 581-592.

89

of 23

20. Bounthavong, M., J.H. Watanabe, and K.M. Sullivan, Approach to addressing missing data for electronic medical records and pharmacy claims data research. Pharmacotherapy, 2015. 35(4): p. 380-7.

21. Yuan, Y.C., Multiple imputation for missing data: concepts and new developments (version 9.0). 2010, SAS Institute Inc.: Rockville.

22. Carpenter, J.K., Michael, Multiple Imputation and its Application. 2013, Wiley. 23. Wells, B.J., et al., Strategies for Handling Missing Data in Electronic Health Record

Derived Data. eGEMs, 2013. 1(3): p. 1035. 24. Madden, J.M., et al., Missing clinical and behavioral health data in a large electronic

health record (EHR) system. J Am Med Inform Assoc, 2016. 23(6): p. 1143-1149. 25. Montvida, O., et al., Addition of or switch to insulin therapy in people treated with

glucagon-like peptide-1 receptor agonists: A real-world study in 66 583 patients. Diabetes Obes Metab, 2017. 19(1): p. 108-117.

26. Control, C.f.D. and Prevention, National diabetes statistics report: estimates of diabetes and its burden in the United States, 2014. Atlanta, GA: US Department of Health and Human Services, 2014. 2014.

27. Crawford, A.G., et al., Comparison of GE Centricity Electronic Medical Record database and National Ambulatory Medical Care Survey findings on the prevalence of major conditions in the United States. Popul Health Manag, 2010. 13(3): p. 139-50.

28. Brixner, D., et al., Assessment of cardiometabolic risk factors in a national primary care electronic health record database. Value in health, 2007. 10(s1): p. S29-S36.

29. Paul, S.K., et al., Weight gain in insulin treated patients by BMI categories at treatment initiation: New evidence from real-world data in patients with type 2 diabetes. Diabetes, Obesity and Metabolism, 2016.

30. Montvida, O., et al., Data Mining Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic Medical Records. Open Bioinformatics Journal, 2017. 10: p. 1-15.

31. van Buuren, S. and K. Groothuis-Oudshoorn, mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 2011. 45(3): p. 1-67.

32. White, I.R., R. Daniel, and P. Royston, Avoiding bias due to perfect prediction in multiple imputation of incomplete categorical variables. Comput Stat Data Anal, 2010. 54(10): p. 2267-2275.

33. van Buuren, S., Multiple imputation of discrete and continuous data by fully conditional specification. Stat Methods Med Res, 2007. 16(3): p. 219-42.

34. Raghunathan, T.E., et al., A Multivariate Technique for Multiply Imputing Missing Values Using a Sequence of Regression Model. Survey Methodology, 2001. 27(1): p. 85-95.

35. Welch, C.A., et al., Evaluation of two-fold fully conditional specification multiple imputation for longitudinal electronic health record data. Statistics in Medicine, 2014. 33(21): p. 3725-3737.

36. Welch, C., J. Bartlett, and I. Petersen, Application of multiple imputation using the two-fold fully conditional specification algorithm in longitudinal clinical data. Stata J, 2014. 14(2): p. 418-431.

37. Cattaneo, M.D., Efficient semiparametric estimation of multi-valued treatment effects under ignorability. Journal of Econometrics, 2010. 155(2): p. 138-154.

38. Little, R.J.A., Rubin, Donald B. , Statistical Analysis with Missing Data. Second ed. 2002: Wiley-Interscience.

39. Kim, J.K., Finite sample properties of multiple imputation estimators. The Annals of Statistics, 2004. 32(2): p. 766-783.

90

of 23

Table 1: Basic statistics and missingness of HbA1c (%) by treatment group with a

minimum 12 months treatment duration.

Minimum 12-months treatment duration

Minimum 24-months treatment duration

DPP-4 GLP1-RA DPP4 GLP1-RA N 38483 8977 23859 5312 Age in years𝜙 58 (12) 54 (11) 58 (12) 54 (11) Maleη 18721 (49) 3136 (35) 11672 (49) 1834 (35) Ethnicityη

Asian 828 (2) 100 (1) 541 (2) 65 (1) Black 4393 (11) 711 (8) 2682 (11) 412 (8) Indian

(American) 483 (1) 47 (1) 343 (1) 27 (1)

Native Hawaiian 67 (0) 19 (0) 49 (0) 10 (0) Unknown 5507 (14) 1192 (13) 3343 (14) 667 (13)

White 27205 (71) 6908 (77) 16901 (71) 4131 (78) HbA1c(%)γ

Baseline 8.1 (6.5, 18.8)

7.7 (6.5, 16.5)

8.1 (6.5, 18.8) 7.7 (6.5, 16.5)

6 months 7.1 (5, 17.7) 6.9 (5, 15.9) 7.1 (5, 17.7) 6.9 (5, 15.9) 12 months 7.2 (5, 17.9) 7.0 (5, 15.6) 7.2 (5, 17.9) 7.0 (5, 15.6) 18 months - - 7.2 (5, 17.5) 7.1 (5, 17.4) 24 months - - 7.3 (5, 17.9) 7.1 (5, 16.7)

HbA1c(%)ω Baseline 7.5 (6.8, 8.8) 7.1 (6.5, 8.3) 7.5 (6.8, 8.8) 7.1 (6.5, 8.3) 6 months 6.8 (6.3, 7.5) 6.6 (6, 7.4) 6.8 (6.3, 7.5) 6.6 (6, 7.4)

12 months 6.9 (6.3, 7.7) 6.6 (6, 7.5) 6.9 (6.3, 7.7) 6.6 (6, 7.5) 18 months - - 6.9 (6.3, 7.7) 6.7 (6.1, 7.6) 24 months - - 6.9 (6.3, 7.8) 6.8 (6.1, 7.6)

HbA1c(%)η – Missingness

Baseline 0 0 0 0 6 months 6622 (28) 1553 (31) 4110 (28) 883 (29)

12 months 6904 (30) 1643 (32) 4093 (28) 945 (31) 18 months - - 4518 (31) 1032 (33) 24 months - - 4680 (32) 1050 (34)

Unless stated otherwise; α: Mean (95% CI); 𝜙: Mean (SD); η: N (%); ω: Median (IQR); γ: Mean (Min, Max)

91

Pag

e 18

of

23

Tab

le 2

: Odd

s ra

tio

(95%

CI)

and

p-v

alue

s fo

r lik

elih

ood

of m

issi

ngne

ss o

f HbA

1c (%

) mea

sure

at 6

and

12

mon

ths

of fo

llow

-up

in p

atie

nts

trea

ted

wit

h D

PP

-4 a

nd G

LP

1-R

A a

djus

ted

for a

ge q

uart

iles

(Q1,

Q2,

Q3,

Q4)

, se

x, p

re-e

xist

ing

CV

D a

nd C

KD

and

pat

ient

s w

ith

base

line

HbA

1c≥

7.5%

.

6 m

onth

s fo

llow

-up

12

mon

ths

foll

ow-u

p

DP

P-4

p-

valu

eG

LP

1-R

A

p-va

lue

DP

P-4

p-

valu

eG

LP

1-R

A

p-va

lue

Age

Qua

rtil

es

Q2

0.84

(0.

78, 0

.90)

<0.

001

0.80

(0.

70, 0

.91)

<0.

001

0.79

(0.

74, 0

.85)

<

0.00

1 0.

69 (

0.60

, 0.7

8)<

0.00

1 Q

3 0.

75 (

0.70

, 0.8

1)<

0.00

1 0.

68 (

0.58

, 0.8

0)<

0.00

1 0.

74 (

0.69

, 0.8

0)

<0.

001

0.61

(0.

52, 0

.71)

<0.

001

Q4

0.70

(0.

64, 0

.76)

<0.

001

0.69

(0.

55, 0

.88)

<0.

001

0.66

(0.

60, 0

.72)

<

0.00

1 0.

62 (

0.50

, 0.7

8)<

0.00

1 M

ale

vs F

emal

e 1.

02 (

0.97

, 1.0

8)0.

48

1.05

(0.

94, 1

.19)

0.39

1.

03 (

0.98

, 1.0

9)

0.28

0.

87 (

0.78

, 0.9

8)0.

024

Car

diov

ascu

lar

Dis

ease

(C

VD

) 1.

06 (

0.98

, 1.1

4)0.

16

1.03

(0.

86, 1

.23)

0.76

1.

08 (

1.00

, 1.1

6)

0.06

1.

13(0

.95,

1.3

5)

0.17

C

hron

ic K

idne

y D

isea

se (

CK

D)

1.

01 (

0.88

, 1.1

6)0.

88

1.02

(0.

70, 1

.50)

0.30

0.

92 (

0.80

, 1.0

6)

0.26

0.

99 (

0.68

, 1.4

3)0.

95

HbA

1c ≥

7.5

% a

t Bas

elin

e 0.

88 (

0.83

, 0.9

3)<

0.00

1 0.

86 (

0.77

, 0.9

7)0.

011

1.0

0 (0

.94,

1.0

5)0.

81

0.94

(0.

84, 1

.05)

0.24

of 23

Table 3: (i) Mean (SD) for HbA1c (%) at 6 and 12 months by treatment group for complete

case and on imputed data by three imputation methods (ii) Change in HbA1c (%) at 6 and 12

months from baseline by treatment group for unadjusted analyses and adjusted for age, gender,

baseline HbA1c (%), diabetes duration, and time to second-line ADD as measured on complete

cases and on imputed data; (iii) Among patients with Hba1c ≥ 7.5% at the index date, the

proportions of patients who reduced HbA1c ≤ 7% at 6 and 12 months during follow-up, by

treatment groups, for complete cases and imputed data; (iv) Odds ratio for HbA1c ≤ 7% in

patients treated with GLP-1RA compared to DPP-4 group adjusted for baseline HbA1c (%),

age, gender, diabetes duration, time to second-line ADD and third-line ADD started within 6

(or 12 month) or not.

6 months follow-up 12 months follow-up

DPP-4 GLP1-RA DPP-4 GLP1-RA

HbA1c (%)

CC 7.1 (1.3) 6.9 (1.3) 7.2 (1.4) 7.0 (1.4) MICE 7.1 (1.3) 6.9 (1.4) 7.2 (1.3) 7.0 (1.4)

Two-fold 7.1 (1.3) 6.9 (1.3) 7.2 (1.3) 7.0 (1.4) Bayesian

MCMC 7.1 (1.3) 6.9 (1.3) 7.2 (1.3) 7.0 (1.4) Change in HbA1c (%)α – Unadjusted

CC -1.05 (-1.07, -1.02)

-0.89 (-0.93, -0.84)

-0.91 (-0.94, -0.89)

-0.73 (-0.77, -0.68)

MICE -1.00 (-1.02, -0.98)

-0.84 (-0.87, -0.80)

-0.91 (-0.93, -0.89)

-0.71 (-0.75, -0.69)

Two-fold -1.00 (-1.02, -0.98)

-0.84 (-0.87, -0.80)

-0.92 (-0.94, -0.9)

-0.72 (-0.76, -0.69)

Bayesian MCMC

-1.00 (-1.02, -0.98)

-0.85 (-0.88, -0.81)

-0.92 (-0.94, -0.9)

-0.72 (-0.76, -0.69)

Change in HbA1c (%)α – Adjusted CC -1.14

(-1.16, -1.11) -0.98

(-1.02, -0.94) -1.00

(-1.03, -0.98) -0.81

(-0.85, -0.76) MICE -1.09

(-1.11, -1.07) -0.93

(-0.96, -0.89) -1.00

(-1.02, -0.98) -0.79

(-0.83, -0.75) Two-fold -1.09

(-1.11, -1.07) -0.92

(-0.95, -0.89) -1.01

(-1.03, -0.99) -0.80

(-0.84, -0.77) Bayesian

MCMC -1.09

(-1.11, -1.07) -0.93

(-0.96, -0.89) -1.00

(-1.02, -0.98) -0.80

(-0.84, -0.77)

93

of 23

Patients with HbA1c ≥7.5% at Baseline and ≤ 7% at follow-up

CC 5193 (48) 1043 (49) 4768 (46) 956 (48)

MICE 6142 (45) 1312 (47) 5673 (43) 1238 (45)

Two-fold 6221 (45) 1294 (46) 5883 (43) 1226 (45)

Bayesian MCMC

6112 (45) 1311 (47) 5774 (43) 1239 (45)

Odds Ratio - Adjusted α

CC Ref 1.03 (1, 1.06) Ref 1.02 (0.99, 1.04) MICE 1.03 (1, 1.06) 1.02 (1, 1.04)

Two-fold 1.03 (1, 1.06) 1.01 (1, 1.04) Bayesian

MCMC 1.03 (1, 1.05) 1 (0.99, 1.03) Unless stated otherwise; α: Mean (95% CI); 𝜙: Mean (SD); η: N (%); ω: Median (IQR); γ: Mean (Min, Max)

94

of 23

Figure 1: Distribution of HbA1c (%) at 6 months and 12 months for DPP-4 and GLP-1RA

respectively for complete case, MICE, Two-fold and MCMC imputation.

95

of 23

Figure 2: Distribution of change in HbA1c (%) (∆HbA1c) at 6 months and 12 months for

DPP-4 and GLP-1RA respectively for complete case, MICE, Two-fold and MCMC

imputation.

96

of 23

Supplementary Figure 1: Trajectory plot for mean (95% CI) HbA1c (%) at baseline and follow-

up for two treatment groups for complete case and on imputed data by three imputation

methods.

97

Chapter 7: Trends in Anti-diabetic Drug

Prescribing Patterns


Paper




expertise;






academic unit, and




Olga Montvida, Jonathan Shaw, John J Atherton, Francis Stringer, Sanjoy K Paul. Long-term

Trends in Antidiabetes Drug Usage in the US: Real-world Evidence in Patients Newly

Diagnosed With Type 2 Diabetes. Diabetes Care. 2017 Nov 6:dc171414.


Olga Montvida Conceived the idea and was responsible for the primary

design of the study. Conducted the data extraction and

statistical analyses. Developed first draft and contributed


Jonathan Shaw Contributed significantly in the study design and manuscript

development.

John J Atherton Contributed significantly in the study design and manuscript

development.

Francis Stringer Contributed in the interpretation of the results and

manuscript development.

98


Signature

Sanjoy K. Paul Conceived the idea and was responsible for the primary



of the manuscript.



authorship.

Sanjoy Ketan Paul ___ 29.06.2018

Name Signature Date

99


Long-term Trends in AntidiabetesDrug Usage in the U.S.: Real-worldEvidence in Patients NewlyDiagnosed With Type 2 DiabetesDiabetes Care 2018;41:69–78 | https://doi.org/10.2337/dc17-1414

OJBECTIVE

To explore temporal trends in antidiabetesdrug (ADD)prescribing and intensificationpatterns, alongwith glycemic levels and comorbidities, and possible benefits of novelADDs in delaying the need for insulin initiation in patients diagnosed with type 2diabetes.

RESEARCH DESIGN AND METHODS

Patients with type 2 diabetes aged 18–80 years, who initiated any ADD, were se-lected (n = 1,023,340) from theU.S. Centricity ElectronicMedical Records. Thosewhoinitiated second-line ADD after first-line metforminwere identified (subcohort 1, n =357,482); the third-line therapy choices were further explored.

RESULTS

From 2005 to 2016, first-line use increased for metformin (60–77%) and decreasedfor sulfonylureas (20–8%). During amean follow-up of 3.4 years postmetformin, 48%initiated a second ADD at a mean HbA1c of 8.4%. In subcohort 1, although sulfonyl-urea usage as second-line treatment decreased (60–46%), it remained themost pop-ular secondADD choice. Use increased for insulin (7–17%) anddipeptidyl peptidase-4inhibitors (DPP-4i) (0.4–21%). The rates of intensification with insulin and sulfonyl-ureas did not decline over the last 10 years. The restricted mean time to insulininitiationwasmarginally longer in second-line DPP-4i (7.1 years) and in the glucagon-like peptide 1 receptor agonist group (6.6 years) compared with sulfonylurea (6.3years, P < 0.05).

CONCLUSIONS

Most patients initiate second-line therapy at elevated HbA1c levels, with highlyheterogeneous clinical characteristics across ADD classes. Despite the introductionof newer therapies, sulfonylureas remained the most popular second-line agent,and the rates of intensification with sulfonylureas and insulin remained consistentover time. The incretin-based therapies were associated with a small delay in theneed for therapy intensification compared with sulfonylureas.

A broad choice of “old” and “new” antidiabetes drugs (ADDs) is available, which differnot only in their mechanisms of action but also in their glycemic and extraglycemiceffects (1). While treatment guidelines for type 2 diabetes are regularly updated basedon new evidence, real-world prescription trends may also be driven by other factors,such as medication costs, side effect profile, and provider and patient preferences.

1Statistics Unit, QIMR Berghofer Medical Re-search Institute, Brisbane, Australia2Faculty of Health, School of Biomedical Sci-ences, Queensland University of Technology,Brisbane, Australia3Baker Heart and Diabetes Institute, Melbourne,Australia4Cardiology Department, Royal Brisbane andWomen’s Hospital, and University of QueenslandSchool of Medicine, Brisbane, Australia5Model Answers Pty Ltd, Brisbane, Australia6Melbourne EpiCentre, University of Melbourne,Melbourne, Australia

Correspondingauthor: SanjoyK. Paul, [email protected].

Received 14 July 2017 and accepted 25 Septem-ber 2017.

This article contains Supplementary Data onlineat http://care.diabetesjournals.org/lookup/suppl/doi:10.2337/dc17-1414/-/DC1.

This article is featured in a podcast available athttp://www.diabetesjournals.org/content/diabetes-core-update-podcasts.

© 2017 by the American Diabetes Association.Readers may use this article as long as the workis properly cited, the use is educational and notfor profit, and the work is not altered. More infor-mation is available at http://www.diabetesjournals.org/content/license.

Olga Montvida,1,2 Jonathan Shaw,3

John J. Atherton,4 Frances Stringer,5 and

Sanjoy K. Paul1,6

Diabetes Care Volume 41, January 2018 69

EPIDEM

IOLO

GY/H

EALTH

SERVICES

RESEA

RCH

100

https://doi.org/10.2337/dc17-1414

http://crossmark.crossref.org/dialog/?doi=10.2337/dc17-1414&domain=pdf&date_stamp=2017-12-05



http://care.diabetesjournals.org/lookup/suppl/doi:10.2337/dc17-1414/-/DC1


http://www.diabetesjournals.org/content/diabetes-core-update-podcasts

http://www.diabetesjournals.org/content/diabetes-core-update-podcasts

http://www.diabetesjournals.org/content/license

http://www.diabetesjournals.org/content/license

With the development of new classes ofantidiabetes therapies since 2005, includingincretin-based drugs and sodium–glucosecotransporter 2 inhibitors (SGLT2i), theparadigm of therapy options for patientswith highly heterogeneous glycemic andcardiovascular risk factors has changedsignificantly. However, the way in whichthis has occurred in real-world practice,especially in the trade-off between olderand new classes of ADDs as initial and in-tensification therapy options, has notbeen studied thoroughly.The newer ADDs have been shown to

be associated with significantly lower riskof hypoglycemia compared with the sul-fonylureas (SU) and insulin (INS) (2). Theweight neutrality or benefits of weightreductions have also been well establishedfor new therapies including the incretins(3,4). Given the glycemic and extraglyce-mic benefits of these agents, one wouldexpect a fall in the use of SU or INS as in-tensification therapies. However, studiesevaluating the possible benefits of usingnewer ADDs in terms of delaying the needfor INS are scarce (5,6). In this context, un-derstanding the changing patterns of ther-apy initiation and intensifications withsecond- and third-line therapies, in conjunc-tionwith theheterogeneous patients’ char-acteristics, is a fundamental backgroundrequirement.Cohenetal. (7) exploredADD-prescribing

patterns in the U.S. from 1997 to 2000 andreported decreasing use of SU and increas-ing trends in metformin (MET) and thiazoli-dinedione (TZD) prescription over time.Utilizing a claims database, Desai et al.(8) reported an increasing proportionalshare of MET and decreasing prescrip-tions for TZD between 2006 and 2008.One of the reasons for the reduction inthe use of TZD was the safety concerns(9–12). While Berkowitz et al. (13) evalu-ated treatment initiations with MET, SU,and dipeptidyl peptidase-4 inhibitors(DPP-4i) between 2009 to 2013 in theU.S., utilization patterns of other ADDsand the changing utilization trend overtime were not explored (14). Lipskaet al. (15) have evaluated the temporaltrend in the use of ADDs from 2006 to2013 using claims data from the U.S. Asmall number of studies have exploredthe clinical characteristics of patientsaccording to the type of ADD pre-scribed, but only over relatively shorttime periods, and did not evaluate treat-ment intensification with second- or

third-line ADDs over a long period oftime (16,17).To the best of our knowledge, the pro-

gressive changes in the proportional dis-tributions across all new and old ADDs,and the patterns and determinants oftherapy intensification with second- andthird-line ADDs, have not been exploredcomprehensively in any study. With rec-ognition of the growing disease burdenand increasing volumes of dispensedmedications (18–21), the primary aimof this studywas to provide a comprehen-sive up-to-date exploration of the treat-ment pattern changes for type 2 diabetesin the U.S. using the nationally representa-tive Centricity Electronic Medical Records(CEMR) from primary and secondary am-bulatory care systems. Specifically, the aimswere to1) explore temporal changes inpre-scribing patterns from 2005 to 2016 withrespect to the drug initiation order, 2) ex-plore therapy intensification with secondand third ADDs, 3) explore patient char-acteristics including risk factors and co-morbidities according to ADD therapyprescribed, 4) explore the temporal pat-terns in the rates of intensification withSU and INS, and 5) evaluate whether useof incretin-based therapies as second-linetherapy delays the need for intensifica-tion with third-line ADDs and with INSany time during follow-up.


Data SourceCEMR represents a variety of ambulatoryand primary care medical practices, in-cluding solo practitioners, communityclinics, academic medical centers, andlarge integrated delivery networks in theU.S. More than 34 million individuals’longitudinal electronic medical records(EMRs) were available from 1995 to April2016. More than 35,000 physicians andother providers from all U.S. states con-tribute to the CEMR, of whom;75% areprimary care providers. The database isgenerally representative of the U.S. pop-ulation, with a diabetes prevalence of 7.1%(identified by diagnostic codes) that issimilar to the national diabetes preva-lence of 6.7% (diagnosed diabetes in2014) (14). The CEMR has been used ex-tensively for academic research world-wide (3,22,23).This database contains comprehensive

patient-level information on demo-graphic, anthropometric, clinical, and lab-oratory variables including age, sex,

ethnicity, and longitudinal measures ofBMI, blood pressure, glycated hemoglo-bin (HbA1c), and lipids. All disease eventsalong with dates are coded with ICD-9,ICD-10, or SNOMEDCT codes.Medicationdata include brand names and doses forindividual medications prescribed, alongwith start/stop dates and specific fields totrack treatment alterations. This dataset also contains patient-reported medi-cations, including prescriptions receivedoutside the EMR network and over-the-counter medications.

MethodsEleven antidiabetes therapeutic classeswere considered in this study: MET, SU,TZD, a-glucosidase inhibitors (AGI), amy-lin, dopamine receptor agonists (DOPRA),meglitinides (MEG), DPP-4i, glucagon-likepeptide 1 receptor agonists (GLP-1RA),SGLT2i, and INS. For each patient, theseADDs were arranged chronologically ac-cording to the initiation dates. Same-dayinitiations (including combination thera-pies) were prioritized in the order as listedabove, with highest order priority assignedto MET and lowest to INS. Additions orswitches were defined by comparing stopdates and start dates of correspondingtherapies. Details on the medication datastructure, associated data-mining chal-lenges, and description of an algorithmapplied to extract and aggregate patient-level medication data from CEMR haverecently been published (24).For convenience, AGI, amylin, DOPRA,

andMEGwere combined into the “other”category. Saxenda (a version of liraglutide)was excluded from the GLP-1RA list, asit was approved in 2014 for weight lower-ing and not as an ADD (25). AlthoughWel-chol (colesevelam) was approved for thetreatment of type 2 diabetes, it wasmainlyprescribed to reduce cholesterol levels;therefore, we did not include colesevelamin our analyses (18).Patients with diabetes were identified

on the basis of diagnostic codes; thosewith a diagnosis of type 1 diabetes or onlygestationaldiabetesmellituswere identifiedand excluded. For identified patients withtype 2 diabetes, the following inclusion cri-teriawere applied:1) age at diagnosis$18and ,80 years, 2) diagnosis date strictlyafter first registered activity in the data-base, 3) diagnosis date on or after 1 Janu-ary 2005, and 4) initiation of any ADD.

Demographic variables included sex,age, and ethnicity. HbA1c values on the

70 Antidiabetes Treatment Patterns in the U.S. Diabetes Care Volume 41, January 2018

101

date of diagnosis and first-, second-, andthird-line ADD therapy initiation were ob-tained as the closest observations withina 3-month window. Body weight, BMI,systolic/diastolic blood pressure (SBP/DBP),lipids (LDL, HDL, and triglycerides), andheart rate were calculated as the aver-age of available measurements within a3-month window of the diagnosis or ADDinitiation date. Obesity was defined asBMI$30 kg/m2.The presence of comorbidities prior to

the first and second drug initiation wasexplored. Cardiovascular disease (CVD)was defined as ischemic heart disease (in-cluding myocardial infarction), peripheralvascular disease, heart failure, or stroke.Cancer was defined as anymalignancy ex-cept malignant neoplasm of skin. CharlsonComorbidity Index was defined and calcu-lated following the algorithm described byQuan et al. (26).

Statistical MethodsThe characteristics of patients were sum-marized byADD classesdat first prescrip-tion and at second ADD initiation whenadded to MET. Separate analyses wereconducted to explore the pattern of ad-dition or switch to third ADD by majorclasses of second-line ADDs. Study vari-ables were summarized as number (%),mean (SD), or median (first quartile [Q1],third quartile [Q3]) as appropriate. In pa-tients who had a second-line ADD addedafter MET, and had at least 1 year offollow-up post–second-line initiation,the “restricted mean survival time” esti-mation approach was used to comparethe mean time to the third ADD/INS ini-tiation among major second-line ADDgroups. This method computes survivaltime as time to third ADD/INS if initiated,and otherwise as time to the end of fol-low-up (date of patient’s last available re-cord within the database). Standard lifetable methods were used to estimaterates per 100 person-years (95% CI) ofsecond-line ADD, INS, and SU initiations inpatients with a minimum of 1 year follow-up post–MET initiation.

RESULTS

From the 2,624,954 patients identifiedwith type 2 diabetes, 2,590,853 wereaged between $18 and ,80 years atthe date of diagnosis with mean/median3.9/2.7yearsof follow-up.Of thesepatients,1,305,686 (50%) were newly diagnosed af-ter 1 January 2005,while 1,023,340patients

initiated any ADD (study cohort) (Supple-mentary Fig. 1) during the availablemean/median 3.4/2.8 years of follow-uptime. In the study cohort, 46%weremale,mean (SD) agewas 58 years (13), and 68%were white Caucasians, 12% blacks, and2% Asians (Table 1).

First ADD

Prescription Pattern Changes Over Time

Figure 1A presents the proportional dis-tribution of the first-line ADD by year ofinitiation. The proportional share of METas the first choice increased consistentlyfrom 60% in 2005 to 77% in 2016 (firstquarter of the year). SU’s share declinedfrom 20 to 8%, while INS’s share rangedfrom 8 to 10%. Starting at 11% in 2005,TZD’s proportional share dropped pro-gressively to 0.7% in 2016. Other drugswere chosen as first-line in 3% of casesor less.

Patients’ Characteristics

In the study cohort of 1,023,340 patients,the distribution of prescription patternsfor individual ADDs at any time from Jan-uary 2005 to April 2016 and as the firstADD are presented in Table 1. The demo-graphic and clinical characteristics of thepatients, along with the prevalence of co-morbidities at the time of first ADD initi-ation, are also presented in Table 1.In the study cohort, 79% received MET

any time during the recorded follow-up,and 72% received MET as the first ADD.The mean time to initiation of MET asthe first ADD and the available follow-up time since initiation were 3.7 monthsand 3.3 years, respectively. The propor-tions of patients with HbA1c level $7.5%(58 mmol/mol) and 8% (64 mmol/mol) atMET initiationwere 48% and 37%, respec-tively. Those who initiated with GLP-1RA,DPP-4i, TZD, and SU had a similar meanHbA1c of 8.0, 7.7, 7.8, and 8.0%, respec-tively (64, 61, 62, and 64 mmol/mol). INSwas initiated at an average HbA1c of 8.9%(74mmol/mol), with 71% and 59%havingHbA1c $7.5% (58 mmol/mol) and 8%(64 mmol/mol), respectively.Patients who initiated treatment with

MET were younger (mean age 57 years,with 19% $70 years) than those whoinitiated with SU (mean age 64 years, 43%$70 years), with INS (mean 60 years,29%$70years),withTZD (mean62years,32% $70 years), or with DPP-4i (mean64 years, 39% $70 years). Those whohad GLP-1RA and SGLT2i as the firstADD were younger, more likely to be

white Caucasian, female, and obese, ascompared with those who initiated withMET, DPP-4i, INS, TZD, or SU.

Comorbidities

The proportions of patients with CVD,chronic kidney disease (CKD), cancer, ordepression at first ADD initiation were19%, 4%, 4%, and 11%, respectively.Those patients initiating therapy withINS had a significantly higher prevalenceof CVD (27%, P , 0.01), CKD (11%, P ,0.01), and higher Charlson ComorbidityIndex with mean (SD) of 1.84 (1.31),compared with those initiating with MET,DPP-4i, GLP-1RA, or TZD (Table 1 for com-parative estimates).

Discontinuation of First ADD

Among patients with at least 1 year offollow-up (n = 813,826), the proportionsof patients discontinuing the first-lineADD within 1 year by individual ADDsare presented in Table 1. While only 8%of patients discontinued MET within ayear, 20%, 17%, and 25% of patients dis-continued GLP-1RA, DPP-4i, and SGLT2iwithin a year, respectively.

Second ADDAmong 740,478 patients who initiatedtherapy with MET, 357,482 (48%, subco-hort 1) (Supplementary Fig. 1) initiated asecond ADD, with an annual mean rate of10.7 initiations per 100 person-years(minimum 10.2, maximum 14.0) duringa mean 3.3 years of available follow-up,at an average HbA1c level of 8.4%(68mmol/mol), with 60% and 48% havingHbA1c $7.5% (58 mmol/mol) and 8.0%(64 mmol/mol), respectively. The propor-tional share of second-line ADD (post-MET) over time is presented in Fig. 1B.The demographic and clinical characteris-tics of the patients along with the time tosecond ADD, and the prevalence of co-morbidities at the time of second druginitiation, are presented in Table 2.Although the proportional share of SU

as a second-line therapy gradually de-creased from 60 to 46% over time (Fig. 1B),it remained the most popular choice(53%) of therapy intensification post–MET initiation across the whole time pe-riod. SU was initiated as second-line ADDat an average HbA1c level of 8.4% (68mmol/mol), with 62% and 49% havingHbA1c $7.5% (58 mmol/mol) and 8.0%(64 mmol/mol), respectively (Table 2).Among patients with a second ADDand a minimum 1 year of follow-up, only

care.diabetesjournals.org Montvida and Associates 71

102




http://care.diabetesjournals.org

Tab

le1—

Patientch

arac

teristicsat

thetimeoffirstADD

initiation,bydru

gclassin

thestudyco

hort

(N=1,023,34

0)

GLP-1RA

SGLT2i

MET

INS

TZD

DPP

-4i

SUOther§

All

Any

timedu

ring

follow-up

n(%

ofN)

68,522

(7)

39,549

(4)

808,518

(79)

270,432

(26)

109,754

(11)

182,457

(18)

354,36

7(35)

25,358

(2)

1,02

3,34

0(100

)Timeto

firstprescription

,mon

ths*

22.32(27.23

)39

.64(33.48

)5.3(14.16

)13

.03(22.54

)8.93(17.72

)19

.57(25.56

)11

.09(20.25

)14

.97(23.56

)d

AtfirstADD

n2(%

ofN)

9,49

4(1)

1,93

5(0.2)

740,478

(72)

93,078

(9)

28,004

(3)

25,005

(2)

116,43

5(11)

8,91

1(1)

d

Timeto

firstprescription

,mon

ths*

3.82(11.92

)7.94

(18.68

)3.73(11.78

)3.74

(11.06

)2.33

(8.2)

4.98(13.65

)3.84(11.51

)2.45(9.66)

3.73(11.66

)Follow-upfrom

ADDinitiation

,years*

3.22(2.5)

0.98(0.62)

3.31

(2.54)

2.94(2.45)

5.06(3.02)

2.79(2.1)

3.72(2.7)

3.86(2.71)

3.36(2.58)

Follow-up$1year

from

firstADDinitiation

,n3(%

ofn2)

7,40

0(78)

889(46)

589,246

(80)

68,385

(73)

25,105

(90)

19,278

(77)

95,854

(82)

7,66

9(86)

813,826

(80)

Discontinuation

within1year,n

(%of

n3)

1,51

6(20)

225(25)

44,485

(8)

3,35

9(5)

4,79

5(19)

3,24

3(17)

9,76

5(10)

1,36

7(18)

68,755

(8)

Age

(years)*

55(12)

56(11)

57(13)

60(13)

62(11)

64(11)

64(11)

58(16)

58(13)

Age

$70

years,n(%

ofn2)

1,21

7(13)

223(12)

144,210

(19)

26,790

(29)

8,83

8(32)

9,74

1(39)

50,280

(43)

2,92

5(33)

244,224

(24)

Male,n(%

ofn2)

3,20

5(34)

849(44)

332,206

(45)

44,016

(47)

14,075

(50)

10,968

(44)

59,208

(51)

3,33

9(37)

467,866

(46)

White

Caucasian,n

(%of

n2)

7,00

5(74)

1,43

0(74)

512,521

(69)

58,396

(63)

18,342

(65)

17,082

(68)

77,533

(67)

5,81

2(65)

698,121

(68)

Black,n

(%of

n2)

899(9)

218(11)

83,767

(11)

14,089

(15)

2,79

5(10)

2,95

7(12)

13,988

(12)

944(11)

119,657

(12)

HbA

1c,%*

8(1.6)

8.1(1.7)

8.1(1.8)

8.9(2.1)

7.8(1.6)

7.7(1.5)

8(1.6)

7.8(1.5)

8.2(1.8)

HbA

1c,mmol/m

ol‡

6465

6574

6261

6462

66HbA

1c$7.5%

(58mmol/m

ol),n(%

ofn2)

831(47)

295(51)

108,114

(48)

19,756

(71)

2,24

9(42)

2,41

0(40)

15,109

(49)

509(45)

149,273

(50)

HbA

1c$8%

(64mmol/m

ol),n(%

ofn2)

609(34)

209(36)

82,914

(37)

16,284

(59)

1,56

5(29)

1,65

6(28)

10,900

(36)

348(31)

114,485

(39)

Weight,kg*

107.6

(26.6)

103.5(24.8)

98.2

(24.9)

95.3

(25.3)

96.7

(25.2)

92.8

(24)

93.6

(23.7)

87.6

(24.6)

97.3

(24.9)

BMI,kg/m

2*

38.1

(8.5)

36.2

(8)

34.6

(7.9)

33.6

(8.3)

34(7.9)

33(7.6)

33.1

(7.6)

31.5

(8)

34.3

(7.9)

Obese,n

(%of

n2)

6,46

0(85)

1,28

1(78)

427,651

(70)

43,030

(64)

12,954

(66)

12,106

(62)

53,029

(62)

3,47

6(51)

559,987

(68)

SBP,mmHg*

129(15)

129(14)

131(15)

131(18)

131(16)

130(16)

132(17)

126(17)

131(16)

SBP$140mmHg,n(%

ofn2)

1,53

8(21)

345(22)

153,062

(25)

20,320

(29)

5,40

0(27)

4,96

0(25)

26,435

(30)

1,31

3(19)

213,373

(26)

DBP,mmHg*

78(10)

78(9)

78(10)

75(11)

75(10)

75(10)

75(10)

74(10)

77(10)

Heartrate,bpm

*79

(11)

79(12)

78(12)

78(12)

75(12)

76(12)

76(12)

76(11)

78(12)

LDL,mg/dL*

103(37)

106(39)

106(37)

98(40)

100(38)

99(37)

98(37)

98(37)

104(37)

HDL,mg/dL*

45(13)

45(14)

44(13)

44(15)

46(14)

45(14)

44(14)

48(15)

44(13)

Triglycerides,mg/dL†

140(102

,187

)14

4(104

,195

)14

3(104

,193

)13

3(93,

184)

131(94,

182)

141(102

,188

)14

2(103

,192

)12

3(88,

174)

142(103

,192

)CVD,n

(%of

n2)

1,42

2(15)

331(17)

118,342

(16)

25,051

(27)

5,44

6(19)

6,18

2(25)

31,293

(27)

1,81

5(20)

189,882

(19)

CKD

,n(%

ofn2)

377(4)

47(2)

12,590

(2)

10,329

(11)

1,82

0(6)

2,37

1(9)

10,988

(9)

643(7)

39,165

(4)

Cancer,n(%

ofn2)

289(3)

64(3)

30,195

(4)

3,63

6(4)

1,37

4(5)

1,34

0(5)

6,55

1(6)

451(5)

43,900

(4)

Depression,n(%

ofn2)

1,26

6(13)

213(11)

88,673

(12)

7,83

4(8)

2,21

0(8)

2,32

7(9)

8,93

1(8)

845(9)

112,299

(11)

CharlsonCom

orbidity

Index*

1.47(0.9)

1.45(0.91)

1.44

(0.89)

1.84(1.31)

1.57(1.06)

1.76(1.22)

1.77(1.23)

1.66(1.16)

1.53(1.0)

bpm,beatsper

minute;n2,num

berof

stud

ycoho

rtpatientsprescribed

each

drug

classas

afirstADD;n

3,nu

mberof

n2patientswith$1year

follow-upafterfirstADDinitiation

.*Mean(SD);

†median

(interquartile

range);‡mean;

§other:amylin,D

OPR

A,A

GI,or

MEG

.


103

Figure 1—A: Proportional share of the first ADD by year of initiation in the study cohort. B: Proportional share of the second ADD by year of initiation insubcohort 1 and key studies listed. C: In patients with a minimum of 1 year of follow-up post-MET, annual rates (95% CI) of SU and INS initiations per100 person-years. Subcohort 1: initiated second ADD and had MET as first-line treatment. *Other: amylin, DOPRA, AGI, or MEG. EMPA REG, BI10773 (Empagliflozin) Cardiovascular Outcome Event Trial in Type 2 Diabetes Mellitus Patients; EXAMINE, Examination of Cardiovascular Outcomeswith Alogliptin versus Standard of Care; FDA, U.S. Food and Drug Administration; LEADER, Liraglutide Effect and Action in Diabetes: Evaluation ofcardiovascular outcome Results; PROactive, Prospective Pioglitazone Clinical Trial in Macrovascular Events; RECORD, Rosiglitazone Evaluated for Cardio-vascular Outcomes in Oral Agent Combination Therapy for Type 2 Diabetes; SAVOR-TIMI, Saxagliptin Assessment of Vascular Outcomes Recorded inPatients with Diabetes Mellitus–Thrombolysis in Myocardial Infarction; TECOS, Trial Evaluating Cardiovascular Outcomes with Sitagliptin; UKPDS, UKProspective Diabetes Study.


104

https://clinicaltrials.gov/ct2/show/NCT01131676?term=EMPA-REG+OUTCOME&rank=1

https://clinicaltrials.gov/ct2/show/NCT01131676?term=EMPA-REG+OUTCOME&rank=1


Tab

le2—

Patientch

arac

teristicsat

thetimeoftheseco

ndADD

initiation,bydru

gclassad

ded

insu

bco

hort

1||(N

=35

7,482)

GLP-1RA

SGLT2i

INS

TZD

DPP

-4i

SUOther§

All

n1(%

ofN)

15,448

(4)

5,97

1(2)

49,939

(14)

33,021

(9)

61,508

(17)

187,81

9(53)

3,77

6(1)

357,482

(100

)

Timefrom

firstto

second

ADD(m

onths)*

11.1

(18.58)

18.52(23.73

)5.74(14.58

)4.09

(11.19

)11

.15(18.95

)7.38(16.08

)7.35(15.92

)7.84(16.5)

Follow-upfrom

second

ADDinitiation

(years)*

2.97(2.44)

0.95(0.66)

2.71

(2.26)

4.77(2.98)

2.68(2.01)

3.34(2.56)

3.59(2.65)

3.22(2.53)

Follow-up$1year

from

second

ADDinitiation

,n2(%

ofn1)

11,431

(74)

2,55

8(43)

36,337

(73)

28,841

(87)

46,822

(76)

149,10

9(79)

3,09

0(82)

278,18

8(78)

Discontinuation

within1year,n

(%of

n2)

2,40

7(21)

643(25)

2,53

7(7)

5,91

3(21)

8,23

4(18)

15,569

(10)

724(23)

36,027

(13)

Age

(years)*

53(12)

54(11)

57(13)

58(11)

58(12)

60(12)

61(12)

59(12)

Age

$70

years,n(%

ofn1)

1,20

3(8)

435(7)

9,57

6(19)

6,45

0(20)

11,595

(19)

47,416

(25)

1,11

6(30)

77,791

(22)

Male,n(%

ofn1)

5,30

5(34)

2,90

9(49)

23,366

(47)

17,301

(52)

29,463

(48)

97,730

(52)

1,66

4(44)

177,73

8(50)

White

Caucasian,n

(%of

n1)

11,698

(76)

4,62

2(77)

33,215

(67)

22,710

(69)

43,076

(70)

130,41

8(69)

2,57

5(68)

248,31

4(69)

HbA

1c,%*

7.8(1.6)

8.1(1.8)

9.3(2.3)

7.9(1.7)

8.2(1.7)

8.4(1.8)

7.5(1.6)

8.4(1.9)

HbA

1c,mmol/m

ol‡

6265

7863

6668

5868

HbA

1c$7.5%

(58mmol/m

ol),n(%

ofn1)

3,89

0(44)

2,10

2(59)

19,083

(73)

8,15

0(46)

20,960

(57)

65,980

(62)

587(41)

118,06

3(60)

HbA

1c$8%

(64mmol/m

ol),n(%

ofn1)

2,91

3(33)

1,58

1(44)

16,871

(64)

6,10

6(34)

15,768

(43)

51,841

(49)

418(29)

93,499

(48)

Weight,kg*

108.1(25.9)

105.1

(25.2)

99.8

(26)

99.9

(24.3)

98(24.2)

98(24.6)

94(26.1)

98.9

(24.9)

BMI,kg/m

2*

38.1

(8.2)

36.2

(7.8)

35(8.3)

34.8

(7.8)

34.3

(7.6)

34.3

(7.7)

33.3

(7.9)

34.6

(7.8)

Obese,n

(%of

n1)

12,429

(86)

4,38

7(79)

32,472

(71)

20,920

(71)

39,574

(69)

117,23

2(68)

1,96

1(61)

222,62

7(70)

SBP,mmHg*

128(14)

129(14)

131(16)

130(15)

130(14)

132(15)

129(15)

131(15)

SBP$140mmHg,n(%

ofn1)

2,56

5(18)

1,09

8(20)

11,642

(25)

6,99

3(24)

12,446

(22)

46,191

(27)

703(22)

79,837

(25)

DBP,mmHg*

78(9)

79(9)

76(10)

77(9)

78(9)

77(10)

75(9)

77(9)

Heartrate,bpm

*81

(11)

80(12)

80(12)

78(11)

79(11)

79(12)

78(12)

79(12)

LDL,mg/dL*

96(34)

79(9)

98(37)

97(35)

98(35)

97(35)

75(9)

97(35)

HDL,mg/dL*

44(12)

43(12)

43(13)

45(13)

44(12)

43(12)

47(15)

43(12)

Triglycerides,mg/dL†

150(109

,200

)15

6(115

,207

)14

3(103

,196

)13

9(100

,190

)14

8(109

,197

)14

9(109

,199

)13

5(97,

185)

147(107

,197

)

CVD,n

(%of

n1)

2,13

4(14)

1,00

4(17)

11,781

(24)

5,89

4(18)

12,212

(20)

41,220

(22)

876(23)

75,121

(21)

CKD,n

(%of

n1)

301(2)

128(2)

1,68

6(3)

836(3)

2,09

0(3)

6,80

6(4)

140(4)

11,987

(3)

Cancer,n

(%of

n1)

606(4)

264(4)

2,55

4(5)

1,35

0(4)

3,35

9(5)

9,47

2(5)

237(6)

17,842

(5)

Depression,n(%

ofn1)

2,97

9(19)

1,12

3(19)

7,29

4(15)

3,71

0(11)

9,07

3(15)

23,433

(12)

465(12)

48,077

(13)

Charlson

Comorbidity

Index*

1.52(0.9)

1.580.0(1)

1.71

(1.16)

1.48(0.92)

1.63(1.08)

1.63(1.08)

1.69(1.11)

1.62(1.07)

bpm,beatsperminute;n1,num

berof

subcoh

ort1patientsprescribed

each

drug

classas

asecond

ADD;n

2,nu

mberof

n1patientswith$1year

follow-upaftersecond

ADDinitiation

.*Mean(SD);

†median

(interquartile

range);‡mean;

§other:amylin,D

OPR

A,A

GI,or

MEG

;|subcoh

ort1:initiatedsecond

ADDandhadMET

asfirst-linetreatm

ent.


105

10% discontinued SU within 1 year com-pared with significantly higher discontin-uation proportions in other second-linenon-INS ADDs.The proportional share of DPP-4i as a

therapy intensification option post–METinitiation sharply increased from 0.4% in2006 (approved in October 2006) to 20%in 2016 (Fig. 1B). DPP-4i were initiated atan average HbA1c of 8.2% (66mmol/mol),with 57% and 43% having HbA1c $7.5%(58 mmol/mol) and 8.0% (64 mmol/mol),respectively. While 18% discontinuedDPP-4i within a year of initiation, the pro-portions of patients discontinuing second-line GLP-1RA, TZD, or SGLT2i within a yearwere higher.The proportional share of patients

receiving GLP-1RA as a second ADD in-creased from 3% in 2006 to 7% in 2016. Ini-tiation of GLP-1RA occurred at relativelylower HbA1c levels of 7.8% (62 mmol/mol)and at the highest BMI levels amongsecond-line ADDgroups. Twenty-one per-cent of patients discontinued GLP-1RAtherapy within a year of commencing itas a second ADD. After approval of thefirst SGLT2i in 2013, the proportionalshare of those receiving it as a secondADD reached 7% in 2016. One-quarterof patients discontinued SGLT2i therapywithin a year of adding it as second-lineADD. The proportional share of patientsreceiving TZD as a second-line therapydropped from 30 to 4% (Fig. 1B), with21% of patients discontinuing therapywithin 1 year.The proportional share of patients re-

ceiving INS as a second ADD post–METinitiation has consistently increased from7% in 2005 to 17% in 2016 (Fig. 1B). Theintensification with INS occurred at a9.3% (78 mmol/mol) average HbA1c level,with 73% and 64% having HbA1c $7.5%(58 mmol/mol) and 8.0% (64 mmol/mol),respectively. Only 7% patients discontin-ued INS within 1 year of initiation.

Third ADDAmong patients in subcohort 1, 78% hadat least 1 year of follow-up from the sec-ond ADD initiation (subcohort 2; n =278,188). Of these patients, 144,106(52%) initiated a third ADD, with an an-nual mean rate of 12.6 initiations per100 person-years (minimum 11.4, maxi-mum 14.9) during a mean follow-upof 4 years post–second-line initiation.Table 3 presents treatment intensifica-tion patterns by the major second-line

ADDs. Most of the patients (84% [n =121,559]) added a third drug on top ofthe second ADD, while 16% (n = 22,547)ceased the second ADD and switched toa third ADD. Addition of the third drugoccurred at higher HbA1c levels (8.5%[69mmol/mol]) comparedwith switching(8.2% [66 mmol/mol]).Among patients with SU as the sec-

ond ADD, 49% (n = 73,776) added and6% (n = 8,204) switched to a third drugduring a mean follow-up of 4.1 years. Themost popular third ADD addition wasDPP-4i (34% of those who added a thirdADD), followed by INS (28%) and TZD(26%). Among those who switched, al-most one-half (49%) switched to INS,while 30% and 8% switched to DPP-4iand GLP-1RA, respectively.SU, DPP-4i, and GLP-1RA were added

to INS in 32%, 26%, and 22% of patients(from those who added a third ADD),respectively. Only 3% of patients ceasedINS therapy to switch to another ADDduring a mean 3.6 years of follow-up. Inthe second-lineDPP-4i group (n = 46,822),40% added and 11% switched to a thirddrug during a mean 3.4 years of follow-up. The most popular third ADD additionwas SU (40% of those who added a thirdADD), followed by INS (29%) andGLP-1RA(9%). Of thosewho switched fromDPP-4i,one-half of the patients moved to SU, fol-lowed by INS and GLP-1RA (17%).Among thosewho had aGLP-1RA as the

second ADD, 52% added INS (of thosewhoadded a third ADD) and 18% switched toINS (of thosewho switched to a third ADD)during a mean 3.9 years of follow-up; 11%added and 34% switched to DPP-4i. In theTZD group, 43% added and 22% switchedto a third ADD during 5.4 years of follow-up. Among those who switched, 45%chose SU while 35% moved to DPP-4i.

Temporal Changes in Rates ofIntensification With SU and INSAmong patients with first-line MET and aminimum 1 year of follow-up, the annualrates per 100 person-years of INS/SU ini-tiation (irrespective of order of therapyintensification) are presented in Fig. 1C.The rates did not significantly declinefrom 2005 to 2014.

Do Novel ADDs Help Delay the Needfor Therapy Intensification?The Kaplan-Meier analyses, based on re-stricted mean years to adding or movingto a third ADD, in major second-line ADD

groups are presented in Table 3. Themean time to intensification with a thirdADD was marginally longer in incretingroups (DPP-4i 4.1 years [95% CI 4.1,4.2] and GLP-1RA 4.2 years [4.1, 4.3])compared with that in patients with SUas the second-line ADD (3.9 years [3.8,3.9]; P = 0.04). The restricted mean timesto intensificationwith INS any timeduringfollow-up were 6.3, 7.1, and 6.6 years inthe SU, DPP-4i, and GLP-1RA groups, re-spectively (all comparative P , 0.05).

CONCLUSIONS

This longitudinal exploratory study of alarge cohort of patients with type 2 dia-betes observed between 2005 and 2016from primary and ambulatory care sys-tems in the U.S. provides 1) a detailedaccount of glycemic states, clinical char-acteristics, and comorbidities at first-lineand second-line therapy initiation by dif-ferent drug classes, as well as new in-sights into 2) the changes in the choiceof first- and second-line ADDs over thelast 10 years, 3) patterns of therapy inten-sification with third-line ADDs and withINS, separately for major second-lineADDs, 4) changes in the annual rates oftherapy intensification with SU and INSover time, and 5) possible benefits of us-ing newer novel antidiabetes therapies interms of delaying the need for third-linetherapy intensification, including theneed for initiating INS.With 3.4 years of mean follow-up in

more than one million patients with a di-agnosis of type 2 diabetes from 2005, thisstudy provides robust and detailed infor-mation on the changing clinical practicesfor the management of type 2 diabetesin a real-world setting. We are not awareof any study that simultaneously evalu-ated the changing prescribing patternsof old and new ADDs as first-line therapyand as intensification options at variouslevels of glycemia and comorbidities.The proportional share of MET as the

first-line therapy choice has increasedfrom 60 to 77%, while that for SU hasdecreased from 20 to 8%, over the lastdecade. However, SU continue to be themost popular second-line therapy inten-sification option, although with a declin-ing share (from 60 to 46% over the lastdecade). The discontinuation rate of SUwas found to be the lowest among non-INS second-line ADDs. Among those whointensified with a third-line therapy, theratio of addition to switching to third ADD


106


Tab

le3—

Intensifica

tionofmajorseco

nd-linetherap

iesin

subco

hort

2‡(N

=278

,188)

GLP-1RA

INS

TZD

DPP

-4i

SUAll

N11

,431

36,337

28,841

46,822

149,109

278,188

Follow-upfrom

second

ADDinitiation

,years*

3.85(2.24)

3.56(2.09)

5.40(2.66)

3.37

(1.81)

4.09(2.35)

4.02(2.33)

InitiatedthirdADD,n

(%of

N)

5,94

2(52)

10,677

(29)

18,788

(65)

23,840

(51)

81,980

(55)

144,106

InitiatedINS,n(%

ofN)

3,28

5(29)

8,22

3(29)

9,63

3(21)

45,293

(30)

67,812

Restrictedmeantimeto

athirdADD,years§

4.23(4.14,

4.32)

6.15

(6.09,

6.21

)3.53(3.49,

3.58)

4.12(4.07,

4.17)

3.91(3.88,

3.93)

4.18(4.17,

4.20)

Restrictedmeantimeto

INS,years§

6.58(6.49,

6.67)

6.82(6.78,

6.87)

7.14(7.09,

7.18)

6.26(6.23,

6.28)

6.51(6.49,

6.53)

Add

edthirdADD

n1(%

ofN)

4,52

2(40)

9,67

5(27)

12,481

(43)

18,881

(40)

73,776

(49)

121,55

9(44)

HbA

1c,%*

8.2(1.7)

8.9(2)

8.1(1.8)

8.5(1.8)

8.6(1.7)

8.5(1.8)

HbA

1c,mmol/m

ol†

6674

6569

7069

HbA

1c$7.5%

(58mmol/m

ol),n(%

ofn1)

1,79

8(61)

4,95

3(74)

4,10

2(55)

8,96

8(69)

32,611

(72)

53,100

(69)

HbA

1c$8%

(64mmol/m

ol),n(%

ofn1)

1,36

4(47)

4,16

0(62)

3,10

1(42)

6,97

4(54)

26,231

(58)

42,330

(55)

GLP-1RAas

thirdADD,n

(%of

n1)

2,12

5(22)

1,53

2(12)

1,64

3(9)

5,66

2(8)

11,230

(9)

INSas

thirdADD,n

(%of

n1)

2,35

6(52)

3,33

8(27)

5,47

2(29)

20,483

(28)

32,447

(27)

TZDas

thirdADD,n

(%of

n1)

241(5)

688(7)

1,42

5(8)

19,010

(26)

21,472

(18)

DPP-4iasthirdADD,n

(%of

n1)

481(11)

2,48

8(26)

3,80

4(30)

25,216

(34)

32,696

(27)

SUas

thirdADD,n

(%of

n1)

865(19)

3,09

0(32)

3,04

8(24)

7,55

6(40)

14,844

(12)

SGLT2ias

thirdADD,n

(%of

n1)

521(12)

938(10)

138(1)

2,39

9(13)

2,07

7(3)

6,09

5(5)

Switched

tothirdADD

n2(%

ofN)

1,42

0(12)

1,00

2(3)

6,30

7(22)

4,95

9(11)

8,20

4(6)

22,547

(8)

HbA

1c,%*

7.9(1.6)

8.2(1.8)

7.8(1.7)

8.1(1.7)

8.7(2)

8.2(1.8)

HbA

1c,mmol/m

ol†

6366

6265

7266

HbA

1c$7.5%

(58mmol/m

ol),n(%

ofn2)

470(54)

377(60)

1,84

6(49)

1,94

0(59)

3,38

0(68)

8,23

1(59)

HbA

1c$8%

(64mmol/m

ol),n(%

ofn2)

335(38)

280(44)

1,29

0(34)

1,45

1(44)

2,78

0(56)

6,28

6(45)

GLP-1RAas

thirdADD,n

(%of

n2)

84(8)

383(6)

842(17)

677(8)

2,06

5(9)

INSas

thirdADD,n

(%of

n2)

262(18)

703(11)

849(17)

4,01

2(49)

5,93

1(26)

TZDas

thirdADD,n

(%of

n2)

61(4)

48(5)

273(6)

510(6)

924(4)

DPP-4iasthirdADD,n

(%of

n2)

488(34)

269(27)

2,19

9(35)

2,43

6(30)

5,56

1(25)

SUas

thirdADD,n

(%of

n2)

390(27)

521(52)

2,82

9(45)

2,45

6(50)

6,45

0(29)

SGLT2ias

thirdADD,n

(%of

n2)

205(14)

66(7)

109(2)

464(9)

373(5)

1,22

8(5)

n1,n

umberofsubco

hort

2patients

foreachdrugclasswhoad

dedathirdADD;n2,num

berof

subcoh

ort2patientsforeach

drug

classwho

switched

toathirdADD.*Mean(SD);†mean;

§mean(95%

CI).

‡Subcoh

ort2:thosewithMET

asfirst-linetreatm

ent,who

initiatedsecond

ADD,w

ithaminimum

of1year

offollow-uppo

st–secon

ddrug

initiation

.


107

was highest in the SU group (9.0), followedbyDPP-4i (3.8)andGLP-1RA(3.2),during.4years of mean follow-up post–second-lineADD initiation.We observed that second-line SU users

initiate a third ADDmarginally sooner com-pared with incretin users. A study based onEMRdata from theU.K. reported the oppo-site results, with an average of 1.6 and 2.4years to third ADD initiation in the 3,080and 15,508 patients treated with MET +DPP-4i and MET + SU, respectively (5).The proportions of patients who added

INS were similar between patients whohad a DPP-4i and an SU as the secondADD. However, among thosewho switchedto a third ADD, only 17% of patients in theDPP-4i group switched to INS, comparedwith almost 50% in the SU group. We alsoobserved that the mean time to INS initia-tionwas significantly shorter for second-lineSUusers (6.3 years) than in theDPP-4i group(7.1 years). This finding is similar to a study(2015) based on 3,864matched pairs of pa-tients treated with DPP-4i or SU whenadded to MET, where Inzucchi et al. (6) re-ported that those treated with DPP-4iwere significantly less likely to initiateINS comparedwith those treatedwith SU.We observed an increasing propor-

tional share of INS as a second-line ther-apy option over the last 10 years, despitethe availability of novel therapies thatwere found to have similar or better gly-cemic efficacy in clinical trials. Also, theannual rates of intensification with INSremained similar over the last decade.In a similar study, Lipska et al. (15) ob-served that the overall rate of severe hy-poglycemia did not reduce from 2006 to2013. Thismay reflect pressure to achieveglycemic targets rapidly and an increasingrecognition that for peoplewith very poorglycemic control, INS may be the onlydrug likely to achieve targets.Compared with rates for older ADDs,

high discontinuation rates of new thera-peutic classes, particularly of DPP-4i, aresurprising. The higher cost of these newerdrugs may be relevant and may also con-tribute to the fairly low rates of initiationof these drugs overall. More studies, uti-lizing additional data sources, are neededto specifically test hypotheses for the dif-ferences in initiation, adherence, and per-sistence between drug classes.The HbA1c level at pharmacological

therapy initiation was found to be 8.2%(66 mmol/mol), with 50% having HbA1c$7.5% (58 mmol/mol). The HbA1c levels

at first-line ADD initiation were similaracross all ADDs, except in those who ini-tiatedwith INS,whose average HbA1cwas8.9% (74 mmol/mol). Although the meantime to second ADD post–MET initiationwas only 8 months, it occurred at a highHbA1c level of 8.4% (68 mmol/mol),with 60% and 48% of patients havingHbA1c $7.5% (58 mmol/mol) and 8%(64 mmol/mol), respectively. Amongthose with a minimum of 1 year of follow-up post–second ADD, ;52% intensifiedwith a third ADD at an average HbA1clevel of 8.5% (69 mmol/mol). These find-ings reflect the continued glycemic riskburden in patients with type 2 diabetes(27–30). While the persistent therapeuticinertia (28) and the long-term conse-quences of therapeutic inertia (27,29)for glycemic control in primary care sys-tems have been evaluated, exploration ofthe glycemic state at therapy initiationand intensification by ADD classes isscarce. Our study provides a detailed ac-count of the glycemic state in peoplewithtype 2 diabetes at therapy initiation andintensifications during a reasonable follow-up period in primary and ambulatory caresettings.The main strength of this study is the

availability of data from the patients’medication lists that included prescribedmedications within the EMR network andalsomedication information that could beprescribed outside of the EMR, as well asdata on glycemic control and comorbid-ities. The CEMR database tracks longitudi-nal treatment adjustments and containscomprehensive clinical information, whichis usually not available in claims databases.The limitations of this study include the

nonavailability of complete and reliabledata on 1) medication adherence andside effects, 2) diet and exercise, 3) socio-economic status, and 4) insurance type.We did not include dosage changes orbrands’ distribution in our analyses. Thefindings of this study should be inter-pretedwith caution: EMR data are in gen-eral biased toward unhealthy populationsand commercially insured individuals,white Caucasians are overrepresented inthe CEMR, and the results are subject tolimited follow-up.Less popular ADDs such as MEG, AGI,

DOPRA, and amylin were included in ourstudy for multiple reasons: first, to assessutilization data of such medications, asthese drugs are usually omitted, and sec-ond, to ensure market shares of other

drugs are not overestimated.Weobservedthat only 39,549 patientswith type 2 diabe-tes were using SGLT2i during the availablefollow-up period, which is not surprising,given that theywerefirst approved in 2013.To conclude, while we have observed

significant increase in the use of MET asthe first-line therapy over the last 10 years,the second- and third-line therapy intensi-fication choices are highly heterogeneous.While increasing popularity of “new”drugs, especially DPP-4i and SGLT2i, wasobserved as the second and third drugschoices, SU remain the most populartherapy intensification choice and havea lower discontinuation rate comparedwith other non-INS ADDs. The propor-tional share of INS as a second-line ther-apy choice has also increased significantlyover the last decade. Incretin-based ther-apies were found to delay the need fortherapy intensification only marginallycompared with other ADDs. Contrary tothe guidelines for proactive glycemicmanagement, pharmacological therapyinitiation and the intensifications oc-curred at very high levels of HbA1c, with48% of patients having HbA1c $8.0%(58 mmol/mol) at second-line therapyinitiation.

Acknowledgments. O.M. acknowledges hercosupervisors Ross Young and Louise Hafner ofQueensland University of Technology.Funding. J.J.A. and S.K.P. acknowledge projectgrant support provided by the Royal BrisbaneandWomen’s Hospital Foundation.O.M. acknowl-edges a PhD scholarship fromQueenslandUniver-sity of Technology. J.S. is supported by a NationalHealth and Medical Research Council ResearchFellowship. Melbourne EpiCentre received sup-port from the National Health and Medical Re-search Council and the Australian Government’sNational Collaborative Research InfrastructureStrategy initiative throughTherapeutic InnovationAustralia.Duality of Interest. J.S. has received speakerhonoraria, consultancy fees, and/or travel spon-sorship from AstraZeneca, Boehringer Ingelheim,Lilly, Sanofi, Mylan, Novo Nordisk, Merck Sharp &Dohme, and Novartis. J.J.A. has received speakerhonoraria, consultancy fees, and/or travel spon-sorship from AstraZeneca, Boehringer Ingelheim,Lilly, and Novartis. S.K.P. has acted as a consultantand/or speaker for Novartis, GI Dynamics, Roche,AstraZeneca, Guangzhou Zhongyi Pharmaceutical,and Amylin Pharmaceuticals. S.K.P. has receivedgrants in support of investigator and investigator-initiated clinical studies fromMerck, Novo Nordisk,AstraZeneca, Hospira, Amylin Pharmaceuticals,Sanofi, and Pfizer. No other potential conflicts ofinterest relevant to this article were reported.Author Contributions. O.M. and S.K.P. wereresponsible for the primary design of the study.J.S. and J.J.A. contributed significantly in the study


108


design.O.M. conductedthedataextraction. O.M.and S.K.P. jointly conducted the statistical anal-yses. F.S. contributed in the interpretation of theresults. The first draft of the manuscript wasdeveloped by O.M. and S.K.P., and all authorscontributed to the finalization of the manuscript.S.K.P. is theguarantorofthisworkand,assuch,hadfull access to all the data in the study and takesresponsibility for the integrity of the data andthe accuracy of the data analysis.

References1. American Diabetes Association. Standards ofMedical Care in Diabetesd2017: summary of re-visions. Diabetes Care 2017;40(Suppl. 1):S4–S52. Paul SK,MaggsD, Klein K, Best JH. Dynamic riskfactors associated with non-severe hypoglycemiain patients treated with insulin glargine or exena-tide once weekly. J Diabetes 2015;7:60–673. Paul SK, Shaw JE, Montvida O, Klein K. Weightgain in insulin-treated patients by body mass in-dex category at treatment initiation: new evidencefrom real-world data in patients with type 2 diabe-tes. Diabetes Obes Metab 2016;18:1244–12524. Waldrop G, Zhong J, Peters M, et al. Incretin-based therapy in type 2 diabetes: an evidence basedsystematicreviewandmeta-analysis. J DiabetesCom-plications. 28 August 2016 [Epub ahead of print].https://doi.org/10.1016/j.jdiacomp.2016.08.0185. Mamza J, Mehta R, Donnelly R, Idris I. Impor-tant differences in the durability of glycaemic re-sponse among second-line treatment options whenadded to metformin in type 2 diabetes: a retrospec-tive cohort study. Ann Med 2016;48:224–2346. Inzucchi SE, Tunceli K, Qiu Y, et al. Progressionto insulin therapy among patients with type 2 di-abetes treated with sitagliptin or sulphonylureaplus metformin dual therapy. Diabetes ObesMetab 2015;17:956–9647. Cohen FJ, Neslusan CA, Conklin JE, Song X. Re-cent antihyperglycemic prescribing trends for U.S.privately insured patients with type 2 diabetes.Diabetes Care 2003;26:1847–18518. Desai NR, Shrank WH, Fischer MA, et al. Pat-terns of medication initiation in newly diagnoseddiabetes mellitus: quality and cost implications.Am J Med 2012;125:302.e1-79. Tanne JH. FDA places “black box” warning onantidiabetes drugs. BMJ 2007;334:1237

10. Nissen SE, Wolski K. Effect of rosiglitazone onthe risk of myocardial infarction and death fromcardiovascular causes. N Engl J Med 2007;356:2457–247111. Woodcock J, Sharfstein JM, HamburgM. Reg-ulatory action on rosiglitazone by the U.S. Foodand Drug Administration. N Engl J Med 2010;363:1489–149112. Home PD, Pocock SJ, Beck-Nielsen H, et al.;RECORD Study Team. Rosiglitazone evaluated forcardiovascular outcomes in oral agent combina-tion therapy for type 2 diabetes (RECORD): a mul-ticentre, randomised, open-label trial. Lancet2009;373:2125–213513. Berkowitz SA, Krumme AA, Avorn J, et al. Ini-tial choice of oral glucose-loweringmedication fordiabetesmellitus: a patient-centered comparativeeffectiveness study. JAMA Intern Med 2014;174:1955–196214. Segal JB, Maruthur NM. Initial therapy for di-abetes mellitus. JAMA Intern Med 2014;174:1962–196315. Lipska KJ, Yao X, Herrin J, et al. Trends in drugutilization, glycemic control, and rates of severehypoglycemia, 2006–2013. Diabetes Care 2017;40:468–47516. Pantalone KM, Hobbs TM,Wells BJ, et al. Clin-ical characteristics, complications, comorbidities andtreatment patterns among patients with type 2 di-abetes mellitus in a large integrated health system.BMJ Open Diabetes Res Care 2015;3:e00009317. Raebel MA, Xu S, Goodrich GK, et al. Initialantihyperglycemic drug therapy among 241 327adults with newly identified diabetes from 2005through 2010: a surveillance, prevention, and man-agementofdiabetesmellitus (SUPREME-DM)study.Ann Pharmacother 2013;47:1280–129118. Hampp C, Borders-Hemphill V, Moeny DG,Wysowski DK. Use of antidiabetic drugs in the U.S.,2003-2012. Diabetes Care 2014;37:1367–137419. Centers for Disease Control and Prevention.National Diabetes Statistics Report: Estimates ofDiabetes and Its Burden in the United States,2014. Atlanta, GA, U.S. Department of Healthand Human Services, 201420. International Diabetes Federation. IDF Diabe-tes Atlas. 7th ed. Brussels, Belgium, 201521. Higgins V, Piercy J, Roughley A, et al. Trends inmedication use in patients with type 2 diabetes

mellitus: a long-term view of real-world treat-ment between 2000 and 2015. Diabetes MetabSyndr Obes 2016;9:371–38022. Crawford AG, Cote C, Couto J, et al. Compar-ison of GE Centricity Electronic Medical Recorddatabase and National Ambulatory Medical CareSurvey findings on the prevalence of major con-ditions in the United States. Popul Health Manag2010;13:139–15023. Brixner D, Said Q, Kirkness C, Oberg B,Ben-Joseph R, Oderda G. Assessment of cardio-metabolic risk factors in a national primary careelectronic health record database. Value Health2007;10:S29–S3624. Montvida O, Arandjelovic O, Reiner E, PaulSK. Data mining approach to estimate the dura-tion of drug therapy from longitudinal electronicmedical records. Open Bioinform J 2017;10:1–1525. U.S. Food and Drug Administration. FDA ap-proves weight-management drug [article online],2014. Available from https://wayback.archive-it.org/7993/20170111160832/http://www.fda.gov/NewsEvents/Newsroom/PressAnnouncements/ucm427913.htm. Accessed 23 December 201426. Quan H, Sundararajan V, Halfon P, et al. Cod-ing algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care2005;43:1130–113927. Paul SK, Klein K, Thorsted BL, Wolden ML,Khunti K. Delay in treatment intensification in-creases the risks of cardiovascular events in pa-tients with type 2 diabetes. Cardiovasc Diabetol2015;14:10028. Khunti K, NikolajsenA, Thorsted BL, AndersenM, Davies MJ, Paul SK. Clinical inertia with regardto intensifying therapy in people with type 2 di-abetes treated with basal insulin. Diabetes ObesMetab 2016;18:401–40929. Paul SK, Klein K, Thorsted BL, Wolden ML,Khunti K. Delay in treatment intensification in-creases the risks of cardiovascular events in pa-tients with type 2 diabetes. Cardiovasc Diabetol2015;14:10030. Montvida O, Klein K, Kumar S, Khunti K, PaulSK. Addition of or switch to insulin therapyin people treated with glucagon-like peptide-1receptor agonists: a real-world study in66 583 patients. Diabetes Obes Metab 2017;19:108–117


109

https://doi.org/10.1016/j.jdiacomp.2016.08.018

https://wayback.archive-it.org/7993/20170111160832/http://www.fda.gov/NewsEvents/Newsroom/PressAnnouncements/ucm427913.htm




Chapter 8: Glycaemic Control and

Sustainability


Paper




expertise;






academic unit, and




Olga Montvida, Jonathan Shaw, Lawrence Blonde, Sanjoy K Paul. Long-term sustainability

of glycaemic achievements with second-line anti-diabetic therapies in patients with type 2

diabetes: A real-world study. Accepted at Diabetes, Obesity, and Metabolism.






Jonathan Shaw Contributed in the study design and manuscript

development.

Lawrence Blonde Contributed in the manuscript development.

Sanjoy K. Paul Conceived the idea, and was responsible for the primary



of the manuscript.

110


Signature



authorship.


Name Signature Date

111


OR I G I N A L A R T I C L E

Long-term sustainability of glycaemic achievements withsecond-line antidiabetic therapies in patients with type2 diabetes: A real-world study

Olga Montvida MSc1,2 | Jonathan E. Shaw MD3 | Lawrence Blonde MD4 |

Sanjoy K. Paul PhD1,5

1Statistics Unit, QIMR Berghofer Medical

Research Institute, Brisbane, Australia

2School of Biomedical Sciences, Faculty of

Health, Queensland University of Technology,

Brisbane, Australia

3Baker Heart and Diabetes Institute,

Melbourne, Australia

4Ochsner Diabetes Clinical Research Unit,

Department of Endocrinology, Frank Riddick

Diabetes Institute, Ochsner Medical Center,

New Orleans, Louisiana

5Melbourne EpiCentre, University of

Melbourne and Melbourne Health,


Correspondence

Sanjoy K. Paul PhD, The Royal Melbourne

Hospital, City Campus, 7 East, Main Building,

Grattan Street, Parkville, Victoria 3050,

Australia.

Email: [email protected]

Funding information

No separate funding was obtained for this

study.

Aims: To inform patients and their carers about both the probability of reducing glycated hae-

moglobin (HbA1c) to clinically desirable levels and the sustainability of such control over

2 years with major second-line antidiabetic therapies, in individual risk scenarios, with and

without third-line intensification.

Materials and Methods: From US Centricity Electronic Medical Records, 163 081 patients with

type 2 diabetes aged 18 to 80 years, who had initiated metformin, intensified their treatment

with dipeptidyl peptidase-4 (DPP-4) inhibitors, glucagon-like peptide-1 (GLP-1) receptor ago-

nists (GLP-1RAs), sulphonylureas (SUs), insulin or thiazolidinediones (TZDs), and continued

second-line treatment for ≥6 months, were selected. Treatment groups were balanced with

regard to baseline characteristics, and glycaemic achievements were estimated using logistic

regression analysis.

Results: With HbA1c concentrations of 58–63.9 mmol/mol (7.5–7.9%) at second-line treat-

ment initiation, the adjusted probabilities of achieving HbA1c <53 mmol/mol (<7%) at

6 months were 32%, 38%, 39%, 26% and 38% in the SU, DPP-4 inhibitor, GLP-1RA, insulin

and TZD groups, respectively, while with baseline HbA1c concentrations of 64–75 mmol/mol

(8–9%), the corresponding probabilities of reducing HbA1c to <58 mmol/mol (<7.5%) were

38%, 44%, 40%, 34% and 42%, respectively. In these baseline HbA1c categories, the adjusted

probabilities of sustaining HbA1c achievements over 2 years were higher in the GLP-1RA and

TZD groups, compared with the SU and insulin groups (P < .01). With baseline HbA1c concen-

trations of 75.1–108 mmol/mol (9.1–12%) 38% of patients achieved an HbA1c concentration

<58 mmol/mol (<7.5%) at 6 months. The adjusted probability of sustaining this control over

2 years was higher in the incretin and TZD groups (range 62%-75%), while insulin and SUs

offered lower chances of sustainable control (range 54%-56%).

Conclusions: Patients treated with second-line incretins and TZDs had a significantly higher

probability of achieving and sustaining glycaemic control over 2 years without further intensifi-

cation, compared with those treated with SUs or insulin.

KEYWORDS

antidiabetic drug, glycaemic control, therapeutic choice

1 | INTRODUCTION

Metformin is recommended as a first-line pharmacological treatment

for patients with type 2 diabetes; however, most patients eventually

require therapy intensification with multiple antidiabetic drugs

(ADDs) to achieve glycaemic control.1–3 For second-line treatment

intensification, the American Diabetes Association recommends sul-

phonylureas (SUs), thiazolidinediones (TZDs), dipeptidyl peptidase

Received: 21 January 2018 Revised: 28 February 2018 Accepted: 7 March 2018

DOI: 10.1111/dom.13288

Diabetes Obes Metab. 2018;1–10. wileyonlinelibrary.com/journal/dom © 2018 John Wiley & Sons Ltd 1

112

http://orcid.org/0000-0002-6187-2203

http://orcid.org/0000-0003-0492-6698

http://orcid.org/0000-0003-0848-7194

http://wileyonlinelibrary.com/journal/dom

(DPP-4) inhibitors, sodium-glucose co-transporter-2 (SGLT2) inhibi-

tors, glucagon-like peptide-1 receptor agonists (GLP-1RAs) or insulin.

Other drugs are recommended under specific conditions.1 The tem-

poral patterns of the changes in the second-line ADD choices over

the last decade in the United States have recently been explored by

Montvida et al.4

While clinicians' and patients' decisions with regard to add-on

agents have become more complicated,5,6 few studies have directly

compared the glycaemic effectiveness of second-line therapies.7–11 A

recent network meta-analysis reported similar glycaemic achieve-

ments with all second-line ADDs when added to metformin.9 Analo-

gous results were discussed in two observational studies using

electronic medical records (EMRs).7,8 In 7009 patients from Germany,

Rathmann et al7 reported an unadjusted mean glycated haemoglobin

(HbA1c) reduction of 0.7% to 1.1% after 6 months of treatment with

major second-line ADDs, including insulin. A study from Denmark in

4734 patients by Thomsen et al8 reported median reductions of 0.8%

to 1.3% at 12 months for non-insulin drugs (from baseline HbA1c of

60–64 mmol/mol [7.6–8.0%]) and 2.4% for insulin-treated patients

(from 81 mmol/mol [9.6%]).8

Given the increasing complexity and challenges with regard to

multiple risk factor management in patients with type 2 diabetes, and

the availability of a number of new and older classes of ADDs, a

population-level assessment of the likelihood of short- and long-term

glycaemic achievements and their sustainability with the use of dif-

ferent second-line ADDs would be of great value. With evaluation of

a reasonably large number of patients from primary and ambulatory

care systems, probabilistic estimates of sustainable glycaemic

achievements with different second-line ADDs for different risk para-

digms would empower clinicians and their patients to make more

informed therapeutic choices. To the best of our knowledge, no study

has evaluated early glycaemic control and its sustainability among

groups of patients with different HbA1c concentrations at the time

of post-metformin second-line ADD intensification.

The newer classes of ADDs, including GLP-1RAs and SGLT2

inhibitors, potentially have both extra glycaemic benefits, such as

weight and blood pressure reductions, and possible associations with

reduced risk of cardiovascular diseases. These therapies are costlier

in comparison to the older ADDs; however, if the newer drugs have

longer-lasting benefits with regard to glycaemic control than older

ADDs, then over time, a lower rate of third-line therapy intensifica-

tion across the whole population would be expected, as the use of

newer drugs increases. In this context, evaluation of whether second-

line therapy intensification with newer drug classes has been helpful

in reducing the need for third-line therapy intensification over time,

would be of great interest, but there is a paucity of population-level

studies addressing this question. A modelling study by Zhang et al12

reported a marginally shorter time to insulin treatment in patients

treated with incretin-based therapies (GLP-1RAs and DPP-4 inhibi-

tors) compared with those treated with SUs, but we are not aware of

any population-level study that has evaluated the possible delay in

the need for third-line therapy intensification in patients who choose

incretin-based therapies as second-line therapy.

Taking into account the heterogeneous HbA1c levels among

patients at second-line treatment initiation, the main aims of the

present study were: to inform clinicians and patients on the likelihood

of reducing HbA1c to a clinically desirable level over 6, 12 and

24 months of treatment with major second-line ADDs when added

to metformin; to estimate the probability of sustaining early glycae-

mic control over 24 months of therapy continuation with and without

the need for third-line ADD addition; and to determine whether the

availability of newer ADDs has reduced the need for intensification

with third-line therapy at the population-level over time.

2 | MATERIALS AND METHODS

2.1 | Data source

The US Centricity Electronic Medical Record (CEMR) database was

used in the present study. The CEMR represents >35 000 solo practi-

tioners, community clinics, academic medical centres and large inte-

grated delivery networks across all of the United States. Patients in

the database are generally representative of the US population, with

a diabetes prevalence (7.1%, as identified by diagnostic codes) that is

similar to National Diabetes Statistics (6.7% with diagnosed diabetes

in 2014).13 The CEMR database has been extensively used for aca-

demic research worldwide.14–16

Research involved existing data, from which the patients could

not be identified either directly or through identifiers linked to the

patients. According to the US Department of Health and Human Ser-

vices Exemption 4 [CFR 46.101(b) (4)], therefore, this study was

exempt from ethics approval from an institutional review board and

informed consent.

For more than 34 million individuals, longitudinal EMRs were

available from 1995 until April 2016, with comprehensive patient-

level information on demographics, anthropometric, clinical and labo-

ratory variables. Medication data included brand names and doses for

individual medications prescribed, along with start/stop dates and

specific fields to track treatment alterations. The dataset also con-

tained patient-reported medications, including prescriptions received

outside the EMR network and over-the-counter medications.

2.2 | Study design

To obtain data on the first-, second-, and third-line ADDs for each

patient with type 2 diabetes, the following drug classes were

arranged chronologically according to the initial prescription dates:

metformin, SUs, TZDs, alpha glucosidase inhibitors, amylin, dopamine

receptor agonists, meglitinide, DPP-4 inhibitors, GLP-1RAs, SGLT2

inhibitors and insulin. Same-day initiations (including combination

therapies) were prioritized in the order as listed above, from highest

to lowest. A robust methodology for extraction and assessment of

longitudinal patient-level medication data from the CEMR database

has recently been described by the authors.17

The study cohort included patients with: (1) age at diagnosis ≥18

and <80 years; (2) a diagnosis date strictly after first registered activ-

ity in the CEMR database; (3) a diagnosis date on or after January

1, 2005; (4) initiation of antidiabetic therapy with metformin; (5) initia-

tion of second-line ADD with SU, TZD, DPP-4 inhibitors, GLP-1RAs

2 MONTVIDA ET AL.

113

or insulin; (6) available HbA1c measure at second-line ADD initiation

(baseline); and (7) second-line therapy duration ≥6 months. Additional

restrictions on the duration of second-line therapy were applied:

≥12 months (sub-cohort 1) and ≥24 months (sub-cohort 2).

Baseline body weight, body mass index (BMI), systolic/diastolic

blood pressure and lipids were calculated as the average of available

measurements within the 3 months before and 3 months after initia-

tion of therapy. HbA1c values at baseline, 6, 12, 18 and 24 months

were obtained as the nearest measure within 3 months either side of

the time point. Under condition of at least two non-missing HbA1c

measures over 24 months, the missing data were imputed using a

Markov chain Monte Carlo method, adjusting for age, diabetes dura-

tion and usage of concomitant ADDs.18 The following baseline

HbA1c categories were then created: (1) 53.0-63.9 mmol/mol (7.0–

7.9%); (2) 64.0–75.0 mmol/mol (8.0–9.0%); (3) 75.1-108.0 mmol/mol

(9.1–12%); (4) >108 mmol/mol (>12%).

The presence of comorbidities prior to baseline was assessed

using the relevant disease identification codes. The Charlson comor-

bidity index (CCI) score was calculated using the algorithm described

by Quan et al19 Cardiovascular disease was defined as ischaemic

heart disease, peripheral vascular disease, heart failure or stroke. Can-

cer was defined as any malignancy except malignant neoplasm

of skin.

2.3 | Statistical methods

Baseline characteristics were summarized as number (%), mean

(SD) or median (first quartile, third quartile) as appropriate. Patterns

of intensification with third-line ADDs were summarized according to

second-line ADDs in the study cohort, sub-cohort 1 and sub-cohort

2. Among patients with ≥2 years of follow-up in the study cohort,

the proportions (95% confidence interval [CI]) of those who initiated

a third ADD within 2 years of baseline were calculated according to

year of second-line treatment initiation.

Propensity scores for multiple treatment levels20 were calculated

within each HbA1c category to account for heterogeneous baseline

characteristics among second-line ADD groups. Inverse probability of

these exposure weights21,22 was used to balance second-line treat-

ment groups with regard to age, sex, baseline HbA1c and baseline

CCI score. In patients without a history of cardiovascular disease,

chronic kidney disease or cancer at baseline, the probabilities (95%

CIs) of achieving glycaemic control (HbA1c <53 or 58 mmol/mol (7 or

7.5%)) at 6, 12 and 24 months after second-line treatment initiation

were estimated in the study cohort, sub-cohort 1 and sub-cohort

2, respectively. Three outcomes were assessed with multinomial

logistic regression: (1) no glycaemic control achievement at corre-

sponding time point; (2) glycaemic control achievement with a third

ADD addition within the analysis time window; and (3) glycaemic

achievement without a third ADD addition within the analysis time

window. Analyses were conducted by balancing the data as described

above, with additional covariate adjustments for age, sex, and time

from metformin to second-line treatment, separately for the HbA1c

categories of 58–63.9 mmol/mol (7.5–7.9%); 64–75 mmol/mol

(8–9%); and 75.1–108 mmol/mol (9.1–12%).

In patients with baseline HbA1c concentrations 58–63.9 mmol/

mol (7.5–7.9%) who achieved an HbA1c target of 53 mmol/mol (7%)

at 6 months without addition of a third ADD, the probabilities of sus-

taining HbA1c control over 24 months were estimated after the bal-

ancing and adjustments described above. Similarly, in patients with

baseline HbA1c of 64–75 mmol/mol (8–9%) who achieved HbA1c of

<58 mmol/mol (<7.5%) at 6 months without addition of a third ADD,

the adjusted probabilities of sustaining HbA1c control over

24 months were estimated. Finally, in patients with baseline HbA1c

concentration of 75.1–108 mmol/mol (9.1–12%) who achieved

HbA1c <58 mmol/mol (<7.5%) at 6 months with or without third-line

treatment intensification, the adjusted probabilities of sustaining

HbA1c control (irrespective of third-line ADD status) over 24 months

were estimated. The assessment of achieving HbA1c <53 mmol/mol

(<7%) in this category was considered clinically unrealistic.

Sensitivity analyses included an intention-to-treat evaluation and

separate assessment in patients with comorbidities at baseline.

3 | RESULTS

From 2 624 954 identified patients with type 2 diabetes, 195 720

initiated second-line ADD after metformin and had available HbA1c

measurements (Figure S1). Of these, 85%, 79%, 77%, 83% and 83%

in the SU, DPP-4 inhibitor, GLP-1RA, insulin and TZD groups, respec-

tively, continued therapy for at least 6 months. The study cohort

included 90 572, 29 308, 6696, 21 827 and 14 678 patients in the

SU, DPP-4 inhibitor, GLP-1RA, insulin and TZD groups, respectively

(Table 1). On average, the progression to a second ADD occurred

9 months after metformin initiation. Available follow-up years from

baseline were 4.0, 3.2, 3.7, 3.5 and 5.6 years in the SU, DPP-4 inhibi-

tor, GLP-1RA, insulin and TZD groups, respectively, and 84% of

patients continued therapy for at least 1 year. The distributions of

age, sex, BMI and comorbidities at baseline were significantly differ-

ent among the the second-line ADDs (Table 1).

The distribution of HbA1c categories at baseline was heteroge-

neous among the treatment groups (Table 1). With a mean

(SD) cohort HbA1c level of 8.4 (1.9)% (68 mmol/mol) at second-line

therapy initiation, the proportions of patients with baseline HbA1c

<64 mmol/mol (<8%) were 52%, 58%, 67%, 36% and 66% in the SU,

DPP-4 inhibitor, GLP-1RA, insulin and TZD groups, respectively.

3.1 | Treatment intensification with a third drug

Overall, 52% in the cohort had a third ADD prescribed (either in addi-

tion to or as a switch from a second ADD) during the available

follow-up. On average, the progression to a third ADD occurred at

15 months after second-line treatment initiation (Table 2). Of those

who initiated a third drug, 88% added it to dual therapy (ranging from

70% in the insulin group to 94% in the GLP-1RA group), while only

12% ceased the second ADD and switched to a third agent.

By study design, patients who switched to a third agent within

6, 12 or 24 months were not included in the study cohort, sub-cohort

1 or sub-cohort 2, respectively. During 6 months of therapy post

baseline, 27%, 21%, 26%, 12% and 29% of patients in the SU, DPP-4

MONTVIDA ET AL. 3

114

inhibitor, GLP-1RA, insulin and TZD groups, respectively, added a

third-line therapy (Table 2). Insulin was the most popular third-line

ADD, followed by DPP-4 inhibitors. Of those who added a third drug,

insulin was chosen by 26%, 36%, 69% and 32% of patients in the SU,

DPP-4 inhibitor, GLP-1RA and TZD groups, respectively (Table 2).

Among those who continued the second-line therapy for 12 months

(sub-cohort 1) and for 24 months (sub-cohort 2), 30% and 39% added

a third-line therapy respectively.

3.2 | Temporal pattern of initiating third-line ADDs

Irrespective of the class of second-line ADD, the proportions of

patients who initiated a third ADD within 2 years of baseline are

shown in Figure 1A (“All”), stratified by calendar year of second-line

initiation. Figure 1 also shows those who intensified treatment with a

third ADD, excluding TZD as second-line group (“All without TZD”)

because a large proportion of patients ceased TZD treatment as a

result of cardiovascular safety concerns23–25 and not necessarily

TABLE 1 Characteristics of patients at initiation of second-line antidiabetic drug

Metformin + SUMetformin +DPP-4 inhibitor

Metformin+ GLP-1RA

Metformin+ insulin

Metformin+ TZD

All

N 90 572 29 308 6696 21 827 14 678 163 081

Mean (SD) age, years 59 (12) 57 (12) 53 (11) 56 (13) 57 (11) 57 (12)

Men, n (%) 46 005 (51) 14 330 (49) 2354 (35) 9858 (45) 7782 (53) 80 329 (49)

White ethnicity, n (%) 63 338 (70) 20 366 (69) 5100 (76) 14 267 (65) 10 256 (70) 113 327 (69)

Black ethnicity, n (%) 11 703 (13) 3618 (12) 616 (9) 3690 (17) 1434 (10) 21 061 (13)

Mean (SD) time from metformin tosecond drug, mo

8.9 (16.8) 12.8 (19.3) 12.1 (18.3) 5.8 (14.2) 4.8 (11.7) 9.0 (16.8)

Mean (SD) follow-up from baseline,years

4.03 (2.49) 3.22 (1.95) 3.66 (2.39) 3.46 (2.24) 5.57 (2.80) 3.93 (2.47)

Mean (SD) therapy duration frombaseline, mo

38.3 (26.3) 29.6 (20.1) 28.4 (21.0) 37.9 (26.0) 35.8 (26.1) 36.0 (25.3)

Therapy duration ≥12 mo, n (%) 77 779 (86) 23 327 (80) 5061 (76) 18 729 (86) 12 040 (82) 136 936 (84)

Therapy duration ≥24 mo, n (%) 56 324 (62) 14 746 (50) 3090 (46) 13 472 (62) 8297 (57) 95 929 (59)

Mean (SD) HbA1c, % 8.4 (1.8) 8.2 (1.7) 7.8 (1.6) 9.3 (2.3) 7.9 (1.7) 8.4 (1.9)

Mean HbA1c mmol/mol 68 66 62 78 63 68

HbA1c category, n (%)

53–63.9 mmol/mol (7–7.9%) 26 493 (29) 10 112 (35) 1953 (29) 4034 (18) 4139 (28) 46 731 (29)

64–75 mmol/mol (8–9%) 18 701 (21) 5726 (20) 1027 (15) 3838 (18) 2295 (16) 31 587 (19)

75.1–108 mmol/mol (9.1–12%) 20 148 (22) 5373 (18) 989 (15) 7432 (34) 2183 (15) 36 125 (22)

>108 mmol/mol (>12%) 4695 (5) 1227 (4) 166 (2) 2798 (13) 504 (3) 9390 (6)

Mean (SD) weight, kg 98.3 (24.5) 98.9 (24.2) 109.5 (25.9) 99.8 (26.2) 100.2 (23.9) 99.3 (24.8)

Mean (SD) BMI, kg/m2 34.5 (7.7) 34.5 (7.6) 38.5 (8.1) 35.2 (8.4) 34.8 (7.6) 34.8 (7.8)

BMI categorya, n (%)

Normal 5803 (7) 1841 (6) 89 (1) 1712 (8) 827 (6) 10 272 (6)

Overweight 20 477 (23) 6567 (23) 669 (10) 4217 (20) 3130 (22) 35 060 (22)

Grade 1 25 568 (29) 8570 (30) 1661 (25) 5697 (27) 4029 (29) 45 525 (29)

Grade ≥ 2 35 853 (41) 11 788 (41) 4144 (63) 9587 (45) 6012 (43) 67 384 (43)

Mean (SD) SBP, mm Hg 131 (15) 129 (13) 128 (13) 130 (15) 130 (14) 130 (14)

SBP ≥ 140 mm Hg, n (%) 22 164 (25) 5807 (20) 1084 (17) 5022 (23) 3088 (22) 37 165 (23)

Mean (SD) LDL cholesterol, mg/dL 98 (34) 98 (35) 95 (34) 99 (37) 97 (34) 98 (35)

Mean (SD) HDL cholesterol, mg/dL 43 (12) 44 (12) 44 (12) 43 (13) 45 (12) 43 (12)

Median (IQR) triglycerides, mg/dL 150 (109, 199) 147 (108, 196) 150 (109, 200) 144 (103, 197) 139 (101, 190) 147 (107, 198)

CVD, CKD or cancer, n (%) 23 281 (26) 7223 (25) 1205 (18) 5870 (27) 2982 (20) 40 561 (25)

CVD, n (%) 18 031 (20) 5406 (18) 852 (13) 4675 (21) 2276 (16) 31 240 (19)

CKD, n (%) 3750 (4) 1205 (4) 151 (2) 811 (4) 431 (3) 6348 (4)

Cancer, n (%) 4469 (5) 1628 (6) 285 (4) 1103 (5) 552 (4) 8037 (5)

Neuropathy, n (%) 7153 (8) 2080 (7) 519 (8) 2305 (11) 879 (6) 12 936 (8)

Retinopathy, n (%) 1329 (1) 288 (1) 76 (1) 535 (2) 166 (1) 2394 (1)

Depression, n (%) 14 925 (16) 5427 (19) 1576 (24) 4200 (19) 2145 (15) 28 273 (17)

Mean (SD) CCI 1.7 (1.1) 1.7 (1.1) 1.6 (0.9) 1.8 (1.2) 1.5 (0.9) 1.7 (1.1)

Abbreviations: BMI, body mass index; CCI, Charlson comorbidity index; CVD, cardiovascular disease; CKD, chronic kidney disease; DBP, diastolic bloodpressure; DPP-4, dipeptidyl peptidase-4; GLP-1RA, glucagon-like peptide-1 receptor agonist; HbA1c, glycated haemoglobin; IQR, interquartile range; SBP,systolic blood pressure; SU, sulphonylurea; TZD, thiazolidinedione.a BMI category: normal: <25 kg/m2; overweight: ≥25 and <30 kg/m2; Grade 1: ≥30 and <35 kg/m2; Grade ≥ 2: ≥35 kg/m2.

4 MONTVIDA ET AL.

115

TABLE 2 Third-line anti-diabetic drug usage in the study cohort and two sub-cohortsa

Metformin + SUMetformin +DPP-4inhibitor

Metformin +GLP-1RA

Metformin +insulin

Metformin +TZD

All

Study cohort N 90 572 29 308 6696 21 827 14 678 163 081

Initiated third drug n (% fromN)

49 255 (54) 15 248 (52) 3513 (52) 7275 (33) 10 006 (68) 85 297 (52)

Time from second tothird drug, mo

Mean (SD) 14.3 (19.5) 14.3 (16.0) 13.2 (17.8) 17.7 (19.3) 18.4 (23.1) 15.0 (19.4)

Added third drugwithin 6 mo

n1 (% fromN)

24 600 (27) 6053 (21) 1725 (26) 2627 (12) 4260 (29) 39 265 (24)

• Most popularthird drug

Name; n (%from n1)

TZD; 8107 (33) Insulin; 2200(36)

Insulin; 1189(69)

SU; 888 (34) Insulin; 1352(32)

Insulin; 11 054(28)

• Second mostpopular third drug

Name; n (%from n1)

DPP-4 inhibitor;7455 (30)

SU; 2073(34)

SU; 193 (11) DPP-4inhibitor;703 (27)

DPP-4inhibitor;1236 (29)


Sub-cohort 1 N2 77 779 23 327 5061 18 729 12 040 136 936


n (% fromN2)

20 990 (27) 4581 (20) 1300 (26) 2220 (12) 3450 (29) 32 541 (24)

Added third drugwithin 6 to 12 mo

n2 (% fromN2)

4265 (5) 1860 (8) 293 (6) 1076 (6) 682 (6) 8176 (6)


Name; n (%from n2)

DPP-4inhibitor;1737(41)

SU; 975 (52) SU; 104 (35) SU; 336 (31) SU; 340 (50) DPP-4inhibitor;2217 (27)


Name; n (%from n2)

Insulin; 1074(25)

Insulin;267 (14)

Insulin;62 (21)

GLP1RA;269 (25)


SU; 1755 (21)

Sub-cohort2 N3 56 324 14 746 3090 13 472 8297 95 929


n (% fromN3)

15 074 (27) 2549 (17) 800 (26) 1521 (11) 2309 (28) 22 253 (23)


n (% fromN3)

2867 (5) 1124 (8) 168 (5) 756 (6) 471 (6) 5386 (6)


n3 (% fromN3)

5302 (9) 1833 (12) 319 (10) 1070 (8) 645 (8) 9169 (10)


Name; n (%from n3)

DPP-4 inhibitor;2356 (44)

SU; 959 (52) SU; 113 (35) SU; 301 (28) SU; 297 (46) DPP-4inhibitor;2876 (31)


Name; n (%from n3)

Insulin; 1225(23)

SGLT2;274 (15)

Insulin;63 (20)



SU; 1670 (18)

Abbreviations: DPP-4, dipeptidyl peptidase-4; GLP-1RA, glucagon-like peptide-1 receptor agonist; SU, sulphonylurea; TZD, thiazolidinedione.a Duration of second-line agent ≥6 months/ ≥12 months/ ≥24 months in the study cohort/ sub-cohort 1/ sub-cohort 2, respectively.

FIGURE 1 Among patients who had at least 2 years of follow-up in the study cohort, the proportion (95% confidence interval) of patients who

initiated third antidiabetic drug (ADD) within 2 years of second ADD: A, irrespective of HbA1c level at second-line initiation; B, among thosewith HbA1c of 64.0–75.0 mmol/mol (8–9%) at second-line initiation; and C, among those with HbA1c of 75.1-108.0 mmol/mol (9.1–12%) atsecond-line initiation. HbA1c, glycated haemoglobin; INS, insulin; TZD, thiazolidinedione

MONTVIDA ET AL. 5

116

because of efficacy issues. The figure also provides the data exclud-

ing those who had a TZD or insulin as second-line (“All without

TZD & insulin”) to explore the possible change in intensification rate

with non-insulin ADDs over time, accounting for decreasing popular-

ity of TZDs. Figure 1B and C focus on those who had baseline HbA1c

of 64–75 mmol/mol (8–9%) and 75.1–108 mmol/mol (9.1–12%),

respectively. Figure 1 shows that between 2007 and 2014 the pro-

portion of patients initiating a third ADD, within 2 years of adding

the second ADD, decreased; however, this decline started to reverse

in 2014, especially among those whose HbA1c was 75.1–108 mmol/

mol (9.1–12%) at initiation of the second ADD.

3.3 | Glycaemic achievements and sustainability

At 6 months, the mean unadjusted HbA1c reductions were 0.8%,

0.8%, 0.7%, 1.0% and 0.8% in the SU, DPP-4 inhibitor, GLP-1RA,

insulin and TZD groups, respectively. The mean adjusted reductions

at 6 months were 0.8%, 1.0%, 1.1%, 0.7% and 1.0% in the respective

treatment groups (significant for all groups, P < .01).

3.3.1 | Baseline HbA1c group 58–63.9 mmol/mol(7.5–7.9%)

Among patients with HbA1c concentrations of 58–63.9 mmol/mol

(7.5–7.9%) at baseline, 44%, 47%, 57%, 31% and 57% of patients in

the SU, DPP-4 inhibitor, GLP-1RA, insulin and TZD groups, respec-

tively, achieved HbA1c <53 mmol/mol (<7%) at 6 months without

third-line treatment addition. The corresponding adjusted probabili-

ties were 32%, 38%, 39%, 26% and 38% in the second-line treatment

groups (P < .01 for all groups [Figure 2A]); however, the probabilities

of reducing HbA1c below the target 53 mmol/mol (7%) without

third-line ADD intensification declined by 5%, 5%, 6%, 2% and 1% at

12 months and by 9%, 8%, 15%, 5% and 7% at 24 months in the SU,

DPP-4 inhibitor, GLP-1RA, insulin and TZD groups, respectively.

Among those who reduced HbA1c to <53 mmol/mol (<7%) with-

out a third ADD at 6 months, 68% and 58% of patients sustained this

glycaemic achievement at 12 and 24 months, respectively. The prob-

ability of sustaining this glycaemic achievement was higher and simi-

lar in the GLP-1RA and TZD groups at 12 months (range of 95% CI

of probability: 76%, 79%), compared with other second-line therapy

options (Figure 2B). While the probability of sustaining this glycaemic

control declined significantly by 24 months, GLP-1RAs, DPP-4 inhibi-

tors and TZDs provided significantly higher chances of sustainability

(range of 95% CI of probability: 53%, 58%) compared with patients

treated with insulin or SUs (range of 95% CI of probability:

46%, 50%).

3.3.2 | Baseline HbA1c 64–75 mmol/mol (8–9%)

Among patients with baseline HbA1c concentrations of 64–75

mmol/mol (8-9%), 55%, 58%, 66%, 41% and 67% of patients in the

SU, DPP-4 inhibitor, GLP-1RA, insulin and TZD groups achieved

HbA1c <58 mmol/mol (7.5%) at 6 months without third-line ADD

addition, and the corresponding adjusted probabilities were 38%,

44%, 40%, 34% and 42%, respectively (Figure 2C). The probabilities

of this glycaemic achievement declined significantly by at least 5%

across all treatment groups at 12 months, and by at least 8% at

24 months.

Among those who achieved an HbA1c concentration <58 mmol/

mol (<7.5%) without a third ADD at 6 months, 76% and 67% sus-

tained this glycaemic achievement at 12 and 24 months, respectively,

without requiring third-line treatment intensification. The probability

of sustaining this glycaemic achievement was significantly higher in

the GLP-1RA and TZD groups at 12 months (range of 95% CI of

probability: 76%, 79%), compared with other second-line ADDs

(Figure 2D; P < .01). While the probability of sustaining this glycae-

mic control declined significantly by 24 months of therapy across all

groups, patients treated with insulin had the lowest probability of

sustaining the glycaemic control.

3.3.3 | Baseline HbA1c 75.1–108 mmol/mol (9.1–12%)

In the patients with 75.1–108 mmol/mol (9.1–12%) baseline HbA1c,

29%, 36% and 45% added a third ADD within 6, 12 and 24 months

of baseline, respectively. Irrespective of third ADD status, 37%, 45%,

38%, 21% and 43% of patients in the SU, DPP-4 inhibitor, GLP-1RA,

insulin and TZD groups, respectively, achieved HbA1c <58 mmol/mol

(<7.5%) at 6 months, with corresponding probabilities of 36%, 45%,

38%, 33% and 43% (Figure 2E). The probability of achieving an

HbA1c concentration <58 mmol/mol (<7.5%) at 24 months reduced

by 4% for insulin users, did not change in the SU and DPP-4 inhibitor

groups, and increased by 8% and 9% in the second-line GLP-1RA and

TZD groups (all P < .01). Among those who achieved an HbA1c con-

centration <58 mmol/mol (<7.5%) at 6 months, 72% and 58%,

respectively, sustained this glycaemic achievement at 12 and

24 months, irrespective of third-line treatment intensification status.

The probability of sustaining glycaemic control at <58 mmol/mol

(<7.5%) over 12 and 24 months of treatment was significantly higher

in the incretin and TZD groups, while insulin and SU offered lower

chances of sustainable control (Figure 2F).

3.3.4 | Baseline HbA1c >108 mmol/mol (>12%)

In patients with baseline HbA1c >108 mmol/mol (>12%), the proba-

bility of reducing HbA1c by at least 2% increased over time: 82% at

2 years of insulin therapy, and ~90% for other second-line choices.

The probabilities of reducing HbA1c by at least 1.5% in this baseline

HbA1c group were not significantly different among the ADD groups

over 2 years (results not shown).

3.4 | Sensitivity analyses

An intention-to-treat approach obtained similar results to those of

the main analyses. Patients with cardiovascular disease, chronic kid-

ney disease or cancer at baseline had marginally higher probabilities

of glycaemic achievements in all treatment groups, compared with

those without comorbidities (results not shown).

4 | DISCUSSION

The novelty of the present pharmaco-epidemiological study, with

real-world population-level data, is its evaluation of short- and long-

term glycaemic control with post-metformin major second-line ADDs,

and the comparison of the sustainability of such glycaemic goals over

6 MONTVIDA ET AL.

117

FIGURE 2 At 6, 12, and 24 months of second-line initiation, adjusted probability (95% confidence interval) of A, reducing glycated haemoglobin

(HbA1c) below 53 mmol/mol (7%) without adding third anti-diabetic drug (ADD), from baseline HbA1c of 58–63.9 mmol/mol (7.5–7.9%); B,sustaining 6-month achievement without adding a third ADD; C, reducing HbA1c below 58 mmol/mol (7.5%) without adding a third ADD, frombaseline HbA1c of 64–75 mmol/mol (8–9%); D, sustaining 6-month achievement without adding a third ADD; E, reducing HbA1c below 58mmol/mol (7.5%) (irrespective of third ADD), from baseline HbA1c of 75.1–108 mmol/mol (9.1–12%); and F, sustaining 6-month achievement(irrespective of third ADD). DPP-4, dipeptidyl peptidase-4; GLP-1RA, glucagon-like peptide-1 receptor agonist; INS, insulin; MET, metformin; SU,sulphonylurea; TZD, thiazolidinedione

MONTVIDA ET AL. 7

118

24 months of continuous treatment. Among patients with HbA1c

concentrations of 58-63.9 mmol/mol (7.5-7.9%) at second-line ADD

initiation, the probabilities of achieving an HbA1c concentration of

<53 mmol/mol (<7%) without adding a third-line ADD at 6 and

12 months were significantly higher in the incretin and TZD groups,

compared with the insulin and SU groups. Treatment with incretins

or TZDs also offered a significantly higher probability of sustaining

this glycaemic achievement over 24 months of treatment without the

need for further therapy intensification. Among those who initiated a

second-line ADD at HbA1c levels of 64–75 mmol/mol (8–9%), DPP-

4 inhibitors and TZDs offered significantly higher and similar chances

of reducing HbA1c to <58 mmol/mol (<7.5%) over 24 months of

therapy without adding a third ADD, compared with other second-

line groups. GLP-1RAs and TZDs offered the highest chances of sus-

taining this control over 24 months, while treatment with SUs, insulin

and DPP-4 inhibitors provided significantly lower sustainability

chances.

In this real-world study, we observed similar performance by

DPP-4 inhibitors and GLP-1RAs in terms of the probability of reduc-

ing HbA1c to a clinically desirable glycaemic target over 24 months

of therapy, when added to metformin. In terms of sustaining the gly-

caemic achievements over 12 months, GLP-1RAs appear to offer

higher chances among patients with HbA1c <75 mmol/mol (<9%) at

second-line initiation (~76%-79% probability), compared with DPP-4

inhibitors (~68%-73% probability); however, this difference disap-

pears at 24 months of therapy. While SUs as second-line therapy

offer a higher probability of achieving desirable glycaemic control

across all HbA1c categories (<108 mmol/mol (<12%)) compared with

insulin over 2 years, the probability of sustaining the early glycaemic

achievement appears to be similar between these two therapy

options. We have seen that, across all HbA1c categories, treatment

with second-line TZDs provided better or similar glycaemic achieve-

ments and sustainability, compared with other therapy options. This

result supports the study by Mamza et al,26 who reported that treat-

ment with post-metformin TZD provides the most durable glycaemic

response compared with second-line SU and DPP-4 inhibitor treat-

ment. Recent results of the TOSCA.IT trial, providing cardiovascular

safety reassurance with pioglitazone, taken in conjunction with our

results may increase the popularity of TZDs as a therapeutic

option.27

Compared with sulphonylurea add-on to metformin, Thomsen

et al8 reported a higher likelihood of achieving HbA1c below 53

mmol/mol (7%) at 6 months for second-line GLP-1RA users (Relative

Risk (95% CI) of 1.10 (1.01, 1.19)), and lower likelihoods for DPP-4

inhibitor (Relative Risk (95% CI) 0.94 (0.89, 0.99)) and insulin users

(Relative Risk (95% CI) 0.88 (0.77, 0.99)). Our results are closer to

those of the study conducted by Rathmann et al,7 who reported odds

ratios (with SU as reference) of achieving HbA1c below 53 mmol/mol

(7%) of 1.2, 1.4, 1.7 and 0.7 for second-line DPP-4 inhibitors, GLP-

1RAs, TZDs and insulin, respectively.

Our findings are also in line with a study that used data from the

National Health and Nutrition Examination Survey, which reported

that only half of patients achieve HbA1c below 53 mmol/mol (7%).28

Furthermore, in patients with HbA1c <75 mmol/mol (<9%) at

second-line initiation, we observed that only 30% maintained

glycaemic control after 2 years of continuous treatment without fur-

ther intensification with a third ADD.

Comparatively poor performance of insulin as a second-line

agent may be surprising, as randomized controlled trial data show

that insulin can achieve at least as much HbA1c reduction as other

agents. A possible reason for this is that insulin is often chosen when

there are multiple comorbidities, and in such patients, the HbA1c tar-

get may be higher, and many other potential third-line ADDs may be

contra-indicated. In addition, the insulin dose may be inadequately

titrated because of adverse effects, such as hypoglycaemia and

weight gain, as well as inadequate healthcare professional support for

the regular titration of insulin doses. More work needs to be carried

out to determine how best to translate the clinical trial efficacy of

insulin into clinical practice effectiveness.

We observed that the proportion of patients who intensify treat-

ment with a third ADD has decreased only moderately during the last

decade, despite the increasing availability of newer agents. Lipska

et al29 reported that overall glycaemic control in the United States

did not change between 2006 and 2013.

A strength of the present study is the availability of data from

patients' medication lists that include prescribed medications within

the EMR network and also medications that could be prescribed out-

side of the EMR. Furthermore, the CEMR database tracks longitudinal

treatment adjustments, and contains comprehensive clinical informa-

tion, which is usually not available in claims databases. In addition, we

used advanced data mining and statistical methods. Given unequal

probabilities of receiving particular second-line agents in the real-

world scenario, we modelled treatment assignment with multinomial

propensity scores, and then assessed the adjusted outcomes of the

study.

The limitations of this study include the non-availability of data

on: (1) adherence and side-effects; (2) diet and exercise; (3) socio-

economic status; and (4) insurance type. Carls and colleagues31

highlighted alarmingly low rates of medication adherence as the main

cause of the disconnect between results of real-world studies and

clinical trials. Importantly, in the present study, we focused only on

those who continued the second-line therapy for a minimum of

6 months. Montvida et al4 recently reported higher discontinuation

rates for incretins, compared with older treatment alternatives.

To conclude, incretin-based therapies and TZDs offered a higher

probability of long-term glycaemic achievements and of sustaining

these, compared with SUs and insulin for metformin-treated patients

with type 2 diabetes. While the results of a large randomized

controlled trial (GRADE) comparing glycaemic efficacy of major

second-line therapies are not expected before 2020, the present

study provides much-needed information to patients and clinicians

with regard to the probability of sustainable glycaemic control with

different therapy options.30

ACKNOWLEDGMENTS

Melbourne EpiCentre gratefully acknowledges the support from the

Australian Government Department of Education's National Collabo-

rative Research Infrastructure Strategy (NCRIS) initiative through

Therapeutic Innovation Australia. O.M. acknowledges the PhD

8 MONTVIDA ET AL.

119

scholarship from Queensland University of Technology, Australia, and

her co-supervisors Prof. Ross Young and Prof. Louise Hafner of the

same University. No separate funding was obtained for this study.

Conflict of interest

S.K.P. has acted as a consultant and/or speaker for Novartis, GI

Dynamics, Roche, AstraZeneca, Guangzhou Zhongyi Pharmaceutical

and Amylin Pharmaceuticals LLC. He has received grants in support

of investigator and investigator-initiated clinical studies from Merck,

Novo Nordisk, AstraZeneca, Hospira, Amylin Pharmaceuticals, Sanofi-

Avensis and Pfizer. O.M. has no conflict of interest to declare.

J.E.S. has received honoraria or grant support from Merck Sharp and

Dohme, Novo Nordisk, Eli Lilly, AstraZeneca, Sanofi-Aventis, Mylan

Pharmaceuticals and Boehringer Ingelheim.

Author contributions

O.M. and S.K.P. were responsible for the primary design of the study.

O.M. conducted the data extraction. O.M. and SKP jointly conducted

the statistical analyses. The first draft of the manuscript was devel-

oped by O.M. and S.K.P., and all authors contributed to the finaliza-

tion of the manuscript. S.K.P. had full access to all the data in the

study and takes responsibility for the integrity of the data and the

accuracy of the data analysis.

ORCID

Jonathan E. Shaw http://orcid.org/0000-0002-6187-2203

Lawrence Blonde http://orcid.org/0000-0003-0492-6698

Sanjoy K. Paul http://orcid.org/0000-0003-0848-7194

REFERENCES

1. American Diabetes Association. Standards of medical care indiabetes—2017: summary of revisions. Diabetes Care. 2017;40:S4-S5.

2. Turner RC, Cull CA, Frighi V, Holman RR, Group UPDS. Glycemic con-trol with diet, sulfonylurea, metformin, or insulin in patients with type2 diabetes mellitus: progressive requirement for multiple therapies(UKPDS 49). JAMA. 1999;281:2005-2012.

3. Garber AJ, Abrahamson MJ, Barzilay JI, et al. Consensus statement by theAmerican association of clinical endocrinologists and American College ofEndocrinology on the comprehensive type 2 diabetes managementalgorithm–2017 executive summary. Endocr Pract. 2017;23:207-238.

4. Montvida O, Shaw J, Atherton JJ, Stringer F, Paul SK. Long-termtrends in antidiabetes drug usage in the US: real-world evidence inpatients newly diagnosed with type 2 diabetes. Diabetes Care. 2018;41:69-78.

5. Giugliano D, Maiorino MI, Bellastella G, Esposito K. Comment onEdelman and Polonsky. Type 2 diabetes in the real world: the elusivenature of glycemic control. Diabetes Care. 2017;40:1425-1432. Dia-betes Care 2018;41:e17.

6. McCarthy MI. Painting a new picture of personalised medicine fordiabetes. Diabetologia. 2017;60:793-799.

7. Rathmann W, Bongaerts B, Kostev K. Change in glycated haemoglo-bin levels after initiating second-line therapy in type 2 diabetes: a pri-mary care database study. Diabetes Obes Metab. 2016;18:840-843.

8. Thomsen RW, Baggesen LM, Søgaard M, et al. Early glycaemic controlin metformin users receiving their first add-on therapy: apopulation-based study of 4,734 people with type 2 diabetes. Diabe-tologia. 2015;58:2247-2253.

9. Palmer SC, Mavridis D, Nicolucci A, et al. Comparison of clinical out-comes and adverse events associated with glucose-lowering drugs in

patients with type 2 diabetes: a meta-analysis. JAMA. 2016;316:313-324.

10. Maruthur NM, Tseng E, Hutfless S, et al. Diabetes medications asmonotherapy or metformin-based combination therapy for type 2 dia-betes: a systematic review and meta-analysis. Ann Intern Med. 2016;164:740-751.

11. Bennett WL, Maruthur NM, Singh S, et al. Comparative effectivenessand safety of medications for type 2 diabetes: an update includingnew drugs and 2-drug combinations. Ann Intern Med. 2011;154:602-613.

12. Zhang Y, McCoy RG, Mason JE, Smith SA, Shah ND, Denton BT.Second-line agents for glycemic control for type 2 diabetes: arenewer agents better? Diabetes Care. 2014;37:1338-1345.

13. Centers for Disease Control and Prevention. National Diabetes Statis-tics Report: estimates of diabetes and its burden in the United States.Atlanta, GA: US Department of Health and Human Services, 2014.

14. Crawford AG, Cote C, Couto J, et al. Comparison of GE centricityelectronic medical record database and National Ambulatory MedicalCare Survey findings on the prevalence of major conditions in theUnited States. Popul Health Manag. 2010;13:139-150.

15. Brixner D, Said Q, Kirkness C, Oberg B, Ben-Joseph R, Oderda G.Assessment of cardiometabolic risk factors in a national primary careelectronic health record database. Value in health. 2007;10(s1).

16. Paul SK, Shaw J, Montvida O, Klein K. Weight gain in insulin treatedpatients by BMI categories at treatment initiation: new evidence fromreal-world data in patients with type 2 diabetes. Diabetes Obes Metab.2016;18(12):1244-1252.

17. Montvida O, Arandjelović O, Reiner E, Paul SK. Data mining approachto estimate the duration of drug therapy from longitudinal electronicmedical records. Open Bioinforma J. 2017;10:1-15.

18. Thomas G, Klein K, Paul S. Statistical challenges in analysing large lon-gitudinal patient-level data: the danger of misleading clinical infer-ences with imputed data. J Indian Soc Agric Stat. 2014;68:39-54.

19. Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defin-ing comorbidities in ICD-9-CM and ICD-10 administrative data. MedCare. 2005;43:1130-1139.

20. Ridgeway G, McCaffrey DF, Morral AR, Burgette LF, Griffin BA.Toolkit for Weighting and Analysis of Nonequivalent Groups: A Tuto-rial for the R TWANG Package. Santa Monica, CA: RAND Corpora-tion, 2014. https://www.rand.org/pubs/tools/TL136z1.html.

21. Lunceford JK, Davidian M. Stratification and weighting via the pro-pensity score in estimation of causal treatment effects: a comparativestudy. Stat Med. 2004;23:2937-2960.

22. McCaffrey DF, Ridgeway G, Morral AR. Propensity score estimationwith boosted regression for evaluating causal effects in observationalstudies. Psychol Methods. 2004;9:403.

23. Woodcock J, Sharfstein JM, Hamburg M. Regulatory action on rosigli-tazone by the US Food and Drug Administration. N Engl J Med. 2010;363:1489-1491.

24. Tanne JH. FDA places" black box" warning on antidiabetes drugs.BMJ. 2007;334:1237.

25. Nissen SE, Wolski K. Effect of rosiglitazone on the risk of myocardialinfarction and death from cardiovascular causes. N Engl J Med. 2007;356:2457-2471.

26. Mamza J, Mehta R, Donnelly R, Idris I. Important differences in thedurability of glycaemic response among second-line treatmentoptions when added to metformin in type 2 diabetes: a retrospectivecohort study. Ann Med. 2016;48:224-234.

27. Vaccaro O, Masulli M, Nicolucci A, Bonora E, Del Prato S,Maggioni AP, Rivellese AA, Squatrito S, Giorda CB, Sesti G,Mocarelli P. Effects on the incidence of cardiovascular events of theaddition of pioglitazone versus sulfonylureas in patients with type 2diabetes inadequately controlled with metformin (TOSCA. IT): a ran-domised, multicentre trial. Lancet Diabetes Endocrinol. 2017;5:887-897.

28. Edelman SV, Polonsky WH. Type 2 diabetes in the real world: theelusive nature of glycemic control. Diabetes Care. 2017;40:1425-1432.

29. Lipska KJ, Yao X, Herrin J, et al. Trends in drug utilization, glycemiccontrol, and rates of severe hypoglycemia, 2006–2013. Diabetes Care.2017;40:468-475.

MONTVIDA ET AL. 9

120

http://orcid.org/0000-0002-6187-2203

http://orcid.org/0000-0002-6187-2203

http://orcid.org/0000-0003-0492-6698

http://orcid.org/0000-0003-0492-6698

http://orcid.org/0000-0003-0848-7194

http://orcid.org/0000-0003-0848-7194

30. Nathan DM, Buse JB, Kahn SE, et al. Rationale and design of the gly-cemia reduction approaches in diabetes: a comparative effectivenessstudy (GRADE). Diabetes Care. 2013;36:2254-2261.

31. Carls GS, Tuttle E, Tan R-D, et al. Understanding the gap betweenefficacy in randomized controlled trials and effectiveness inreal-world use of GLP-1 RA and DPP-4 therapies in patients withtype 2 diabetes. Diabetes Care. 2017;40:1469-1478.

SUPPORTING INFORMATION

Additional Supporting Information may be found online in the sup-

porting information tab for this article.

How to cite this article: Montvida O, Shaw JE, Blonde L,

Paul SK. Long-term sustainability of glycaemic achievements

with second-line antidiabetic therapies in patients with type 2

diabetes: A real-world study. Diabetes Obes Metab. 2018;

1–10. https://doi.org/10.1111/dom.13288

10 MONTVIDA ET AL.

121

https://doi.org/10.1111/dom.13288

Chapter 9: Cardio-metabolic Risk Factor

Burden and Safety


Paper




expertise;






academic unit, and




Olga Montvida, Xiaoling Cai, Sanjoy K Paul. Cardio-metabolic risk factor burden and safety

in patients with type 2 diabetes receiving intensified anti-diabetic and cardio-protective

therapies.






Xiaoling Cai Contributed to the manuscript development.

Sanjoy K. Paul Conceived the idea, and was responsible for the primary



of the manuscript.

122


Signature



authorship.


Name Signature Date

123


of 25

ABSTRACT

Background: Individualized treatment of patients with type 2 diabetes requires detailed

evaluation of risk factor dynamics at population level. The aim was to evaluate persistent

glycaemic and cardiovascular (CV) risk factor burden over 2 years post treatment

intensification (TI).

Methods: From US Centricity Electronic Medical Records, 276,884 patients with type 2

diabetes who intensified metformin were selected. SBP ≥ 130 / 140 mmHg and LDL ≥ 70 /

100 mg/dL were defined as uncontrolled for those with / without a history of CV disease

(CVD) at TI. Triglycerides (Trig) ≥ 150 mg/dL and HbA1c ≥ 7.5% were defined as

uncontrolled. Longitudinal measures over 2 years post TI were used to define continuously

uncontrolled patients.

Findings: With 3.7 years mean follow-up, patients were 59 years old, 70% were obese, 22%

had history of CVD; 60 / 30 / 50 / 48% had uncontrolled HbA1c / SBP / LDL / Trig at TI;

81% and 69% were receiving therapies for blood pressure and lipid control respectively.

The proportion of patients with consistently uncontrolled HbA1c increased from 31% in 2005

to 41% in 2014. Among those on lipid-modifying drugs, 41% and 37% had consistently high

LDL and Trig over 2 years. Being on blood pressure control therapies, 29% had continuously

uncontrolled SBP. Among patients receiving cardio-protective therapies, 62% failed to

achieve control in HbA1c + LDL, 62% in HbA1c + Trig, and 55% in HbA1c + SBP over 2

year post TI. Rates per 1000-person years of major adverse cardiovascular events were lower

among those who intensified metformin with GLP-1RA, compared to other therapies.

124

of 25

Interpretation: Among patients on multiple therapies for risk factor control, more than a

third had uncontrolled HbA1c, lipid and SBP levels, and 3 out of 5 had uncontrolled HbA1c

and at least one CV risk factor over 2 years post TI.

125

of 25

INTRODUCTION

Cardiovascular (CV) disease in patients with type 2 diabetes has been in much of focus

during the last decade and remains so till date, being the most common reason of death and

comorbidities among patients with diabetes 1,2. The efficient management of these patients

requires a multi-faced approach to holistically control for hyperglycaemia and CV risk factors

such as blood pressure, body weight, and lipids 3,4. Recent review by Khunti and colleagues

discuss current evidence of early control of glycose, lipids, and blood pressure on CV

benefits 5.

While American and international guidelines constantly stress out the importance of cardio-

metabolic risk factor control, the population-level control has not improved during last

decade in the US 6-8. Using data from National Health and Nutrition Examination Survey,

Carls and colleagues reported that 57% of patients with diabetes during 2003-2006 achieved

HbA1c < 7%, while only 51% in the 2011-20148. Similarly, using privately insured and

Medicare Advantage patients with type 2 diabetes, Lipska and colleagues reported declining

proportion of patients with HbA1c < 7% from 56% in 2006 to 54% in 2013 7. Ali and

colleagues reported that only 14% of patients with diabetes had simultaneous control of

glycose, blood pressure, cholesterol and non-smoking status during 1999-2010 in the US 6.

Another study on 530,747 patients from Diabetes Collaborative Registry, reported that 83%

and 81% of patients have hypertension and hyperlipidaemia respectively 3.

Significant portion of patients with type 2 diabetes eventually intensify first-line metformin

apart from using multiple cardio-protective medications, nonetheless poor cardio-metabolic

risk factor control is common in these patients. While previous studies were using general

population of individuals with diabetes, to the best of our knowledge, there is no study that

holistically explores the patterns of risk factor control post therapy intensification at

126

of 25

population level. In this context, assessment of those who continuously fail to control risk

factors would help to understand whether increasing evidence of early control benefits and

introduction of newer classes of anti-diabetic drugs (ADDs) has helped to improve population

health during the last decade.

In patients with type 2 diabetes identified from US primary and secondary ambulatory care

systems’ electronic medical records (EMRs), the aims of this study were to provide up-to-

date exploration of the population-level (1) glycaemic and CV risk factor control at the time

of metformin intensification; (2) simultaneous control of glycaemic and CV risk factors post-

metformin intensification, (3) persistent glycaemic and CV risk factor burden in those who

are using anti-diabetic and cardio-protective therapies; and (4) rates of major adverse

cardiovascular events by different second-line ADD choices.


Data Source

The Centricity EMRs were used in this study, the database represents more than 35,000 solo

practitioners, community clinics, academic medical centres, and large integrated delivery

networks across all US states. Patients in the database are generally representative of the

USA population, among those who were active in the CEMR during 2015 and were older

than 18 years, 11.6% were identified to have any type of diabetes. This estimate stands very

close to the US National Diabetes Statistics (NDS) report that estimated 12.2% of adult

population to have diabetes in 20159. The database has been extensively used for academic

research worldwide10-12.

127

of 25

For more than 34 million individuals, longitudinal EMRs were available from 1995 until

April 2016, with comprehensive patient-level information on demographics, anthropometric,

clinical and laboratory variables.

Study Design

For each identified patient with type 2 diabetes, the following drug classes were arranged

chronologically according to the initial prescription dates: metformin, sulfonylurea (SU),

thiazolidinedione (TZD), alpha glucosidase inhibitor, amylin, dopamine receptor agonist,

meglitinide, dipeptidyl peptidase-4 inhibitor (DPP-4i), glucagon like peptide 1 receptor

agonist (GLP-1RA), sodium glucose cotransporter inhibitor, and insulin (INS). Next, the data

on individual patient’s first-, second-, and third-line ADDs was created. A robust

methodology for extraction and assessment of longitudinal patient-level medication data from

the Centricity EMRs has been recently described by the authors13.

Main study cohort included patients with: (1) age at diagnosis 18 and <80 years, (2)

diagnosis date strictly after first registered activity in the EMR database, (3) diagnosis date on

or after January 1, 2005, (4) initiated anti-diabetic therapy with metformin, (5) initiated

second-line ADD with SU, TZD, DPP-4i, GLP-1RA or INS, (6) available HbA1c, systolic

blood pressure (SBP), low density lipoprotein (LDL), or triglycerides measure at second-line

ADD initiation (baseline), (7) second-line therapy duration at least three months, and (8)

follow-up from baseline at least six months. Additional restrictions on the follow-up were

applied: 12 months (sub-cohort 1) and 24 months (sub-cohort 2).

HbA1c measures at baseline, 6, 12, 18, and 24 months were obtained as the nearest measure

within 3 months either side of the time point. Baseline and longitudinal body weight, SBP,

and lipids were calculated as the average of available measures within 3 months either side of

128

of 25

the time point. With the condition of at least two non-missing follow-up data over 24 months,

the missing data were imputed using a Markov Chain Monte Carlo method adjusting for age,

diabetes duration and usage of concomitant ADDs14.

The presence of comorbidities prior to baseline was assessed by relevant disease

identification codes. Cardiovascular disease (CVD) was defined as ischaemic heart disease,

peripheral vascular disease, heart failure (HF), or stroke. Three-point major adverse

cardiovascular event (MACE) was defined as presence of HF, Myocardial infarction (MI), or

stroke.

Lipid modifying agents included all FDA approved drugs with highest Anatomical

Therapeutic Chemical (ATC) classification code of C10. Drugs against high blood pressure

were defined by the ATC codes of C02-C04 and C07-C09 (includes diuretics and

vasodilators).

SBP ≥ 130/ 140 mmHg for those with/ without CVD history at baseline was defined as

uncontrolled. Similarly, LDL ≥ 70/ 100 mg/dL for those with/ without CVD history at

baseline was defined as uncontrolled. Triglycerides ≥ 150 mg/dL and HbA1c ≥ 7.5% were

defined as uncontrolled.

Statistical Methods

Baseline characteristics were summarised as number (%), mean (SD) or median (first

quartile, third quartile). Longitudinal failure to control risk factors (individual and pairwise)

and risk factor burden were calculated irrespective of baseline control status. Failure to

control LDL and triglycerides at 6, 12, 24 months was calculated in those who were using a

lipid modifying drug prior to 6, 12, 24 months of baseline, respectively. Similarly, failure to

control SBP was calculated only in those who were using a blood pressure lowering drug

129

of 25

prior to 6, 12, 24 months of baseline. Sub-cohort 1 and sub-cohort 2 were used for one and

two year analyses, respectively. Pairwise failure to simultaneously control HbA1c plus (1)

LDL, (2) triglycerides, and (3) SBP was summarised as proportion (95% CI) at 6, 12, and 24

of baseline. Probability (95% CI) of failure to control both risk factors by second-line ADD

groups was calculated using “Treatment Effects” modelling approach15-17. Second-line

treatment groups were balanced on baseline risk factor measurements. Probit model for

likelihood of failure was adjusted for sex, duration of diabetes, baseline age and body weight.

Risk factor two-year burden was defined as uncontrolled measures (at 6 months OR at 12

months) AND (at 18 months OR at 24 months) for patients in sub-cohort 2. Two-year burden

for LDL and triglycerides was calculated among those who were using a lipid modifying drug

prior to 12 months of baseline. Two-year burden for SBP was calculated among those who

were using a blood pressure lowering drug prior to 12 months of baseline. Proportions of

continuously uncontrolled patients (two-year burden) were summarized by the year of

second-line ADD initiation and by the class of second-line ADD. Standard life-table methods

were used to estimate rates per 1000 person years (95% CI) of MACE by class of second-line

ADDs.

RESULTS

From 2,624,954 identified patients with T2DM, 276,884 met the inclusion criteria

(Supplementary Figure 1, Table 1). With mean follow-up of 3.7 years, 89% of cohort had at

least one year of follow-up. In the cohort majority of patients were obese (n=187,936, 70%),

and 60,317 (22%) had a history of CVD on or prior to baseline. Those with a history of CVD

were older (mean: 64 years) and more likely to be male (61%) than those without a history of

CVD (mean: 57 years; 46% male). With a mean (SD) HbA1c of 8.4 (1.9)% at the time of

second ADD initiation, 54/ 61% of patients with/ without a history of CVD had HbA1c

130

of 25

≥7.5% respectively. With mean (SD) LDL of 97 (35) mg/dL, 67% of those with a history of

CVD had LDL ≥70 mg/dL, while 46% of those without CVD history had LDL ≥100 mg/dL.

Baseline triglycerides ≥ 150 mg/dL had 48% of patients. In the sub-cohort 1, among those

with/ without a history of CVD 90/ 74% were using a lipid modifying drug prior or within 1

year of baseline (data not shown). With mean (SD) SBP of 131 (15) mmHg, 50% of those

with a history of CVD had SBP ≥130 mmHg, while 25% of those without CVD history had

SBP ≥140 mmHg. In the sub-cohort 1, among those with/ without a history of CVD 97 / 84%

were using a blood pressure lowering drug prior or within 1 year of baseline (data not

shown).

Individual Risk Factor Failure

Irrespective of baseline control, 37, 39, and 42% of patients failed to achieve HbA1c below

7.5% at 6, 12, and 24 months post intensification with second-line ADD (Table 2). The

proportions of those who failed to control HbA1c were lower for those with a history of CVD

at baseline (32-38%), compared to those without a history of CVD at baseline (38-42%, data

not shown). Among patients, who were using a lipid modifying drug, 43% had uncontrolled

LDL over two years post baseline (Table 2), whereas 64 / 36% of those with / without a

history of CVD failed to achieve LDL< 70 /100 mg/dL (data not shown). Among patients,

who were using a lipid modifying drug, 46% had uncontrolled triglycerides over two years

post baseline (Table 2), the proportions were similar among those with/ without a history of

CVD at baseline. Among patients, who were using a blood pressure lowering drug, 30%

failed to control SBP during two years post intensification with second-line ADD, whereas 49

/ 24% of those with / without a history of CVD failed to achieve SBP < 130/140 mmHg over

2 years.

131

of 25

Among patients with baseline HbA1c ≥7.5 and ≤ 9%, 43/ 46/ 48% failed to achieve HbA1c <

7.5% at 6/ 12/ 24 months, irrespective of additional therapy intensification (Supplementary

Figure 2). Among those who were using a lipid modifying drug and had uncontrolled LDL at

baseline, the proportions of those who were uncontrolled at 6/12/ 24 months were 71 /65/

60%. Similarly, more than 60% continued to have uncontrolled triglycerides, among those

who were uncontrolled at baseline. Among those who were using a blood pressure lowering

drug and had uncontrolled SBP at baseline, the proportions of those who were uncontrolled at

6/12/ 24 months were 60/ 55/ 51% (Supplementary Figure 2).

Pairwise Risk Factor Control

Among patients who were using a lipid modifying drug, apart from being on intensified ADD

by design, around 62% failed to simultaneously control HbA1c+LDL over two years post

second-line ADD initiation (Table 2), whereas around 75 / 58% of those with / without a

history of CVD failed to control both risk factors simultaneously (data not shown). The

adjusted probability (95% CI) of failing to simultaneously control both risk factors at 24

months was the lowest in those who initiated second-line with GLP-1RA [0.55 (0.53, 0.58)]

and TZD [0.55 (0.54, 0.57)], followed by DPP-4i [0.59 (0.58, 0.60)], and significantly higher

for SU [0.65 (0.64, 0.65)] and INS [0.69 (0.68, 0.70)] users (Table 3).

Among patients who were using a lipid modifying drug, around 62% failed to simultaneously

control HbA1c+Triglycerides over two years post second-line ADD initiation (Table 2). The

adjusted probability (95% CI) of failing to simultaneously control both risk factors at 24

months was the lowest in those who initiated second-line with TZD [0.52 (0.50, 0.53)],

followed by DPP-4i [0.56 (0.55, 0.57)] and GLP-1RA [0.56 (0.52, 0.60)], and significantly

higher for SU [0.63 (0.62, 0.64)] and insulin [0.65 (0.63, 0.66)] users (Table 3).

132

of 25

Among patients who were using a blood pressure lowering drug, 53/ 55/ 57% failed to

simultaneously control HbA1c+SBP at 6, 12, and 24 months post second-line ADD initiation

(Table 2). Among those with/ without a history of CVD 64-67% and 50-54% of patients

failed to control both risk factors simultaneously (data not shown). The adjusted probability

(95% CI) of failing to simultaneously control both risk factors at 24 months, was lower in

those who initiated second-line with TZD [0.48 (0.47, 0.50)], GLP-1RA [0.49 (0.47, 0.51)],

or DPP-4i [0.51 (0.50, 0.52)], compared to those who initiated with SU [0.60 (0.59,0.60)] and

insulin [0.64 (0.63, 0.65)] groups (Table 3).

Continued Risk Factor Burden

Among those with follow-up for at least 24 months, 35% had continuously uncontrolled

HbA1c of more than 7.5%. The two-year burden increased from 31% for those who

intensified first-line in 2005 till 41% for those who intensified therapy in 2014 (Figure 1A).

Two-year burden increased from 28 to 36% and from 32 to 42% for those with/ without a

history of CVD at baseline (Figure 1B and C). The proportions of those with continuously

uncontrolled HbA1c were lower among those who initiated second-line with GLP-1RA (95%

CI: 24-26%) and TZD (95% CI: 23-24%), followed by DPP-4i (95% CI: 28-30%), and

significantly higher for SU (95% CI: 39-40%) and INS (95% CI: 50-51%) (Figure 1 D).

Among those who initiated a lipid modifying drug prior to 12 months of baseline and had at

least two years of follow-up, 41% had continuously uncontrolled LDL (Figure 1A). Among

those with/ without a history of CVD at baseline, 65/ 33% had continuously uncontrolled

LDL over 2 years of baseline (Figure 1B and C). Comparing by the CVD status at baseline,

133

of 25

the proportions of those with two-year LDL burden were similar among second-line ADD

classes (Figure 1 E-F).

Among those who initiated a lipid modifying drug prior to 12 months of baseline and had at

least two years of follow-up, 37% had continuously uncontrolled triglycerides of more than

150 mg/dL (Figure 1A-C). Among those without CVD history at baseline, the proportion of

those with continuously uncontrolled triglycerides was lower in TZD (95% CI: 31-33%) and

INS (95% CI: 33-36%) groups, compared to other second-line ADDs (95% CIs: 37-41%)

(Figure 1E). Among those with CVD history at baseline, the lowest proportion of those with

continuously uncontrolled triglycerides was in TZD (95% CI: 28-32%) group, compared to

other second-line ADDs (95% CIS: 34-43%) (Figure 1F).

Among those who initiated a blood pressure lowering prior to 12 months of baseline and had

at least two years of follow-up, 27-33% had continuously uncontrolled SBP (Figure 1A).

Among those with/ without a history of CVD at baseline, 51/ 21% had continuously

uncontrolled SBP over 2 years of baseline (Figure 1B and C). Those who initiated second-

line ADD with GLP-1RA had the lowest two-year SBP burden (95% CI: 18-20%), followed

by DPP-4i (95% CI: 23-24%), TZD (95% CI: 25-26%), SU and INS (95% CIs: 30-32%)

(Figure 1 D-F).

Cardiovascular risk

Individual and composite rates per 1000 person-years of HF, MI, and stroke along with

number of failures are presented in the Table 4. In the primary prevention group, 18,438 3-

point MACE events occurred during available follow-up. The lowest rate per 1000 person-

years was observed in those who initiated second-line with GLP-1RA (95% CI: 13-15),

134

of 25

followed by DPP-4i and TZD groups (95% CIs: 19, 21), SU (95% CI: 26-27), and INS (95%

CI: 31-34). Such pattern was preserved in individual analyses for HF, MI, and stroke as well.

In the secondary prevention group, 15,323 3-point MACE events occurred during available

follow-up. The lowest rate per 1000 person-years was observed in those who initiated

second-line with GLP-1RA (95% CI: 53-66), followed by DPP-4i and TZD groups (95% CIs:

66- 80), SU (95% CI: 86-90), and INS (95% CI: 100-108). Such pattern was preserved in

individual analyses for HF, MI, and Stroke as well (Table 4).

CONCLUSIONS

In this longitudinal exploratory study of a large cohort of patients with type 2 diabetes from

primary and ambulatory care systems of USA we have observed that (1) irrespective of

baseline control, more than 40% of patients do not meet 7.5% target after two years post

metformin intensification; (2) long-term glycaemic burden has increased over the last decade;

(3) around third of patients have consistently uncontrolled lipids and SBP even though they

are using cardio-protective drugs; (4) treatment with GLP-1RA was associated with lower

rates of major adverse macrovascular events.

The results of this study clearly demonstrate persistent glycaemic and CV risk factor burden

among patients who are using multiple medications for glucose, lipid, and blood pressure

control. Three out of five patients who are already receiving intensified treatment are failing

to simultaneously control glucose level and at least one CV risk factor. Furthermore, the

proportions of patients who fail to control CV risk factors are not reducing over the time, and

glycaemic burden has increased during the last decade.

Population ageing, therapy non-adherence and not adequate treatment intensification when

needed (therapeutic inertia) may explain observed patterns. While statin prescribing patterns

135

of 25

are increasing, using US Medical Expenditure Panel Survey data, Salami and colleagues

reported that use of high intensity statins was only 18-20% in patients with diabetes and no

atherosclerotic cardiovascular disease 18. Similarly, Abdallah and colleagues reported that

among 1,300 patients with diabetes, 88% were prescribed statins at the time of hospital

discharge for acute myocardial infarction, whereas only 22% were prescribed intensive statin

therapy 19. Further studies investigating intensification patterns for lipid and blood pressure

control and long-term consequences of not intensifying therapy when needed are required in

patients with diabetes.

We have observed the lower probabilities of simultaneous failure in control of HbA1c and

LDL, triglycerides, or SBP in those who initiated second-line with GLP-1RA, DPP-4i, and

TZD, compared to those who were treated with SU and INS. In a recent study by Montvida

and colleagues it was shown that incretin-based therapies and TZD provide higher chance of

sustainable glycaemic compared to SU and INS 20. While two-year glycaemic burden was

significantly different by the ADD classes, those who were treated with TZD had lower

triglycerides burden and those on GLP-1RA had lower SBP burden. The composite and

individual rates of HF, MI, and stroke were lowest for those who initiated second-line with

GLP-1RA, compared to other ADDs in cohorts with and without a history of CVD.

In general, the CEMR database is representative of US population in terms of age and ethnic

subgroups, however a higher proportions of patients from north eastern and mid-western

states are represented in the CEMR 21. The distribution of CV risk factors was found to be

similar to the prospective national health surveys 22. A large cohort size with average of 3.7

years of follow-up post metformin intensification assure reliable estimates reported in the

present study. Drug use data available from patient’s medication lists along with prescribing

information and robust data mining methodologies applied, bring additional value to this

136

of 25

study 13,23. Nonetheless, the findings should be interpreted with caution: EMR data are in

general biased towards unhealthy populations and commercially insured individuals, White

Caucasians are over represented in the CEMR, and the results are subject to limited follow-

up.

To conclude, we have observed alarming rates of population-level glycaemic and CV risk

factor control, whereas the burden has not reduced during the last decade. While treatment

guidelines, clinician, and population education are constantly improving, the CV burden and

associated costs of diabetes management are unlikely to reduce in the nearest future.

ACKNOWLEDGEMENTS

OM and SKP were responsible for the primary design of the study. JHS contributed

significantly in the study design. OM conducted the data extraction. OM and SKP jointly

conducted the statistical analyses. The first draft of the manuscript was developed by OM and

SKP, and all authors contributed to the finalization of the manuscript. SKP had full access to

all the data in the study and takes responsibility for the integrity of the data and the accuracy

of the data analysis.

Melbourne EpiCentre gratefully acknowledges the support from the National Health and

Medical Research Council and the Australian Government’s National Collaborative Research

Infrastructure Strategy (NCRIS) initiative through Therapeutic Innovation Australia. OM

acknowledges the Ph. D. scholarship from Queensland University of Technology, Australia,

and her co-supervisors Prof. Ross Young and Prof. Louise Hafner of the same University.

JHS is supported by a National Health and Medical Research Council Research Fellowship.

No separate funding was obtained for this study.

Declaration of interests

137

of 25

SKP has acted as a consultant and/or speaker for Novartis, GI Dynamics, Roche,

AstraZeneca, Guangzhou Zhongyi Pharmaceutical and Amylin Pharmaceuticals LLC. He has

received grants in support of investigator and investigator initiated clinical studies from

Merck, Novo Nordisk, AstraZeneca, Hospira, Amylin Pharmaceuticals, Sanofi-Avensis and

Pfizer. OM has no conflict of interest to declare. JHS has received speaker honoraria,

consultancy fees and/or travel sponsorship from AstraZeneca, Boehringer Ingelheim, Lilly,

Sanofi, Mylan, Novo Nordisk, Merck Smith and Dohme and Novartis.

138

of 25

REFERENCES

1. Fox CS, Coady S, Sorlie PD, et al. Increasing cardiovascular disease burden due to diabetes mellitus: the Framingham Heart Study. Circulation 2007; 115(12): 1544-50. 2. Turnbull FM, Abraira C, Anderson RJ, et al. Intensive glucose control and macrovascular outcomes in type 2 diabetes. Diabetologia 2009; 52(11): 2288-98. 3. Arnold SV, Kosiborod M, Wang J, Fenici P, Gannedahl G, LoCasale RJ. Burden of Cardio-Renal-Metabolic Conditions in Adults with Type 2 Diabetes within the Diabetes Collaborative Registry. Diabetes, Obesity and Metabolism 2018. 4. American Diabetes Association. Standards of Medical Care in Diabetes—2018. Diabetes Care 2018; 41(Supplement 1): S4. 5. Khunti K, Kosiborod M, Ray KK. Legacy benefits of blood glucose, blood pressure and lipid control in people with diabetes and cardiovascular disease: Time to overcome multifactorial therapeutic inertia? Diabetes, Obesity and Metabolism 2018. 6. Ali MK, Bullard KM, Saaddine JB, Cowie CC, Imperatore G, Gregg EW. Achievement of goals in US diabetes care, 1999–2010. New England Journal of Medicine 2013; 368(17): 1613-24. 7. Lipska KJ, Yao X, Herrin J, et al. Trends in drug utilization, glycemic control, and rates of severe hypoglycemia, 2006–2013. Diabetes care 2017; 40(4): 468-75. 8. Carls G, Huynh J, Tuttle E, Yee J, Edelman SV. Achievement of glycated hemoglobin goals in the US remains unchanged through 2014. Diabetes Therapy 2017; 8(4): 863-73. 9. Centers for Disease Control and Prevention. National diabetes statistics report: estimates of diabetes and its burden in the United States, 2018. Atlanta, GA: US Department of Health and Human Services 2018. 10. Crawford AG, Cote C, Couto J, et al. Comparison of GE Centricity Electronic Medical Record database and National Ambulatory Medical Care Survey findings on the prevalence of major conditions in the United States. Population health management 2010; 13(3): 139-50. 11. Brixner D, Said Q, Kirkness C, Oberg B, Ben-Joseph R, Oderda G. Assessment of cardiometabolic risk factors in a national primary care electronic health record database. Value in health 2007; 10(s1): S29-S36. 12. Paul SK, Shaw J, Montvida O, Klein K. Weight gain in insulin treated patients by BMI categories at treatment initiation: New evidence from real-world data in patients with type 2 diabetes. Diabetes, Obesity and Metabolism 2016. 13. Montvida O, Arandjelović O, Reiner E, Paul SK. Data Mining Approach to Estimate the Duration of Drug Therapy from Longitudinal Electronic Medical Records. Open Bioinformatics Journal 2017; 10: 1-15. 14. Thomas G, Klein K, Paul S. Statistical challenges in analysing large longitudinal patient-level data: the danger of misleading clinical inferences with imputed data. J Indian Soc Agric Stat 2014; 68: 39-54. 15. Rotnitzky A, Robins JM. Inverse probability weighting in survival analysis. Encyclopedia of Biostatistics 2005. 16. Austin PC, Stuart EA. The performance of inverse probability of treatment weighting and full matching on the propensity score in the presence of model misspecification when estimating the effect of treatment on survival outcomes. Statistical methods in medical research 2015: 0962280215584401. 17. Austin PC. Variance estimation when using inverse probability of treatment weighting (IPTW) with survival analysis. Statistics in Medicine 2016; 35(30): 5642-55.

139

of 25

18. Salami JA, Warraich H, Valero-Elizondo J, et al. National trends in statin use and expenditures in the US adult population from 2002 to 2013: insights from the Medical Expenditure Panel Survey. Jama cardiology 2017; 2(1): 56-65. 19. Abdallah MS, Kosiborod M, Tang F, et al. Patterns and predictors of intensive statin therapy among patients with diabetes mellitus after acute myocardial infarction. American Journal of Cardiology 2014; 113(8): 1267-72. 20. Montvida O, Shaw J, Blonde L, Paul SK. Long-term sustainability of glycaemic achievements with second-line anti-diabetic therapies in patients with type 2 diabetes: A real-world study. Diabetes, Obesity and Metabolism 2018. 21. Brixner D, McAdam-Marx C, Ye X, et al. Six-month outcomes on A1C and cardiovascular risk factors in patients with type 2 diabetes treated with exenatide in an ambulatory care setting. Diabetes, Obesity and Metabolism 2009; 11(12): 1122-30. 22. Brixner D, Said Q, Kirkness C, Oberg B, Ben-Joseph R, Oderda G. Assessment of cardiometabolic risk factors in a national primary care electronic health record database. Value in health 2007; 10: S29-S36. 23. Owusu Adjah ES, Montvida O, Agbeve J, Paul SK. Data Mining Approach to Identify Disease Cohorts from Primary Care Electronic Medical Records: A Case of Diabetes Mellitus. The Open Bioinformatics Journal 2017; 10(1).

140

of 25

Table 1: Cohort characteristics at the time of second-line anti-diabetic drug initiation.

All No history of CVD

History of CVD

N 276,884 216,567 60,317 Age, years* 59 (12) 57 (12) 64 (9) Male† 136,918 (49) 99,907 (46) 37,011 (61) White† 194,758 (70) 149,180 (69) 45,578 (76) Black† 32,671 (12) 27,274 (13) 5,397 (9) Time from metformin initiation, months*

7.5 (15.7) 7.1 (15.2) 9.0 (17.4)

Follow-up, years* 3.7 (2.4) 3.7 (2.5) 3.6 (2.4) Follow-up ≥ 12 months† 247,223 (89) 193,092 (89) 54,131 (90) Follow-up ≥ 24 months† 191,883 (69) 149,833 (69) 42,050 (70) Therapy duration, months* 33 (25) 33 (25) 33 (24) HbA1c, %* 8.4 (1.9) 8.5 (1.9) 8.1 (1.7) HbA1c ≥ 7.5%† 102,624 (60) 84,835 (61) 17,789 (54) Weight, kg* 99 (25) 100 (25) 97 (23) BMI, kg/m2 * 35 (8) 35 (8) 33 (7) BMI<25 kg/m2† 18,819 (7) 13,735 (7) 5,084 (9) BMI ≥25 and <30 kg/m2† 60,575 (23) 44,963 (22) 15,612 (27) BMI ≥ 30 kg/m2† 187,936 (70) 150,067 (72) 37,869 (65) SBP, mmHg* 131 (15) 131 (15) 130 (16) Uncontrolled SBP† 82,837 (30) 53,168 (25) 29,669 (50) DBP, mmHg* 77 (9) 78 (9) 75 (9)

LDL, mg/dL* 97 (35) 100 (35) 87 (34) Uncontrolled LDL† 71,424 (50) 51,077 (46) 20,347 (67) HDL, mg/dL*` 43 (12) 44 (12) 42 (12) Triglycerides, mg/dL‡ 147 (107, 197) 148 (107, 198) 146 (107, 195) Triglycerides ≥ 150 mg/dL† 54,640 (48) 43,240 (49) 11,400 (48) Chronic kidney disease† 9,602 (3) 5,793 (3) 3,809 (6) Cancer† 13,750 (5) 9,951 (5) 3,799 (6) Depression† 38,444 (14) 29,996 (14) 8,448 (14) Charlson Comorbidity Index* 1.6 (1.1) 1.4 (0.9) 2.4 (1.4) Any lipid modifying drug† 188,272 (68) 137,391 (63) 50,881 (84) Statin† 168,485 (61) 121,287 (56) 47,198 (78) Blood pressure lowering drug† 224,086 (81) 167,177 (77) 56,909 (94) *mean (sd); †n(%); ‡median (IQR);

§Uncontrolled SBP: ≥ 130 /140 mmHg for those with/ without a history of cardiovascular disease;

||Uncontrolled LDL: ≥ 70 /100 mg/dL for those with/ without history cardiovascular disease.

141

of 25

Table 2: Proportions (95% CI) of those who failed to control individual risk factors* and proportions (95% CI) of those who failed to control two risk factors simultaneously at 6, 12, and 24 months post second-line anti-diabetic drug initiation,

6 months 12 months 24 months Individual Failure HbA1c 37 (36, 37) 39 (39, 39) 42 (41, 42) LDL 43 (43, 43) 43 (43, 43) 42 (41, 42) Triglycerides 46 (45, 46) 46 (45, 46) 45 (44, 45) SBP 31 (30, 31) 31 (30, 31) 30 (30, 30) Simultaneous Failure HbA1c + LDL 61 (60, 61) 62 (62, 62) 63 (62, 63) HbA1c + Triglycerides 61 (61, 61) 62 (62, 62) 63 (62, 63) HbA1c + SBP 53 (53, 54) 55 (55, 55) 57 (57, 57)

*Control: HbA1c <7.5%; triglycerides <150 mg/dL; SBP < 130/140 mmHg for those with/ without a history of cardiovascular disease; LDL< 70 /100 mg/dL for those with/ without history cardiovascular disease. LDL, Triglycerides and SBP proportions are calculated among users of lipid modifying and blood pressure lowering drugs.

142

of 25

Table 3: By groups of second-line anti-diabetic drug, adjusted probability (95% CI) of failure to simultaneously control* both risk factors.

6 months 12 months 24 months HbA1c and LDL MET+SU 0.62 (0.62, 0.63) 0.64 (0.64, 0.65) 0.65 (0.64, 0.65) MET+TZD 0.57 (0.56, 0.58) 0.55 (0.54, 0.56) 0.55 (0.54, 0.57) MET+INS 0.66 (0.65, 0.67) 0.67 (0.66, 0.68) 0.69 (0.68, 0.70) MET+GLP-1RA 0.56 (0.54, 0.58) 0.54 (0.52, 0.56) 0.55 (0.53, 0.58) MET+DPP4 0.58 (0.57, 0.59) 0.58 (0.58, 0.59) 0.59 (0.58, 0.60) HbA1c and Triglycerides MET+SU 0.60 (0.59, 0.61) 0.62 (0.61, 0.62) 0.63 (0.62, 0.64) MET+TZD 0.53 (0.51, 0.54) 0.52 (0.50, 0.53) 0.52 (0.50, 0.53) MET+INS 0.62 (0.61, 0.64) 0.65 (0.63, 0.66) 0.65 (0.63, 0.66) MET+GLP-1RA 0.54 (0.51, 0.57) 0.54 (0.50, 0.57) 0.56 (0.52, 0.60) MET+DPP4 0.54 (0.53, 0.55) 0.56 (0.54, 0.57) 0.56 (0.55, 0.57) HbA1c and SBP MET+SU 0.55 (0.55, 0.56) 0.57 (0.57, 0.58) 0.60 (0.59, 0.60) MET+TZD 0.48 (0.47, 0.49) 0.47 (0.46, 0.48) 0.48 (0.47, 0.50) MET+INS 0.59 (0.58, 0.60) 0.61 (0.60, 0.62) 0.64 (0.63, 0.65) MET+GLP-1RA 0.47 (0.45, 0.49) 0.48 (0.46, 0.49) 0.49 (0.47, 0.51) MET+DPP4 0.49 (0.48, 0.49) 0.49 (0.48, 0.50) 0.51 (0.50, 0.52)

*Control: HbA1c <7.5%; triglycerides <150 mg/dL; SBP < 130/140 mmHg for those with/ without a history of cardiovascular disease; LDL< 70 /100 mg/dL for those with/ without history cardiovascular disease. LDL, Triglycerides and SBP proportions are calculated among users of lipid modifying and blood pressure lowering drugs. Second-line treatment groups balanced on baseline risk factor measurements, analyses adjusted for sex, duration of diabetes, baseline age and body weight.

143

Pag

e 22

of

25

Fig

ure

1:

Pro

port

ion

of c

onti

nuou

sly

unco

ntro

lled

pat

ient

s ov

er tw

o ye

ars

post

sec

ond-

line

ant

i-di

abet

ic d

rug

init

iati

on, b

y th

e ye

ar o

f dr

ug

init

iati

on (

A-C

) an

d by

dru

g cl

ass

(D-F

).

*U

ncon

trol

led

mea

sure

s (a

t 6 m

onth

s O

R a

t 12

mon

ths)

AN

D (

at 1

8 m

onth

s O

R a

t 24

mon

ths)

. L

DL

, Tri

glyc

erid

es a

nd S

BP

pro

port

ions

are

cal

cula

ted

amon

g th

ose

who

wer

e us

ing

lipid

mod

ifyi

ng a

nd b

lood

pre

ssur

e lo

wer

ing

drug

s pr

ior

to o

r w

ithin

12

mon

ths

of s

econ

d-lin

e an

ti-di

abet

ic d

rug

initi

atio

n.

(B)

(C)

(A)

(D)

(E)

(F)

of 25

Table 4: By groups of second-line anti-diabetic drug and by CVD status at the time of second-line anti-diabetic drug initiation, number of failures and rates (95% CIs ) per 1000 person years of heart failure, myocardial infarction, stroke and their composite.

Without baseline CVD With baseline CVD Failures Rate Failures Rate Heart Failure, Myocardial Infarction, or StrokeMET+SU 10,781 26.1 (25.7, 26.6) 9,157 87.9 (86.1, 89.7) MET+TZD 2,165 20.2 (19.4, 21.1) 1,416 69.1 (65.6, 72.8) MET+INS 2,809 32.4 (31.2, 33.6) 2,491 103.5 (99.5, 107.6)MET+GLP-1RA 480 14.1 (12.9, 15.4) 307 59.4 (53.1, 66.4) MET+DPP4 2,203 19.3 (18.5, 20.2) 1,952 76.3 (73.0, 79.7) Heart Failure MET+SU 4,835 11.3 (11.0, 11.6) 4,666 39.9 (38.8, 41.0) MET+TZD 916 8.2 (7.7, 8.8) 610 26.0 (24.0, 28.1) MET+INS 1,357 15.0 (14.2, 15.8) 1,336 49.4 (46.8, 52.1) MET+ GLP-1RA

179 5.1 (4.4, 6.0) 134 24.0 (20.3, 28.4)

MET+DPP4 791 6.8 (6.3, 7.3) 898 32.2 (30.1, 34.4) Myocardial Infarction MET+SU 1,429 3.3 (3.1, 3.4) 1,511 12.2 (11.6, 12.8) MET+TZD 284 2.5 (2.2, 2.8) 238 9.7 (8.6, 11.0) MET+INS 366 4.0 (3.6, 4.4) 386 13.4 (12.1, 14.8) MET+ GLP-1RA

63 1.8 (1.4, 2.3) 49 8.4 (6.4, 11.2)

MET+DPP4 229 1.9 (1.7, 2.2) 307 10.6 (9.5, 11.9) Stroke MET+SU 5,925 13.9 (13.6, 14.3) 4,548 39.4 (38.3, 40.6) MET+TZD 1,232 11.2 (10.6, 11.8) 815 36.3 (33.8, 38.8) MET+INS 1,415 15.7 (14.9, 16.6) 1,210 44.8 (42.4, 47.4) MET+ GLP-1RA

276 8.0 (7.1, 9.0) 167 30.2 (26.0, 35.2)

MET+DPP4 1,318 11.4 (10.8, 12.0) 1,011 36.8 (34.6, 39.1)

145

of 25

Supplementary Figure 1: Flowchart of study cohorts

History of CVD (n=42,050)

No history of CVD (n=149,833)

Type 2 Diabetes (n=2,624,954)

Age at diagnosis ≥ 18 and <80 (n=2,590,853)

Diabetes Mellitus (n=2,893,321)

Patients with non‐missing sex and age (n=34,299,123)

Diabetes Diagnosis after entry to the EMR (n=1,412,938)

Metformin as first‐line (n=740,478)

Diabetes Diagnosis on or after Jan 1, 2005 (n=1,305,686)

GLP‐1RA (n=15,448

DPP‐4i (n=61,508

Initiated second‐line (n=347,735)

INS(n=49,939

SU (n=187,819)

TZD (n=33,021

Baseline HbA1c, SBP, Triglycerides, or LDL (n=322,630)


Therapy duration ≥ 3 months and follow‐up ≥ 6 months (n=276,884)


STUDY COHORT

Follow‐up at least 12 months (n=247,223)

STUDY SUB‐COHORT 1

Follow‐up at least 24 months (n=191,883)

STUDY SUB‐COHORT 2



146

of 25

Supplementary Figure 2: Among uncontrolled* patients at baseline, proportion (95% CI)

of those who failed to control† individual risk factors at 6-,12-, and 24- months post second-

line anti-diabetic drug initiation.

*Uncontrolled: HbA1c ≥7.5 and ≤ 9%; triglycerides ≥ 150 mg/dL; SBP ≥130/140 mmHg for those with/ without a history of cardiovascular disease; LDL ≥ 70 /100 mg/dL for those with/ without history cardiovascular disease.

†Control: HbA1c <7.5%; triglycerides <150 mg/dL; SBP < 130/140 mmHg for those with/ without a history of cardiovascular disease; LDL< 70 /100 mg/dL for those with/ without history cardiovascular disease. LDL, Triglycerides and SBP proportions are calculated among users of lipid modifying and blood pressure lowering drugs.

147

Chapter 10: Discussion and Conclusions

The novelty of this research project includes a holistic evaluation of large representative EMR

to conduct much needed pharmaco-epidemiological comparative effectiveness and outcome

studies in patients with T2DM treated with different older and newer classes of anti-diabetic

therapies, while also addressing the methodological issues to ensure our ability to draw robust

inferences from such epidemiological studies based on EMRs. This thesis provides a detailed

exploration of real-world cardio-metabolic effects of treatment with incretin-based therapies in

patients with T2DM. Six pharmaco-epidemiological and three methodological studies were

conducted over three years and multiple important findings were reported in prestigious high

impact research journals.

Using large EMRs for clinical studies is a comparatively recent, and rapidly developing

research direction that requires specialists in health informatics to have both outstanding data

management skills as well as a deep understanding of the clinical question being studied. Three

methodological studies conducted as part of this thesis have direct implications on the data

quality used for the dissertation’s clinical analyses, and also can potentially improve the quality

of the data in future EMR based clinical and pharmaco-epidemiological research, leading to

reliable inferences drawn from individual studies.

The first methodological study (Chapter 4) focused on the analytical challenges associated with

the dynamics of prescriptions with different drugs, and developed two algorithms to estimate

the duration of treatment with specific drugs of interest. These approaches were compared and

tested on their ability to capture interchanges between therapies during the course of treatment.

The proposed algorithm was shown to be a reliable and effective tool to extract and aggregate

information on medication data at an individual patient level, which makes it a valuable tool

for use in future research.

The second methodological study (Chapter 5 and Appendix B) discussed data mining

challenges associated with the extraction of diseased cohorts, some of which are not

straightforward and potentially may be unnoticed or indicated at the late stages of statistical

analyses. Diagnostic codes, clinically guided algorithms and machine learning approaches

were simultaneously employed to ensure the choice of a robust cohort of patients with diabetes

for observational studies with a reduced selection bias.

148

Missing data is a pervasive problem with all prospective and retrospective observational studies

(including EMRs), posing challenges in terms of efficient design and analyses of clinical

effectiveness studies. Compared to the missing data in RCTs, the patterns and mechanisms

behind the missing risk factor or outcome data in EMRs are very complex and difficult to

ascertain. Robust imputation of missing data relies on the understanding of the predictors of

missingness in the risk factor data, especially in patients with chronic diseases. The third

methodological study (Chapter 6) compared three approaches based on the Multiple Imputation

technique in terms of their robustness in imputing data in patients with diabetes. A novel

component of this study is the investigation of the likelihood of missingness of follow-up risk

factor measures (HbA1c) with patients’ demographic and clinical characteristics (age, sex, pre-

existing comorbidities and disease severity). While all three imputation techniques were able

to provide consistent and reliable clinical inferences under unknown patterns of missingness,

this study demonstrated that complete case analyses were prone to bias by indication and

highlights the importance of missing risk factor data imputation.

This dissertation provides detailed explorations of the population-level disease management

patterns. Firstly, it provides detailed assessments of changes in the choices of first-, second-,

and third- line ADDs over last 10 years (Chapter 7). It also explores the glycaemic state, clinical

characteristics and comorbidities at the time of first-line and second-line therapy initiation by

ADD classes. This dissertation clearly demonstrates that the therapeutic inertia problem exists

at the population level with 50 - 60% of patients having HbA1c above 7.5% at first- and second-

line therapy initiation. The long-term consequences of not intensifying treatment when needed

on the glycaemic and CV risk were shown by Paul and Khunti [90, 178, 179]. In this context,

the study that explored the combination of GLP-1RA and insulin (Appendix A) is of high

importance. It was demonstrated that patients who are not adequately controlled on GLP-1RA

would benefit from an early addition of insulin (compared to switching) in terms of long-term

cardio-metabolic outcomes. In fact, those who intensified the therapy with insulin later, ended

up at the same high HbA1c level after 2 years of therapy initiation with GLP-1RA. The study

that reported that obese patients who initiate insulin do not increase body weight (Appendix C)

brings additional reassurance to these patients, as most of the patients who initiate with GLP-

1RA are obese with mean body weight of 109 kg (Chapters 7-9).

Due to complex study designs, results of RCTs are difficult to compare and individual patient

choices on therapy intensification becomes very complex. While only one large RCT has been

designed to compare the glycaemic outcomes of treatment with major ADDs (GRADE -

149

completion expected in 2020), the study presented in chapter 8 provides much needed estimates

of adjusted probabilities to achieve clinically desirable glycaemic control with major second-

line ADDs, and the probabilities to sustain such glycaemic achievements over 2 years, with

and without the need for third-line therapy intensification. A highly valuable contribution of

this study is the assessment of glycaemic control sustainability in patients treated with major

second-line ADDs. Notably, the sustainability of achieved control would not be ethical to

assess using a RCT. Similarly, the long-term glycaemic and CV risk factor burden in patients

with T2DM who are already using intensified anti-diabetic and cardio-protective therapies, was

not reported to date (Chapter 9). With more than a third of patients having consistently

uncontrolled HbA1c, lipid and SBP levels, 3 out of 5 have uncontrolled HbA1c and at least

one CV risk factor over 2 years post intensification. The results reported in chapter 9 provide

an explanation for the non-declining rates of CV events among patients with T2DM – an issue

that is of major concern for health and government authorities globally.

Extensions of this dissertation include both methodological and clinical directions. The data

quality and linkages of registry data are improving with time, and it becomes possible to

estimate a patient's adherence to prescribed medications in the real-world setting. Such

calculations are methodologically challenging and will provide only rough estimates.

Nonetheless, such studies are essential in order to understand population level patterns of

adherence - given the increased cardio-metabolic burden described in this dissertation and other

studies. Much needed methodological studies include exploring the variability of study

outcomes under different study designs and statistical methodologies. For example, assessing

supplementary hypotheses while working on this dissertation, it was observed that exposure to

insulin is associated with an increased risk of acute pancreatitis in patients with T2DM.

However when events that occurred during the first 6 months of baseline were excluded, the

risk was no longer significant (Appendix D).

Exploration of efficient designs for pharmaco-epidemiological effectiveness studies in chronic

diseases is an important future direction, along with research towards improving statistical

techniques for advanced analyses to account for the longitudinal non-linear risk factor

interactions in real-world scenarios. While the “STrengthening the Reporting of OBservational

studies in Epidemiology’’ (STROBE) statement is guiding through the methodological aspects

in observational studies, there are no standardised protocols to conduct such studies till date

[180].

150

The pharmaco-epidemiological extension directions include assessment of cardio-metabolic

outcomes of combining incretin-based therapies and SGLT-2i, the latest ADD class. This

dissertation was not designed to explore glycaemic and CV outcomes of treatment with SGLT-

2i alone, neither to assess outcomes of combining incretin-based therapies with SGLT-2i due

to very limited data in the initial CEMR extract. As it was shown in the chapter 7, the popularity

of DPP-4i and SGLT-2i is dramatically increasing, therefore assessments of novel drug

combinations present particular interest. Chapter 7 also reports higher discontinuation rates of

novel therapies compared to the older alternatives, which may be attributable to side-effects

and also to higher medication costs. More detailed assessments of the underlying reasons for

medication cessation and the long-term outcomes of such discontinuations also present an

opportunity for future studies. As it was reported in this dissertation, even though most patients

with T2DM are using lipid modifying and blood pressure lowering drugs, many do not meet

LDL, Triglycerides, or SBP targets. There is a paucity of studies assessing outcomes of

therapeutic inertia associated with non-intensifying lipid or blood pressure lowering therapies

when needed. The numerical estimates of the complications associated with therapeutic inertia

could motivate clinicians and patients towards more pro-active / aggressive CV risk factor

management. The direct extension of the studies reported in chapters 8-9 is the development

of an algorithm that would estimate the probabilities of achieving and sustaining cardio-

metabolic risk factor control with different glucose, lipid, and blood pressure lowering drug

intensification options under the given (current) risk factor profile of an individual patient. The

algorithm could be implemented as an open-source tool or integrated into the existing EMR

systems. Such a patient-centred approach would help health care professionals to make therapy

intensification choices in the most informed manner to maximise long-term cardio-metabolic

benefits while reducing diabetes related complications in a cost-effective way. The tool would

also help to involve and engage a patient in the decision making process without the need to

assess a huge amount of clinical literature. Finally, using this tool, health economists could

evaluate cost-benefits related to the therapy choice and risk-factor control at national levels.

Challenges and limitations of EMR-based studies were discussed in the introductory chapter

(Subsection 1.9.2), whereas particular concerns associated with individual studies were

discussed in each chapter separately. Representativeness of the CEMR database and of patients

with diabetes are discussed in the subsections 3.1 and 5.4 respectively. In general, it was

observed that the reported population-level processes are comparable to studies using national

survey data. Compared to surveys, EMR data does not suffer from selective nonresponse,

151

response and recall biases [99], but are biased towards unhealthy populations and commercially

insured individuals. Comparative analyses conducted during this dissertation were carefully

balanced on baseline characteristics between treatment groups and appropriately adjusted on

various confounders including demographics, existing comorbidities, and longitudinal clinical

and medication data. Nonetheless, non-availability of data to control for patients’ social-

economic status, diet and exercise complicates direct causality interpretation.

To summarise, this thesis highlights the existing glycaemic and cardiovascular risk factor

burden at the population level. Treatment with incretin-based therapies and thiazolidinedione

provides higher chances to achieve and sustain a glycaemic and CV risk factor control,

compared to sulfonylurea and insulin. A residual benefit on the risk of major adverse

cardiovascular events was observed among patients treated with GLP-1RA compared to other

major ADD choices. Nonetheless, patient-centred disease management to holistically control

for glycaemic and cardiovascular risk factors remains a key aspect to improve long-term

outcomes in patients with T2DM.

152

Bibliography

1. American Diabetes Association, Standards of Medical Care in Diabetes—2018. Diabetes Care, 2018. 41(Supplement 1): p. S4.

2. International Diabetes Federation, IDF Diabetes Atlas 6th edition. 2014, Brussel, Belgium: International Diabetes Federation.

3. World Health Organization, Definition, diagnosis and classification of diabetes mellitus and its complications: report of a WHO consultation. Part 1, Diagnosis and classification of diabetes mellitus. 1999.

4. Lloyd-Jones, D., et al., Executive summary: Heart disease and stroke statistics-2010 update: A report from the american heart association. Circulation, 2010. 121(7): p. e46-e215.

5. American Diabetes Association, Expert Committee on the Diagnosis and Classification of Diabetes Mellitus. Report of the expert committee on the diagnosis and classification of diabetes mellitus. Diabetes Care, 2003. 26(1): p. S5-S20.

6. Benjamin, M., Miller-Keane Encyclopedia and Dictionary of Medicine. Nursing and Allied Health. Philadelphia: Saunders, 1997.

7. Zimmet, P.Z., Diabetes and its drivers: the largest epidemic in human history? Clin Diabetes Endocrinol, 2017. 3: p. 1.

8. International Diabetes Federation, IDF Diabetes Atlas, 8 edn. Brussels, Belgium, 2017.

9. Riddle, M.C. and W.H. Herman, The Cost of Diabetes Care—An Elephant in the Room. Diabetes Care, 2018. 41(5): p. 929-932.

10. American Diabetes Association, Economic Costs of Diabetes in the US in 2017. Diabetes Care, 2018. 41(5): p. 917-928.

11. Magliano, D.J., et al., The productivity burden of diabetes at a population level. Diabetes care, 2018. 41(5): p. 979-984.

12. Bommer, C., et al., Global economic burden of diabetes in adults: projections from 2015 to 2030. Diabetes care, 2018. 41(5): p. 963-970.

13. Wild, S., et al., Global prevalence of diabetes: estimates for the year 2000 and projections for 2030. Diabetes Care, 2004. 27(5): p. 1047-53.

14. Zheng, Y., S.H. Ley, and F.B. Hu, Global aetiology and epidemiology of type 2 diabetes mellitus and its complications. Nat Rev Endocrinol, 2018. 14(2): p. 88-98.

15. Hu, F.B., Globalization of diabetes: the role of diet, lifestyle, and genes. Diabetes care, 2011. 34(6): p. 1249-1257.

16. Paul, S.K., et al., Comparison of body mass index at diagnosis of diabetes in a multi-ethnic population: A case-control study with matched non-diabetic controls. Diabetes Obes Metab, 2017. 19(7): p. 1014-1023.

17. Xu, Y., et al., Prevalence and control of diabetes in Chinese adults. Jama, 2013. 310(9): p. 948-59.

153

18. Rayanagoudar, G., et al., Quantification of the type 2 diabetes risk in women with gestational diabetes: a systematic review and meta-analysis of 95,750 women. Diabetologia, 2016. 59(7): p. 1403-1411.

19. Wendland, E.M., et al., Gestational diabetes and pregnancy outcomes--a systematic review of the World Health Organization (WHO) and the International Association of Diabetes in Pregnancy Study Groups (IADPSG) diagnostic criteria. BMC Pregnancy Childbirth, 2012. 12: p. 23.

20. Bellamy, L., et al., Type 2 diabetes mellitus after gestational diabetes: a systematic review and meta-analysis. Lancet, 2009. 373(9677): p. 1773-9.

21. Clausen, T.D., et al., High prevalence of type 2 diabetes and pre-diabetes in adult offspring of women with gestational diabetes mellitus or type 1 diabetes: the role of intrauterine hyperglycemia. Diabetes Care, 2008. 31(2): p. 340-6.

22. Turnbull, F.M., et al., Intensive glucose control and macrovascular outcomes in type 2 diabetes. Diabetologia, 2009. 52(11): p. 2288-98.

23. Stratton, I.M., et al., Association of glycaemia with macrovascular and microvascular complications of type 2 diabetes (UKPDS 35): prospective observational study. Bmj, 2000. 321(7258): p. 405-12.

24. Zoungas, S., et al., Effects of intensive glucose control on microvascular outcomes in patients with type 2 diabetes: a meta-analysis of individual participant data from randomised controlled trials. Lancet Diabetes Endocrinol, 2017. 5(6): p. 431-437.

25. Mannucci, E., et al., Is Glucose Control Important for Prevention of Cardiovascular Disease in Diabetes? Diabetes Care, 2013. 36(Suppl 2): p. S259-S263.

26. Holman, R.R., et al., 10-year follow-up of intensive glucose control in type 2 diabetes. NEJM, 2008. 359(15): p. 1577-89.

27. Holman, R.R., et al., Long-term follow-up after tight control of blood pressure in type 2 diabetes. NEJM, 2008. 359(15): p. 1565-76.

28. Zimmet, P., Preventing diabetic complications: a primary care perspective. Diabetes research and clinical practice, 2009. 84(2): p. 107-116.

29. Cade, W.T., Diabetes-related microvascular and macrovascular diseases in the physical therapy setting. Physical therapy, 2008. 88(11): p. 1322-1335.

30. Bourne, R.R., et al., Causes of vision loss worldwide, 1990–2010: a systematic analysis. The lancet global health, 2013. 1(6): p. e339-e349.

31. UK Prospective Diabetes Study (UKPDS) Group, Effect of intensive blood-glucose control with metformin on complications in overweight patients with type 2 diabetes (UKPDS 34). The Lancet, 1998. 352(9131): p. 854-865.

32. Patel, A., et al., Intensive blood glucose control and vascular outcomes in patients with type 2 diabetes. New England Journal of Medicine, 2008. 358(24): p. 2560-2572.

33. Skyler, J.S., et al., Intensive Glycemic Control and the Prevention of Cardiovascular Events: Implications of the ACCORD, ADVANCE, and VA Diabetes Trials. A Position Statement of the American Diabetes Association and a Scientific Statement of the American College of Cardiology Foundation and the American Heart Association. Journal of the American College of Cardiology, 2009. 53(3): p. 298-304.

154

34. de Boer, I.H., et al., Long-term renal outcomes of patients with type 1 diabetes mellitus and microalbuminuria: an analysis of the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications cohort. Arch Intern Med, 2011. 171(5): p. 412-20.

35. Fox, C.S., et al., Increasing cardiovascular disease burden due to diabetes mellitus: the Framingham Heart Study. Circulation, 2007. 115(12): p. 1544-50.

36. American Heart Association, Diabetes mellitus: a major risk factor for cardiovascular disease. 1999: American Heart Association.

37. Booth, G.L., et al., Relation between age and cardiovascular disease in men and women with diabetes compared with non-diabetic people: a population-based retrospective cohort study. Lancet, 2006. 368(9529): p. 29-36.

38. Booth, G.L., et al., Recent trends in cardiovascular complications among men and women with and without diabetes. Diabetes Care, 2006. 29(1): p. 32-7.

39. Benjamin, E.J., et al., Heart Disease and Stroke Statistics-2017 Update: A Report From the American Heart Association. Circulation, 2017. 135(10): p. e146-e603.

40. Goff, D.C., et al., Prevention of cardiovascular disease in persons with type 2 diabetes mellitus: current knowledge and rationale for the Action to Control Cardiovascular Risk in Diabetes (ACCORD) trial. American Journal of Cardiology, 2007. 99(12): p. S4-S20.

41. Shah, A.D., et al., Type 2 diabetes and incidence of cardiovascular diseases: a cohort study in 1ꞏ 9 million people. The lancet Diabetes & endocrinology, 2015. 3(2): p. 105-113.

42. Del Prato, S., Megatrials in type 2 diabetes. From excitement to frustration? Diabetologia, 2009. 52(7): p. 1219-26.

43. Ismail-Beigi, F., et al., Effect of intensive treatment of hyperglycaemia on microvascular outcomes in type 2 diabetes: an analysis of the ACCORD randomised trial. The Lancet, 2010. 376(9739): p. 419-430.

44. Fox, C.S., et al., Update on prevention of cardiovascular disease in adults with type 2 diabetes mellitus in light of recent evidence: A scientific statement from the American Heart Association and the American Diabetes Association. Circulation, 2015. 132(8): p. 691-718.

45. Khunti, K., M. Kosiborod, and K.K. Ray, Legacy benefits of blood glucose, blood pressure and lipid control in people with diabetes and cardiovascular disease: Time to overcome multifactorial therapeutic inertia? Diabetes, Obesity and Metabolism, 2018.

46. Holman, R.R., et al., 10-year follow-up of intensive glucose control in type 2 diabetes. N Engl J Med, 2008. 359(15): p. 1577-89.

47. Diabetes Atlas, International diabetes federation. Press Release, Cape Town, South Africa, 2006. 4.

48. Inzucchi, S.E., et al., Management of hyperglycemia in type 2 diabetes: a patient-centered approach: position statement of the American Diabetes Association (ADA) and the European Association for the Study of Diabetes (EASD). Diabetes Care, 2012. 35(6): p. 1364-79.

49. Turner, R.C., et al., Glycemic control with diet, sulfonylurea, metformin, or insulin in patients with type 2 diabetes mellitus: progressive requirement for multiple therapies (UKPDS 49). UK Prospective Diabetes Study (UKPDS) Group. Jama, 1999. 281(21): p. 2005-12.

155

50. Nathan, D.M., et al., Medical Management of Hyperglycemia in Type 2 Diabetes: A Consensus Algorithm for the Initiation and Adjustment of Therapy. Diabetes Care, 2009. 32(1): p. 193-203.

51. American Diabetes Association, Standards of Medical Care in Diabetes—2011. Diabetes Care, 2011. 34(Supplement 1): p. S11-S61.

52. Adler, A.I., et al., Newer agents for blood glucose control in type 2 diabetes: summary of NICE guidance. BMJ, 2009. 338.

53. Ross, S.A. and J.-M. Ekoé, Incretin agents in type 2 diabetes. Canadian Family Physician, 2010. 56(7): p. 639-648.

54. Hædersdal, S., et al. The Role of Glucagon in the Pathophysiology and Treatment of Type 2 Diabetes. in Mayo Clinic Proceedings. 2018. Elsevier.

55. Drucker, D.J. and A.B. Goldfine, Cardiovascular safety and diabetes drug development. The Lancet, 2011. 377(9770): p. 977-979.

56. Garber, A.J., Novel GLP-1 receptor agonists for diabetes. Expert Opinion on Investigational Drugs, 2012. 21(1): p. 45-57.

57. Holst, J.J., The physiology of glucagon-like peptide 1. Physiological Reviews, 2007. 87(4): p. 1409-1439.

58. Nauck, M.A., et al., Incretin-based therapies: viewpoints on the way to consensus. Diabetes Care, 2009 32(Suppl 2): p. S223-231.

59. Stonehouse, A.H., T. Darsow, and D.G. Maggs, Incretin-Based Therapies. Journal of Diabetes, 2011: p. In press.

60. Baggio, L.L. and D.J. Drucker, Biology of Incretins: GLP-1 and GIP. Gastroenterology, 2007. 132(6): p. 2131-2157.

61. Charbonnel, B. and B. Cariou, Pharmacological management of type 2 diabetes: the potential of incretin-based therapies. Diabetes, Obesity and Metabolism, 2011. 13(2): p. 99-117.

62. Smilowitz, N.R., R. Donnino, and A. Schwartzbard, Glucagon-Like Peptide-1 Receptor Agonists for Diabetes Mellitus: A Role in Cardiovascular Disease. Circulation, 2014. 129(22): p. 2305-2312.

63. Scheen, A.J., Cardiovascular outcome studies with incretin-based therapies: comparison between DPP-4 inhibitors and GLP-1 receptor agonists. diabetes research and clinical practice, 2017. 127: p. 224-237.

64. Nauck, M.A., et al., Incretin-based therapies: viewpoints on the way to consensus. Diabetes care, 2009. 32(suppl 2): p. S223-S231.

65. Neumiller, J.J., Incretin-based therapies. Medical Clinics, 2015. 99(1): p. 107-129.

66. Stonehouse, A.H., T. Darsow, and D.G. Maggs, Incretin‐based therapies. Journal of diabetes, 2012. 4(1): p. 55-67.

67. Gough, S., Handbook of Incretin-based Therapies in Type 2 Diabetes. 2016: Springer.

68. Madsbad, S., Review of head‐to‐head comparisons of glucagon‐like peptide‐1 receptor agonists. Diabetes, Obesity and Metabolism, 2016. 18(4): p. 317-332.

156

69. Munir, K.M. and E.M. Lamos, Diabetes type 2 management: what are the differences between DPP-4 inhibitors and how do you choose? 2017, Taylor & Francis.

70. Craddy, P., H.-J. Palin, and K.I. Johnson, Comparative Effectiveness of Dipeptidylpeptidase-4 Inhibitors in Type 2 Diabetes: A Systematic Review and Mixed Treatment Comparison. Diabetes Therapy, 2014. 5(1): p. 1-41.

71. Keshavarz, K., et al., Linagliptin versus sitagliptin in patients with type 2 diabetes mellitus: a network meta-analysis of randomized clinical trials. DARU Journal of Pharmaceutical Sciences, 2017. 25(1): p. 23.

72. Sivertsen, J., et al., The effect of glucagon-like peptide 1 on cardiovascular risk. Nat Rev Cardiol, 2012. 9(4): p. 209-22.

73. Mora, P.F. and E.L. Johnson, Cardiovascular Outcome Trials Of The Incretin-Based Therapies: What Do We Know So Far? Endocr Pract, 2017. 23(1): p. 89-99.

74. Ha, S.J., et al., Preventive Effects of Exenatide on Endothelial Dysfunction Induced by Ischemia-Reperfusion Injury via KATP Channels. Arteriosclerosis, Thrombosis, and Vascular Biology, 2012. 32(2): p. 474-480.

75. Nikolaidis, L.A., et al., Effects of Glucagon-Like Peptide-1 in Patients With Acute Myocardial Infarction and Left Ventricular Dysfunction After Successful Reperfusion. Circulation, 2004. 109(8): p. 962-965.

76. Ban, K., et al., Cardioprotective and Vasodilatory Actions of Glucagon-Like Peptide 1 Receptor Are Mediated Through Both Glucagon-Like Peptide 1 Receptor–Dependent and –Independent Pathways. Circulation, 2008. 117(18): p. 2340-2350.

77. Chilton, R., et al., Cardiovascular Comorbidities of Type 2 Diabetes Mellitus: Defining the Potential of Glucagonlike peptide–1-Based Therapies. The American Journal of Medicine, 2011. 124(1, Supplement): p. S35-S53.

78. Song, X., et al., Anti-atherosclerotic effects of the glucagon-like peptide-1 (GLP-1) based therapies in patients with type 2 Diabetes Mellitus: A meta-analysis. Scientific reports, 2015. 5.

79. Noyan-Ashraf, M.H., et al., GLP-1R Agonist Liraglutide Activates Cytoprotective Pathways and Improves Outcomes After Experimental Myocardial Infarction in Mice. Diabetes, 2009. 58(4): p. 975-983.

80. Bunck, M.C., et al., One-year treatment with exenatide vs. Insulin Glargine: Effects on postprandial glycemia, lipid profiles, and oxidative stress. Atherosclerosis, 2010. 212(1): p. 223-229.

81. Schwartz, E.A., et al., Exenatide suppresses postprandial elevations in lipids and lipoproteins in individuals with impaired glucose tolerance and recent onset type 2 diabetes mellitus. Atherosclerosis, 2010. 212(1): p. 217-222.

82. Buse, J.B., et al., Switching to Once-Daily Liraglutide From Twice-Daily Exenatide Further Improves Glycemic Control in Patients With Type 2 Diabetes Using Oral Agents. Diabetes Care, 2010. 33(6): p. 1300-1303.

83. Vilsbøll, T., et al., Effects of glucagon-like peptide-1 receptor agonists on weight loss: systematic review and meta-analyses of randomised controlled trials. BMJ, 2012. 344.

157

84. Sun, F., et al., Effect of glucagon-like peptide-1 receptor agonists on lipid profiles among type 2 diabetes: a systematic review and network meta-analysis. Clin Ther, 2015. 37(1): p. 225-241.e8.

85. Katout, M., et al., Effect of GLP-1 mimetics on blood pressure and relationship to weight loss and glycemia lowering: results of a systematic meta-analysis and meta-regression. Am J Hypertens, 2014. 27(1): p. 130-9.

86. Food and Drug Administration. Use of Real-World Evidence To Support Regulatory Decision-Making for Medical Devices; Guidance for Industry and Food and Drug Administration Staff. 2017; Available from: https://www.fda.gov/downloads/medicaldevices/deviceregulationandguidance/guidancedocuments/ucm513027.pdf.

87. Sherman, R.E., et al., Real-world evidence—what is it and what can it tell us. N Engl J Med, 2016. 375(23): p. 2293-2297.

88. Franklin, J.M. and S. Schneeweiss, When and how can real world data analyses substitute for randomized controlled trials? Clinical Pharmacology & Therapeutics, 2017. 102(6): p. 924-933.

89. Paul, S.K., et al., Delay in treatment intensification increases the risks of cardiovascular events in patients with type 2 diabetes. Cardiovascular diabetology, 2015. 14(1): p. 100.

90. Khunti, K., et al., Clinical inertia with regard to intensifying therapy in people with type 2 diabetes treated with basal insulin. Diabetes, Obesity and Metabolism, 2016.

91. Giugliano, D., et al., Comment on Edelman and Polonsky. Type 2 Diabetes in the Real World: The Elusive Nature of Glycemic Control. Diabetes Care 2017;40:1425–1432. Diabetes Care, 2018. 41(2): p. e17.

92. Edelman, S.V. and W.H. Polonsky, Type 2 Diabetes in the Real World: The Elusive Nature of Glycemic Control. Diabetes Care, 2017. 40(11): p. 1425-1432.

93. Crapo, J., Big Data in Healthcare: Separating The Hype From The Reality. HealthCatalyst, 2015: p. 5.

94. Crawford, A.G., et al., Comparison of GE Centricity Electronic Medical Record database and National Ambulatory Medical Care Survey findings on the prevalence of major conditions in the United States. Popul Health Manag, 2010. 13(3): p. 139-50.

95. Grabenbauer, L., A. Skinner, and J. Windle, Electronic Health Record Adoption - Maybe It's not about the Money: Physician Super-Users, Electronic Health Records and Patient Care. Appl Clin Inform, 2011. 2(4): p. 460-71.

96. Birkhead, G.S., M. Klompas, and N.R. Shah, Uses of electronic health records for public health surveillance to advance public health. Annu Rev Public Health, 2015. 36: p. 345-59.

97. Coloma, P.M., et al., Combining electronic healthcare databases in Europe to allow for large‐scale drug safety monitoring: the EU‐ADR Project. Pharmacoepidemiology and drug safety, 2011. 20(1): p. 1-11.

98. Kosiborod, M., et al., Lower Risk of Heart Failure and Death in Patients Initiated on Sodium-Glucose Cotransporter-2 Inhibitors Versus Other Glucose-Lowering DrugsClinical Perspective: The CVD-REAL Study (Comparative Effectiveness of Cardiovascular Outcomes

158

in New Users of Sodium-Glucose Cotransporter-2 Inhibitors). Circulation, 2017. 136(3): p. 249-259.

99. Verheij, A.R., et al., Possible Sources of Bias in Primary Care Electronic Health Record Data Use and Reuse. J Med Internet Res, 2018. 20(5): p. e185.

100. Regier, E.E., M.V. Venkat, and K.L. Close, More than 7 years of hindsight: revisiting the FDA’s 2008 guidance on cardiovascular outcomes trials for Type 2 diabetes medications. Clinical Diabetes, 2016. 34(4): p. 173-180.

101. Nissen, S.E. and K. Wolski, Effect of rosiglitazone on the risk of myocardial infarction and death from cardiovascular causes. New England Journal of Medicine, 2007. 356(24): p. 2457-2471.

102. Scirica, B.M., et al., Saxagliptin and cardiovascular outcomes in patients with type 2 diabetes mellitus. New England Journal of Medicine, 2013. 369(14): p. 1317-1326.

103. McMurray, J.J.V., et al., Heart failure: a cardiovascular outcome in diabetes that can no longer be ignored. The Lancet Diabetes & Endocrinology, 2015. 2(10): p. 843-851.

104. Marso, S.P., et al., Liraglutide and cardiovascular outcomes in type 2 diabetes. New England Journal of Medicine, 2016. 375(4): p. 311-322.

105. Monami, M., et al., Dipeptidyl peptidase‐4 inhibitors and cardiovascular risk: a meta‐analysis of randomized clinical trials. Diabetes, Obesity and Metabolism, 2013. 15(2): p. 112-120.

106. Monami, M., et al., Effects of glucagon‐like peptide‐1 receptor agonists on cardiovascular risk: a meta‐analysis of randomized clinical trials. Diabetes, Obesity and Metabolism, 2014. 16(1): p. 38-47.

107. Wu, S., et al., The cardiovascular effect of incretin-based therapies among type 2 diabetes: a systematic review and network meta-analysis. Expert opinion on drug safety, 2018: p. 1-7.

108. Gamble, J.M., et al., Incretin-based medications for type 2 diabetes: an overview of reviews. Diabetes Obes Metab, 2015. 17(7): p. 649-58.

109. Patorno, E., et al., Comparative Cardiovascular Safety of Glucagon-Like Peptide-1 Receptor Agonists versus Other Antidiabetic Drugs in Routine Care: a Cohort Study. Diabetes, Obesity and Metabolism, 2016: p. n/a-n/a.

110. Kannan, S., et al., Risk of overall mortality and cardiovascular events in patients with type 2 diabetes on dual drug therapy including metformin: A large database study from the Cleveland Clinic. J Diabetes, 2016. 8(2): p. 279-85.

111. d’Agostino, R.B., Tutorial in biostatistics: propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med, 1998. 17(19): p. 2265-2281.

112. Rubin, D.B., Using multivariate matched sampling and regression adjustment to control bias in observational studies. ETS Research Report Series, 1978. 1978(2).

113. Hade, E.M. and B. Lu, Bias associated with using the estimated propensity score as a regression covariate. Statistics in medicine, 2014. 33(1): p. 74-87.

114. Zghebi, S., et al., Comparative risk of major cardiovascular events associated with second‐line antidiabetic treatments: a retrospective cohort study using UK primary care data linked to

159

hospitalization and mortality records. Diabetes, Obesity and Metabolism, 2016. 18(9): p. 916-924.

115. Gamble, J.-M., et al., Comparative effectiveness of incretin-based therapies and the risk of death and cardiovascular events in 38,233 metformin monotherapy users. Medicine, 2016. 95(26).

116. Paul, S.K., et al., The association of the treatment with glucagon-like peptide-1 receptor agonist exenatide or insulin with cardiovascular outcomes in patients with type 2 diabetes: A retrospective observational study. Cardiovascular Diabetology, 2015. 14(1).

117. Best, J.H., et al., Risk of cardiovascular disease events in patients with type 2 diabetes prescribed the Glucagon-Like Peptide 1 (GLP-1) receptor agonist exenatide twice daily or other glucose-lowering therapies: A retrospective analysis of the lifelink database. Diabetes Care, 2011. 34(1): p. 90-95.

118. Velez, M., et al., Association of antidiabetic medications targeting the glucagon-like peptide 1 pathway and heart failure events in patients with diabetes. J Card Fail, 2015. 21(1): p. 2-8.

119. Mogensen, U.M., et al., Cardiovascular safety of combination therapies with incretin-based drugs and metformin compared with a combination of metformin and sulphonylurea in type 2 diabetes mellitus - a retrospective nationwide study. Diabetes Obes Metab, 2014.

120. Fu, A.Z., et al., Association Between Hospitalization for Heart Failure and Dipeptidyl Peptidase-4 Inhibitors in Patients With Type 2 Diabetes: An Observational Study. Diabetes care, 2016: p. dc150764.

121. Weir, D.L., et al., Sitagliptin use in patients with diabetes and heart failure: a population-based retrospective cohort study. JACC Heart Fail, 2014. 2(6): p. 573-82.

122. Scheller, N.M., et al., All-cause mortality and cardiovascular effects associated with the DPP-IV inhibitor sitagliptin compared with metformin, a retrospective cohort study on the Danish population. Diabetes Obes Metab, 2014. 16(3): p. 231-6.

123. Kim, S.C., et al., Dipeptidyl peptidase-4 inhibitors do not increase the risk of cardiovascular events in type 2 diabetes: a cohort study. Acta Diabetol, 2014. 51(6): p. 1015-23.

124. Wang, K.L., et al., Sitagliptin and the risk of hospitalization for heart failure: a population-based study. Int J Cardiol, 2014. 177(1): p. 86-90.

125. Morgan, C.L., et al., Combination therapy with metformin plus sulphonylureas versus metformin plus DPP-4 inhibitors: association with major adverse cardiovascular events and all-cause mortality. Diabetes Obes Metab, 2014. 16(10): p. 977-83.

126. GE Healthcare, Centricity Electronic Medical Record Brochure. 2011.

127. Brixner, D., et al., Six‐month outcomes on A1C and cardiovascular risk factors in patients with type 2 diabetes treated with exenatide in an ambulatory care setting. Diabetes, Obesity and Metabolism, 2009. 11(12): p. 1122-1130.

128. Brixner, D., et al., Assessment of cardiometabolic risk factors in a national primary care electronic health record database. Value in health, 2007. 10: p. S29-S36.

129. Inzucchi, S., et al., Progression to insulin therapy among patients with type 2 diabetes treated with sitagliptin or sulphonylurea plus metformin dual therapy. Diabetes, Obesity and Metabolism, 2015. 17(10): p. 956-964.

160

130. Levin, P., et al., Therapeutically interchangeable? A study of real‐world outcomes associated with switching basal insulin analogues among US patients with type 2 diabetes mellitus using electronic medical records data. Diabetes, Obesity and Metabolism, 2015. 17(3): p. 245-253.

131. Chitnis, A.S., et al., Clinical effectiveness of liraglutide across body mass index in patients with type 2 diabetes in the United States: a retrospective cohort study. Advances in therapy, 2014. 31(9): p. 986-999.

132. Davis, K.L., et al., Real-world comparative outcomes of US type 2 diabetes patients initiating analog basal insulin therapy. Current medical research and opinion, 2013. 29(9): p. 1083-1091.

133. Ashton, V., et al., LDL-C levels in US patients at high cardiovascular risk receiving rosuvastatin monotherapy. Clinical therapeutics, 2014. 36(5): p. 792-799.

134. Chopra, I. and K.M. Kamal, Factors associated with therapeutic goal attainment in patients with concomitant hypertension and dyslipidemia. Hospital Practice, 2014. 42(2): p. 77-88.

135. Saseen, J.J., et al., Maintaining goal blood pressures after switching from olmesartan to other angiotensin receptor blockers. The Journal of Clinical Hypertension, 2013. 15(12): p. 888-892.

136. Crawford, A.G., et al., Prevalence of obesity, type II diabetes mellitus, hyperlipidemia, and hypertension in the United States: findings from the GE Centricity Electronic Medical Record database. Population health management, 2010. 13(3): p. 151-161.

137. Brixner, D., et al., Evaluation of cardiovascular risk factors, events, and costs across four BMI categories. Obesity, 2013. 21(6): p. 1284-1292.

138. DerSarkissian, M., et al., Maintenance of weight loss or stability in subjects with obesity: a retrospective longitudinal analysis of a real-world population. Current Medical Research and Opinion, 2017. 33(6): p. 1105-1110.

139. Tandon, N., et al., Psy64 Ge Centricity® Electronic Medical Records Study: Comorbidities And Biologic Experience Among Patients Receiving Golimumab. Value in Health, 2011. 14(3): p. A70-A71.

140. Wang, J., et al., New diagnosis of hypertension among celecoxib and nonselective NSAID users: a population-based cohort study. Annals of Pharmacotherapy, 2007. 41(6): p. 937-943.

141. Rajagopalan, V., et al., SAT0069 Performance of the Framingham Cardiovascular Risk Prediction Model with and without C-Reactive Protein or Erythrocyte Sedimentation Rate in RA: Analysis of US Electronic Medical Records Database. Annals of the Rheumatic Diseases, 2014. 73(Suppl 2): p. 615-615.

142. Paul, S.K., et al. Effectiveness of biologic and non-biologic antirheumatic drugs on anaemia markers in 153,788 patients with rheumatoid arthritis: new evidence from real-world data. in Seminars in arthritis and rheumatism. 2018. Elsevier.

143. Marelli, C., et al., Statins and risk of cancer: a retrospective cohort analysis of 45,857 matched pairs from an electronic medical records database of 11 million adult Americans. Journal of the American College of Cardiology, 2011. 58(5): p. 530-537.

144. Talal, A., et al., Absolute and relative contraindications to pegylated‐interferon or ribavirin in the US general patient population with chronic hepatitis C: results from a US database of over 45 000 HCV‐infected, evaluated patients. Alimentary pharmacology & therapeutics, 2013. 37(4): p. 473-481.

161

145. Unni, S., et al., Hypertension control and antihypertensive therapy in patients with chronic kidney disease. American journal of hypertension, 2014. 28(6): p. 814-822.

146. Patel, A., et al., Care Provision and Prescribing Practices of Physicians Treating Children and Adolescents With ADHD. Psychiatric Services, 2017: p. appi. ps. 201600130.

147. World Health Organization Collaborating Centre for Drug Statistics Methodology. ATC. 2011; Available from: https://www.whocc.no/atc/structure_and_principles/.

148. US Food and Drug Administration. FDA Approved Drug Products. 2017; Available from: https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm.

149. US Food and Drug Administration, FDA approves weight-management drug Saxenda. 2014.

150. Hampp, C., et al., Use of antidiabetic drugs in the US, 2003–2012. Diabetes care, 2014. 37(5): p. 1367-1374.

151. Yurkovich, M., et al., A systematic review identifies valid comorbidity indices derived from administrative health data. Journal of Clinical Epidemiology, 2015. 68(1): p. 3-14.

152. Needham, D.M., et al., A systematic review of the Charlson comorbidity index using Canadian administrative databases: a perspective on risk adjustment in critical care research. Journal of critical care, 2005. 20(1): p. 12-19.

153. Elixhauser, A., et al., Comorbidity measures for use with administrative data. Medical care, 1998. 36(1): p. 8-27.

154. Von Korff, M., E.H. Wagner, and K. Saunders, A chronic disease score from automated pharmacy data. Journal of clinical epidemiology, 1992. 45(2): p. 197-203.

155. Khan, N.F., et al., Adaptation and validation of the Charlson Index for Read/OXMIS coded databases. BMC family practice, 2010. 11(1): p. 1.

156. Bannay, A., et al., The best use of the charlson comorbidity index with electronic health care database to predict mortality. Medical care, 2016. 54(2): p. 188-194.

157. Charlson, M.E., et al., A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. Journal of chronic diseases, 1987. 40(5): p. 373-383.

158. Quan, H., et al., Updating and validating the Charlson comorbidity index and score for risk adjustment in hospital discharge abstracts using data from 6 countries. American journal of epidemiology, 2011. 173(6): p. 676-682.

159. Denti, L., et al., Validity of the modified Charlson comorbidity index as predictor of short-term outcome in older stroke patients. Journal of Stroke and Cerebrovascular Diseases, 2015. 24(2): p. 330-336.

160. Quan, H., et al., Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Medical care, 2005: p. 1130-1139.

161. Deyo, R.A., D.C. Cherkin, and M.A. Ciol, Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. Journal of clinical epidemiology, 1992. 45(6): p. 613-619.

162. U.S. National Library of Medicine. SNOMED CT to ICD-10-CM Map. 2016; Available from: https://www.nlm.nih.gov/research/umls/mapping_projects/snomedct_to_icd10cm.html.

162

163. Kamal, K.M., et al., Use of electronic medical records for clinical research in the management of type 2 diabetes. Res Social Adm Pharm, 2014. 10(6): p. 877-84.

164. Su, C.C., et al., Risk of diabetes in patients with rheumatoid arthritis: a 12-year retrospective cohort study. J Rheumatol, 2013. 40(9): p. 1513-8.

165. Seidu, S., et al., Prevalence and characteristics in coding, classification and diagnosis of diabetes in primary care. Postgraduate medical journal, 2014. 90(1059): p. 13-17.

166. Shivade, C., et al., A review of approaches to identifying patient phenotype cohorts using electronic health records. Journal of the American Medical Informatics Association, 2014. 21(2): p. 221-230.

167. Tapak, L., et al., Real-Data Comparison of Data Mining Methods in Prediction of Diabetes in Iran. Healthcare Informatics Research, 2013. 19(3): p. 177-185.

168. Mani, S., et al., Type 2 diabetes risk forecasting from EMR data using machine learning. AMIA Annu Symp Proc, 2012. 2012: p. 606-15.

169. American Diabetes Association, Standards of Medical Care in Diabetes—2017: Summary of Revisions. Diabetes Care, 2017. 40(Supplement 1): p. S4-S5.

170. Witten, I.H., et al., Data Mining: Practical machine learning tools and techniques. 2016: Morgan Kaufmann.

171. Leung, K.M., Naive bayesian classifier. Polytechnic University Department of Computer Science/Finance and Risk Engineering, 2007.

172. Ng, A.Y. and M.I. Jordan. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. in Advances in neural information processing systems. 2002.

173. Cristianini, N. and J. Shawe-Taylor, An introduction to support vector machines and other kernel-based learning methods. 2000: Cambridge university press.

174. Murtagh, F., Multilayer perceptrons for classification and regression. Neurocomputing, 1991. 2(5): p. 183-197.

175. Bhargava, N., et al., Decision tree analysis on j48 algorithm for data mining. Proceedings of International Journal of Advanced Research in Computer Science and Software Engineering, 2013. 3(6).

176. Holte, R.C., Very simple classification rules perform well on most commonly used datasets. Machine learning, 1993. 11(1): p. 63-90.

177. Centers for Disease Control and Prevention, National diabetes statistics report: estimates of diabetes and its burden in the United States, 2018. Atlanta, GA: US Department of Health and Human Services, 2018.

178. Paul, S., J. Shaw, and K. Klein. Therapeutic inertia for glycaemic and blood pressure control in patients with type 2 diabetes mellitus and the cardiovascular consequences. in Diabetologia. 2015. Springer 233 Spring St, New York, NY 10013 USA.

179. Mata‐Cases, M., et al., Therapeutic inertia in patients treated with two or more antidiabetics in primary care: F actors predicting intensification of treatment. Diabetes, Obesity and Metabolism, 2018. 20(1): p. 103-112.

163

180. Von Elm, E., et al., The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: guidelines for reporting observational studies. International journal of surgery, 2014. 12(12): p. 1495-1499.

164

APPENDIX A

165


Addition of or switch to insulin therapy in people treated withglucagon-like peptide-1 receptor agonists: A real-world studyin 66 583 patients

Olga Montvida MSc1,2 | Kerenaftali Klein PhD1 | Sudhesh Kumar MD3 |

Kamlesh Khunti PhD4 | Sanjoy K. Paul PhD1

1Clinical Trials and Biostatistics Unit, QIMR

Berghofer Medical Research Institute,

Brisbane, Australia

2School of Biomedical Sciences, Institute of

Health and Biomedical Innovation, Faculty of

Health, Queensland University of Technology,

Brisbane, Australia

3Warwick Medical School, University of

Warwick, and University Hospitals Coventry

and Warwickshire, Coventry, UK

4Diabetes Research Centre, Leicester Diabetes

Centre, University of Leicester, Leicester, UK

Corresponding Author: Sanjoy K. Paul, PhD,

Clinical Trials and Biostatistics Unit, QIMR


300 Herston Road, Herston, QLD 4006

Brisbane, Australia (Sanjoy.

[email protected]).

Funding Information

No separate funding was obtained for this

study.

Background: Real world outcomes of addition or switch to insulin therapy in type 2 diabetes

(T2DM) patients on glucagon-like paptide-1 receptor agonist (GLP-1RA) with inadequately con-

trolled hyperglycaemia, are not known.

Materials and methods: Patients with T2DM (n = 66 583) with a minimum of 6 months of

GLP-1RA treatment and without previous insulin treatment were selected. Those who added

insulin (n = 39 599) or switched to insulin after GLP-1RA cessation (n = 4706) were identified.

Adjusted changes in glycated haemoglobin (HbA1c), weight, systolic blood pressure (SBP) and

LDL cholesterol were estimated over 24 months follow-up.

Results: Among those who continued with GLP-1RA treatment without adding or switching to

insulin, the highest adjusted mean HbA1c change was achieved within 6 months, with no fur-

ther glycaemic benefits observed during 24 months of follow-up. Addition of insulin within

6 months of GLP-1RA initiation was associated with 18% higher odds of achieving HbA1c <7%

at 24 months, compared with adding insulin later. At 24 months, those who added insulin

reduced HbA1c significantly by 0.55%, while no glycaemic benefit was observed in those who

switched to insulin. Irrespective of intensification with insulin, weight, SBP and LDL cholesterol

were significantly reduced by 3 kg, 3 mm Hg and 0.2 mmol/L, respectively, over 24 months.

Conclusions: Significant delay in intensification of treatment by addition of insulin is observed

in patients with T2DM inadequately controlled with GLP-1RA. Earlier addition of insulin is

associated with better glycaemic control, while switching to insulin is not clinically beneficial

during 2 years of treatment. Non-responding patients on GLP-1RA would benefit from adding

insulin therapy, rather than switching to insulin.

KEYWORDS

GLP-1 analogue, insulin therapy, pharmaco-epidemiology, type 2 diabetes

1 | INTRODUCTION

People with diabetes are at increased risk of developing disabling and

life-threatening health problems, including microvascular and macro-

vascular complications.1,2 Good control of hyperglycaemia and the

associated risk factors in type 2 diabetes (T2DM) has been shown to

reduce the risk of these complications.1,3 Thus anti-hyperglycaemic

treatment strategies should ideally also address the management of

cardiovascular risk factors, including body weight, blood pressure and

lipids.4 Novel antihyperglycaemic glucagon-like peptide-1 receptor

agonist (GLP-1RA) therapies, including exenatide (EXE) and liraglutide

(LIRA), have the potential to address these challenges.5–9

The combination of GLP-1RA and insulin treatments represents a

promising glycaemic management strategy because of the comple-

mentary mechanisms of actions of these therapies.10 Both therapies

affect body weight, but in opposite directions; while significant

weight reductions have been observed in patients treated with GLP-

1RA, insulin is known to significantly increase body weight.11,12

Received: 11 June 2016 Revised: 9 September 2016 Accepted: 11 September 2016

DOI 10.1111/dom.12790

Diabetes Obes Metab; 9999: •• wileyonlinelibrary.com/journal/dom © 2016 John Wiley & Sons Ltd 1

166


A meta-analysis of clinical trials conducted by Eng et al.13 showed

that the combination of a GLP-1RA with basal insulin resulted in

robust glycaemic control without increased risk of hypoglycaemia or

weight gain. The effectiveness of adding GLP-1RA to basal insulin

therapy on glucose and weight control in patients with T2DM has

also been evaluated in a number of observational and real-world

data-based studies.10,14–20 The intensification of insulin therapy by

addition of GLP-1RA, rather than adding mealtime insulin, has been

shown to be an attractive therapeutic option,13,21 and is now recom-

mended in international guidelines: “The available data now suggest

that either a GLP-1 receptor agonist or prandial insulin could be used

in this setting [in patients not achieving target glycated haemoglobin

(HbA1c)], with the former arguably safer, at least for short-term

outcomes.”21

A significant number of patients treated with GLP-1RAs also

intensify therapy by adding insulin, or switching to insulin therapy,

primarily because of sub-optimal glycaemic control; however, to the

best of our knowledge, only three observational studies (n = 44 to

n = 432) have evaluated the effectiveness of adding insulin to GLP-

1RA therapy.20 Moreover, these studies did not explore the HbA1c

trajectories over time to understand the longitudinal patterns of

glycaemic failure. Furthermore, no real-world study, to the best of

our knowledge, has explored the dynamics of changes in glycaemic

and cardiovascular risk factors from the time of GLP-1RA initiation

through to the transition phase of adding or switching to insulin ther-

apy. The real-world patterns of adding or switching to insulin therapy

from initial GLP-1RA treatment are also not well understood. Further,

amongst patients with sub-optimal glycaemic control who intensify

GLP-1RA therapy by addition of insulin, it is not known whether

earlier intensification is beneficial for sustainable glucose control.

The aims of the present longitudinal cohort study were to evalu-

ate, from the time of initiation of GLP-1RA therapy in patients with

T2DM: (1) changes in HbA1c, body weight, blood pressure and LDL

cholesterol over 6, 12 and 24 months of follow-up; (2) possible bene-

fits of adding or switching to insulin treatment; and (3) the likelihood

of clinically meaningful HbA1c reduction in those who intensified

GLP-1RA treatment with insulin earlier, compared with those who

added insulin later.


2.1 | Data source

The Centricity Electronic Medical Record (CEMR) database was used

for this study. The CEMR represents a variety of ambulatory medical

practices from 49 US states, including solo practitioners, community

clinics, academic medical centres and large integrated delivery net-

works. The CEMR database consists of >35 000 physicians and other

providers, of whom ~75% are primary care providers. The database

has been extensively used for academic research worldwide.5,22–24

The CEMR database contains detailed prescription information

with dates of prescription, including information on medications that

were purchased over the counter or prescribed outside of the EMR

network. The main medication information data set stores individual

records in the form of start/stop dates, along with several specific

fields to track treatment adjustments and alterations over time.

From >2.4 million patients with a confirmed diagnosis of T2DM,

a cohort of 134 268 patients who received treatment with GLP-1RAs

between April 2005 and October 2014 was identified. The final study

cohort of 66 583 patients was selected on the basis of the following

criteria: (1) diagnosis of diabetes from January 1990; (2) age

≥18 years at the diagnosis of diabetes; (3) no insulin therapy before

the initiation of GLP-1RA treatment; (4) no missing data on age, sex,

HbA1c and body weight at GLP-1RA initiation; and (5) minimum

6 months of continuous treatment with GLP-1RA from the first

recorded prescription date.

In the final study cohort, those treated with EXE and those trea-

ted with LIRA were identified. Those who added insulin therapy to

existing GLP-1RA treatment (GLP-1RA + INS), and those who

switched to insulin therapy on cessation of GLP-1RA treatment (GLP-

1RA ! INS), were identified by comparing insulin initiation dates and

GLP-1RA cessation dates. Time to addition of/switch to insulin ther-

apy was calculated as the time difference between GLP-1RA and

insulin initiation dates. Insulin therapy was identified by any insulin

regimen (basal, biphasic or prandial). The majority of the patients in

the EXE group (94%) were treated with a twice-daily EXE regimen.

Demographic, clinical and laboratory information included age,

sex, ethnicity and longitudinal measures of body weight, body mass

index, systolic blood pressure (SBP), diastolic blood pressure, HbA1c,

and lipids. Clinical and laboratory data at GLP-1RA treatment initia-

tion (index date) were included on the basis of a 3-months window,

on or prior to the index date. Follow-up clinical and laboratory mea-

sures were arranged longitudinally on the basis of non-overlapping

6-monthly windows which were defined progressively from the time

of GLP-1RA treatment initiation. Complete information on antihyper-

glycaemic agents, antihypertensive agents, cardio-protective medica-

tions (CPMs), weight-lowering and anti-depressant drugs was

obtained, along with dates of prescriptions. The CPMs included sta-

tins, angiotensin-converting enzyme inhibitors and angiotensin II

receptor blockers. The status of different medication intakes was

defined by whether it was taken during GLP-1RA treatment, and the

treatment durations for such medications were estimated.


While complete data on HbA1c and body weight were available at

index date (by design), the proportion of patients with missing longi-

tudinal data on body weight, SBP, LDL cholesterol and HbA1c over

24 months of follow-up ranged from 9% to 19%. The missing longitu-

dinal follow-up data were imputed using the multiple imputation

approach, with adjustments for age at index date and use of oral anti-

hyperglycaemic drugs during follow-up. All primary analyses were

conducted using the imputed data, with additional analyses con-

ducted for sensitivity assessment based on complete cases.

The mean (95% confidence interval [CI]) changes in HbA1c, body

weight, SBP and LDL cholesterol at 6, 12 and 24 months from index

date were estimated using multivariate regression models. Risk factor

changes were adjusted for age at index date, sex, ethnicity and con-

comitant antihyperglycaemic, antihypertensive and weight-lowering

2 MONTVIDA ET AL.

167

treatments, weighted by the respective baseline measures, as appropri-

ate. Separate analyses for HbA1c and weight changes were conducted

for patients who continued to receive only GLP-1RA treatment over

6, 12 and 24 months, and for those who added or switched to insulin

during follow-up. Robust estimates of the CIs were obtained.

For patients with HbA1c >7.5% at the time of GLP-1RA initia-

tion, the proportions of patients who achieved HbA1c below 7% at

6, 12 and 24 months of treatment were evaluated for all groups. Pro-

portions of those who achieved weight loss ≥5% from initial body

weight at 6, 12 and 24 months after GLP-1RA initiation were also

calculated. The characteristics of the patients who added or switched

to insulin were presented at the index date and at the time of transi-

tion to insulin.

3 | RESULTS

In the cohort of 66 583 patients with minimum 6 months of treat-

ment with GLP-1RA, the mean (standard deviation) age was

56 (11) years, 28 959 (43%) were male, 45 291 (68%) were white,

51 719 (87%) were obese, 3404 (26%)/858 (7%) had micro-/macro-

albuminuria, and 17 415 (26%) had a history of hypertension at the

time of initiation of GLP-1RA treatment (Table 1). The use of differ-

ent medications during GLP-1RA therapy, along with their durations

of treatment, are shown in Table 1.

3.1 | Glycaemic control

With mean HbA1c of 8.2% at index date, among those who continued

with GLP-1RA treatment without adding or switching to insulin, the

highest adjusted mean HbA1c change was achieved within 6 months

(−0.73%; 95% CI −0.73, −0.71), with no further glycaemic benefits at

12 months (−0.65%; 95% CI −0.65, −0.62) or 24 months of follow-up

(−0.59%; 95% CI −0.60, −0.58; Table 2; Figure 1A; all P < 0.01). Among

patients with HbA1c ≥7.5% at index date, who did not add or switch

to insulin, and who continued with GLP-1RA treatment only for

12 and 24 months (n = 14 682 and n = 6825), 26% achieved HbA1c

levels below 7% at 12 and 24 months, respectively (Table 2).

Among those who added insulin during follow-up (GLP-1RA +

INS, n = 39 599), the mean HbA1c values at index date and at the

time of adding insulin were 8.3% and 8.8%, respectively. The median

time to intensification with insulin was 3 months. Among these

patients, 84% and 71% had HbA1c levels above 7.5% and 8%,

respectively, at insulin initiation. Those who added insulin within

6 months of GLP-1RA initiation achieved significantly greater

(P < 0.001) adjusted HbA1c reduction at 24 months of follow-up

(−0.58%; 95% CI −0.61, −0.57), compared with those who added

insulin after 12 months (−0.41%; 95% CI −0.43, −0.40; Figure 1A).

Those who added insulin within 6 months of GLP-1RA initiation were

18% (odds ratio 1.18; 95% CI 1.09, 1.28; P < 0.001) more likely to

achieve HbA1c below 7% at 24 months of follow-up, compared with

those who added insulin treatment later.

The 6-monthly trajectories (mean, 95% CI) of HbA1c levels for

those who switched to insulin therapy within 24 months (n = 2483;

mean time to insulin 14 months; Table 3) are shown in Figure 1B. In

these patients, the mean HbA1c increased to 9.3% at the time of

switching to insulin from a mean HbA1c of 8.5% at GLP1-RA initiation,

and 80% of them had HbA1c above 8% at insulin initiation (Table 3).

Notably, these patients did not achieve better glycaemic control at

24 months compared with their glycaemic status at index date

(Figure 1B,C). In contrast, patients who added insulin within 24 months

of follow-up (n = 36 113; mean time to insulin 3 months; Table 3)

experienced a significant HbA1c reduction of 0.55% (95% CI 0.54,

0.57) at 24 months compared with the index date (Figure 1C). The

adjusted means (95% CI) of change in HbA1c and body weight at

6, 12 and 24 months after GLP-1RA initiation, for those who ceased

GLP-1RA after 6 months of initiation and switched to insulin between

6 and 12, 12 and 18, and 18 and 24 months, are shown in Table S1.

3.2 | Weight change

With a baseline body weight of 109 kg (Table 1), patients with a min-

imum 12 months of treatment with GLP-1RA had significantly greater

adjusted weight reduction (mean reduction 2.5 kg [CI 2.50, 2.51]),

and 24% reduced their body weight by ≥5% (Table 4). Among those

who continued GLP-1RA treatment only (without addition or switch

to insulin) for 24 months, the average weight reduction from index

date was 3.31 kg (95% CI 3.30, 3.32), and a third of patients achieved

a weight loss of ≥5%. Patients who added insulin achieved marginally

higher weight reduction (adjusted) at 12 and 24 months (2.93 and

3.40 kg, respectively; Table 4), compared with those who did not add

or switch to insulin therapy (2.50 and 3.31 kg, respectively).

3.3 | Associations of glucose and weight loss

Among patients with a minimum of 12 months of GLP-1RA treat-

ment, 78% and 67% had reductions in HbA1c and body weight,

respectively, from the index date, while 53% had reductions in both

body weight and HbA1c (similar in patients treated with LIRA and

EXE; Figure 1D). At 12 months of follow-up, 8% and 7% of patients

did not have reductions in both HbA1c and weight in the EXE and

LIRA groups, respectively.

3.4 | Cardiovascular risk factors

With a mean 129 mm Hg SBP level at index date, only 24% had SBP

≥140 mm Hg. The adjusted average reduction in SBP was ~3 mm Hg

consistently over 6, 12 and 24 months of follow-up, and was similar

across EXE and LIRA groups (Tables 1 and 4). Among those who

switched to insulin, the mean SBP levels at index date, at the time of

moving to insulin, and at 24 months of follow-up remained stable at

130 mm Hg.

In all, 92% of patients in the study cohort were on lipid-lowering

therapy. The average reduction in LDL cholesterol was ≥0.18 mmol/L

consistently during 6, 12 and 24 months of follow-up (range of CI of

reduction: 0.17-0.24 mmol/L; Table 4), starting with a baseline LDL

cholesterol level of 2.43 mmol/L. Among patients who did not

receive any statin (n = 15 949), mean reductions in LDL cholesterol

at 6, 12 and 24 months were 0.15 mmol/L, 0.14 mmol/L and

0.17 mmol/L, respectively (all P < 0.001). Among those who switched

MONTVIDA ET AL. 3

168

TABLE 1 Basic statistics on study variables at the time of initiation of exenatide or liraglutide for patients who continued glucagon-like peptide-

1 receptor agonist treatment for at least 6 months, those who added insulin treatment during the follow-up, and those who switched to insulintreatment during the follow-up

ALL EXE LIRA GLP-1RA + INSGLP-1RA ! INS

N 66 583 44 523 22 060 39 599 4706

Age at GLP-1RA initiation, y 56 (11) 56 (11) 56 (11) 56 (11) 56 (11)

Men, n (%) 28 959 (43) 18 917 (42) 10 042 (46) 17 531 (44) 2130 (45)

Ethnicity, n (%)

White 45 291 (68) 29 500 (66) 15 791 (72) 27 231 (69) 3311 (70)

Black 5021 (8) 3118 (7) 1903 (9) 2971 (8) 358 (8)

Hispanic 1465 (2) 943 (2) 522 (2) 1023 (3) 89 (2)

Asian 534 (1) 326 (1) 208 (1) 312 (1) 30 (1)

HbA1c at diagnosis of diabetes, % 8 (1.4) 8 (1.4) 8.1 (1.5) 8.1 (1.5) 8.2 (1.5)

HbA1c at GLP-1RA initiation, % 8.2 (1.3) 8.1 (1.3) 8.3 (1.4) 8.3 (1.4) 8.4 (1.3)

Median (IQR) HbA1c at GLP-1RA initiation, % 7.8 (7, 8.8) 7.8 (7, 8.7) 7.9 (7.1, 8.9) 8 (7.1, 9.0) 8.2 (7.4, 9.0)

HbA1c ≥ 7% at GLP-1RA initiation, n (%) 60 351 (91) 40 180 (90) 20 171 (91) 36 130 (91) 4388 (93)

HbA1c ≥ 7.5% at GLP-1RA initiation, n (%) 41 045 (62) 26 700 (60) 14 345 (65) 25 691 (65) 3388 (72)

HbA1c ≥ 8% at GLP-1RA initiation, n (%) 30 599 (46) 19 777 (44) 10 822 (49) 19 859 (50) 2628 (56)

Weight at GLP-1RA initiation, kg 109 (25) 110 (25) 109 (25) 110 (25) 108 (24)

BMI at GLP-1RA initiation, kg/m2 38 (8) 38 (8) 38 (8) 38 (8) 37 (8)

Obese at GLP-1RA initiation, n (%) 57 927 (87) 38 753 (87) 18 971 (86) 34 847 (88) 4000 (85)

SBP at GLP-1RA initiation, mm Hg 129 (16) 129 (16) 129 (16) 129 (16) 130 (16)

SBP ≥ 140 mm Hg at GLP-1RA initiation, n (%) 15 728 (24) 10 628 (24) 5100 (23) 9487 (24) 1184 (25)

DBP at GLP-1RA initiation, mm Hg 77 (10) 77 (10) 77 (10) 77 (10) 77 (10)

LDL cholesterol at GLP-1RA initiation, mmol/L 2.43 (0.72) 2.44 (0.73) 2.42 (0.72) 2.42 (0.74) 2.47 (0.74)

LDL cholesterol ≥ 3.37 mmol/L at GLP-1RAinitiation, n (%)

5780 (9) 3951 (9) 1829 (8) 3652 (9) 450 (10)

HDL cholesterol at GLP-1RA initiation, mmol/L 1.10 (0.31) 1.11 (0.30) 1.10 (0.31) 1.10 (0.30) 1.09 (0.29)

Median (IQR) triglycerides at GLP-1RA initiation,mmol/L

1.69 (1.23, 2.28) 1.69 (1.23, 2.27) 1.71 (1.24, 2.31) 1.71 (1.24, 2.29) 1.75 (1.25, 2.35)

Triglyceride ≥ 1.69 mmol/L at GLP-1RAinitiation, n (%)

15 060 (51) 9920 (50) 5140 (51) 10 107 (51) 967 (53)

Micro-albuminuria, n (%) 3404 (26) 2126 (25) 1278 (29) 2481 (26) 167 (26)

Macro-albuminuria, n (%) 858 (7) 532 (6) 326 (7) 597 (6) 49 (8)

Hypertension, n (%) 17 415 (26) 12 707 (29) 4708 (21) 11 020 (28) 1340 (28)

Metformin taken during the GLP-1RA treatment,n (%)

56 035 (84) 37 645 (85) 18 390 (83) 33 837 (85) 4129 (88)

Median (IQR) metformin duration, months 52.7 (28.2, 84.3) 62.6 (34.9, 91.9) 36.6 (20.8, 61.1) 55.9 (30.3, 87.5) 71.8 (43.6, 97.6)

Sulphonylurea taken during the GLP-1RAtreatment, n (%)

38 003 (57) 26 719 (60) 11 284 (51) 23 723 (60) 3583 (76)

Median (IQR) sulphonylurea duration, months 32.5 (15, 60.3) 36.8 (17, 66) 25.4 (12.2, 45.8) 33.5 (15.2, 61.9) 39.8 (20.2, 67.9)

Antihypertensive taken during the GLP-1RAtreatment, n (%)

53 821 (81) 36 610 (82) 17 211 (78) 32 655 (82) 4032 (86)

Median (IQR) antihypertensive duration, months 46.5 (23.8, 80.1) 54.7 (28.4, 86.9) 33.2 (17.5, 59.8) 48.7 (25.2, 82.3) 61.4 (33.9, 91.8)

CPM taken during the GLP-1RA treatment, n (%) 61 145 (92) 41 273 (93) 19 872 (90) 36 861 (93) 4462 (95)

Median (IQR) CPM duration, months 51.1 (26.6, 84.9) 59.9 (32.7, 91.6) 35.7 (19.4, 64.5) 53.8 (28.5, 87.5) 68.3 (39, 96.3)

Weight lowering drugs taken during theGLP-1RA treatment, n (%)

4591 (7) 3297 (7) 1294 (6) 2831 (7) 328 (7)

Median (IQR) weight-lowering duration, months 12.6 (5.1, 28.2) 13.1 (5.2, 29.4) 11.9 (4.9, 25.5) 12.1 (4.8, 27.5) 12.6 (5.3, 28.7)

Anti-depressants taken during the GLP-1RAtreatment, n (%)

28 133 (42) 19 865 (45) 8268 (37) 16 950 (43) 2296 (49)

Median (IQR) anti-depressants duration, months 35.6 (15.4, 68.6) 40.6 (17.4, 74.7) 27.2 (12.6, 51.4) 36.2 (15.5, 70.1) 43.4 (19.5, 78.9)

BMI, body mass index; CPM, cardio-protective medication; DBP, diastolic blood pressure; EXE, exenatide; GLP-1RA, glucagon-like peptide-1 receptoragonist; HbA1c, glycated haemoglobin; INS, insulin; IQR, interquartile range; LIRA, liraglutide; SBP, systolic blood pressure.

Values are mean (standard deviation) unless stated otherwise.

4 MONTVIDA ET AL.

169

to insulin, the mean LDL cholesterol levels at index date, at the time

of moving to insulin, and at 24 months of follow-up were 2.47, 2.42

and 2.38 mmol/L, respectively.

4 | DISCUSSION

The present longitudinal cohort study of 66 583 patients with T2DM

treated with GLP-1RA suggests that: (1) significant HbA1c reductions

may be obtained within 6 months from GLP-1RA treatment initiation,

with no further glycaemic benefits likely over 24 months of follow-

up; (2) earlier intensification with insulin therapy by 6 months (when

added to GLP-1RA) is associated with 18% higher odds of lowering

HbA1c below 7% within 2 years of treatment; and (3) adding insulin,

rather than switching to it, is associated with significantly lower glu-

cose levels in a long-term outcome. We have also observed a clear

indication of therapeutic inertia among patients who failed to

respond to GLP-1RA therapy.

This study presents real-world evidence of significant reductions

in HbA1c, body weight, SBP and LDL cholesterol over 2 years of

follow-up in patients treated with GLP-1RAs. Initiation of GLP-1RA

treatment at lower HbA1c levels was associated with better glucose

control over 2 years of follow-up. The observed HbA1c reductions

were consistent with previous findings.15–19 While glycaemic

achievements observed within 6 months of GLP-1RA treatment initi-

ation were higher in clinical trials, it is recognized that the effective-

ness studies based on real-world data generally provide lower

estimates of glycaemic reduction or treatment effect(s) in general.

TABLE 2 Adjusted mean (95% confidence interval) of change in glycated haemoglobin (HbA1c) at 6, 12 and 24 months after glucagon-like

peptide-1 receptor agonist (GLP-RA) initiation, for those who took a GLP-1RA for at least 6, 12 and 24 months, stratified by whether patientscontinued on GLP-1RA treatment only or added insulin therapy, and, for patients with HbA1c levels above 7.5% at GLP-1RA initiation, numberand proportion of those whose HbA1c reduced below 7% at 6, 12 and 24 months after GLP-1RA initiation stratified by whether patients tookGLP-1RA only or added insulin therapy

On GLP-1RA for ≥6 months On GLP-1RA for ≥12 months On GLP-1RA for ≥24 months

All EXE LIRA All EXE LIRA All EXE LIRA66 583 44 523 22 060 50 109 35 085 15 024 28 422 22 111 6311

Δ HbA1c at 6 months, mean (95% CI)

GLP-1RAonly

−0.73(−0.73,−0.71)

−0.70(−0.71,−0.70)

−0.80(−0.80,−0.79)

−0.75(−0.76,−0.75)

−0.72(−0.73,−0.72)

−0.83(−0.81,−0.83)

−0.75(−0.75,−0.74)

−0.73(−0.73,−0.72)

−0.85(−0.86,−0.85)

GLP-1RA + INS

−0.83(−0.84,−0.83)

−0.75(−0.77,−0.75)

−0.95(−0.96,−0.94)

−0.82(−0.82,−0.81)

−0.75(−0.76,−0.75)

−0.95(−0.96,−0.95)

−0.81(−0.81,−0.80)

−0.77(−0.77,−0.76)

−0.93(−0.93,−0.92)

HbA1c < 7% at 6 months for those whose HbA1c was ≥7.5% at GLP-1RA initiation, n (%)

GLP-1RAonly

5410 (25) 3672 (24) 1738 (27) 3965 (27) 2826 (26) 1139 (29) 2018 (30) 1603 (29) 415 (34)

GLP-1RA + INS

4156 (21) 2236 (19) 1920 (24) 3451 (22) 1987 (20) 1464 (25) 2338 (24) 1594 (22) 744 (28)

Δ HbA1c at 12 months, mean (95 CI)

GLP-1RAonly

- - - −0.65(−0.65,−0.62)

−0.62(−0.62,−0.61)

−0.71(−0.72,−0.71)

−0.67(−0.68,−0.67)

−0.65(−0.67,−0.65)

−0.74(−0.74,−0.72)

GLP-1RA + INS

- - - −0.73(−0.73,−0.72)

−0.67(−0.67,−0.66)

−0.85(−0.86,−0.85)

−0.74(−0.75,−0.74)

−0.71(−0.71,−0.70)

−0.84(−0.84,−0.83)


GLP-1RAonly

- - - 3829 (26) 2770 (26) 1059 (27) 1960 (29) 1577 (28) 383 (32)

GLP-1RA + INS

- - - 3462 (22) 2083 (21) 1379 (24) 2401 (24) 1679 (23) 722 (27)

Δ HbA1c at 24 months, mean (95 CI)

GLP-1RAonly

- - - - - - −0.59(−0.60,−0.58)

−0.58(−0.58,−0.57)

−0.63(−0.64,−0.63)

GLP-1RA + INS

- - - - - - −0.65(−0.66,−0.65)

−0.63(−0.64,−0.63)

−0.70(−0.71,−0.70)


GLP-1RAonly

- - - - - - 1806 (26) 1469 (26) 337 (28)

GLP-1RA + INS

- - - - - - 2247 (23) 1621 (23) 626 (23)

CI, confidence interval; EXE, exenatide; GLP-1RA, glucagon-like peptide-1 receptor agonist; HbA1c, glycated haemoglobin; INS, insulin; LIRA, liraglutide.

MONTVIDA ET AL. 5

170

FIGURE1

A,M

ean(95%

confi

denc

einterval[CI])o

flong

itud

inal

glycated

haem

oglobin(H

bA1c)

mea

suremen

tsby

whe

ther

patien

tad

dedinsulin

within

6months/6to

12months/12to

18months/18to

24months

from

theGLP

-1RAinitiation,

orremaine

donGLP

-1RAtreatm

entonly.

B,M

ean(95%

CI)oflong

itud

inalHbA

1cmea

suremen

tsby

whe

ther

patientsw

itch

edto

insulin

within

6to

12months/12

to18months/1

8to

24months

from

theGLP

-1RAinitiation.

C,M

ean(95%

CI)oflong

itud

inal

HbA

1cmea

suremen

tsby

whe

ther

patien

tad

dedorsw

itch

edto

insulin

within

24monthsfrom

GLP

-1RA

initiation.

D,S

catterplotofHbA

1cch

ange

andbo

dyweigh

tch

ange

at6an

d12months

forpa

tien

tstreatedwithex

enatidean

dliraglutide

witho

utad

dingorsw

itch

ingto

insulin

.Theperpen

diculardotted

lines

presen

tthemea

ntimeto

additionorsw

itch

ingto

insulin

bycatego

ries

oftimeline.

6 MONTVIDA ET AL.

171

The mean HbA1c reduction in patients treated with LIRA was margin-

ally higher than in those treated with EXE, although the proportions

of patients who had reductions in HbA1c to below 7% over 12 and

24 months of follow-up (ranging from 26% to 28%) were similar for

each of these two therapies.

With regard to initial response to GLP-1RA within 6 months of

follow-up, patients who switched to insulin (by design after 6 months

of GLP-1 RA treatment) experienced a significant rise in the glycae-

mic level at different points of follow-up, as shown in Figure 1B. For

example, those who switched to insulin within 18 to 24 months after

the index date, clearly experienced rising HbA1c levels consistently

above 8% after 6 months of treatment with GLP-1RA. Furthermore,

we observed that although switching to insulin prevented any further

rise in HbA1c, no significant glycaemic reductions were achieved by

the end of the 24-months follow-up period, compared to the index

date (Figure 1B,C); however, those patients who added insulin had

significantly better glycaemic control by the end of the follow-up

period. After adjusting for the HbA1c levels at index date and at the

time of insulin initiation, those who added insulin achieved a signifi-

cantly higher HbA1c reduction at 24 months, by 0.48% (95% CI 0.47,

0.50%), compared with those who switched to insulin (Figure 1C).

This study showed that addition of insulin and switch to insulin

occurred at elevated HbA1c levels of 8.8% and 9.3%, respectively,

with a significant proportion of patients having HbA1c above 8%/

8.5% (71%/58% in the GLP-1RA + INS group and 80%/68% in the

GLP-1RA ! INS group). This clearly raises the issue of therapeutic

inertia.25,26 Given the high glycaemic burden in this population, the

time to intensification of therapy requires further evaluation, in con-

junction with the factors that might prevent early intensification with

insulin therapy, including fear of weight gain and hypoglycaemia.

Notably, the distributions of body weight were similar between those

who switched to, or added insulin, and those who remained on GLP-

1RA treatment only (Table 1). Moreover, observed adjusted weight

reductions in patients who added insulin were consistently and mar-

ginally greater than in those treated with GLP-1RA only during the

follow-up period (Table 4). Our findings in terms of weight loss are

consistent with a recent study reporting no weight gain after initia-

tion of insulin in obese patients with T2DM27 and the systematic

review by Balena et al.20

The limitations of this study include non-availability of complete

and reliable data on: (1) medication adherence; (2) dose adjustments

in insulin-treated patients; (3) diet and exercise; (4) socio-economic

status; and (5) potential residual confounders. The selection of

patients with a minimum 6 months’ treatment with GLP-1RA, exclud-

ing those who initiated insulin therapy earlier, could lead to a poten-

tial selection bias; however, the large analysis cohort from the

validated CEMR database used in the study should be considered as

a representative sample, and as such, provides a reliable picture of

TABLE 3 Glycated haemoglobin levels at glucagon-like peptide-1 receptor agonist (GLP-1RA) initiation and at insulin initiation, and the time to

insulin therapy, for those who added insulin to existing GLP-1RA (GLP-1RA+INS) and those who switched to insulin treatment (GLP-1RA!INS)within 2 years of follow-up

GLP-1RA + INS GLP-1RA ! INS

All EXE LIRA All EXE LIRAN 36 113 22 703 13 410 2483 1856 627

HbA1c at GLP-1RAinitiation, %

8.3 (1.4) 8.3 (1.4) 8.4 (1.4) 8.5 (1.4) 8.5 (1.4) 8.5 (1.4)

Median (IQR) HbA1c atGLP-1RA initiation, %

8 (7.1, 9) 7.9 (7.1, 9) 8.1 (7.2, 9.1) 8.3 (7.5, 9.2) 8.3 (7.5, 9.2) 8.3 (7.6, 9.3)

HbA1c ≥ 7% at GLP-1RAinitiation, n (%)

33 032 (91) 20 649 (91) 12 383 (92) 2351 (95) 1761 (95) 590 (94)

HbA1c ≥ 7.5% at GLP-1RAinitiation, n (%)

23 736 (66) 14 507 (64) 9229 (69) 1892 (76) 1404 (76) 488 (78)

HbA1c ≥ 8% at GLP-1RAinitiation, n (%)

18 436 (51) 11 202 (49) 7234 (54) 1489 (60) 1119 (60) 370 (59)

HbA1c at insulin initiation, % 8.8 (1.3) 8.8 (1.3) 8.8 (1.3) 9.3 (1.6) 9.2 (1.5) 9.5 (1.7)

Median (IQR) HbA1c at insulininitiation, %

8.7 (7.8, 9.4) 8.7 (7.8, 9.4) 8.8 (7.9, 9.5) 9.1 (8.2, 10) 9 (8.1, 10) 9.1 (8.4, 10.3)

HbA1c ≥ 7% at insulininitiation, n (%)

36 045 (100) 22 652 (100) 13 393 (100) 2482 (100) 1855 (100) 627 (100)

HbA1c ≥ 7.5% at insulininitiation, n (%)

30 267 (84) 18 848 (83) 11 419 (85) 2209 (89) 1634 (88) 575 (92)

HbA1c ≥ 8% at insulininitiation, n (%)

25 649 (71) 15 897 (70) 9752 (73) 1985 (80) 1466 (79) 519 (83)

Median (IQR) Δ HbA1c(insulin - GLP-1RA), %

0.46 (0.45, 0.47) 0.49 (0.47, 0.50) 0.42 (0.4, 0.44) 0.73 (0.67, 0.79) 0.67 (0.60, 0.73) 0.91 (0.80, 1.03)

Mean (range) time to insulininitiation, months

3 (0, 24) 3 (0, 24) 2 (0, 24) 14 (6, 24) 14 (6, 24) 14 (6, 24)

Median (IQR) time to insulininitiation, months

0 (0, 3) 0 (0, 4) 0 (0, 2) 13 (10, 18) 14 (10, 19) 12 (9, 18)

EXE, exenatide; GLP-1RA, glucagon-like peptide-1 receptor agonist; HbA1c, glycated haemoglobin; INS, insulin; IQR, interquartile range; LIRA, liraglutide.

Statistics are mean (standard deviation) unless stated otherwise.

MONTVIDA ET AL. 7

172

TABLE

4Adjustedmea

n(95%

confi

denc

einterval

[CI])ofch

ange

inbo

dyweigh

tat

6,1

2an

d24months

aftergluc

agon-likepe

ptide-1receptoragonist

(GLP

-1RA)initiation,

forthose

who

tookaGLP

-1RA

forat

least6,1

2an

d24months,stratified

bywhe

ther

patien

tsco

ntinue

donGLP

-1RAtrea

tmen

tonlyorad

dedinsulin

therap

y,nu

mbe

ran

dproportionofthose

wholost

≥5%

bodyweigh

tduringfollo

w-up

afterGLP

-1RAinitiation,

andad

justed

mea

n(95%

CI)ofch

ange

sin

SBPan

dLD

Lch

olesterola

t6,1

2an

d24months

afterGLP

-1RAinitiation

OnGLP

-1RAfor≥6mo

OnGLP

-1RAfor≥12mo

OnGLP

-1RAfor≥24mo

All

EXE

LIRA

All

EXE

LIRA

All

EXE

LIRA

ΔW

eigh

tat

6months,a

djustedmea

n(95%

CI)

GLP

-1RAonly

−1.87(−1.88,

−1.87)

−1.90(−1.91,

−1.90)

−1.80(−1.81,

−1.80)

−1.96(−1.97,

−1.96)

−2.00(−2.00,

−1.99)

−1.86(−1.87,

−1.85)

−2.24(−2.24,−

2.23)

−2.23(−2.24,−

2.23)

−2.25(−2.26,−

2.23)

GLP

-1RA

+IN

S−2.32(−2.32,

−2.31)

−2.20(−2.20,

−2.19)

−2.51(−2.52,

−2.51)

−2.41(−2.41,

−2.40)

−2.26(−2.26,

−2.25)

−2.68(−2.68,

−2.67)

−2.50(−2.50,−

2.49)

−2.38(−2.39,−

2.37)

−2.84(−2.85,−

2.83)

Weigh

tloss

≥5%

at6months,n

(%)

GLP

-1RAonly

5539(15)

3957(15)

1582(15)

3978(15)

2980(15)

998(15)

2050(16)

1684(16)

366(17)

GLP

-1RA

+IN

S5241(18)

3085(17)

2156(19)

4413(18)

2729(17)

1684(20)

2985(19)

2124(18)

861(21)

ΔW

eigh

tat

12months,a

djustedmea

n(95%

CI)

GLP

-1RAonly

--

-−2.50(−2.51,

−2.50)

−2.54(−2.55,

−2.54)

−2.39(−2.39,

−2.38)

−2.83(−2.84,−

2.82)

−2.82(−2.83,−

2.82)

−2.87(−2.88,−

2.85)

GLP

-1RA

+IN

S-

--

−2.93(−2.93,

−2.92)

−2.80(−2.81,

−2.80)

−3.16(−3.16,

−3.15)

−3.08(−3.08,−

3.07)

−3.00(−3.00,−

2.99)

−3.31(−3.32,−

3.30)

Weigh

tloss

≥5%

at12months,n

(%)

GLP

-1RAonly

--

-6150(24)

4644(24)

1506(23)

3204(25)

2662(26)

542(25)

GLP

-1RA

+IN

S-

--

6404(26)

4062(26)

2342(28)

4314(27)

3132(27)

1182(29)

ΔW

eigh

tat

24months,a

djustedmea

n(95%

CI)

GLP

-1RAonly

--

--

--

−3.31(−3.32,−

3.30)

−3.3

(−3.31,−

3.29)

−3.36(−3.38,−

3.35)

GLP

-1RA

+IN

S-

--

--

-−3.40(−3.41,−

3.40)

−3.32(−3.33,−

3.31)

−3.63(−3.64,−

3.62)

Weigh

tloss

≥5%

at24months,n

(%)

GLP

-1RAonly

--

--

--

3884(31)

3210(31)

674(31)

GLP

-1RA

+IN

S-

--

--

-5104(32)

3747(32)

1357(33)

Bloodpressure

chan

ges,ad

justed

mea

n(95%

CI)

ΔSB

Pat

6months

−2.82(−2.82,

−2.81)

−2.78(−2.78,

−2.77)

−2.90(−2.91,

−2.89)

−2.95(−2.96,

−2.95)

−2.91(−2.91,

−2.90)

−3.05(−3.06,

−3.04)

−2.91(−2.91,−

2.90)

−2.87(−2.87,−

2.86)

−3.05(−3.07,−

3.04)

ΔSB

Pat

12months

--

-−2.79(−2.79,

−2.78)

−2.79(−2.79,

−2.78)

−2.79(−2.80,

−2.78)

−2.85(−2.86,−

2.85)

−2.78(−2.78,−

2.77)

−3.13(−3.14,−

3.11)

ΔSB

Pat

24months

--

--

--

−2.64(−2.64,−

2.63)

−2.69(−2.70,−

2.69)

−2.44(−2.46,−

2.42)

LDLch

ange

s,ad

justed

mea

n(95%

CI)

ΔLD

Lch

olesterola

t6months

−0.19(−0.20,

−0.19)

−0.19(−0.20,

−0.19)

−0.19(−0.19,

−0.18)

−0.19(−0.19,

−0.18)

−0.19(−0.20,

−0.19)

−0.19(−0.19,

−0.18)

−0.19(−0.21,−

0.18)

−0.18(−0.18,−

0.17)

−0.20(−0.21,−

0.19)

ΔLD

Lch

olesterola

t12months

--

-−0.18(−0.19,

−0.18)

−0.18(−0.19,

−0.18)

−0.19(−0.19,

−0.18)

−0.19(−0.20,−

0.19)

−0.18(−0.18,−

0.17)

−0.20(−0.21,−

0.18)

ΔLD

Lch

olesterola

t24months

--

--

--

−0.23(−0.24,−

0.23)

−0.23(−0.24,−

0.23)

−0.23(−0.23,−

0.22)

CI,co

nfide

nceinterval;EXE,e

xena

tide

;GLP

-1RA,g

lucago

n-likepe

ptide-1receptoragonist;HbA

1c,glycated

haem

oglobin;

INS,

insulin

;IQ

R,interqu

artile

range

;LIRA,liraglutide;

SBP,systolic

bloodpressure.

8 MONTVIDA ET AL.

173

the state of risk factor management in routine practice. Complete risk

factor data were available at index date, and imputations were con-

ducted for only 9% to 19% missing longitudinal data. The results from

complete case analyses and imputed data were very similar. Finally,

the careful new-user design with a reasonable exposure time of

2 years, and appropriate adjustments for confounders are the primary

strengths of the study.

In conclusion, this novel real-world study provides evidence of

significant delays in intensification of treatment in patients with

T2DM treated with a GLP-1RA. Among HbA1c non-responders, early

addition of insulin with GLP-1RA therapy within 6 months resulted in

better and sustainable glycaemic control over 2 years. The findings

from the present study suggest that, in people requiring treatment

intensification on GLP-1RA, the preferred option should be addition

of insulin rather than stopping GLP-1RA and switching to insulin

therapy.

ACKNOWLEDGMENTS

We gratefully acknowledge the support for the QIMR Berghofer

Institute from the Australian Government Department of Education’s

National Collaborative Research Infrastructure Strategy (NCRIS) initi-

ative through Therapeutic Innovation Australia. No separate funding

was obtained for this study. O. M. acknowledges the support from

her associate supervisors Prof. Ross Young and Prof. Louise Hafner.

K. Klein acknowledges support from the National Institute for Health

Research Collaboration for Leadership in Applied Health Research

and Care – East Midlands (NIHR CLAHRC – EM), and the NIHR

Leicester Loughborough Diet, Lifestyle and Physical Activity Biomedi-

cal Research Unit.


S. K. P. has acted as a consultant and/or speaker for Novartis, GI



of investigator and investigator-initiated clinical studies from Merck,


Avensis and Pfizer. K. Khunti has acted as a consultant and speaker

for Novartis, Novo Nordisk, Sanofi-Aventis, Lilly, Merck Sharp &

Dohme, Janssen, Astra Zeneca and Boehringer Ingelheim. He has

received grants in support of investigator and investigator-initiated

trials from Novartis, Novo Nordisk, Sanofi-Aventis, Lilly, Pfizer, Boeh-

ringer Ingelheim, Merck Sharp & Dohme, Janssen and Roche, and

funds for research, honoraria for speaking at meetings and has served

on advisory boards for Lilly, Sanofi-Aventis, Merck Sharp & Dohme

and Novo Nordisk, Boehringer Ingelheim, Janssen and Astra Zeneca.

S. K. has received research grants and been on advisory boards for

Novo Nordisk. O. M. and K. Klein have no conflict of interest to

declare.


S. K. P. conceived the idea, and S. K. P. and O. M. were responsible

for the primary design of the study. The design concept was

discussed and agreed with S. K. and K. Khunti. O. M. and Kere K con-

ducted the data extraction. S. K. P., K. Klein and O. M. jointly con-

ducted the statistical analyses. S. K. and K. Khunti worked on the

analysis plan along with S. K. P. The first draft of the manuscript was

developed by S. K. P. and O. M., and all authors contributed to the

finalization of the manuscript. S. K. P. had full access to all the data in

the study and takes responsibility for the integrity of the data and

the accuracy of the data analysis.

REFERENCES

1. Turnbull FM, Abraira C, Anderson RJ, et al. Intensive glucose controland macrovascular outcomes in type 2 diabetes. Diabetologia.2009;52(11):2288-2298.

2. Shah AD, Langenberg C, Rapsomaniki E, et al. Type 2 diabetes andincidence of cardiovascular diseases: a cohort study in 1 � 9 millionpeople. Lancet Diabetes Endocrinol. 2015;3(2):105-113.

3. Fox CS, Golden SH, Anderson C, et al. Update on prevention of cardi-ovascular disease in adults with type 2 diabetes mellitus in light ofrecent evidence: a scientific statement from the American HeartAssociation and the American Diabetes Association. Circulation.2015;132(8):691-718.

4. American Diabetes Association. Standards of medical care indiabetes—2015. Diabetes Care. 2015;38(suppl 1):S49-S57.

5. Paul SK, Klein K, Maggs D, Best JH. The association of the treatmentwith glucagon-like peptide-1 receptor agonist exenatide or insulinwith cardiovascular outcomes in patients with type 2 diabetes: a ret-rospective observational study. Cardiovasc Diabetol. 2015;14:10,doi:10.1186/s12933-015-0178-3.

6. Smilowitz NR, Donnino R, Schwartzbard A. Glucagon-like peptide-1receptor agonists for diabetes mellitus: a role in cardiovascular dis-ease. Circulation. 2014;129(22):2305-2312.

7. Drucker DJ, Goldfine AB. Cardiovascular safety and diabetes drugdevelopment. Lancet. 2011;377(9770):977-979.

8. Garber AJ. Novel GLP-1 receptor agonists for diabetes. Expert OpinInvestig Drugs. 2012;21(1):45-57.

9. Monami M, Dicembrini I, Nardini C, Fiordelli I, Mannucci E. Effects ofglucagon-like peptide-1 receptor agonists on cardiovascular risk: ameta-analysis of randomized clinical trials. Diabetes Obes Metab.2014;16(1):38-47.

10. Vora J. Combining incretin-based therapies with insulin: realizing thepotential in type 2 diabetes. Diabetes Care. 2013;36(suppl 2):S226-S232.

11. Shaefer CF, Reid TS, Dailey G, et al. Weight change in patients withtype 2 diabetes starting basal insulin therapy: correlates and impacton outcomes. Postgrad Med. 2014;126(6):93-105.

12. Balkau B, Home PD, Vincent M, Marre M, Freemantle N. Factorsassociated with weight gain in people with type 2 diabetes startingon insulin. Diabetes Care. 2014;37(8):2108-2113.

13. Eng C, Kramer CK, Zinman B, Retnakaran R. Glucagon-like peptide-1receptor agonist and basal insulin combination treatment for themanagement of type 2 diabetes: a systematic review and meta-analy-sis. Lancet. 2014;384(9961):2228-2234.

14. Lee WC, Dekoven M, Bouchard J, Massoudi M, Langer J. Improvedreal-world glycaemic outcomes with liraglutide versus other incretin-based therapies in type 2 diabetes. Diabetes Obes Metab.2014;16(9):819-826.

15. Li Q, Chitnis A, Hammer M, Langer J. Real-world clinical and eco-nomic outcomes of liraglutide versus sitagliptin in patients withtype 2 diabetes mellitus in the United States. Diabetes Ther.2014;5(2):579-590.

16. Rigato M, Avogaro A, Fadini GP. Effects of dose escalating liraglutidefrom 1.2 to 1.8 mg in clinical practice: a case-control study.J Endocrinol Invest. 2015;38(12):1357-1363.

17. Gautier JF, Martinez L, Penfornis A, et al. Effectiveness and persist-ence with liraglutide among patients with type 2 diabetes in routineclinical practice–EVIDENCE: a prospective, 2-year follow-up, observa-tional, post-marketing study. Adv Ther. 2015;32(9):838-853.

MONTVIDA ET AL. 9

174

http://dx.doi.org/10.1186/s12933-015-0178-3

18. Ryder B, Thong K. Findings from the Association of British ClinicalDiabetologists (ABCD) nationwide exenatide and liraglutide audits.Hot topics in diabetes. 2012;5:49-61.

19. Thong KY, Jose B, Sukumar N, et al. Safety, efficacy and tolerabilityof exenatide in combination with insulin in the Association of BritishClinical Diabetologists nationwide exenatide audit. Diabetes ObesMetab. 2011;13(8):703-710.

20. Balena R, Hensley IE, Miller S, Barnett AH. Combination therapy withGLP-1 receptor agonists and basal insulin: a systematic review of theliterature. Diabetes Obes Metab. 2013;15(6):485-502.

21. Inzucchi SE, Bergenstal RM, Buse JB, et al. Management of hypergly-cemia in type 2 diabetes, 2015: a patient-centered approach: updateto a position statement of the American Diabetes Association and theEuropean Association for the Study of Diabetes. Diabetes Care.2015;38(1):140-149.

22. Kamal KM, Chopra I, Elliott JP, Mattei TJ. Use of electronic medicalrecords for clinical research in the management of type 2 diabetes.Res Social Adm Pharm. 2014;10(6):877-884.

23. Davis KL, Tangirala M, Meyers JL, Wei W. Real-world comparativeoutcomes of US type 2 diabetes patients initiating analog basal insulintherapy. Curr Med Res Opin. 2013;29(9):1083-1091.

24. Crawford AG, Cote C, Couto J, et al. Comparison of GE centricityelectronic medical record database and National Ambulatory MedicalCare Survey findings on the prevalence of major conditions in theUnited States. Popul Health Manag. 2010;13(3):139-150.

25. Khunti K, Nikolajsen A, Thorsted BL, Andersen M, Davies MJ,Paul SK. Clinical inertia with regard to intensifying therapy in people

with type 2 diabetes treated with basal insulin. Diabetes Obes Metab.2016;18(4):401-409.

26. Paul SK, Klein K, Thorsted BL, Wolden ML, Khunti K. Delay intreatment intensification increases the risks of cardiovascularevents in patients with type 2 diabetes. Cardiovasc Diabetol.2015;14(1):100.

27. Paul SK, Shaw J, Montvida O, Klein K. Weight gain in insulin treatedpatients by BMI categories at treatment initiation: new evidence fromreal-world data in patients with type 2 diabetes. Diabetes Obes Metab.2016, doi:10.1111/dom.12761. [Epub ahead of print].




How to cite this article: Montvida O, Klein K, Kumar S,

Khunti K and Paul SK. Addition of or switch to insulin therapy

in people treated with glucagon-like peptide-1 receptor ago-

nists: A real-world study in 66 583 patients, Diabetes Obes

Metab, 2016. doi: 10.1111/dom.12790

10 MONTVIDA ET AL.

175

http://dx.doi.org/10.1111/dom.12761

Using data obtained from the Centricity Electronic Medical Record database, which documents patient and treatment data from >35,000 USA‑based physicians or other health‑care providers, Montvida et al.1 present data on changes in HbA1c levels, body weight, systolic blood pressure and LDL cholesterol levels in patients with type 2 diabetes mellitus (T2DM) who received treatment with glucagon‑like peptide 1 receptor agonists (GLP1‑RAs; spe‑cifically, exenatide and liraglutide). Patient data were examined between 6 months and 24 months after the initiation of GLP1‑RA treatment in ~67,000 individuals with T2DM. After a minimum of 6 months of treatment with GLP1‑RAs, 33.5% of patients continued with GLP1‑RA treatment alone for at least 24 months, 7.1% of patients were switched to insulin treatment and simultaneously dis‑continued GLP1‑RA treatment and 59.9% of patients received insulin (of any preparation) in addition to GLP1‑RA treatment.

In patients initially receiving GLP1‑RA treatment and continuing with this treat‑ment alone for up to 24 months, the great‑est reduction in HbA1c levels were observed within 6 months of starting treatment, with negligible changes observed after 6 months. Patients in whom GLP1‑RA treatment was discontinued and replaced with insulin treatment did not show improved glycaemic control by switching treatment. However, if insulin was added to GLP1‑RA treatment, a net improvement in glycaemic control was observed. This net improvement was greater if insulin was added earlier rather than later,

on such trials6,7. Although about one‑third of patients can achieve HbA1c concentrations in the target range with GLP1‑RAs alone and will maintain treatment with this therapy for a long period of time, such long‑term treat‑ment efficacy is not observed in other patients. Furthermore, the necessity for treatment intensification can occur earlier or later dur‑ing the course of treatment with GLP1‑RAs. In head‑to‑head trials comparing GLP1‑RAs with insulin (mainly basal insulin) treatment, both approaches lead to similar degrees of glycae‑mic control when used as separate treatment strategies8. Small but statistically significant differences in efficacy favouring GLP1‑RAs over insulin treatments have been noted. This effect seems to persist even in patients with high baseline HbA1c levels, in whom one might intuitively assume a better response to insulin therapy than to GLP1‑RAs9. As the clinical effi‑cacy of GLP1‑RA treatment or insulin regimens is similar, the lack of net changes in glycaemic control when switching from one to the other, as shown by Montvida et al., is not surprising1.

An unexpected result of the analysis by Montvida et al.1 is that adding insulin treat‑ment to GLP1‑RA treatment does not increase body weight; rather, weight loss was greater in patients adding insulin to their regimen. In studies directly comparing GLP1‑RAs and insulin as single injectable agents and in combi‑nation4,5, weight reduction for the combination treatment was attenuated compared with the GLP1‑RA treatment alone, with insulin treat‑ment alone promoting weight gain. Whether body weight was assessed by using calibrated scales or less reliable information provided by the patient is not known, which points to some limitations of such ‘real world’ approaches.

Another interesting and novel aspect of the study by Montvida et al.1 is the change in HbA1c levels before intensification of the therapy was considered necessary. If a patient responded well to GLP1‑RA treatment, the reduction in HbA1c levels was evident after 6 months of treatment, with little further change in HbA1c levels with time. However, treatment intensification (either a switch to insulin treatment or the addition of insu‑lin treatment to GLP1‑RA treatment) was preceded by an elevation in HbA1c levels that was usually only evident during the 6‑month period before the requirement for treatment change. Thus, as with the trajectories of

relative to the time point at which GLP1‑RA treatment alone failed to provide sufficient glycaemic control1.

GLP1‑RAs have been approved as glucose‑ lowering treatments for patients with T2DM. Usually, these agonists are added to one or more oral glucose‑lowering agents (for exam‑ple, metformin or sulfonylureas), as alternatives to insulin treatment. However, GLP1‑RAs can also be used with insulin treatment, for which long‑acting insulin preparations such as insulin glargine2, insulin detemir3 or insulin degludec4 are most often used. This combination can be achieved by adding GLP1‑RAs to pre‑existing insulin treatment2, by adding insulin to pre‑ existing GLP1‑RA treatment3 or by introduc‑ing both treatments at the same time, in the lat‑ter case fixed‑dose combinations can be used, such as IDegLira (insulin degludec plus lira‑glutide)4 or LixiLan (insulin glargine plus lix‑isenatide)5. Prospective, randomized controlled trials have demonstrated achievement of HbA1c levels <7% in 60–85% of patients using these approaches2–5.

The findings of Montvida et al.1 confirm that results obtained from randomized, prospective trials can be replicated in the ‘real world’, as the database used contained results from diagnos‑tic procedures (such as measuring HbA1c levels, body weight and LDL cholesterol levels) and the prescription of medication (including the date of initiation and the duration of continu‑ous use of drugs, such as GLP1‑RAs or insulin treatment). At least qualitatively, the results of Montvida et al.1 confirm the findings reported in clinical trials2–5 and in meta‑analyses based

D I A B E T E S

Incretin mimetics and insulin — closing the gap to normoglycaemiaMichael A. Nauck and Juris J. Meier

Treatment of type 2 diabetes mellitus with GLP1 receptor agonists can result in long-term glycaemic control or can fail over time, in which case insulin can be used as an alternative or as an additive treatment. New research shows that the latter is more likely to achieve glycaemic targets than the former.

Refers to Montvida, O. et al. Addition or switch to insulin therapy in people treated with GLP-1 receptor agonists: a real world study in 66,583 patients. Diabetes Obes. Metab. http://dx.doi.org/10.1111/dom.12790 (2016)

NATURE REVIEWS | ENDOCRINOLOGY www.nature.com/nrendo

NEWS & VIEWS

© 2016

Macmillan

Publishers

Limited,

part

of

Springer

Nature.

All

rights

reserved.176


Nature Reviews | Endocrinology

Real-world targetachievement

40–70% patients

20–60% patients

60–85% patients

20–60% patients

100% patients

100% patients

20–40% patients

Traditional approach Novel approach 1 Novel approach 2

Add basal insulin Add basal insulin

Add basal insulin

Add prandial insulin

Add a GLP1 analogue

Add a GLP1 analogue

Add short-acting insulin at one meal

Add short-actinginsulin at all meals

Pote

ntia

l for

targ

et a

chie

vem

ent

Current target achievement

Normoglycaemia

Residual glycaemic burdenFuture unknown strategies

Oral glucose-lowering drugs 40–60%patients

Target achievement in clinical trials

?

measures of insulin secretory capacity and insulin resistance before the manifestation of T2DM10, short‑term dynamics in important determinants of glycaemic control are also evident during the transition from satisfactory glycaemic control with a glucose‑lowering medication (in this case GLP1‑RAs) to treat‑ment failure, indicating the need for a rapid response when implementing a treatment intensification. Increased granularity, ena‑bling the examination of these relationships at a higher temporal resolution, is not possi‑ble with the present data set, but should be an area of interest for future research.

One might also argue whether clinical inertia (that is, an undue delay in treatment intensification), is at work in these circum‑stances, or whether the hesitation to escalate treatment rather reflects physicians’ concerns about hypoglycaemia, other treatment‑related adverse effects or monetary considerations. In accordance with these concerns, long‑term adherence to GLP1‑RAs is still subop‑timal, probably owing to treatment‑emergent adverse effects.

The study by Montvida et al.1 also shows that even with the combination of GLP1‑RAs and insulin, the two drug classes consid‑ered to be most effective in terms of reduc‑ing HbA1c levels, only a minority of patients achieve an HbA1c target of <7%, in contrast with the higher efficacy reported in current

short‑term clinical trials. Addition of insulin to GLP1‑RA treatment is, therefore, suggested to be more effective than switching from a GLP1‑RA to insulin in reducing HbA1c lev‑els, thus offering one effective way to narrow the gap to near‑normoglycaemia. However, despite these clear advances in the treatment of T2DM, a large proportion of patients still fail to reach normoglycaemia with current glucose‑lowering strategies (FIG. 1). Continued efforts will be needed, to develop novel treat‑ment strategies, reduce treatment‑related adverse effects, optimize treatment adherence and refine current combination strategies.

Michael A. Nauck and Juris J. Meier are at the Division of Diabetology, Department of Medicine, St Josef-

Hospital, Ruhr University Bochum, Gudrunstraße 56, D-44791, Bochum, Germany.

[email protected]; [email protected]

doi:10.1038/nrendo.2016.180 Published online 10 Nov 2016

1. Montvida, O., Kleine, K., Kumar, S., Khunti, K. & Paul, S. K. Addition or switch to insulin therapy in people treated with GLP‑1 receptor agonists: a real world study in 66,583 patients. Diabetes Obes. Metab. http://dx.doi.org/10.1111/dom.12790 (2016).

2. Buse, J. B. et al. Use of twice‑daily exenatide in basal insulin‑treated patients with type 2 diabetes: a randomized, controlled trial. Ann. Intern. Med. 154, 103–112 (2011).

3. DeVries, J. H. et al. Sequential intensification of metformin treatment in type 2 diabetes with liraglutide followed by randomized addition of basal insulin prompted by A1C targets. Diabetes Care 35, 1446–1454 (2012).

4. Gough, S. C. et al. Efficacy and safety of a fixed‑ratio combination of insulin degludec and liraglutide (IDegLira) compared with its components given alone: results of a phase 3, open‑label, randomised, 26‑week, treat‑to‑target trial in insulin‑naive patients with type 2 diabetes. Lancet Diabetes Endocrinol. 2, 885–893 (2014).

5. Rosenstock, J. et al. Benefits of LixiLan, a titratable fixed‑ratio combination of insulin glargine plus lixisenatide, versus insulin glargine and lixisenatide monocomponents in type 2 diabetes inadequately controlled with oral agents: the LixiLan‑O randomized trial. Diabetes Care http://dx.doi.org/10.2337/dc16‑0917 (2016).

6. Balena, R., Hensley, I. E., Miller, S. & Barnett, A. H. Combination therapy with GLP‑1 receptor agonists and basal insulin: a systematic review of the literature. Diabetes. Obes. Metab. 15, 485–502 (2013).

7. Eng, C., Kramer, C. K., Zinman, B. & Retnakaran, R. Glucagon‑like peptide‑1 receptor agonist and basal insulin combination treatment for the management of type 2 diabetes: a systematic review and meta‑analysis. Lancet 384, 2228–2234 (2014).

8. Abdul‑Ghani, M. A., Williams, K., Kanat, M., Altuntas, Y. & DeFronzo, R. A. Insulin versus GLP‑1 analogues in poorly controlled type 2 diabetic subjects on oral therapy: a meta‑analysis. J. Endocrinol. Invest. 36, 168–173 (2013).

9. Buse, J. B. et al. Is insulin the most effective injectable antihyperglycaemic therapy? Diabetes Obes. Metab. 17, 145–151 (2015).

10. Tabak, A. G. et al. Trajectories of glycaemia, insulin sensitivity, and insulin secretion before diagnosis of type 2 diabetes: an analysis from the Whitehall II study. Lancet 373, 2215–2221(2009).

Competing interests statementM.A.N. declares that he has received personal fees, grants, non‑financial support or other support from AstraZeneca, Berlin Chemie‑AG, Boehringer Ingelheim, Eli Lil ly, G laxoSmithK l ine, Hof fmann La Roche, In tarc ia Therapeuticals, Janssen Global Services, Medscape LLC, Merck Sharp & Dohme, Novartis, Novo Nordisk, Sanofi‑Aventis and Versartis. J.J.M. declares that he has received grants or personal fees from Astra Zeneca, Berlin‑Chemie, Boehringer‑Ingelheim, Eli Lilly, MSD, NovoNordisk, Sanofi and Servier.

Figure 1 | Glycaemic target achievement resulting from different treatment approaches. The achievement of glycaemic targets in clinical trials and in real world studies of patients with type 2 diabetes mellitus (T2DM) differs with conventional and novel treatment approaches, and at different stages of treatment escalation. The lack of achievement of normoglycaemia in all patients illustrates the need for further advances in the treatment of T2DM. The gap between the target achievement with current combination treatment approaches and normoglycaemia is called the ‘residual glycaemic burden’.

NATURE REVIEWS | ENDOCRINOLOGY www.nature.com/nrendo

N E W S & V I E W S

© 2016

Macmillan

Publishers

Limited,

part

of

Springer

Nature.

All

rights

reserved. ©

2016

Macmillan

Publishers

Limited,

part

of

Springer

Nature.

All

rights

reserved.177

http://dx.doi.org/10.1038/nrendo.2016.180




APPENDIX B

178

Send Orders for Reprints to [email protected]

16 The Open Bioinformatics Journal , 2017, 10, 16-27

1875-0362/17 2017 Bentham Open

The Open Bioinformatics Journal

Content list available at: www.benthamopen.com/TOBIOIJ/

DOI: 10.2174/1875036201710010016

RESEARCH ARTICLE

Data Mining Approach to Identify Disease Cohorts from PrimaryCare Electronic Medical Records: A Case of Diabetes Mellitus

Ebenezer S. Owusu Adjah1,2, Olga Montvida1,3, Julius Agbeve1 and Sanjoy K. Paul4,*

1QIMR Berghofer Medical Research Institute, Brisbane, Australia2Faculty of Medicine, The University of Queensland, Brisbane, Australia3School of Biomedical Sciences, Institute of Health and Biomedical Innovation, Faculty of Health, QueenslandUniversity of Technology, Brisbane, Australia4Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne, Australia

Received: August 17, 2017 Revised: November 28, 2017 Accepted: November 29, 2017

Abstract:

Background:

Identification of diseased patients from primary care based electronic medical records (EMRs) has methodological challenges thatmay impact epidemiologic inferences.

Objective:

To compare deterministic clinically guided selection algorithms with probabilistic machine learning (ML) methodologies for theirability to identify patients with type 2 diabetes mellitus (T2DM) from large population based EMRs from nationally representativeprimary care database.

Methods:

Four cohorts of patients with T2DM were defined by deterministic approach based on disease codes. The database was mined for aset of best predictors of T2DM and the performance of six ML algorithms were compared based on cross-validated true positive rate,true negative rate, and area under receiver operating characteristic curve.

Results:

In the database of 11,018,025 research suitable individuals, 379 657 (3.4%) were coded to have T2DM. Logistic Regression classifierwas selected as best ML algorithm and resulted in a cohort of 383,330 patients with potential T2DM. Eighty-three percent (83%) ofthis cohort had a T2DM code, and 16% of the patients with T2DM code were not included in this ML cohort. Of those in the MLcohort without disease code, 52% had at least one measure of elevated glucose level and 22% had received at least one prescriptionfor antidiabetic medication.

Conclusion:

Deterministic cohort selection based on disease coding potentially introduces significant mis-classification problem. ML techniquesallow testing for potential disease predictors, and under meaningful data input, are able to identify diseased cohorts in a holistic way.

Keywords: Electronic Medical Records, Primary Care Database, Machine Learning Algorithm, Diabetes, Type 2 Diabetes, CohortIdentification.

* Address correspondence to this author at the Melbourne EpiCentre , University of Melbourne and Melbourne Health, Melbourne, Australia;Tel: +61-3-93428433; E-mail: [email protected]

179

http://benthamopen.com

http://crossmark.crossref.org/dialog/?doi=10.2174/1875036201710010016&domain=pdf

http://www.benthamopen.com/TOBIOIJ/

http://dx.doi.org/10.2174/1875036201710010016


Cohort Identification from Primary Care Database The Open Bioinformatics Journal , 2017, Volume 10 17

1. INTRODUCTION

Recent advances in the design and implementation of large patient-level electronic medical records (EMRs) fromnational primary care databases have created opportunities in clinical, epidemiological and public health research [1, 2].In a typical primary or ambulatory care setting, large volumes of data are generated as patients go through variousphases of treatment. Individual patients’ longitudinal data on demographics, lifestyle, disease and treatment history,clinical and laboratory parameters, hospitalization statistics, and clinical events are typically organized and stored in aform of relational database. Such databases present unique challenges in terms of efficient and effective extraction ofdata for various investigative interests [3]. One of the challenging aspects in this context is the identification of diseasecohorts for retrospective or prospective clinical epidemiological studies [4, 5].

Diagnostic codes, such as the International Classification of Diseases (ICD) codes or Read codes [6], are generallyused to identify disease cohorts from EMRs [4]. The reliability of diagnosis coding for various diseases has beenextensively examined for many primary care databases including The Health Improvement Network (THIN) databasefrom the United Kingdom [7 - 9]. However, there are four specific issues in relation to identifying cohorts by diagnosticcodes: (1) differentiating between disease subtypes from high-level codes, (2) overlapping codes of disease subtypeslongitudinally at individual patient level, (3) absence of codes for diseased patients (false negatives), and (4) presence ofdisease specific codes for patients without the specific disease (false positives).

With regards to diabetes mellitus (DM), identification and appropriate classification of different types of diabetes inthe primary care databases are particularly challenging [5, 10 - 13]. These challenges border mostly on inaccuratecoding leading to misclassification, misdiagnosis, and undiagnosed diabetes [12]. Algorithms based on laboratory,clinical, and medication data have thus been proposed as tools for distinguishing between type 1 diabetes mellitus(T1DM) and type 2 diabetes mellitus (T2DM) [10, 14 - 16]. However, the overall accuracy and reliability of deriveddisease cohorts based on diagnostic codes can be improved by implementing advanced machine learning (ML) orstatistical data mining techniques and clinically guided cohort selection algorithms that robustly capture comprehensivepatient level information available in the EMRs [4, 5, 12, 13].

Shivade and colleagues (2014) have conducted a systematic review of various techniques used for the identificationof different disease cohorts from different sources of clinical databases [2]. Some of these proposed algorithms havebeen criticized for their appropriateness in the context of other studies [17]. While several studies compared or appliedML techniques to identify T2DM patients, to the best of our knowledge, there is no study that employed an extensiveassessment of diagnostic codes, deterministic clinical selection algorithms, and ML algorithms simultaneously toidentify T2DM cohorts from primary care EMRs.

The aims of this exploratory methodological study were to (1) explore technical challenges in the extraction ofdisease cohorts, (2) compare the ability of different clinically guided cohort selection algorithms to identify the diseasecohorts, and (3) compare the disease cohorts identified by ML algorithms and clinically guided cohort selectionalgorithms using a large nationally representative primary care database from the UK.

2. MATERIALS AND METHODS

In this section, we introduce the primary care database, describe the challenges in identifying cohort of patients withspecific disease (i.e. T2DM), explain the clinically guided cohort selection algorithms, and the data mining andcomputational processes leading to comparison of different supervised ML techniques.

2.1. Data Source

Data from The Health Improvement Network (THIN), which is a patient level primary care data from UK was usedin this study. THIN is an ongoing primary care database of medical records of anonymized patients from generalpractitioners, covers over 600 UK general practices, and has been linked to the hospital episode statistics (HES) andother statistics from the National of Bureau of Statistics. Longitudinal patient level records have been collected since1990 and the current version of the database holds more than 13 million individual patient records. The patientsincluded in this database are representative of the UK population by age, gender, medical conditions and death ratesadjusted for demographics and social deprivation. The accuracy and completeness of THIN database have beenpreviously described elsewhere [18, 19]. The THIN database is considered as one of the most comprehensive patientlevel databases available globally, and has been extensively used by researchers and government bodies for clinical,epidemiological and public health related studies [20]. The database contains extensive information on individuals’

180

18 The Open Bioinformatics Journal , 2017, Volume 10 Owusu Adjah et al.

demographic, clinical, laboratory, medications and event history data. The study protocol was approved by theIndependent Scientific Review Committee for the THIN database (Protocol Number: 15THIN030) and the InstitutionalReview Board of QIMR Berghofer Medical Research Institute.

2.2. Challenges in Identifying Disease Cohort

THIN uses the UK’s standard Read code classification system which is useful for hierarchical classification ofpatients’ specific circumstances and lifestyles, thereby enhancing scalability and retrieval (6). However, the Readcoding system is complex as a disease or an encounter with a general practitioner can be coded in several waysincluding use of existing codes or by creating new user-defined codes [21]. In this way, considerable variation andinconsistency is introduced into the coding system as observed in the case of DM [11, 14, 22].

2.2.1. Differentiating Between Disease Subtypes

Typically, many diabetes related codes are available for a single patient, some of which are high- level codes (e.g.C10 - “Diabetes mellitus”) or disease related codes that are unspecific in the description of the diabetes type (e.g.C106.12-“Diabetes mellitus with neuropathy”). Common practice has been to exclude any high level codes [14, 23]which may lead to underestimation of the disease cohort. When it is impossible to identify disease subtype (type 1 ortype 2 diabetes) from the diagnostic codes, data on surrogate markers (like glutamic acid carboxylase) could be useful,but such information is not available in THIN database. Nevertheless, combinations of available biomarkers (such asage, weight or HbA1c) and medication prescriptions have been used to distinguish types of diabetes in some studies [10,14].

2.2.2. Longitudinally Overlapping Disease Subtypes

Patients may have different disease subtypes coded longitudinally as a result of data entry errors or biologicalprogression of the disease. While the former can lead to any combinations of subtypes, the latter may result indeveloping T1DM from T2DM or T2DM from gestational diabetes. To distinguish between contradictory codes,longitudinal exploratory techniques were applied in some studies [5]. Also, the techniques described above that dealwith unspecific codes may be considered. To address the issue of contradictory diagnostic codes longitudinally, thefollowing was adopted to distinguish between T1DM and T2DM.

Use of Read codes that uniquely distinguish between T1DM and T2DM.i.In patients with unspecific codes, or longitudinally overlapping subtypes, the following is used:ii.

If oral antidiabetic drug (ADD) is taken ≥ 2 months, then T2DM.a.Otherwise, if age at first available diagnosis date ≤ 18 years and insulin initiated within 1 year, thenb.T1DM.Otherwise, if age at first available diagnosis date > 18 years and insulin initiated within 3 months thenc.T1DM.Else T2DM.d.

Patients with codes for gestational diabetes and other forms of diabetes were not include in this studyiii.

2.2.3. Absence of Codes for Patients with Disease and Presence of Codes for Patients without Disease

Data entry errors such as omissions, typing, communicating errors and patients’ temporary loss of follow-up inEMRs usually result in relatively small amount of false positive, and larger numbers of false negative patients identifiedby diagnostic codes. Earlier studies have addressed this complex issue by employing deterministic or probabilisticalgorithms [2, 15, 16]. We further focus on this challenging aspect by comparing deterministic (clinically guided) andprobabilistic (ML) cohort identification approaches.

2.3. Clinically Guided Cohort Selection Algorithms

Four separate cohorts were created by applying logical, clinically guided algorithms that select patients from thosewho have at least one record of Read code for T2DM (Fig. 1). Specifically, the T2DM cohorts were selected on thebasis of available records for:

Selection algorithm 1: T2DM Read code (Cohort 1);i.

181


Selection algorithm 2: Lifestyle modification intervention + T2DM Read code (Cohort 2);ii.Selection algorithm 3: At least one prescription for antidiabetic medication + lifestyle modification interventioniii.+ T2DM Read code (Cohort 3);Selection algorithm 4: At least one prescription for antidiabetic medication or lifestyle modification interventioniv.+ T2DM Read code (Cohort 4).

Selection algorithm 1: T2DM Read code only; Selection algorithm 2: T2DM Read code + lifestyle modificationadvice. Selection algorithm 3: T2DM Read code + antidiabetic medication + lifestyle modification advice. Selectionalgorithm 4: T2DM Read code + (antidiabetic medication or lifestyle modification advice)

Fig. (1). Flow chart for the selection of type 2 diabetes (T2DM) cohorts by clinically guided algorithms.

2.4. Supervised Machine Learning Techniques

The process of selecting one most appropriate probabilistic algorithm to identify patients with T2DM is describedbelow.

2.4.1. Feature Selection

THIN database was mined to detect the most frequent medications, comorbidities, laboratory and anthropometricmeasurements among patients with T2DM identified on the basis of Read codes. The resulting 280 variables werecombined with current clinical considerations, practices and guidelines for T2DM management [24], and 11 potentialdisease predictors were obtained through iterative process (Table 1). Correlation based Feature Selection (CFS)algorithm was applied to determine best of these predictors [25, 26]. This scheme independent attribute subset selectionapproach is particularly useful when attributes are correlated with one another, and with the class attribute. Bi-directional, forward and backward greedy search methods were applied using 10-fold cross-validation [27] and they allagreed on the same seven features described in Table 1.

2.4.2. Training Dataset

From the 11,018,025 patients in THIN database, a training dataset of 150,000 instances, containing equal number ofpositive and negative representatives was extracted. Positive instances were randomly selected from patients with (1)available T2DM Read code, (2) at least one year of follow-up, and (3) 18-90 years old at the time of T2DM diagnosis.

All patients with valid record

(n=11,018,025)

Individuals with any type of DM

(n=530,948)

No record of DM(n=10,487,077)

T2DM (Selection algorithm 1) n=379,657

Age, mean = 60 yearsMale, % = 55

Exclude :1. Type 1 Diabetes (n=46,238)2. Gestational Diabetes (n=15,814)3. Prediabetes (n=86,800)4. Other Types (n=2,439)







182


Negative instances were also randomly selected from those without Read code for any subtype of DM and at least oneyear of follow-up (Fig. (2), training set).

Table 1. Features selected as best T2DM predictors.

– Feature Name Feature Type Selected for ML

1 Two measurements of HbA1c>6% or fasting blood glucose > 7 mmol/l or random blood glucose > 11.1 mmol/lwithin 1 year. Binary Yes

2 Any antidiabetic drug prescriptions for at least 6 months. Binary Yes3 Average BMI. Continuous Yes

4 Hypertension diagnosis or antihypertensive drug use greater or equal to 6 months or beta blockers prescriptionfor 6 months or more. Binary Yes

5 Chronic kidney diagnosis. Binary Yes6 Retinopathy or neuropathy diagnosis. Binary Yes7 Average systolic blood pressure. Continuous Yes8 Lifestyle modification advice. Binary No9 Average HbA1c. Continuous No10 Average random glucose Continuous No11 Heart failure or myocardial infarction or stroke or coronary artery disease Binary No

Fig. (2). Flowchart of creating dataset for machine learning training, and of dataset for predicting diabetes status.

2.4.3. Classification Algorithm Selection

Keeping the selected subset of 7 robust predictors of T2DM, six classification algorithms were applied to thetraining set. Ten repeat 10-fold cross-validation was applied to calculate true positive rate (sensitivity), true negative


(n=11,018,025)


(n=11,018,025)

No record of DM(n=10,487,077)

T2DM (Cohort 1)(n=379,657)

Exclude :1. Type 1 Diabetes (n=46,238)2. Gestational Diabetes (n=15,814)3. Prediabetes (n=86,800)4. Other Types (n=2,439)

Randomly select 75,000(+ instances)

Follow-up ≥ 1 yr age at diagnosis 18 - 90 years

(n=350,201)

Randomly select 75,000

(- instances)

Follow-up ≥ 1 yr (n=9,587,202)

Training set(n=150,000)

Prediction set (n=9,937,403)

183


rate (specificity), and area under receiver operating characteristic curve (AUC). Percent of correctly classified instancesand required central processing unit (CPU) time for training the algorithms were also derived. The algorithms forcomparison were: Naïve Bayes [28, 29], Logistic regression [30], Support Vector Machine (SVM) [31, 32], MultilayerPerceptron (MP) [33], Decision Tree with J48 modification [34], and One Rule [35].

One Rule algorithm performed significantly worse. Except differences in CPU time, performance of otheralgorithms was similar. Among them, Naïve Bayes had lower sensitivity misclassifying approximately 500 additionalpatients compared to other approaches. AUC was smaller for SVM and J48, while SVM and MP required significantlyhigher CPU time (Table 2). Interestingly, neither body mass index nor blood pressure contributed significantly to anymodel. Logistic regression was selected as most appropriate model for predicting T2DM. The model obtained from fulltraining dataset was applied to all THIN database patients with no record of Read code for diabetes diagnosis other thanT2DM, and with available follow-up for at least one year (Fig. (2), prediction set).

Table 2. Performance of machine learning algorithms on the training dataset.

– Naïve Bayes Logistic Regression MultilayerPerceptron

Support VectorMachine

J48Decision Tree One Rule

Percent correct 95.6 95.9 95.9 95.9 95.9 91.7TPR 0.98 0.99 0.99 0.99 0.99 0.99TNR 0.93 0.93 0.93 0.93 0.93 0.84AUC 0.98 0.98 0.98 0.96 0.96 0.92

CPU time 0.09 3.36 68.03 191.9 1.78 0.21TPR: True Positive Rate, TNR: True Negative Rate; AUC: Area Under receiver operating characteristic Curve; CPU: Central Processing Unit.

3. RESULTS

The distributions of basic characteristics of patients identified by all four clinically guided algorithms and the MLalgorithm were similar (Table 3). Clinically guided algorithms 1-4 and the ML algorithm resulted in cohorts of 379,657;243,597; 197,326; 346,993; and 383,330 patients with T2DM respectively. For patients identified by the ML algorithmwho did not have a Read code, the first available date of entry of the significant predictors was used as their date ofdiagnosis. At the time of diabetes diagnosis, identified patients were on average 60 years old, 86 kg in weight with 55%male. The proportions of those who had two elevated glucose level measurements within one year were 75, 86, 90, 79,and 82% in cohorts identified by selection algorithms 1-4 and ML respectively. With median 11 years of follow-up postdiagnosis, proportions of those who received at least one prescription for antidiabetic medication were 79, 81, 100, 87,and 75% in cohorts identified by rules 1-4 and ML respectively.

Among the cohort of T2DM patients identified by ML algorithm, 317,979 (83% of 383,330) patients had Read codefor T2DM (Table 4). It is worth noting that 59,678 (16% of 379,657) patients with a record of T2DM Read code werenot selected by ML approach. Almost a fifth (17% of 383,330) of the patients in ML cohort were without a record ofT2DM Read code. Of them, 52% had at least one measure of elevated glucose level and 22% had received at least oneprescription for antidiabetic medication (Table 4).

In order to assess the proportion of patients that remain undetected by the algorithms used in this study, complementcohort-specific analysis was performed (data not shown). Among patients not selected by ML as T2DM, only 884patients had at least two elevated glucose measurements (HbA1C > 6% or fasting blood glucose > 7 mmol/l or randomblood glucose > 11.1 mmol/l) within 1 year, compared to 32,039, 106,671, 137,796, and 42,583 patients not selected byselection algorithms 1-4.

Table 3. Baseline characteristics of T2DM patients identified by selection algorithms and logistic regression classifier (ML).

– Selection Algorithm 1 Selection Algorithm 2 Selection Algorithm 3 Selection Algorithm 4 MLPatients, n 379,657 243,597 197,326 346,993 383,330Age at diagnosis (years) α 60 (15) 59 (14) 58 (14) 60 (15) 59 (15)Age at diagnosis (years) * 61 (50,71) 60 (50,69) 58 (49,67) 60 (50,70) 60 (50,70) ≤40 32,644 (9) 19,761 (8) 17,969 (9) 29,701 (9) 71,752 (19) 41-50 62,656 (17) 43,872 (18) 39,289 (20) 59,608 (17) 58,813 (15) 51-60 90,464 (24) 62,610 (26) 54,006 (27) 85,587 (25) 84,277 (22) 61+ 193,893 (51) 117,354 (48) 86,062 (44) 172,097 (50) 168,488 (44)Male # 208,155 (55) 134,393 (55) 110,178 (56) 191,107(55) 200,447 (52)

184


– Selection Algorithm 1 Selection Algorithm 2 Selection Algorithm 3 Selection Algorithm 4 MLAt least one prescription# 300,722 (79) 197,326 (81) 197,326 (100) 300,722 (87) 287,095 (75)Prescription duration ≥ 6months# 243,064 (64) 171,800 (71) 171,800 (87) 243,064 (70) 254,255 (66)

RBG (mmol/l) α § 11.5 (5.1) 11.4 (5.1) 12.1 (5.3) 11.6 (5.2) 11.3 95.2)

RBG (mmol/l) α ‡ 9.5 (3.4) 9.4 (3.3) 9.9 (3.4) 9.6(3.4) 9.1 (3.5)

FBG (mmol/l) α § 8.4 (2.3) 8.4 (2.3) 8.9 (2.4) 8.5 (2.3) 8.3 (2.3)

FBG (mmol/l) α ‡ 7.8 (2.1) 7.7 (2.0) 8.0 (2.1) 7.8(2.1) 7.5 (2.1)

HbA1c (%)α § 8.4 (2.1) 8.4 (2.1) 8.7 (2.2) 8.5 (2.2) 8.3 (2.1)

HbA1c (%)α ‡ 7.5 (1.4) 7.5 (1.3) 7.7 (1.3) 7.5(1.4) 7.4 (1.3)

Composite measure# ‡ 283,419 (75) 208,787 (86) 177,689 (90) 272,875 (79) 314,574 (82)

Weight (kg) α § 89.4(20.8) 90.3 (21.0) 91.1 (21.1) 89.6 (20.9) 89.3 (21.0)

Weight (kg) α ‡ 85.0 (19.8) 86.6 (19.9) 87.6 (20.0) 85.5 (19.8) 86.1 (20.6)

BMI (kg/m2) α § 31.6 (6.7) 32.0 (6.7) 32.2 (6.7) 31.7 (6.7) 31.7 (6.8)

BMI (kg/m2) α ‡ 30.2 (6.1) 30.7 (6.1) 31.0 (6.2) 30.4(6.1) 30.7 (6.7)

Normal weight # 22311(12) 15,821 (11) 12,339 (11) 21,108 (12) 24,453 (13)

Overweight # 58,447 (32) 44,283 (32) 35,289 (31) 55,885 (32) 61,846 (32)

Grade 1 obese # 52,465 (29) 41,323 (30) 33,669 (30) 50,423 (29) 55,684 (29)

Grade 2 obese # 27,168 (15) 22,163 (16) 18,497 (16) 26,336 (15) 29,178 (15)

Any CVD# 106,523 (28) 67,011 (28) 51,905 (26) 96,147 (28) 93,703 (24)

CKD# 10,547 (3) 8,035 (3) 4,609 (2) 9,445 (3) 12,404 (3)

Cancer# 24,159 (6) 15,998 (7) 11,084 (6) 21,536 (6) 22,112 (6)

Hypertension# 149,752 (39) 104,916 (43) 79,193 (40) 137,440 (40) 140,341 (37)Follow-up (years) * 11 (6,17) 10 (6,15) 11 (6,16) 11(6,17) 10 (5,16)Legend: Selection algorithm 1: Read code only; Selection algorithm 2: Read code and lifestyle modification advice; Selection algorithm 3: Read codeand medication and lifestyle modification advice; Selection algorithm 4: Read code and (medication or lifestyle modification advice); ML: Machinelearned cohort; RBG: random blood glucose; FBG: fasting blood glucose; Composite measure: fasting blood glucose > 7mmol/l or random bloodglucose >11.1 mmol/l or HbA1c >6; BMI: Body Mass Index (kg/m2); Normal: (18.5-24.99), Overweight: (25-29.99); Grade 1 obese: (30-34.99), Grade2 obese (35-39.99); α: Mean(SD); #: n(%); *: median (Q1,Q3); CKD: Chronic kidney disease ; Any CVD: any cardiovascular disease defined asoccurrence of angina, MI, coronary heart disease (CHD), HF, stroke, and peripheral artery disease (PAD) on or before diagnosis of T2DM; §:measured at diagnosis and ‡: an average over of all available measurements.

Table 4. Baseline characteristics and distribution of glycaemic markers among patients identified by ML.

– Machine Learned T2DM Cohort(n=383,330)

– With Read Code Without Read CodePatients # 319,979 (83) 63,351 (17)

Age at diagnosis (years) α 60 (14) 54 (24)Age at diagnosis (years) * 60 (50, 70) 56 (33, 73) ≤ 40 25,645 (8) 46,107 (73) 41-50 56,583 (18) 2,230 (4) 51-60 81,262 (25) 3,015 (5) 61+ 156,489 (49) 11,999 (19)Male # 176,568 (55) 23,879 (38)

At least one prescription # 273,272 (85) 13,823 (22)

Prescription duration ≥ 6 months # 241,517 (76) 12,738 (20)

RBG >11.1 mmol/l #, 101,135 (32) 1,471 (2)

FBG > 7 mmol/l# 50,446 (16) 1,695 (3)

HbA1c > 6%# 274,565 (86) 29,793 (47)

Composite measure# 274,565 (86) 29,793 (47)Legend: RBG: random blood glucose; FBG: fasting blood glucose; Composite measure: fasting blood glucose > 7 mmol/l or random blood glucose>11.1 mmol/l or HbA1c > 6; *: median (Q1,Q3), #: n (%), α: mean (SD)


185


4. DISCUSSION

In this study we addressed a number of problems encountered by computer based methods in the complex tasks ofidentifying a disease cohort from large EMR databases. Specifically, (1) we have defined and discussed commontechnical challenges in differentiating diabetes subtypes, (2) combining clinical, medication and morbidity informationwith database patterns, we selected a set of best predictors as feeds to ML algorithms that can be used to identifypatients with T2DM in the absence of any disease code, and (3) compared T2DM cohorts identified by clinically guidedselection algorithm and ML algorithm. The results of this study are of particular interest to researchers who work withTHIN database, however methods explored in this study are generalizable for any EMR with different disease codingsystems.

Although we have seen no difference in distributions of basic characteristics among cohorts obtained bydeterministic and probabilistic approaches, ML algorithms were found to be superior. With the use of selected features,we could confirm that 83% of the patients identified by the ML algorithm had a Read code for T2DM (Table 3). Thosewithout Read code had comparable high risk as identified by the significant predictors. While 25 / 21% of patients withRead code / Read code + (medication or life style advice) for T2DM did not have at least two elevated measures ofblood glucose within one year, only 18% of ML identified cohort did not have such measures. Among Read code / MLdefined patients without elevated composite glucose measure, 69 / 41% did not receive ADD for at least 6 months. It isimportant to note that the patients without a Read code for diabetes are highly less likely to have a 2 elevated bloodglucose measures within one year unless they were known to be diabetic or pre-diabetic.

Five of the six ML algorithms demonstrated similar performances in the training-testing data sets. Logisticregression approach was chosen as the best classifier for THIN database, however different feature patterns within otherEMRs could potentially lead to better performance of other ML techniques to predict T2DM cohort. Tapak andcolleagues [36] reported SVM as the better classifier, while Mani and colleagues [37] reported decision trees tooutperform other ML algorithms. In this context it is important to mention that, ML algorithms cannot operate withoutmeaningful data fed-in (“Garbage in, garbage out” principle). Although the use of different datasets makes it difficultfor direct comparisons, a critical part of ML steps is the feature engineering or selection. Some recent studies have usedlarge sets of variables associated with diabetes with the aim of enhancing the predictive accuracy [38, 39]. However,this may be limited by inclusion of irrelevant and redundant variables, and model overfitting in cases where number ofobservations are less than number of variables. While earlier studies were primarily based on clinically guided featureselection, we adopted a more holistic approach initially to identify the data driven candidates as potential predictors ofT2DM from the whole database. Combining clinical knowledge and data driven candidate predictors, we ensuredselection of most robust set of 7 predictors. Although selected features were not surprising, we have seen that, BMI,lifestyle modification advice and hypertension did not contribute to the models, while microvascular complications did.

We have compared the performances of six classification algorithms on a set of 150,000 instances, which wasreconfirmed to be large enough by assessing the performance curves of several incremental classifiers. Nevertheless,training dataset was small compared to the whole database; therefore in order to ensure that our results are not prone toselection bias, we performed same analyses on 2 other randomly selected training datasets and obtained almost identicalresults.

Unlike most ML applications that focus on training to ensure best fit for future predictions, in this study, we haveused various techniques to correct available labelling with ultimate goal to improve quality of diseased cohort (Type 2Diabetes). It would be of great interest to compare ML error, Rule-based error, and human error in terms of predictingdisease from available data. For this task a “gold standard” dataset would consist of random patients whose true diseasestate was reconfirmed approaching both clinician and patient. We were not able to conduct this task, as the THINdatabase contains de-identified patient-level data, which is true for all large EMR databases that are used for researchpurposes. THIN database also does not have data on surrogate markers that could improve quality of the cohortidentification algorithms. Miscoding between type 1 and type 2 diabetes in the primary care database is not uncommon[40, 41]. It is important to mention that ML techniques may poorly distinguish between disease subtypes withoutincorporating additional classification rules. We have excluded patients with other diabetes Read codes from the dataseton which our ML algorithm was applied. Furthermore, for patients identified as T2DM without Read codes, the MLtechniques are not able to provide exact diagnosis date, therefore requiring incorporation of additional techniques.

186


CONCLUSION

Careful investigation of diagnostic codes patterns within the databases is essential prior to conducting analyses onthe disease cohort. Direct extraction of a disease cohort using diagnostic codes may lead to inclusion of falselydiagnosed patients and omitting patients with true disease state. Rule-based techniques represent conservative approach,which results in minimizing only false positive cases. ML techniques that minimize both false positives and falsenegatives cases represent more robust approach. However, ML techniques heavily rely on the meaningful input and usediagnostic codes for training purposes. Combining human expertise and machine power represent best strategy thatallows to test hypotheses on potential disease predictors, lower human interventions, and to reduce the burden ofselection bias.


ADD = Antidiabetic Drug

AUC = Area Under the Curve

BMI = Body Mass Index

CHD = Coronary Heart Disease

CPU = Central Processing Unit

CVD = Cardiovascular Disease

DM = Diabetes Mellitus

EMR = Electronic Medical Record

FBG = Fasting Blood Glucose

GP = General Practitioner

HbA1c = Glycated Haemoglobin

HES = Hospital Episode Statistics

HF = Heart Failure

ICD = International Classification of Diseases

MI = Myocardial Infarction

ML = Machine Learning

MP = Multilayer Perceptron

PAD = Peripheral Artery Disease

RBG = Random Blood Glucose

SD = Standard Deviation

SVM = Support Vector Machine

T1DM = Type 1 Diabetes Mellitus

T2DM = Type 2 Diabetes Mellitus

THIN = The Health Improvement Network

TNR = True Negative Rate

TPR = True Positive Rate

UK = United Kingdom

ETHICS APPROVAL AND CONSENT TO PARTICIPATE

The study protocol was approved by the Independent Scientific Review Committee for the THIN database (ProtocolNumber: 15THIN030) and the Institutional Review Board of QIMR Berghofer Medical Research Institute.

HUMAN AND ANIMAL RIGHTS

No Animals/Humans were used for studies that are base of this research.

CONSENT FOR PUBLICATION

Not applicable.

187



Sanjoy K. Paul has acted as a consultant and/or speaker for Novartis, GI Dynamics, Roche, AstraZeneca,Guangzhou Zhongyi Pharmaceutical and Amylin Pharmaceuticals LLC. He has received grants in support ofinvestigator and investigator initiated clinical studies from Merck, Novo Nordisk, AstraZeneca, Hospira, AmylinPharmaceuticals, Sanofi-Avensis and Pfizer. Ebenezer S. Owusu Adjah, Olga Montvida, and Julius Agbeve. have noconflict of interest to declare.

ACKNOWLEDGEMENTS

Sanjoy K. Paul conceived the idea and was responsible for the primary design of the study. Ebenezer S. OwusuAdjah , and Olga Montvida significantly contributed in the study design. Julius Agbeve conducted the primary raw dataextraction. Ebenezer S. Owusu Adjah and Olga Montvida jointly conducted the data extraction, data manipulation,statistical analyses and developed the first draft of the manuscript. Ebenezer S. Owusu Adjah , Olga Montvida , SanjoyK. Paul, and Julius Agbeve contributed to the finalization of the manuscript. Sanjoy K. Paul had full access to all thedata in the study and is the guarantor, taking responsibility for the integrity of the data and the accuracy of the dataanalysis. Ebenezer S. Owusu Adjah was supported by QIMR Berghofer International Ph.D. Scholarship and TheUniversity of Queensland International Scholarship. Olga Montvida was supported by the Queensland University ofTechnology International Scholarship. No separate funding was obtained for this study. Melbourne EpiCentre gratefullyacknowledges the support from the Australian Government’s National Collaborative Research Infrastructure Strategy(NCRIS) initiative through Therapeutic Innovation Australia and the research project funding from the National Healthand Medical Research Council of Australia (Project Number: GNT1063477). Olga Montvida acknowledges the supportfrom her associate supervisors Prof. Ross Young and Prof. Louise Hafner.

REFERENCES

[1] Sagreiya H, Altman RB. The utility of general purpose versus specialty clinical databases for research: Warfarin dose estimation fromextracted clinical variables. J Biomed Inform 2010; 43(5): 747-51.[http://dx.doi.org/10.1016/j.jbi.2010.03.014] [PMID: 20363365]

[2] Shivade C, Raghavan P, Fosler-Lussier E, et al. A review of approaches to identifying patient phenotype cohorts using electronic healthrecords. J Am Med Inform Assoc 2014; 21(2): 221-30.[http://dx.doi.org/10.1136/amiajnl-2013-001935] [PMID: 24201027]

[3] Tate AR, Beloff N, Al-Radwan B, et al. Exploiting the potential of large databases of electronic health records for research using rapid searchalgorithms and an intuitive query interface. J Am Med Inform Assoc 2014; 21(2): 292-8.[http://dx.doi.org/10.1136/amiajnl-2013-001847] [PMID: 24272162]

[4] Kandula S, Zeng-Treitler Q, Chen L, Salomon WL, Bray BE. A bootstrapping algorithm to improve cohort identification using structureddata. J Biomed Inform 2011; 44(Suppl. 1): S63-8.[http://dx.doi.org/10.1016/j.jbi.2011.10.013] [PMID: 22079803]

[5] Sadek AR, Van Vlymen J, Khunti K, De Lusignan S. Automated identification of miscoded and misclassified cases of diabetes from computerrecords. Diabet Med 2012; 29(3): 410-4.[http://dx.doi.org/10.1111/j.1464-5491.2011.03457.x] [PMID: 21916978]

[6] Read J. The Read clinical classification (Read codes). Br Homeopath J 1991; 80(1): 14-20.[http://dx.doi.org/10.1016/S0007-0785(05)80418-1]

[7] Herrett E, Thomas SL, Schoonen WM, Smeeth L, Hall AJ. Validation and validity of diagnoses in the General Practice Research Database: Asystematic review. Br J Clin Pharmacol 2010; 69(1): 4-14.[http://dx.doi.org/10.1111/j.1365-2125.2009.03537.x] [PMID: 20078607]

[8] Hammad TA, Margulis AV, Ding Y, Strazzeri MM, Epperly H. Determining the predictive value of Read codes to identify congenital cardiacmalformations in the UK Clinical Practice Research Datalink. Pharmacoepidemiol Drug Saf 2013; 22(11): 1233-8.[http://dx.doi.org/10.1002/pds.3511] [PMID: 24002995]

[9] Khan NF, Harrison SE, Rose PW. Validity of diagnostic coding within the General Practice Research Database: A systematic review. Br JGen Pract 2010; 60(572): e128-36.[http://dx.doi.org/10.3399/bjgp10X483562]

[10] Stone MA, Camosso-Stefinovic J, Wilkinson J, de Lusignan S, Hattersley AT, Khunti K. Incorrect and incomplete coding and classificationof diabetes: A systematic review. Diabet Med 2010; 27(5): 491-7.[http://dx.doi.org/10.1111/j.1464-5491.2009.02920.x] [PMID: 20536944]

[11] De Lusignan S, Sadek K, McDonald H, et al. Call for consistent coding in diabetes mellitus using the Royal College of General Practitionersand NHS pragmatic classification of diabetes. Inform Prim Care 2012; 20(2): 103-13.[PMID: 23710775]

188









http://dx.doi.org/10.1111/j.1464-5491.2011.03457.x


http://dx.doi.org/10.1016/S0007-0785(05)80418-1

http://dx.doi.org/10.1111/j.1365-2125.2009.03537.x





http://dx.doi.org/10.1111/j.1464-5491.2009.02920.x




[12] Seidu S, Davies MJ, Mostafa S, de Lusignan S, Khunti K. Prevalence and characteristics in coding, classification and diagnosis of diabetes inprimary care. Postgrad Med J 2014; 90(1059): 13-7.[http://dx.doi.org/10.1136/postgradmedj-2013-132068] [PMID: 24225940]

[13] De Lusignan S, Liaw S-T, Dedman D, Khunti K, Sadek K, Jones S. An algorithm to improve diagnostic accuracy in diabetes in computerisedproblem orientated medical records (POMR) compared with an established algorithm developed in episode orientated records (EOMR). JInnov Health Inform 2015; 22(2): 255-64.[http://dx.doi.org/10.14236/jhi.v22i2.79] [PMID: 26245239]

[14] De Lusignan S, Khunti K, Belsey J, et al. A method of identifying and correcting miscoding, misclassification and misdiagnosis in diabetes: Apilot and validation study of routinely collected data. Diabet Med 2010; 27(2): 203-9.[http://dx.doi.org/10.1111/j.1464-5491.2009.02917.x] [PMID: 20546265]

[15] Holt TA, Gunnarsson CL, Cload PA, Ross SD. Identification of undiagnosed diabetes and quality of diabetes care in the United States: Cross-sectional study of 11.5 million primary care electronic records. CMAJ Open 2014; 2(4): E248-55.[http://dx.doi.org/10.9778/cmajo.20130095] [PMID: 25485250]

[16] Holt TA, Stables D, Hippisley-Cox J, O’Hanlon S, Majeed A. Identifying undiagnosed diabetes: cross-sectional survey of 3.6 million patients’electronic records. Br J Gen Pract 2008; 58(548): 192-6.[http://dx.doi.org/10.3399/bjgp08X277302] [PMID: 18318973]

[17] Magliano DJ, Zimmet P, Shaw J. US trends for diabetes prevalence among adults. JAMA 2016; 315(7): 705.[http://dx.doi.org/10.1001/jama.2015.16455] [PMID: 26881376]

[18] Blak BT, Thompson M, Dattani H, Bourke A. Generalisability of The Health Improvement Network (THIN) database: Demographics, chronicdisease prevalence and mortality rates. Inform Prim Care 2011; 19(4): 251-5.[PMID: 22828580]

[19] Denburg MR, Haynes K, Shults J, Lewis JD, Leonard MB. Validation of The Health Improvement Network (THIN) database forepidemiologic studies of chronic kidney disease. Pharmacoepidemiol Drug Saf 2011; 20(11): 1138-49.[http://dx.doi.org/10.1002/pds.2203] [PMID: 22020900]

[20] IMS Health Incorporated The Health Improvement Network (THIN) database London: IMS Health Incorporated 2017. Available at:http://www.csdmruk.imshealth.com/index.html

[21] Gray J, Orr D, Majeed A. Use of Read codes in diabetes management in a south London primary care group: Implications for establishingdisease registers. BMJ 2003; 326(7399): 1130.[http://dx.doi.org/10.1136/bmj.326.7399.1130] [PMID: 12763987]

[22] Rollason W, Khunti K, De Lusignan S. Variation in the recording of diabetes diagnostic data in primary care computer systems: Implicationsfor the quality of care. Inform Prim Care 2009; 17(2): 113-9.[PMID: 19807953]

[23] Lycett D, Nichols L, Ryan R, et al. The association between smoking cessation and glycaemic control in patients with type 2 diabetes: ATHIN database cohort study. Lancet Diabetes Endocrinol 2015; 3(6): 423-30.[http://dx.doi.org/10.1016/S2213-8587(15)00082-0] [PMID: 25935880]

[24] American Diabetes Association. Standards of Medical Care in Diabetes-2015. Diabetes Care 2015; 38(Suppl. 1): S4.[http://dx.doi.org/10.2337/dc15-S003]

[25] Hall MA. 1999. Correlation-based feature selection for machine learning PhD dissertation. Hamilton, NZ: University of Waikato, 1999

[26] Senliol B, Gulgezen G, Yu L, Cataltepe Z. Fast Correlation Based Filter (FCBF) with a different search strategy. Computer and InformationSciences. 2008 ISCIS'08 23rd International SymposiumIstanbol, Turkey: IEEE, 2008.

[27] Witten IH, Frank E. Data Mining: Practical machine learning tools and techniques. Berlington, MA: Morgan Kaufmann 2005.

[28] Friedman N, Geiger D, Goldszmidt M. Bayesian Network Classifiers. Mach Learn 1997; 29(2): 131-63.

[29] John GH, Langley P, Eds. Estimating continuous distributions in Bayesian classifiers. Proceedings of the Eleventh conference on Uncertaintyin artificial intelligence. Berlington, MA: Morgan Kaufmann Publishers Inc.338-45.

[30] Schmidt M, Roux NL, Bach F. Minimizing finite sums with the stochastic average gradient. Math Program 2017; 162(1-2): 83-112.

[31] Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995; 20(3): 273-97.[http://dx.doi.org/10.1007/BF00994018]

[32] Wu T-F, Lin C-J, Weng RC. Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 2004; 5: 975-1005.

[33] Ruck DW, Rogers SK, Kabrisky M. Feature selection using a multilayer perceptron. J Neural Netw Comput 1990; 2(2): 40-8.

[34] Loh W-Y. Improving the precision of classification trees. Ann Appl Stat 2009; 3(4): 1710-37.[http://dx.doi.org/10.1214/09-AOAS260]

[35] Holte RC. Very simple classification rules perform well on most commonly used datasets. Mach Learn 1993; 11(1): 63-90.[http://dx.doi.org/10.1023/A:1022631118932]

[36] Tapak L, Mahjub H, Hamidi O, Poorolajal J. Real-data comparison of data mining methods in prediction of diabetes in Iran. Healthc InformRes 2013; 19(3): 177-85.

189

http://dx.doi.org/10.1136/postgradmedj-2013-132068


http://dx.doi.org/10.14236/jhi.v22i2.79


http://dx.doi.org/10.1111/j.1464-5491.2009.02917.x


http://dx.doi.org/10.9778/cmajo.20130095




http://dx.doi.org/10.1001/jama.2015.16455





http://www.csdmruk.imshealth.com/index.html

http://dx.doi.org/10.1136/bmj.326.7399.1130



http://dx.doi.org/10.1016/S2213-8587(15)00082-0


http://dx.doi.org/10.2337/dc15-S003

http://dx.doi.org/10.1007/BF00994018

http://dx.doi.org/10.1214/09-AOAS260

http://dx.doi.org/10.1023/A:1022631118932


[http://dx.doi.org/10.4258/hir.2013.19.3.177] [PMID: 24175116]

[37] Mani S, Chen Y, Elasy T, Clayton W, Denny J. Type 2 diabetes risk forecasting from EMR data using machine learning. AMIA Annu SympProc 2012. 606-15.

[38] Zheng T, Xie W, Xu L, et al. A machine learning-based framework to identify type 2 diabetes through electronic health records. Int J MedInform 2017; 97: 120-7.[http://dx.doi.org/10.1016/j.ijmedinf.2016.09.014] [PMID: 27919371]

[39] Razavian N, Blecker S, Schmidt AM, Smith-McLallen A, Nigam S, Sontag D. Population-level prediction of type 2 diabetes from claims dataand analysis of risk factors. Big Data 2015; 3(4): 277-87.[http://dx.doi.org/10.1089/big.2015.0020] [PMID: 27441408]

[40] Thomas G, Klein K, Paul S. Statistical challenges in analysing large longitudinal patient-level data: The danger of misleading clinicalinferences with imputed data. J Indian Soc Agric Stat 2014; 68(2): 39-54.

[41] Khunti K, Davies M, Majeed A, Thorsted BL, Wolden ML, Paul SK. Hypoglycemia and risk of cardiovascular disease and All-causemortality in insulin-treated people with type 1 and type 2 diabetes: A cohort study. Diabetes Care 2015; 38(2): 316-22.[PMID: 25492401]

© 2017 Owusu Adjah et al.

This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), acopy of which is available at: (https://creativecommons.org/licenses/by/4.0/legalcode). This license permits unrestricted use, distribution, andreproduction in any medium, provided the original author and source are credited.

190

http://dx.doi.org/10.4258/hir.2013.19.3.177


http://dx.doi.org/10.1016/j.ijmedinf.2016.09.014


http://dx.doi.org/10.1089/big.2015.0020



https://creativecommons.org/licenses/by/4.0/legalcode

APPENDIX C

191


Weight gain in insulin-treated patients by body mass indexcategory at treatment initiation: new evidence fromreal-world data in patients with type 2 diabetes

S. K. Paul PhD1 | J. E. Shaw MSc2 | O. Montvida MSc1,3 | K. Klein PhD1

1Clinical Trials and Biostatistics Unit, QIMR


Brisbane, Australia

2Baker IDI Heart and Diabetes Institute,


3School of Biomedical Sciences, Faculty of

Health, Institute of Health and Biomedical

Innovation, Queensland University of

Technology, Brisbane, Australia

Corresponding Author: Prof. S. K. Paul, Clinical

Trials and Biostatistics Unit, QIMR Berghofer

Medical Research Institute, Brisbane,

Queensland 4006, Australia

([email protected] ).

Funding information

The QIMR Berghofer Medical Research

Institute gratefully acknowledges the support

from the National Health and Medical

Research Council and the Australian

Government’s National Collaborative Research

Infrastructure Strategy (NCRIS) initiative

through Therapeutic Innovation Australia.

Aims: To evaluate, in patients with type 2 diabetes (T2DM) treated with insulin, the extent of

weight gain over 2 years of insulin treatment, and the dynamics of weight gain in relation to

glycaemic achievements over time according to adiposity levels at insulin initiation.

Materials and methods: Patients with T2DM (n = 155 917), who commenced insulin therapy

and continued it for at least 6 months, were selected from a large database of electronic medi-

cal records in the USA. Longitudinal changes in body weight and glycated haemoglobin (HbA1c)

according to body mass index (BMI) category were estimated.

Results: Patients had a mean age of 59 years, a mean HbA1c level of 9.5%, and a mean BMI of

35 kg/m2 at insulin initiation. The HbA1c levels at insulin initiation were significantly lower

(9.2-9.4%) in the obese patients than in patients with normal body weight (10.0%); however,

the proportions of patients with HbA1c >7.5% or >8.0% were similar across the BMI cate-

gories. The adjusted weight gain fell progressively with increasing baseline BMI category over

6, 12 and 24 months (p < .01). The adjusted changes in HbA1c were similar across BMI cate-

gories. A 1% decrease in HbA1c was associated with progressively less weight gain as pretreat-

ment BMI rose, ranging from a 1.24 kg gain in those with a BMI <25 kg/m2 to a 0.32 kg loss in

those with a BMI > 40 kg/m2.

Conclusions: During 24 months of insulin treatment, obese patients gained significantly less

body weight than normal-weight and overweight patients, while achieving clinically similar

glycaemic benefits. These data provide reassurance with regard to the use of insulin in

obese patients.

KEYWORDS

body mass index, glycaemic control, insulin initiation, type 2 diabetes, weight change

1 | INTRODUCTION

Type 2 diabetes (T2DM) is a progressive disease in which β-cell

function continues to decline over time, leading to the need for

insulin treatment in a significant proportion of patients. Although

studies suggest that early initiation of insulin supplementation may

alter the progressive course of T2DM, insulin initiation is often

delayed, mainly because of patients’ hesitation and physician

barriers.1–7 This may significantly increase the risk of developing dia-

betic complications.7,8 Fear of weight gain is one of the common

reasons for delaying insulin therapy, and concerns about potential

weight gain are usually greatest for those who are already

obese.8–11

While many studies have reported that significant weight gain is

associated with insulin therapy, no study, to the best of our knowl-

edge, has explored the weight gain with insulin therapy according to

adiposity levels at the time of insulin initiation. If it were the case that

insulin-related weight gain declines with increasing pretreatment adi-

posity, then significant reassurance could be provided to obese

patients, and timely insulin therapy could be initiated.

In the present longitudinal study, using data from a comprehen-

sive electronic medical record database, the aims were to evaluate

Received: 2 June 2016 Revised: 31 July 2016 Accepted: 1 August 2016

DOI 10.1111/dom.12761

1244 © 2016 John Wiley & Sons Ltd wileyonlinelibrary.com/journal/dom Diabetes Obes Metab December 2016; 18: 1244–1252

192


the following, according to pretreatment body mass index (BMI) cate-

gory: (1) change in body weight during the 2 years after initiation of

insulin treatment; (2) glycaemic control in these patients in relation to

weight change; and (3) weight gain associated with a 1 percentage

point improvement in glycated haemoglobin (HbA1c).


2.1 | Data source

The General Electronic Centricity Electronic Medical Records

(CEMR) contain more than 40 million patients’ clinical/treatment

records from 1995 to January 2015. The CEMR represents 49 US

states and includes data from >35 000 healthcare providers, of

which ~70% are primary care providers. The CEMR database stores

data on medication prescriptions within the electronic medical

records network, and also information on medications that may be

used over the counter or prescribed outside of the electronic medi-

cal records network. This database includes insured and uninsured

patients, and has been extensively used for academic research

worldwide.12–16

From more than 1.6 million patients with T2DM, a cohort of

432 287 patients, who received at least one prescription of insulin on

or after diagnosis of T2DM, was selected. Identification of T2DM

was based on the International Classification of Diseases (ICD)-9

codes and at least two prescriptions for any antidiabetes drugs within

6 months of diagnosis of T2DM. Patients with incomplete description

of the disease and with any record of type 1 diabetes longitudinally

were excluded. Inclusion criteria were no missing data on age, sex or

ethnicity at diagnosis of T2DM, age at insulin initiation between

18 and 75 years, first prescription date of insulin on or after 1 January

1995, no missing data on body weight and HbA1c at and within

3 months of the first date of insulin prescription, and any prescription

of glucagon-like peptide-1 (GLP-1) receptor agonists during the

follow-up period. The sizes of the cohorts that had 6, 12 and

24 months of insulin treatment duration were 155 917, 151 220

(sub-cohort 1) and 144 857 (sub-cohort 2), respectively.

Patients’ BMI at insulin initiation was categorized as: BMI < 25

kg/m2 (normal weight); BMI ≥ 25 and <30 kg/m2 (overweight);

BMI ≥ 30 and <35 kg/m2 (Grade 1 obesity); BMI ≥ 35 and <40 kg/

m2 (Grade 2 obesity); and BMI ≥ 40 kg/m2 (Grade 3 obesity).

Baseline data included age, sex, ethnicity, body weight, BMI and

blood pressure at the time of diagnosis of diabetes and at the time of

insulin initiation (index date). Longitudinal clinical and laboratory mea-

sures were arranged on the basis of 6-monthly windows, progres-

sively from 6 months before the index date, and only the latest

measurement within each window was preserved. For example, the

latest HbA1c value measured >6 and ≤12 months after the index

date was kept as HbA1c at 12 months. Complete information on

antidiabetes drugs, antihypertensive and cardioprotective medica-

tions over time was obtained, along with dates of prescriptions. For

antidiabetes drugs, information was extracted on any medication that

was prescribed after diagnosis of diabetes and after the index date.

The treatment duration with individual medications was calculated.


The proportions of patients with missing data on body weight

and HbA1c between 6 and 24 months of follow-up ranged between

9% – 16% and 8% – 17%, respectively. The missing data were

imputed using a multiple imputation technique, with adjustments for

age at insulin initiation, diabetes duration at insulin initiation and

usage of oral antidiabetes drugs (OAD) during follow-up. All primary

analyses were conducted using the imputed weight and HbA1c data,

with additional analyses based on complete cases for sensitivity

analyses.

The mean [95% confidence interval (CI)] changes in body weight

and HbA1c at 6, 12 and 24 months for the main study cohort and

the two sub-cohorts were obtained using weighted multivariate

regression models, adjusting for age at index date, sex, diabetes dura-

tion at index date, OAD usage, and distribution of weight or HbA1c

at index date, separately for BMI categories at index date. The

regression models for weight and HbA1c change were weighted by

baseline weight and HbA1c respectively. The mean (95% CI) values

for the possible marginal contribution of sulphonylurea usage to

weight and HbA1c changes were estimated using the same regres-

sion models, as appropriate. Separate sensitivity analyses were con-

ducted using data from patients with a minimum of 2 years’ diabetes

duration at index date, to verify the consistency of the findings on

weight change according to BMI category at index date. Additional

sensitivity analyses included evaluations of changes in body weight

and HbA1c, with further adjustments for the insulin regimen at index

date, and among those who did not undergo any bariatric surgery

procedure before insulin initiation or during follow-up (n = 153 788).

To evaluate the possible association of a 1% reduction in HbA1c

by insulin � other antidiabetes drugs with weight gain over 6, 12 and

24 months of insulin treatment, multivariate factorial regression mod-

els were fitted. For example, to evaluate the possible association of a

1% reduction in HbA1c at 12 months of treatment, the fitted

model was:

weight12 m – weightindex dateð Þ � function ageindex date + sexf

+ diabetesdurationindex date

+ use of anyOADon index date or during12months of follow-up

+ BMI categoriesindex dateð Þ× HbA1c12m – HbA1cindex dateð Þg

−weightedbyweightindex date:

The regression-based approach described above was also used to

evaluate the possible differences in the patterns of association of

HbA1c change with weight change in people with different BMI

levels at insulin initiation.

3 | RESULTS

The demographic, clinical, laboratory and medications data at the

time of insulin initiation are shown in Table 1, for the whole study

PAUL ET AL. 1245

193

cohort (n = 155 917) and the two sub-cohorts defined on the basis

of minimum insulin treatment duration of 12 months (n = 151 220)

and 24 months (n = 144 857). In the whole cohort, patients had a

mean [standard deviation (s.d.)] age of 59 (11) years, 48% were male,

54% were white, and the median diabetes duration at index date was

~2 years. The mean HbA1c at insulin initiation (9.5%) and the propor-

tions of patients with HbA1c >7.5% and >8% were similar in all

cohorts.

Table 2 shows the baseline characteristics according to BMI cate-

gory. Female and white patients were more likely to have a higher

BMI at index date. The diabetes duration was similar among the BMI

groups. The mean HbA1c levels at index date were significantly lower

(9.2-9.4%) in the obese patients than in patients with normal body

weight (10.0%); however, the proportions of patients with HbA1c

>7.5% and >8.0% were similar across the BMI categories. The obese

patients were more likely to be on concomitant metformin and/or

sulphonylurea therapies than patients with normal weight.

3.1 | Weight change

Weight change over 24 months is shown in Figure 1 and Table 3.

The weight gain was significantly and consistently lower in patients

with a higher BMI, compared with that in patients with normal

body weight. In Grade 1 and Grade 2 obese patients, the adjusted

mean weight gain over 6 and 12 months of insulin treatment ran-

ged between 0.1 and 0.9 kg (Table 3), combining the main cohort

and sub-cohort 1. In Grade 3 obese patients, the adjusted reduc-

tions in body weight were 0.7, 1.1 and 2.2 kg over 6, 12 and

24 months of insulin treatment, respectively. The adjusted mean

weight gain in the normal-weight patients ranged between 2 and

TABLE 1 Baseline characteristics of main cohort and two sub-cohorts

Insulin treatment ≥ 6 months Insulin treatment ≥ 12 months Insulin treatment ≥ 12 monthsMain cohort Sub-cohort 1 Sub-cohort 2

N 155 917 151 220 144 857

Male, n (%) 75 038 (48) 72 797 (48) 69 788 (48)

Ethnicity, n (%)

White 83 441 (54) 80 840 (54) 77 431 (54)

Black 21 658 (14) 21 086 (14) 20 274 (14)

Hispanic 6180 (4) 6004 (4) 5740 (4)

Asian 2789 (2) 2713 (2) 2616 (2)

Other/Unknown 41 489 (27) 40 577 (27) 38 796 (27)

Mean (s.d.) age at insulin initiation, years 59 (11) 59 (11) 59 (11)

Diabetes duration

Median (Q1, Q3), years 1.9 (0.2, 2.6) 1.9 (0.2, 2.6) 2.0 (0.2, 2.7)

<2 years, n (%) 102 038 (72) 98 911 (72) 94 655 (71)

2-5 years, n (%) 25 754 (18) 25 020 (18) 24 027 (18)

>5years, n (%) 14 848 (10) 14 431 (10) 13 856 (11)

Mean (s.d.) weight, kg 98.8 (26.0) 98.6 (26.0) 98.6 (25.9)

Mean (s.d.) BMI, kg/m2 34.8 (8.4) 34.8 (8.5) 34.8 (8.4)

BMI category1, n (%)

<25 kg/m2 13 880 (9) 13 585 (9) 13 153 (9)

≥25 and <30 kg/m2 32 047 (21) 31 254 (21) 30 106 (21)

≥30 and < 35 kg/m2 43 274 (28) 41 916 (28) 40 153 (28)

≥35 and <40 kg/m2 32 445 (21) 31 385 (21) 29 940 (21)

≥40 kg/m2 34 271 (22) 33 080 (22) 31 505 (22)

HbA1c

Mean (s.d.) HbA1c, % 9.5 (2.3) 9.5 (2.3) 9.5 (2.2)

Median (Q1, Q3) HbA1c, % 9.2 (7.7, 11.0) 9.2 (7.7, 10.9) 9.2 (7.7, 10.9)

HbA1c ≥ 7.5%, n (%) 123 802 (79) 120 111 (79) 114 965 (79)

HbA1c ≥ 8%, n (%) 111 368 (71) 108 052 (72) 103 379 (71)

Antidiabetes medication, n (%)

Metformin 108 377 (70) 105 084 (70) 100 531 (69)

Sulphonylurea 74 492 (48) 72 305 (48) 69 214 (48)

Insulin type at initiation, n (%)

Basal 99 299 (65) 96 343 (65) 92 212 (65)

Biphasic 21 136 (14) 20 548 (14) 19 843 (14)

Prandial 31 314 (21) 30 335 (21) 29 044 (21)

1 BMI categories at insulin initiation: <25 kg/m2 (normal weight); ≥25 and <30 kg/m2 (overweight); ≥30 and <35 kg/m2 (Grade 1 obesity); ≥35 and<40 kg/m2 (Grade 2 obesity); and ≥40 kg/m2 (Grade 3 obesity).

1246 PAUL ET AL.

194

4 kg over 6-24 months of insulin treatment. In normal-weight and

overweight patients, the mean weight gain was significantly higher

than in obese patients. As evident from Figure 1C, normal-weight

and overweight patients continuously gained weight over 2 years

of follow-up, while declining body weight trajectories were

observed in obese patients after 12 months of insulin initiation.

The proportions of patients with weight gain of ≥5 kg were sig-

nificantly lower in obese groups (16% and 19%) than in normal-

weight patients (28% and 37%) during 12 and 24 months of insulin

treatment (Table 3). Sulphonylurea treatment only marginally contrib-

uted to the weight gain (0.17-0.27 kg over 6-24 months of insulin

treatment).

Approximately 28% of patients (n = 53 879) had a minimum of

2 years’ diabetes duration at insulin initiation (Table S1). The patterns

of weight change by BMI category in this subset of patients were

similar to those observed in all patients across the insulin treatment

duration categories. The observed weight changes in different BMI

categories were similar after adjustments for insulin regimen, and also

for those who did not undergo any bariatric procedure(s).

3.2 | Glycaemic control

The 6-monthly longitudinal measures [mean (95% CI)] of HbA1c over

24 months from index date are shown in Figure 1B. The adjusted

changes in HbA1c at 6, 12 and 24 months over the BMI categories

are shown in Table 4. Starting with a significantly higher HbA1c of

10% at the index date (compared with overweight and obese

patients), the normal-weight patients had a mean reduction in HbA1c

of ~1.4% over 6-24 months of insulin treatment. The overweight,

Grade 1 and Grade 2 obese patients had similar glycaemic achieve-

ments over the follow-up time (1.0-1.3%). Although the HbA1c

reductions in Grade 3 obese patients were statistically significantly

lower compared with the other groups of overweight and obese

patients, these were clinically marginal differences.

In the main study cohort, the proportions of patients on metfor-

min, sulphonylureas or both medications were 70%, 48% and 39%,

respectively. Among normal-weight patients with a minimum of

24 months of insulin treatment (13 153/144 857 patients), the pro-

portions of patients receiving metformin, sulphonylurea or both were

TABLE 2 Study variables and concomitant medication usage at insulin initiation according to BMI category in patients with a minimum

24 months of insulin treatment (sub-cohort 2)

BMI category

<25 kg/m2≥25 and<30 kg/m2

≥30 and<35 kg/m2

≥35 and<40 kg/m2 ≥40 kg/m2

N 13 153 30 106 40 153 29 940 31 505

Male, n (%) 7126 (54) 16 925 (56) 20 847 (52) 13 592 (45) 11 298 (36)

Ethnicity, n (%)

White 6200 (47) 14 959 (50) 21 491 (54) 16 727 (56) 18 054 (57)

Black 1851 (14) 4244 (14) 5414 (14) 3967 (13) 4798 (15)

Hispanic 584 (4) 1577 (5) 1641 (4) 1051 (4) 887 (3)

Asian 639 (5) 891 (3) 603 (2) 267 (1) 216 (1)

Other/Unknown 3879 (30) 8435 (28) 11 004 (28) 7928 (27) 7550 (24)

Mean (s.d.) age at insulin initiation,years

58 (12) 60 (11) 59 (11) 59 (11) 57 (12)

Diabetes duration, years

Median (Q1, Q3) 1.6 (0.1, 1.9) 1.8 (0.1, 2.4) 1.9 (0.1, 2.4) 1.8 (0.1, 2.5) 1.8 (0.1, 2.5)

<2 years, n (%) 9223 (76) 19 800 (71) 26 050 (71) 19 281 (71) 20 301 (71)

2–5 years, n (%) 1927 (16) 5004 (18) 6697 (18) 5076 (19) 5323 (19)

>5 years, n (%) 1015 (8) 2952 (11) 3937 (11) 2892 (11) 3060 (11)

Mean (s.d.) weight, kg 64.4 (10.5) 79.8 (11.1) 93.8 (13.9) 106.4 (15.0) 130.2 (23.1)

Mean (s.d.) BMI, kg/m2 22.5 (2.2) 27.7 (1.4) 32.6 (1.4) 37.3 (1.4) 46.8 (6.9)

HbA1c

Mean (s.d.), % 10.0 (2.8) 9.7 (2.4) 9.4 (2.3) 9.3 (2.2) 9.2 (2.1)

Median (Q1, Q3), % 9.6 (7.7, 12.0) 9.4 (7.8, 11.3) 9.2 (7.7, 10.9) 9.0 (7.7, 10.7) 9.0 (7.6, 10.5)

HbA1c ≥ 7.5%, n (%) 10 400 (79) 24 289 (81) 32 015 (80) 23 686 (79) 24 575 (78)

HbA1c ≥ 8%, n (%) 9 462 (72) 22 045 (73) 28 766 (72) 21 188 (71) 21 918 (70)

Antidiabetes medication, n (%)

Metformin only 7892 (60) 20 685 (69) 27 973 (70) 21 328 (71) 22 653 (72)

Sulphonylurea only 5416 (41) 14 339 (48) 19 378 (48) 14 683 (49) 15 398 (49)

Metformin + Sulphonylurea 4390 (33) 11 896 (40) 15 945 (40) 12 074 (40) 12 636 (40)

Insulin type at initiation, n (%)

Basal 8164 (64) 19 503 (66) 25 757 (66) 19 018 (65) 19 770 (64)

Biphasic 1756 (14) 4114 (14) 5449 (14) 4125 (14) 4399 (14)

Prandial 2901 (23) 5755 (20) 7847 (20) 6025 (21) 6516 (21)

PAUL ET AL. 1247

195

60%, 41% and 33%, respectively. The distribution of usages of these

drugs alone or in combination were similar for overweight and obese

patients, and were significantly higher compared with the usage

observed among the normal-weight patients (Table 2). The adjusted

marginal HbA1c reductions, associated with metformin treatment,

were 0.37%, 0.38% and 0.31% at 6, 12 and 24 months of follow-up.

The adjusted changes in HbA1c level associated with sulphonylurea

were marginal (Table 4). These estimates were similar across all BMI

categories (results are not presented).

3.3 | Association of weight change withimprovement in glycaemic control

The adjusted estimates (regression coefficients) and 95% CIs of

weight gain associated with 1% reduction in HbA1c by BMI cate-

gories over 6, 12 and 24 months of insulin treatment among patients

in sub-cohort 2 are shown in Table 3 and Figure 1D. The estimated

weight gains were significantly lower among obese patients than

among normal- and overweight patients. While a 1% HbA1c reduc-

tion was associated with weight gains of 0.92 and 1.24 kg among

normal-weight patients at 12 and 24 months of insulin treatment,

weight gain was 0.13-0.46 kg among Grade 1 and Grade 2 obese

patients during the same follow-up period. Among Grade 3 obese

patients, a 1% HbA1c reduction was not associated with any increase

in weight.

3.4 | Pattern of association of weight changewith glycaemic control by BMI category

Figure 1D shows a difference in the association between HbA1c

reduction and weight gain by BMI category. For example, at

24 months of treatment, with the slope of 1.24 (95% CI 1.18, 1.31)

kg associated with a 1% reduction in HbA1c for normal-weight

patients as reference, the differences in the slopes for obese cate-

gories (from slope for normal-weight category) were significantly dif-

ferent (p < .01) from zero, and were also significantly different among

three obese categories. The differences in the slopes in the over-

weight, Grade 1 and Grade 2 categories were 0.43, 0.78 and 1.11 kg,

respectively (p < .01). The difference in estimated slopes in the

normal-weight and Grade 3 obese patients was 1.56 kg. The pattern

of association of longitudinal changes in body weight with HbA1c

was similar for different ethnic groups.

As a sensitivity analysis, all analyses (described above) were also

conducted in patients with complete data on weight and HbA1c

FIGURE 1 A, Mean (95% CI) of longitudinal measures of body weight (kg) over 2 years from the time of insulin initiation by BMI category at

index date; B, mean (95% CI) of longitudinal measures of HbA1c over 2 years from the time of insulin initiation by BMI category at index date;C, adjusted mean (95% CI) of change in body weight at 6, 12 and 24 months of treatment with insulin by BMI category at index date;D, adjusted mean (95% CI) change in body weight at 6, 12 and 24 months of treatment with insulin, associated with 1% reduction in HbA1c atthese time points.

1248 PAUL ET AL.

196

measures at 6, 12 and 24 months of follow-up. No difference in the

estimates or inferences was observed between complete case ana-

lyses and analyses based on the imputed data.

4 | DISCUSSION

The present exploratory clinical study, based on large-scale longitudi-

nal real-world data, clearly suggests that: (1) weight gain associated

with insulin treatment is significantly lower in obese patients with

T2DM compared with that observed in patients with normal body

weight; (2) the significantly lower weight gain in obese patients is

consistent over 6, 12 and 24 months of treatment with insulin,

adjusted for various factors including the use of concomitant antidia-

betes medications; (3) the glycaemic control over 24 months of treat-

ment with insulin is similar among patients with different BMI levels;

and (4) the weight gain associated with a 1% reduction in HbA1c falls

progressively as pretreatment BMI increases.

Our finding that obese patients gain significantly less weight with

insulin treatment is robust, and is supported by consistent estimates

of weight changes according to different BMI categories over 6-

24 months of insulin treatment (Figure 1C). Separate analyses to

explore possible confounders, through evaluation of various charac-

teristics according to BMI category in patients with a minimum of

2 years’ insulin treatment (Table 2), provide a basis to support this

robust finding. This is further supported by consistent findings in

patients with a minimum of 2 years’ diabetes duration before insulin

initiation (Table S1). It is likely that some of the normal-weight

patients (BMI <25 kg/m2), and perhaps some with a BMI 25-30 kg/

m2, had lost weight before commencing insulin, which may account

for some of their weight gain after insulin initiation; however, we still

see progressive reductions in body weight gain with progressive

increases in baseline BMI >30 kg/m2.

TABLE 3 Adjusted weight change over 24 months of insulin treatment initiation, by baseline BMI category

Main cohortn = 155 917

Sub-cohort 1n = 151 220



Mean (95% CI) weight change1, kg Mean (95% CI) weight change associatedwith a 1% reduction in HbA1c2

At 6 months

BMI < 25 kg/m2 2.0 (2.0, 2.1) 2.0 (2.0, 2.1) 2.0 (2.0, 2.1) 0.66 (0.61, 0.70)

BMI ≥ 25 and <30 kg/m2 1.2 (1.1, 1.2) 1.2 (1.1, 1.2) 1.1 (1.1, 1.2) 0.44 (0.41, 0.47)

BMI ≥ 30 and <35 kg/m2 0.6 (0.6, 0.7) 0.6 (0.6, 0.7) 0.6 (0.6, 0.7) 0.29 (0.26, 0.31)

BMI ≥ 35 and <40 kg/m2 0.1 (0.1, 0.2) 0.1 (0.1, 0.2) 0.1 (0.1, 0.2) 0.11 (0.08, 0.15)

BMI ≥40 kg/m2 −0.7 (−0.7, −0.6) −0.7 (−0.7, −0.6) −0.7 (−0.7, −0.6) −0.06 (−0.02, 0.0)

At 12 months

BMI < 25 kg/m2 3.0 (3.0, 3.1) 3.0 (3.0, 3.1) 0.92 (0.86, 0.97)

BMI ≥ 25 and <30 kg/m2 1.8 (1.7, 1.8) 1.7 (1.7, 1.8) 0.61 (0.58, 0.65)

BMI ≥ 30 and <35 kg/m2 0.9 (0.9, 1.0) 0.9 (0.9, 1.0) 0.39 (0.36, 0.43)

BMI ≥ 35 and <40 kg/m2 0.2 (0.2, 0.3) 0.2 (0.2, 0.3) 0.15 (0.11, 0.19)

BMI ≥ 40 kg/m2 −1.1 (−1.2, −1.1) −1.1 (−1.2, −1.1) −0.18 (−0.22, −0.14)

At 24 months

BMI < 25 kg/m2 3.9 (3.8, 3.9) 1.24 (1.18, 1.31)

BMI ≥ 25 and <30 kg/m2 2.0 (1.9, 2.0) 0.81 (0.76, 0.85)

BMI ≥ 30 and <35 kg/m2 0.7 (0.7, 0.8) 0.46 (0.41, 0.50)

BMI ≥ 35 and <40 kg/m2 −0.2 (−0.3, −0.2) 0.13 (0.08, 0.18)

BMI ≥ 40 kg/m2 −2.2 (−2.2, −2.1) −0.32 (−0.37, −0.27)

Percentage of patients who increased body weight ≥ 5 kg

BMI < 25 kg/m2 13 28 37

BMI ≥ 25 and < 30 kg/m2 12 19 24

BMI ≥ 30 and < 35 kg/m2 11 17 19

BMI ≥ 35 and < 40 kg/m2 11 16 18

BMI ≥ 40 kg/m2 11 17 18

Mean (95% CI) of marginal contribution towards changes in weight, kg

Sulphonylurea treatment 0.17 (0.11, 0.22) 0.25 (0.17, 0.32) 0.27 (0.18, 0.36)

1 Adjusted for sex, diabetes duration and metformin and sulphonylurea usage, according to BMI categories at insulin initiation.2 Adjusted for sex, diabetes duration, concomitant antidiabetic medication usage, and baseline HbA1c, according to BMI category at insulin initiation.BMI categories at insulin initiation: <25 kg/m2 (normal weight); ≥25 and <30 kg/m2 (overweight); ≥30 and <35 kg/m2 (Grade 1 obesity); ≥35 and<40 kg/m2 (Grade 2 obesity); and ≥40 kg/m2 (Grade 3 obesity).

PAUL ET AL. 1249

197

The level of HbA1c reached over time was clinically similar in the

different BMI groups, despite the fact that obese patients had signifi-

cantly lower HbA1c levels (9.2-9.4%) at insulin initiation compared

with those with normal body weight or who were overweight (9.7-

10.0%; Table 2). Thus, patients arrive at approximately the same

HbA1c level, independently of the starting value. This might suggest

that the lower weight gain seen in obese patients was attributable to

the use of less intensive insulin therapy. Although we did not have

insulin dose data, we nevertheless showed that, even when corrected

for the same HbA1c reduction, weight gain remained smaller in the

obese group.

Based on a cohort of 2179 patients with a median diabetes dura-

tion of ~9 years, Balkau et al. reported 1.78 kg of average weight

gain (unadjusted) over 1 year of treatment with insulin, with 24% of

patients experiencing weight gain of ≥5 kg, and a significant inverse

association of baseline BMI with weight gain.17 The adjusted mean

weight gain in the present study cohort with a minimum of 1 year of

treatment with insulin (cohort 2) ranged between 0.2 and 3.49 kg

among patients with BMI ≤ 40 kg/m2 (combining sub-cohorts 1 and

2). A marginal weight loss was observed in patients with Grade 3 obe-

sity. The proportion of patients gaining ≥5 kg body weight at 1 year

was similar across obese patients (16-17%; Table 3), while 28% of

patients with normal body weight gained ≥5 kg. Compared with

patients with normal weight, patients in the Grade 1, 2 and 3 obesity

categories were 48%, 51% and 50% less likely, respectively, to

increase body weight by >5 kg.

We observed that the cost of glycaemic control in terms of

weight gain is marginal in Grade 1 and Grade 2 obese patients.

Among the Grade 3 obese patients, a 1% reduction in HbA1c was

associated with a decrease in weight of ~0.3 kg after 24 months of

insulin treatment. Balkau et al.17 reported that high baseline HbA1c

level, insulin dose requirements and lower baseline BMI were inde-

pendently associated with greater weight gain. In a meta-analysis,

Pontiroli et al.10 found that intensity of treatment, insulin dose, final

HbA1c level, change in HbA1c level and frequency of hypoglycaemia

were significantly associated with weight increase as well as type of

insulin regimen; however, these studies did not evaluate the possible

association of glycaemic control with weight change in patients with

different levels of adiposity at the time of insulin initiation. The pres-

ent study also provides new information on the significant differences

in the patterns of possible association between glycaemic control and

weight change in insulin-treated patients by BMI category.

Electronic databases present challenges in terms of accuracy

and completeness of the required data. The limitations of this

TABLE 4 Adjusted HbA1c change over 24 months of insulin treatment initiation, by baseline BMI category

Main cohortn = 155 917



Mean (95% CI) HbA1c change (%)1

At 6 months

<25 kg/m2 −1.4 (−1.4, −1.3) −1.4 (−1.4, −1.3) −1.3 (−1.3, −1.2)

≥25 and <30 kg/m2 −1.3 (−1.3, −1.2) −1.3 (−1.3, −1.2) −1.2 (−1.2, −1.1)

≥30 and <35 kg/m2 −1.1 (−1.2, −1.1) −1.1 (−1.2, −1.1) −1.1 (−1.2, −1.1)

≥35 and <40 kg/m2 −1.1 (−1.2, −1.1) −1.1 (−1.1, −1.0) −1.1 (−1.1, −1.0)

≥40 kg/m2 −1.0 (−1.0, −0.9) −1.0 (−1.0, −0.9) −1.0 (−1.0, −0.9)

At 12 months

<25 kg/m2 −1.3 (−1.4, −1.3) −1.3 (−1.3, −1.2)

≥25 and <30 kg/m2 −1.2 (−1.2, −1.1) −1.2 (−1.2, −1.1)

≥30 and <35 kg/m2 −1.1 (−1.1, −1.0) −1.1 (−1.1, −1.0)

≥35 and <40 kg/m2 −1.0 (−1.0, −0.9) −1.0 (−1.1, −1.0)

≥40 kg/m2 −1.0 (−0.1, −0.9) −1.0 (−1.0, −0.9)

At 24 months

<25 kg/m2 −1.4 (−1.4, −1.3)

≥25 and <30 kg/m2 −1.2 (−1.2, −1.1)

≥30 and <35 kg/m2 −1.1 (−1.1, −1.0)

≥35 and <40 kg/m2 −1.0 (−1.1, −1.0)

≥40 kg/m2 −1.0 (−1.0, −0.9)

Metformin treatment

Yes −1.2 (−1.3, −1.2) −1.2 (−1.3, −1.2) −1.2 (−1.2, −1.1)

No −0.9 (−0.9, −0.8) −0.9 (−0.9, −0.8) −0.9 (−0.9, −0.8)

Mean (95% CI) of marginal contribution towards changes in HbA1c, %

Sulphonylurea treatment 0.03 (0.02, 0.03) 0.3 (0.1, 0.5) 0.7 (0.5, 0.9)

1 Adjusted for sex, diabetes duration and metformin and sulphonylurea usage. Estimates are provided by BMI categories at insulin initiation. BMI cate-gories at insulin initiation: <25 kg/m2 (normal weight); ≥25 and <30 kg/m2 (overweight); ≥30 and <35 kg/m2 (Grade 1 obesity); ≥35 and <40 kg/m2

(Grade 2 obesity); and ≥40 kg/m2 (Grade 3 obesity).

1250 PAUL ET AL.

198

study include non-availability of complete and reliable data on:

(1) medication adherence; (2) diet and exercise; (3) socio-economic

status; and (4) potential residual confounders. We believe the non-

availability of longitudinal insulin doses would not affect the

robustness of our findings, as the main interest was to evaluate

the observed weight change and glycaemic control at a population

level, reflecting the primary/ambulatory care disease risk factor

management. Our analysis of weight change in relation to change

in HbA1c confirms that the weight gain “cost” of achieving any

given improvement in glycaemic control is, in fact, less with

increase in pretreatment BMI.

Although we excluded patients treated with GLP-1 receptor ago-

nists, it is not possible to know to what extent the lower weight gain

with increasing BMI is attributable to pharmacological differences in

the effects of insulin or to other attempts to lose weight in the obese

groups. Patients may well intensify lifestyle efforts on commencing

insulin. Whether or not this happens more in the more obese is diffi-

cult to ascertain because of the non-availability of longitudinal life-

style intervention data. Thus, these results should not be interpreted

as indicating that lifestyle efforts to control weight gain are not nec-

essary for obese patients initiating insulin. Nevertheless, they do indi-

cate that within the facilities available in routine care, weight gain can

readily be limited when initiating insulin therapy in obese patients

with T2DM.

A large analysis cohort from the validated CEMR database

should be considered as a representative sample, and as such, pro-

vides a good picture of the state of weight and glycaemic control

in routine practice. We had complete data on weight and HbA1c

measured within 3 months of insulin initiation, and the 6-monthly

follow-up measures of weight and HbA1c were imputed for only

8-16% of missing cases. The results from complete case analyses

and imputed data were very similar. Finally, a careful new-user

design with a reasonable exposure time of 2 years and appropriate

adjustments for various aspects are the primary strengths of the

study.

In conclusion, we observed that, over 24 months of treatment

with insulin, obese patients gained significantly less weight than

normal- and overweight patients, while achieving clinically similar gly-

caemic benefits. These findings should provide important reassurance

that, among obese patients with T2DM in routine clinical practice,

meaningful improvements in glycaemic control can be achieved with

only small increases in weight.

ACKNOWLEDGMENTS

The QIMR Berghofer Medical Research Institute gratefully acknowl-

edges the support from the National Health and Medical Research

Council and the Australian Government’s National Collaborative

Research Infrastructure Strategy (NCRIS) initiative through Therapeu-

tic Innovation Australia.


S. K. P. has acted as a consultant and/or speaker for Novartis, GI



of investigator and investigator initiated clinical studies from Merck,


Avensis and Pfizer. J. S. has received honoraria for consultancies and

lectures from: Novartis, Novo Nordisk, Astra Zeneca, Sanofi, Merck

Sharp and Dohme, Abbott, Janssen Cilag, and Takeda. O. M. and

K. K. have no conflict of interest to declare.


S. K. P. conceived the idea and was responsible for the primary

design of the study. J. S. and K. K. significantly contributed to the

study design. K. K. conducted the data extraction, and S. K. P., O. M.

and K. K. jointly conducted the statistical analyses. The first draft of

the manuscript was developed by S. K. P., and all authors contributed

to the finalization of the manuscript. S. K. P. had full access to all the

data in the study and is the guarantor, taking responsibility for the

integrity of the data and the accuracy of the data analysis.

REFERENCES

1. Ishii H, Iwamoto Y, Tajima N. An exploration of barriers to insulin ini-tiation for physicians in Japan: findings from the Diabetes Attitudes,Wishes and Needs (DAWN) JAPAN study. PLoS One. 2012;7:e36361.

2. Weng J, Li Y, Xu W, et al. Effect of intensive insulin therapy on beta-cell function and glycaemic control in patients with newly diagnosedtype 2 diabetes: a multicentre randomised parallel-group trial. Lancet.2008;371:1753–1760.

3. Alvarsson M, Sundkvist G, Lager I, et al. Effects of insulinvs. glibenclamide in recently diagnosed patients with type 2 diabetes:a 4-year follow-up. Diabetes Obes Metab. 2008;10:421–429.

4. Wang Z, York NW, Nichols CG, Remedi MS. Pancreatic beta celldedifferentiation in diabetes and redifferentiation following insulintherapy. Cell Metab. 2014;19:872–882.

5. Khunti K, Davies M, Majeed A, Thorsted BL, Wolden ML, Paul SK.Hypoglycemia and risk of cardiovascular disease and all-cause mortal-ity in insulin-treated people with type 1 and type 2 diabetes: a cohortstudy. Diabetes Care. 2015;38:316–322.

6. Russell-Jones D, Khan R. Insulin-associated weight gain in diabetes–causes, effects and coping strategies. Diabetes Obes Metab.2007;9:799–812.

7. Paul SK, Klein K, Thorsted BL, Wolden ML, Khunti K. Delay in treat-ment intensification increases the risks of cardiovascular events inpatients with type 2 diabetes. Cardiovasc Diabetol. 2015;14:1–10.

8. Khunti K, Nikolajsen A, Thorsted BL, Andersen M, Davies MJ,Paul SK. Clinical inertia in intensifying therapy among people withtype 2 diabetes treated with basal insulin. Diabetes Obes Metab.2016;18:401–409.

9. Wang H, Ni YF, Li HZ, Yang S, Feng B. Effects of insulin monother-apy on body weight, composition, and fat distribution in newly diag-nosed patients with type 2 diabetes mellitus. J Diabetes.2013;5:146–148.

10. Pontiroli AE, Miele L, Morabito A. Increase of body weight during thefirst year of intensive insulin treatment in type 2 diabetes: systematicreview and meta-analysis. Diabetes Obes Metab. 2011;13:1008–1019.

11. Paul S, Thorsted BL, Wolden M, Klein K, Khunti K. Delay in treatmentintensification increases the risks of cardiovascular events in patientswith type 2 diabetes. Cardiovascular diabetology. 2015;14:1.

12. Kamal KM, Chopra I, Elliott JP, Mattei TJ. Use of electronic medicalrecords for clinical research in the management of type 2 diabetes.Res Social Adm Pharm. 2014;10:877–884.

13. Herrin J, da Graca B, Nicewander D, et al. The effectiveness of imple-menting an electronic health record on diabetes care and outcomes.Health Serv Res. 2012;47:1522–1540.

PAUL ET AL. 1251

199

14. Hansen RA, Farley JF, Maciejewski ML, Ye X, Qian C, Powers B. Real-world utilization patterns and outcomes of colesevelam HCL in the geelectronic medical record. BMC Endocr Disord. 2013;13:24.

15. Levin P, Wei W, Miao R, et al. Therapeutically interchangeable? Astudy of real-world outcomes associated with switching basal insulinanalogues among US patients with type 2 diabetes mellitus usingelectronic medical records data. Diabetes Obes Metab.2015;17:245–253.

16. Davis KL, Tangirala M, Meyers JL, Wei W. Real-world comparativeoutcomes of US type 2 diabetes patients initiating analog basal insulintherapy. Curr Med Res Opin. 2013;29:1083–1091.

17. Balkau B, Home PD, Vincent M, Marre M, Freemantle N. Factorsassociated with weight gain in people with type 2 diabetes startingon insulin. Diabetes Care. 2014;37:2108–2113.




How to cite this article: Paul SK, Shaw J, Montvida O and

Klein K. Weight gain in insulin-treated patients by body mass

index category at treatment initiation: new evidence from

real-world data in patients with type 2 diabetes, Diabetes

Obes Metab 2016, 18, 1244–1252. DOI:10.1111/dom.12761

1252 PAUL ET AL.

200

APPENDIX D

201

Research Article: Treatment

Treatment with incretins does not increase the

risk of pancreatic diseases compared to older

anti-hyperglycaemic drugs, when added to metformin:

real world evidence in people with Type 2 diabetes

O. Montvida1,2, J. B. Green3, J. Atherton4 and S. K. Paul1,5

1Statistics Unit, QIMR Berghofer Medical Research Institute, 2School of Biomedical Sciences, Queensland University of Technology, Brisbane, Australia, 3Division of

Endocrinology and Duke Clinical Research Institute, Duke University Medical Center, Durham, NC,USA, 4Cardiology Department, Royal Brisbane and Women’s

Hospital and University of Queensland School of Medicine, Brisbane and 5Melbourne EpiCentre, University of Melbourne and Melbourne Health, Melbourne,

Australia

Accepted 9 October 2018

Abstract

Aims In people with metformin-treated diabetes, to evaluate the risk of acute pancreatitis, pancreatic cancer and other

diseases of the pancreas post second-line anti-hyperglycaemic agent initiation.

Methods People with Type 2 diabetes diagnosed after 2004 who received metformin plus a dipeptidyl peptidase-4

inhibitor (DPP-4i, n = 50 095), glucagon-like peptide-1 receptor agonist (GLP-1RA, n = 12 654), sulfonylurea

(n = 110 747), thiazolidinedione (n = 17 597) or insulin (n = 34 805) for at least 3 months were identified in the US

Centricity Electronic Medical Records. Time to developing acute pancreatitis, other diseases of the pancreas and

pancreatic cancer was estimated, balancing and adjusting anti-hyperglycaemic drug groups for appropriate confounders.

Results In the DPP-4i group, the adjusted mean time to acute pancreatitis was 2.63 [95% confidence intervals (CI)

2.38, 2.88] years; time to pancreatic cancer was 2.70 (2.19, 3.21) years; and time to other diseases of the pancreas was

2.73 (2.33, 3.12) years. Compared with DPP-4i, the insulin group developed acute pancreatitis 0.48 years (P < 0.01)

earlier and the GLP-1RA group developed pancreatic cancer 3 years later (P < 0.01). However, with the constraint of no

event within 6 months of insulin initiation, the risk of acute pancreatitis in the insulin group was insignificant. No other

significant differences were observed between groups.

Conclusions No significant differences in the risk of developing pancreatic diseases in those treated with various

anti-hyperglycaemic drug classes were found.

Diabet. Med. 00, 1–8 (2018)

Introduction

Glucagon-like peptide-1 receptor agonists (GLP-1RAs) and

dipeptidyl peptidase-4 inhibitors (DPP-4i) represent incretin-

based therapeutic drug classes used to treat Type 2 diabetes.

These drugs have demonstrated efficacy in reducing blood

glucose levels with low risk of hypoglycaemia [1–3]. Treat-

ment with GLP-1RAs is associated with favourable changes

in metabolic measurements such as body weight, and some

agents in the class have been shown to reduce the risk of

cardiovascular events [3–5]. However, some recent clinical

observational studies have raised questions as to the possible

association of treatment with incretin-based therapies, par-

ticularly with DPP-4i, and the risk of acute pancreatitis or

pancreatic cancer [6–17].

Although a number of cohort studies and meta-analyses

reported no association between incretin-based therapies and

risk of acute pancreatitis and pancreatic cancer [9,12,13,

15,16,18], other studies have reported an increased risk of

acute pancreatitis with such agents [8,11,14,17]. In a meta-

analysis based on pooled data from the SAVOR-TIMI 53,

EXAMINE and TECOS trials, Tk�a�c and Raz [8] reported a

significant increase in the incidence of acute pancreatitis

[odds ratio (OR): 1.79; 95% confidence intervals (CI) 1.13,

2.82] in people treated with DPP-4i when compared with

placebo. The observed increase in the absolute risk of acuteCorrespondence to: Sanjoy Ketan Paul. E-mail: [email protected]

ª 2018 Diabetes UK 1

DIABETICMedicine

DOI: 10.1111/dme.13835

202

http://orcid.org/0000-0003-0848-7194

http://orcid.org/0000-0003-0848-7194

http://orcid.org/0000-0003-0848-7194

mailto:

pancreatitis with DPP-4i therapy was 0.13%. In a cohort

study based on real-world primary care data from the UK,

Knapen and colleagues [17] reported a 1.5-fold increased risk

of any pancreatitis in incretin-based therapy users compared

with other non-insulin anti-hyperglycaemic drug users.

However, another study by Knapen and colleagues [16],

based on the same database, reported no association of

incretin-based therapies with the risk of pancreatic cancer.

Previously published cohort studies have generally assessed

pancreatic risk by comparing rates of pancreatic diseases in

users of incretin-based therapies with rates in users of any non-

incretin based anti-hyperglycaemic regimen. However, no

prior study of adequate size and duration has holistically

evaluated the risks of acute pancreatitis and pancreatic cancer

with incretin-based therapy compared with use of other

specific anti-hyperglycaemic drug classes. Furthermore, pre-

vious publications based on real-world data, in which baseline

risks differ significantly and are modified over time by

contrasting confounders, may not have utilized optimal

analytical approaches to assess risk. Using extensive person-

level longitudinal data from ambulatory and primary care

systems in theUSA, the aims of this exploratory outcome study

were to evaluate the rates and risks of developing acute

pancreatitis, other diseases of the pancreas or pancreatic

cancer in people with metformin-treated Type 2 diabetes who

initiated second-line anti-hyperglycaemic therapy with a DPP-

4i, GLP-1RA, sulfonylurea, thiazolidinedione or insulin.

Materials and methods

Data source

Centricity Electronic Medical Record (CEMR) of the USA

represents a variety of ambulatory and primary care medical

practices, including solo practitioners, community clinics,

academic medical centres and large integrated delivery

networks. Over 35 000 physicians and other providers from

all US states contribute to the CEMR, wherein ~ 75% are

primary care providers. The database is generally represen-

tative of the US population: the diabetes prevalence (7.1% of

people with diabetes identified by diagnostic codes) is similar

to the US National Diabetes Statistics (6.7% diagnosed

diabetes in 2014) [19]. CEMR has been used extensively for

academic research worldwide [20,21].

For more than 34 million individuals, longitudinal EMRs

were available from 1995 to April 2016. This database

contains comprehensive person-level information on demo-

graphics, anthropometric, clinical and laboratory variables

including age, sex, ethnicity, smoking status, and longitudi-

nal measures of body weight, BMI, blood pressure, HbA1c,

full lipid profiles, urine albumin and creatinine, and serum

creatinine.

Medication data include brand names and doses for

individual medications prescribed (RxNorm), along with

start/stop dates and specific fields to track treatment alter-

ations. This data set also contains self-reported medications,

including prescriptions received outside the EMRnetwork and

over-the-counter medications. All disease events along with

dates are coded with International Classification of Diseases

(ICD)-9, ICD-10 or SNOMED Clinical Terms (CT) codes.

Study design and study data

All individuals with a diagnosis of Type 2 diabetes were

included in this study with the conditions of no missing data

for age, sex or ethnicity; age ≥ 18 and < 80 years at the

diagnosis of Type 2 diabetes; and date of diagnosis of Type 2

diabetes after EMR registration date and after 1 January

2005. All those included also first began anti-hyperglycaemic

therapy with metformin, followed by second-line additional

treatment with a DPP-4i, GLP-1RA, insulin, thiazolidine-

dione or sulfonylurea for ≥ 3 months. Users of second-line

insulin, thiazolidinedione or sulfonylurea who had ever

received a DPP-4i or GLP-1RA were excluded, as were those

with other diseases of the pancreas or any type of cancer that

occurred prior to initiation of second-line anti-hyperglycae-

mic drug (index date).

Baseline (index date) data included age, sex, ethnicity,

body weight, BMI and blood pressure at the time of second

anti-hyperglycaemic drug initiation. Baseline HbA1c was

obtained as the closest observation to second drug initiation

within a [�3, +3] month window. Body weight, BMI, SBP

and lipids were calculated as the average of available

measurements within [�3, +3] months of baseline. Obesity

was defined as BMI ≥ 30 kg/m2.

The presence of comorbidities prior/post index date and

the time to such events were also obtained. Acute pancreati-

tis, other diseases of the pancreas, cancer, cardiovascular

disease, chronic kidney disease and hypertension were

identified. Cancer was defined as any malignant neoplasm

or carcinoma in situ. Cancer of the pancreas was additionally

separated. Other diseases of the pancreas included specified

What’s new?

• Association of treatment with incretin-based therapies

and the risk of pancreatic diseases remains controver-

sial. However, no study explored the comparative

safety of different anti-hyperglycaemic drugs in this

context.

• This study provides a holistic population-level compar-

ative outcome evaluation of the risk of pancreatic

diseases from the time of receiving different second-line

anti-hyperglycaemic drugs post metformin.

• Although treatment with incretin-based therapies was

not found be to be associated with an increased risk of

pancreatic diseases, people treated with insulin experi-

enced higher risk of such diseases.

2 ª 2018 Diabetes UK

DIABETICMedicine Pancreatic diseases in people with Type 2 diabetes � O. Montvida et al.

203

(e.g. pancreatic cyst) and unspecified diseases of the pancreas

with appropriate clinical codes. Cardiovascular disease was

defined as ischaemic heart disease (includes myocardial

infarction), peripheral vascular/artery disease, heart failure

or stroke.

Tobacco use status included data on the use of cigars, pipe,

cigarettes, chewing tobacco, snuff and smokeless tobacco.

Occasional smokers were classified as ‘current’. In case of

discordant same-day statuses, priority was given to ‘current’,

rather than to ‘former’ and lastly to ‘never’ status. Last status

recorded prior to index date was preserved as tobacco use

status. Complete information on anti-hyperglycaemic drugs,

along with non-steroidal anti-inflammatory drugs, lipid-

modifying drugs, anti-hypertensive and cardioprotective

medications was obtained. Cardioprotective medications

included beta-blocking agents, angiotensin-converting

enzyme inhibitors, angiotensin II antagonists and statin.

Ethical approval

Research involved existing data, in which individuals could

not be identified directly or through identifiers linked to

them. Thus, according to the US Department of Health and

Human Services Exemption 4 (CFR 46.101(b)(4)), this study

is exempt from ethics approval from an institutional review

board and informed consent.

Statistical methods

All analyses were performed by class of second-line anti-

hyperglycaemic drugs. Basic statistics were presented using

number (%), mean (SD) or median [interquartile range

(IQR)], as appropriate. The event rates per 1000 person-

years (95% CI) were estimated for acute pancreatitis, other

diseases of the pancreas, pancreatic cancer using standard

life-table method.

In the presence of significant differences in risk factors

between comparative treatment groups in observational

studies, standard Cox regression survival models after

propensity score adjustments are often used. Estimation of

hazard ratios are useful for population effects when they are

constant, which occurs when the treatment enters linearly,

and the distribution of the outcome has a proportional-

hazards form [22]. However, decisions on therapeutic

introductions and modifications are neither linear nor

conform to proportional-hazards form in the context of risk

of an event. Given the observational nature of the study, with

high likelihood of inherent differences in the comparator

treatment groups, we used a ‘treatment effect’ modelling

approach [23–25]. The parametric gamma time-to-event

model with inverse probability-weighted regression adjust-

ment for the confounders was used to evaluate the adjusted

mean (95% CI) time to event for the reference treatment

group (DPP-4), and the adjusted time difference (95% CI) to

the occurrence of event in the comparator treatment groups

were estimated. The robustness of choosing gamma distri-

bution was tested on the basis of information criteria

estimates. Risk analyses were balanced on age and the

follow-up time by treatment groups, and were adjusted for

age, sex, smoking status, BMI and diabetes duration,

following the weighted propensity-score approach. Survival

time was computed as time to event (acute pancreatitis, other

diseases of the pancreas or pancreatic cancer) if an event

occurred, otherwise as time to the end of follow-up (date of a

person’s last available record within the database). The

robustness of risk modelling with balancing factors were

evaluated by estimating the weighted standardised differ-

ences in these factors by treatment groups.

Two sensitivity analyses were conducted to evaluate the

robustness of the risk analyses in two sub-cohorts: (i) in all

people from the study cohort excluding those with acute

pancreatitis, other diseases of the pancreas or any type of

cancer within 6 months of the index date (sub-cohort 1); and

(ii) in all people from the study cohort with non-missing

baseline HbA1c (sub-cohort 2). Sub-cohort 2 was addition-

ally balanced on HbA1c and body weight for risk analyses.

Results

From 2 624 954 people identified as having Type 2 diabetes,

225 898 met the inclusion criteria for the study (Fig. 1). At

the index date, participants had a mean (SD) age of 59

(12) years, 49% were men, 69% had White European

ancestry, and had an overall mean follow-up time of

3.2 years. Anti-hyperglycaemic drug groups as defined

included 22% (n = 50 095) using DPP-4i; 6%

(n = 12 654) using GLP-1RA; 15% (n = 34 805) using

insulin; 49% (n = 110 747) using sulfonylurea; and 8%

(n = 17 597) using thiazolidinedione (Table 1). Follow-up

time of those in most of the treatment groups was similar,

except for those in the thiazolidinedione group which had a

longer mean follow-up of 4.6 years. The distributions of age,

diabetes duration and HbA1c were significantly different

between groups, as expected. The proportions of people

adding or moving to a third-line anti-hyperglycaemic drug

were similar in those receiving incretin-based therapy (47%

in both the DPP-4i and GLP-1RA groups), although other

groups had significantly lower proportions. The non-incretin

groups could not have received incretin-based therapies

during follow-up, by design.

Risk of acute pancreatitis

Only 1049 (0.46%) people developed acute pancreatitis

during the mean 3.2 years of follow-up (Table 1). The rates

per 1000 person-years of acute pancreatitis were similar in

the DPP-4i (1.31; 95% CI 1.21, 1.59), GLP-1RA (1.49; 1.16,

1.92) and sulfonylurea (1.45; 1.33, 1.58) groups, whereas

those treated with insulin had significantly higher acute

pancreatitis rate (2.01; 1.75, 2.31) and those treated with


Research article DIABETICMedicine

204

thiazolidinedione had significantly lower acute pancreatitis

rate (0.89; 0.70, 1.12) compared with DPP-4i group

(Table 2). The adjusted mean (95% CI) time to acute

pancreatitis in people treated with DPP-4 was 2.63 (2.38,

2.88) years. Those treated with insulin were likely to develop

acute pancreatitis 0.48 years (P < 0.01) earlier. The adjusted

mean time to acute pancreatitis were similar in other

treatment groups.

Risk of cancer of the pancreas

In the cohort, 357 (0.16%) people developed cancer of the

pancreas, and there was no significant difference in the rate of

pancreatic cancer per 1000 person-years among the treatment

groups. Among those with acute pancreatitis, 17 (2%)

developed pancreatic cancer, of whom four and two individ-

uals belonged to DPP-4 and GLP-1RA groups, respectively.

The adjusted mean (95% CI) time to outcome in the DPP-4

group was 2.70 (2.19, 3.21) years, with no significant

differences in time to event in the insulin, sulfonylurea and

thiazolidinedione groups (Table 2). However, people treated

with GLP-1RA were likely to develop pancreatic cancer by

~ 3 years later (P < 0.01) compared with the DPP-4 group.

Risk of other diseases of the pancreas

Only 0.33% (n = 752) of the cohort experienced other

diseases of the pancreas during follow-up. Among those with

Type 2 diabetes (n = 2 624 954)

Age at diagnosis ≥ 18 and <80 (n = 2 590 853)

Diabetes mellitus (n = 2 893 321)

Pa�ents with non-missing sex and age (n = 34 299 123)

Diabetes diagnoses a�er entry to the EMR (n = 1 412 938)

Me�ormin as first line (n = 740 478)

Diabetes diagnosis on or a�er jan 1 2005 (n = 1 305 686)

GLP-1RA (n = 15 448)

Ini�ated second line (n = 357 482)

DPP-4i(n = 61 508)

INS(n = 49 939)

TZD (n = 33 021)

SU (n = 187 819)

No record of acute pancrea��s, or other disease of pancreas, or any type of cancer prior to second-line ini�a�on (n = 320 754)

DPP-4i(n = 56 327)

INS(n = 45 936)

TZD (n = 30 856)

SU (n = 173 137)

GLP-1RA (n = 12 654)

On treatment for at least 3 months (n = 289 434)

DPP-4i(n = 50 095)

INS(n = 40 846)

TZD (n = 28 337)

SU (n = 157 502)

GLP-1RA(n = 14 498)

GLP-1RA (n = 12 654)

No DPP-4 or GLP-1RA ever taken in TZD, INS, or SU groups (n = 225 898)

DPP-4i(n = 50 095)

INS(n = 34 805)

TZD (n = 17 597)

SU (n = 110 747)

GLP-1RA (n = 12 518)

No record of acute pancrea��s, or other disease of pancreas, or any type of cancer 6 months post second-line ini�a�on (n = 221 882)

DPP-4i(n = 49 419)

INS(n = 34 098)

TZD (n = 17 294)

SU (n = 108 553)

GLP-1RA (n = 7 580)

Non-missing HbA1c at the �me of second-line ini�a�on (n = 131 482)

DPP-4i(n = 31 618)

INS(n = 18 924)

TZD (n = 9 274)

SU (n = 64 086)

MAIN COHORT

SUB-COHORT 1 SUB-COHORT 2

FIGURE 1 Flow-chart of the study cohort. EMR, electronic Medical Record; DDP-4i, dipeptidyl peptidase-4 inhibitor; GLP-1RA, glucagon-like

peptide-1 receptor agonist; INS, insulin; SU, sulfonylurea; TZD, thiazolidinedione.



205

Table

1Basiccharacteristics

atthetimeofsecond-lineanti-hyperglycaem

ictherapyinitiation(index

date)

Dipeptidylpeptidase-4

inhibitor

Glucagon-likepeptide-1

receptoragonist

Insulin

Sulfonylurea

Thiazolidinedione

Total

N50095

12654

34805

110747

17597

225898

Age,

years*

58(12)

53(12)

57(13)

60(12)

59(12)

59(12)

Men

†24034(48)

4346(34)

16302(47)

57876(52)

9174(52)

111732(49)

WhiteEuropeanancestry

†34989(70)

9613(76)

23229(67)

76430(69)

11791(67)

156052(69)

Black

†5852(12)

1083(9)

5083(15)

12971(12)

1581(9)

26570(12)

Currenttobaccouse

†4872(10)

979(8)

3929(11)

9980(9)

797(5)

20557(9)

Form

ertobaccouse

†8086(16)

1989(16)

5729(16)

16790(15)

1232(7)

33826(15)

Never

tobaccouse

†15265(30)

3839(30)

8828(25)

26797(24)

2482(14)

57211(25)

Tobaccouse

–unknownstatus†

21872(44)

5847(46)

16319(47)

57180(52)

13086(74)

114304(51)

Diabetes

durationpriorto

index

date,

months*

14.35(21.47)

13.50(20.85)

7.76(16.61)

11.13(19.92)

6.34(14.27)

11.09(19.64)

Treatm

entduration,months*

26.22(20.07)

24.85(20.34)

31.15(24.77)

31.59(24.94)

32.49(26.13)

30.03(23.91)

Follow-up,years

‡2.54(1.3,4.11)

2.67(1.25,4.67)

2.27(1.13,3.99)

2.67(1.31,4.6)

4.33(1.99,6.79)

2.66(1.3,4.54)

Follow-up,years*

2.91(1.96)

3.24(2.39)

2.81(2.13)

3.24(2.37)

4.56(2.89)

3.20(2.34)

HbA1c,mmol/mol‡

61(52,74)

56(48,68)

74(56,93)

63(53,77)

55(48,68)

62(52,78)

HbA1c,%

‡7.7

(6.9,8.9)

7.3

(6.5,8.4)

8.9

(7.3,10.7)

7.9

(7.0,9.2)

7.2

(6.5,8.4)

7.8

(6.9,9.3)

SBP,mmHg*

130(14)

128(13)

131(16)

132(16)

130(15)

131(15)

LDL,mmol/l*

2.53(0.91)

2.48(0.88)

2.53(0.96)

2.53(0.91)

2.51(0.88)

2.53(0.91)

Triglicerides,mmol/l‡

1.66(1.22,2.21)

1.68(1.22,2.24)

1.61(1.15,2.20)

1.67(1.22,2.24)

1.55(1.12,2.11)

1.65(1.21,2.21)

Weight,kg*

98(24)

108(26)

100(26)

97(25)

99(24)

99(25)

BMI,kg/m

2*

34.4

(7.6)

38.2

(83)

35.2

(8.3)

34.1

(7.7)

34.6

(7.9)

34.6

(7.9)

Obese†

32567(70)

10201(87)

22600(71)

67899(68)

10763(70)

144030(70)

Hypertension†

28063(56)

6675(53)

17477(50)

60207(54)

8434(48)

120856(54)

Cardiovasculardisease

†8796(18)

1531(12)

7745(22)

22995(21)

2958(17)

44025(19)

Chronic

kidney

disease

†1525(3)

229(2)

1129(3)

3910(4)

447(3)

7240(3)

Received

thirdanti-hyperglycaem

icagent†

23318(47)

5986(47)

4482(13)

34905(32)

7015(40)

75706(34)

Received

cardio-protective

medication†

46395(93)

11336(90)

31862(92)

103553(94)

16390(93)

209536(93)

Received

non-steroidalanti-

inflammatory

drugs

37265(74)

9269(73)

25121(72)

82963(75)

13207(75)

167825(74)

Received

anti-hypertensive†

3613(7)

824(7)

3821(11)

10518(10)

1627(9)

20403(9)

Sub-cohort

1†

49419(99)

12518(99)

34098(98)

108553(98)

17294(98)

221882(98)

Sub-cohort

2†

31618(63)

7580(60)

18924(54)

64086(58)

9274(53)

131482(58)

Values

are

given

as*m

ean(SD),

†n(%

)or‡median(IQR).

Sub-cohort

1,patients

from

thestudycohort

excludingthose

withacute

pancreatitis,

other

disease

ofpancreasoranytypeofcancerwithin

6monthsofindex

date.

Sub-cohort

2,patients

from

thestudycohort

withnon-m

issingHbA1cmeasure

atindex

date.



206

acute pancreatitis, 101 (10%) also had other diseases of the

pancreas. Except for the insulin group, the rate per 1000

person-years for other diseases of the pancreas was similar

across treatment groups (Table 2). The mean (95% CI) time

to develop other diseases of the pancreas in DPP-4i group

was 2.73 (2.33, 3.12) years, with no significant difference

from other treatment groups.

In all survival regression models, the balances for defined

confounders were achieved among the treatment groups. The

weighted standardized differences in confounding factors

across the treatment groups were similar while evaluating the

risk of acute pancreatitis, other diseases of the pancreas and

pancreatic cancer separately (Table S1). The sensitivity

analyses with sub-cohorts 1 and 2 showed similar estimates

of time to events. However, after removing those who

developed acute pancreatitis or other diseases of the pancreas

or any cancer within 6 months of the index date (sub-cohort

1), the adjusted time to acute pancreatitis among people

treated with insulin was no more different from the other

groups, as otherwise observed in the primary cohort.

Discussion

A potential relationship between the use of incretin-based

anti-hyperglycaemic drugs and adverse pancreatic outcomes

has been suggested by various pre-clinical and clinical

studies. In particular, meta-analysis of data obtained from

cardiovascular safety trials of several DPP-4i medications

suggests that exposure to those medications significantly

increased the risk of acute pancreatitis compared with

placebo, although the absolute increase in risk was small

[8]. Interestingly, although people with Type 2 diabetes and

obesity are known to be at increased risk for pancreatitis and

pancreatic cancer, rates of those complications have been

low in recent clinical trials. Low rates of these outcomes of

interest, as well as the likely extended duration of exposure

and follow-up needed to ascertain a relationship between

anti-hyperglycaemic drug exposure and the development of

malignancy, pose significant challenges in determining the

pancreatic safety of drugs commonly used in diabetes

management.

In this retrospective, longitudinal, real-world study of a

large cohort of people with metformin-treated Type 2

diabetes, we analysed the effects of second-line anti-

hyperglycaemic drugs upon rates and times to clinically

important pancreatic complications. Our analyses have

shown that the rate of acute pancreatitis was higher in

the group treated with insulin, lower in the thiazolidine-

dione group, and similar in the groups receiving GLP-1RA

or sulfonylurea therapy when compared with the group

treated with DPP-4i. Time to development of pancreatic

cancer was longer in the GLP-1RA group compared with

DPP-4i users, but did not differ significantly between the

other anti-hyperglycaemic drug groups compared with DPP-

4i. Rates of any other disease of the pancreas were not

higher among people who received additional therapy with

a DPP-4i compared with other classes of anti-hyperglycae-

mic drugs.

The increased risk of pancreatitis in insulin users is perhaps

not surprising, because insulin users often tend to have a

greater burden of comorbidities and risks for adverse

outcomes that cannot be fully adjusted for in a retrospective

analysis. However, time to acute pancreatitis was no longer

significantly different between the insulin and DPP-4i groups

after removing individuals who developed acute pancreatitis

or other diseases of the pancreas or any cancer within

6 months of the index date (sub-cohort 1). The lower rate

per 1000 person-years of acute pancreatitis noted in

Table 2 Event rates (95% CI) per 1000 person-years; adjusted mean time to events (95% CI) in dipeptidyl peptidase-4 (DPP-4) group and adjusteddifference in time to events in other treatment groups with DPP-4 inhibitor as a reference

Acute pancreatitis (95% CI) Pancreatic cancer (95% CI) Other disease of pancreas (95% CI)

DPP-4Rate per 1000 person-years 1.38 (1.21, 1.59) 0.46 (0.36, 0.59) 0.93 (0.78, 1.10)Mean time to event (years) 2.63 (2.38, 2.88) 2.70 (2.19, 3.21) 2.73 (2.33, 3.12)

Glucagon-like peptide-1 receptor agonistRate per 1000 person-years 1.49 (1.16, 1.92) 0.17 (0.08, 0.36) 0.78 (0.55, 1.10)Time difference (years) �0.18 (�0.72, 0.37) 3.00 (0.84, 5.16)* 0.52 (�0.60, 1.65)

InsulinRate per 1000 person-years 2.01 (1.75, 2.31) 0.59 (0.46, 0.77) 1.48 (1.26, 1.75)Time difference (years) �0.48 (�0.90, �0.06)* �0.70 (�1.56, 0.17) �0.49 (�1.01, 0.03)

SulfonylureaRate per 1000 person-years 1.45 (1.33, 1.58) 0.55 (0.47, 0.63) 1.04 (0.94, 1.15)Time difference (years) �0.01 (�0.51, 0.50) �0.57 (�1.26, 0.11) �0.43 (�1.13, 0.28)

ThiazolidinedioneRate per 1000 person-years 0.89 (0.70, 1.12) 0.36 (0.25, 0.52) 0.85 (0.67, 1.08)Time difference (years) �0.25 (�0.56, 0.05) �0.09 (�0.74, 0.56) �0.28 (�0.74, 0.18)

*P < 0.01.†Analyses were balanced on age and the follow-up time by treatment groups, and were adjusted for age, sex, BMI, smoking status anddiabetes duration.



207

thiazolidinedione users is perhaps more unexpected. Inter-

estingly, studies in rodent models suggest that rosiglitazone

exposure may limit the severity of pancreatitis and shorten

recovery from pancreatic inflammation in the setting of

induced pancreatic injury [26,27]. However, the thiazo-

lidinedione class is associated with a number of adverse side

effects that have significantly limited clinical use of these

medications [28]. The findings of this analysis provide

reassurance to prescribers and users of DPP-4i that these

medications do not significantly increase the risk of adverse

pancreatic outcomes compared with other commonly pre-

scribed second-line therapies.

Several strengths of this analysis include the large number

of people who met the inclusion criteria for analysis, the

robust amount of data collected and the reasonable duration

of follow-up after addition of the second-line anti-hypergly-

caemic drug. Although the overall numbers of people expe-

riencing acute pancreatitis and pancreatic cancer were small,

the numbers exceed those available for inclusion in the

previously cited meta-analysis [8]. The unique and novel

aspects of this study include the careful choice of study cohort

without any history of pancreatic diseases or cancer. This

approach reduces the likelihood of events attributable to pre-

existing conditions such as biliary disease, structural pancre-

atic disorders or an autoimmune/genetic predisposition to

pancreatic diseases. The analyses also include a holistic

evaluation of the risks associated with use of different anti-

hyperglycaemic drugs rather than a one drug vs. all approach;

a detailed evaluation of the treatment patterns, ensuring non-

exposure to incretin-based therapies in other comparator

treatment groups; and careful choice of statistical methodol-

ogy to evaluate the risk while robustly addressing the

challenging issues of imbalances in important risk factors

and confounders across treatment groups. Our finding that

people treated with a DPP-4i are not at higher risk of

pancreatic diseases is robust, supported by the sensitivity

analyses in a large number of people with Type 2 diabetes.

Electronic databases do present challenges in terms of the

accuracy and completeness of the required data. As a result,

limitations of this study include non-availability of complete

and reliable longitudinal data related tomedication adherence,

tobacco and alcohol consumption, socio-economic status and

potential residual confounders. In particular, the inability to

quantify alcohol exposure does not permit adjustment or

balancing for this known risk factor for pancreatitis. The

analyses were not adjusted for conditions such as hypertriglyc-

eridaemia, hypercalcaemia or non-anti-hyperglycaemic drug

exposures that have been associated with pancreatitis; how-

ever, these are considered responsible for only a small

percentage of overall cases of acute pancreatitis [29]. Further-

more, other known risk factors for pancreatic cancer including

dietary composition, inactivity or family history/genetic pre-

disposition are not available in routinely collected electronic

health records. There may also have been inherent medical,

socio-economic or other differences that affected the types of

anti-hyperglycaemic drug prescribed. These differences could

not be readily determined or adjusted for in the analyses, and

the impact of these individual characteristics upon pancreatic

outcomes is unknown. Future pharmaco-epidemiologic stud-

ies with longer-term follow up and more robust medication

exposure and outcomes ascertainment will further comple-

ment our understanding of the pancreatic safety of anti-

hyperglycaemic therapies.

Funding sources

None.

Competing interests

SKP has acted as a consultant and/or speaker for Novartis,

GI Dynamics, Roche, AstraZeneca, Guangzhou Zhongyi

Pharmaceutical and Amylin Pharmaceuticals LLC. He has

received grants in support of investigator and investigator-

initiated clinical studies from Merck, Novo Nordisk,

AstraZeneca, Hospira, Amylin Pharmaceuticals, Sanofi-

Avensis and Pfizer. JA has received honoraria for consultan-

cies and lectures from Novartis, Novo Nordisk, Astra

Zeneca, Sanofi, Merck Sharp and Dohme, Abbott, Janssen

Cilag and Takeda. OM has no conflict of interest to declare.

JG has received research grants from AstraZeneca, Boehrin-

ger Ingelheim, GlaxoSmithKline, Intarcia and Sanofi, and

personal fees for consultative work from AstraZeneca,

Daiichi, Merck Sharp & Dohme, NovoNordisk and

Boehringer-Ingelheim.

Acknowledgements

SKP, OM and JA acknowledge a grant provided by Royal

Brisbane Women Hospital Foundation. Melbourne EpiCen-

tre gratefully acknowledges support from the National

Health and Medical Research Council and the Australian

Government’s National Collaborative Research Infrastruc-

ture Strategy (NCRIS) initiative through Therapeutic Inno-

vation Australia. The authors are grateful to all contributors

in the CEMR database. OM gratefully acknowledges the

PhD scholarship from Queensland University of Technology,

Australia, and her co-supervisors Prof. Ross Young and Prof.

Louise Hafner of the same university.


SKP conceived the idea and was responsible for the primary

design of the study. OM, JA and JG contributed in the study

design. OM conducted the data extraction, and SKP and OM

jointly conducted the statistical analyses. The first draft of

the manuscript was developed by OM and SKP, and all

authors contributed to the finalisation of the manuscript.

SKP had full access to all the data in the study and is the

guarantor, taking responsibility for the integrity of the data



208

and the accuracy of the data analysis. Aggregated data is

available upon on request.

References

1 Deacon CF, Mannucci E, Ahr�en B. Glycaemic efficacy of glucagon-

like peptide-1 receptor agonists and dipeptidyl peptidase-4 inhibi-

tors as add-on therapy to metformin in subjects with type 2 diabetes

—a review and meta analysis. Diabetes Obes Metab 2012; 14: 762–767.

2 Paul SK, Agbeve J, Maggs D, Best JH. Comparison of trajectories of

self-monitored glucose levels by hypoglycemia status over 52 weeks

of treatment with insulin glargine or exenatide once weekly. J

Diabetes 2016; 8: 148–157.3 American Diabetes Association. Standards of Medical Care in

Diabetes—2018. Diabetes Care 2018; 41(Suppl 1): S4.

4 Paul SK, Klein K, Maggs D, Best JH. The association of the

treatment with glucagon-like peptide-1 receptor agonist exenatide

or insulin with cardiovascular outcomes in patients with type 2

diabetes: a retrospective observational study. Cardiovasc Diabetol

2015; 14: 10.

5 Wu S, Cipriani A, Yang Z, Yang J, Cai T, Xu Y et al. The

cardiovascular effect of incretin-based therapies among type 2

diabetes: a systematic review and network meta-analysis. Expert

Opin Drug Saf 2018; 17: 243–249.6 Azoulay L, Filion KB, Platt RW, Dahl M, Dormuth CR, Clemens

KK et al. Incretin based drugs and the risk of pancreatic cancer:

international multicentre cohort study. BMJ 2016; 352: i581.

7 Azoulay L. Incretin-based drugs and adverse pancreatic events:

almost a decade later and uncertainty remains. Diabetes Care 2015;

38: 951–953.8 Tk�a�c I, Raz I. Combined analysis of three large interventional trials

with gliptins indicates increased incidence of acute pancreatitis in

patients with type 2 diabetes. Diabetes Care 2017; 40: 284–286.9 Thomsen RW, Pedersen L, Møller N, Kahlert J, Beck-Nielsen H,

Sørensen HT. Incretin-based therapy and risk of acute pancreatitis:

a nationwide population-based case-control study. Diabetes Care

2015; 38: 1089–1098.10 Faillie J-L, Azoulay L, Patenaude V, Hillaire-Buys D, Suissa S.

Incretin based drugs and risk of acute pancreatitis in patients with

type 2 diabetes: cohort study. BMJ 2014; 348: g2780.

11 Roshanov PS, Dennis BB. Incretin-based therapies are associated

with acute pancreatitis: meta-analysis of large randomized con-

trolled trials. Diabetes Res Clin Pract 2015; 110: e13–e17.12 Wang T, Wang F, Gou Z, Tang H, Li C, Shi L et al. Using real-

world data to evaluate the association of incretin-based therapies

with risk of acute pancreatitis: a meta-analysis of 1 324 515

patients from observational studies. Diabetes Obes Metab 2015;

17: 32–41.13 Li L, Shen J, Bala MM, Busse JW, Ebrahim S, Vandvik PO et al.

Incretin treatment and risk of pancreatitis in patients with type 2

diabetes mellitus: systematic review and meta-analysis of ran-

domised and non-randomised studies. BMJ 2014; 348: g2366.

14 Chou H-C, Chen W-W, Hsiao F-Y. Acute pancreatitis in patients

with type 2 diabetes mellitus treated with dipeptidyl peptidase-4

inhibitors: a population-based nested case-control study. Drug Saf

2014; 37: 521–528.15 Chang C-H, Lin J-W, Chen S-T, Lai M-S, Chuang L-M, Chang

Y-C. Dipeptidyl peptidase-4 inhibitor use is not associated with

acute pancreatitis in high-risk type 2 diabetic patients: a nationwide

cohort study. Medicine 2016; 95: e2603.

16 Knapen L, van Dalem J, Keulemans Y, van Erp N, Bazelier M, De

Bruin M et al. Use of incretin agents and risk of pancreatic cancer: a

population-based cohort study. Diabetes Obes Metab 2016; 18:

258–265.17 Knapen LM, de Jong RG, Driessen JH, Keulemans YC, van Erp

NP, De Bruin ML et al. The use of incretin agents and risk of acute

and chronic pancreatitis: a population-based cohort study. Dia-

betes Obes Metab 2017; 19: 401–411.18 Chen H, Zhou X, Chen T, Liu B, Jin W, Gu H et al. Incretin-based

therapy and risk of pancreatic cancer in patients with Type 2

diabetes mellitus: a meta-analysis of randomized controlled trials.

Diabetes Ther 2016; 7: 725–742.19 Centers for Disease Control and Prevention. National Diabetes

Statistics Report: Estimates of Diabetes and its Burden in the

United States, 2014. Atlanta, GA: US Department of Health and

Human Services, 2014.

20 Crawford AG, Cote C, Couto J, Daskiran M, Gunnarsson C, Haas

K et al. Comparison of GE Centricity Electronic Medical Record

database and National Ambulatory Medical Care Survey findings

on the prevalence of major conditions in the United States. Popul

Health Manag 2010; 13: 139–150.21 Paul SK, Shaw J, Montvida O, Klein K. Weight gain in insulin-

treated patients by body mass index category at treatment

initiation: new evidence from real-world data in patients with type

2 diabetes. Diabetes Obes Metab 2016; 18: 1244–1252.22 ElHafeez SA, Torino C, D’Arrigo G, Bolignano D, Provenzano F,

Mattace-Raso F et al. An overview on standard statistical methods

for assessing exposure–outcome link in survival analysis (Part II):

the Kaplan-Meier analysis and the Cox regression method. Aging

Clin Exp Res 2012; 24: 203–206.23 Rotnitzky A, Robins JM. Inverse probability weighting in survival

analysis. Encyclopedia of Biostatistics. Chichester: Wiley, 2005.

24 Austin PC, Stuart EA. The performance of inverse probability of

treatment weighting and full matching on the propensity score in

the presence of model misspecification when estimating the effect of

treatment on survival outcomes. Stat Methods Med Res 2017; 26:

1654–1670.25 Austin PC. Variance estimation when using inverse probability of

treatment weighting (IPTW) with survival analysis. Stat Med 2016;

35: 5642–5655.26 Pini M, Rhodes DH, Castellanos KJ, Cabay RJ, Grady EF,

Fantuzzi G. Rosiglitazone improves survival and hastens recovery

from pancreatic inflammation in obese mice. PLoS One 2012; 7:

e40944.

27 Wan H, Yuan Y, Liu J, Chen G. Pioglitazone, a PPAR-c activator,

attenuates the severity of cerulein-induced acute pancreatitis by

modulating early growth response-1 transcription factor. Transl

Res 2012; 160: 153–161.28 Woodcock J, Sharfstein JM, Hamburg M. Regulatory action on

rosiglitazone by the US Food and Drug Administration. N Engl J

Med 2010; 363: 1489–1491.29 Forsmark CE, Vege SS, Wilcox CM. Acute pancreatitis. N Engl J

Med 2016; 375: 1972–1981.

Supporting Information

Additional supporting information may be found online in

the Supporting Information section at the end of the article.

Table S1. Weighted standardized differences in balanced

groups.



209

EVALUATION OF CARDIO-METABOLIC EFFECTS OF TREATMENT … · Olga Montvida, Kerenaftali Klein, Sanjoy...

Documents

Transcript of EVALUATION OF CARDIO-METABOLIC EFFECTS OF TREATMENT … · Olga Montvida, Kerenaftali Klein, Sanjoy...