Statistical approaches for understanding the aetiology of ...
Transcript of Statistical approaches for understanding the aetiology of ...
Statistical approaches for
understanding the aetiology
of psoriatic arthritis:
genetics, environment and
comorbidities
A thesis submitted to The University of Manchester for the degree of
Doctor of Philosophy
in the Faculty of Biology, Medicine and Health
2018
Eftychia Bellou
School of Biological Sciences
Division of Musculoskeletal and Dermatological Sciences
3
Table of Contents
List of Tables..................................................................................................................................................... 7
List of Figures ................................................................................................................................................. 10
Abbreviations.................................................................................................................................................. 12
Abstract ........................................................................................................................................................... 15
Declaration...................................................................................................................................................... 17
Copyright Statement .................................................................................................................................... 18
Publications Arising During this PhD ........................................................................................................ 19
About the Author.......................................................................................................................................... 21
Dedication ....................................................................................................................................................... 22
Acknowledgements ....................................................................................................................................... 22
Introduction .................................................................................................................................................... 24
1.1 Common complex diseases ...................................................................................................... 24
1.2 Epidemiology in complex diseases .......................................................................................... 25
1.2.1 Risk factors .......................................................................................................................... 25
1.2.2 Study designs ....................................................................................................................... 26
1.2.3 Causation versus association .......................................................................................... 29
1.2.4 Bias ........................................................................................................................................ 29
1.2.5 Basic epidemiological concepts ....................................................................................... 31
1.2.6 Observational epidemiology: Investigating environmental/lifestyle factors in
complex diseases ................................................................................................................................. 36
1.2.7 Genetic epidemiology: Investigating the genetic basis of complex diseases ........ 38
1.3 Investigating risk factors for PSO and PsA ............................................................................ 44
1.3.1 Epidemiology of PSO and PsA ........................................................................................ 45
1.3.2 Clinical manifestations of the diseases .......................................................................... 46
1.3.3 Immunopathogenesis of PSO and PsA .......................................................................... 58
1.3.4 Comorbid diseases ............................................................................................................ 60
1.3.5 Environmental risk factors for PsA ................................................................................ 73
4
1.3.6 Genetic risk factors for PSO and PsA .......................................................................... 81
1.4 Overall aims and objectives ...................................................................................................... 95
1.5 Outline of thesis .......................................................................................................................... 95
Environmental risk factors .......................................................................................................................... 97
2.1 Introduction ................................................................................................................................. 97
2.1.1 UK Biobank ......................................................................................................................... 98
2.2 Aims and objectives .................................................................................................................. 101
2.2.1 Aims and objectives of first study ................................................................................ 101
2.2.2 Aim and objectives of second study ............................................................................ 101
2.3 Contribution of the candidate ............................................................................................... 102
2.4 Methods ...................................................................................................................................... 103
2.4.1 Identifying lifestyle factors and comorbidities associated with PSO without
arthritis and PsA compared to the general population ............................................................ 103
2.4.2 Comorbidities in rheumatic diseases and their effect on physical activity ........ 114
2.5 Results ......................................................................................................................................... 119
2.5.1 Identifying lifestyle factors and comorbidities associated with PSO without
arthritis and PsA compared to the general population in the UK Biobank ........................ 119
2.5.2 Comorbidities in rheumatic diseases and their effect on physical activity ........ 130
2.6 Discussion ................................................................................................................................... 139
2.6.1 Review of objectives ....................................................................................................... 140
2.6.2 Study design ...................................................................................................................... 148
2.6.3 Conclusion ........................................................................................................................ 149
Genetics of PsA ........................................................................................................................................... 151
3.1 Introduction ............................................................................................................................... 151
3.2 Aims and Objectives ................................................................................................................ 153
3.3 Contribution of the candidate ............................................................................................... 153
3.4 Methods ...................................................................................................................................... 154
3.4.1 GWAS summary statistics datasets............................................................................. 154
3.4.2 Pre-processing .................................................................................................................. 155
5
3.4.3 Statistical analysis ............................................................................................................. 157
3.5 Results .......................................................................................................................................... 163
3.5.1 Genetic overlap between the diseases ....................................................................... 163
3.5.2 cFDR analysis .................................................................................................................... 164
3.5.3 MTAG ................................................................................................................................. 169
3.5.4 Sub-based analysis (ASSET) ........................................................................................... 174
3.6 Discussion ................................................................................................................................... 180
Mendelian Randomization.......................................................................................................................... 187
4.1 Introduction ................................................................................................................................ 187
4.1.1 General Overview of MR ............................................................................................... 188
4.2 Aims and objectives .................................................................................................................. 197
4.2.1 Aim ...................................................................................................................................... 197
4.2.2 Objectives .......................................................................................................................... 197
4.3 Contribution of the candidate ............................................................................................... 197
4.4 Methods ....................................................................................................................................... 198
4.4.1 Data sources and choice of IVs .................................................................................... 198
4.4.2 Statistical analysis ............................................................................................................. 200
4.5 Results .......................................................................................................................................... 201
4.5.1 Effect of BMI upon PsA and vice versa ....................................................................... 201
4.5.2 Effect of smoking initiation upon PsA and vice versa .............................................. 208
4.5.3 Effect of alcohol frequency consumption upon PsA and vice versa ..................... 209
4.6 Discussion ................................................................................................................................... 210
4.6.1 Strengths and weaknesses of the study ...................................................................... 211
4.6.2 Future work ...................................................................................................................... 212
4.6.3 Conclusion ......................................................................................................................... 213
Discussion of thesis ..................................................................................................................................... 215
5.1 Conclusion .................................................................................................................................. 218
References ..................................................................................................................................................... 219
Appendix........................................................................................................................................................ 249
7
List of Tables
Table 1 | Advantages and disadvantages of the main observational study designs ............... 28
Table 2 | Types of variables used in epidemiology .............................................................................. 31
Table 3 | Types of studies in genetic epidemiology and their use ................................................. 38
Table 4 | Characteristics of the screening tools at their development phase ............................ 53
Table 5 | Comparison of psoriatic arthritis screening tools by different studies .................... 57
Table 6 | Cardiovascular events in psoriasis and psoriatic arthritis ............................................. 62
Table 7 | Hypertension in psoriasis and psoriatic arthritis .............................................................. 64
Table 8 | Obesity in psoriasis and psoriatic arthritis ........................................................................... 66
Table 9 | Liver disease in psoriasis and PsA ............................................................................................ 68
Table 10 | Chronic obstructive pulmonary disease in psoriasis patients ................................... 69
Table 11 | Psychological disorders in patients with psoriasis and psoriatic arthritis .......... 71
Table 12 | Other environmental factors associated with psoriasis and psoriatic arthritis . 77
Table 13 | Twin studies conducted to establish the genetic basis of psoriasis......................... 82
Table 14 | Epidemiological studies estimating familial aggregation in psoriatic arthritis .. 82
Table 15 | Non-MHC PSO susceptibility loci identified by association studies in the
European population (Adapted by (Ray-Jones, Eyre et al. 2016)) ................................................ 85
Table 16 | Non-MHC PSO susceptibility loci identified by association studies in the Chinese
population (Adapted by (Ray-Jones, Eyre et al. 2016)) ...................................................................... 89
Table 17 | Data collection of lifestyle factors by the UK Biobank and their categorisation for
the current study ..............................................................................................................................................105
Table 18 | Methods for controlling confounding effects in statistical modelling ..................108
Table 19 | Morbidities with their codes included in the current study and categorisation
used ........................................................................................................................................................................111
Table 20 | Baseline characteristics of the study populations .........................................................120
Table 21 | Adjusted analysis for identifying the exposures that were associated with
disease status .....................................................................................................................................................122
Table 22 | Association between lifestyle/environmental factors and disease status (final,
multivariable analysis)...................................................................................................................................124
Table 23 | Univariate regression analysis investigating the association of prevalent
comorbidities with disease status .............................................................................................................127
Table 24 | Multivariable regression analysis investigating the association of prevalent
comorbidities with disease status .............................................................................................................128
8
Table 25 | Baseline characteristics of the cohorts .............................................................................. 131
Table 26 | Prevalence of comorbidities in participants with a rheumatic disease ............... 134
Table 27 | Prevalence of comorbidities in participants with a rheumatic disease (self-
reported rheumatic disease and use of a DMARD) ............................................................................ 135
Table 28 | Association between comorbidities and physical activity in participants with a
rheumatic disease ............................................................................................................................................ 138
Table 29 | Shared pathways among immune-mediated diseases (Adapted from (Sun and
Zhang 2014)) ...................................................................................................................................................... 152
Table 30 | Sample sizes of the GWAS summary statistics datasets of the five
musculoskeletal diseases .............................................................................................................................. 154
Table 31 | Loci associated with PsA after applying cFDR analysis using as conditional
phenotypes RA, AS and JIA ........................................................................................................................... 168
Table 32 | Power gain when using MTAG approach .......................................................................... 170
Table 33 | MTAG results for PsA (presented for original PsA p-value≤0.05).......................... 172
Table 34 | MTAG results for PsA (original PsA p-value>0.05)....................................................... 173
Table 35 | Loci associated with AS, JIA, PsA, RA and SLE after applying the ASSET subset-
based approach ................................................................................................................................................. 175
Table 36 | Assumptions regarding pleiotropy of the Mendelian Randomization methods
.................................................................................................................................................................................. 195
Table 37 | Methods used to address MR limitations.......................................................................... 196
Table 38 | Characteristics of the GIANT consortium and the UK Biobank ............................... 199
Table 39 | Number of genetic instruments used for the MR analysis for each exposure-
outcome ................................................................................................................................................................ 202
Table 40 | Results of Mendelian randomization with BMI as exposure and PsA as the
outcome ................................................................................................................................................................ 203
Table 41 | Results of Mendelian randomization with smoking initiation from the UK
Biobank as the exposure and PsA as the outcome ............................................................................ 208
Table 42 | Results of Mendelian randomization with alcohol intake frequency from the UK
Biobank as the exposure and PsA as the outcome ............................................................................. 209
Appendix Table 1 | The sequence of the assessment visit (table taken from
http://www.ukbiobank.ac.uk/)...................................................................................................249
Appendix Table 2 | Genetic correlations between PsA, JIA and RA and SLE using LD
Hub……………………………………………………………………………………………………………….253
Appendix Table 3 | Loci associated with JIA after applying cFDR analysis using as
conditional phenotypes AS, PsA, RA and SLE…………………………………………………...257
9
Appendix Table 4 | Loci associated with SLE after applying cFDR analysis using as a
conditional phenotype RA and JIA…………………………………………………………………..264
Appendix Table 5 | Loci associated with RA after applying cFDR analysis using as a
conditional phenotype SLE, JIA and PsA………………………………………………………….271
Appendix Table 6 | MTAG results for JIA (results presented for original JIA p-
value<0.05)……………………………………………………………………………………………………275
Appendix Table 7 | MTAG results for JIA (original JIA p-value>0.05)……………….277
Appendix Table 8 | MTAG results for SLE………………………………………………………..281
Appendix Table 9 | MTAG results for RA…………………………………………………………286
Appendix Table 10 | MTAG results for AS……………………………………………………….289
10
List of Figures
Figure 1 | Liability-threshold model presented as a normal (Gaussian) distribution. .......... 25
Figure 2 | Example of confounding bias. ................................................................................................... 30
Figure 3 | Skin manifestations of psoriasis .............................................................................................. 47
Figure 4 | Nail changes in patients with psoriasis ................................................................................ 48
Figure 5 | Manifestations of psoriatic arthritis ...................................................................................... 49
Figure 6 | Joint with the enthesis and synovial lining being points of inflammation in
psoriatic arthritis. Adapted from Wikipedia (https://en.wikipedia.org) ................................... 60
Figure 7 | Locations of the 22 assessment centres in the UK ........................................................... 99
Figure 8 | Association of lifestyle factors with disease status (adjusted model) adjusting for
age, sex and ethnicity ...................................................................................................................................... 123
Figure 9 | Association of lifestyle factors with disease status (multivariable model)
adjusting for age, sex and ethnicity; ......................................................................................................... 125
Figure 10 | Association of prevalent comorbidities with disease status (multivariable
model) adjusting for age, sex, ethnicity, smoking and alcohol consumption, BMI and
Townsend deprivation index; ..................................................................................................................... 129
Figure 11 | Number of participants included in the study .............................................................. 130
Figure 12 | Prevalence and incidence rates of comorbidities ........................................................ 136
Figure 13 | Association between presence/absence of rheumatic disease, (co)morbidity
and physical activity ........................................................................................................................................ 138
Figure 14 | Genetic correlation for each pair of the five musculoskeletal disorders. .......... 163
Figure 15 | Dendrogram clustering the diseases on correlation “distances”. ......................... 164
Figure 16 | Q-Q plots for PsA conditional on RA (top), AS (left) and JIA (right). ................... 166
Figure 17 | cFDR results for PsA conditioned on RA (top), AS (bottom left) and JIA (bottom
right). ..................................................................................................................................................................... 167
Figure 18 | Manhattan plot of association results for PsA. ............................................................. 171
Figure 19 | Novel loci identified by ASSET subset-based analysis by frequency of disease
clusters. ................................................................................................................................................................ 179
Figure 20 | All loci identified by ASSET subset-based approach by frequency of disease
clusters. ................................................................................................................................................................ 179
Figure 21 | Scatterplot for comparison of methods of BMI (GIANT) upon PsA. .................... 204
Figure 22 | Scatterplot for comparison of methods of BMI (UK Biobank) upon PsA. .......... 205
11
Figure 23 | Funnel plot displaying the causal effect estimate of each IV against its precision
for MR analysis of BMI (GIANT) on PsA. .................................................................................................206
Figure 24 | Funnel plot displaying the causal effect estimate of each IV against its precision
for MR analysis of BMI (UK Biobank) on PsA. ......................................................................................207
Appendix Figure 1 | Short version of the International Physical Activity Questionnaire
(IPAQ)……………………………………………………………………………………..250
Appendix Figure 2 | Scoring protocol for International Physical Activity Questionnaire
(IPAQ) ....................................................................................................................................................................252
Appendix Figure 3| Q-Q plots for JIA conditional on AS (top left), PsA (top right), RA
(bottom left) and SLE (bottom right). ......................................................................................................254
Appendix Figure 4 | cFDR results for JIA conditioned on AS (top left), PsA (top right), RA
(bottom left). ......................................................................................................................................................256
Appendix Figure 5 | Q-Q plots for SLE conditional on RA (left) and JIA (right). ...................262
Appendix Figure 6 | cFDR results for SLE conditioned on RA (left) and JIA (right). ............263
Appendix Figure 7 | Q-Q plots for RA conditional on SLE (top), PsA (bottom left) and JIA
(bottom right). ...................................................................................................................................................269
Appendix Figure 8 | cFDR results for RA conditioned on SLE (top), PsA (bottom left) and
JIA (bottom right). ............................................................................................................................................270
Appendix Figure 9 | Manhattan plot of association results for JIA. .............................................279
Appendix Figure 10 | Manhattan plot of association results for SLE. .........................................283
Appendix Figure 11 | Manhattan plot of association results for RA. ..........................................285
Appendix Figure 12 | Manhattan plot of association results for AS. ...........................................288
Appendix Figure 13 | Forest plot of BMI (GIANT) on PsA using Wald ratio for each IVW.
..................................................................................................................................................................................290
Appendix Figure 14| Leave-one-out-plot for BMI (GIANT) on PsA. ............................................291
Appendix Figure 15 | Forest plot of BMI (UK Biobank) on PsA using Wald ratio for each
IVW. ........................................................................................................................................................................292
Appendix Figure 16 | Leave-one-out-plot for BMI (UK Biobank) on PsA. ................................293
12
Abbreviations
1KG 1000 Genome
2SLS Two-Stage Least Squares
AS Ankylosing Spondylitis
BIA Bioelectrical Impedance Analysis
BMI Body Mass Index
BSA Body Surface Area
CASPAR ClASsification of Psoriatic ARthritis
ccFDR conjunctional conditional False Discovery Rate
CD Crohn’s Disease
CD4 Cluster of Differentiation 4
cFDR conditional False Discover Rate
CHD Coronary Heart Disease
CHIAG Community Health Index Advisory Group
CI Confidence Interval
COPD Chronic Obstructive Pulmonary Disease
CPMA Cross-phenotype meta-analysis
CRP C-Reactive Protein
CVD Cardiovascular Disease
DC Dendritic Cell
DIP Distal Interphalangeal Joint
DM Diabetes Mellitus
DMARD Disease-modifying Anti-Rheumatic Drug
DNA Deoxyribonucleic acid
EARP Early ARthritis for Psoriatic Patients
ERAP1 Encoding Endoplasmic Reticulum Aminopeptidase 1
FDR False Discovery Rate
gcp genetic causality proportion
GIANT Genetic Investigation of ANthropometric Traits
GP General Practitioner
GPP Generalised Palmoplantar Pustulosis
GWAS Genome-Wide Association Study
13
HCV Hepatitis C Virus
HIV Human Immunodeficiency Virus
HLA Human Leukocyte Antigen
HPA Hypothalamic-Pituitary-Adrenal
HR Hazard Ratio
IBD Inflammatory Bowel Disease
IFN Interferon
IgA Immunoglobulin A
IgG Immunoglobulin G
IL Interleukin
IL-23R Interleukin 23 receptor
IPAQ International Physical Activity Questionnaire
IQR Interquartile Range
IV Instrumental Variable
JIA Juvenile Idiopathic Arthritis
KP Koebner Phenomenon
LCE Late Cornified Envelope
LCV Latent Causal Variable
LD Linkage Disequilibrium
MBE Mode-Based Estimate
MDD Major Depressive Disorder
MHC Major Histocompatibility Complex
MI Myocardial Infraction
mPAQ modified Psoriasis and Arthritis Questionnaire
MR Mendelian Randomization
MREC Multi-centre Research Ethics Committee
MRI Magnetic Resonance Imaging
MS Multiple Sclerosis
MTAG Multi-Trait Analysis of GWAS
NAFLD Non-Alcoholic Fatty Liver Disease
NASH Non-Alcoholic SteatoHepatitis
NHS Nurses’ Health Study
NIGB National Information Governance Board
NSAID NonSteroidal Anti-Inflammatory Drug
OR Odds Ratio
PAQ Psoriasis and Arthritis Questionnaire
14
PASE Psoriatic Arthritis Screening and Evaluation
PASI Psoriasis Area and Severity Index
PASQ Psoriasis and Arthritis Screening Questionnaire
PBC Primary Biliary Cholangitis
pDC plasmacytoid Dendritic Cell
PEST Psoriasis Epidemiology Screening Tool
PsA Psoriatic Arthritis
PSO Psoriasis
Q-Q Quantile-Quantile
RA Rheumatoid Arthritis
RANK Receptor Activator of Nuclear factor Kappa-B
RANKL RANK ligand
RR Relative Risk
SBM Subset-based Method
SD Standard Deviation
SF-36 36-item Short Form
SLE Systemic Lupus Erythematosus
SMR Standardised Morbidity Ratio
SNP Single Nucleotide Polymorphism
SPR Standardised Prevalence Rate
ST Systemic Therapy
T1D Type 1 Diabetes
T2D Type 2 Diabetes
TAG Tobacco, Alcohol and Genetics consortium
Tc1 T cytotoxic 1
Th1 T helper 1
THIN The Health Improvement Network
TIA Transient Ischaemic Attack
TNF Tumour Necrosis Factor
ToPAS Toronto Psoriatic Arthritis Screen
UC Ulcerative Colitis
UK United Kingdom
USA United States of America
WHO World Health Organisation
WTCCC Wellcome Trust Case Control Consortium
ZEMPA Zero Modal Pleiotropy Assumption
15
Abstract
Background: Psoriatic arthritis (PsA) is a seronegative inflammatory arthritis affecting
patients with psoriasis. Early identification of PsA could result in less joint damage and
better outcomes and highlight potential clinical targets. Several studies have tried to
elucidate the aetiology of PsA by investigating its genetic basis using genome-wide
association studies, the contribution of environmental and lifestyle factors to its
development and the prevalence of comorbidities in patients with psoriasis and/or PsA.
However, the small sample sizes used in these studies along with the unclear
phenotypic characterisation have led to the identification of only a handful PsA-specific
risk factors.
Aims: The broad aim of this study was to improve the understanding of the
pathogenesis of PsA by investigating the genetic and the environmental contribution,
along with the prevalence of multi-morbidity that has an impact on clinical outcomes.
Firstly, the study aimed to explore the association and causality of environmental
factors with PsA and the prevalence of comorbidities using the wealth of data UK
Biobank offers. Secondly, the study aimed to identify novel genetic variants
underpinning PsA using state-of-the-art techniques that leverage power from genetic
studies performed in other correlated musculoskeletal diseases.
Methods: The association of PsA with known environmental factors and
comorbidities was investigated using logistic regression in the UK Biobank. To further
define the genetic variants underpinning PsA, GWAS data from other musculoskeletal
diseases were tested for correlation with PsA using LD score regression and cross-
trait analysis was subsequently performed. Conditional False Discovery Rate analysis
and two alternative meta-analysis methods (Multi-Trait analysis of GWAS and subset-
based analysis) were used because of their ability to exploit the pleiotropy among
correlated traits and increase the power of polymorphism detection. Finally, the causal
role of the statistically significant environmental factors was then determined using
Mendelian Randomisation.
16
Results: Body mass index was confirmed to play a causal role in the development of
PsA in patients with psoriasis. In addition, using LD score regression rheumatoid
arthritis, systemic lupus erythematosus, ankylosing spondylitis and juvenile idiopathic
arthritis were found to be genetically correlated with PsA. Twenty one novel SNPs
were found by all three methods to be associated with PsA, the majority of which are
mapped to genes that have not previously been associated with PsA.
Summary: This work has carried forward the research of detecting PsA risk factors.
It includes the first cross-trait study investigating PsA along with other musculoskeletal
diseases, the first study to explore UK Biobank data for associations of the disease
with lifestyle risk factors and known comorbidities and finally the first study to assess
the causal role of obesity, smoking status and alcohol frequency consumption in the
onset of PsA. All this evidence can be taken forward for further functional and clinical
applications.
17
Declaration
I declare that no portion of the work referred to in the thesis has been submitted in
support of an application for another degree or qualification of this or any other
university or other institute of learning.
18
Copyright Statement
I. The author of this thesis (including any appendices and/or schedules to this
thesis) owns certain copyright or related rights in it (the “Copyright”) and she
has given The University of Manchester certain rights to use such Copyright,
including for administrative purposes.
II. Copies of this thesis, either in full or in extracts and whether in hard or
electronic copy, may be made only in accordance with the Copyright, Designs
and Patents Act 1988 (as amended) and regulations issued under it or, where
appropriate, in accordance with licensing agreements which the University has
from time to time. This page must form part of any such copies made.
III. The ownership of certain Copyright, patents, designs, trademarks and other
intellectual property (the “Intellectual Property”) and any reproductions of
copyright works in the thesis, for example graphs and tables (“Reproductions”),
which may be described in this thesis, may not be owned by the author and
may be owned by third parties. Such Intellectual Property and Reproductions
cannot and must not be made available for use without the prior written
permission of the owner(s) of the relevant Intellectual Property and/or
Reproductions.
Further information on the conditions under which disclosure, publication and
commercialisation of this thesis, the Copyright and any Intellectual Property and/or
Reproductions described in it may take place is available in the University IP Policy (see
http://documents.manchester.ac.uk/DocuInfo.aspx?DocID=24420), in any relevant
Thesis restriction declarations deposited in the University Library, The University
Library’s regulations (see http://www.library.manchester.ac.uk/about/regulations/) and
in The University’s policy on Presentation of Theses.
19
Publications Arising During this PhD
Manuscripts
Bowes J., Ashcroft J., Dand N., Jalali-Najafabadi F., Bellou E., Ho P., Marzo-Ortega H.,
Helliwell P.S., Feletar M., Ryan A.W., Kane D.J., Korendowych E., Simpson M.A.,
Packham J., McManus R., Brown M.A., Smith C.H., Barker J.N., McHugh N., FitzGerald
O., Warren R.B., Barton A. Cross-phenotype association mapping of the MHC
identifies genetic variants that differentiate psoriatic arthritis from psoriasis. Ann Rheum
Dis. 2017; 76(10):1774-1779
Cook M.*, Bellou E.*, Bowes J., Sergeant J.C., O’Neill T.W., Barton A., Verstappen
S.M.M. Impact of (co-)morbidities on physical activity in people with and without
inflammatory rheumatic diseases: results from the UK Biobank. Rheumatology
(Accepted/In press) *equal contribution
Conference Abstracts
The American College of Rheumatology (San Francisco, November 2015)
Bellou E., Cook M., Bowes J., Sergeant J.C., Barton A., O’Neill T.W., Verstappen
S.M.M. Prevalence of chronic comorbidities in patients with rheumatoid arthritis,
psoriatic arthritis, ankylosing spondylitis and systemic lupus erythematosus: an analysis
of UK Biobank Data (oral).
Cook M.J., Bellou E., Sergeant J.C., Bowes J., Barton A., O’Neill T.W., Verstappen
S.M.M. The impact of cardiovascular and lung disorder morbidities on physical activities
in people with inflammatory arthritis compared to the general population in the UK
(poster).
British Society of Investigative Dermatology (Dundee, April 2016)
Bellou E., Bowes J., Verstappen S.M.M, Cook M., Sergeant J.C., Barton A., Warren R.B.
A study from the UK Biobank of lifestyle habits and cardiovascular disease in psoriasis,
psoriatic arthritis and controls (poster).
20
British Society of Rheumatology (Glasgow, April 2016)
Bellou E., Verstappen S.M.M., Cook M., Sergeant J.C., Warren R.B., Barton A., Bowes J.
Increased rates of hypertension in patients with psoriatic arthritis compared to
psoriasis alone: results from the UK Biobank (oral).
Cook M.J., Bellou E., Sergeant J.C., Bowes J., Barton A., O’Neill T.W., Verstappen
S.M.M. Higher prevalence of chronic cardiovascular and pulmonary morbidities in
people with inflammatory arthritis is associated with a lower level of physical activity:
results from the UK Biobank (poster).
The American College of Rheumatology (Washington D.C.- November
2016)
Bellou E., Verstappen S.M.M., Cook M., Sergeant J.C., Warren R.B., Barton A., Bowes J.
Increased rates of hypertension in patients with psoriatic arthritis compared to
psoriasis alone: results from the UK Biobank (poster)
21
About the Author
My background is in computer science, having completed a 5-year MEng in Computer
and Communications Engineering at the University of Thessaly, Greece. During the
final year of my MEng I became fascinated with bioinformatics and decided to pursue a
MSc in this field. I graduated from the University of Newcastle in 2014 with an MSc in
Bioinformatics with distinction. For my master’s year project, I embarked upon a
project on the investigation of fatigue in patient with primary Sjogren’s syndrome using
Machine Learning. During this time I developed a passion for research and I became
increasingly interested in epidemiology and decided to pursue a PhD at the Arthritis
Research UK Centre for Genetics and Genomics in Manchester.
During my PhD, I enjoyed developing my existing programming skills and learning
statistical techniques for analysing large datasets. In particular I was keen on using both
novel and “traditional” methods in (genetic) epidemiology to further our understanding
in the aetiology of autoimmune diseases such as psoriasis and psoriatic arthritis. In
addition, I have enjoyed presenting my research at international and national
conferences including ACR and BSR.
Currently, I am a Research Associate in Bioinformatics at the Division of Psychological
Medicine and Clinical Neurosciences at Cardiff University working on the development
and implementation of polygenic risk algorithms for stratifying individuals for future
cognitive decline due to Alzheimer’s disease.
22
Dedication
This thesis is dedicated to my parents, Sofia Georgoudi and Evangelos Bellos and to the
loving memory of my grandparents, Dimitra and Georgios Georgoudis.
Acknowledgements
First of all, I want to express my gratitude to my supervisor, Dr. John Bowes, for introducing
me to field of genetic epidemiology, for his guidance and his endless support throughout the
PhD. He has helped me to develop invaluable skills for my future career and for that, I am
extremely grateful. I would also like to thank Professor Anne Barton and Professor Richard
Warren for always being there to offer a new insight and ideas and for providing constructive
feedback.
During my PhD, I was incredibly fortunate to collaborate with brilliant researchers within the
ARUK Centre for Epidemiology. Many thanks to Dr. Suzanne Verstappen, Dr. Jamie Sergeant
and Michael Cook for their generous help and guidance with various analyses. I would also
wish to thank Professor Goran Nenadic for his advice during the implementation of the
“misspelling” algorithm and James Liley for help with the cFDR method. Finally, I am grateful to
everyone within the Arthritis Research UK Centre for Genetics and Genomics who have
provided training whenever needed.
I could not have survived this PhD without the support of my close friends and family. Endless
thanks to my friends for life and fellow students in 2.706 for the great moments we have
shared, the morning hashtag deep conversations, the unstoppable laughter and the vast amount
of cookies during stressful periods. Special thanks to my friends outside of the University for
putting up with me during our endless phone calls and the great memories we have created
travelling. Alex, Dimitra, Jo, Marina, Mpou and Xara thanks for always being there. Last but not
least, I wish to thank my parents for being supportive of my decisions, and believing in me.
Finally, I wish to thank the Psoriasis Association for funding this PhD and Sofoklis Achillopoulos
Foundation for their support during my studies.
24
Chapter1
Introduction
Common complex diseases 1.1
Modern genetics has had a major impact on medicine by defining diseases that are
caused by alterations in one gene and are called “Mendelian” or “monogenic” diseases.
They run in families; the majority are rare and their transmission pattern can be
dominant or recessive, autosomal or sex-linked.
In contrast, common complex diseases do not follow the standard Mendelian patterns
of inheritance but are caused by the interplay of genetic, environmental and lifestyle
factors. Such conditions include Alzheimer’s and Parkinson’s disease, various types of
cancer, mental health disorders and autoimmune diseases.
The complex diseases present a polygenic inheritance in which many gene loci have a
small effect (Mitchell 2012). The liability-threshold model consists of two assumptions:
i) all members of a population have a normally distributed genetic liability for a
particular trait and ii) according to the threshold value; all individuals whose value on
the liability continuum exceeds this threshold are affected by the trait (Figure 1). An
individual’s liability is the sum of his or her genetic and lifestyle risk factors, with each
additional risk factor moving the individual closer to the threshold (Haegert 2004).
25
Figure 1 | Liability-threshold model presented as a normal (Gaussian) distribution. The arrows present the potential range of liabilities.
This model highlights the importance of studying the contribution from all risk factors
to fully understand susceptibility to disease.
Epidemiology in complex diseases 1.2
Epidemiology is concerned with the distribution and the determinants of health-related
states or events in specific populations. It is one of the core disciplines used to
investigate the associations between environmental and genetic factors and health
outcomes. More specifically epidemiology focuses on i) the definition of the disease ii)
the aetiology of the disease iii) the prevalence and incidence of the disease in a specific
population iv) the identification of risk factors and v) the control and prevention of the
disease.
Risk factors 1.2.1
Risk factors are aspects of lifestyle, environmental exposure, biological characteristics
and/or genetic predisposition that are associated with the frequency of occurrence of a
health-related condition such as tobacco and alcohol consumption, high blood pressure
and body mass index (BMI) (Fletcher and Fletcher 2005). There are various types of
risk factors including
Inherited (predisposition) such as carriage of certain HLA alleles that
increases the risk of autoimmune diseases.
Environmental determinants that lie outside the individual’s immediate
control such as air pollutants, infectious agents and water pollution. There
26
are others that are part of the social environment; for example, loss of a
relative or unemployment.
Determinants associated with the individual’s lifestyle and behaviour
including tobacco and alcohol consumption.
The exposure to a risk factor can occur either at a single point in time, as when an
individual is traumatised during a car accident, or over a period of time (e.g. asbestos
exposure) with the risk of the disease associated with the exposure time. Recognising
risk factors can be challenging because the associations between exposure and disease
are not obvious due to:
the long latency between exposure to a risk factor and the onset of the disease
the frequency of exposure to a risk factor
the low incidence of the disease or the small risk that the exposure can confer
which may necessitate large number of cases to observe a relationship between
the exposure and the disease outcome
various determinants may be related and their combination might be associated
with the onset of the disease (Fletcher and Fletcher 2005).
Study designs 1.2.2
Optimal study design is essential in order to investigate the nature of the relationship
between a risk factor and a health outcome and it depends on the study population,
the outcome of interest and the aim of the study. There are two basic approaches to
measure this relationship; the experimental and the observational approach. The
effects of most risk factors can be studied with observational studies in which the
researcher gathers data by simply observing and without interfering in the process.
Cohort studies 1.2.2.1
In a cohort design, a closed group of subjects is classified based on their exposure to a
factor of interest and then it is observed over a meaningful period of time to note the
incidence of any new cases of a trait (Song and Chung 2010). This design helps in
establishing a timeline of events occurring as well as in evaluating many outcomes. The
cohort studies can be either prospective or retrospective. In a prospective study
design, the subjects are followed over time into the future, whereas in retrospective
27
study the data from the subjects were recorded at some point in the past and their
current status with respect to the outcome of interest is determined.
Population-based cohort studies 1.2.2.1.1
Population-based studies are a type of a cohort design; however, the cohort is not a
fixed group of subjects but an entire target population. This type of study tries to
reflect the variety of demographic, epidemiological and clinical characteristics of a well-
defined population with the results being generalised to the whole population (Ethgen
and Standaert 2012). This type can include a range of other study designs such as case-
control and cross-sectional studies.
Case-control studies 1.2.2.2
Case-control studies are a type of retrospective design, where two groups are
compared based on past exposure to putative risk factors; the case group containing
subjects with the outcome of interest and the control group including subjects without
the outcome. The difference between this and the cohort study is in the selection of
the subjects; in a cohort study the subjects are free of the outcome of interest and
then are monitored over a sensible period of time, whereas in the case-control
approach, the subjects are selected based on whether the outcome is present or not
(Lewallen and Courtright 1998).
Nested case-control studies 1.2.2.3
Nested case-control studies are a variant of the conventional case-control and cohort
study and can also be described as a case-control study within a cohort study. With
this approach, a defined cohort is created, followed and cases are identified either as
they occur (prospective approach) or after occurring (retrospective approach). Then
for each case, a number of controls are selected among those who have not developed
the outcome (Ernster 1994).
Cross-sectional studies 1.2.2.4
In the cross-sectional study, the selection of subjects is made from an existing defined
population and at a specific point in time information is simultaneously obtained for all
the subjects on both the exposure(s) and outcome(s) of interest (Song and Chung
2010).
28
The main advantages and disadvantages of the three approached are described in Table
1.
Table 1 | Advantages and disadvantages of the main observational study designs
Study design Advantages Disadvantages
Cohort study 1. Assures that exposure
occurred before the
outcome of interest
1. Not applicable to rare
diseases as a large cohort
will be needed
2. Large cohorts are
expensive and time
consuming to be formed
3. Follow-up issues
4. Susceptible to selection
bias
Case-control study 1. Suitable for rare diseases
2. Suitable for studying
diseases with long
induction period
3. Smaller number of subjects
needed so they are
inexpensive to carry out
1. More interpretation
difficulties compared to
the cohort approach
2. Controls and cases
should be selected from
the same population
3. Unsure whether
exposure(s) preceded
the studied outcome(s)
Cross-sectional study 1. Estimation of prevalence of
conditions
2. Investigation of the
distribution and the
determinants of
behavioural risk factors
3. Easy and quick to
implement
1. Unsure whether
exposure(s) preceded
the studied outcome(s)
2. Susceptible to selection
bias and misclassification
issues
29
Causation versus association 1.2.3
One of the main goals of epidemiology is to assert the existence of a causal
relationship between a risk factor and a health outcome. Understanding the difference
between association and causation is the key to accurate interpretation of
epidemiological findings. For that reason, Hill proposed nine criteria that must be taken
into account in assessing whether causation exists (Fedak, Bernal et al. 2015). These
are:
strength of the association
consistency of the association (which is the repeated observation of the
association in different settings)
specificity (meaning a specific disease results from a given exposure and not
from other exposures under a given association)
temporality (which means that the exposure must be observed before the
effect)
biological gradient (meaning the existence of a linear relationship between the
two variables)
biological plausibility
coherence among studies about the nature of the association
experimental evidence when possible, and
analogy (similar factors have been taken into account).
Bias 1.2.4
Epidemiological studies can be subject to a number of biases at any research stage that
can lead to an inaccurate result. The term bias refers to the systematic deviation of the
estimated statistic of the association between an exposure and a disease from the true
value. Most biases occur during the design of the study, the data collection and during
the estimation of an effect influenced by many determinants (Delgado-Rodriguez and
Llorca 2004). Biases can fall into the following broad categories:
Selection bias is the type of error introduced when the study population
is not representative of the target population. It occurs when the
compared groups of patients are dissimilar in determinants of the health
outcome, such as age and sex, compared to the target population.
30
Measurement bias is the result of systematic erroneous measurements
because of imprecise tools, faulty measurement procedure or human
error.
Information bias describes the recording of either a risk factor or the
outcome being studied in a different way leading to misclassification. For
example, if the interviewer knows the status of the subjects before the
interview, he/she may examine the exposures in a different way if the
subjects are cases.
Confounding bias occurs when a risk factor, which is associated with
the under-study exposure, is also associated with the outcome of
interest without being an intermediate step of the causal pathway. For
example, in a study of whether alcohol consumption causes mouth
cancer, smoking can be a confounder if it is a well-known risk factor for
mouth cancer and if it is associated with alcohol consumption without
being a result of alcohol consumption (Figure 2). This type of bias can be
dealt with during the analysis of the data if the confounding variables
have been recorded; otherwise this leads to spurious associations
between the investigated risk factor and the outcome of interest.
Figure 2 | Example of confounding bias. Another exposure exists (smoking) in the study population besides the one being studied (alcohol consumption) and is associated both with disease (mouth cancer) and the exposure being studied. If the confounder (smoking), which is a determinant of or a risk factor for the disease, is unequally distributed between the exposure subgroups (alcohol drinkers are more likely to smoke), it can lead to confounding.
31
Basic epidemiological concepts 1.2.5
Before describing the methods for measuring effects per each design study, it is useful
to list a number of fundamental epidemiological terms used in the current thesis for
better comprehension of the methods used and the reasons they were chosen.
Types of variables 1.2.5.1
The following table (Table 2) summarises the types of variables that can be
encountered in an epidemiological study.
Table 2 | Types of variables used in epidemiology
Type Scale Definition
Categorical or Qualitative Nominal The values are categories
without ranking
Ordinal The values can be ranked
Continuous or Quantitative Interval Values are measured in equally
spaced unites with no zero point
Ratio Values can have a zero point
Distribution and measures 1.2.5.2
Frequency distributions have two main properties; central location (where the
distribution peaks) and spread (the distribution out of a central value).
Regarding central location, the most common measures are the mean and the median
and they can summarise the entire distribution. The selection of the measure to be
used depends on the shape of the distribution. The mean is equal to the sum of all the
values in the dataset divided by the number of values in the same dataset. It is used to
summarise continuous variables that follow a normal distribution and is affected by the
presence of extreme values. On the contrary, median is the middle score for a set of
data that has been arranged in order of magnitude. It is used for reporting continuous
variables that have a skewed or asymmetrical distribution and it is a robust measure, as
it is not affected by extreme value observations.
Regarding the spread, the measures that are most frequently reported are the
interquartile range (IQR) and the standard deviation (SD). The SD is used in
conjunction with the mean and it shows how widely or tightly the observations are
distributed from the centre. The IQR is jointly used with the median and it conveys the
32
portion of the distribution from the 25th percentile to the 75th percentile (Dicker
2006).
Measures of frequency 1.2.5.3
Frequency measures compare parts of the same distribution or a part to the entire
distribution. The most common measures are ratio, proportion and rate (Dicker
2006).
𝑅𝑎𝑡𝑖𝑜 =𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑣𝑒𝑛𝑡𝑠, 𝑝𝑒𝑟𝑠𝑜𝑛𝑠 𝑖𝑛 𝑜𝑛𝑒 𝑔𝑟𝑜𝑢𝑝
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑣𝑒𝑛𝑡𝑠, 𝑝𝑒𝑟𝑠𝑜𝑛𝑠 𝑖𝑛 𝑎𝑛𝑜𝑡ℎ𝑒𝑟 𝑔𝑟𝑜𝑢𝑝
In ratio, the two compared groups should not be related. It is mainly used to estimate
the occurrence of an event (as described later).
The proportion is suitable when the intended use is the comparison of a part to the
whole.
𝑃𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛 =𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑣𝑒𝑛𝑡𝑠, 𝑝𝑒𝑟𝑠𝑜𝑛𝑠 𝑤𝑖𝑡ℎ 𝑎 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑐 𝑡𝑟𝑎𝑖𝑡 𝑜𝑟 𝑒𝑥𝑝𝑜𝑠𝑢𝑟𝑒
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑣𝑒𝑛𝑡𝑠 𝑜𝑟 𝑝𝑒𝑟𝑠𝑜𝑛𝑠 𝑥 100
In proportion, the numerator should always be a subset of the denominator. It is
mainly used as a descriptive measure and it is expressed as a percentage.
Finally, the rate measures the frequency of event (the risk of an event occurring) in a
specific population during a particular period of time and it is useful when the
frequency of an event needs to be compared in different times or different groups of
subjects from different sized populations.
Measures of morbidity occurrence 1.2.5.4
Measuring the occurrence of morbidity depends on the period during which the
population was at risk. There are two main measures; prevalence and incidence (dos
Santos Silva 1999).
33
(Point) Prevalence measures how many cases there are in a population at a specific
point in time.
𝑃𝑟𝑒𝑣𝑎𝑙𝑒𝑛𝑐𝑒 =𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑎𝑠𝑒𝑠 𝑖𝑛 𝑎 𝑑𝑒𝑓𝑖𝑛𝑒𝑑 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑎𝑡 𝑎 𝑝𝑎𝑟𝑡𝑖𝑐𝑢𝑙𝑎𝑟 𝑝𝑜𝑖𝑛𝑡 𝑖𝑛 𝑡𝑖𝑚𝑒
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑒𝑜𝑝𝑙𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑑𝑒𝑓𝑖𝑛𝑒𝑑 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑎𝑡 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒 𝑝𝑜𝑖𝑛𝑡 𝑖𝑛 𝑡𝑖𝑚𝑒
Prevalence can also be presented as a percentage (multiplying the ratio with 100) or as
the number of cases per 100,000 of the population.
The incidence measures the occurrence of new cases in a population over a particular
period of time. The two most frequent types of incidence used are the incidence risk
and the incidence rate.
𝐼𝑛𝑐𝑖𝑑𝑒𝑛𝑐𝑒 𝑟𝑖𝑠𝑘 =𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑛𝑒𝑤 𝑐𝑎𝑠𝑒𝑠 𝑜𝑓 𝑚𝑜𝑟𝑏𝑖𝑑𝑖𝑡𝑦 𝑖𝑛 𝑎 𝑑𝑒𝑓𝑖𝑛𝑒𝑑 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑎𝑡 𝑎 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑐 𝑝𝑒𝑟𝑖𝑜𝑑
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑚𝑜𝑟𝑏𝑖𝑑𝑖𝑡𝑦 𝑓𝑟𝑒𝑒 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑎𝑡 𝑡ℎ𝑒 𝑠𝑡𝑎𝑟𝑡 𝑜𝑓 𝑡ℎ𝑎𝑡 𝑝𝑒𝑟𝑖𝑜𝑑
𝐼𝑛𝑐𝑖𝑑𝑒𝑛𝑐𝑒 𝑟𝑎𝑡𝑒 =𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑛𝑒𝑤 𝑐𝑎𝑠𝑒𝑠 𝑜𝑓 𝑚𝑜𝑟𝑏𝑖𝑑𝑖𝑡𝑦 𝑖𝑛 𝑎 𝑑𝑒𝑓𝑖𝑛𝑒𝑑 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑎𝑡 𝑎 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑐 𝑝𝑒𝑟𝑖𝑜𝑑
𝑇𝑖𝑚𝑒 𝑒𝑎𝑐ℎ 𝑠𝑢𝑏𝑗𝑒𝑐𝑡 𝑤𝑎𝑠 𝑓𝑜𝑙𝑙𝑜𝑤𝑒𝑑 𝑢𝑝, 𝑡𝑜𝑡𝑎𝑙𝑒𝑑 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠
The difference between the two types of incidence is time relevant. The estimation of
risk needs a population that would be entirely followed-up for a specific period,
whereas in the case of incidence rate the population can be dynamic, meaning that not
all individuals have been followed up for the same amount of time.
Measures of exposure effect 1.2.5.5
The main purpose of epidemiology is to quantify the association between the exposure
and the outcome of interest among two groups. The main measures are:
𝑅𝑖𝑠𝑘 𝑟𝑎𝑡𝑖𝑜 =𝑅𝑖𝑠𝑘 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑒𝑥𝑝𝑜𝑠𝑒𝑑 𝑔𝑟𝑜𝑢𝑝
𝑅𝑖𝑠𝑘 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑢𝑛𝑒𝑥𝑝𝑜𝑠𝑒𝑑 𝑔𝑟𝑜𝑢𝑝
𝑅𝑎𝑡𝑒 𝑟𝑎𝑡𝑖𝑜 =𝐼𝑛𝑐𝑖𝑑𝑒𝑛𝑐𝑒 𝑟𝑎𝑡𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑒𝑥𝑝𝑜𝑠𝑒𝑑 𝑔𝑟𝑜𝑢𝑝
𝐼𝑛𝑐𝑖𝑑𝑒𝑛𝑐𝑒 𝑟𝑎𝑡𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑢𝑛𝑒𝑥𝑝𝑜𝑠𝑒𝑑 𝑔𝑟𝑜𝑢𝑝
𝑂𝑑𝑑𝑠 𝑟𝑎𝑡𝑖𝑜 (𝑂𝑅) =𝑂𝑑𝑑𝑠 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑒𝑥𝑝𝑜𝑠𝑒𝑑 𝑔𝑟𝑜𝑢𝑝
𝑂𝑑𝑑𝑠 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒 𝑖𝑛 𝑡ℎ𝑒 𝑢𝑚𝑒𝑥𝑝𝑜𝑠𝑒𝑑 𝑔𝑟𝑜𝑝𝑢𝑝
The first two ratios are also referred as “relative risk (RR)”. A value of 1.0 indicates
that both the exposed and the unexposed group have identical incidence; thus, there is
34
no association between the exposure and the outcome of interest. A value greater
than one indicates that the exposed group has an increased risk of developing the
outcome of interest compared to the unexposed group (positive association); a value
less than one indicates that the exposed group has a decreased risk of developing the
outcome compared to the unexposed group (negative association) (Dicker 2006).
Regression analysis models 1.2.5.5.1
The most common method to estimate the OR and the RR is regression analysis. This
technique is used for prediction and investigation of the relationship between a
dependent variable and an independent or predictor variable(s). It indicates significant
associations and the strength of the effect the independent variables have on the
dependent one. Moreover, it is used to control for any potential confounders. There
are various types of regression analysis that can be used depending on the number of
independent variables, the type of the dependent variable and the shape of the
regression line. In general, regression analysis can be either simple or
multiple/multivariable depending on the number of independent variables and
univariate or multivariate depending on the number of dependent variables included in
the model. Then, depending on the type of the dependent variable (continuous,
categorical, counts of an event or patient’s hazard rate), either linear/non-linear or
logistic or Poisson or Cox proportional hazards regression is used, respectively.
In regression, the dependent variable is modelled as a function of the independent
variables, fixed coefficients1 and an error term2. The most basic regression model is
the univariate simple linear regression method described by the equation
𝑦𝑖 = 𝛽0 + 𝛽1𝑥𝑖 + 𝜀
where 𝑦𝑖 denotes the predicted response for subject 𝑖, 𝑥𝑖 denotes the
predictor value for subject 𝑖, 𝛽0 is the intercept, 𝛽1 is the slope (the average
increase of the outcome per unit increase of the predictor) and ε is the error
term.
1 Also called parameters and they present the mean increase in the dependent variable per increase in the
independent variable 2 The error is a random variable which presents the unexplained variation in the dependent variable.
35
When there are more than one independent/predictor variables, the model is called
univariate multiple/multivariable linear regression
𝑦𝑖 = 𝛽0 + 𝛽1𝑥𝑖 + ⋯ + 𝛽𝑁𝑥𝑖 + 𝜀
where N is the number of independent variables in the model.
In the case where there is more than one measured responses the regression model is
called multivariate (simple/multiple) regression and the multiple type has the form
𝑦𝑖1 + ⋯ + 𝑦𝑖𝐾 = 𝛽0𝑘 + 𝛽1𝑥𝑖 + ⋯ + 𝛽𝑁𝑥𝑖 + 𝜀
where 𝐾 is the number of dependent variables/responses for subject 𝑖.
It should be noted that the terms multiple/multivariable and multivariate are often used
interchangeably in the literature, although they are two different types of analysis
models (Hidalgo and Goodman 2013).
In the case where the dependent variable is categorical (as described in 1.2.5.1), then
binary logistic regression is used, which estimates the probability that a trait is present
given the values of the dependent/predictor variables; 𝜋 = 𝑃𝑟(𝑌 = 1|𝑋 = 𝑥) in the
case of a single predictor. Thus,
𝜋𝑖 = 𝑃𝑟(𝑌𝑖 = 1|𝑋𝑖 = 𝑥𝑖)
or,
log(𝜋𝑖) = log (𝜋𝑖
1 − 𝜋𝑖) = 𝛽0 + 𝛽1𝑥𝑖
where 𝑌𝑖 = 1 if the trait is present in the subject 𝑖, 𝑌𝑖 = 0 is absent in the
subject 𝑖, 𝑋 is the independent variables, 𝑥 the observed value for the
independent variable, 𝑖 is the 𝑖-th subject.
When the subject’s hazard rate (or the possibility of the event occurring in a subject
yet to develop the event) is needed, the Cox regression is used (Klein, Rizzo et al.
2001). The hazard rate for a subject 𝑖 can be presented as
ℎ𝑖(𝑡) = ℎ0(𝑡)𝑒𝑥𝑝{𝛽𝑍𝑖}
36
where ℎ0(𝑡) is the baseline hazard rate, 𝑍𝑖 is the 𝑖-th subject’s covariate and 𝛽
is the risk or the regression coefficient.
Significance and confidence intervals 1.2.5.6
The significance level denoted as alpha or α is the probability of rejecting the null
hypothesis when it is true. Usually the significance level of 0.05 (Fisher 1925) is used
which corresponds to a chance of error of 1 in 20.
The p-value is used in epidemiological studies to determine whether the null
hypothesis should be accepted or rejected and it assists in the recognition of
statistically important findings (du Prel, Hommel et al. 2009). The smaller the p-value
(compared to a predefined threshold alpha), the stronger the evidence that the
observed association or difference did not occur by chance. If the p-value is less than a
set alpha level (usually 0.05), then the finding of the statistical hypothesis test is
designated as “statistically significant”.
The confidence interval (CI) indicates the range in which the true value lies with a
predefined degree of probability (the 95% CI is usually used). The size of the range
depends on the sample size and the standard deviation of the groups being compared.
Compared to the p-value, the CI provides information about the direction and the
strength of the effect. When the CI does not include the value of zero effect, the
finding can be assumed to be “statistically significant”.
Observational epidemiology: Investigating environmental/lifestyle 1.2.6
factors in complex diseases
Statistical analysis 1.2.6.1
Summarising data 1.2.6.1.1
For summarising data, the measures of central location and spread are used depending
on the type of the variable as described in section 1.2.5.2. At this point, various
statistical tests can be applied depending on the intended use. For example, the chi-
squared (𝜒2) test can be applied in categorical variables from the same population to
investigate whether any difference in frequencies between a set of results is due to
chance. The general formula is:
37
χ2 = ∑(𝑂𝑖 − 𝐸𝑖)2
𝐸𝑖
where O is the observed value and E the expected value.
When the compared variables are continuous, either the T-test or Mann-Whitney U-
test can be used depending on whether the variables are normally-distributed or not,
respectively. Both tests allow the comparison of the means of the two groups to
investigate whether there is a statistically significant difference between them.
Longitudinal cohort studies 1.2.6.1.2
In cohort studies the incidence rate and risk can be estimated as described in section
1.2.5.4 and the risk ratio (or RR) as presented in 1.2.5.5. Another method that can be
applied is the standardised ratio to compare either the incidence (standardised
incidence ratio) or the morbidity (standardised morbidity ratio) in the cohort
compared to the general population (dos Santos Silva 1999). The number of new cases
that would be expected in the cohort, if the incidence or the morbidity was the same
in the general population, is estimated. Standardisation is one of the most common
approaches used to adjust for the effect of age and/or sex and it can be either direct or
indirect. In summary, the direct method requires that stratum-specific rates are
available for all the populations studied. The indirect approach requires only the total
number of cases that occurred in each population.
Regarding regression modelling, two techniques are mostly used in cohort studies;
Cox regression and Poisson regression. In Cox regression analysis, the target
parameter is the time until the occurrence of the outcome of interest. Cox regression
uses a proportional hazard model to calculate the hazard ratio (HR). Poisson
regression is used when the target parameter is the number of observations of a rare
event; for example, the number of ovarian cancer cases within a certain period of time.
Case-control studies 1.2.6.1.3
The measure of association between the exposure and the outcome of interest used in
case-control studies is the OR. The best statistical method to estimate the OR for the
binary variable (outcome yes or no) is the logistic regression analysis.
38
Cross-sectional studies 1.2.6.1.4
In cross-sectional studies the prevalence is estimated as a measure of frequency and
the prevalence OR is a measure of association (or effect). The prevalence OR
compares the odds of the prevalence of the outcome in the exposed group with the
odds of the prevalence of the outcome in the unexposed group.
Genetic epidemiology: Investigating the genetic basis of complex 1.2.7
diseases
Genetic epidemiology focuses on the role of genes and their interplay with
environmental factors in the development of a disease in families and in populations
(Kaprio 2000). The flow of research in genetic epidemiology is summarised in Table 3
and described in more detail in the following sections.
Table 3 | Types of studies in genetic epidemiology and their use
Aim Analytical study designs
Familial clustering Familial aggregation study
Genetic or environmental basis Twin studies
Mode of inheritance Segregation analysis
Disease susceptibility loci Linkage analysis
Disease susceptibility variants Association study
Disease susceptibility variants Candidate-gene association study
Refining disease true causality Fine-mapping study
Familial aggregation studies 1.2.7.1
The initial step in determining the potential genetic basis of a trait is to investigate
whether the trait appears in families more often than expected (familial clustering)
without any specific model in mind (Matthews, Finkelstein et al. 2008). The analysis for
dichotomous traits, such as psoriatic arthritis (PsA), is based on familial sampling in
which affected subjects and healthy controls are identified and the disease status of
their relatives is assessed. By calculating the prevalence of a trait in relatives (e.g.
siblings) of cases over the general population, the potential increased risk of having the
trait when having relatives with the same trait is determined. Thus, a genetic
39
component of the trait can be established. The measure used is termed relative
recurrence risk
𝜆𝑅 =λ
𝐾
where 𝑅 is the type of relatives (siblings, first degree relatives), 𝐾 is the prevalence of
the trait in the population and λ is the probability a subject has the disease given that a
relative has also the disease. Higher values of the recurrence risk suggest that a greater
proportion of the risk clusters in families compared to the general population.
Logistic regression analysis can be used to assess the familial aggregation, adjusting for
potential confounders such as environmental risk factors for each relative.
Twin studies 1.2.7.2
Aggregation studies are not sufficient to demonstrate genetic basis for a trait as
aggregation can be the result of other factors including environmental determinants.
Hence, the next step is the estimation of heritability (ℎ2) via twin studies (Sahu and
Prasuna 2016). Heritability is the proportion of variation that is due to genetic
differences.
In twin studies the monozygotic (identical) twins, sharing the same genes, are
compared to dizygotic (fraternal) twins, which share 50% genes but have common
environmental exposures. The measure used in this design is the concordance rate
which is defined as the probability that a pair of subjects will both have a certain trait;
given that one of them has the trait. The concordance rate is calculated as follows:
𝐶𝑜𝑛𝑐𝑜𝑟𝑑𝑎𝑛𝑐𝑒 𝑟𝑎𝑡𝑒 =𝐵𝑜𝑡ℎ 𝑡𝑤𝑖𝑛𝑠 𝑎𝑟𝑒 𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑
𝑂𝑛𝑒 𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑 + 𝑏𝑜𝑡ℎ 𝑎𝑓𝑓𝑒𝑐𝑡𝑒𝑑 𝑥 100
If the disease is genetic, the concordance rate will be higher for identical twins
compared to fraternal ones.
Linkage studies 1.2.7.3
Genetic linkage analysis is used to detect loci in the genome that contain disease
predisposing genes. There are two methods used for such analysis: parametric and
non-parametric linkage analysis. Parametric analysis is also called model-based as it
firstly requires the construction of the model for explaining the disease inheritance in a
40
family with both diseased and non-diseased individuals and then the estimation of the
recombination rate for a given pedigree. Non-parametric analysis does not require the
knowledge of the inheritance mode; the latter is the reason why it is preferred in
multifactorial diseases in which the inheritance pattern is not clear. The idea behind
this method is that diseased siblings will share susceptibility alleles and markers (Risch
1990).
Genetic association studies 1.2.7.4
The most efficient method to identify susceptibility loci for diseases, in which common
variants are causal, is the genome-wide association study (GWAS) which analyses
DNA sequence variations such as single nucleotide polymorphisms (SNPs) across the
human genome in order to identify genetic risk factors for diseases that are common
in the population. The conduct of GWAS is feasible because of several factors. Firstly,
the International HapMap Project (International HapMap 2003) identified the
commonly occurring SNPs for testing in genetic studies. A variety of sequencing
methods were used and SNPs were discovered in the European population, the
Yoruba population of Africa descent, Han Chinese and Japanese from Tokyo. GWAS
were also made possible by the advance in genotyping technology as chip-based
microarrays for assaying one million or more SNPs were developed. Finally, the
development of statistical methods to assist in the data mining and analyse the genetic
data and the international collaborations that formed to explore the genetic basis of
common diseases by combing well-phenotypes cohorts, contributed to the wide
expansion of GWAS. Usually, hundreds of thousands of markers are used to achieve
genome-wide coverage. However, the large number of statistical tests conducted in
GWAS requires a genome-wide threshold of significance, protecting against false-
positive results that will occur when multiple tests are performed at the level of 0.05.
The first threshold of 5x10-8 was proposed in 1996 (Risch and Merikangas 1996) and is
widely used3 (Hoggart, Clark et al. 2008).
3 A GWAS involves approximately 1 million independent tests, thus the significance threshold that is
widely used has been Bonferroni corrected for the multiple tests (𝑃 = 0.05 106 = 5𝑥10−8⁄ ).
41
GWAS have successfully identified numerous associations for complex diseases,
exploiting the linkage disequilibrium (LD)4 between nearby genetic variants. However,
the majority of these strongly associated SNPs are most likely to be in LD with the
causal variant, rather than playing a biological role themselves. In order to identify truly
causal variants, fine-mapping of the associated locus is required, where all variants in
the region are densely genotyped. This is carried out in large independent studies,
usually large international consortia that design custom genotyping arrays. These arrays
containing approximately 200,000 variants provide dense genotyping of previously
discovered GWAS regions for fine-mapping (Spain and Barrett 2015). For instance, the
Immunochip is an Illumina Infinium custom array containing 196,524 polymorphisms
designed to replicate and fine-map established GWAS significant associations with
autoimmune diseases. As the initiative of the Wellcome Trust Case-Control
Consortium (WTCCC), Immunochip was designed to incorporate loci from 12
inflammatory disorders including rheumatoid arthritis (RA), psoriasis (PSO), Crohn’s
disease (CD), ulcerative colitis (UC), ankylosing spondylitis (AS), systemic lupus
erythematosus (SLE), type 1 diabetes (T1D), thyroid disease, celiac disease, multiple
sclerosis (MS), primary biliary cirrhosis (PBC) and immunoglobulin A (IgA) deficiency
(Cortes and Brown 2011). The Immunochip array allows dense genotyping across 186
regions of the genome with evidence for association with autoimmune diseases.
Use of summary statistics and pleiotropy methods 1.2.7.5
GWAS have been successful in identifying genetic variants that are associated with
susceptibility for complex traits and highlight candidate underlying biological
mechanisms. However, these variants explain only a proportion of the trait’s
heritability, as there are many variants that have a low penetrance that GWAS cannot
statistically associate with a trait (Maher 2008). These studies have produced extensive
genetic variation databases whose analysis could shed more light into the genetics of
complex diseases, but the individual-level genotype and phenotype data are often
inaccessible due to confidentially concerns. For that reason, GWAS summary statistics
data from large consortia can be used as they are publicly available and advantageous in
computational cost. They usually contain per allele SNP effect sizes along with their
4 The term is used to describe the non-random association of alleles at two or more loci and it can be
produced by natural selection, mutation, random drift and gene flow.
42
standard errors which can be used to compute z-score (which is the effect size divided
by the standard error).
A variety of methods have been developed for the analyses of GWAS summary
statistics including methods focusing on single-variant association analysis (meta-
analysis, conditional association), fine-mapping causal SNPs, polygenic risk scores
construction for disease prediction, joint analysis of multiple traits and causal inference.
In the current thesis, I will concentrate on meta-analysis, cross-trait analysis methods
and causal inference via Mendelian randomization.
Cross-trait analyses 1.2.7.5.1
Most complex diseases, such as autoimmune disorders, have a shared genetic aetiology
which can be due to pleiotropy; that is when shared genetic variants with non-zero
effects influence multiple traits. Methods have been developed that exploit pleiotropic
effects in order to identify novel genetic associations and investigate the underlying
biological pathways (Andreassen, Thompson et al. 2013; Liley and Wallace 2015;
Pickrell, Berisa et al. 2016).
Pickrell et al. using a Bayesian approach on summary statistics data from 42 traits, of
unknown in-between relation, performed a scan for SNPs that influence pairs of traits
at each locus in the genome, including a correction for overlapping subjects in the
model (Pickrell, Berisa et al. 2016). Another method, exploiting pleiotropic effects
among diseases known or suspected to be related, can leverage the increased power
from combining GWAS and detect novel common variants that could not be identified
in the original GWAS analysis because of stringent significance threshold. The Bayesian
conditional False Discovery Rate (cFDR) constitutes an upper bound on the expected
false discovery rate (FDR) across a set of SNPs whose p-values for two diseases are
both less than two disease-specific thresholds. This model-free statistical analysis is
based on the notion that if two diseases share common genetic risk factors, a degree
of association of a locus with one disease may increase the likelihood of detecting an
association with the other (Andreassen, Thompson et al. 2013). This method also been
extended to include studies with overlapping control subjects, strengthening the
power of the technique (Liley and Wallace 2015).
43
An alternative approach assessing the overlap between two complex diseases is to
estimate the genetic correlation between effect sizes across the two traits (Bulik-
Sullivan, Finucane et al. 2015; Palla and Dudbridge 2015). Palla et al. developed a fast
method based on polygenic risk scores for estimating the proportion of variants
affecting each trait and the genetic correlation between a pair of related traits, using
only summary statistics. However, this method requires independent datasets and the
use of uncorrelated markers so “LD pruning” (selecting SNPs with limited pairwise
correlation) is essential (Palla and Dudbridge 2015). Another recent study developed a
method that uses cross-trait LD score regression, which uses LD to estimate the
variance among traits. This method is robust to overlapping samples and adjusts for
population stratification in its calculation (Bulik-Sullivan, Finucane et al. 2015).
Meta-analysis 1.2.7.5.2
Meta-analysis is a statistical method that jointly integrates the results of multiple
GWAS of a single trait to boost power for identifying SNP associations with small
effects (Evangelou and Ioannidis 2013). The advantage in performing meta-analysis is
the use of aggregated data as it is wholly available and does not require any additional
cost. A meta-analysis is usually performed using fixed-effect (effects sizes are constant
across studies) approaches, where any differences between effect sizes are due to
sampling error. However, when the observed differences are also due to variation in
the true, causal effects (which is called heterogeneity), a random-effect meta-analysis
model should be used (Kelley and Kelley 2012). Although random-effects models
reflect the heterogeneous nature of the complex diseases, they tend to be less
powerful than fixed-effects models (Evangelou and Ioannidis 2013).
The use of single-trait analyses cannot exploit information provided by correlated
traits, thus methods have been developed which jointly analyse GWAS results from
several related diseases (Cotsapas, Voight et al. 2011; Bhattacharjee, Rajaraman et al.
2012). These approaches boost the statistical power to detect genetic associations for
each disease and investigate the underlying biological pathways.
Cross-phenotype meta-analysis (CPMA) is a statistical approach that assesses whether
a SNP has multiple phenotypic associations across different diseases that may be
genetically similar, such as autoimmunity (Cotsapas, Voight et al. 2011; Turley, Walters
et al. 2018). CPMA is agnostic to the direction of the effect and it examines the
44
deviation in the distribution of association p-values, thus it can detect variants that are
associated to at least a subset of, and not necessarily all, diseases. A major disadvantage
of this method is that it cannot be applied to studies that share the same control
samples.
On the contrary, both multi-trait analysis of GWAS (MTAG) by Turley et al. and the
subset-based method (SBM) by Bhattacharjee et al. are robust to sharing the same
controls, which is essential when summary statistics data come from large consortia.
MTAG is a generalised inverse-variance-weighted meta-analysis method that is based
on the key assumption that all markers share the same variance-covariance matrix of
effects sizes across diseases; even then MTAG has proven to be a consistent estimator
(Turley, Walters et al. 2018). Its main advantage is that it can be specifically useful for a
disease of interest that is underpowered but shows strong genetic correlation with
other diseases. However, the application of MTAG to a large number of low-powered
studies or to GWASs with a substantial difference in power could cause large inflations
to the FDR. Regarding the subset-based method, it is a generalisation of the basic
fixed-effects meta-analysis that allows some subset of the studies to have no effect or
the effect of susceptibility loci to manifest in different directions for different traits.
More specifically, this method explores all possible subsets for non-null associations to
identify the strongest one and then evaluates the significance of the association while
accounting for multiple testing (Bhattacharjee, Rajaraman et al. 2012).
Investigating risk factors for PSO and PsA 1.3
PSO is a chronic, immune-mediated disorder with variable manifestations, severity and
course. It mainly affects the skin and is associated with both a physical and a
psychological burden, comparable to other major chronic disorders (Rapp, Feldman et
al. 1999). Up to 30% of patients can develop chronic inflammatory arthritis, called PsA
(Gladman, Antoni et al. 2005). Certain clinical features such as negative testing for
rheumatoid factor differentiate PsA from RA, which classes PsA as a seronegative
spondyloarthropathy (Moll and Wright 1973). Both PSO and PsA present a large
degree of clinical overlap, as patients with PsA usually present with skin manifestations
as well. They are both complex diseases, like the majority of immune related disorders,
45
whose onset and progression is influenced by the individual’s genetic predisposition
and various environmental and lifestyle factors.
Epidemiology of PSO and PsA 1.3.1
Prevalence and incidence of PSO 1.3.1.1
PSO affects approximately 0.91 (United States) (Robinson, Hackett et al. 2006) to 8.5%
(Norway) (Bo, Thoresen et al. 2008) of the population worldwide. The occurrence
varies according to age and geographic location with countries further away from
equator having higher prevalence rates. In the United Kingdom (UK) it is estimated to
occur in 2-3% of the general population (Parisi, Symmons et al. 2013). Moreover, a
recent study showed that latitude may significantly influence PSO, with 6.5 new PSO
cases per 100,000 person-years for every degree increase in latitude in the UK
(Springate, Parisi et al. 2017). The incidence of PSO in adults varied from 78.9 per
100,000 person-years in the United States (USA) to 230 per 100,000 person-years in
Italy (Parisi, Symmons et al. 2013).
PSO equally affects women and men and it can develop at any stage; however two
peaks of incidence have been reported at the ages of 16 or 22 and 60 or 57 (Henseler
and Christophers 1985), describing a bimodal distribution. This dichotomises the
disease into Type 1 (early-onset PSO before 40 years of age) and Type 2 (late-onset
PSO after the age of 40).
PSO in childhood is less prevalent compared to adulthood, ranging from 0% in Taiwan
to 2.1% in Italy; whereas, the incidence rate reported in the USA was 40.8 per 100,000
person-years (Parisi, Symmons et al. 2013).
Prevalence and incidence of PsA 1.3.1.2
The prevalence of PsA is difficult to estimate as until recently there was a lack of
widely accepted classification criteria. Nevertheless, prevalence estimates in the USA
range from 0.06 to 0.25% and from 0.05% to 0.21% in Europe (Ogdie and Weiss 2015).
The incidence of PsA in the general population ranges from 0.1 to 23.1 per 100,000
person-years according to a systematic review (Alamanos, Voulgari et al. 2008).
Despite the low prevalence in the general population, PsA is the most frequent
comorbidity in patients with PSO, with prevalence ranging from 6% to 41% depending
46
again on the definition of the disease and the methodology used per study (Ogdie and
Weiss 2015). In a population-based study, the cumulative incidence of PsA over time in
patients with PSO was assessed and 1.7%, 3.1% and 5.1% had developed PsA at 5,10
and 20 years, respectively, after being diagnosed with PSO (Wilson, Icen et al. 2009).
In a prospective study of 313 psoriatic patients, an annual incidence of 1.87% was
reported (Eder, Chandran et al. 2011).
Clinical manifestations of the diseases 1.3.2
PSO is a diverse disease that can manifest as various phenotypes as seen in Figure 3.
PSO vulgaris is the most common type accounting for 90% of all cases and is
characterised by round to oval, raised plaques covered with silvery white scales to
well-defined, erythematous areas at the knees, elbows, scalp and lower back. Guttate
PSO manifests as smaller, less scaly patches, has an onset in childhood or young
adulthood (age<30 years old) and is typically triggered by streptococcal infection.
Inverse PSO presents at the folds of the body as erythematous, not scaly plaques,
whereas the erythrodermic PSO is a relatively rare and rather severe type that
appears as a widespread erythema covering 90% of the patient’s body and can be life-
threatening (Greb, Goldminz et al. 2016). Until recently, pustular PSO (including
generalised pustular PSO (GPP) and palmoplantar pustulosis), was referred to as a type
of PSO; however, evidence suggests that it is likely to be a distinct entity. Genetic
studies of GPP have found an association of small number of cases with mutations in
CARD14 and AP1S3 (Navarini, Burden et al. 2017). In addition, GPP has been observed
in patients without a history of PSO and it has been reported that interleukin-36RN
(IL36RN) mutations are more prevalent in patients with GPP alone compared to those
with PSO as well (Sugiura, Takemoto et al. 2013).
47
Figure 3 | Skin manifestations of psoriasis a) psoriasis vulgaris b) guttate psoriasis c) inverse psoriasis and d) erythrodermic psoriasis. Picture e) shows pustular psoriasis, previously reported to be a PSO phenotype. Picture adapted from (Greb, Goldminz et al. 2016) – used with permission.
Furthermore, a common feature of PSO is the involvement of the nail which presents a
lifetime incidence of 80-90% in patients with PSO (Reich 2009) and PsA (Tan, Chong et
al. 2012). The nail changes include the involvement of the nail matrix which causes nail
pitting and nail dystrophy, and the nail bed involvement that leads to subungual
hyperkeratosis and onycholysis that appear as yellow, keratinous material under the
nail plate (Sobolewski, Walecka et al. 2017) (Figure 4).
48
Figure 4 | Nail changes in patients with psoriasis These include discoloration and dystrophy.
The nail matrix is anatomically connected to the enthesis of the distal interphalangeal
predominant (DIP) joint extensor, with the latter being the most often affected in PsA
and thus, potentially explaining the higher prevalence of nail changes in those patients
(Tan, Chong et al. 2012).
The age of onset of PsA is usually between 30-55 years and both sexes are equally
affected. Despite the fact that most patients (~70%) with PsA suffer from PSO at the
time of diagnosis, in 30% of cases PsA precedes PSO or a simultaneous development is
observed (Gottlieb, Korman et al. 2008). PsA is an inflammatory disease causing pain
and joint damage, leading to disability. The clinical manifestations of the disease can be
diverse in severity and involvement; patients may develop axial and peripheral joint
inflammation, nail dystrophies as those seen in PSO, enthesitis or dactylitis (Figure 5).
Patients with PsA are at high risk of developing spondylitis (40%); therefore, the
disorder is classified with the spondyloarthropathies. However, the difference between
PsA and the latter can be detected at the development of peripheral arthritis and the
asymmetrical joint involvement (Gladman, Antoni et al. 2005).
49
Figure 5 | Manifestations of psoriatic arthritis a) nail changes b) swollen joint (left knee) c) swollen Achilles tendon (enthesitis) d) swollen/sausage digit (dactylitis). [Picture reprinted from http://www.aad.org]
Classification and diagnostic criteria for PsA 1.3.2.1
The discrimination between disorders with similar manifestations poses a great
challenge for specialists. Thus, the development of criteria for use in clinical care and
research is an important aspect in rheumatology. Although classification and diagnostic
criteria can be very similar, especially in well-defined diseases such as gout, in reality
disease features are usually different among patients. Thus, classification criteria do not
perform 100% accurately leading to misclassification, so they cannot be used for
diagnosis. The primary aim of classification criteria is to create a well-defined cohort
capturing the majority of patients with the key features of the disease for research
purposes, whereas the diagnostic criteria aim to effectively identify as many subjects
with the disease as possible by incorporating the various features of the disease
(Aggarwal, Ringold et al. 2015).
In 1973, Moll and Wright proposed a set of classification criteria for PsA that had been
widely used for some time which are (Moll and Wright 1973):
50
An inflammatory arthritis (peripheral arthritis and/or sacroilitis or spondylitis)
The presence of PSO
Blood test negative for rheumatoid factor, which is an autoantibody against the
fragment crystallisable region of immunoglobulin G (IgG) and is detected in
patients with autoimmune diseases including RA (Song and Kang 2010).
Using these criteria, PsA was classified into five major subtypes based on the clinical
features of the disease: polyarthritis, asymmetrical oligoarthritis, DIP joint, spondylitis
and arthritis mutilans. However, due to the overlap observed among the groups the
Classification of Psoriatic Arthritis (CASPAR) group suggested new classification
criteria, which have since been routinely used by most researchers (Taylor, Gladman
et al. 2006). According to CASPAR, inflammation of either joints, spine or entheses is
needed along with a score of three or more in the following:
Current PSO (score 2), personal or family history of PSO (score 1 each)
Psoriatic nail dystrophy, including onycholysis, pitting and hyperkeratosis (score
1)
Negative rheumatoid factor presence test (score 1)
Current dactylitis or personal history of dactylitis (score 1)
Radiographic evidence of juxta-articular new bone formation (score 1).
The above classification criteria are used in research but not for diagnostic purposes.
Instead, screening questionnaires have been developed to assist dermatologists in
identifying individuals with possible PsA in routine clinical care settings. PsA is
estimated to be undiagnosed in approximately 10.1-15.5% of patients with PSO
because of a lack of awareness among patients and dermatologists about the
relationship between skin and joint symptoms and the lack of a commonly accepted
and validated diagnostic/screening tool. Screening tools currently used include i) the
PSO and Arthritis Questionnaire (PAQ) (Peloso, Behl et al. 1997) and the modified
PAQ (mPAQ) (Alenius, Stenberg et al. 2002) ii) the Psoriatic Arthritis Screening and
Evaluation (PASE) questionnaire (Husni, Meyer et al. 2007) iii) the Toronto Psoriatic
Arthritis Screen (ToPAS) questionnaire (Gladman, Schentag et al. 2009) and ToPAS2
iv) the PSO Epidemiology Screening Tool (PEST) (Ibrahim, Buch et al. 2009) v) the PSO
and Arthritis Screening Questionnaire (PASQ) (Khraishi, Landells et al. 2010) and vi)
51
the Early Arthritis for Psoriatic Patients (EARP) (Tinazzi, Adami et al. 2012). The
characteristics of the screening tools can be reviewed in Table 4.
A number of studies have compared the performance of these screening instruments
in different settings and populations (Coates, Aslam et al. 2013; Haroon, Kirby et al.
2013; Walsh, Callis Duffin et al. 2013; Mease, Gladman et al. 2014; Karreman, Weel et
al. 2016; Mishra, Kancharla et al. 2017) (Table 5). While the screening tools performed
well in the training datasets, they demonstrated low sensitivity and specificity in
validation datasets. For example, Haroon et al. compared the performance of PASE,
PEST and ToPAS and found that they performed poorly in identifying patients with
non-polyarticular manifestations of PsA, resulting in low sensitivities (Haroon, Kirby et
al. 2013). In a different study, the low specificity of the same tools reflects the fact that
they identify many cases with other musculoskeletal diseases. However, that study
recruited patients from PSO clinics rather than general dermatology clinics; thus, many
patients would already have been diagnosed with PsA and excluded from the study.
This may have resulted in a reduction of the specificity (Coates, Aslam et al. 2013).
Finally, lower specificities compared to the original validation of the same tools were
presented by Walsh et al. The lower specificities were probably caused by the high
prevalence of musculoskeletal diseases in the study population which have similar
manifestations as those caused by PsA. In addition, it was noted that the diversity in
PsA’s phenotypes may have resulted in lower sensitivities as patients who did not fulfil
the CASPAR criteria were included. Finally, it was shown that these tools did not
adequately differentiate PsA from osteoarthritis or fibromyalgia (Walsh, Callis Duffin et
al. 2013). On the contrary, Mease et al. comparing the performance of PASQ, PEST
and ToPAS showed that these tools can effectively identify patients with arthritis that
could benefit from a rheumatological evaluation (Mease, Gladman et al. 2014).
Interpreting the findings from the different studies is problematic because of the
existence of substantial differences in patient characteristics such as age, PSO severity,
presence and duration of PsA, treatments received and study setting (different
recruitment sites) and methods used. In general, the screening tools appeared unable
to differentiate between PsA and other musculoskeletal disorders. In addition,
differences in the wording of the questions between the tools could contribute to
their performance. For example, PASE asks about painful joints, PEST about swollen
52
joints and ToPAS asks about red and swollen joints. Furthermore, in the case of
patients not having any musculoskeletal symptoms, their score would be negative for
PsA. However, they could help patients realise the connection between skin and joint
involvement and make them more “open” in revealing to their physicians any signs or
symptoms they had (such as back pain that is usually believed to be part of the
everyday life).
A high proportion of patients with PSO also have undiagnosed PsA and raising
awareness of the association between PSO and arthritis could raise awareness about
that issue. These tools demonstrate good sensitivity and specificity in the development
stages, but fail to perform to the same high standards in validation attempts. It is clear
that the presence of clinical symptoms will be important classifiers for the identification
of PSO patients at the very early stages of developing PsA, but are not sufficient alone
to provide accurate prediction.
53
Table 4 | Characteristics of the screening tools at their development phase
Characteristics PAQ (pilot) mPAQ by Alenius PASE ToPAS PEST PASQ EARP
ToPAS 2 ePASQ
Setting Community and
hospital based
register
Combined
dermatology-
rheumatology
clinic
PsA clinic, PSO
clinic, general
dermatology
clinic, general
rheumatology
clinic (without
PsA patients) and
family medicine
clinic
Community
sample (two
general
practitioners
(GPs)) and
hospital
rheumatology
clinic
Dermatology
and
rheumatology
clinic
Dermatology-
rheumatology
combined clinic
Community-
based
Based on De novo
dermatological
input
PAQ De novo
dermatological
and
rheumatological
input using the
Delphi method
Dermatology,
rheumatology
and methodology
input
mPAQ PAQ typical
symptoms and
signs in PsA
patients
ToPAS PASQ
Date 1997 2002 2007 2009
2009 2010 2012
2011
PAQ: Psoriasis and Arthritis Questionnaire; mPAQ: modified PAQ; PASE: Psoriatic Arthritis Screening and Evaluation; ToPAS: Toronto Psoriatic Arthritis Screen;
PEST: Psoriasis Epidemiology Screening Tool; PASQ: Psoriasis and Arthritis Screening Questionnaire; EARP: Early Arthritis for Psoriatic patients; PsA: Psoriatic Arthritis;
GP: General Practitioner; PSO: Psoriasis
54
Table 4 | Characteristics of the screening tools at their development phase
Characteristics PAQ (pilot) mPAQ by Alenius PASE ToPAS PEST PASQ EARP
ToPAS 2 ePASQ
Initial patients
administrated
108 PSO
patients
202 psoriatic
patients not
knowing whether
they had arthritis
69 PSO patients
naïve to systemic
therapy
134 (PsA clinic),
123 (PSO clinic),
118
(dermatology),
135
(rheumatology),
178 (family
medicine)
93 with unknown
PsA (GP) and 21
diagnosed with
PsA
(rheumatology
clinic)
group A: 87 with
either PSO or PsA,
group B: 42 with
early PsA
228 PSO patients
with unknown
PsA naïve to
systemic therapy
with a disease-
modifying anti-
rheumatic drug
(DMARD)
131 (with PsA),
336 (with PSO
only), 89 (healthy
controls)
54 with suspected
early PsA (with or
without known
PSO)
Cut-off score 7 4 47 8 3 9 (group A)
7 (group B)
3
7 (8) 7
PAQ: Psoriasis and Arthritis Questionnaire; mPAQ: modified PAQ; PASE: Psoriatic Arthritis Screening and Evaluation; ToPAS: Toronto Psoriatic Arthritis Screen;
PEST: Psoriasis Epidemiology Screening Tool; PASQ: Psoriasis and Arthritis Screening Questionnaire; EARP: Early Arthritis for Psoriatic patients;
GP: General Practitioner; PsA: Psoriatic Arthritis; PSO: Psoriasis
55
Table 4 | Characteristics of the screening tools at their development phase
Characteristics PAQ (pilot) mPAQ by Alenius PASE ToPAS PEST PASQ EARP
ToPAS 2 ePASQ
Sensitivity (%) 85 60 82 89 (PSO and PsA), 92
(Dermatology and PsA),
93 (Rheumatology and
PsA), 90 (Family
medicine and PsA)
92 86 (group A)
93 (group B)
85
92 (87) (PsA vs rest), 92
(87) (PsA vs PSO), 92
(87) (PsA vs healthy)
98
Specificity (%) 88 62 73 86 (PSO and PsA), 95
(Dermatology and PsA),
86 (Rheumatology and
PsA), 100 (Family
medicine and PsA)
78 89 (group A)
75 (group B)
92
77 (83) (PsA vs rest), 74
(80) (PsA vs PSO), 90
(92) (PsA vs healthy)
75
PAQ: Psoriasis and Arthritis Questionnaire; mPAQ: modified PAQ; PASE: Psoriatic Arthritis Screening and Evaluation; ToPAS: Toronto Psoriatic Arthritis Screen;
PEST: Psoriasis Epidemiology Screening Tool; PASQ: Psoriasis and Arthritis Screening Questionnaire; EARP: Early Arthritis for Psoriatic patients;
PsA: Psoriatic Arthritis; PSO: Psoriasis
56
Table 4 | Characteristics of the screening tools at their development phase
Characteristics PAQ
(pilot)
mPAQ by Alenius PASE ToPAS PEST PASQ EARP
ToPAS 2 ePASQ
Axial
involvement
Yes Yes Yes (being developed) Both Yes Yes Both Yes Yes
Skin/nail
involvement
Yes Yes No Both Yes Yes Both Yes No
Unique features 7-item symptom
subscale and 8-item
function subscale, PsA
and osteoarthritis
symptom distinction,
tracks patients’
response to
treatment, refers only
to current status
pictures of skin/nail
involvement, PsA
can be screened in
any population, use
of direct questions
manikin for areas
of tenderness
manikin for joint
involvement
Addition of pictures
of dactylitis and
arthritic joints,
rephrasing
questions about
axial disease
Physician’s
involvement was
not needed as it is
electronic and
self-scoring
PAQ: Psoriasis and Arthritis Questionnaire; mPAQ: modified PAQ; PASE: Psoriatic Arthritis Screening and Evaluation; ToPAS: Toronto Psoriatic Arthritis Screen;
PEST: Psoriasis Epidemiology Screening Tool; PASQ: Psoriasis and Arthritis Screening Questionnaire; EARP: Early Arthritis for Psoriatic patients;
PsA: Psoriatic Arthritis
57
Table 5 | Comparison of psoriatic arthritis screening tools by different studies
Studies PASE47 PASE44 PEST ToPAS ToPAS 2 EARP PASQ
Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity
(Haroon, Kirby
et al. 2013)
0.24
0.62*
0.94 0.28
0.86*
0.98 0.41
0.83*
0.90
(Coates, Aslam
et al. 2013)
0.75 0.39 - - 0.77 0.37 0.77 0.30 - - - - - -
(Walsh, Callis
Duffin et al.
2013)
0.68 0.50 0.78 0.40 0.85 0.45 0.75 0.72 - - - - - -
(Mease,
Gladman et al.
2014)
- - - - 0.84 0.75 0.77 0.72 - - - - 0.67 0.64
(Mishra,
Kancharla et
al. 2017)
0.76 0.95 0.80 0.95 0.53 0.95 - - 0.44 0.97 0.91 0.88 - -
(Karreman,
Weel et al.
2016)
0.59 0.66 0.66 0.57 0.68 0.71 - - - - 0.87 0.34 - -
* group 2 (= confirmed psoriatic arthritis)
PASE: Psoriatic Arthritis Screening and Evaluation; PEST: Psoriasis Epidemiology Screening Tool; ToPAS: Toronto Psoriatic Arthritis Screen;
EARP: Early Arthritis for Psoriatic patients; PASQ: Psoriasis and Arthritis Screening Questionnaire
58
Immunopathogenesis of PSO and PsA 1.3.3
The pathogenesis of PSO is the result of the interplay between skin cells and the innate
and adaptive immune systems. PSO was considered to be a disease characterised by
the hyper-proliferation of keratinocytes that manifested in the characteristic scaling
plaques, until the crucial role of T-cells was demonstrated (Gottlieb, Gilleaudeau et al.
1995). Initially, the concept of an immune-mediated pathogenesis was supported by the
presence of T-cells in the psoriatic plaques, including T helper 1 (Th1; CD4+) and T
cytotoxic (Tc1; CD8+) subsets (Krueger 2002) however, the pathogenic paradigm has
been shifted and now a key role of the IL-23/IL-17 axis has been recognised (Martin,
Towne et al. 2013).
The pathogenesis of PSO can be divided into two stages; initiation and maintenance.
The activation of the dermal dendritic cells (DCs) by a trigger, often environmental
such as infection or trauma, causes the production of cytokines IL-12 and IL-23 which
in turn induce the differentiation of T cells into effector cells such as Th1 or Tc1 and
Th17 or Tc17, respectively. The effector cells (Th and Tc) recirculate and travel into
the skin tissue in response to the signals from key chemokines and cytokines
(Nograles, Davidovici et al. 2010). There they produce IL-17 and IL-22 which lead to
further keratinocyte activation and proliferation, creating the skin lesions and
maintaining the disease. Recently distinct Th cells were found in the psoriatic lesions,
known as Th22 which can only produce IL-22. Th22 cells reside in normal skin and are
enriched in the lesions where they produce IL-22 without the ability to produce IL-17
and interferon γ (IFN-γ) (Duhen, Geiger et al. 2009; Fujita 2013). DCs also produce
tumor necrosis factor alpha (TNFα) and IL-23. TNFα amplifies inflammation by
regulating the antigen-presenting cells and by activating IL-23 production in DCs,
whereas IL-23 is responsible for the expansion of Th17 cells that produce IL-17
(Lowes, Kikuchi et al. 2008). Finally, the secretion of chemokines, cytokines and
antimicrobial peptides from the keratinocytes contributes to the feedback loop that
exists between the immune system and the epidermal cells in PSO by attracting further
immune cells (Lowes, Kikuchi et al. 2008). Bergboer et al. suggested an alternative
model for the pathogenesis of PSO based on genetic evidence implicating that skin
barrier function alterations contribute to the PSO along with the innate and adaptive
immune system (Bergboer, Zeeuwen et al. 2012).
59
The pathogenesis of PsA is not clearly understood due to the complexity of the
disease. Inflammation can appear within the entheses, the synovium and the spine,
affecting both soft tissue and bone with a range of immune cells playing a role in the
onset and progression of the disease. When the immune system is triggered, two main
events for PsA to develop occur; the T cells and B cells infiltrate into the enthesis and
the synovium of the joint, and the entheseal and synovial tissues respond to the
infiltration with the presence of CD8+ T-cell clones implicating the adaptive immune
system and the infiltration of CD4+ T-cells in the synovial fluid and the epidermis of
the skin. Results from experiments using mouse and human tissues showed that IL-23
induced Th17 cytokines (IL-17 and IL-22) can contribute to all four pathological
features in PsA: development of psoriatic plaque, pannus formation in the joint, joint
erosion, and new bone formation (Raychaudhuri, Saxena et al. 2015).
Analysis from synovial fluid has shown elevated levels of pro-inflammatory cytokines
including IL-1, IL-2, IL-6, IL-8, TNFα and INF-γ. Moreover, within the synovium
angiogenesis has been observed due to functional changes in the infiltrating immune
cells. In the joint, osteoclastogenesis and bone resorption is induced by the infiltration
of T-cells. In addition, inflammatory changes have been seen extending from the
enthesis in the adjacent tissues and synovium. The enthesis is the insertion of ligament,
tendon or joint capsule to bone (Figure 6) including the underlying bone at attachment
sites and enthesitis is a focal insertional inflammation (Benjamin and McGonagle 2001).
Based on imaging results, McGonagle et al. suggested that PsA is primarily an enthesis-
centred disease with enthesitis appearing at sites of high mechanical stress and
compression forces with synovitis arising later (McGonagle 2005).
60
Figure 6 | Joint with the enthesis and synovial lining being points of inflammation in psoriatic arthritis. Adapted from Wikipedia (https://en.wikipedia.org)
Comorbid diseases 1.3.4
PSO is not just “skin deep” but is linked to many comorbidities (Ni and Chiu 2014).
The “PSO march” or “inflammatory march” model has been used to describe the
systemic expansion of the cutaneous inflammation across a variety of organs (Furue
and Kadono 2017). According to this hypothesis, PSO produces a variety of pro-
inflammatory cytokines and chemokines not only in the lesions but also in circulation,
which can trigger chronic inflammatory responses in other tissues and induce low-
grade systemic inflammation. In a review, Krueger and Brunner described the role of
the IL-23/Th17 axis, that plays an important role in the immunopathogenesis of PSO as
described in 1.3.3, and specifically IL-17 in the genesis of many morbidities associated
with PSO (Krueger and Brunner 2017). According to them, IL-17 acts synergistically
with TNFα and interferons to alter the response of various cell types that contribute
to the onset of comorbidities. Thus, they propose that chronic inflammation induces
insulin resistance and metabolic abnormalities (Gottlieb, Dann et al. 2008), endothelial
dysfunction and cardiovascular disorders (CVDs) (Hu and Lan 2017). In addition,
psychological (Ferreira, Abreu et al. 2016) and gastrointestinal disorders (Pietrzak,
Pietrzak et al. 2017) along with liver disease (Prussick and Miele 2017) lead patients
with PSO to a more diminished health related quality of life (de Korte, Sprangers et al.
2004). The abovementioned comorbidities also co-occur in patients with PsA (Kotsis,
Voulgari et al. 2012; Zhu, Li et al. 2012; Dreiher, Freud et al. 2013). It has been
61
suggested that the coexistence of cutaneous disease with joint involvement may cause
an overwhelming inflammatory status that could provoke these comorbidities (Ogdie,
Schwartzman et al. 2015). Although it is still unclear if PSO and PsA are a consequence
of these morbidities or predisposing factors, evidence shows that their co-appearance
is based on shared biological pathways.
Cardiovascular disease 1.3.4.1
The presence of both PSO and PsA has been linked to an increased prevalence of
cardiovascular diseases (Table 6).
62
Table 6 | Cardiovascular events in psoriasis and psoriatic arthritis
Outcome Study type Patients (n) HR/RR/OR/SPR (95% CI) Confounders controlled
MI cohort study
(Mehta, Yu et al. 2011)
severe PSO: 3,603
controls: 14,330
HR 1.53 (1.26-1.85) Age, gender, hypertension, diabetes and
hyperlipidaemia
MI meta-analysis
(Armstrong, Harskamp et
al. 2013)
mild PSO: 201,239 RR 1.29 (1.02-1.63)
Primary adjustment accounting for
comorbidities
Stroke RR 1.12 (1.08-1.16)
MI severe PSO: 17,415
RR 1.70 (1.32-2.18)
Stroke RR 1.56 (1.32-1.84)
MI meta-analysis
(Samarasekera, Neilson
et al. 2013)
mild PSO HR or IRR 1.34 (1.07-1.68)
Primary adjustments
Stroke HR or IRR 1.15 (0.98,1.35)
MI severe PSO
HR or IRR 3.04 (0.65-14.35)
Stroke HR or IRR 1.59 (1.34-1.89)
MI cross-sectional
(Lai and Yew 2016)
PSO: 520
controls: 19,065
OR 2.23 (1.27-3.95) Smoking, alcohol consumption, metabolic
syndrome and hyperuricemia CHD OR 1.90 (1.18-3.05)
Stroke OR 1.01 (0.48-2.16)
MI meta-analysis
(Raaby, Ahlehoff et al.
2017)
mild PSO HR 1.20 (1.06-1.35)
Primary adjustments Stroke HR 1.10 (1.00-1.19)
MI severe PSO
HR 1.70 (1.18-2.43)
Stroke HR 1.38 (1.20-1.60)
HR: Hazard Ratio; RR: Risk Ratio; OR: Odds Ratio; SPR: Standardised Prevalence Rate; CI: Confidence Interval; MI: Myocardial Infraction; PSO: Psoriasis;
IRR: Incidence Risk Ratio; CHD: Coronary/Ischaemic Heart Disease
63
Table 6 | Cardiovascular events in psoriasis and psoriatic arthritis
Outcome Study type Patients (n) HR/RR/OR/SPR (95% CI) Confounders controlled
MI
population-based cohort
(Ogdie, Yu et al. 2015)
PsA: 8,706
No DMARD: HR 1.36 (1.04-1.77)
Age, sex, hypertension,
diabetes, hyperlipidaemia,
smoking and start year in the
cohort
DMARD: HR 1.36 (1.01-1.84)
Stroke No DMARD: HR 1.33 (1.03-1.71)
DMARD: HR 1.13 (0.83-1.55)
MI
PSO: 138,424
No DMARD: HR 1.08 (0.98-1.18)
DMARD: HR 1.26 (0.92-1.72)
Stroke No DMARD: HR 1.08 (0.99-1.17)
DMARD: HR 1.45 (1.10-1.92)
Ref Healthy controls: 81,573 ref
Any MI population-based cohort
(Egeberg, Thyssen et al. 2017)
mild cutaneous PSO: 46,085 HR 1.01 (0.94-1.08) Age, sex, socioeconomic status,
smoking, alcohol abuse,
previous cardiovascular
diseases, diabetes,
hypertension, statin use and
health care consumption
severe cutaneous PSO: 7,369 HR 1.19 (1.03-1.39)
PsA: 8,149 HR 1.22 (1.05-1.43)
general population = 4,300,085 Ref
MI meta-analysis
(Polachek, Touma et al. 2017)
PsA: 32,973
Controls
OR 1.68 (1.31-2.15) Primary adjustment
Heart Failure OR 1.32 (1.11-1.57)
Cerebrovascular events OR 1.22 (1.05-1.41)
HR: Hazard Ratio; RR: Risk Ratio; OR: Odds Ratio; SPR: Standardised Prevalence Rate; CI: Confidence Interval; MI: Myocardial Infraction; PsA: Psoriatic Arthritis;
DMARD: Disease-Modifying Anti-Rheumatic Drug; PSO: Psoriasis; Ref: reference group
64
Hypertension 1.3.4.2
Hypertension is one of the conditions, along with obesity, insulin resistance and
hypercholesterolemia that is part of the metabolic syndrome and it is a traditional risk
factor for CVDs and diabetes. Increased prevalence and incidence of hypertension has
been reported in both PSO and PsA (Table 7). Interestingly, a study among 611
patients with PsA and 449 with PSO without arthritis showed that hypertension was
significantly more prevalent in PsA (adjusted OR 2.17; 95% CI: 1.22-3.83) (Husted,
Thavaneswaran et al. 2011).
Table 7 | Hypertension in psoriasis and psoriatic arthritis
Study type Patients (n) HR/RR/OR (95% CI) Confounders controlled
meta-analysis
(Armstrong,
Harskamp et al.
2013)
mild PSO: 127,706 OR 1.03 (1.01-1.06)
Age, sex, person-years and
diabetes, lipids, smoking
and BMI
controls: 465,252 ref
severe PSO: 3,854 OR 1.00 (0.87-1.14)
controls: 14,065 ref
Prospective
(Qureshi, Choi et
al. 2009)
females with PSO: 386 RR 1.17 (1.06-1.30) Age, BMI, smoking, alcohol
intake and physical activity controls: 15,338 ref
case-control
(Cohen, Weitzman
et al. 2010)
PSO: 12,502 OR 1.37 (1.29-1.46) Age, sex, smoking, obesity,
diabetes, use of NSAIDs and
Cox-2 inhibitors
controls: 24,285 ref
cross-sectional
(Neimann, Shin et
al. 2006)
PSO OR 1.58 (1.42-1.76)
mild PSO OR 1.30 (1.15-1.47)
severe PSO OR 1.49 (1.20-1.86)
PsA OR 2.07 (1.41-3.04)
controls ref
cohort study
(Jafri, Bartels et al.
2017)
PsA: 9,741 HR 1.31 (1.23-1.39) Age, sex, other
cardiovascular risk factors,
heart disease, health care
utilization
controls: 307,278 ref
HR: Hazard Ratio; RR: Risk Ratio; OR: Odds Ratio; CI: Confidence Interval; PSO: psoriasis;
ref: reference group; BMI: Body Mass Index; NSAID: non-steroid anti-inflammatory drugs;
PsA: Psoriatic Arthritis
65
Diabetes mellitus 1.3.4.3
Type 2 diabetes mellitus (DM) is a metabolic disorder leading to increased insulin
resistance and hyperglycaemia and is one of the main contributors to the increased
cardiovascular morbidity and mortality (Armstrong, Harskamp et al. 2013)
In a case-control study in 1,835 PSO patients (mild cases=1,661 and severe cases=129)
from the Middle East, the prevalence of DM in mild PSO, severe PSO and controls was
37.4%, 41% and 16%, respectively (p-value=0.00001) (Al-Mutairi, Al-Farag et al. 2010).
In a population-based cohort study among 108,132 patients with PSO and 430,716
controls, the hazard ratio for incident DM was 1.14 (95% CI:1.10-1.18) in the PSO
cohort adjusted for BMI, hypertension, hyperlipidaemia, age and sex (Azfar, Seminara
et al. 2012).
The high prevalence of DM in patients with PsA was found in the majority of studies
(Han, Robinson et al. 2006; Tam, Tomlinson et al. 2008; Solomon, Love et al. 2010;
Eder, Chandran et al. 2017), but not in all (Khraishi, MacDonald et al. 2011).
Obesity 1.3.4.4
Obesity is thought to be a chronic inflammatory condition (Monteiro and Azevedo
2010). For that reason, obesity could trigger PSO and/or PsA or it could be the
consequence of the latent diseases, arising from metabolic disorders and low quality of
life (eating habits, physical inactivity). High BMI has been found to be prevalent in both
PSO and PsA compared to the general population and the risk of developing either
disease is elevated in obese individuals (Table 8). In addition, individuals with PsA have
higher BMI (OR 1.60 (95% CI: 1.09-2.39)) compared with those with PSO after
controlling for potential confounders (Bhole, Choi et al. 2012).
The majority of the studies evaluating either the metabolic risk or adiposity in patients
with PSO have used BMI as a simple marker of obesity. However, BMI cannot provide
information about the distribution of body fat such as the proportion of visceral
adiposity in the abdominal area, which has been more closely associated to
cardiovascular risk than the total fat mass (Despres 2012). Measurement of visceral
adiposity is more challenging compared to BMI as it requires imaging techniques,
however the use of anthropometric instruments such as the waist circumference has
been suggested as it could be used in a clinical setting.
66
Table 8 | Obesity in psoriasis and psoriatic arthritis
Study type Patients (n) HR/RR/OR (95% CI) Confounders
controlled
meta-analysis
(Armstrong,
Harskamp et al.
2012)
PSO: 201,831 OR 1.66 (1.46-1.89) primary adjustment
controls ref
cross-sectional
(Bhole, Choi et al.
2012)
PSO: 644 OR 1.84 (1.50-2.26)
Age and sex PsA: 448 OR 2.71 (2.31-3.18)
controls: 115,787 ref
population-based
(Snekvik, Smith et
al. 2017)
BMI (18.5-24.9): 13,904 ref Age, sex, education
and smoking BMI (25.0-29.9): 15,010 PSO RR: 1.45 (1.15-1.84)
BMI (≥30): 4,820 PSO RR: 1.87 (1.38-2.52)
cohort
(Love, Zhu et al.
2012)
PSO with BMI <25:
26,263
ref
Age, sex, smoking
status, alcohol intake
and history of trauma
PSO with BMI 25-29.9:
27,147
PsA RR: 1.09 (0.93-1.28)
PSO with BMI 30-34.9:
14,088
PsA RR: 1.22 (1.02-1.47)
PSO with BMI ≥35.0:
7.897
PsA RR: 1.48 (1.20-1.81)
HR: Hazard Ratio; RR: Risk Ratio; OR: Odds Ratio; CI: Confidence Interval; PSO: psoriasis;
ref: reference group; BMI: Body Mass Index; PsA: Psoriatic Arthritis
However, waist circumference is highly correlated with BMI at population level and the
latter cannot distinguish visceral from subcutaneous adiposity (Despres 2012). For that
reason, other techniques have been suggested like the use of whole-body bioelectrical
impedance analysis (BIA) with conflicting results about their advantages over the
routinely used measurements (Elia 2013).
Hypercholesterolemia 1.3.4.5
Hypercholesterolemia or increased cholesterol is defined as excessively high plasma
cholesterol levels and it is one of the main risk factors for CVDs.
Several studies have reported the increased prevalence of hypercholesterolemia
among patients with PSO or PsA compared to the general population (Wu, Mills et al.
2008; Warnecke, Manousaridis et al. 2011). Moreover, using the Nurses’ Health Study
67
II which is a cohort of US women, Wu et al. showed that hypercholesterolemia was
associated with an increased risk of incident PSO (HR 1.25; 95% CI: 1.04-1.50) and PsA
(HR 1.58; 95% CI: 1.13-2.23) and more specifically patients having
hypercholesterolemia for seven years or more were at higher risk of developing PSO
and PsA (Wu, Li et al. 2014).
Liver disorders 1.3.4.6
In recent years, studies have tried to elucidate the association of liver disease with
PSO and PsA (Table 9). As described in the previous section, the metabolic syndrome
is prevalent in patients with PSO and it is also a known contributor to the
development of non-alcoholic fatty liver disease (NAFLD); a precursor of fibrosis and
cirrhosis (Paschos and Paletas 2009).
Some medications used to treat arthritis such as methotrexate and leflunomide are
known to increase the risk of cirrhosis (Tilling, Townsend et al. 2006; Curtis,
Beukelman et al. 2010). The combination of existing NAFLD with alcohol consumption
and/or cumulative methotrexate dose can lead to cirrhosis (Bath, Brar et al. 2014) in
patients with PSO and PsA; however it is not clear in what degree methotrexate
induces hepatotoxicity in the absence of the other risk factors.
Respiratory disorders 1.3.4.7
Chronic obstructive pulmonary disease (COPD) which encompasses emphysema and
chronic obstructive bronchitis is a tobacco-related disease (Tuder and Petrache 2012)
that affects 10%-12% of the population (Adeloye, Chua et al. 2015). There have been a
few studies assessing the prevalence of COPD in patients with PSO compared with
healthy controls; however, there is no study comparing prevalence among patients
with PsA and controls or patients with PSO without arthritis (Table 10).
68
Table 9 | Liver disease in psoriasis and PsA
Outcome Study type Patients (n) HR/OR (95% CI) Confounders controlled
NAFLD
cross-sectional
(van der Voort,
Koehler et al. 2014)
PSO: 118 OR 1.7 (1.1-2.6) Age, gender, alcohol consumption, smoking, presence of
metabolic syndrome and alanine aminotransferase controls: 2,174 ref
Advanced
liver fibrosis
cross-sectional
(van der Voort,
Koehler et al. 2016)
PSO: 74 OR 2.57 (1.00-6.63)
Demographics, lifestyle variables and laboratory findings controls: 1,461 ref
NAFLD
meta-analysis
(Candia, Ruiz et al.
2015)
PSO: 581 vs.
controls: 2,764
OR 2.07 (1.62-2.64)
(good quality
papers)
Primary adjustments PsA: 117 vs.
PSO without PSA: 388
OR 2.25 (1.37-3.71)
mild PSO: 9,134 vs.
moderate to severe PSO: 42,795
OR 2.07 (1.59-2.71)
Liver Disease
(NAFLD,
cirrhosis)
population-based
(Ogdie, Grewal et al.
2017)
PSA (no ST): 5,786 HR 1.38 (1.02-1.86)
Age at start date, sex, smoking status, alcohol intake, BMI
category, use of oral corticosteroids and NSAIDs in
baseline
PSA (ST): 6,522 HR 1.67 (1.29-2.15)
PSO (no ST): 186,006 HR 1.37 (1.29-1.45)
PSO (ST): 11,124 HR 1.97 (1.63-2.38)
controls: 1,279,754 ref
HR: Hazard Ratio; OR: Odds Ratio; CI: Confidence Interval; NAFLD: Non-Alcoholic Fatty Liver Disease; PSO: Psoriasis; ref: reference group;
BMI: Body Mass Index; PsA: Psoriatic Arthritis; ST: systemic therapy; NSAID: Non-Steroid Anti-Inflammatory Drug
69
Table 10 | Chronic obstructive pulmonary disease in psoriasis patients
Study type Patients (n) HR/OR (95% CI) Confounders
controlled
case-control
(Dreiher, Weitzman et
al. 2008)
PSO: 12,502 OR 1.27 (1.13-1.42) Age, sex,
socioeconomic status,
smoking and obesity controls: 24,287 ref
case-control
(Al-Mutairi, Al-Farag et
al. 2010)
mild-moderate PSO:
1,661
OR 1.35 (0.98-1.85)
Age, gender severe PSO: 129 OR 1.78 (0.88-3.65)
controls: 1,835 ref
population-based
(Chiang and Lin 2012)
PSO: 2,096 HR 2.35 (1.42-3.89) Sociodemographics
and comorbidities controls: 8.384 ref
meta-analysis
(Li, Kong et al. 2015)
PSO: 42,150 OR 1.90 (1.36-2.65)
Primary adjustment
controls: 163,174 ref
mild-moderate PSO:
3,241
OR 1.66 (1.00-2.76)
controls: 10,177 ref
severe PSO: 620 OR 2.20 (1.29-3.75)
controls: 10,177 ref
HR: Hazard Ratio; OR: Odds Ratio; CI: Confidence Interval; PSO: psoriasis; ref: reference group;
Gastrointestinal disorders 1.3.4.8
PSO often co-exists with disorders affecting the gastroenterological tract. The
association of inflammatory bowel disease (IBD), an umbrella term that includes UC
and CD, has been investigated by epidemiological studies, although more clear insights
into their pathological overlap has been gained via genetic studies (Skroza, Proietti et
al. 2013).
In a case-control study of 12,502 PSO patients and 24,287 age- and sex- matched
controls, UC and CD were found to be significantly more prevalent in patients with
PSO compared with the controls (OR 1.64; 95% CI: 1.15-2.33 and OR 2.49; 95% CI:
1.71-3.62, respectively) after adjusting for TNFα therapy (Cohen, Dreiher et al. 2009).
In a population-based study involving 8,072 IBD cases and a matched control cohort,
an increased prevalence of PSO was found in both UC and CD cases (Bernstein,
Wajda et al. 2005). In a study of 174,476 women from the Nurses’ Health Study (NHS)
70
and NHS2, the PSO group had an elevated risk of developing CD (pooled analysis RR
3.86; 95% CI: 2.23-6.67) but not UC (RR 1.17; 95% CI: 0.41-3.36). Also, there was a
pronounced risk of CD in patients with PsA (RR 6.54; 95% CI: 2.07-20.65) (Li, Han et
al. 2013).
Psychological disorders 1.3.4.9
PSO and PsA can have profound physical, emotional and social effects and negative
impacts on many aspects of quality of life (Weiss, Kimball et al. 2002). Patients suffer
from high levels of anxiety and stress as the visible skin lesions can cause
embarrassment (Tejada Cdos, Mendoza-Sassi et al. 2011). A study showed that 83% of
patients with moderate and severe PSO felt that they ‘often’ or ‘always’ had to hide
their skin lesions and avoid social activities such as swimming (Weiss, Kimball et al.
2002). About 10% of patients with PSO have suicidal feelings (Gupta and Gupta 1998).
The majority of studies that have evaluated the prevalence or the incidence of
psychological disorders in patients with PSO compared to the healthy controls (Table
11). However there has been a study comparing patients with PsA to PSO without
arthritis in terms of depression and anxiety prevalence, in which the prevalence of
anxiety and depression was significantly higher in patients with PsA (36.6% and 22.2%,
respectively) compared to those with PSO only (24.4% and 9.6%, p-value < 0.05)
(McDonough, Ayearst et al. 2014).
71
Table 11 | Psychological disorders in patients with psoriasis and psoriatic arthritis
Outcome Study type Patients (n) HR/RR/OR (95% CI) Confounders controlled
Depression population-based cohort
(Kurd, Troxel et al. 2010)
mild PSO: 146,042 vs. controls: 746,930 HR 1.38 (1.35-1.40)
Age and sex severe PSO: 3,956 vs. controls: 20,020 HR 1.72 (1.57-1.88)
Anxiety mild PSO: 146,042 vs. controls: 746,930 HR 1.31 (1.29-1.34)
severe PSO: 3,956 vs. controls: 20,020 HR 1.29 (1.15-1.43)
Clinical
Depression
meta-analysis of 5 studies
(Dowlatshahi, Wakkee et al.
2014)
PSO OR 1.57 (1.40-1.76)
Primary adjustment controls ref
Depression cross-sectional
(Dalgard, Gieler et al. 2015)
PSO: 626 vs. controls: 1,359 OR 3.02 (1.86-4.90) Age, gender, socio-economics status, stress and
comorbidity Anxiety PSO: 626 vs. controls: 1,359 OR 2.91 (2.01-4.21)
Depression cross-sectional
(McDonough, Ayearst et al. 2014) PsA: 306 vs. PSO: 135
36.6% vs. 24.4% No adjustment to the prevalence estimation
Anxiety 22.2% vs. 9.6%
Depression prospective cohort
(Dommasch, Li et al. 2015)
PSO without PsA: 126 RR 1.25 (1.05-1.49) Age, smoking, alcohol intake, BMI, cancer,
angina, diabetes, snoring, hypertension, high
cholesterol, menopausal status, hormone use,
RA, sleeping duration, stroke
PSA: 30 RR 1.52 (1.06-2.19)
controls: 5,144 ref
PsA population-based
(Lewinson, Vallerand et al. 2017)
PSO with depression: 5,216 HR 1.37 (1.05-1.80) Age, sex, BMI, smoking, alcohol use, Charlson
comorbidity index, Townsend deprivation
index, PSO severity
PSO with no depression: 68,231 ref
HR: Hazard Ratio; OR: Odds Ratio; CI: Confidence Interval; PSO: psoriasis; ref: reference group; BMI: Body Mass Index; PsA: Psoriatic Arthritis; RA: Rheumatoid
Arthritis
72
Fatigue and chronic pain 1.3.4.10
Although fatigue and chronic pain cannot be classified as comorbidities, they are the
most important reported outcome among many patients with arthritis (Hewlett,
Cockshott et al. 2005; Gudu, Etcheto et al. 2016). However, only a few studies have
explored their association with PSO and PsA.
Pain and fatigue, which is defined as “an overwhelming sense of tiredness, lack of
energy, and a feeling of exhaustion” (Mills and Young 2008), are highly subjective and
due to their complex nature they are difficult to assess objectively. Both can be
influenced by the underlying disease, genetic predisposition, lifestyle and psychological
factors (Husted, Tom et al. 2009).
In a study assessing the prevalence of symptoms such as itch, pain and fatigue in
patients with dermatological conditions in general practice, 51.8% of patients with PSO
experience fatigue with 27.7% experiencing it relatively severely; whereas pain was
reported by 25% of the patients, with severe pain being less frequent, affecting
approximately 13.5% (Verhoeven, Kraaimaat et al. 2007). In PsA, the prevalence of
moderate fatigue is approximately 50%, with almost 30% of patients reporting severe
fatigue (Husted, Tom et al. 2009). In a recent study, fatigue was assessed using three
different instruments in 84 PSO patients and 84 age- and sex-matched controls (Skoie,
Dalen et al. 2017). Concomitant depression and bodily pain were also measured. On
all three instruments, patients with PSO scored higher in terms of fatigue compared to
controls. Fatigue severity was not associated with disease activity and inflammatory
variables. In addition, fatigue was associated with depression and pain; a finding in
concordance with a study by Evers et al. in which higher levels of fatigue were found to
be related with psychological distress (Evers, Lu et al. 2005). In a study by Rosen et al.,
patients with PSO without arthritis were less fatigued (3.4 vs 4.3, p-value=0.0007) and
experienced less pain compared with patients with PsA (Rosen, Mussani et al. 2012). In
a prospective study comparing patients with PSO and PsA, patients with PsA had
higher levels of fatigue compared to the PSO patients receiving phototherapy or
systemic treatment (p-value<0.009) (Tobin, Sadlier et al. 2017). Moreover, fatigue was
higher in female compared to male patients (4.2 vs 2.8, p-value<0.001). Finally, fatigue
was associated with depression (correlation r=0.3, p-value<0.001).
73
Limitations of current research 1.3.4.11
One of the limitations of the undertaken studies lies in the use of patients with severe
PSO recruited from hospital settings which may bias the estimates of comorbidities.
Thus, research should focus on patients from primary care settings who have also been
assessed for the co-existence of arthritis. The latter could help researchers better
evaluate the additional burden of the inflammatory joint disorder in certain
comorbidities.
Better understanding of the relationship between PSO, PSA and comorbidities could
help the separation of cause from effect and highlight targets for clinical intervention.
Environmental risk factors for PsA 1.3.5
PSO and PsA are multifactorial diseases in which the interplay between hereditary
factors, lifestyle and environmental influences is thought to be of major importance. It
is suggested that PSO patients with genetic susceptibility to arthritis develop PsA
following an environmental trigger.
A small number of studies have been conducted investigating the association between
environmental and lifestyle factors and the onset of PsA in patients with PSO. The first
took place in Rochester, Minnesota and studied 60 PsA and 120 PSO patients and
showed that corticosteroid use was associated with higher risk of PsA (OR 4.33, 95%
CI 1:34 to 14:02), while pregnancy had a protective role against PsA (OR 0.19, 95% CI
0.04-0.95). No association was found with ethnicity, trauma/infection, severity of PSO
and the type of therapy used to treat PSO (Thumboo, Uramoto et al. 2002). The
second was performed in the UK among 98 PsA and 163 PSO patients using a self-
completed questionnaire. It was found that immunization, especially for rubella;
infection by the human immunodeficiency virus (HIV); conjunctivitis and oral ulceration
and physical/psychological trauma were more common in the years preceding disease
onset in patients with PsA compared to PSO (Pattison, Harrison et al. 2008).
However, it should be noted that both studies were retrospective case-control, thus
subject to recall bias. In the second above-mentioned study they tried to minimize the
recall errors by recruiting patients with recent onset of PsA (up to five years).
Moreover, both studies had limited statistical power because of their small sample
74
sizes. The potential risk factors, identified in epidemiological studies, are discussed in
the following sections along with more recent evidence about their association with
the onset of PSO and/or PsA.
Physical trauma and the “deep Koebner phenomenon” 1.3.5.1
There have been a few cases series and a handful of retrospective studies implicating
physical trauma or injury as a trigger of PsA among patients with PSO. However, it is
unclear whether trauma is an actual trigger or a coincidental event.
The role of trauma in the onset of PSO is not a new concept and dates back to the
19th century when Heinrich Koebner reported the formation of PSO-like lesions in
unaffected skin of patients with PSO after cutaneous trauma. The Koebner
phenomenon (KP) appears in other skin conditions but it has been studied more
widely in PSO. Approximately 25% of patients with PSO will exhibit the Koebner
response after various injuries such as burns, surgical incisions and tattooing (Boyd and
Neldner 1990). According to Boyd and Neldner, the KP can develop in any anatomic
site and usually during the winter. The period from the actual trauma to the lesion
appearance is between 10 and 20 days but it can be as long as two years depending on
the patient’s skin sensitivity (Boyd and Neldner 1990).
Proposed mechanisms 1.3.5.1.1
The pathogenic mechanism in PSO underlying the Koebner response is not well
understood. It is assumed that an inflammatory response leads to the production of
various cytokines, stress proteins and adhesion molecules (Sagi and Trau 2011).
Several mechanisms have been proposed by which KP could affect deeper tissues like
the enthesis and cause arthritis. McGonagle et al. introduced the synovio-entheseal
complex theory in which mechanical stress in the entheses is hypothesised to lead to
enthesitis and in turn enthesitis could be the initiator of the innate immune activation.
Via the innate immune system an inflammatory reaction could be induced in the
juxtaposed synovium causing synovitis (McGonagle, Lories et al. 2007). Imaging studies
have supported this theory, as they have reported a higher prevalence of entheseal and
bone abnormalities in PSO patients without arthritis (McGonagle, Ash et al. 2011).
An alternative theory is that local trauma could result in the release of neuropeptides
like substance P whose expression levels have been reported to be increased in
75
psoriatic lesions and in the synovium. Substance P is thought to induce prolonged
inflammation by triggering the proliferation of synoviocytes (Hsieh, Kadavath et al.
2014).
Epidemiological evidence 1.3.5.1.2
There have been some case reports followed by two case series supporting the
hypothesis of KP initiating PsA among patients with PSO. In 1992, Scarpa et al.
reviewed the medical records of 138 patients with PsA and 138 patients with RA, used
as controls, for any acute event other than PSO that occurred less than ten days prior
the onset of PsA. Three patients developed arthritis followed articular trauma and in
only one case did arthritis occur at the same site affected by the local trauma (Scarpa,
Del Puente et al. 1992). Punzi et al. reported a higher prevalence of trauma that
occurred less than three months prior to the onset of arthritis in patients with PsA
(8%) compared to patients with RA (1.7%) or AS (0.7%). Higher levels of IL-6 were
also observed in patients with PsA (p-value<0.0005) (Punzi, Pianon et al. 1998). These
cases series can only suggest a possible association between a risk factor and the onset
of PsA, as they cannot infer causality and are prone to selection bias.
Three case-control studies (Thumboo, Uramoto et al. 2002; Pattison, Harrison et al.
2008; Eder, Law et al. 2011) tried to assess the relationship between trauma and PsA
with conflicting results, probably due to different definitions of trauma, the different
diagnostic criteria used for PsA (only Eder et al. used CASPAR) and the different time
frames preceding the trigger of disease. As previously mentioned (section1.3.5),
Thumboo et al. found no significant association between trauma and PsA (OR 1.58;
95% CI: 0.73-3.41) compared to controls, whereas Pattison et al. and Eder et al.
reported significant associations (trauma leading to medical care OR 2.53; 95% CI: 1.1-
6.0 and injuries OR 2.1; 95% CI: 1.11-4.01, respectively). Finally, data from the Health
Improvement Network (THIN) showed that patients with PSO exposed to trauma had
an increased risk of developing PsA compared to controls (adjusted HR 1.32; 95% CI:
1.13-1.54) with only bone and joint trauma being associated with PsA occurrence (HR
1.46; 95% CI: 1.04-2.04 and HR 1.50; 95% CI: 1.19-1.90, respectively) (Thorarensen, Lu
et al. 2017).
76
Stress 1.3.5.2
Stress has been defined along three categories: a) stressful events such as moving
house, financial problems and unemployment b) psychological difficulties and c) lack of
social support (Gupta, Gupta et al. 1989). Regardless of how stress is defined, it has
been associated with a higher severity of PSO and it has been suggested that PsA
occurs more frequently in patients with more severe PSO. A study showed that
patients with PsA reported more frequently that they had changed house (which is
exposure to psychological trauma) compared to PSO patients (30.3% vs. 18.1%) with
OR 2.29 (95% CI: 1.21-4.4) (Pattison, Harrison et al. 2008). However, as this demands
physical activity too, the true association could be with physical trauma. In addition, in
a study of 2,000 psoriatic patients, a significant increase in PSO exacerbations was
noted during stressful periods (Farber and Nall 1993).
It is believed that one of the most important cells in the pathogenesis of PSO is the T
cell (Kryczek, Bruce et al. 2008). Psychological stressors, which have been reported to
increase the level of T-cells (Buske-Kirschbaum, Kern et al. 2007), cause skin flares in
88% of PSO patients and may be associated with the onset of the disease in 40% of
cases (Al'Abadie, Kent et al. 1994; Griffiths and Richards 2001). As the psychological
disorders among patients with skin disease show a prevalence of 30% (Shenefelt 2011),
more studies should be conducted to determine the role of stress in both diseases and
whether stress is a causal factor or a consequence of PSO.
Infections, Vaccinations, Medication, Diet and Hormonal changes 1.3.5.3
Various studies have reported associations of various factors with the triggering or
induction of PSO and/or PsA, detailed in Table 12. More longitudinal studies are
needed to verify the role of these environmental factors in the pathogenesis of both
diseases.
77
Table 12 | Other environmental factors associated with psoriasis and psoriatic arthritis
Environmental factor Study Findings
Infections
Mouth ulceration PsA: 98 vs. PSO-only: 163
(Pattison, Harrison et al. 2008)
PsA vs. PSO: OR 4.20
(95% CI: 1.96-9.00)
HCV
Prospective study; PSO: 118
(Chouela, Abeldano et al. 1996)
anti-HCV prevalence;
PSO vs. controls: 7.6% vs 1.2%, respectively
PsA: 50, PSO: 50 vs. controls: 76
(Taglione, Vatteroni et al. 1999)
anti-HCV prevalence;
PsA vs. controls: 12% vs 5.2%, p<0.05
PSO vs. controls: 10% vs 5.2%, p>0.05
PSO: 12,502 vs. controls: 24,287
(Cohen, Weitzman et al. 2010)
Hepatitis C prevalence;
PSO vs. controls: 1.03% vs. 0.56%, p<0.001
OR 1.86 (95% CI 1.46-2.38)
HIV
PSO with HIV: 50
(Obuch, Maurer et al. 1992)
In HIV-positive patients 2.5% developed
PSO, comparable to that of the controls
PSO with HIV: 56
(Kassi, Mienwoley et al. 2013)
28.8% prevalence of severe PSO in HIV
Black African patients.
HIV patients: 52 and 1,100
respectively
(Buskila, Gladman et al. 1990;
Solinger and Hess 1993)
5.7% and 0.4% PsA prevalence in HIV
patients vs 0.25% in general USA
population
Medication
Lithium review
(Jafferany 2008)
Cause of flares in PSO patients and trigger to
patients without familial history of PSO
NSAIDs
prospective cohort study; 95,540
women (NHS II)
(Wu, Han et al. 2015)
Regular vs. non-regular users:
PSO HR 1.12 (95% CI 0.94-1.33)
PsA HR 1.35 (95% CI 0.98-1.88)
Aspirin
prospective cohort study; 95,540
women (NHS II)
(Wu, Han et al. 2015)
Regular vs. non-regular users:
PSO HR 0.97 (95% CI 0.79-1.20)
PsA HR 0.94 (95% CI 0.64-1.39)
Paracetamol
prospective cohort study; 95,540
women (NHS II)
(Wu, Han et al. 2015)
Regular vs. non-regular users:
PSO HR 1.17 (95% CI 0.97-1.39)
PsA HR 1.78 (95% CI 1.28-3.96)
Systemic steroids
non-systematic literature
research
(Mrowietz and Domm 2013)
Not recommended for PSO due to
deterioration of disease after withdrawal
from the drug
Antimalarial drugs review
(Basavaraj, Ashok et al. 2010)
Reported to trigger or induce PSO in
susceptible patients
PsA: Psoriatic Arthritis; PSO: Psoriasis; HCV: Hepatitis C Virus; HIV: Human Immunodeficiency virus; OR:
Odds Ratio; HR: Hazard Ratio; vs.: versus; CI: Confidence Interval; NHS: Nurses’ Health Study
78
Table 12 | Other environmental factors associated with psoriasis and psoriatic arthritis
Environmental factor Study Findings
Vaccinations
Rubella PsA: 98 vs. PSO-only: 163
(Pattison, Harrison et al. 2008)
PsA vs PSO: OR 12.4 (95% CI 1.20-
122.14)
Tetanus PSA vs PSO: OR 1.91 (95% CI 1.0-3.7)
Diet
α-carotene
PSO: 156 vs. controls: 6,104
(Johnson, Ma et al. 2014)
PSO vs. controls: OR 1.02
(95% CI 1.01-1.04)
vitamin A intake PSO vs. controls: OR 1.01
(95% CI 1.00-1.02), p=0.03
lower sugar consumption PSO vs. controls: OR 0.998
(95% CI 0.996-1.00), p=0.04
Variety of food PSO: 1,206 vs. population control
data from NHANES 2009-2010:
5,103
(Afifi, Danesh et al. 2017)
PSO vs. controls: less sugar, dairy,
whole grain fiber and calcium and
the consumption of fruits, vegetables
and legumes significantly increased.
53.8% of PSO patients reported skin
improvement after reducing alcohol,
53.4% after reducing gluten, 44.6%
after increasing omega-3 intake and
41% after adding vitamin D
Hormonal changes
Estrogen oral contraception Cohort study; 17,032 women
(Vessey, Painter et al. 2000)
Users vs non-users:
RR 1.07 (95% CI 1.0-2.9) for hospital
referral due to PSO
Pregnancy Pregnant women with PSO: 47 vs.
menstruating women with PSO:
27
(Murase, Chan et al. 2005)
55% PSO patients reported
improvement vs. 23% who reported
worsening. Postpartum, 65%
reported worsening of PSO. 84%
decrease of the lesions in women
with body surface involvement of
PSO>10%
PsA: 60 vs. PSO: 120
(Thumboo, Uramoto et al. 2002)
Decreased risk of developing PSA
with OR 0.16 (95% CI 0.02-0.99)
PSO: Psoriasis; vs.: versus; PSA: Psoriatic Arthritis; RR: Relative Risk; CI: Confidence Interval; OR: Odds Ratio;
NHANES: National Health And Nutrition Examination Survey
79
Obesity 1.3.5.4
As described in section 1.3.4.4 obesity can be classified as a comorbid condition in
PSO and PsA; however, three studies have supported the association of increased BMI
with the onset of PsA among patients with PSO (Soltani-Arabshahi, Wong et al. 2010;
Li, Han et al. 2012; Love, Zhu et al. 2012). Interestingly, Love et al. and Li et al.
reported a dose-effect of BMI on the development of PsA. Although these studies
reinforce the hypothesis that obesity is linked with the onset of PsA, it should be
mentioned that in the study by Love et al. the diagnosis of PsA was made by a primary
care physician and it was not validated by a rheumatologist; therefore bias because of
disease misclassification could be present (Canete and Mease 2012).
Smoking and alcohol consumption 1.3.5.5
Smoking and alcohol consumption are two of the lifestyle factors that have been
reported to be associated with an increased risk of PSO. A systematic literature
review was conducted by Brenaut et al. to assess whether alcohol consumption is
prevalent in PSO patients and whether it is a trigger factor of the disease (Brenaut,
Horreau et al. 2013). Out of the 23 studies investigating the association of PSO and
prevalent alcohol consumption, 18 concluded that the latter is significantly higher in
PSO compared to the general population, whereas five reported that this was not the
case. The fact that alcohol is a risk factor for PSO development was supported by four
studies. However, only one had a prospective design (Qureshi, Dominguez et al. 2010)
in which the RR of PSO was 1.72 (95% CI: 1.15-2.57) for a consumption of 2.3 drinks
per week or more, compared with women who did not drink alcohol. In a later study
the same group examined whether alcohol intake was also a risk factor for the onset
of PsA (Wu, Cho et al. 2015). An excessive alcohol consumption of 30 grams per day
was associated with an increased risk of PsA in women (HR 4.45; 95% CI: 2.07-9.59). A
possible mechanism underlying the interaction between alcohol and PSO could be the
up-regulation of pro-inflammatory cytokines (Ockenfels, Keim-Maas et al. 1996) or the
increase of lymphocyte proliferation (Schopf, Ockenfels et al. 1996) by ethanol.
The role of smoking in the disease risk is unclear. It has been suggested that smoking
can activate the nicotinic cholinergic receptors in keratinocytes which in turn enhance
cell differentiation (Grando, Horton et al. 1996). In addition, smoking is linked to
oxidative stress (Morrow, Frei et al. 1995) that may induce chronic inflammation and
80
activate signalling pathways implicated in PSO (Sopori 2002). In a meta-analysis of
studies assessing the prevalence of smoking among patients with PSO, the pooled OR
was 1.78 (95% CI: 1.53-2.06). Moreover, current smokers were at higher risk of
developing PSO (pooled adjusted OR 1.94; 95% CI: 1.64-2.28) compared to non-
smokers. In a recent nationwide study from Korea, the adjusted incidence ratio of
developing PSO among current smokers was 1.14 (95% CI: 1.13-1.15) and among
former smokers was 1.11 (95% CI: 1.10-1.12) compared to non-smokers, with the risk
of developing PSO being higher in smokers having more than two packs per day (Lee,
Han et al. 2017).
The relationship between smoking and PsA has been examined by researchers with
conflicting results. Eder et al. found an inverse association between smoking and PsA
(Eder, Law et al. 2011) which held only among patients negative for HLA-Cw6 (Eder,
Shanmugarajah et al. 2012). Li et al. reported that the RR of PsA was 3.13 (95% CI:
2.08-4.71) among current smokers and 1.54 (95% CI: 1.06-2.24) for former smokers
compared with non-smokers. This relationship was dose-dependent when assessing
the risk for PsA in the entire population (Li, Han et al. 2012). The protective effect of
smoking in the development of PsA among patients with PSO reported by Eder et al.
could be the result of a type of selection bias called index event bias. Nguyen et al.
tried to explain this “paradoxical” phenomenon using the THIN database (Nguyen,
Zhang et al. 2018). According to this study, smoking was associated with an increased
risk of PsA (HR 1.27; 95% CI: 1.19-1.36) in the general population but with a
decreased risk among patients with PSO (HR 0.91; 95% CI: 0.84-0.99). Performing
mediation analysis, they showed the effect of smoking on the risk of PsA was mediated
through its effect on PSO. In correspondence, Lee and Song commented that the study
by Nguyen et al. failed to adjust for factors such as physical activity, drugs, diet and
others, suggesting a possible confounding effect of additional unmeasured variables
(Lee and Song 2017). Nonetheless, “paradoxical” findings should always be interpreted
with caution because of bias and unidentified confounding. Therefore, further research
is needed to clarify the effect of smoking on the onset of PsA among patients with
PSO.
Lastly, two clinical factors have been suggested to be associated with the development
of PsA; PSO severity (Wilson, Icen et al. 2009; Soltani-Arabshahi, Wong et al. 2010;
81
Tey, Ee et al. 2010) and nail involvement (Wilson, Icen et al. 2009; Soltani-Arabshahi,
Wong et al. 2010). Further prospective cohort studies are required to confirm the
associations with the above-mentioned environmental and lifestyle factors and explore
differences between PSO and PsA. The identification of risk factors could help identify
those patients with PSO that are likely to develop PsA and it could allow clinicians to
intervene early or even prevent the development of the disease.
Genetic risk factors for PSO and PsA 1.3.6
The genetic basis of PSO and PsA is not fully understood. However, with the advent of
GWAS many SNPs have been found to be associated with both conditions. Discovery
of disease associated SNPs and genes can lead to two clinical benefits:
Improved prediction of disease risk
Improved treatment by identifying novel therapeutic targets and inform
repurposing of existing drugs
Although both PSO and PsA pathogenesis remains unclear, it is certain that genetic
predisposition plays a crucial role in the individual’s susceptibility and disease
expression. Evidence for this genetic predisposition has been ascertained through the
analysis of heritability in twin and genealogical studies.
Twin studies 1.3.6.1
In the case of PSO, as depicted in Table 13, in all conducted studies the concordance in
identical twins is much higher supporting a genetic component to disease. Moreover,
the majority of these studies have estimated the heritability (the proportion of trait
variance as a result of genetic variance) and found that it ranges between 68 and 90%.
As far as PsA is concerned, its genetics have not been thoroughly investigated. The
only twin study that has been conducted in Denmark among 36 twins did not have the
statistical power to detect genetic effectors on PsA, although confirmed the
importance of genes in PSO (Pedersen, Svendsen et al. 2008).
82
Table 13 | Twin studies conducted to establish the genetic basis of psoriasis
Country Cohort Concordance of
monozygotic
twins %
Concordance
of dizygotic
twins %
Heritability %
US (Farber, Nall et al. 1974) 61 73 20 -
Denmark (Brandrup, Hauge
et al. 1978)
36 64 14 90
Australia (Duffy, Spelman et
al. 1993)
77 35 12 80
Norway (Grjibovski, Olsen et
al. 2007)
273 22 6 66
Denmark (Lonnberg, Skov et
al. 2013)
804 20 9 68
Familial aggregation studies 1.3.6.2
Familial aggregation studies in PSO showed that the recurrence ratio in first degree
relatives is 7.6 (in one study) and the λs has been estimated between 4 and 12 (Myers,
Kay et al. 2005; Rahman and Elder 2005; Chandran, Schentag et al. 2009). A number of
epidemiological studies have estimated familial aggregation in PsA, as seen in Table 14,
consistently reporting that its genetic burden is higher compared to PSO. Particularly
in the Icelandic study the risk ratio of first degree relatives to fourth-degree relatives
was 39, 12, 3.6 and 2.3 respectively (Karason, Love et al. 2009).
Table 14 | Epidemiological studies estimating familial aggregation in psoriatic arthritis
Country λ1 Prevalence in
FDRs, %
λs Prevalence in
siblings, %
UK (Moll and Wright 1973) 55 5.5 - -
UK (Myers, Kay et al. 2005) - - 47 14.3
Canada (Chandran, Schentag et al. 2009) 30.4 7.6 30.8 7.7
Iceland (Karason, Love et al. 2009) 39 - -
λ1: recurrence risk ratio in first degree relatives; FDRs: First Degree Relatives;
λs: recurrence risk ration in siblings
83
Association studies 1.3.6.3
PSO 1.3.6.3.1
In 2007, the first PSO GWAS was undertaken by Cargill et al. and consisted of 467
cases and 500 controls and 25,215 SNPs (Cargill, Schrodi et al. 2007). In the following
years, another four GWAS were carried out (Nair, Duffin et al. 2009; Ellinghaus,
Ellinghaus et al. 2010; Genetic Analysis of Psoriasis, the Wellcome Trust Case Control
et al. 2010), one exome-wide association study (Dand, Mucha et al. 2017), two meta-
analyses including only GWAS datasets (Stuart, Nair et al. 2010; Ellinghaus, Ellinghaus
et al. 2012) and three meta-analysis including both GWAS and Immunochip datasets
(Tsoi, Spain et al. 2012; Tsoi, Spain et al. 2015; Tsoi, Stuart et al. 2017), identifying 63
susceptibility loci in total in the European population (Table 15). In the Han Chinese
population (Table 16) GWAS have indicated some novel associations along with shared
loci with the European findings. The candidate genes identified so far are involved in
four broad biological pathways: NF-κB signaling, skin barrier function, antigen
presentation and IL-23/Th17 signaling.
The most significant association reported in different populations is with the SNPs
located at the MHC class I region, which harbors the HLA genes with the HLA-C gene
being involved in the presentation of antigens to T lymphocytes, a function that is
important for the immune system. The strongest correlated SNP tags the HLA-Cw6
allele, whose contribution to the aetiology of PSO has been the aim of various studies.
It has been suggested that HLA-Cw6 may present a melanocytic autoantigen,
ADAMTS-like protein 5M to CD8+ T cells (Arakawa, Siewert et al. 2015). Moreover,
HLA-Cw6 has shown to have a high binding affinity for LL-37, which has been
described as a T-cell autoantigen in PSO (Mabuchi and Hirayama 2016). Finally,
encoding endoplasmic reticulum aminopeptidase 1 (ERAP1) in 5q15 takes part in
antigen presentation by shedding of pro-inflammatory cytokine receptors (Haroon and
Inman 2010) and its existing interaction with HLA-C could regulate PSO susceptibility
(Genetic Analysis of Psoriasis, the Wellcome Trust Case Control et al. 2010).
The skin barrier function also plays a role in PSO pathogenesis, as supported by several
GWAS findings. A deletion in late cornified envelope (LCE) at genes LCE3B and LCE3C,
and LCE3A have been significantly associated with the disease (de Cid, Riveira-Munoz et
al. 2009). The LCE genes play a crucial role in epidermal terminal differentiation as they
84
encode the stratum corneum proteins of the cornified envelope (Mischke, Korge et al.
1996).
Finally, IL23 signaling that regulates Th17 is a key pathway to the immunopathogenesis
of PSO. Polymorphisms within IL12B, IL23A (Nair, Duffin et al. 2009) and IL23R (Cargill,
Schrodi et al. 2007) have been identified in both European and Asian populations and
along with TRAF3IP2 (encodes an adaptor molecule driving NF-κB signal transduction
downstream of IL-17) (Ellinghaus, Ellinghaus et al. 2010) and NFKBIZ (a target of IL-17
signaling in keratinocytes) (Tsoi, Spain et al. 2015) confirm the involvement of T-cell
signaling in PSO susceptibility.
85
Table 15 | Non-MHC PSO susceptibility loci identified by association studies in the European population (Adapted by (Ray-Jones, Eyre et al. 2016))
Locus Notable gene(s) Index SNPⱡ Index SNP annotation P-value Risk
allele
OR Sample size
cases/controls
1p36.23 SLC45A1, TNFRSF9 rs11121129 Intergenic 1.7 x 10-8 A 1.131 10,588/22,806
1p36 IL-28RA rs7552167 4.2kb 5' of IL-28RA 8.5 x 10-12 G 1.211 10,588/22,806
1p36.11 RUNX3 rs7536201 1.5kb 5' of RUNX3 2.3 x 10-12 C 1.131 10,588/22,806
1p31.3 IL-23R rs9988642 441bp 3' of IL-23R 1.1 x 10-26 T 1.521 10,588/22,806
1p31.1 FUBP1 rs34517439 Intronic: DNAJB4 4.43 × 10−9 A 1.182 Up to 11,988/275,334
1q21.3 LCE3B, LCE3D rs6677595 3.6kb 3' of LCE3B 2.1 x 10-33 T 1.261 10,588/22,806
1q24.3 FASLG rs12118303 Intergenic 3.02 × 10−10 C 1.122 Up to 11,988/275,334
1q31.1 LRRC7 rs10789285 Intergenic 1.43 x 10-8 G 1.123 15,295/27,578
1q31.3 DENND1B rs2477077 Intronic: DENND1B 3.05 x 10-8 (meta) T NR4 1,962/8,923
1q32.1 IKBKE rs41298997 Intronic: IKBKE 2.37 × 10−8 T 1.132 Up to 11,988/275,334
2p16.1 FLJ16341, REL rs62149416 Intronic: FLJ16341 1.8 x 10-17 T 1.171 10,588/22,806
2p15 B3GNT2 rs10865331 Intergenic 4.7 x 10-10 A 1.121 10,588/22,806
2q24.2 KCNH7, IFIH1 rs17716942 Intronic: KCNH7 3.3 x 10-18 T 1.271 10,588/22,806
3p24.3 PLCL2 rs4685408 Intronic: PLCL2 8.58 x 10-9 G 1.123 15,295/27,578
3q11.2 TP63 rs28512356 400bp 3' of TP63 4.31 x 10-8 C 1.175 3,496/5,186
3q12.3 NFKBIZ rs7637230 Intronic: RP11-221J22.1 2.07 x 10-9 A 1.143 15,295/27,578
ⱡ The most recently reported GWAS index SNP in each locus at genome wide significance (p-value ≤ 5 x 10-8), excluding any secondary signals in the locus
1 (Tsoi, Spain et al. 2012); 2 (Tsoi, Stuart et al. 2017); 3 (Tsoi, Spain et al. 2015); 4 (Bowes, Budu-Aggrey et al. 2015); 5 (Yin, Low et al. 2015)
86
Table 15 | Non-MHC PSO susceptibility loci identified by association studies in the European population (Adapted by (Ray-Jones, Eyre et al. 2016))
Locus Notable gene(s) Index SNPⱡ Index SNP annotation P-value Risk
allele
OR Sample size
cases/controls
5p13.1 PTGER4, CARD6 rs114934997 Intergenic 1.27 x 10-8 C 1.173 15,295/27,578
5q15 ERAP1, LNPEP rs27432 Intronic: ERAP1 1.9 x 10-20 A 1.201 10,588/22,806
5q31 IL13, IL4 rs1295685 3'-UTR: IL13 3.4 x 10-10 G 1.181 10,588/22,806
5q33.1 TNIP1 rs2233278 5'-UTR: TNIP1 2.2 x 10-42 C 1.591 10,588/ 22,806
5q33.3 IL12B rs12188300 Intergenic 3.2 x 10-53 T 1.581 10,588/22,806
6p25.3 EXOC2, IRF4 rs9504361 Intronic: EXOC2 2.1 x 10-11 A 1.121 10,588/22,806
6p22.3 CDKAL1 rs4712528 Intronic: CDKAL1 8.4 x 10-11 C 1.166 9,293/13,670
6q21 TRAF3IP2 rs33980500 Missense: TRAF3IP2 4,2 x 10-45 T 1.521 10,588/22,806
6q23.3 TNFAIP3 rs582757 Intronic: TNFAIP3 2.2 x 10-25 C 1.231 10,588/22,806
6q25.3 TAGAP rs2451258 Intergenic 3.4 x 10-8 C 1.121 10,588/22,806
7p14.1 ELMO1 rs2700987 Intronic: ELMO1 4.3 x 10-9 A 1.111 10,588/22,806
9p21.1 DDX58 rs11795343 Intronic: DDX58 8.4 x 10-11 T 1.111 10,588/22,806
9q31.2 KLF4 rs10979182 Intergenic 2.3 x 10-8 A 1.121 10,588/22,806
9q32 TNFSF15 rs6478108 Intronic: TNFSF15 1.50 x 10-8 C 1.107 11,861/28,610
10q21.2 ZNF365 rs2944542 Intronic: ZNF365 1.76 × 10−8 G 1.082 Up to 11,988/275,334
10q22.2 CAMK2G, FUT11 rs2675662 Intronic: CAMK2G 7.35 x 10-9 A 1.123 15,295/27,578
10q22.3 ZMIZ1 rs1250544 Intronic: ZMIZ1 3.53 x 10-8 G 1.168 8,644/15,055
ⱡ The most recently reported GWAS index SNP in each locus at genome wide significance (p-value ≤ 5 x 10-8), excluding any secondary signals in the locus
1 (Tsoi, Spain et al. 2012); 2 (Tsoi, Stuart et al. 2017); 3 (Tsoi, Spain et al. 2015); 4 (Bowes, Budu-Aggrey et al. 2015); 5 (Yin, Low et al. 2015); 6 (Stuart, Nair et al. 2015);
7 (Dand, Mucha et al. 2017); 8 (Ellinghaus, Ellinghaus et al. 2012)
87
Table 15 | Non-MHC PSO susceptibility loci identified by association studies in the European population (Adapted by (Ray-Jones, Eyre et al. 2016))
Locus Notable gene(s) Index SNPⱡ Index SNP annotation P-value Risk
allele
OR Sample size
cases/controls
10q23.31 PTEN, KLLN, SNORD74 rs76959677 Intergenic 2.75 × 10−8 G 1.282 Up to 11,988/ 275,334
10q24.31 CHUK rs61871342 Intronic: BLOC1S2 1.56 × 10−9 G 1.102 Up to 11,988/ 275,334
11q13 RPS6KA4, PRDX5 rs694739 256bp 5' of AP003774.1 3.71 x 10-9 A 1.128 8,644/ 15,055
11q13.1 CFL1, FIBP, FOSL1 rs118086960 Intronic: CFL1 6.89 × 10−9 T 1.122 Up to 11,988/ 275,334
11q22.3 ZC3H12C rs4561177 1.7kb 5' of ZC3H12C 7.7 x 10-13 A 1.141 10,588/ 22,806
11q24.3 ETS1 rs3802826 Intronic: ETS1 9.5 x 10-10 A 1.121 10,588/ 22,806
12p13.2 KLRK1, KLRC4 rs11053802 Intronic: KLRC1 4.17 × 10−9 T 1.112 Up to 11,988/ 275,334
12q13.3 IL-23A, STAT2 rs2066819 Intronic: STAT2 5.4 x 10-17 C 1.391 10,588/ 22,806
12q24.12 BRAP, MAPKAPK5 rs11065979 Intergenic 1.67 × 10−8 T 1.082 Up to 11,988/ 275,334
12q24.31 IL31 rs11059675 Intronic: LRRC43 1.50 × 10−8 A 1.102 Up to 11,988/ 275,334
13q14.11 COG6 rs34394770 Intronic: COG6 2.65 x 10-8 T 1.165 3,496/5,186
13q14.11 LOC144817 rs9533962 within LOC144817 1.93 x 10-8 C 1.145 3,496/5,186
13q32.3 UBAC2, RN7SKP9 rs9513593 Intronic: UBAC2 3.60 × 10−8 G 1.122 Up to 11,988/ 275,334
14q13.2 NFKBIA rs8016947 Intronic: RP11-56B11.3 2.5 x 10-17 G 1.162 10,588/ 22,806
14q32.2 RP11-61O1.1 rs142903734 Intronic: RP11-61O1.1 7.15 × 10−9 AAG 1.122 Up to 11,988/ 275,334
15q13.3 KLF13 rs28624578 Intronic: KLF13 9.22 × 10−10 T 1.182 Up to 11,988/ 275,334
ⱡ The most recently reported GWAS index SNP in each locus at genome wide significance (p-value ≤ 5 x 10-8), excluding any secondary signals in the locus
1 (Tsoi, Spain et al. 2012); 2 (Tsoi, Stuart et al. 2017); 3 (Tsoi, Spain et al. 2015); 4 (Bowes, Budu-Aggrey et al. 2015); 5 (Yin, Low et al. 2015); 6 (Stuart, Nair et al. 2015);
8 (Ellinghaus, Ellinghaus et al. 2012)
88
Table 15 | Non-MHC PSO susceptibility loci identified by association studies in the European population (Adapted by (Ray-Jones, Eyre et al. 2016))
Locus Notable gene(s) Index SNPⱡ Index SNP annotation P-value Risk
allele
OR Sample size
cases/controls
16p13.13 PRM3, SOCS1 rs367569 1.6kb 3' of PRM3 4.9 x 10-8 C 1.131 10,588/ 22,806
16p11.2 FBXL19, PRSS53 rs12445568 Intronic: STX1B 1.2 x 10-16 C 1.161 10,588/ 22,806
17q11.2 NOS2 rs28998802 Intronic: NOS2 3.3 x 10-16 A 1.221 10,588/ 22,806
17q21.2 PTRF, STAT3, STAT5A/B rs963986 Intronic: PTRF 5.3 x 10-9 C 1.151 10,588/ 22,806
17q25.1 TRIM47, TRIM65 rs55823223 Intronic: TRIM65 1.06 × 10−8 A 1.152 Up to 11,988/ 275,334
17q25.3 CARD14 rs11652075 Missense: CARD14 3.4 x 10-8 C 1.111 10,588/ 22,806
18p11.21 PTPN2 rs559406 Intronic: PTPN2 1.19 × 10−10 G 1.102 Up to 11,988/ 275,334
18q21.2 POL1, STARD6, MBD2 rs545979 Intronic: POL1 3.5 x 10-10 T 1.121 10,588/ 22,806
19p13.2 TYK2 rs34536443 Missense: TYK2 9.1 x 10-31 G 1.881 10,588/ 22,806
19p13.2 ILF3, CARM1 rs892085 Intronic: QTRT1 3 x 10-17 A 1.171 10,588/ 22,806
19q13.33 FUT2 rs492602 Synonymous: FUT2 6.57 × 10−13 G 1.112 Up to 11,988/ 275,334
20q13.13 RNF114 rs1056198 Intronic: RNF114 1.5 x 10-14 C 1.161 10,588/ 22,806
21q22 RUNX1 rs8128234 Intronic: RUNX1 3.74 x 10-8 T 1.175 3,496/ 5,186
22q11.21 UBE2L3, YDJC rs4821124 1kb 3' of UBE2L3 3.8 x 10-8 C 1.131 10,588/ 22,806
ⱡ The most recently reported GWAS index SNP in each locus at genome wide significance (p-value ≤ 5 x 10-8), excluding any secondary signals in the locus
1 (Tsoi, Spain et al. 2012); 2 (Tsoi, Stuart et al. 2017); 3 (Tsoi, Spain et al. 2015); 4 (Bowes, Budu-Aggrey et al. 2015); 5 (Yin, Low et al. 2015); 6 (Stuart, Nair et al. 2015);
8 (Ellinghaus, Ellinghaus et al. 2012)
89
Table 16 | Non-MHC PSO susceptibility loci identified by association studies in the Chinese population (Adapted by (Ray-Jones, Eyre et al. 2016))
Locus Notable gene(s) Index SNPⱡ Index SNP annotation P-value Risk
allele
OR Sample size
cases/controls
1p36.3 MTHFR rs2274976 Missense: MTHFR 2.33 x 10-10 G 1.211 11,245/11,177
1p36 IL-28RA rs4649203 5.5kb 5' of IL-28RA 9.74 x 10-11 A 1.192 8,339/12,725
1p36.11 ZNF683 rs10794532 Missense: ZNF683 4.18 x 10-8 A 1.111 11,245/11,177
1p31.3 IL-23R chr1: 67,421,184
(build hg18)
Nonsynonymous: IL-23R 1.94 x 10-11 G 1.283 10,727/10,582
1p31.3 C1orf141 rs72933970 Missense: C1orf141 1.23 x 10-8 G 1.161 11,245/11,177
1q21.3 LCE3B, LCE3D rs10888501 175bp 3' of LCE3E 6.48 x 10-13 A 1.141 11,245/11,177
1q22 AIM2 rs2276405 Stop-gained: AIM2 3.22 x 10-9 G 1.171 11,245/11,177
2q12.1 IL1RL1 rs1420101 Intronic: IL1RL1 1.71 x 10-10 G 1.121 11,245/11,177
2q24.2 KCNH7, IFIH1 rs13431841 Intronic: IFIH1 2.96 x 10-9 G 1.174 15,207/17,103
3q13 CASR rs1042636 Missense: CASR 1.88 x 10-10 A 1.091 11,245/11,177
3q26.2-q27 GPR160 rs6444895 Intronic: GPR160 1.44 x 10-12 G 1.111 11,245/11,177
4q24 NFKB1 rs1020760 Intronic: NFKB1 2.19 x 10-8 G 1.124 15,207/17,103
5q14 ZFYVE16 rs249038 Missense: ZFYVE16 2.14 x 10-8 G 1.161 11,245/11,177
5q15 ERAP1, LNPEP rs27043 Intronic: ERAP1 6.50 x 10-12 G 1.134 15,207/17,103
ⱡ The most recently reported GWAS index SNP in each locus at genome wide significance (p-value ≤ 5 x 10-8), excluding any secondary signals in the locus
1 (Zuo, Sun et al. 2015); 2 (Cheng, Li et al. 2014); 3 (Tang, Jin et al. 2014); 4 (Sheng, Jin et al. 2014)
90
Table 16 | Non-MHC PSO susceptibility loci identified by association studies in the Chinese population (Adapted by (Ray-Jones, Eyre et al. 2016))
Locus Notable gene(s) Index SNPⱡ Index SNP annotation P-value Risk
allele
OR Sample size
cases/controls
5q33.1 TNIP1 rs10036748 Intronic: TNIP1 4.26 x 10-9 G 1.101 11,245/ 11,177
5q33.3 IL12B rs10076782 Intronic: RNF145 4.11 x 10-11 G 1.121 11,245/ 11,177
5q33.3 PTTG1 rs2431697 Intergenic 1.11 x 10-8 C 1.205 8,312/ 12,919
7p14.3 CCDC129 rs4141001 Missense: CCDC129 1.84 x 10-11 A 1.141 11,245/ 11,177
8p23.2 CSMD1 rs10088247 Intronic: CSMD1 4.54 x 10-9 C 1.175 8,312/ 12,919
11p15.4 ZNF143 rs10743108 Missense: ZNF143 1.70 x 10-8 C 1.141 11,245/ 11,177
11q13.1 AP5B1 rs610037 Synonymous: AP5B1 4.29 x 10-11 C 1.111 11,245/ 11,177
12p13.3 CD27, LAG3 rs758739 Intronic: NCAPD2 4.08 x 10-8 C 1.094 15,207/ 17,103
13q12.11 GJB2 rs72474224 Missense: GJB2 7.46 x 10-11 T 1.343 10,727/ 10,582
13q14.11 LOC144817 rs12884468 Intergenic 1.05 x 10-8 G 1.121 11,245/ 11,177
14q23.2 SYNE2 rs2781377 Stop-gained: SYNE2 4.21 x 10-11 G 1.151 11,245/ 11,177
17q12 IKZF3 rs10852936 Intronic: ZPBP2 1.96 x 10-8 T 1.104 15,207/ 17,103
17q21.2 PTRF, STAT3, STAT5A/B rs11652075 Missense: CARD14 3.46 x 10-9 C 1.093 10,727/ 10,582
17q25.3 TMC6 rs12449858 Missense: TMC6 2.28 x 10-8 A 1.121 11,245/ 11,177
18q22.1 SERPINB8 rs514315 3’ of SERPINB8 5.92 x 10-9 T 1.135 8,312/ 12,919
19q13.41 ZNF816A rs12459008 Missense: ZNF816 2.25 x 10-9 A 1.123 10,727/ 10,582
21q22.11 IFNGR2 rs9808753 Missense: IFNGR2 2.75 x 10-8 A 1.081 11,245/ 11,177
21q22.11 SON rs3174808 Missense: SON 1.15 x 10-8 G 1.101 11,245/ 11,177
ⱡ The most recently reported GWAS index SNP in each locus at genome wide significance (p-value ≤ 5 x 10-8), excluding any secondary signals in the locus
1 (Zuo, Sun et al. 2015); 2 (Cheng, Li et al. 2014); 3 (Tang, Jin et al. 2014); 4 (Sheng, Jin et al. 2014); 5 (Sun, Cheng et al. 2010)
91
PsA 1.3.6.3.2
HLA-Cw6 has also been found to be associated with PsA (Lopez-Larrea, Torre Alonso
et al. 1990; Chandran, Bull et al. 2013) with Eder et al. showing that the frequency of
this allele to be lower in patients with PsA compared with patients with PSO-only;
although both cohorts harboured a higher frequency of the allele compared with the
controls (Eder, Chandran et al. 2012). Ho et al. also reported that HLA-Cw6 allele was
strongly associated with both PsA, only in patients with type 1 PSO, and PSO (Ho,
Barton et al. 2008). A recent study tried to assess the “paradoxical” finding of HLA-Cw6
being protective of PsA among PSO patients comparing 2808 patients with PSO
without arthritis, 1945 PsA cases and 8920 controls and by controlling for the age of
onset of PSO, found no significant association of PsA to the allele; instead PsA was
significantly associated with amino acid 97 of HLA-B (Bowes, Ashcroft et al. 2017).
Aside from HLA-Cw6, studies have provided evidence of other HLA associations. The
frequency of HLA-B*38, B*08 and B*39 has been found significantly elevated in PsA
compared to PSO alone (Eder, Chandran et al. 2012; Winchester, Minevich et al.
2012). In addition, data from large scale fine-mapping study consisting of GWAS and
Immunochip datasets reported other HLA-C alleles along with HLA-A and HLA-B
alleles (Okada, Han et al. 2014). The HLA genes play a significant role in antigen
presentation and T-cell signalling and alteration of the signalling pathways can lead to
inappropriate targeting and destruction of cells; thus contributing to the pathogenesis
of PsA (O'Rielly and Rahman 2014).
In PsA, a small number of GWAS have been published, most of which included patients
with PSO as well. Liu et al. performed a GWAS with 223 PSO cases (91 had PsA) and
519 controls for the “discovery” phase and then the results were replicated in a UK
cohort of 576 PsA cases (Liu, Helms et al. 2008). They showed that MHC is a major
factor in PsA susceptibility and they confirmed the associations with IL23 and IL12B.
Finally a novel region on chromosome 4q27 for PsA was revealed which harbors genes
for IL2 and IL21 cytokines. It should be noted that the findings of this study did not
reach the genome-wide significance threshold (<5x10-8) probably because of its small
sample size. Nair et al. found by genotyping SNPs that were strongly correlated with
PsA in a PSO cohort, three genes to be significantly associated with PsA susceptibility
(HLA-C, IL12B, TNIP1). Moreover, an association was detected with TNFAIP3, IL13,
92
IL23A, TSC1 and SMARCA4 but did not reach genome-wide significance (Nair, Duffin et
al. 2009). The association of TNIP1 was confirmed by Bowes et al. in 1057 PsA patients.
They also showed convincing evidence of association with IL23A and nominal evidence
with TNFAIP3 and TSC1 (Bowes, Orozco et al. 2011). These findings show the
involvement of NF-κB and IL-23 signaling in the pathogenesis of PsA which are key
effectors of the adaptive and innate immunse responses. Furthermore, the association
of IL13 with PsA was confirmed in a study including 1,057 PsA patients and 778 type 1
PSO cases (Bowes, Eyre et al. 2011). IL23 encondes an imunoregulatory cytokine that
inhibits inflammation by down-regulating macrophage activity.
The study published by Huffmeier et al., was the first GWAS solely focused on PsA and
involved 609 PsA cases and 990 healthy controls. Not only did it confirm the
association with HLA-C and IL12B, but it detected and replicated the associations with
three polymorphisms at the TRAF3IP2 locus. TRAF3IP2 gene is involved in the
regulation of the adaptive immune system (Huffmeier, Uebe et al. 2010). It codes for
Act1 which is the connection between adaptive immune responses mediated by IL-17
and NF-κB innate pathway, controlling the transcription of a range of pro-inflammatory
cytokines (Hoffmann and Baltimore 2006). Another study after performing stratified
analysis in the dataset to include only PsA cases (1922 PsA patients compared to 8037
controls) confirmed the association of TRAF3IP2 with PsA, suggesting a shared
susceptibility with PSO (Ellinghaus, Ellinghaus et al. 2010).
In 2012, Ellinghaus et al. carried a genome-wide meta-analysis using PsA data (535
cases) from their previous study, Nair et al. and from an independent Canadian study.
A significant association was found between PsA and variants at the REL gene which is
stronger than the association found with PSO (Ellinghaus, Stuart et al. 2012).
In 2013, further analysis was conducted on 17 SNPs that were not significantly
associated with PsA in the study by Huffmeier et al., including independent cohorts of
1398 PsA cases and 6389 controls (Apel, Uebe et al. 2013). The only variant that
reached genome-wide significance, when combining the original GWAS dataset with
the replication study, was rs4649038 in the RUNX3 region. RUNX3 encodes a
transcription factor that promotes the differentiation of T-cells to CD8+ T cells which
predominate in the synovial fluid of PsA patients (Costello, Bresnihan et al. 1999).
93
In 2015, a meta-analysis of 3061 PsA cases that included PSO cases as well reported
three novel association signals at the IFNLR1 gene, the IFIH1 and the NFKBIA gene locus
(Stuart, Nair et al. 2015). Regarding the IFH1 gene, a recent study (Budu-Aggrey,
Bowes et al. 2017) using the exome chip discovered a rare coding allele that is
protective for PsA and it is independent from the variant reported by Stuart et al.
Finally, Bowes et al. using the Immunochip genotyping array and by including 1962 cases
and 8923 controls reported a novel association at chromosome 5q31 and seven loci
which have confirmed association with PSO too. These are IL23R, TNIP1, IL12B, HLA-B
and HLA-C, TRAF3IP2, STAT2 and IL23A and TYK2 and shed more light to the
importance of pathways involved in susceptibility to PsA (Bowes, Budu-Aggrey et al.
2015).
The common risk loci that have been identified for both PSO and PsA emphasize the
genetic overlap between the two diseases, as well as the shared underlying pathways
involved in their pathogenesis illustrating that the PSO experienced by patients with
PsA is genetically similar to cutaneous-only PSO.
PsA-specific risk loci 1.3.6.3.3
Although both diseases share a large genetic overlap, the fact that PsA has a higher
genetic burden as discussed in 1.3.6.2 suggests the existence of PsA-specific risk loci.
Among the HLA alleles, HLA-B27 has been reported to be a genetic marker for PsA
(Eder, Chandran et al. 2012), a finding that was reported in previous studies
(Gladman, Anhorn et al. 1986). In addition HLA-B*08 and HLA-B*38 have been
suggested to be PsA-specific and HLA-B*39 a potential risk allele for axial PsA (Eder,
Chandran et al. 2012). In another study three independent effects were identified in
HLA-C*0602, HLA-B and HLA-A*0201 (Bowes, Budu-Aggrey et al. 2015). However, a
more recent study revealed that the age of onset of PSO confounds the HLA
associations when comparing PsA with PSO patients. After controlling for age, the
HLA-C*0602 was no longer significantly associated with PsA; instead, the amino acid at
position 97 of HLA-B was associated with PsA (Bowes, Ashcroft et al. 2017).
Moreover the Glu at HLA-B position 45 confers risk of PsA and the fact that HLA-B27,
HLA-B38 and HLA-B39 carry Glu at position 45 indicates the more significant
association of these alleles with PsA (Okada, Han et al. 2014). However, in the MHC
94
region it is difficult to detect specific risk loci for these two correlated phenotypes
because of its extensive LD.
Outside of the MHC region, a number of loci have been stated to have larger effects in
PsA, such as variants in TRAF3IP2, REL and FBXL19 (Ellinghaus, Ellinghaus et al. 2010;
Huffmeier, Uebe et al. 2010; Nair, x..c. et al. 2013). In addition a number of studies
have suggested the IL13 gene as a potential marker of PsA (Duffin, Freeny et al. 2009;
Bowes, Orozco et al. 2011; Eder, Chandran et al. 2011). However, PSO studies have
reported an association with the same genetic variant too (Genetic Analysis of
Psoriasis, the Wellcome Trust Case Control et al. 2010; Tsoi, Spain et al. 2012). An
Immunochip study, reported a novel PsA-specific association at chromosome 5q31
which is independent of the IL13 PSO-associated SNP and the functional annotation
identified SLC22A5 as the candidate gene (Bowes, Budu-Aggrey et al. 2015). The same
study revealed a distinct PsA-associated variant at locus IL23R (rs12044149). The same
SNP was replicated at a different population confirming its independence of the PSO-
associated variant (rs9988642) (Budu-Aggrey, Bowes et al. 2016). Finally, the
rs2476601 variant at PTPN22 has been reported for the first time to be PsA-specific
(Bowes, Loehr et al. 2015).
95
Overall aims and objectives 1.4
This project aims to improve understanding of the aetiology of PsA by using a wide
range of statistical methods to identify environmental, lifestyle and genetic PsA specific
risk factors. In addition, assessing which comorbidities are prevalent in both PSO and
PsA and in other musculoskeletal diseases can shed some light on the shared biological
mechanisms underlying these disorders. These aims will be achieved by applying a
number of established methods along with some novel pleiotropic techniques to data
collected as part of the UK Biobank, in-house PsA genetic data and GWAS summary
statistics data from other musculoskeletal disorders such as RA, SLE, AS and juvenile
idiopathic arthritis (JIA).
In order to achieve this, known environmental risk factors and prevalent comorbidities
will be identified in both PSO and PsA compared to the healthy population using the
UK Biobank. In parallel, the association of prevalent comorbidities in PSO, PsA and
other musculoskeletal diseases will be assessed. Then, using genome-wide summary
statistics of other musculoskeletal diseases and statistical methods that exploit the
pleiotropy among traits potential disease risk loci will be investigated. Finally, bi-
directional Mendelian Randomization (MR) analysis will be performed between PsA and
the significant risk factors found to identify any possible causative relationship and its
direction of effect.
Outline of thesis 1.5
The rest of the thesis is presented in four chapters; the second chapter presenting two
different studies with the first being focused on the environmental factors that are
associated with both PSO and PsA and the prevalent comorbidities found in these
diseases, and the second study on investigating the prevalent comorbidities in
additional musculoskeletal diseases and their effect on physical activity. The third
chapter presents the application of statistical methods that exploit the phenomenon of
pleiotropy applied to PsA, RA, SLE, AS and JIA. The forth chapter investigates the
causal role of environmental factors identified in chapter 2 on PsA onset and vice
versa, using MR, and the final discussion chapter brings together all three results
chapters; environmental, genetic, causality via MR, to discuss insights into the
pathogenesis of PsA.
97
Chapter2
Environmental risk factors
2
Introduction 2.1
The onset of PsA and PSO is influenced by the individual‘s genetic predisposition and
the influence of environmental and lifestyle factors (Chandran 2010). A number of
environmental exposures have been identified as potential risk factors (Ogdie and
Gelfand 2015) for PsA in patients with PSO and include smoking and alcohol
consumption, BMI and trauma (both physical and psychological).
Previous studies suggest that various comorbidities including CVDs (Gelfand, Neimann
et al. 2006; Husted, Thavaneswaran et al. 2011), metabolic abnormalities (Haroon,
Gallagher et al. 2014) and depression (McDonough, Ayearst et al. 2014) are more
prevalent in PSO and PsA leading to a diminished health-related quality of life (de
Korte, Sprangers et al. 2004). In addition, patients with arthritis often find chronic pain
and fatigue more debilitating than comorbidities affecting their everyday life (Rosen,
Mussani et al. 2012). This is not unique to PSO and PsA; patients with all types of
inflammatory arthritis such as RA, AS and SLE, also experience similar comorbidities
and more pain and fatigue. However, there is a lack of studies comparing the
prevalence and incidence of comorbidities across these rheumatic diseases. In addition,
there is some evidence that patients are less physically active compared to the general
population but it is unclear how that relates to comorbidities (cause or effect) and the
contribution of comorbidities to physical inactivity has not been investigated in detail.
98
UK Biobank 2.1.1
Overview and aim 2.1.1.1
The UK Biobank is a large-scale national health resource established by the Wellcome
Trust medical charity, Medical Research Council, Department of Health, Scottish
Government and the Northwest Regional Development Agency. It is hosted by the
University of Manchester and it is also funded by the Welsh Assembly Government,
British Heart Foundation and Diabetes UK. Its main goal is to collect population level
data to facilitate epidemiological studies to assess the causes of a wide range of medical
conditions including cancer, PSO, diabetes, CVDs and dementia. It is also a valuable
resource to investigate susceptibility factors for particular diseases because it has
included thousands of participants who have completed detailed health, activity,
medical, medication and demographic questionnaires and have provided biological
samples; it is therefore possible to identify individuals with specific diseases, thus
providing enough statistical power to address questions about the risk factors and the
morbidities that co-exist with each disease.
Ethical Approval 2.1.1.2
The UK Biobank has been approved by the North West Multi-centre Research Ethics
committee (MREC) in the UK, by the National Information Governance Board for
Health and Social Care (NIGB) and by the Community Health Index Advisory Group
(CHIAG) in Scotland. Moreover, all participants provided written informed consent.
Access to UK Biobank data was provided to the Centre for Musculoskeletal Research
following submission of a study proposal.
Study design and patient recruitment 2.1.1.3
The UK Biobank is a prospective study of British participants with good representation
across the UK, aged 37-73 that were voluntarily recruited from 2006-2010.
Participants were identified from their National Health System records and an
invitation was sent to them. After confirmation of attendance, a pre-visit questionnaire
was mailed to the volunteers to record information that could be easily forgotten such
as medication, family medical history and surgeries. Twenty two assessment centres
(Figure 7) were used, which were located within ten miles from areas with sufficient
population in the preferred age range. Each centre recruited volunteers for a six
month to one year period before relocating to another area.
99
Figure 7 | Locations of the 22 assessment centres in the UK 1.Edinburgh, 2.Glasgow, 3.Newcastle-upon-Tyne, 4.Middlesborough, 5.Leeds, 6.Bury, 7.Manchester, 8.Altrincham, 9.Liverpool, 10.Sheffield, 11.Nottingham, 12.Stoke-on-Trent, 13.Birmingham, 14.Oxford, 15.Bristol, 16.Reading, 17.London, 18.Hounslow, 19.Croydon, 20.Cardiff, 21.Swansea, 22.Wrexham
Data collection 2.1.1.4
Baseline data 2.1.1.4.1
Lifestyle, environmental information and medical history were collected through a
computer-based, self-completed questionnaire and follow-up interview conducted by a
research nurse. In addition, physical measurements and biological samples were
collected at the first assessment visit (Appendix Table 1).
100
The baseline questionnaire was designed based on:
known and potential environmental risk exposures for disorders that are
prevalent in the adult population
current knowledge about risk factor-disease relationships
the importance of each disease
the reliability of the questionnaire measures
a prevalence of exposures of at least 15%.
The baseline questions can be categorised into:
sociodemographics
lifestyle exposures including dietary habits, smoking and alcohol consumption
and physical activity
psychological and cognitive state
family history and early childhood exposures
medical history and general health.
The questions were identified by previous questionnaires used in studies and were
reviewed by experts.
101
Aims and objectives 2.2
The work described in this chapter is comprised of two independent studies using UK
Biobank data.
Aims and objectives of first study 2.2.1
Aim 2.2.1.1
The aim of the first study was twofold:
1. To investigate the association of known environmental factors with the
prevalence of PsA and PSO
2. To identify the association of prevalent comorbidities with disease status.
Objectives 2.2.1.2
Create three distinct study cohorts; the PsA cohort, the PSO without arthritis
cohort and the controls using the UK Biobank participants
Perform regression analysis to identify the environmental factors that are
associated with the disease status compared to controls
Perform regression analysis to identify the prevalent comorbidities in both PsA
and PSO without arthritis and compare rates between groups.
Aim and objectives of second study 2.2.2
Aim 2.2.2.1
The second study aimed:
1. To identify any associations between known prevalent comorbidities and
rheumatic diseases, including RA, PsA, AS and SLE
2. To evaluate the contribution of these comorbidities to the physical activity
levels of patients with a rheumatic disease.
Objectives 2.2.2.2
Create the study cohorts; the RA, the PsA, the AS, the SLE and the controls
using the UK Biobank participants
Create an algorithm that would identify and correct misspelling medication
reported by participants during the interview and recorded as free text
102
Determine the prevalence and the incidence rates of comorbidities in these
types of inflammatory arthritis compared to people without a rheumatic
disease
Perform a sensitivity analysis comparing the prevalence of comorbidities in
participants with each self-reported rheumatic disease and taking DMARDs to
those without a rheumatic disease
Create a modified functional comorbidity index based on the physical function
domain of the 36-item Short Form Health Survey (SF-36).
Determine whether the prevalent comorbidities are associated with physical
activity among patients with a rheumatic disease using the above mentioned
index.
Contribution of the candidate 2.3
The candidate (EB) was not involved in the acquisition of data for the UK Biobank.
However, for the first study the data preparation, the planning, statistical analysis and
interpretation of the results were performed by EB.
For the second study, the estimation of the prevalence and creation of the misspelling
algorithm were performed by the candidate (EB). Additional analysis was performed by
Michael Cook, a research assistant at the Arthritis Research UK Centre for
Epidemiology.
103
Methods 2.4
Identifying lifestyle factors and comorbidities associated with PSO 2.4.1
without arthritis and PsA compared to the general population
Defining the study design 2.4.1.1
For the purpose of the current study, a cross-sectional study design was used (see
1.2.2.4 and Table 1).
Defining the study population 2.4.1.2
Participants were included in the analysis if they self-reported a diagnosis of PsA or
PSO when answering the touch-screen question “Has a doctor ever told you that you
have any other serious medical conditions or disabilities” or during the follow-up interview
with the research nurse.
For the purpose of this study three cohorts were created:
The PsA cohort which included the participants having been diagnosed with
PsA
The PSO only group was defined as participants with PSO who had not
reported any type of arthritis including PsA, AS, RA, SLE and non-specified
arthritis.
The control population was all remaining participants not belonging to the
previous two groups.
Comparisons were performed between these three cohorts in order to assess
whether an association emerged because of the presence of PSO in the PsA cohort or
was specific to PsA.
Identifying lifestyle factors associated with the disease status 2.4.1.3
Environmental risk factors that may potentially trigger disease onset in a susceptible
individual were defined a priori, following a comprehensive review of the literature
(section 1.3.5). The variables selected included the Townsend deprivation index, BMI,
current smoking status, alcohol frequency consumption and fractures or muscle
104
trauma. Information about these lifestyle habits were collected at the baseline
assessment visit during the completion of the computer-based, self-reported
questionnaire.
The postcode of residence was used to estimate the Townsend index, a measure of
area-based deprivation. This index incorporates the following variables; unemployment,
non-car and/or non-home ownership and household overcrowding. Positive value of
the index indicates areas with high deprivation, where zero indicates an area with
mean values. The ethnic background variable is categorised into White, Asian, Black,
Chinese, other and mixed by the UK Biobank. Due to insufficient number of
participants in some categories, all groups except white were clustered together. The
self-reported smoking status was classified as never, previous and current and the
frequency of alcohol intake as never, special occasions only, one to three times a
month, once or twice a week, three or four times a week and daily or almost daily.
For this study, alcohol consumption status was categorised as daily (daily or almost
daily intake), frequent (alcohol intake at least once per week) and low-frequency
(never, special occasions only, one to three times a month), based on the frequency of
alcohol intake. In addition, weight and height were measured and BMI was derived.
Finally, physical activity was measured based on the International Physical Activity
Questionnaire (IPAQ) (Appendix Figure 1). By using the participants’ responses about
the number of days per week and the duration per day that were spent in physical
engagement and processing them according to IPAQ guidelines (Appendix Figure 2),
three categories were created: “low intensity”, “medium intensity” and “high intensity”
physical activity. Table 17 provides further details.
105
Table 17 | Data collection of lifestyle factors by the UK Biobank and their categorisation for the current study
Data-Field Question asked by the UK Biobank Possible answers provided by the UK Biobank Categorisation for current study
Townsend deprivation
index (189)
Townsend deprivation index was
estimated before assessment visit and is
based on postcode of residence.
N/A Used as a continuous variable
Ethnic background
(21000)
“What is your ethnic group?” White
Mixed
Asian or Asian British
Black or Black British
Chinese
Other ethnic group
White background
Other ethnic background
Smoking status
(20116)
“Do you smoke tobacco now and in the
past how often have you smoked tobacco?”
Never
Previous
Current
As categorised by the UK Biobank
Alcohol intake
frequency (1558)
“About how often do you drink alcohol?” Daily or almost daily Daily drinker
Three or four times a week
Once or twice a week Frequent drinker
Once to three times a month
Special occasions only
Never
Low frequency drinker
BMI (21001) BMI was estimated from height and weight
manual measurements during assessment
visit
N/A Used as a continuous variable
N/A: Not Applicable; BMI: Body Mass Index
The number in () is number of the Data-Field used in the UK Biobank data.
106
Table 17 | Data collection of lifestyle factors by the UK Biobank and their categorisation for the current study
Data-Field Question asked by the UK Biobank Possible answers provided by the UK Biobank Categorisation for current study
Fractured/broken
bones in the last 5
years (2463) or
muscle injuries
“Have you ever fractured/broken any
bones in last 5 years?”
Data on muscle injuries were reported
during the interview
Yes
No
Do not know
Prefer not to answer
As categorised by the UK Biobank (Do not know and
prefer not to answer were excluded from the analysis)
Physical activity
(864)
(874)
(884)
(894)
(904)
(914)
“In a typical week, how many days did you
walk for at least 10 minutes at time?”
“How many minutes did you usually spend
walking on a typical day?”
“In a typical week, on how many days did
you do 10 minutes or more of moderate
physical activities like carrying light loads,
cycling at normal pace?”
“How many minutes did you usually spend
doing moderate activities on a typical
day?”
“In an typical week, how many days did
you do 10 or more minutes of vigorous
activity”
“How many minutes did you usually spend
doing vigorous activities on a typical day?”
N/A
N/A
N/A
N/A
N/A
N/A
Using IPAQ guidelines, three categories of physical
activity were used: “low intensity”, “medium intensity”
and “high intensity”
N/A: Not Applicable; IPAQ: International Physical Activity Questionnaire.
The number in () is number of the Data-Field used in the UK Biobank data.
107
Identification of confounders 2.4.1.3.1
The aim of epidemiological studies is to estimate the associations between diseases of
interest and risk factors. This is done by comparing the effect that the risk factor has
on the diseased group versus the healthy cohort. However, there may be other factors
that are related to the exposure and also affect the development of a disease leading
to a distortion in the estimated measurement of the association between exposure and
disease. These factors are called confounding variables and it is essential to account for
them.
There is not a standard, agreed method for determining which variables can act as
confounders. Some investigators decide this by inspecting the data and checking
whether there is a clinically meaningful association between the variable (potential
confounder) and the risk factor and between the variable and the outcome of interest.
Others estimate the difference between the crude (estimation of the association
before adjusting for a confounder) and the adjusted results and the presence of
variation of more than 10% indicates a confounding “phenomenon” (Skelly, Dettori et
al. 2012). Adjusting a statistical model for confounders can be achieved either through
the study design stage or during the statistical analysis. The methods used per stage are
described in Table 18.
In summary, adjusting during the design stage happens before the data gathering and in
registry-based observational studies like the UK Biobank it could be difficult or
insufficient to control for confounding only during the design of the study. In
epidemiological studies, usually many confounders need to be accounted for which
cannot be done with restriction as it could result in very small cohorts; furthermore, it
may not be possible to find control subjects that could be used to match with cases for
the comparisons. It should be noted that controlling during design and through analysis
can be used in the same design, for example case-control matching and multivariable
analysis.
For the current study, the confounder controlling took place at the analysis level using
the multivariable analysis method, described below. Age, sex and ethnic background
were included as confounding variables as is routine for epidemiological studies.
108
Table 18 | Methods for controlling confounding effects in statistical modelling
Stage of the study Method Description Advantages & Disadvantages
Design Restriction Participation in the study of individuals who are similar
according to a confounder
- Difficult to generalise to the rest of the population
Randomization Randomly allocating individuals to exposure categories + Similarly distributed known and unknown confounders for each cohort
being compared
- Use in clinical trials
Matching Selection of controls according to the distribution of
confounders among the cases
- Use in case-control studies
Analysis Stratification Evaluation of association between exposure and disease
within confounder’s different strata where the
confounder does not vary
+ Good for small number of strata and with one or two confounders to control
- Strata with more subjects provide more precise estimation of the
association compared to those with fewer (use of Mantel-Haenszel weighting
method)
Standardisation As different populations of interest may be significantly
different with respect to age and gender, this method
compares age and/or gender specific rates
+ Comprehensive comparison with the increased strata of specific rates
- Used to control for age and gender
- Choice between direct or indirect
- Used for mortality and morbidity rates
- Difficult to use when doing large number of comparisons
Multivariate analysis Evaluation of association between exposure(s) and
disease and controlling for many confounders
+ Good for more than two confounders and for confounders with many
grouping levels to simultaneously handle
- Multicollinearity, linearity, normality must be taken into account
Propensity Score Measurement of probability of an exposure based on
the subject’s observed baseline characteristics
+ Creation of a single score based on all confounders
+ Robust when the outcome is rare
- Exposure must be categorical variable
- Information loss when balancing the comparison groups
109
Statistical analysis 2.4.1.3.2
All data analysis was performed using R statistical analysis software (R Development
Core Team 2008).
2.4.1.3.2.1 Descriptive analysis
All continuous variables were initially analysed using histograms (see Supplementary
data) to assess whether they followed a normal distribution. The non-normal
distributed variables were presented as medians with IQR. The significance between
group differences was examined using the Mann-Whitney U-test for non-normally
distributed variables and the chi-squared test for categorical variables. A two-tailed p-
value<0.05 was considered statistically significant.
2.4.1.3.2.2 Missing data
In observational studies like the UK Biobank it is common to encounter missing data.
Within the dataset however, missing data were minimal. For all the
environmental/lifestyle exposures, <1% was missing (participants who responded “I do
not know” or “Prefer not to answer”); thus by excluding these subjects and performing a
complete-case analysis would not result in a significant loss of statistical power or bias
the results.
2.4.1.3.2.3 Association of lifestyle factors with the prevalence of PSO and PsA
Analysis was performed in two stages. During the initial, “screening stage” logistic
regression analysis was used to create statistical models per environmental factor using
disease status as an outcome and adjusting for age, sex and ethnic background,
referred to as the adjusted model. During this stage, the significant factors that were
associated with disease status were identified by pairwise comparison of all three
cohorts. For the second and final stage of the analysis, three multivariable models were
built with the same outcome variable and the same confounders as in the previous
stage; however, those factors that were found to be statistically significant in that stage
were also included in the model, referred to as the multivariable model.
110
Investigating the association of prevalent comorbidities with disease 2.4.1.4
status
The choice of co-existing morbidities to be investigated was made a priori, after an
extensive literature review as presented in section 1.3.4. During the computer-based
questionnaire, participants were asked if they had ever been diagnosed by a physician
for specific disorders including CVDs, diabetes and pulmonary diseases. The stated
diseases were verified during the following interview.
The CVDs that were analysed in the current study were i) heart attack or myocardial
infarction (MI) ii) angina iii) stroke or transient ischaemic attack (TIA) iv) hypertension
and v) high cholesterol. The last two morbidities do not belong to the CVD category;
rather, they are traditional risk factors for developing CVD. For this analysis, they
were included in the CVDs. The definition of depression was complicated in the UK
Biobank due to the number and content of the questions included in the questionnaire.
For the current study, cases were diagnosed by a specialist or a general practitioner
(GP) either for depression, nerves, tension or anxiety. More specifically, the
participants where characterised as depressed if i) they self-reported ever feeling
depressed or down or uninterested for things they once used to enjoy for at least a
week and ii) the duration of this feeling lasting for at least two weeks and iii) this
episode occurred more than once and iv) they had seen a GP or a psychiatrist for
nerves, anxiety, tension or depression. In case of chronic pain, participants had to have
experienced persistent pain for more than three months in any of the sites listed, such
as headache, back pain, knee pain and pain all over the body to be categorised as
suffering from chronic pain. Finally, participants who reported feeling tired or lacking
energy in the last two weeks at least several days per week were classed as fatigued.
More details about the comorbidities included per disease category and the clustering
used for this study can be found in Table 19. Comorbidities with a frequency of more
than 1% were reported.
Selection of exposure/independent variables 2.4.1.4.1
For assessing the prevalence of co-existing morbidities in PsA and PSO without
arthritis compared to the control group and to each other, disease status (PsA, PSO
without arthritis or healthy controls) was used as the independent variable.
111
Table 19 | Morbidities with their codes included in the current study and categorisation used
Morbidity UK Biobank Codes Categorisation for current
study
Heart attack 6150 heart attack ¥
1075 heart attack/myocardial infraction
Angina 6150 angina ¥
1074 angina
Stroke 6150 stroke ¥
1081 stroke
1082 transient ischaemic attack (TIA)
1083 subdural haemorrhage/haematoma
1086 subarachnoid haemorrhage
1425 cerebral aneurysm
1583 ischaemic stroke
Hypertension 6150 high blood pressure ¥
1065 hypertension
1072 essential hypertension
High cholesterol 1473 high cholesterol
Pulmonary disease 6152 blood clot in the lung,
emphysema/chronic bronchitis,
asthma¥
1093 pulmonary embolism
1111 asthma
1112 chronic obstructive airways disease/copd
1113 emphysema/chronic bronchitis
1114 bronchiectasis 1123 sleep apnoea
1412 bronchitis
1472 emphysema
Diabetes 2443 diabetes ¥
1220 diabetes
1222 type 1 diabetes
1223 type 2 diabetes
1521 diabetes insipidus
Liver disease 1136 liver/biliary/pancreas problem
1155 hepatitis
1156 infective/viral hepatitis
1157 non-infective hepatitis
1158 liver failure/cirrhosis
1506 primary biliary cirrhosis
1578 hepatitis a
1579 hepatitis b
1580 hepatitis c
1581 hepatitis d
1582 hepatitis e
1604 alcoholic liver disease/alcoholic cirrhosis
Fatigue in last 2 weeks 2080 “Over the past two weeks, how often
have you felt tired or had little energy?”
Categorised as fatigued if
participants reported feeling
tired:
several days OR
more than half the
days OR
nearly every day
Gastrointestinal disease 1154 irritable bowel syndrome
1462 Crohn’s disease
1463 ulcerative colitis
¥ computed-based question
112
Table 19 | Morbidities with their codes included in the current study and categorisation
used
Morbidity UK Biobank Codes Categorisation for current
study
Depression 4598 “Ever depressed/down for at least a
whole week?”
4609 “How many weeks was the longest period
when you were feeling
depressed/down?” 4620 “How many periods have you had when
you were feeling depressed/down for
at least a whole week?”
4631 “Have you ever had a time when you
were uninterested in things or unable to
enjoy the things you used to for at least
a whole week?”
5375 “How many weeks was the longest
period when you were uninterested in
things or unable to enjoy the things you
used to?”
5386 “How many periods have you had when
you were uninterested in things or
unable to enjoy the things you used to
for at least a whole week?”
2090 “Have you ever seen a general
practitioner for nerves, anxiety, tension
or depression?”
2100 “Have you ever seen a psychiatrist for
nerves, anxiety, tension or depression?”
Categorised as depressed if
participants:
ever depressed/down for at least a week
AND
at least two weeks
duration AND
at least one episode
AND
ever seen a GP OR a psychiatrist for nerves,
anxiety, tension or
depression
OR
ever uninterested for
things once used to
enjoy AND
at least two weeks duration AND
at least one episode
AND
ever seen a general
practitioner OR a
psychiatrist for nerves,
anxiety, tension or
depression
Chronic pain (more
than three months)
6159 “In the last month have you experienced
any of the following that interfered
with your usual activities?”
Headache, Facial pain, Neck/shoulder
pain, back pain, stomach/abdominal
pain, hip pain, knee pain, pain all over
the body, none of the above
Categorised as having chronic
pain if:
experienced pain in the
last month in any of the
of the sites listed AND
pain persisted more
than three months
3799 “Have you had headaches for more than
three months?”
4067 “Have you had facial pains for more than
three months?”
3404 “Have you had neck/shoulder pain for
more than three months?”
3571 “Have you had back pains for more than
three months?”
3741 “Have you had stomach/abdominal pains
for more than three months?”
3414 “Have you had hip pains for more than
three months?”
3773 “Have you had knee pains for more than
three months?”
2956 “Have you had pains all over your body
for more than three months?”
113
Identification of confounders 2.4.1.4.2
As described in section 2.4.1.3.1, adjustment for confounders was performed during
the statistical multivariable analysis. Potential confounders were selected based on
existing knowledge of clinically meaningful associations between comorbidity outcome
and confounder; for example smoking is known to associate with CVDs and is a
potential confounder. The confounders that were included were age, sex, ethnic
background, Townsend deprivation index, current smoking status, alcohol frequency
consumption and BMI. Furthermore, as duration of existing inflammation in the body
could be a potential confounder it was used to verify the significant associations found
during the analysis. It was estimated using information provided by the participants
about their current age and year of diagnosis with either PSO or PsA.
Statistical analysis 2.4.1.4.3
2.4.1.4.3.1 Descriptive analysis
All comorbidities were classified as categorical variables with factors where 1 indicates
the presence of the comorbidity and 0 the absence. The association of the comorbidity
and outcome was tested using the chi-squared test. A two-tailed p-value<0.05 was
considered statistically significant.
2.4.1.4.3.2 Missing data
A complete-case analysis was performed by excluding participants who had replied “I
don’t know” or “Prefer not to answer” to the relevant questions.
2.4.1.4.3.3 Association of prevalent comorbidities with disease status
To investigate the association between prevalent comorbidities and disease status,
both disease cohorts were compared with the controls and to each other using; firstly,
a univariate logistic regression for assessing the crude ratio, and secondly a
multivariable logistic regression analysis, adjusting for possible confounders including
age, gender, ethnicity, Townsend deprivation index, BMI, current smoking status and
alcohol consumption status.
114
Comorbidities in rheumatic diseases and their effect on physical 2.4.2
activity
Defining the study design 2.4.2.1
The design of this study is a population-based cohort study, in which a case-control
analysis was also performed when the incident cases of comorbidity that developed
after the diagnosis of arthritis were identified.
Defining the study population 2.4.2.2
From the self-reported data during the interview 2.4.2.2.1
Participants who self-reported as having RA, AS, PsA or SLE during their interview
were clustered into the four respective disease cohorts. Participants who reported not
having any medical condition, who reported having non-specified arthritis and those
who reported having more than one form of arthritis were excluded from the study.
For sensitivity analysis purposes, participants were also identified as having one type of
rheumatic disease from their reports during the interview with the research nurse and
whether they were using synthetic or biologic DMARDs. Medications were recorded
during the interview by selecting from a pre-specified list. When a drug was not
included in the list, it was recorded in free-text format. This method led to the
insertion of text varying from one word to many, including drugs with misspelled
names.
The coding of medication is an on-going process by UK Biobank and during the
implementation of the current study the coding of the DMARDs had not been
completed (December 2016). For that reason, an algorithm was developed to
recognise drug names from free text data, either correctly or falsely spelled. The
process included various stages:
Identification of current medications prescribed to patients with rheumatic
diseases from the British National Formulary (musculoskeletal chapter)
Creation of a “dictionary” that included both generic and brand drug names by
downloading an xml version of the majority of the drugs from Drugbank
(www.drugbank.ca)
115
Use of the spellchecking library PyEnchant for Python (www.python.org) to
create a text mining algorithm to correct misspelled medication names.
Briefly, PyEnchant tokenises the free-text sentences to single words and then cross-
references each word to the provided dictionary. If a word is not included in the
dictionary of drugs, the algorithm returns a number of suggestions for the misspelled
word based on the Levenshtein distance between the two words (Haldar and
Mukhopadhyay 2011). The latter is used as a measurement of similarity between two
words, in which distance is the number of deletions, insertions and substitutions
needed to transform a word to another. For example the words “ward” and “world”
have a distance of 2 as it needs a substitute (a to o) and an insertion (the letter l) to
get the second word from the first. If the distance between two words is larger than a
specified threshold, meaning the words are too different, the algorithm does not
provide any suggestions.
Comparison of the prevalence and incidence of comorbidities in 2.4.2.3
rheumatic diseases
Comorbidities investigated 2.4.2.3.1
The comorbidities that were investigated were decided a priori and they were those
that are considered to be more likely to co-exist with rheumatic diseases based on
previous reports (Burner and Rosenthal 2009; Edwards, Cahalan et al. 2011;
Nurmohamed, Heslinga et al. 2015; Doyle and Dellaripa 2017). These were:
Myocardial disorders such as angina and heart attack
Vascular diseases including stroke and hypertension
Pulmonary disease which includes COPD or emphysema or bronchitis
Diabetes
Depression
Identification of incident cases of comorbidity 2.4.2.3.2
In order to identify the incident cases of comorbidities whose onset was after the
diagnosis of arthritis, the age or year of diagnosis was used and the years between
diagnosis of arthritis and comorbidities was estimated.
116
Statistical analysis 2.4.2.3.3
All statistical analyses were performed in R 3.3.2 and Stata V.13.1.
2.4.2.3.3.1 Descriptive analysis
The non-normal distributed continuous variables were presented as medians with IQR
and number of participants (%) for categorical variables. The significance of group
differences with the control group was tested using the Mann-Whitney U-test for the
non-normally distributed variables and the chi-squared (χ2) test for the categorical
variables. A two-tailed p-value<0.05 was considered statistically significant.
2.4.2.3.3.2 Prevalence of comorbidities
For estimating the sex-adjusted and 5-year age band- prevalence and morbidity ratios
(SMRs), indirect standardisation was used. Indirect standardisation is usually used to
estimate the expected mortality rate for the index population, given age specific
mortality rates from a reference population.
The standardised mortality ratio is expressed in ratio and integer (ratio*100) formats
along with a confidence interval. So,
Standardised mortality ratio= 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑑𝑒𝑎𝑡ℎ𝑠
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑑𝑒𝑎𝑡ℎ𝑠∗ 100 and
Number of expected deaths = ∑ 𝑛𝑖𝑅𝑖𝑘𝑖=1 where
𝑛𝑖 is the person-time for the 𝑖𝑡ℎ study group stratum and 𝑅𝑖 is the reference
population rate for the 𝑖𝑡ℎ stratum.
In the current study, indirect standardisation takes the age- and sex- specific rates from
the standard or reference population and applies them to the corresponding numbers
of people in the age groups in the population of interest. Summing these gives the
total number of events expected in the special population if the age- and sex-specific
rates were the same as in the standard population. Here the standard or reference
population was the control group.
A sensitivity analysis was performed by restricting the inclusion of participants to those
who were classified in one of the rheumatic diseases cohorts and also reported taking
synthetic or biologic DMARD. Prevalence estimates based on less than 10 cases were
not included.
117
2.4.2.3.3.3 Risk of incident comorbidities occurring after arthritis diagnosis
Participants with arthritis were matched to four control participants by age and sex in
order to estimate the risk of incident comorbidities developing after the diagnosis of
arthritis. Then, Cox regression analysis was used for determining the hazard ratio of
developing comorbidities compared to controls. The time of diagnosis of a rheumatic
disease was used as an index date in all cohorts, including the matched controls. The
proportional hazards assumption was assessed using the Schoenfeld Residuals Test.
The latter is used to detect any violation of the Cox model’s assumption in which the
effect of a change in a covariate on the hazard rate of an event occurrence is stable
over time. Incidence estimates based on less than 10 cases were not reported.
Association of prevalent comorbidities and physical activity in 2.4.2.4
rheumatic diseases
Physical activity 2.4.2.4.1
Section 2.4.1.3 and Table 17 describe the categorisation used in the current study.
Creation of a modified functional comorbidity index 2.4.2.4.2
As physical activity is one of the measures of success of a medical intervention (along
with health status and quality of life), adjustment for comorbid conditions is essential in
epidemiological studies. For that reason, a self-administered comorbidity index with
physical activity as the outcome, called SF-36 was developed (Groll, To et al. 2005).
For the current study, a modified version of the SF-36 was used. The index was
created by summing comorbidities including asthma, angina, heart failure, heart attack,
osteoporosis, COPD, neurological disease, stroke, peripheral vascular disease, diabetes
(both types), upper gastrointestinal disease, depression, anxiety, visual and hearing
impairment and degenerative disc disease that correlate with SF-36’s physical function
subscale.
Statistical analysis 2.4.2.4.3
Multinomial logistic regression was used to estimate the association between
comorbidities and physical activity level, controlling for age, sex, BMI, smoking and
alcohol consumption. First, the association between physical activity and comorbidity,
where the modified functional comorbidity index was larger than 0, was investigated in
118
participants with a rheumatic disease compared to those without. Four groups were
studied:
Participants with one of the rheumatic diseases and without any comorbid
disease
Participants with a rheumatic disease and a comorbidity
Participants without rheumatic disease but with a comorbidity
Participants without a rheumatic disease and without a comorbidity (referent
group)
Secondly, the relationship between physical activity and individual comorbidities in the
diseased cohorts was assessed. Finally, the association between physical activity and
the functional comorbidity index in people with one of the four rheumatic diseases was
estimated. The categories of the index were 0, 1-2, 3-4 and ≥5 where higher values
indicate higher comorbidity burden. The proportion of participants with and without
one of the investigated rheumatic diseases who carried out the World Health
Organisation (WHO) recommended amount of physical activity was compared using a
chi-square test.
119
Results 2.5
Identifying lifestyle factors and comorbidities associated with PSO 2.5.1
without arthritis and PsA compared to the general population in the
UK Biobank
Of the 502,643 individuals that participated in UK Biobank, 939 (0.2%) of them self-
reported having been diagnosed by a physician with PsA and 4,991 (1.0%) with PSO
without any type of inflammatory arthritis. Their demographic and clinical
characteristics are shown in .
Table 20.
The median age of the three cohorts was similar; however the prevalence of PSO
without arthritis in the male population was significantly higher compared to the
controls (53.5% vs. 45.5%, respectively, p<0.01). Moreover, the proportion of
participants with white ethnic background was significantly higher for both PsA and
PSO without arthritis (97% and 96.5%, respectively, vs. 94%, p<0.01) and the median
BMI for both disease groups was higher (28.0 and 27.5, respectively) compared to
participants without PSO and/or PsA (26.3). Those who reported PsA were more
likely to be previous smokers (38.1%) but less likely to consume alcohol (frequent
drinker 46.5%, daily drinker 16.3%), whilst those in the PSO-only group were more
likely to be smokers (previous smoker 41.1%, current smoker 16.3%) and daily
drinkers (22.3%) compared to the controls (previous smoker 34.4%, current smoker
10.5%, frequent drinker 48.5%, daily drinker 20.2%). Finally, participants who reported
either PsA or PSO without arthritis were less likely to engage in moderate (41.3% and
42.8%, respectively) or high (20.4% and 23.9%, respectively) intensity physical activity
compared to those without either condition (moderate 43%, high intensity 25.5%).
The most prevalent comorbid conditions in both PsA and PSO without arthritis groups
were chronic pain (67% and 45.4%, respectively) and fatigue (64.6% and 54.1%,
respectively) (.
Table 20). The proportion of patients that reported at least one CVD was 41.2% and
32.7%, respectively, with 2.9% and 3.0% of them having self-reported heart attack, 4.8%
120
and 4.2% angina, 1.7% and 1.9% stroke and 38.8% of those with PsA had hypertension,
whereas 29.7% from the PSO-only group were hypertensive. In addition, the
prevalence of high cholesterol, pulmonary disease, diabetes, chronic depression, liver
disease and gastrointestinal disease in PsA was 13.1%, 13.6%, 6.8%, 6.5%, 2.0% and
15.9%, respectively. In the PSO-only cohort the prevalence of the aforementioned
comorbidities were 13.7%, 14.5%, 6.9%, 7.2%, 1.2% and 17.4%, respectively.
Identifying lifestyle factors associated with the disease status 2.5.1.1
As an initial “screening test”, adjusted analysis using logistic regression was performed
to determine the environmental and lifestyle exposures that are significantly associated
with the prevalence of the two conditions (Table 21, Figure 8). When PsA was
compared to the PSO-only cohort, BMI (OR per unit increase 1.03, 95% CI 1.01-1.04),
smoking status (previous smoker: OR 0.78, 95% CI 0.67-0.91 and current smoker: OR
0.53, 95% CI 0.42-0.67) and alcohol intake frequency (frequent drinker: OR 0.80, 95%
CI 0.68-0.94 and daily drinker: OR 0.61, 95% CI 0.49-0.75) were found to be
statistically significant determinants.
The significant factors found in the adjusted analysis step were included in the final
multivariable logistic regressions. The analysis of lifestyle factors between PsA and the
PSO-only cohort found BMI to be associated with increased odds of PsA (OR per unit
increase 1.02, 95% CI 1.01-1.04), while alcohol consumption frequency was associated
with decreased odds of PsA (frequent drinker: OR 0.82, 95% CI 0.70-0.96 and daily
drinker: 0.67, 95% CI 0.54-0.83) as was smoking status (previous smoker: OR 0.80,
95% CI 0.68-0.93 and current smoker: OR 0.54, 95% CI 0.42-0.68) (Table 22 and
Figure 9).
In comparison with the control population, the following variables were independently
associated with PsA: BMI (OR per unit increase 1.05, 95% CI 1.04-1.06), previous
smokers compared to non-smokers (OR 1.17, 95% CI 1.01-1.35), frequent and daily
drinkers (OR 0.78, 95% CI 0.67-0.90 and OR 0.65, 95% CI 0.53-0.80, respectively)
compared to low-frequency drinkers. In addition the exposures that remained
significantly associated with the prevalence of PSO were: Townsend deprivation index
(OR per unit increase 1.01, 95% CI 1.01-1.02), BMI (OR per unit increase 1.03, 95% CI
1.02-1.03) and previous and current smokers (OR 1.46, 95% CI 1.37-1.55 and OR 1.89,
95% CI 1.74-2.06) compared to non-smokers (Table 22 and Figure 9).
121
Table 20 | Baseline characteristics of the study populations
Characteristics PsA (N=939)
PSO without arthritis (N=4,991)
Controls (N=496,536)
Age Median (IQR) 57 (11) 58 (13) 58 (13) Gender Male N (%) 453 (48.2) 2,670 (53.5)* 225,977 (45.5) Ethnic background Missing data N (%) 4 (0.4) 25 (0.5) 2,748 (0.6) White N (%) 911 (97.0)* 4,817 (96.5)* 466,929 (94.0) BMI Missing data N (%) 5 (0.5) 17 (0.38) 3,077 (0.62) Median (IQR) 28 (6.5)* 27.5 (6)* 26.3 (5.8) Townsend deprivation index, Missing data N (%) 3 (0.32) 1 (0.02) 623 (0.13) Median (IQR) -2.1 (4.5) -1.9 (4.5)* -2.1 (4.2) Smoking status Missing data N (%) 5 (0.45) 21 (0.4) 2,925 (0.6) Never 478 (50.9) 2,103 (42.1) 270,957 (54.6) Previous 358 (38.1)* 2,053 (41.1)* 170,600 (34.4) Current 98 (10.4) 814 (16.3)* 52,054 (10.5) Frequency of alcohol intake Missing data N (%) 2 (0.2) 13 (0.3) 1,488 (0.3) Low frequency 347 (37.0) 1,484 (29.7) 152,648 (30.7) Frequent 437 (46.5)* 2,381 (47.7) 241,899 (48.7) Daily 153 (16.3)* 1,113 (22.3)* 100,501 (20.2) Fractures or muscle trauma Missing data N (%) 3 (0.3) 23 (0.5) 3,778 (0.8) Yes 89 (9.5) 493 (9.9) 46,867 (9.4) Intensity of physical activity (IPAQ) Missing data N (%) 89 (9.5) 441 (8.8) 50,048 (10.1) Low 270 (28.6) 1,222 (24.5) 106,314 (21.4) Moderate 388 (41.3)* 2,137 (42.8)* 213,691 (43.0) High 192 (20.4)* 1,191 (23.9)* 126,483 (25.5) Fatigue in last 2 weeks Missing data N (%) 30 (3.2) 173 (3.5) 17,050 (3.4) Yes 607 (64.6)* 2,699 (54.1)* 255,856 (51.5) Chronic pain (more than 3 months) Missing data N (%) 97 (10.3) 860 (17.2) 85,790 (17.3) Yes 629 (67.0)* 2,266 (45.4)* 215,705 (43.4) Heart attack Yes N (%) 27 (2.9) 149 (3.0)* 11,587 (2.3) Angina Yes N (%) 45 (4.8)* 209 (4.2)* 16,226 (3.3) Stroke Yes N (%) 16 (1.7) 94 (1.9) 9,156 (1.8) Hypertension Yes N (%) 364 (38.8)* 1,484 (29.7)* 136,319 (27.5) High cholesterol Yes N (%) 123 (13.1) 685 (13.7)* 60,801 (12.2) Pulmonary disease Yes N (%) 128 (13.6) 725 (14.5) 71,349 (14.3) Diabetes Missing data N (%) 3 (0.3) 19 (0.4) 2,467 (0.5) Yes N (%) 64 (6.8)* 345 (6.9)* 26,275 (5.3) Chronic depression Missing data N (%) 769 (81.9) 3,931 (78.8) 380,061 (76.5) Yes N (%) 61 (6.5)* 361 (7.2)* 32,204 (6.5) Liver disease Yes N (%) 19 (2.0)* 58 (1.2)* 3,782 (0.8) Gastrointestinal disease Yes N (%) 150 (16.0) 892 (17.9)* 74,206 (14.9)
PsA: Psoriatic Arthritis; IQR: Interquartile Range; BMI: Body Mass Index; * statistically significant (p<0.05) with controls as a referent group
122
Table 21 | Adjusted analysis for identifying the exposures that were associated with disease status
Exposures PsA vs. PSO without arthritis PSO without arthritis vs. controls PsA vs. controls OR 95% CI p-value OR 95% CI p-value OR 95% CI p-value
Townsend deprivation index 0.99 0.97-1.01 0.30 1.03 1.02-1.04 4e-12* 1.02 1.00-1.04 0.05 BMI 1.03 1.01-1.04 1.9e-04* 1.03 1.02-1.04 <2e-16* 1.05 1.04-1.07 <2e-16* Smoking status Previous smoker vs. non-smoker 0.78 0.67-0.91 0.002* 1.50 1.41-1.60 <2e-16* 1.17 1.02-1.35 0.02* Current smoker vs. non-smoker 0.53 0.42-0.67 9.8e-08* 1.93 1.78-2.10 <2e-16* 1.03 0.82-1.28 0.78 Alcohol consumption status Frequent vs. low-frequency drinker 0.80 0.68-0.94 0.006* 0.92 0.86-0.98 0.01* 0.73 0.63-0.85 2.1e-04* Daily vs. low-frequency drinker 0.61 0.49-0.75 2.8e-06* 1.01 0.93-1.10 0.77 0.61 0.50-0.75 5.2e-07* Fractures in last 5 years/muscle injury Yes vs. No 0.94 0.73-1.18 0.60 1.05 0.95-1.15 0.33 0.99 0.79-1.22 0.90
PsA: Psoriatic Arthritis; PSO: Psoriasis; vs.: versus; OR: Odds Ratio; CI: Confidence Interval; BMI: Body Mass Index Models are adjusted for age, sex and ethnic background * statistically significant (p-value < 0.05)
123
Figure 8 | Association of lifestyle factors with disease status (adjusted model) adjusting for age, sex and ethnicity a. Results from logistic regression. Disease status is the dependent variable. The referent group is comprised of participants without psoriatic arthritis (PsA) and psoriasis b. Results from logistic regression. The referent group is the participants with psoriasis without arthritis
124
Table 22 | Association between lifestyle/environmental factors and disease status (final, multivariable analysis)
Exposures PsA vs. PSO without arthritis PSO without arthritis vs. controls PsA vs. controls OR 95% CI p-value OR 95% CI p-value OR 95% CI p-value
Townsend deprivation index 1.01 1.00-1.02 0.002* BMI 1.02 1.01-1.04 0.002* 1.03 1.02-1.03 <2e-16* 1.05 1.04-1.06 6.3e-16* Smoking status Previous smoker vs. non-smoker 0.80 0.68-0.93 0.004* 1.46 1.37-1.55 <2e-16* 1.17 1.01-1.35 0.03* Current smoker vs. non-smoker 0.54 0.42-0.68 2.3e-07* 1.89 1.74-2.05 <2e-16* 1.04 0.83-1.29 0.75 Alcohol consumption status Frequent vs. Low-frequency drinker 0.82 0.70-0.96 0.01* 0.96 0.90-1.03 0.22 0.78 0.67-0.90 7.9e-04* Daily vs. Low-frequency drinker 0.67 0.54-0.83 2.5e-04* 1.01 0.93-1.10 0.76 0.65 0.53-0.80 3.1e-05*
PsA: Psoriatic Arthritis; vs.: versus; PSO: Psoriasis; OR: Odds Ratio; CI: Confidence Interval; BMI: Body Mass Index Multivariable model including significant factors from the adjusted analysis and adjusted for age, gender and ethnic background * statistically significant (p-value < 0.05)
125
Figure 9 | Association of lifestyle factors with disease status (multivariable model) adjusting for age, sex and ethnicity; a. Results from logistic regression. Disease status is the dependent variable. The referent group is comprised of participants without psoriatic arthritis (PsA) and psoriasis b. Results from logistic regression. The referent group is the participants with psoriasis without arthritis
126
Investigating the association of prevalent comorbidities with disease 2.5.1.2
status
For assessing the crude ratios of prevalent comorbidities in both diseases compared to
controls and to each other, univariate logistic regression analysis was used (Table 23).
In the univariate analysis the prevalence of fatigue, chronic pain, hypertension and liver
disease were significantly elevated in PsA compared to the PSO without arthritis
cohort, with the ORs ranging from 1.50 to 2.43. Compared to the controls, the PSO-
only cohort was associated with the prevalence of the majority of morbidities that
were investigated with the exception of stroke and pulmonary disease. Comparing PsA
with controls, fatigue, chronic pain, hypertension, angina and liver disease were more
likely to be reported by participants with PsA.
During the multivariable analysis in which BMI, smoking status, frequency of alcohol
consumption and Townsend deprivation index were controlled, participants with PsA
were more likely to report fatigue (OR 1.55, 95% CI 1.33-1.81), chronic pain (OR 2.39,
95% CI 2.02-2.85), at least one CVD (OR 2.39, 95% CI 2.02-2.85), hypertension (OR
1.57, 95% CI 1.34-1.84) and liver disease (OR 1.84, 95% CI 1.06-3.09) compared to
participants with PSO (Table 24 and Figure 10). PSO-only patients were more likely to
report diabetes, chronic depression and gastrointestinal disease compared to controls;
whilst these differences were not statistically different in PsA compared with controls,
no detectable differences between PSO and PsA were observed suggesting that
reduced power may explain the lack of association with PsA versus controls. The
results remain significant after including duration of inflammation.
127
Table 23 | Univariate regression analysis investigating the association of prevalent comorbidities with disease status
Comorbidities PsA vs. PSO without arthritis PSO without arthritis vs. controls PsA vs. controls OR 95% CI p-value OR 95% CI p-value OR 95% CI p-value
Fatigue in last 2 weeks Yes vs. No 1.58 1.36-1.83 2.1e-09* 1.11 1.05-1.18 2.3e-04* 1.76 1.53-2.02 1.3e-15* Chronic pain (more than 3 months) Yes vs. No 2.43 2.06-2.88 <2e-16* 1.10 1.03-1.17 0.003* 2.67 2.29-3.13 <2e-16* Hypertension Yes vs. No 1.50 1.29-1.73 4.9e-08* 1.12 1.05-1.19 4.4e-04* 1.67 1.46-1.90 2.2e-14* Angina
Yes vs. No 1.15 0.82-1.59 0.40 1.29 1.12-1.48 3.2e-04* 1.49 1.09-1.98 0.01* Heart attack/MI Yes vs. No 0.96 0.62-1.43 0.85 1.29 1.09-1.51 0.003* 1.24 0.82-1.77 0.29 Stroke/TIA Yes vs. No 0.90 0.51-1.50 0.71 1.02 0.83-1.24 0.85 0.92 0.54-1.46 0.74 Liver disease Yes vs. No 1.76 1.02-2.91 0.03* 1.53 1.17-1.97 0.001* 2.69 1.65-4.12 2.0e-05* High cholesterol Yes vs. No 0.95 0.77-1.16 0.61 1.14 1.05-1.24 0.002* 1.08 0.89-1.30 0.43 Pulmonary disease Yes vs. No 0.93 0.76-1.13 0.48 1.01 0.93-1.09 0.78 0.94 0.78-1.13 0.51 Diabetes (either type 1 or type 2) Yes vs. No 0.98 0.74-1.29 0.91 1.33 1.19-1.48 4.6e-07* 1.31 1.01-1.67 0.04* Chronic depression Yes vs. No 1.08 0.77-1.51 0.64 1.35 1.19-1.53 3.8e-06* 1.46 1.06-2.00 0.02* Gastrointestinal disease Yes vs. No 0.87 0.72-1.05 0.16 1.24 1.15-1.33 8.6e-09* 1.08 0.91-1.28 0.38
PsA: Psoriatic Arthritis; PSO: Psoriasis; OR: Odds Ratio; CI: Confidence Interval; vs.: versus; MI: Myocardial Infarction; TIA: Transient Ischaemic Attack *statistically significant (p-value<0.05)
128
Table 24 | Multivariable regression analysis investigating the association of prevalent comorbidities with disease status
Comorbidities PsA vs. PSO without arthritis PSO without arthritis vs. controls PsA vs. controls OR 95% CI p-value OR 95% CI p-value OR 95% CI p-value
Fatigue in last 2 weeks Yes vs. No 1.55 1.33-1.81 2.7e-08* 1.08 1.01-1.14 0.02 1.66 1.44-1.92 3.3e-12* Chronic pain (more than 3 months) Yes vs. No 2.39 2.02-2.85 <2e-16* 1.03 0.96-1.10 0.39 2.48 2.12-2.91 <2e-16* Hypertension Yes vs. No 1.57 1.34-1.84 2.5e-08* 1.01 0.95-1.08 0.70 1.57 1.36-1.80 7e-10* Angina
Yes vs. No 1.28 0.90-1.80 0.16 1.11 0.96-1.28 0.16 1.43 1.04-1.93 0.02* Heart attack/MI Yes vs. No 1.15 0.73-1.73 0.53 1.06 0.89-1.24 0.53 1.22 0.81-1.77 0.31 Stroke/TIA Yes vs. No 0.95 0.53-1.58 0.84 0.90 0.73-1.11 0.34 0.90 0.53-1.43 0.69 Liver disease Yes vs. No 1.84 1.06-3.09 0.02* 1.40 1.06-1.81 0.01 2.64 1.62-4.05 3e-05* High cholesterol Yes vs. No 0.99 0.79-1.23 0.94 1.03 0.95-1.12 0.45 1.04 0.85-1.26 0.69 Pulmonary disease Yes vs. No 0.92 0.75-1.13 0.45 0.96 0.89-1.04 0.36 0.89 0.74-1.07 0.23 Diabetes (either type 1 or type 2) Yes vs. No 0.93 0.68-1.24 0.62 1.17 1.04-1.31 0.009* 1.11 0.84-1.45 0.43 Chronic depression Yes vs. No 1.11 0.77-1.58 0.58 1.31 1.15-1.50 5.3e-05* 1.39 1.00-1.92 0.05 Gastrointestinal disease Yes vs. No 0.87 0.71-1.06 0.18 1.15 1.07-1.25 1.8e-04* 1.03 0.86-1.23 0.76
PsA: Psoriatic Arthritis; PSO: Psoriasis; OR: Odds Ratio; CI: Confidence Interval; vs.: versus; MI: Myocardial Infarction; TIA: Transient Ischaemic Attack Multivariable adjusted model included age, gender, ethnic background, Townsend deprivation index, Body mass index, smoking status, alcohol frequency consumption *statistically significant (p-value<0.05)
129
Figure 10 | Association of prevalent comorbidities with disease status (multivariable model) adjusting for age, sex, ethnicity, smoking and alcohol consumption, BMI and Townsend deprivation index; a. Results from logistic regression. Comorbidities is the dependent variable. The referent group is comprised of participants without psoriatic arthritis (PsA) and psoriasis b. Results from logistic regression. The referent group is the participants with psoriasis without arthritis.
130
Comorbidities in rheumatic diseases and their effect on physical 2.5.2
activity
Of the 502,643 individuals recruited by the UK Biobank 488,991 were eligible to be
included in the study (Figure 11). Of the latter, 5,315 (1.1%) had self-reported RA, 865
(0.2%) had PsA, 1,255 (0.3%) had reported being diagnosed with AS and 559 (0.1%)
had SLE. The rest of UK Biobank’s population (98.4%) formed the controls cohort.
Figure 11 | Number of participants included in the study
The median age of the participants that reported having one of the rheumatic diseases
was 61.0 for those having RA, 57.0 for PsA, 59.0 for AS, 56.0 for SLE and these were
significantly different to the controls’ median age of 58 years. The proportion of
participants using DMARDs varied between the different types of inflammatory
arthritis, ranging from 48.4% in RA to 7.7% in AS. 0.4% of the controls cohort also
reported taking DMARDs and 1.1% taking corticosteroids.
131
Table 25 | Baseline characteristics of the cohorts
RA (N=5,315)
PsA (N=865)
AS (N=1,254)
SLE (N=559)
Controls (N=480,998)
Age Median (IQR) 61.0 (55.0-65.0)* 57.0 (51.0-62.0)* 59.0 (52.0-63.0)* 56.0 (49.0-62.0)* 58.0 (50.0-63.0) Age at onset of rheumatic disease Median (IQR) 48.3 (37.9-56.0) 44.7 (35.0-52.2) 36.1 (25.5-46.9) 42.0 (23.8-51.3) - Gender Female N (%) 3,713 (69.9)* 445 (51.4) 459 (36.6)* 499 (89.3)* 259,915 (54.0) BMI Median (IQR) 27.4 (24.4-31.0)* 28.0 (25.1-31.6)* 26.9 (24.4-30.0) 26.2 (23.5-30.2) 26.7 (24.1-20.8) Smoking status Current 659 (12.5)* 88 (10.2) 179 (14.3)* 69 (12.4) 50,083 (10.5) Past 2,137 (40.5)* 325 (37.8) 512 (41.0)* 194 (34.8) 165,240 (34.5) Never 2,479 (47.0)* 447 (52.0) 557 (44.7)* 294 (52.8) 263,115 (55.0) Alcohol Daily or almost daily 766 (14.4)* 139 (16.1)* 296 (23.6)* 82 (14.7)* 98,407 (20.5) Three or four times a week 889 (16.8)* 197 (22.8)* 265 (21.2)* 75 (13.4)* 111,691 (23.3) Once or twice a week 1,308 (24.7)* 213 (24.7)* 311 (24.8)* 114 (20.4)* 124,130 (25.9) One to three times a month 661 (12.5)* 99 (11.5)* 123 (9.8)* 84 (15.1)* 53,330 (11.1) Special occasions only 881 (16.6)* 122 (14.1)* 153 (12.2)* 111 (19.7)* 54,443 (11.4) Never 800 (15.1)* 93 (10.8)* 105 (8.4)* 93 (16.7)* 37,722 (7.9) Medication Using synthetic DMARD 2,574 (48.4)* 418 (48.3)* 97 (7.7)* 226 (40.4)* 2,050 (0.4) Using biologic DMARD 327 (6.2)* 53 (6.1)* 40 (3.2)* 0 (0.0) 66 (0.01) Using corticosteroids 522 (9.8)* 44 (5.1)* 48 (3.8)* 114 (20.4)* 5,064 (1.1)
IQR: Interquartile Range; BMI: Body Mass Index; IPAQ: International Physical Activity Questionnaire; DMARD: Disease-Modifying Anti-Rheumatic Drug *Statistically significantly difference from the controls group (P<0.05), using Mann-Whitney U-test for continuous variables and chi-square test for categorical variable
132
Table 25 | Baseline characteristics of the cohorts
RA (N=5,315)
PsA (N=865)
AS (N=1,254)
SLE (N=559)
Controls (N=480,998)
Physical activity (IPAQ group) Low 1,010 (23.0)* 164 (22.0)* 239 (21.4)* 87 (18.6)* 67,394 (15.4) Moderate 1,811 (41.2)* 320 (43.0)* 447 (40.1)* 208 (44.5)* 182,781 (42.2) High 1,575 (35.8)* 261 (35.0)* 429 (38.5)* 172 (36.8)* 183,505 (42.3) Functional comorbidity index 0 1,962 (37.0)* 372 (43.1)* 537 (42.9)* 215 (38.5)* 235,831 (49.1) 1-2 2,688 (50.7)* 410 (47.5)* 593 (47.3)* 269 (48.1)* 217,179 (45.2) 3-4 575 (10.8)* 76 (8.8)* 99 (7.9)* 68 (12.2)* 24,786 (5.2) ≥5 80 (1.5)* 6 (0.7)* 24 (1.9)* 7 (1.3)* 2,292 (0.5)
IQR: Interquartile Range; BMI: Body Mass Index; IPAQ: International Physical Activity Questionnaire; DMARD: Disease-Modifying Anti-Rheumatic Drug *Statistically significantly difference from the controls group (P<0.05), using Mann-Whitney U-test for continuous variables and chi-square test for categorical variables
133
Comparison of the prevalence and incidence of comorbidities in 2.5.2.1
rheumatic diseases
Prevalence of morbid conditions 2.5.2.1.1
The prevalence rate ratio of each comorbidity was found to be increased in the
majority of rheumatic diseases (Table 26 and Figure 12a). In RA and SLE almost all
studied comorbidities were increased compared to the controls and in the case of SLE,
the increase was considerable. More specifically, angina (SMR: 3.1, 95% CI 2.2-4.2),
heart attack (SMR: 3.3, 95% CI 2.1-4.9) and stroke (SMR: 4.9, 95% CI 3.6-6.6) were
more prevalent in SLE compared to controls. The following comorbidities were
prevalent in all four disease cohorts: compared to controls: angina (RA: SMR 1.9, 95%
CI 1.7-2.1, PsA: SMR 1.5, 95 CI% 1.1-2.0, AS: SMR 1.4, 95% CI 1.1-1.7 and SLE: SMR
3.1, 95% CI 2.2-4.2), hypertension (RA: SMR 1.2, 95% CI 1.2-1.3, PsA: SMR 1.4, 95% CI
1.3-1.6, AS: SMR 1.2, 95% CI 1.1-1.3 and SLE: SMR 1.4, 95% CI 1.3-1.6) and depression
(RA: SMR 1.2, 95% CI 1.1-1.3, PsA: SMR 1.3, 95% CI 1.0-1.7, AS: SMR 1.5, 95% CI 1.2-
1.8 and SLE: SMR 1.4, 95% CI 1.0-1.8). Only participants with RA had an increased
prevalence of diabetes compared to the controls (SMR: 1.5, 95% CI 1.4-1.6).
The sensitivity analysis including only participants who self-reported a rheumatic
disease and who were also taking synthetic and/or biologic DMARD compared to the
controls showed similar results (Table 27).
Incidence of morbid conditions 2.5.2.1.2
Cox regression analysis of incident cases of comorbidities occurring after the diagnosis
of a rheumatic disease revealed similar significant results as the SMR method.
Participants with RA were at increased risk of developing all of the comorbidities
considered compared to controls over the same period of time (Figure 12b).
Participants with PsA had a statistically significant increased risk of developing
hypertension only (HR 1.5, 95% CI 1.3-1.8) compared to controls. Participants with AS
were at increased risk of having a stroke (HR 1.6, 95% CI 1.1-2.5), developing
pulmonary disease (HR 2.0, 95% CI 1.3-3.1) and depression (HR 1.5, 95% CI 1.1-2.0).
The risk of developing myocardial, vascular and pulmonary comorbidities was
increased in participants with SLE, with particularly increased risk of incident angina
and stroke.
134
Table 26 | Prevalence of comorbidities in participants with a rheumatic disease
RA PsA AS SLE
n (%) SMR¥ n (%) SMR¥ n (%) SMR¥ n (%) SMR¥
Angina 350 (0.06) 1.9 (1.7-2.1)* 42 (0.05) 1.5 (1.1-2.0)* 71 (0.05) 1.4 (1.1-1.7)* 41 (0.06) 3.1 (2.2-4.2)* Heart attack/MI 224 (0.04) 1.9 (1.6-2.1)* 26 (0.03) 1.3 (0.8-1.9) 53 (0.04) 1.3 (1.0-1.7)* 23 (0.04) 3.3 (2.1-4.9)* Stroke/Ischaemic stroke 180 (0.03) 1.6 (1.4-1.9)* 16 (0.02) 1.0 (0.6-1.6) 40 (0.03) 1.5 (1.0-2.0)* 45 (0.07) 4.9 (3.6-6.6)*
Hypertension 2043 (36.5) 1.2 (1.2-1.3)* 345 (38.7) 1.4 (1.3-1.6)* 462 (34.8)
1.2 (1.1-1.3)* 218 (39.2) 1.4 (1.3-1.6)*
Pulmonary disease 284 (5.1) 2.1 (1.9-2.4)* 25 (2.8) 1.3 (0.8-1.9) 63 (4.8) 2.0 (1.6-2.6)* 30 (4.7) 2.4 (1.6-3.4)*
Diabetes 443 (7.9) 1·5 (1.4-1.6)* 57 (6.4) 1.2 (0.9-1.6) 84 (6.3) 1.0 (0.8-1.3) 36 (5.6) 1.4 (1.0-2.0) Depression
367 (6.5) 1.2 (1.1-1.3)* 62 (7.0) 1.3 (1.0-1.7)* 94 (7.1) 1.5 (1.2-1.8)* 56 (8.7) 1.4 (1.0-1.8)*
SMR: Standardised Morbidity Ratio; MI: Myocardial Infarction; COPD: Chronic Obstructive Pulmonary Disease Pulmonary disease includes COPD, emphysema and bronchitis ¥Age- and sex-standardised morbidity ratio. The reference population comprised participants without any of the four rheumatic diseases being studied *p<0.05
135
Table 27 | Prevalence of comorbidities in participants with a rheumatic disease (self-reported rheumatic disease and use of a DMARD)
RA PsA AS SLE
n (%) SMR¥ n (%) SMR¥ n (%) SMR¥ n (%) SMR¥
Angina 134 (0.05) 1.5 (1.3-1.8)* 21 (0.04) 1.6 (1.0-2.4) 8 (0.07) 1.9 (0.8-3.8) 5.5 (1.3-4.0)* Heart attack/MI 103 (0.04) 1.8 (1.5-2.2)* 12 (0.02) 1.2 (0.6-2.1) -+ -+ 2.0 (0.7-4.4) Stroke / Ischaemic stroke 73 (0.03) 1.4 (1.1-1.8)* -+ -+ -+ -+ 17 (0.06) 4.4 (2.6-7.1)* Hypertension 967 (0.4) 1.2 (1.2-1.3)* 176 (0.4) 1·5 (1.3-1.7)* 60 (0.5) 1·8 (1.4-2.3)* 93 (0.3) 1.4 (1.2-1.8)*
Pulmonary disease 133 (0.05) 2.1 (1.7-2.4)* -+ -+ -+ -+ 16 (0.06) 2.9 (1.7-4.8)*
Diabetes 188 (0.07) 1.3 (1.2-1.5)* 27 (0.06) 1.2 (0.8-1.7) 13 (0.1) 1.9 (1.0-3.2)* 14 (0.05) 1.3 (0.7-2.1) Depression
145 (0.05) 1.0 (0.8-1.1) 37 (0.08) 1.6 (1.1-2.2)* 15 (0.1) 2.5 (1.4-4.2)* 25 (0.09) 1.4 (0.9-2.0)
SMR: Standardised Morbidity Ratio; MI: Myocardial Infarction; COPD: Chronic Obstructive Pulmonary Disease Pulmonary disease includes COPD, emphysema and bronchitis ¥ Age- and sex-standardised morbidity ratio. The reference population comprised participants without any of the four rheumatic/musculoskeletal diseases being studied. + Results are not presented where the number of cases is <10 * p<0.05
136
Figure 12 | Prevalence and incidence rates of comorbidities a. Indirect age- and sex- standardised morbidity ratios for the four rheumatic diseases. The referent group includes participants with none of the rheumatic diseases being analysed b. Hazard ratios from a Cox proportional hazard model. Each participant with a rheumatic disease was age- and sex- matched with four participants from the controls group.
137
Association of prevalent comorbidities and physical activity 2.5.2.2
A significantly lower proportion of people with one of the four diseases reported
performing a high level of physical activity, compared to the control population (Table
25). The proportion of participants meeting the WHO recommended amount of
physical activity was 64% for people with rheumatic diseases and 74% for people
without a rheumatic disease (p<0.001, chi-square test). The presence of (co)morbidity
was associated with reduced odds of reporting a moderate or high level of physical
activity in participants with a rheumatic disease and in controls, with low physical
activity as the referent group (Figure 13). Participants with a rheumatic disease and no
comorbidity were less likely to report a high (OR 0.61, 95% CI 0.55-0.69) or moderate
(OR 0.72, 95% CI 0.64-0.80) level of physical activity than participants with no
rheumatic disease and a morbidity (high: OR 0.80, 95% CI 0.79-0.82, moderate: OR
0.87, 95% CI 0.85-0.89), with the referent group comprising participants with no
rheumatic disease and no morbidity (Figure 13).
In people with one of the four rheumatic diseases, most of the comorbidities
considered were individually associated with physical activity level (Table 28). In
particular, cardiovascular comorbidities and depression were associated with reduced
odds of reporting a moderate or high level of physical activity. There was evidence of
a dose-response relationship between increasing level of comorbid burden, measured
using a modified functional comorbidity index, and reduced odds of reporting a
moderate or high level of physical activity (Table 28).
138
Figure 13 | Association between presence/absence of rheumatic disease, (co)morbidity and physical activity Results from the multinomial logistic regression. Physical activity group (referent=low) is the dependent variable. Study group: no rheumatic disease and no morbidity, no rheumatic disease and morbidity, rheumatic disease and no comorbidity, and rheumatic disease and comorbidity is the independent variable. Adjusted for age, sex, smoking, alcohol consumption and BMI.
Table 28 | Association between comorbidities and physical activity in participants with a rheumatic disease
Physical activity level
Moderate High
RRR (95% CI)a
Angina 0.60 (0.46-0.78)* 0.54 (0.4-0.71)*
MI / heart attack 0.68 (0.50-0.92)* 0.54 (0.39-0.75)*
Stroke / Ischaemic stroke 0.55 (0.39-0.78)* 0.65 (0.46-0.92)*
Hypertension 0.75 (0.66-0.86)* 0.71 (0.62-0.82)*
Pulmonary disease 0.86 (0.64-1.16) 0.72 (0.53-0.99)*
Depression 0.67 (0.52-0.85)* 0.77 (0.60-0.98)*
Functional comorbidity index
0 referent referent
1-2 0.72 (0.63-0.83)* 0.68 (0.59-0.78)*
3-4 0.48 (0.38-0.60)* 0.48 (0.38-0.60)*
≥5 0.32 (0.20-0.54)* 0.22 (0.13-0.40)*
RRR: Relative Risk Ratio; CI: Confidence Interval; MI: myocardial infarction; COPD: Chronic Obstructive Pulmonary Disease a Relative risk ratio from a multinomial logistic model with physical activity level as the dependent group. Low physical activity was the referent group. Adjusted for age and sex. *p<0.05
139
Discussion 2.6
As the field of biobanking has been evolving over the past number of decades,
population-based repositories have been created worldwide for collecting, storing and
analysing phenotypic and genetic information on large samples of their source
population (De Souza and Greenspan 2013). Such large prospective studies, like the
UK Biobank, that collect an extensive range of data (including pre-occurrent exposures
that could affect the onset of a disease and pre-existing or subsequent comorbidities
that could burden quality of life) are needed to shed light on the causes of the diseases
such as PSO and rheumatic diseases and the greater functional impairment that
patients with comorbidities experience.
A major advantage of the UK Biobank is that the participants recruited were
registered with a GP in the National Health Service. As the latter keeps detailed
medical records, linkage with each participant’s UK Biobank profile allow the cross-
validation of the self-reported information provided during the assessment visit as well
as the participant’s follow-up health outcomes. This extensive phenotypic, genetic and
clinical data along with the large sample size enables the in depth investigation of
exposures and outcomes of diseases in order to improve the prevention, diagnosis and
treatment of diseases (Allen 2013; Sudlow, Gallacher et al. 2015).
However, UK Biobank is not representative of the general population as a result of the
“volunteer” effect or bias in which the participants who volunteer to take part in
studies tend to be women, with higher socioeconomic status, married and healthier
(Galea and Tracy 2007). This was exemplified by the findings of Fry et al. where
women, older aged individuals and those living in less socioeconomically deprived areas
were more likely to participate in UK Biobank. Moreover, UK Biobank participants
were less likely to be daily drinkers, obese and smokers and less likely to report health
outcomes, having lower all-cause mortality rates compared to the general population
of the same age group (Fry, Littlejohns et al. 2017). This fact makes UK Biobank
unsuitable for estimating generalizable prevalence and incidence rates. However,
because of its large sizes with different levels of exposures it can provide reliable
associations between the latter and health outcomes with non-representativeness not
being a caveat (Collins 2012; Ebrahim and Davey Smith 2013; Richiardi, Pizzi et al.
2013). For example, whilst the number of participants with higher BMI is lower
140
compared to the general population, there are enough obese participants to estimate
the association of high BMI with various diseases.
Finally, the data collected in UK Biobank is self-reported and were gathered via the
touchscreen questionnaire and the face-to-face interview with a nurse. The use of self-
report data is faster and cheaper than their extraction from the medical records, but
they might not accurately capture the exposures and phenotypes they represent and
there is a chance of bias and misclassification incidents (Fadnes, Taube et al. 2008).
However, research has shown that patients can accurately report medical conditions
(Barber, Muller et al. 2010), with the accuracy varying depending on the condition
(Okura, Urban et al. 2004) and the age and gender of the interviewees (Pakhomov,
Jacobsen et al. 2008). Previous validation of self-reported PsA cases from the THIN
database showed limited misclassification (Ogdie, Langan et al. 2013). Still there is a
possibility of undiagnosed cases of PsA among participants with PSO; however the
frequency cannot be estimated in a population-based cohort. This misclassification will
underestimate the observed differences (bias to the null); thus, if differences are seen,
they are more likely to be real.
As UK Biobank is a large longitudinal cohort established to investigate susceptibility to
a variety of diseases, the primary objective was to exploit this rich resource of data
and provide a baseline, mainly descriptive analysis of the characteristics of participants
that reported PsA or PSO. More specifically, the aim was to identify associations of
lifestyle determinants with PsA and to investigate the prevalence of comorbidities in
participants with PSO, PsA and other inflammatory rheumatic diseases. The
relationship between comorbidities and physical (in)activity was also investigated.
Review of objectives 2.6.1
First study 2.6.1.1
First aim 2.6.1.1.1
There is an increased interest in identifying environmental and lifestyle risk factors for
the onset of PsA among patients with PSO as they could assist in understanding the
causal pathways of PsA and in potentially preventing the development of the disease.
The identification of these factors is challenging in PsA due to i) the small sample sizes
studied, which leads to insufficient power to detect significant associations ii) the
141
undiagnosed cases of PsA which are wrongly categorised as PSO without arthritis iii)
the lack of universal consensus in the diagnostic criteria used iv) the misdiagnosis of
osteoarthritis with PsA in early stages as they can both start in the entheses
(McGonagle, Hermann et al. 2015) v) the uncertainty of the PsA onset as patients delay
seeking medical advice vi) the difference in study designs, with the majority being case-
control studies which are prone to selection and recall bias (Kopec and Esdaile 1990).
A number of studies (Thumboo, Uramoto et al. 2002; Pattison, Harrison et al. 2008;
Soltani-Arabshahi, Wong et al. 2010; Tey, Ee et al. 2010; Eder, Law et al. 2011; Li, Han
et al. 2012; Li, Han et al. 2012; Love, Zhu et al. 2012; Eder, Haddad et al. 2015; Wu,
Cho et al. 2015) have investigated the environmental and lifestyle risk factors that are
associated with the development of PsA; however for the abovementioned reasons the
findings are often conflicting.
In the current study, participants with PsA had a higher BMI compared with both the
PSO without arthritis cohort and the controls. As the study design is cross-sectional, it
is impossible to infer causality between elevated BMI and the development of PsA, or if
the increased BMI is a consequence of reduced physical activity among patients with
arthritis. BMI has been reported to be associated with a higher risk of PSO in a
prospective study in women (Kumar, Han et al. 2013) and it is the only risk factor
whose association with the onset of PsA has been replicated across three studies
(Soltani-Arabshahi, Wong et al. 2010; Li, Han et al. 2012; Love, Zhu et al. 2012). Li et
al. and Love et al. using prospective cohort studies reported a dose-response relation
between increasing BMI and increasing risk of incident PsA after adjusting for potential
confounders. Finally, Soltani-Arabshahi et al. reported that BMI at age 18 was predictive
of PsA (OR 1.06, p-value<0.01) while current BMI was not significantly associated with
the risk of PsA. That study suffered from a few limitations including the definition of
PsA cases, which was based on self-reported data and the accuracy of BMI at age 18
could be affected by recall bias. One possible explanation could be the chronic
inflammation that is common in both obese and PsA patients. More specifically, obesity
has been found to be related to an overproduction of inflammatory cytokines which
are in turn associated with adiposity (Hamminga, van der Lely et al. 2006). However,
adjusting for duration of disease (and thus inflammatory burden) in the current UK
Biobank study did not materially alter the findings.
142
Regarding alcohol consumption, findings are mixed due to differences in alcohol intake
assessments and time of recording. The current study suggested an inverse association
between PsA and alcohol consumption compared to both the PSO without arthritis
group and controls after adjusting for potential confounders. The association of
frequency of alcohol consumption with the PSO without arthritis did not reach
significance. An inverse association between alcohol consumption and the prevalence
of PsA (OR 0.34, 95% CI 0.23-0.62) has been reported elsewhere (Huidekoper, van
der Woude et al. 2013). The inverse association could be explained by a potential
change in drinking behaviour after the diagnosis of PsA (Wang, Kay et al. 2009) as
advised by the physician due to the use of certain disease modifying anti-rheumatic
drugs that could lead to abnormal liver function (Curtis, Beukelman et al. 2010). In two
case-control studies, Tey et al. and Eder et al. found no association with PsA, whereas
Wu et al. in their prospective study reported that excessive alcohol intake in women
may be associated with increased risk of developing PsA (fully adjusted HR: 4.45, 95%
CI 2.07-9.59) compared to non-drinkers among all participants (Wu, Cho et al. 2015).
This association did not reach the significance threshold among participants with
confirmed PSO. However, excessive drinkers had an increased risk of developing PsA
compared to moderate drinkers (fully adjusted HR: 2.79, 95% CI 1.24-6.26) among
participants with PSO. In a previous prospective study the same authors suggested an
association between excessive alcohol intake and the risk of incident PSO (fully-
adjusted HR: 2.53, 95% CI 1.45-4.40) in women (Qureshi, Dominguez et al. 2010).
The role of smoking in the onset of PsA is unclear. It has been suggested that acute
cigarette smoking activates neutrophils and macrophages and it is associated with
oxidative stress that could stimulate inflammation (van der Vaart, Postma et al. 2004).
In a large cohort of women, current and past smokers were at increased risk of
developing PSO compared to non-smokers (current smokers: RR 1.78, 95% CI 1.46-
2.16, past smokers: RR 1.37, 95% CI 1.17-1.59). The risk increased with the duration,
intensity and pack-years of smoking (Setty, Curhan et al. 2007). Conversely, a
suppressive effect of smoking on several inflammatory cytokines has been suggested
probably due to the presence of anti-inflammatory carbon monoxide and nicotine
(Chapman, Otterbein et al. 2001; Bencherif, Lippiello et al. 2011). Regarding PsA, the
studies that have been conducted have reported conflicting results. Eder et. al (2011)
found an inverse association between smoking and PsA. In a later study, using a larger
143
sample size and stratifying by the HLA-C*06, this inverse association was present only
among patients who were HLA-C*06 negative (Eder, Shanmugarajah et al. 2012). By
contrast, Li et al. reported an elevated risk of PsA in both current (RR 3.13, 95% CI
2.08-4.71) and past smokers (RR 1.54, 95% CI 1.06-2.24) among all the participants,
with an increase in risk of PsA as the duration of smoking (pack-years) increased.
Among participants with PSO, there was an association between current smokers that
smoke more than 15 cigarettes per day (RR 1.93, OR 95% CI 1.09-3.40) and smoking
duration of more than 25 years (RR 1.90, 90% CI 1.09-3.33) with an increased risk of
developing PsA. In the current study, the findings were in line with Eder et al. (2011);
smoking is a risk factor when compared to control populations but protective for PsA
when compared to PSO. However, this finding is probably a paradox as the observed
protective effect of smoking on the development of PsA within patients with PSO has
been shown to be almost completely mediated by smoking’s direct effect on PSO
(Nguyen, Zhang et al. 2015). This spurious association is due to index event bias
(collider stratification bias) caused by conditioning on an outcome, in this case
restricting the analysis to participants with PSO (inclusive of PsA), and inducing
dependence between risk factors (Nguyen, Zhang et al. 2018).
Socioeconomic status was assessed with the use of Townsend deprivation index which
is a measure of deprivation based on unemployment, non-car and non-home
ownership and household overcrowding. A negative value represents a high
socioeconomic status. Evidence is limited about the association of socioeconomic
status and its association with PSO and PsA. Eder et. al (2015) reported that
participants with university or college education were at lower risk of developing PsA
compared to participants with incomplete high school education (RR 0.20, 95% CI
0.06-0.62). As they stated the lower level of education is a marker of lower
socioeconomic status, which is linked with lifestyle habits that may increase the risk of
PsA. The current study including another measure of socioeconomic status found that
people with lower socioeconomic status (higher Townsend deprivation index) have
higher odds of reporting PSO without arthritis compared to the controls; supporting
Eder et. al finding. Clearly, this association will require further and more
comprehensive assessment in future studies.
144
Finally, fractures and muscle injuries were not associated with any disease status in the
adjusted analysis. A few studies have addressed this potential risk factor of PsA. An
association between trauma that leads to medical consultation (Pattison, Harrison et
al. 2008), heavy weight lifting (Eder, Law et al. 2011) and PSO onset due to Koebner
phenomenon (Soltani-Arabshahi, Wong et al. 2010) and PsA has been suggested
previously.
In summary, this study verified the association of increased BMI with PsA compared to
PSO-only cohort and clarified the smoking paradox that has previously been reported
as a limitation of cross-sectional studies.
Second aim 2.6.1.1.2
The majority of patients with PSO with or without PsA have at least one comorbid
condition which may interfere in treatment selection. Thus, it is essential these
comorbidities are recognised and addressed to fulfil the second aim of the study; the
prevalence of self-reported comorbid conditions in both diseases compared to the
general population was assessed. Hypertension was found to be more prevalent in PsA
compared to the PSO-only cohort and the controls, which is similar to the estimated
prevalence reported previously (Husted, Thavaneswaran et al. 2011). This association
could be a result of the interaction between systemic inflammation which is increased
in arthritis because of joint involvement, traditional risk factors which are prevalent in
arthritis and medication effects like corticosteroids (Nurmohamed, Heslinga et al.
2015). No information was available about the severity of the diseases in the current
study; however, Husted et. al reported a significant association (OR 2.17, 95% CI 1.22-
3.83) even after controlling for disease severity and medication history.
There are limited studies investigating the prevalence of liver disease, specifically non-
alcoholic fatty liver, in both PSO and PsA (Gisondi, Targher et al. 2009; Lindsay, Fraser
et al. 2009; Miele, Vallone et al. 2009). Moreover, some medication used in the
treatment of PSO and PSA such as methotrexate and leflunomide have been associated
with abnormalities in liver function tests (Tilling, Townsend et al. 2006; Curtis,
Beukelman et al. 2010). In this study, a higher prevalence of liver disease in PsA
compared to PSO without arthritis and to controls was found, although the number of
cases in both cohorts was too small to make firm conclusions. Husted et. al reported
145
similar findings with patients with PsA being more likely to report liver disease
compared to PSO without arthritis group (OR 7.74, 95% CI 1.35-44.29).
Finally, the prevalence of fatigue and chronic pain was examined, as patients think these
symptoms play a leading role in reducing their quality of everyday life compared to
other comorbidities. The results have shown significant associations between these
symptoms and PsA compared to the PSO and control cohorts. In a study investigating
the quality of life in both diseases, patients with PsA had reduced quality of life
compared to PSO (Rosen, Mussani et al. 2012). Patients with PsA were significantly
more fatigued (measured by the Fatigue Severity Scale) than patients with PSO (4.3 vs
3.4, p-value=0.0007) and experienced more body pain as measured by the SF-36 (61.8
vs. 78.9, p-value<0.0001), where lower scores indicate worse outcome. Both fatigue
and bodily pain were correlated with the number of actively inflamed joints.
Notably, the prevalence of the aforementioned comorbidities with PSO compared to
the controls did not reach significance supporting the hypothesis that their higher
prevalence in PsA could be the result of the increased systemic inflammation
Regarding the PSO cohort, the prevalence of diabetes (both types one and two) was
higher compared to the controls. Various cross-sectional studies have also reported a
significant association (Neimann, Shin et al. 2006; Brauchli, Jick et al. 2008; Qureshi,
Choi et al. 2009). Furthermore, a study reported a significant correlation between
insulin secretion, serum resistin levels (which is increased in insulin resistance) and the
Psoriasis Area and Severity Index (PASI), an assessment of PSO severity (Boehncke,
Thaci et al. 2007).
PSO can have profound emotional and social effects and negative impact on many
aspects of quality of life (Weiss, Kimball et al. 2002). Patients suffer from high levels of
anxiety and stress as the visible skin lesions can cause embarrassment (Tejada Cdos,
Mendoza-Sassi et al. 2011). An increased prevalence of chronic depression was
reported among patients with PSO without arthritis in the current study; a finding that
supports the outcome of a population-based cohort study (Kurd, Troxel et al. 2010),
in which the adjusted relative risk of depression (RR 1.39, 95% CI 1.37-1.41), anxiety
(RR 1.31, 95% CI 1.29-1.34) and suicidality (RR 1.44, 95% CI 1.32-1.57) was higher in
the PSO group compared to the general population.
146
Finally, a significant association was shown with the prevalence of gastrointestinal
disease including IBD, UC and CD. A study conducted on 12,502 patents with PSO
and 24,287 controls, showed that the prevalence of UC and CD was significantly
higher in the PSO group (OR 1.64, 95% CI 1.15-2.33 and OR 2.49, 95% CI 1.71-3.62,
respectively). The associations remained statistically significant even after excluding
patients treated with anti-TNFα drugs (Cohen, Dreiher et al. 2009). It is known that
PSO and inflammatory bowel disease are strongly genetically linked (Skroza, Proietti et
al. 2013).
In conclusion, the current study demonstrated higher prevalence of fatigue, chronic
pain and hypertension in participants with PsA compared to both PSO and the
controls, indicating that the additional inflammatory burden could lead to a higher
prevalence of comorbidities.
Second study 2.6.1.2
Rheumatic diseases, including RA, PsA, AS and SLE are associated with an increased
risk of comorbid conditions compared to the general population (Ursum, Korevaar et
al. 2013; Ursum, Nielen et al. 2013). Using UK Biobank, an increased prevalence and
incidence of chronic myocardial, vascular and pulmonary comorbidities and depression
in people with a range of chronic rheumatic diseases compared to those without these
conditions was found. The results are similar to previous cross-sectional studies
showing an increased prevalence of chronic comorbidities in people with rheumatic
diseases. Data from the Netherlands Information Network of General Practice showed
an increased prevalence of COPD (40% increase), cardiovascular disease (40%
increase) and depression (20% increase) at the time of diagnosis of rheumatic diseases
compared to age- and sex-matched controls (Ursum, Korevaar et al. 2013). Similar
results have been found in people with AS (Kang, Chen et al. 2010), PsA (Khraishi,
MacDonald et al. 2011), and RA (Symmons and Gabriel 2011).
Two previous meta-analyses showed that patients with RA have almost a twofold risk
of developing COPD (Ungprasert, Srivali et al. 2016) and a 70% increased risk of
having a myocardial infarction compared to controls (Avina-Zubieta, Thomas et al.
2012). Data from the Dutch Primary Care Database has also showed that patients with
rheumatic diseases have a 40% increased risk of developing depression compared to
controls without arthritis (Ursum, Nielen et al. 2013). Moreover, a cross-sectional
147
analysis of medical service and prescription drug claims database from the US, found a
30% increased prevalence of hypertension in people with either RA, PsA, or AS,
compared to controls without any of these conditions (Han, Robinson et al. 2006),
which is in line with the current study.
Physical activity has many benefits for patients with inflammatory rheumatic diseases,
including reducing disease activity and pain, increasing functional capacity, and
improving psychological health (Tierney, Fraser et al. 2012), as well as potentially
reducing the incidence of some comorbidities including cardiovascular disease,
diabetes, and osteoporosis (Warburton, Nicol et al. 2006). This study’s results are
consistent with other studies showing that people with rheumatic diseases are less
physically active compared to the general population (Henchoz, Bastardot et al. 2012),
and that a significant proportion do not carry out the recommended level of physical
activity (Manning, Hurley et al. 2012). Two previous studies did not see an association
between comorbidity and physical activity in people with a rheumatic disease (Greene,
Haldeman et al. 2006). This discordance may be explained by difference in the study
population: subjects included in the previous two studies also had other forms of
rheumatic diseases, including osteoarthritis and gout.
The current study compares the prevalence and incidence of common comorbidities
across a range of rheumatic diseases using a single, large, national cohort from the UK
using a consistent study design. The age- and sex-standardised prevalence rate ratios of
each of the comorbidities considered was increased in at least one rheumatic disease
compared to people without these conditions. Particularly high standardised
prevalence rate ratios for angina, stroke and myocardial infarction were seen in
participants with SLE. Compared to participants with no morbidity, participants with a
rheumatic disease and no comorbidity were less likely to have a moderate or high level
of physical activity. In addition, compared to participants with a rheumatic disease and
no comorbidity, those with a rheumatic disease and comorbidity were less likely to
have a moderate or high level of physical activity.
148
Study design 2.6.2
Strengths 2.6.2.1
The obvious strength of these two studies is that the prevalent chronic rheumatic
diseases were studied in a single large national cohort, with detailed demographic and
lifestyle data, as well as details about chronic diseases and medication collected in a
consistent way.
Regarding the first study, another strength is the use of a “healthy” cohort in the
analysis which can help clarify whether an observed association is a result of the
cutaneous part of the disease or the joint involvement. The majority of the studies
compare PsA with PSO as a referent group, often without assessing the latter for
arthritis. However, the coexisting skin and join involvement may cause an
overwhelming inflammatory status and alter the risk of comorbidity or the lifestyle
factor effect.
Limitations 2.6.2.2
Despite the obvious strengths of using a large population cohort, there some
limitations related mostly to the design of the UK Biobank. Due to the self-reported
nature of the data, there is a possibility of misclassification. However, the prevalence of
the rheumatic diseases used in the two studies matches closely with previously
published estimates (Gabriel and Michaud 2009). There has been limited validation of
self-reported medical conditions in UK Biobank to date; however one study has
suggested that the prevalence of overall pain and musculoskeletal-specific pain in UK
Biobank closely match estimates from large population studies with much higher
participation rates (Macfarlane, Beasley et al. 2015). In addition, some participants that
have been included in the study reported being diagnosed with an inflammatory
arthritis before the age of 18; these participants may have been developed juvenile
arthritis that has persisted in adulthood. Data on disease activity and severity was not
available. However, Husted et al. reported similar prevalence with the findings of the
first study after controlling for disease severity. At the same time, of the six studies
that have looked at the association of disease activity with physical activity, only one
found a modest association (Larkin and Kennedy 2014). Finally, because data on
physical activity and environmental factors were collected at a single point in time, it
was not possible to determine any temporal relationships. Another limitation of the
149
cross-sectional study design that has been used is that is prone to bias such the index
event bias that caused the spurious association between smoking and PsA.
Conclusion 2.6.3
Patients with inflammatory and rheumatic diseases have an increase of chronic
comorbidities compared to the general population that contribute to the further
reduced quality of life reported by the patients. Early detection and optimal
management of comorbid conditions in patients with a rheumatic disease may help to
reduce the impact of the increased comorbid burden seen in these patients. Patients
with a rheumatic disease should be encouraged to meet physical activity guidelines
where possible, which may help to reduce the risk of incident cardiovascular disease.
Longitudinal studies are needed to investigate the association of environmental factors
with the development of complex diseases.
151
Chapter 3 Genetics of PsA
3
Introduction 3.1
A key challenge in the discovery of genetic risk loci for PsA is reaching sufficient
sample sizes to adequately power the analysis to detect modest effect sizes. A number
of methods have now emerged that exploit pleiotropy between correlated traits to
improve statistical power. In the current chapter, I explore the use of these statistical
methods for estimating genetic correlation between PsA and related musculoskeletal
diseases, including RA, SLE, AS and JIA, and for the discovery of novel PsA associated
loci. These diseases are thought to be immune-mediated and are characterized by joint
inflammation and with multiple genetic variants contributing to their susceptibility,
many of which have been found to be common or “pleiotropic” among them as seen in
Table 29 (Cotsapas, Voight et al. 2011; Parkes, Cortes et al. 2013; Solovieff, Cotsapas
et al. 2013). This genetic overlap can encompass:
a common locus for which the same SNP confers increased risk for more than
one diseases
a common locus for which the same haplotype confers increased risk for one
disease but is protective for another, or
a common locus for which different haplotypes are implicated.
In contrast to pleiotropy which focuses on particular regions, genetic correlation
estimates the genome-wide correlation of all SNPs for two traits. Genetic correlation
can only exist if the directions of effects are consistently aligned between the traits
(Bulik-Sullivan, Finucane et al. 2015).
152
Table 29 | Shared pathways among immune-mediated diseases (Adapted from (Sun and Zhang 2014))
Region Reported gene(s) Biological annotations Associated diseases
1q23 FCGR2A Antigen processing and presentation AS, RA, SLE
1p13 PTPN22 T-cell receptor signaling pathway PsA, RA, SLE, JIA
1p31 IL23R IL-23/Th17 axis PSO, PsA, AS, JIA
1p36 RUNX3 CD8+ T lymphocyte differentiation PsA, AS
2q24.2 IFH1 Interferon signaling pathway PsA, PSO, SLE
2q32 STAT4 IL-23/Th17 axis RA, SLE, JIA, PSO
2p16. REL Rel/NF-κB family PsA, PSO, RA
5q33 IL12B Th1 cell differentiation PsA, PSO, AS
5q33 TNIP1 NF-κB pathway PSA, PSO, SLE
5q31.1 IL13 Th2 cell differentiation PsA, PSO
5q15 ERAP1, ERAP2 MHC class I processing AS, PSO, JIA
6q21 PRDM1 Type III interferon responses
regulation
RA, SLE
6q23 TNFAIP3 NF-κB pathway PsA, PSO, RA, SLE
6q25 TAGAP Signal transduction PSO, RA
10p15 IL2RA IL-2R signaling pathway JIA, RA
11q24.3 ETS1 Regulation of Th17 and B cells SLE, RA
16q24 IRF8 IRF family RA, SLE
16p13.13 SOCS1 IL-7RA/IL-7 pathway SLE, RA
16p11 IL27 IL-23/Th17 axis AS, SLE
18p11 PTPN2 JAK/STAT pathway regulation PSO, JIA
19p13 TYK2 IL-23/Th17 signaling RA, PsA, PSO, JIA, AS
22q11 UBE2L3 Ubiquitylation SLE, RA, AS, PSO, JIA
22q11 YDJC Ubiquitylation PSO, RA, SLE, JIA
22q13 IL2RB IL-2R signaling pathway JIA, RA
PsA: psoriatic arthritis; PSO: Psoriasis; AS: ankylosing spondylitis; SLE: systemic lupus
erythematosus; RA: rheumatoid arthritis; JIA: juvenile idiopathic arthritis; IL: Interleukin;
Th: T helper; MHC: Major Histocompatibility Complex;
All methods used to exploit the pleiotropy between diseases require only GWAS
summary statistics data and account for the use of common controls.
153
Aims and Objectives 3.2
The aim of this chapter is to identify novel PsA associated variants by leveraging power
from other musculoskeletal traits and, by extension, the common underlying pathways
among the musculoskeletal diseases. While the primary motivation is the discovery of
novel PsA association the methods employed will identify correlations between the
other four traits used in the analysis and these will also be described in the Appendix.
The objectives of this chapter are:
Harmonize the associations of the GWAS summary data to the same effect
allele using as a reference the 1000 Genomes Project
Explore the genetic correlation among the studied musculoskeletal diseases
Identify novel associations using cFDR analysis to the genetically correlated
diseases identified by the previous objective
Combine the datasets in a meta-analysis exploring two different methods
Contribution of the candidate 3.3
The candidate (EB) performed the data preparation, the planning, the statistical analysis
and the interpretation of the results.
154
Methods 3.4
GWAS summary statistics datasets 3.4.1
The summary statistics data were obtained from five studies on musculoskeletal
diseases. Both RA (Okada, Wu et al. 2014)
(http://plaza.umin.ac.jp/~yokada/datasource/software.htm) and SLE (Bentham, Morris et
al. 2015) (www.immunobase.org) datasets were publicly available, whereas the AS data
(Australo-Anglo-American Spondyloarthritis, Reveille et al. 2010) were provided upon
request. The PsA and JIA summary datasets are based on GWAS from the Arthritis
Research UK Centre for Genetics and Genomics (personal communication Dr. Anne
Hinks and Dr. John Bowes). Inclusion and exclusion criteria for RA, SLE and AS are
described in the original study publications. The sample numbers of the GWAS
summary datasets for the five diseases can be found in Table 30. The control groups
were partially overlapping as they were obtained from common data sources (e.g.
WTCCC2).
Table 30 | Sample sizes of the GWAS summary statistics datasets of the five musculoskeletal diseases
Disease dataset Number of Cases Number of Controls
RA 14,361 43,923
SLE 4,036 6,959
AS 2,951 6,658
PsA 2,443 5,129
JIA 1,472 5,181
RA: Rheumatoid Arthritis; SLE: Systematic Lupus Erythematosus; AS: Ankylosing
Spondylitis; PsA: Psoriatic Arthritis; JIA: Juvenile Idiopathic Arthritis
In addition, the 1000 Genomes phase 3 alleles frequencies dataset
(ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz) was
downloaded from the International Genome Sample Resource
(ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3). This dataset contains the allele
frequencies for every SNP per population.
155
Pre-processing 3.4.2
Datasets 3.4.2.1
Due to inconsistencies in the format of publicly available GWAS summary association
statistics data, conversion into the same format is necessary to prevent any pitfalls in
the post-GWAS analyses. Initially, it is essential to bring all genetic analyses to the
same reference build. For that reason, genome positions in NCBI build 36 (UCSC hg
18) from the AS summary data were transferred to NCBI build 37 (UCSC hg19) using
the online tool Batch Coordinate Conversion (liftOver) from UCSC Genome Browser
(https://genome.ucsc.edu/cgi-bin/hgLiftOver). In addition, it is essential that the
summary statistics contain the following columns as they are needed by the post-
GWAS analysis methods used in the current chapter; SNP: the rs identification of
SNPs, CHR: chromosome number, BP: base pair positions, A1: effect allele, A2: other
allele, Z: z-score with respect to the allele A1, BETA: beta coefficient with respect to
allele A1, SE: standard error and N: sample size. Only the SLE data contained an OR
column, instead of BETA, along with its 95% CI. Thus, the BETA and SE were
computed using the formulas:
beta = log 𝑂𝑅 (1)
𝑠𝑒 = 𝐶𝐼𝑙𝑜𝑔𝑂𝑅𝐿
−𝑏𝑒𝑡𝑎
−1.96 (2)
where 𝐶𝐼𝑙𝑜𝑔𝑂𝑅𝐿 is the lower confidence bound (the upper bound can also be used but
instead of -1.96, 1.96 should be used). Moreover, none of the datasets included the z-
score so it was estimated with the following formula:
𝑧 − 𝑠𝑐𝑜𝑟𝑒 = 𝑏𝑒𝑡𝑎
𝑠𝑒 (3)
It should be noted that the AS dataset did not include any allele columns, thus the
alleles were imputed from the reference panel during the harmonization of the alleles
described in the next session. Finally, quality control was performed on the SNPs:
The MHC region, which exhibits both strong LD and strong association
with musculoskeletal diseases, was excluded from every summary statistics
dataset.
All non-biallelic SNPs were removed
156
All SNPs without rs ID or with duplicated rs ID were removed
All SNPs on chromosome X, Y and mitochondrial SNPs were removed.
Regarding the 1000 Genomes (1KG) dataset, variants that are neither SNPs nor bi-
allelic nor autosomal were removed and only the allele frequencies of the European
population (EUR_AF) were kept in the dataset. Then three additional columns named
as minor_allele and major_allele and euro_allele_frequency were created according to
the following pseudo-code so as EUR_AF to refer to minor_allele column (initially it is
referred to ALT column):
IF EUR_AF IS LESS THAN OR EQUAL TO 0.5 THEN:
minor_allele IS EQUAL TO ALT;
major_allele IS EQUAL TO REF;
euro_allele_frequency IS EQUAL TO EUR_AF;
ELSE:
minor_allele IS EQUAL TO REF;
major_allele IS EQUAL TO ALT;
euro_allele_frequency IS EQUAL TO 1-EUR_AF;
Harmonisation of datasets 3.4.2.2
The summary statistic data from different studies often suffer from allele coding
discordance and thus, aligning the alleles of each SNP from all the datasets against
those of a reference panel is essential. The process requires the reversal of signs of the
betas and Z-scores if the alleles of a SNP in the summary stats are the reverse of the
alleles of the reference panel. Thus, each study dataset was merged to the updated
version of 1KG dataset that was created as described in the previous section. If the
minor_allele from 1KG was different from the study’s A1 allele then the signs of betas
and Z-scores were flipped.
157
Statistical analysis 3.4.3
Estimation of genome-wide genetic correlation 3.4.3.1
Elucidating the complex relationships and underlying pathways among diseases is a
primary aim of epidemiology. Genetic variations can assist in shedding some light in
cause and effect, as they are more robust to confounding and reverse causality. This
can be achieved by looking for correlations in effect sizes from summary data of
GWAS analyses among complex diseases.
In the current study all five musculoskeletal conditions present a polygenic architecture
(in which inheritance is affected by thousands of SNPs with small effects), thus the
pairwise genetic correlation between them was estimated using the cross-trait LD
Score regression (Bulik-Sullivan, Finucane et al. 2015). This method is used to test for
genetic overlap among traits and diseases using GWAS summary statistics and is not
affected by sample overlap. The key assumption behind this approach is that the
variants that have high LD scores - a measure of the extent a variant is in LD with its
neighbour variants - are more likely to tag causal SNPs and have a higher χ2 statistic on
average compared to those with low LD scores. LD Score regression can also be used
to control for population stratification and estimate the genetic heritability as it
exploits the expected relationships between true association signals and local LD
around them.
In the current study, pre-computed LD Scores were downloaded for HapMap3 SNPs
from the LDScore website https://data.broadinstitute.org/alkesgroup/LDSCORE/ (File
eur_w_ld_chr.tar.bz2). Moreover, as imputation quality is correlated with LD Score
and low imputation quality (INFO) yields lower test statistics, it is suggested that SNPs
with INFO<0.9 should be removed from the analysis. Due to the lack of the INFO
column in the datasets that were used, the filtering was performed using HapMap3
SNPs, as recommended from (Bulik-Sullivan, Finucane et al. 2015). The file containing
the HapMap3 SNPs was downloaded from the above-mentioned website (File
w_hm3.snplist.bz2).
Correlation is scaled between -1 and +1 depending on whether it is a positive or a
negative correlation.
158
In order to verify the findings from the analyses, LD Hub was used
(http://ldsc.broadinstitute.org/ldhub/) to technically validate a subset of the findings; this
is a web interface that contains summary-level GWAS data for 173 traits including RA
and SLE and automates the LD score regression analysis and was applied to the PsA
and JIA data (Zheng, Erzurumluoglu et al. 2017). Full analysis on LD Hub was not
possible due to the limited public availability of some of the datasets.
cFDR analysis 3.4.3.2
In GWAS the parallel testing of millions of potential markers with a comparatively low
number of samples, requires the use of a stringent significance threshold in order to
limit Type 1 errors (false positives). As a result, the identification of variants with small
effect sizes requires large sample sizes which in turn are costly and time-consuming.
The leverage of power from genetically related diseases can improve detection of
associated variants without requiring larger data samples. The Bayesian cFDR analysis
establishes an upper bound on the FDR across a set of variants whose p-values are
both less than two disease-specific thresholds (Andreassen, Djurovic et al. 2013; Liley
and Wallace 2015).
Genomic control 3.4.3.2.1
No additional genomic control was performed due to the fact that all the studies had
already been corrected for genomic inflation, as can be seen in the original publications
(Australo-Anglo-American Spondyloarthritis, Reveille et al. 2010; Okada, Wu et al.
2014; Bentham, Morris et al. 2015).
Pleiotropic enrichment estimation 3.4.3.2.2
A Quantile-Quantile (Q-Q) plot is a graph indicating the observed distribution of a
random variable against the expected distribution. In GWAS, Q-Q plots are often used
to present the observed association across SNPs with the expected distribution of
association test statistics under the null hypothesis. A true association is observed
when there is a deviation from the identity line. Thus, to assess pleiotropic enrichment
of association, Q-Q plots conditioned on “pleiotropic” effects with varying strengths of
association in the conditional trait were used based on Andreassen et al. (Andreassen,
Djurovic et al. 2013). The conditional Q-Q graphs were plotted for quantiles of
nominal −𝑙𝑜𝑔10(𝑃) values for association of the subset of SNPs below each
significance threshold in the conditional disorder. The nominal P-values −𝑙𝑜𝑔10(𝑃)
159
were plotted on the y-axis and the empirical quantiles −𝑙𝑜𝑔10(𝑞) were plotted on the
x-axis. Any leftward shift from the identity line as the principal phenotype is
successively conditioned on more stringent significance criteria indicates a pleiotropic
enrichment. Greater spacing between the curves implies a stronger trend of
enrichment shared between the two traits (Andreassen, Djurovic et al. 2013).
Q-Q plots were constructed for each pair of diseases that were significantly correlated
as shown by the LD Score regression analysis, using each trait both as principal and as
conditional.
Estimation of the cFDR 3.4.3.2.3
The enrichment seen in the Q-Q plots can be interpreted in terms of FDR. FDR is the
rate that features called significant are truly null and is given by the formula
𝐹𝐷𝑅 = 𝐸 [𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓𝑎𝑙𝑠𝑒 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠]
For example an FDR of 5% means that among 100 features called significant, five are
truly null.
FDR can also be defined as
𝐹𝐷𝑅 ≈ 𝑝
𝑞
which is the equivalent to the nominal p-value divided by the empirical quantile as
described in the previous section. Using the −𝑙𝑜𝑔10 conversion
−𝑙𝑜𝑔10(𝐹𝐷𝑅(𝑝)) ≈ 𝑙𝑜𝑔10(𝑞) − 𝑙𝑜𝑔10(𝑝)
a value is obtained that corresponds to the horizontal shift of the curves in the Q-Q
plots, with a larger shift corresponding to a smaller FDR.
Finally, conditional FDR is the posterior probability that a given variant is null for the
first phenotype given that the p-values for both are as small or smaller as the observed
p-values (Andreassen, Djurovic et al. 2013).
In the current study, the extended cFDR method by Liley et al. was used as it allows
the use of shared controls between the studies (Liley and Wallace 2015). This
160
improves the accuracy of the effect size estimates as no splitting of the control cohort
among studies is required. The cFDR was calculated for each SNP where the
significantly correlated pairs of musculoskeletal diseases served either as the principal
phenotype or as the conditional one. To assess whether the cFDR method leads to the
enrichment of a specific locus, a less conservative significance cut-off was used; the
method chooses the cFDR a SNP would have if it had a principal p-value of 5e-08 and a
conditional p-value of 1. The value of the cut-off is a bound on the FDR for the GWAS
analysis (that is, rejecting the null whenever p-value<5e-08).
Conjunctional cFDR (ccFDR) 3.4.3.2.4
To identify the pleiotropic loci, that is the SNPs that were associated with both
diseases - a conjunction cFDR was used (Andreassen, Thompson et al. 2015). The
ccFDR was estimated as the maximum cFDR of the two traits
𝑚𝑎𝑥{𝑐𝐹𝐷𝑅𝑡𝑟𝑎𝑖𝑡1 | 𝑡𝑟𝑎𝑖𝑡2, 𝑐𝐹𝐷𝑅𝑡𝑟𝑎𝑖𝑡2 | 𝑡𝑟𝑎𝑖𝑡1}
Again, the most significant SNP in each LD block was reported based on the minimum
ccFDR.
MTAG analysis 3.4.3.3
Another method to identify novel variants used in the current study is MTAG, a joint
analysis of GWAS summary statistics from different traits robust to sample overlap.
MTAG can be applied to a cluster of traits with known or suspected genetic
correlation, such as autoimmune disease with resulting gains in statistical power.
Therefore the effect estimates for each trait can be improved with the incorporation
of information contained in the GWAS estimates for the other related traits (Turley,
Walters et al. 2018).
MTAG is a generalised version of the inverse-variance-weighted meta-analysis that
produces trait-specific association statistics. It uses bivariate LD score regression to
account for the overlapping samples and is based on the key assumption that all SNPs
have the same effect sizes across traits. As shown by Turley et al., even when some
SNPs are not associated with some traits (a violation of the key assumption), MTAG is
still a consistent estimator (Turley, Walters et al. 2018).
161
MTAG was applied to the intersection of SNPs from the summary statistics of all five
musculoskeletal diseases to exploit the expected gains in power. This method requires
filtering of effect allele frequency (EAF) to exclude rare variants. As the EAFs were not
available in the summary statistics datasets, the 1KG allele frequencies (updated
variable euro_allele_frequency) were used.
Manhattan plots, using R package “qqman”, were created to illustrate the localisation
of the genetic markers associated with the traits, plotting all SNPs within an LD block
in relation to their chromosomal location.
Subset-based analysis (ASSET) 3.4.3.4
An alternative meta-analysis method used to conduct association analysis is based on
subsets (ASSET) methodology. This method’s main advantage is that it can account for
subset-specific and bidirectional effects of individual SNPs; thus, it gains a substantial
power compared to the basic meta-analysis method of fixed-effects (Bhattacharjee,
Rajaraman et al. 2012).
The subset-based meta-analysis (SBM) is a generalised fixed-effects meta-analysis model
which investigates all possible subsets of traits for the presence of true associations by
incorporating a multiple-testing adjustment procedure. The method that has been
implemented in the R package “ASSET” performs both one-sided and two-sided
analysis. The one-sided method identifies the traits that have associations in the same
direction, whereas the two-sided method applies one-sided subset search separately
for positively and negatively associated traits for a given SNP and then combines the
signals from the two directions in to a single combined χ2 statistic. Both methods
account for correlation among the studied traits due to shared subjects.
For the application of the methods, the use of case-control overlap matrices N11, N10
and N00 are required which specify the number of cases, controls and the cases that
served as controls among studies.
In the current study the following matrices were used:
162
N11
RA SLE AS PsA JIA
RA 14,361 0 0 0 0
SLE 0 4,036 0 0 0
AS 0 0 2,951 0 0
PsA 0 0 0 2,443 0
JIA 0 0 0 0 1,472
The table denotes the number of cases that are shared between the disease studies. The
diagonal contains the number of cases in each disease study. The zeros indicate the absence of
overlap among the studies.
N00
RA SLE AS PsA JIA
RA 43,923 231 5,847 5,129 5,181
SLE 231 6,959 231 231 231
AS 5,847 231 6,658 3,000 3,000
PsA 5,129 231 3,000 5,129 5,129
JIA 5,181 231 3,000 5,129 5,181
The table denotes the number of controls that are shared between the disease studies (non-
diagonal elements of the table). The diagonal contains the number of controls in each disease
study.
N10
controls
cases RA SLE AS PsA JIA
RA 0 0 0 0 0
SLE 0 0 0 0 0
AS 0 0 0 0 0
PsA 0 0 0 0 0
JIA 0 0 0 0 0
The table denotes the number of cases in a disease study that were used as controls in
another disease study. By definition, the diagonal is zero since cases cannot serve as controls
and vice versa in the same study.
163
Results 3.5
Genetic overlap between the diseases 3.5.1
The genome-wide correlation for each pair of musculoskeletal disorders was assessed
using the LD Score regression (Figure 14, Figure 15). The pairs which significantly
overlap were RA-SLE (rg=0.49, p-value=1.93e-14), RA-PsA (rg=0.30, p-value=0.002),
RA-JIA (rg=0.49, p-value=4.3e-05), SLE-JIA (rg=0.60, p-value=2.00e-04) and PsA-JIA
(rg=0.67, p-value=5.3e-06). Also, AS presented a negative weak association with both
PsA (rg=-0.14, p-value=0.005) and JIA (rg=-0.12, p-value=0.045). This weak association
between PsA and AS could be due to the absence of axial sub-phenotype in the PsA
cohort. The findings were verified for both PsA and JIA by using LD hub (Appendix
Table 2).
Figure 14 | Genetic correlation for each pair of the five musculoskeletal disorders. The pairs RA-SLE, RA-PsA, RA-JIA, SLE-JIA, AS-PsA, AS-JIA and PsA-JIA presented a statistically significant correlation. Red colour indicates negative correlation, blue indicates positive correlation and white indicates no correlation. P-values are also presented.
164
Figure 15 | Dendrogram clustering the diseases on correlation “distances”. The dissimilarity measure 1-abs(correlation) was used to discriminate all correlated pairs and is presented as the clustering height (y-axis). PsA and JIA are clustered together, then RA and SLE create their own cluster. AS is in its own branch.
cFDR analysis 3.5.2
The pleiotropy-informed cFDR analysis was applied to the pairs of diseases that
presented a significant genetic correlation during the LD score regression analysis
including RA-SLE, RA-PsA, RA-JIA, SLE-JIA, PsA-JIA, PsA-AS and AS-JIA. Initially, each
Q-Q plot-based enrichment analysis was performed for each pair and then all SNPs
identified were reported. Finally, pleiotropic SNPs affecting both diseases in a pair
were also identified. In the main body of the thesis only the cFDR analysis using PsA as
the principal disease is described. The analysis for the rest of the pairs of diseases is
analytically described in the Appendix.
165
cFDR analysis using PsA as the principal disease 3.5.2.1
Enrichment plots 3.5.2.1.1
Q-Q plots of nominal p-values from GWAS summary statistic data can be used to
visualise the enrichment of the observed statistical association relative to that
expected under the null hypothesis. Figure 16 depicts the conditional Q-Q plots for
PsA given nominal p-values of association with RA (PsA|RA), AS (PsA|AS) and JIA
(PsA|JIA). It shows enrichment across different significance values for the three
conditional diseases. The successive leftward shifts for decreasing nominal p-values of
each of the RA, AS and JIA indicate that the proportion of non-null effects in PsA
varies considerably across the distinct levels of association with either RA, AS or JIA.
The slopes of the Q-Q plots of PsA associations increased as the plotted SNP sets
become more strongly associated with each of the conditional diseases providing
evidence of pleiotropy.
166
Figure 16 | Q-Q plots for PsA conditional on RA (top), AS (left) and JIA (right). Y axes show log10(P’) for each principal disease and X axes show the log quantile of p-values in sets of SNPs. The degree of leftward shift of a black point from the diagonal is proportional to the unconditional FDR of that p-value for the principal phenotype, and the degree of leftward shift of a coloured point is proportional to the conditional FDR of the p-value for the principal phenotype and the p-cutoff corresponding to the colour for the conditional phenotype. Each colour corresponds to the Q-Q plot for 𝒑𝑷𝒔𝑨 amongst a subset of SNPs with 𝒑𝑹𝑨𝒐𝒓 𝒑𝑨𝑺𝒐𝒓 𝒑𝑱𝑰𝑨less than the indicated cutoff.
A leftward shift with decreasing 𝒑𝑹𝑨𝒐𝒓 𝒑𝑨𝑺𝒐𝒓 𝒑𝑱𝑰𝑨cut-off indicates that SNPs which are associated
with the conditional phenotype (RA, AS or JIA) are more likely to be associated with the principal phenotype (PsA), presumably due to pleiotropic effects on phenotypes.
167
PsA loci identified with cFDR 3.5.2.1.2
As shown in Figure 17, conditioning PsA on RA led to the identification of 61
significant SNPs (orange colour, top plot left of the vertical line), while 37 (orange
colour, bottom left) and 30 (orange colour, bottom right) were identified when
conditioned on AS and JIA, respectively. The identified SNPs map to eight independent
loci and the list of index SNPs for each region can be seen in Table 31. All novel
associations identified in this analysis for PsA have been previously reported to be
associated with PSO or PsA, either directly or indirectly as a proxy SNP being in LD
with the SNP associated with the disease.
Figure 17 | cFDR results for PsA conditioned on RA (top), AS (bottom left) and JIA (bottom right). The black vertical line signifies the GWAS significance threshold 5e-08. The red dots signify the genome-wide significant SNPs for the principal disease (here, PsA), whereas the orange dots (on the left side of the vertical line) signify the SNPs identified as significant for PsA after conditioning on the conditional disease (RA, AS or JIA). Black dots show a random sample of the observed p-value pairs. Note that the leftward shift of colours corresponding to an increased p-value threshold for association with PsA for SNPs with low p-values for the conditional diseases.
168
Table 31 | Loci associated with PsA after applying cFDR analysis using as conditional phenotypes RA, AS and JIA
Chr Position rsid effect
allele
other
allele
MAF conditional
phenotype
principal
p-value
conditional
p-value
cFDRprinc.|cond. Gene Consequence Associated Trait
1 114377568 rs2476601 A G 0.09
RA
2.66e-04 1.56e-144 4.63e-04 PTPN22 missense variant RA,T1D,CD
p: JIA, g: PsA
2 163110536 rs2111485 A G 0.40 5.47e-08 2.57e-02 2.50e-04 IFIN1 intergenic variant IBD, vitiligo
p:T1D,IgA,PSO
mixed
6 138199417 rs610604 G T 0.34 2.51e-07 6.59e-06 3.56e-05 TNFAIP3 intron variant PSO
19 10469975 rs12720356 C A 0.09 9.42e-08 8.80e-07 1.18e-05 TYK2 missense variant PSO, IBD, CD
1 25294345 rs7536848 A C 0.46 AS
3.15e-07 8.23e-09 1.47e-05 RUNX3 upstream gene variant p: PSO, AS
2 62559205 rs6759003 T C 0.38 5.17E-07 1.52e-21 1.80e-05 p: PSO, AS, CD
1 25293941 rs4648890 G A 0.46
JIA
3.15e-07 6.61e-05 1.42e-04 RUNX3 upstream gene variant p: PSO, Celiac
disease, AS
16 11354091 rs413024 G A 0.32 2.24e-06 1.88e-05 6.00e-04 SOCS1 upstream gene variant PBC, p: PSO
Chr: Chromosome; MAF; Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; RA: Rheumatoid Arthritis; T1D: Type 1 Diabetes;
CD: Crohn’s Disease; p: proxy SNP to the reported SNP associated with ; JIA: Juvenile Idiopathic Arthritis; IBD: Inflammatory Bowel Disease; PSO: Psoriasis; AS: Ankylosing Spondylitis;
g: gene associated with; PBC: Primary Biliary Cirrhosis/Cholangitis; mixed: mixed population (Europeans and Asians)
PsA|RA cut-off = 6.13e-04; PSA|AS cut-off: 4.54e-04; PSA|JIA cut-off: 6.23e-04
The Associated Traits for the reported SNP have been detected using PhenoScanner with parameters; catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0
169
Pleiotropic loci identified with conjunctional cFDR (ccFDR) 3.5.2.2
To identify pleiotropic variants among each genetic correlated pair, a conjunctional
FDR analysis was performed. The pair PsA and JIA shares a pleiotropic SNP, rs413024
(SOCS1), in chromosome 16 with ccFDR<2.40e-03. Moreover, a pleiotropic SNP was
detected, rs16903065 (RP11-89M16.1), associated with both JIA and RA with
ccFDR<1.29e-03. For both variants the direction of the effect (z-scores) was the same
for both diseases per pair.
MTAG 3.5.3
MTAG was applied to summary statistics from the five single-trait analyses described
above.
Table 32 presents the gain in average power for each trait using MTAG. The resulting
GWAS-equivalent sizes were 11686 for PsA, 22311 for JIA, 67303 for RA, 13200 for
SLE and 9883 for AS, yielding gains equivalent to increasing the original samples by
54%, 235%, 16%, 20% and 3%, respectively. The use of MTAG resulted in an impressive
gain in power for JIA, whereas the power increase for detecting novel loci for AS was
minimal and reflects the lack of genetic correlation with the other traits.
Manhattan plots were used to represent the p-values of disease variants (in both
original GWAS and MTAG analysis) on a genomic scale. The p-values are represented
in genomic order by chromosome and position on the chromosome (x-axis). The y-
axis represents the −𝑙𝑜𝑔10of the P value (equivalent to the number of zeros after the
decimal point plus one).
In this chapter, only the MTAG results for PsA are presented and the findings for the
remaining musculoskeletal diseases are presented in the Appendix.
170
Table 32 | Power gain when using MTAG approach
Trait Sample size SNPs used GWAS equivalent (max.) sample size
PsA 7,572
2,167,678
11,686 JIA 6,653 22,311 SLE 10,995 13,200 RA 58,284 67,303 AS 9,609 9,883 SNP: Single Nucleotide Polymorphism; GWAS: Genome-Wide Association Studies; max.: maximum; PsA: Psoriatic Arthritis; JIA: Juvenile Idiopathic Arthritis; SLE: Systemic Lupus Erythematosus;
RA: Rheumatoid Arthritis; AS: Ankylosing spondylitis
PsA loci identified with MTAG 3.5.3.1
The significant associations to SNPs with at least a marginal association in the original
PsA summary statistics (p-value = 0.05) for PsA are presented in Table 33 with bright
purple colour and Table 34 presents the SNPs with original p-value>0.05, whereas the
Manhattan plot (Figure 18) presents the PsA-associated variants before and after
applying MTAG.
Sixteen loci passed the significance threshold of 5e-08 of which 11 were novel for
PSO/PsA. Two of the novel SNPs (rs2135755 and rs12990970) were intergenic
variants being associated with celiac disease and RA, and were found to be protective
of PsA (OR 0.91 with p=7e-12 and OR 0.92 with p=2.01e-09, respectively).
Furthermore, four new signals were found to contribute to the susceptibility of PsA
including AC006460.2 (rs744600, OR 1.08, p = 1.18e-08), RP4-590F24.1 (rs12563513,
OR 1.16, p = 2.63e-10), ITPR3 (rs2296330, OR 1.09, p = 4.61e-08) and PTPN2
(rs2542151, OR 1.12, p = 6.16e-09). The remaining five associations were found to be
protective of PsA; IL12RB2 (rs6693065, OR 0.92, p = 4.61e-08), ANKRD55 (rs6859219,
OR 0.91, p = 9.48e-09), IRF5 (rs3807306, OR 0.90, p = 6.99e-15), RP11-279F6.3
(rs12899564, OR 0.85, p = 6.10e-10) and ICAM3 (rs2278442, OR 0.90, p = 1.40e-13).
Finally, the strongest evidence of association was with the PTPN22 locus (rs6679677,
OR 1.45, p = 2.11e-57) which has been previously been reported to be associated with
PsA. In addition, STAT4 (rs11889341, OR 1.16, p = 2.17e-20), SOCS1 (rs243325, OR
0.92, p = 5.91 e-09), IFIH1 (rs2111485, OR 0.92, p = 9.96e-10) and YDJC (rs11089637,
OR 1.14, p = 6.93e-13) have been previously associated with PSO and/or PsA.
171
Figure 18 | Manhattan plot of association results for PsA. Each circle presents the − 𝐥𝐨𝐠𝟏𝟎(𝒑) of the variants. The thresholds of suggestive (p-value = 1e-06) and genome-wide significance (p-value = 5e-08) are delineated with blue and red lines, respectively. The plot includes SNPs that were significant in GWAS and MTAG.
.
172
Table 33 | MTAG results for PsA (presented for original PsA p-value≤0.05)
C
hr
Position rsid effect
allele
other
allele
MAF PsA
p-value
MTAG
p-value
MTAG
OR
MTAG
95% CI
Gene Consequence Associated Trait
1 25302495 rs2135755 G A 0.46 2.30e-07 7.00e-12 0.91 0.89-0.94 intergenic variant p: Celiac, IgAD
1 114303808 rs6679677 A C 0.09 3.11e-04 2.11e-57 1.45 1.38-1.51 PTPN22 upstream gene variant CD, RA, T1D, JIA, p:PSA
1 67800018 rs6693065 G A 0.24 8.84e-04 4.61e-08 0.92 0.89-0.95 IL12RB2 intron variant
2 191943742 rs11889341 T C 0.23 7.63e-03 2.17e-20 1.16 1.12-1.19 STAT4 intron variant RA, g: PSO
p: SLE, MS, PsA
2 204700689 rs12990970 T C 0.45 3.52e-02 2.01e-09 0.92 0.90-0.95 intergenic variant RA
2 163110536 rs2111485 A G 0.40 5.47e-08 9.96e-10 0.92 0.90-0.94 IFIH1 intergenic variant IBD, p:PSO,T1D, IgAD
2 191564757 rs744600 G T 0.39 1.63e-02 1.18e-08 1.08 1.05-1.11 AC006460.2 intron & non coding
transcript variant
Height
5 55438580 rs6859219 A C 0.22 5.68e-02 9.48e-09 0.91 0.88-0.94 ANKRD55 intron variant RA, p: JIA, CD
7 128580680 rs3807306 G T 0.49 1.94e-03 6.99e-15 0.90 0.88-0.93 IRF5 intron variant RA, PBC, MI
15 69985284 rs12899564 G C 0.07 1.10e-03 6.10e-10 0.85 0.81-0.90 RP11-
279F6.3
intron & non coding
transcript variant
RA
16 11354497 rs243325 C T 0.34 7.00e-06 5.91e-09 0.92 0.90-0.95 SOCS1 upstream gene variant CD, PBC, p:PSO
19 10444826 rs2278442 G A 0.34 1.64e-06 1.40e-13 0.90 0.88-0.93 ICAM3 intron variant RA
22 21979096 rs11089637 C T 0.17 7.56e-05 6.93e-13 1.14 1.10-1.18 YDJC downstream gene variant RA,HDL,IBD, p: PSO
Chr: Chromosome; MAF; Minor Allele Frequency; MTAG: Multi-Trait Analysis of GWAS; OR: Odds Ratio; CI: Confidence Interval; IgAD: Immunoglobulin A Deficiency;
p: proxy SNP; CD: Crohn’s Disease; RA: Rheumatoid Arthritis; T1D: Type 1 Diabetes; JIA: Juvenile Idiopathic Arthritis; IBD: Inflammatory Bowel Disease; PSO: Psoriasis;
PBC: Primary Biliary Cirrhosis/Cholangitis; MS: Multiple Sclerosis
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
The novel loci are presented with bright purple.
173
Table 34 | MTAG results for PsA (original PsA p-value>0.05)
Chr Position rsid effect
allele
other
allele
MAF PsA
p-value
MTAG
p-value
MTAG
OR
MTAG
95% CI
Gene Consequence Associated Trait
1 114547798 rs12563513 A G 0.09 6.66e-01 2.63e-10 1.16 1.11-1.21 RP4-
590F24.1
upstream gene variant RA
6 33650621 rs2296330 A G 0.24 6.89e-01 4.61e-08 1.09 1.06-1.12 ITPR3 intron variant RA, Height
18 12779947 rs2542151 G T 0.14 1.10e-01 6.16e-09 1.12 1.08-1.16 PTPN2 upstream gene variant CD,IBD,RA,T1D,IgAD,
p: JIA
Chr: Chromosome; MAF; Minor Allele Frequency; MTAG: Multi-Trait Analysis of GWAS; OR: Odds Ratio; CI: Confidence Interval; RA: Rheumatoid Arthritis; CD: Crohn’s Disease;
IBD: Inflammatory Bowel Disease; T1D: Type 1 Diabetes; IgAD: Immunoglobulin A Deficiency; p: proxy SNP;; JIA: Juvenile Idiopathic Arthritis;
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
174
Sub-based analysis (ASSET) 3.5.4
The SBM method has the ability to detect susceptibility loci, and the clusters of traits
that have a shared genetic architecture were used to investigate any novel SNPs
contributing to the susceptibility of PsA. Table 35 shows the loci per disease identified
by the method and the subset of traits for each association signal with the same
direction. Three independent SNPs remained for PsA and AS, 16 for JIA, 6 for RA and
29 for SLE. The three association genes for PsA were IFIH1 (rs1990760, OR 0.85, psubset
= 5.6e-10), ICAM3 (rs2278442, OR 0.90, psubset = 5.3e-15) and CCDC116 (rs5754467,
OR 1.26, psubset = 3.1e-18). The latter SNP has been found to be associated with SLE
and CD and the gene with PSO. In addition, a rare allele in IFIH1 (rs35667974) has
been previously reported to be protective of PsA (Budu-Aggrey, Bowes et al. 2017);
however, it is independent from the SNP found in the current analysis. Both SBM and
MTAG identified rs2278442 (ICAM3) and IFH1 (rs2111485 and rs1990760 are in LD) to
be associated with PsA. The largest number of identified associations by MTAG could
be due to its approach to use estimates of heritability based on genome-wide set of
SNPs making MTAG more powerful when the diseases under study share strong
genetic correlation.
Figure 19 and Figure 20 depict the frequency of the appearance of disease clusters for
only the novel loci and the total identified association SNPs, respectively. The most
frequent subset of diseases sharing novel loci was the RA-SLE (n=12), followed by the
JIA-RA-SLE (n=6). For rs2278442 (ICAM3), PsA, JIA, RA and SLE had the same
association signal direction (Figure 19). In addition, SBM identified a significant
association between rs16903065 and both JIA and RA. This SNP was also identified by
ccFDR to be pleiotropic for JIA and RA (section 3.5.2.2).
175
Table 35 | Loci associated with AS, JIA, PsA, RA and SLE after applying the ASSET subset-based approach
chr position rsid allele
A
allele
B
subset subset
p-value
trait trait
p-value
gene consequence
1 161478810 rs6671847 A G RA,AS 4.10e-09 AS
3.93e-04 FCGR2A intron variant
3 169500487 rs3772190 A G SLE,AS 1.14e-08 1.91e-04 MYNN intron variant
12 112007756 rs653178 C T RA,SLE,AS,JIA 7.49e-12 3.96e-03 ATXN2 intron variant
1 2524915 rs10752747 T G RA,SLE,JIA 2.81e-09
JIA
0.10 MMEL1 intron variant
2 100825367 rs9653442 C T RA,JIA 1.61e-10 0.02 LINC01104 intron variant & non
coding transcript variant
2 191862398 rs16833157 A G SLE,JIA 1.97e-11 4.22e-03 STAT1 intron variant
5 102608924 rs2561477 A G RA,SLE,JIA 1.21e-10 7.45e-03 C5orf30 intron variant
5 133429471 rs10077437 A G SLE,JIA 4.37e-09 0.03 intergenic variant
6 34640870 rs13207858 T C SLE,JIA 3.59e-09 6.893-04 C6orf106 intron variant
6 138005515 rs17264332 G A RA,SLE,JIA 4.37e-24 0.02 intergenic variant
8 11070721 rs7000141 A G SLE,JIA 1.47e-10 0.01 intergenic variant
8 129540464 rs16903065 A C RA,SLE,JIA 3.35e-08 1.50e-03 RP11-
89M16.1
intron variant & non
coding transcript variant
10 6098949 rs706778 T C RA,JIA 6.38e-11 2.91e-03 IL2RA intron variant
10 8106502 rs570613 C T RA,JIA 4.85e-09 0.10 GATA3 intron variant
12 112007756 rs653178 C T RA,SLE,AS,JIA 7.49e-12 6.89e-04 ATXN2 intron variant
13 40342557 rs12875311 A G RA,JIA 1.60e-09 4.61e-03 COG6 intron variant
15 38834033 rs8032939 C T RA,SLE,JIA 7.83e-13 0.06 RASGRP1 intron variant
RA: Rheumatoid Arthritis; AS: Ankylosing Spondylitis; SLE: Systemic Lupus Erythematosus; PsA: Psoriatic Arthritis; JIA: Juvenile Idiopathic Arthritis
176
Table 35 | Loci associated with AS, JIA, PsA, RA and SLE after applying the ASSET subset-based approach
chr position rsid allele
A
allele
B
subset subset
p-value
trait trait
p-value
gene Consequence
16 86019087 rs13330176 A T RA,SLE,JIA 3.78e-09
JIA
0.02 RP11-
542M13.2
downstream gene variant
19 10444826 rs2278442 G A RA,SLE,PSA,JIA 4.32e-15 8.60e-03 ICAM3 intron variant
2 163124051 rs1990760 C T SLE,PSA 5.58e-10 PSA
5.04e-08 IFIH1 missense variant
19 10444826 rs2278442 G A RA,SLE,PSA,JIA 4.32e-15 1.64e-06 ICAM3 intron variant
22 21985094 rs5754467 G A SLE,PSA 3.10e-18 6.14e-05 CCDC116 upstream gene variant
1 161478810 rs6671847 A G RA,AS 4.10e-09
RA
1.20e-07 FCGR2A intron variant
6 36350605 rs881648 T C RA,SLE 4.76e-11 5.30e-08 ETV7 intron variant
8 129540464 rs16903065 A C RA,SLE,JIA 3.35e-08 5.50e-07 RP11-89M16.1 intron variant & non
coding transcript variant
11 128499574 rs7927748 T G RA,SLE 8.75e-10 5.90e-07 RP11-744N12.3 non coding transcript variant
12 112007756 rs653178 C T RA,SLE,AS,JIA 7.49e-12 5.60e-07 ATXN2 intron variant
18 12779947 rs2542151 G T RA,JIA 5.78e-10 5.80e-08 RP11-973H7.1 upstream gene variant
1 2524915 rs10752747 T G RA,SLE,JIA 2.81e-09 SLE
0.02 MMEL1 intron variant
1 114547798 rs12563513 A G RA,SLE 5.84e-27 3.44e-03 RP4-590F24.1 upstream gene variant
RA: Rheumatoid Arthritis; AS: Ankylosing Spondylitis; SLE: Systemic Lupus Erythematosus; PsA: Psoriatic Arthritis; JIA: Juvenile Idiopathic Arthritis
177
Table 35 | Loci associated with AS, JIA, PsA, RA and SLE after applying the ASSET subset-based approach
chr position rsid allele A allele B subset subset p-value trait trait p-value gene Consequence
2 65635688 rs2661798 T A RA,SLE 1.33e-12
SLE
8.67e-06 SPRED2 intron variant
2 135046984 rs4954125 T G SLE 1.14e-10 8.80e-08 MGAT5 intron variant
2 163124051 rs1990760 C T SLE,PSA 5.58e-10 2.48e-06 IFIH1 missense variant
2 204610396 rs1980422 C T RA,SLE 9.79e-11 0.01 intergenic variant
2 204738919 rs3087243 A G RA,SLE 2.00e-19 8.97e-03 CTLA4 downstream gene
variant
3 169500487 rs3772190 A G SLE,AS 1.14e-08 1.67e-05 MYNN intron variant
4 6607460 rs7672421 T C SLE 3.25e-09 8.12e-05 MAN2B2 intron variant
4 56971271 rs10030686 A G SLE 1.24e-08 5.49e-04 intergenic variant
4 181310395 rs4293824 T G SLE 1.12e-09 5.77e-04 intergenic variant
5 102608924 rs2561477 A G RA,SLE,JIA 1.21e-10 7.15e-03 C5orf30 intron variant
6 34640870 rs13207858 T C SLE,JIA 3.59e-09 1.12e-07 C6orf106 intron variant
6 36350605 rs881648 T C RA,SLE 4.76e-11 3.54e-06 ETV7 intron variant
6 138005515 rs17264332 G A RA,SLE,JIA 4.37e-24 1.17e-05 intergenic variant
6 159514778 rs654690 T C RA,SLE 8.29e-11 0.02 intergenic variant
6 167540842 rs1571878 C T RA,SLE 1.13e-14 0.02 CCR6 intron variant
8 11070721 rs7000141 A G SLE,JIA 1.47e-10 6.50e-08 intergenic variant
RA: Rheumatoid Arthritis; AS: Ankylosing Spondylitis; SLE: Systemic Lupus Erythematosus; PsA: Psoriatic Arthritis; JIA: Juvenile Idiopathic Arthritis
178
Table 35 | Loci associated with AS, JIA, PsA, RA and SLE after applying the ASSET subset-based approach
chr position rsid allele
A
allele
B
subset subset
p-value
trait trait
p-value
gene Consequence
8 129540464 rs16903065 A C RA,SLE,JIA 3.35e-08
SLE
0.002301 RP11-89M16.1 intron variant & non coding transcript
variant
10 8480044 rs10905371 G A SLE 4.24e-10 2.06e-07 RP11-543F8.2 intron variant & non coding transcript
variant
10 63800004 rs12764378 A G RA,SLE 3.48e-14 0.005879 ARID5B intron variant
11 118741842 rs4938573 C T RA,SLE 1.44e-15 4.27e-04 intergenic variant
11 128499574 rs7927748 T G RA,SLE 8.75e-10 1.39e-06 RP11-744N12.3 non coding transcript
variant
12 112007756 rs653178 C T RA,SLE,AS,JIA 7.49e-12 1.2eE-07 ATXN2 intron variant
15 38834033 rs8032939 C T RA,SLE,JIA 7.83e-13 0.001027 RASGRP1 intron variant
16 86019087 rs13330176 A T RA,SLE,JIA 3.78e-09 0.007345 RP11-542M13.2 downstream gene
variant
17 38066267 rs1008723 G T RA,SLE 2.61e-11 8.29E-07 GSDMB intron variant
19 10444826 rs2278442 G A RA,SLE,PSA,JIA 4.32e-15 6.52E-08 ICAM3 intron variant
22 39740078 rs137687 A G RA,SLE 6.08e-12 0.004826 intergenic variant
RA: Rheumatoid Arthritis; AS: Ankylosing Spondylitis; SLE: Systemic Lupus Erythematosus; PsA: Psoriatic Arthritis; JIA: Juvenile Idiopathic Arthritis
179
Figure 19 | Novel loci identified by ASSET subset-based analysis by frequency of disease clusters.
Figure 20 | All loci identified by ASSET subset-based approach by frequency of disease clusters.
180
Discussion 3.6
The aim of this study was to identify novel PsA-specific loci using a new category of
statistical methods that exploit the pleiotropy among musculoskeletal diseases. They
only require GWAS summary statistic data and can account for any potential overlap
between samples thereby maximising the statistical power of the analysis. Therefore,
these methods helped identifying novel loci associated not only with PsA but the other
musculoskeletal diseases including RA, SLE, AS and JIA. In addition, they provided
further information about the common biological mechanisms underlying these
diseases assisting with the identification of common and/or discrete therapeutic
targets. In this section, discussion is limited to the loci identified to be associated with
PsA as this disease was the basis of my PhD.
A Bayesian approach called cFDR was the first method applied to the pairs of
musculoskeletal diseases that showed a statistically significant genetic overlap by LD
score regression to identify potential novel loci. Thus, RA, JIA and AS were used as
conditional phenotypes for boosting the statistical power to search for loci
contributing to PsA susceptibility. Eight new SNPs were identified mapping to genes
already known to be related with either PsA or PSO such as PTPN22, IFIN1, TNFAIP3,
TYK2, RUNX3 and SOCS1, but would not have been identified in this data alone. In
addition, a pleiotropic SNP (rs413024) in SOCS1 was shared among PsA and JIA. SOCS1
is a cytokine signalling inhibitor gene that regulates the IFN signal transduction. A
previous study reported changes in SOCS1 levels in systemic JIA monocytes providing
evidence of inhibition of IFN signalling in these cells (Macaubas, Wong et al. 2016).
The application of a meta-analysis method called MTAG led to 16 identified SNPs
resulting in 54% power increase in SNP detection, with 11 of them being novel. Among
this newly associated loci was ITPR3 (rs2296330) which has been shown to participate
in induction of apoptosis in T cells and other types of cells (Blackshaw, Sawa et al.
2000) and in susceptibility to autoimmune diseases such as SLE (Oishi, Iida et al. 2008)
and T1D (Roach, Deutsch et al. 2006). Moreover, PTPN2 (rs2542151) was found to be
associated with an increased risk of developing PsA. There is evidence that rs2542151
is associated with a higher risk of developing joint erosions in patients with RA
(Ciccacci, Conigliaro et al. 2016) and increases the risk of developing CD and UC
(Glas, Wagner et al. 2012). In JIA, rs2542151 is involved in the epistasis (gene-gene
181
interaction) amongst PTPN2 and vitamin D genes contributing to risk of JIA (Ellis,
Scurrah et al. 2015).
By contrast, IL12RB2 (rs6693065) was found to be protective for PsA which is of
interest as IL12RB2 is involved in IL12 signalling, is upregulated by gamma interferon in
Th1 cells and plays a role in Th1 differentiation (Chang, Shevach et al. 1999). In
addition, in animal models the lack of Il12rb2 signalling contributes to the
predisposition to autoimmunity (Airoldi, Di Carlo et al. 2005). Another SNP protective
of PsA was rs6859219 in ANKRD55, a gene of unknown function. The same locus has
been previously reported to be protective for RA as well (Stahl, Raychaudhuri et al.
2010). Another study tried to shed some light on this gene and its role in MS and
other immune-mediated disease susceptibility. They reported a correlation of
rs6859219 with expression of ANKRD55 in CD4+ cells and higher expression of the
gene in the risk allele carriers indicating the gene’s role in the pro-inflammatory state
(Lopez de Lapuente, Feliu et al. 2016; Lopez de Lapuente, Feliu et al. 2016). IRF5 has
been associated with the pathogenesis of SLE and RA and its function includes the
induction of type 1 interferons and pro-inflammatory cytokines (Stahl, Raychaudhuri et
al. 2010; Cham, Ko et al. 2012). In addition, IRF5 is involved in the generation of
effective Th1 and Th17 T cell responses (Krausgruber, Blazek et al. 2011). The
protective role of IRF5 for PsA in this study needs further investigation. Finally, the
protein encoded by ICAM3 is over-expressed in all leucocytes and plays an important
role in the initiation of the immune-response. There has been evidence of the presence
of ICAM3-positive naïve T cells in the synovium of RA patients, leading to the
hypothesis that it contributes to the onset of RA (van Lent, Figdor et al. 2003). The
rs2278442 in ICAM3 was also identified using SBM to be protective of PsA with SLE,
RA and JIA having the same direction of association. SBM also identified two additional
SNPs to be associated with PsA; rs1990760 (IFH1) and rs5754467 (CCDC116).
Thus, 21 novel SNPs were identified for PsA/PSO, which would not have been
detected without the use of leveraging power from other traits. It should be noted the
impressive power increase in identifying novel loci in JIA by MTAG method which led
to 42 SNPs including IL12B2 and ICAM3 variants (Appendix).
This study exploits the phenomenon of pleiotropy which has been widely researched
in genetic epidemiology due to the public availability of summary statistic data from
182
consortia investigating common diseases. Solovieff et al. described the existence of
three categories of pleiotropy; a) biological pleiotropy, where causal variants for
different traits tag the same gene b) mediated pleiotropy where a variant affects a trait
and this trait in turn affects another trait c) spurious pleiotropy, whereby causal
variants for two traits fall into different loci but are in LD with a SNP associated with
both traits (Solovieff, Cotsapas et al. 2013). An evaluation of the NIHR catalogue
indicated that two or more traits were genetically overlapped in 4.6% of the variants, a
number that probably has increased through the years. The genetic overlap among
immune-mediated diseases, a broader category including musculoskeletal diseases, was
indicated by observational studies which reported the co-occurrence of these diseases
in the same individual or among family members. Later, it was confirmed by the
conduct of cross-phenotype studies utilising one of the methods described in the
current chapter or other relevant methods. For example, Cotsapas et al. assessed the
existence of a common genetic basis among celiac disease, CD, MS, PSO, RA, SLE and
T1D using a cross-phenotype meta-analysis (CPMA) method that examines whether a
variant is associated with two or more diseases. They found that 44% of the examined
variants were associated to multiple diseases (Cotsapas, Voight et al. 2011).
Furthermore, Ellinghaus et al. conducted a cross-trait meta-analysis study using the
SBM described in this chapter, to investigate the common pathogenesis among AS, CD,
PSO, PSC and UC. They were able to identify 244 multi-disease signals and 27 novel
susceptibility loci (Ellinghaus, Jostins et al. 2016).
The current cross-trait study used four methods that leverage the power of GWAS
and the phenomenon of pleiotropy to identify novel PsA loci and to construct
biological hypotheses about the underlying mechanisms of pathogenesis. All four
methods require only the use of GWAS summary statistics and can adjust for sample
overlap. The, first method was a univariate, genome-wide method called LD Score
regression that performs genetic correlation analysis. An online database and tool has
also been created, the LD Hub, which is used to accumulate the summary statistics
data from various GWAS and to systematically perform the correlation analysis across
the traits. The authors suggest that the use of data from targeted genotyping arrays
such as Immunochip does affect the performance. The data analysed in the current
study contained a small number of Immunochip samples but this should not distort the
findings. In addition, it is suggested that the LD score regression should be applied only
183
in datasets with over 5000 samples to avoid noisy results. Finally, the most essential
consideration when interpreting the genetic correlation among traits is that the genetic
correlation is not the same as pleiotropy. For example, zero correlation does not
mean that the two traits do not share any risk loci as there could be lack of
directionality to the genetic relationship. Recently, a new method for the identification
of regions that contribute to the genetic correlation of two traits has been proposed
named ρ-HESS (Shi, Mancuso et al. 2017). This method hypothesizes that at a region-
level two traits may present significant genetic covariance even if the genome-wide
genetic correlation is not significant. This method could be used to verify the LD score
regression findings in this study, investigate whether the pairs that did not present a
genome-wide correlation present any local correlation and find the specific regions
with strong correlation that could serve as putative causal models between the
diseases.
Exploiting any pleiotropic effects among the genetically correlated diseases can
improve the statistical power to detect risk associated loci. This is the basis of the
three methods used in the current study which also account for the existence of
overlapping samples among the datasets. The Bayesian cFDR analysis detects variants
associated with the principal trait given the p-values of association with both the
principal and the conditional traits are less than a specified threshold. In addition, the
detection of SNPs is not weakened when there is no extensive pleiotropy among the
two traits. MTAG and SBM are extensions of the classical meta-analysis in which effect
sizes or p-values are combined across numerous studies of the same trait, with effect
sizes assumed to be either consistent (fixed effects meta-analysis) or varied (random
effects meta-analysis). SBM evaluates all possible subsets of traits in order to identify
the one with the maximum z-statistic at a SNP. The method improves the
interpretation of the findings by presenting the cluster of traits that show the same
effect direction. Another advantage is its flexibility to use restricted subset of SNPs for
search, based on previous knowledge of potential grouping of traits. Its major caveat is
the loss of power compared to the standard meta-analysis method when a large
proportion of the studies contain association signals with the same effect direction.
This loss increases with the number of studies involved in the analysis and the multiple
testing penalty. Authors recommend the use of both standard and subset-based meta-
analysis to account for loss of power. On the other hand, the MTAG method can
184
incorporate the effect estimates from all the traits included in the analysis and present
adjusted estimates per trait. The use of MTAG is more fruitful when the traits involved
present a high genetic correlation. Caution should be taken when applying the method
in underpowered GWAS as the FDR can become substantial. FDR calculation should
be performed, a process that is not included in the existing pipeline provided by the
authors. It should be noted that the current study lacks these FDR estimations, a step
that could be addressed in future work. Finally, an extensive description of the
methods used currently to detect pleiotropy can be found in the review by Hackinger
and Zeggini (Hackinger and Zeggini 2017).
The current chapter presents an exploration of statistical methods that can be used to
identify genetic variants that increase the risk of PsA and shows many merits. The
strength of the study lies in the use of publicly available summary data, most of which
has been used in other studies; thus confirming their quality. It is the first time this
cluster of musculoskeletal diseases has been systematically analysed to exploit the
power of pleiotropic effects and identify association signals for each disease and to
assess their common genetic basis.
Limitations should also be considered and include the fact that the summary statistics
for PsA were from GWAS comparing patients with PsA to healthy controls. Due to
the fact that these patients also have PSO, it was not possible to determine whether
the PsA associations observed in the novel loci were because of PsA or because of the
presence of PSO. Another limitation was the absence of the alleles in the AS GWAS
summary data, leading to the use of the alleles from 1000G for each SNP. In addition,
the absence of strand information from most of the studies made it difficult to verify
findings for markers with complementary alleles. However, this issue was addressed
performing harmonization of the datasets with the 1000G. Moreover, the MHC region
that has shown evidence of pleiotropic effects was excluded from the analysis because
of the extensive LD present in the region. Thus, only the shared genetic basis and the
existence of novel susceptibility loci outside this region were assessed. In addition, this
study is oriented on SNPs but gene-based pleiotropy is also of interest (Wagner and
Zhang 2011).
The current study indicated numerous SNPs associated with PsA and the other
musculoskeletal diseases. The results need to be replicated in well-powered,
185
independent studies to verify their validity as credible associations. GWAS results
require further extensive interrogation for meaningful conclusions. Usually, the next
step is to perform dense genotyping to investigate the association of all variants in LD
with the lead variant in order to gain some insight into the SNPs’ causality. Functional
experiments should be conducted to directly assess the mechanism by which the
putative lead SNP affects gene expression.
187
Chapter 4 Mendelian Randomization
4
Introduction 4.1
The aim of epidemiology is to determine the causes of a disease, with many studies
trying to identify the environmental and lifestyle determinants that could modify the
risk of a disease. However, observational studies, as described in section 1.2.2, suffer
from confounding bias and reverse causation5. Thus, causal inference cannot be
proposed from the association between an exposure and a disease, unless all the
potential confounders of the association have been recognized, correctly measured
and adjusted for. To overcome these limitations, genetic epidemiology can provide
those non-confounding surrogates for exposures needed which can be analysed by
Mendelian Randomization (Smith and Ebrahim 2004).
Genetic variants that explain variation in the exposure and are not associated with the
outcome (except through the exposure) can be used as proxies to estimate a causal
effect, as they can be thought of as biological exposures being present from
conception. This is the basic premise of MR in which such genetic variants are termed
instrumental variables (IVs) (Greenland 2000). The MR can be thought as a randomized
clinical trial, in which individuals have been randomly assigned to receive a different
level of exposure, depending on whether they carry an allele associated with the
exposure or not.
5 Reverse causation refers to the situation when the outcome or disease precedes and causes the
exposure instead of the other way around.
188
General Overview of MR 4.1.1
Instrumental Variables 4.1.1.1
Initially, MR was performed using a small number of genetic variants which explained a
small proportion of the exposure’s variation limiting the power to investigate the
causal role of the exposure on the outcome (Smith and Ebrahim 2004). The
proliferation of GWAS led to the use of a large number of genetic variants which
meant increasing power for inferring causality when the variants explained a larger
proportion of the exposure’s variance. In addition, the availability of summary
association statistics by large consortia allows the interrogation of many causal
hypotheses without the administrative burden needed for the individual-level data
analyses (Burgess, Butterworth et al. 2013).
Three assumptions must be satisfied for a genetic variant to be a valid IV (Greenland
2000):
The genetic variant is associated with the exposure
The genetic variant is not associated with any of the confounders of the
exposure-outcome relationship
The genetic variant is only associated with the outcome through the exposure
These assumptions suggest that there is only one causal pathway from the genetic
variant to the outcome and that is via the exposure. The first assumption can be easily
been tested; however, the other two assumptions are unlikely to hold especially when
using summary data. The use of many instruments increases the risk of including at
least one invalid IV which could bias the result.
More specifically,
Assumption 1: Typically, SNPs that have been found to be significantly associated (p-
value<5e-08) with the exposure in GWAS and subsequently have been replicated in
independent studies are used as IVs. However, the inclusion of SNPs that have not
reached the significance level may improve the prediction power as they could be
variants with small effect sizes that could not reach significance due to lack of power.
When the IVs are “weak”, which means they explain little variation of the exposure
under analysis, they can lead to inflated type 1 error rates and can bias the causal
189
estimates (Burgess, Thompson et al. 2011). At this point the use of all variants
combined into an allelic score as an IV could be efficient by increasing power and
avoiding weak instrument bias (Burgess and Thompson 2013).
Assumption 2: Although it is impossible to prove the validity of this assumption in a MR
analysis, it might be possible to check whether the IVs are associated with known
confounders of the exposure-outcome relationship.
Assumption 3: This is known as the exclusion restriction criterion and it refers to the
non-existence of horizontal pleiotropy for the IV to be valid. Horizontal pleiotropy
occurs when a variant affects multiple outcomes though separate pathways. Although
directly testing this assumption is impossible, methods have been developed and
described later that provide accurate estimations even when the assumption is
violated.
Design strategies for Mendelian Randomization 4.1.1.2
Single-sample/One sample MR 4.1.1.2.1
This design is the basic implementation of MR in which the SNPs, exposure and
outcome are from individuals in the same sample. In this design, the causal effect is
estimated by using 2-stage least-squares (2SLS) regression. In the first stage of the 2SLS
method, the exposure is regressed on the IVs. In the second stage, the outcome under
study is regressed over the predicted values of the exposure (estimated in the first
stage) using the either linear or logistic regression based on the nature of the
outcome. Then, the β coefficient or the log 𝑂𝑅 can be interpreted as the change in the
outcome per unit increase in the exposure due to IVs (Haycock, Burgess et al. 2016).
Two-sample MR 4.1.1.2.2
The two-sample MR is an extension of the 2SLS allowing for greater statistical power
due to the ability to use larger sample sizes. In this setting, variants-exposure and
variants-outcome associations should be estimated in different, non-overlapping
samples. Due to the latter, the two-sample approach has gained popularity as GWAS
summary statistics data from large consortia is publicly available. The advantages of
two-sample approach compared with the single-sample approach are i) no requirement
to measure exposure and outcome in the same sample as it could be difficult and
expensive and ii) the weak instrument bias is toward the null whereas in one-sample
190
approach, it is toward the confounded observational association (Burgess, Scott et al.
2015).
Bidirectional MR 4.1.1.2.3
In this design, IVs for both the exposure and the outcome are used to assess whether
the exposure causes the outcome and vice versa. More specifically, if exposure causes
the outcome, then the instrument Zexposure will be associated with both the exposure
and the outcome. However, the instrument Zoutcome will be associated to the outcome
and not to the exposure. The main assumption of this method is that the causal
association occurs in one direction, without having the ability to address any
complexities in the biological systems such as the effect of feedback loops among
exposure and outcome (Davey Smith and Hemani 2014).
Two-step MR 4.1.1.2.4
This method is used to assess whether there is mediation in the causal pathway; that
is, whether an intermediary trait is a mediator between exposure and outcome. In the
first step, variants for the exposure are used to assess the causal role of the exposure
at the intermediate factor. In the second step, IVs for the intermediate factor are used
to assess its causal effect on the outcome. Association in both steps implies the
existence of mediation between the exposure and the outcome by the intermediary
factor (Haycock, Burgess et al. 2016).
Multivariable MR 4.1.1.2.5
In some situations, the lack of variants that are solely associated with the exposure of
interest leads to the use of pleiotropic variants. Although, horizontal pleiotropy leads
to the violation of the third assumption, the development of multivariable MR
overcomes this issue by using IVs associated with multiple exposures to jointly assess
the independent causal role of each exposure on the outcome (Burgess, Dudbridge et
al. 2015; Burgess and Thompson 2015).
Multifactorial MR 4.1.1.2.6
Risk factors usually cluster together to contribute to the increasing burden of a
disease. For example, increased BMI combined with heavy alcohol consumption
significantly increases the risk of liver disease (Hart, Morrison et al. 2010). However, it
is difficult to estimate the effect of confounded exposures without the risk of
191
confounding bias. For that reason, factorial MR can be used to identify the combined,
unconfounded causal effects of the co-occurrence of two or more exposures for an
outcome (Zheng, Baird et al. 2017).
Pitfalls in MR studies 4.1.1.3
Even if the assumptions hold, there are still limitations in MR studies (Zheng, Baird et
al. 2017):
Weak instrument bias: As described previously the estimates of the IVs can be
biased when many of the genetic variants are only modestly associated with the
exposure. Practically, this means that a “strong” IV can explain the difference in
exposure and any difference in the outcome will be due to the difference in
exposure. By contrast, a “weak” IV can explain little variation in the exposure
and the difference in the outcome could be due to chance difference in
confounders. The ‘rule of thumb’ to avoid bias is that the F-statistic6 should be
at least 10, which means that the bias of the IV estimator is 10% the bias of the
observational estimator (Burgess, Thompson et al. 2011). In a one-sample
setting, the causal estimates can be biased towards the observational estimate
whereas in a two-sample setting the bias is towards the null. There are a few
ways to minimize this type of bias including a) increase the F-statistic by
increasing the sample size with the use of large GWAS (summary statistics)
datasets and b) adjustment for measured confounders that are not on the
causal pathway between exposure and outcome which will lead to increased
precision. Although it is better to address any type of bias prior to data
collection in order to ensure large F-statistics, it is possible to conduct
sensitivity analyses to assess any effects on causal estimates (Burgess,
Thompson et al. 2011).
Lack of genetic variants for exposure: Finding genetic variants is not always feasible
even with the proliferation of GWAS. A suggestion would be the use of
polygenic risk scores; however their use could introduce horizontal pleiotropy.
6 The F-statistic indicates the strength of an IV and its formula is (
𝑛−𝑘−1
𝑘) (
𝑅2
1−𝑅2) where 𝑛 is the sample
size, 𝑘 the number of IVs and 𝑅2 the proportion of variance in the phenotype explained by the variants.
192
Population stratification: This phenomenon can induce spurious associations due
to ancestry difference between in study subjects. For that reason, MR should
always be conducted using genetic associations from homogenous populations
or from GWAS that have adjusted for population structure.
Low power: Genetic variants usually explain only a small proportion of the total
variance of the exposure which results in the lack of statistical power to detect
a causal effect. The use of larger sample sizes from GWAS consortia could lead
to the increase of power and to more precise estimation of causal effects.
Horizontal pleiotropy: In the case of horizontal pleiotropy, the IV is associated
with the outcome via a pathway that does not pass through the study exposure
(Davey Smith and Hemani 2014). Erroneous causal estimates can be limited by
choosing IVs that act directly on the trait. However when less well-
characterized IVs are used, there are methods used to detect the effect of
pleiotropy in the causal inference and methods for effect estimation that are
robust to pleiotropy by relaxing the assumptions (Davey Smith and Hemani
2014) and will be mentioned later in the chapter. In the context of MR, there is
also vertical pleiotropy in which the IVs are associated with other risk factors
downstream of the exposure of interest. However, this type of pleiotropy does
not invalidate the assumptions of MR.
Trait heterogeneity: SNPs can be associated with multiple aspects of a single trait,
for example rs1051730 is associated with the number of cigarettes smoked per
day. However, the smoking behavior is different among smokers including
number and depth of smoke inhalations per cigarette and years of smoking. The
latter does not invalidate rs1051730 as an IV, but it makes it difficult to
estimate the precise magnitude of the causal effect (Haycock, Burgess et al.
2016).
LD: When a variant is in LD with the IV, confounding bias can be introduced to
the analysis as there will be a pathway from the IV to the outcome other than
the one including the exposure of interest. One solution is the use of the SNP
with the smallest p-value as IV and the removal of the other SNPs that are in
LD with the IV.
193
Winner’s curse7: In the single-sample MR setting, the use of the same sample for
GWAS discovery and MR study leads to upward bias in the estimation of the
SNP-exposure association. A possible sensitivity analysis test would be the use
of unweighted allelic score of several variants as IV. In the case of two-sample
MR, where the GWAS and MR analysis are independent, the winner’s curse will
bias the MR estimates towards the null.
Collider bias: As mentioned in the previous chapter, conditioning on a variable
that is independently affected by both exposure and outcome may cause
selection bias. Collider bias usually occurs in MR studies where the IVs are
chosen from a GWAS which conditions one phenotype on another, for
example waist circumference on CVD adjusted for BMI.
Binary outcomes: when the genetic associations with the outcome are estimated
in a case-control study, then the causal effect estimates may be imprecise (Dai
and Zhang 2015). If exposure is obtained after disease diagnosis, then the
genetic associations could be biased by reverse causation. Moreover, there is
the phenomenon of “noncollapsibilty” of ORs, which means that ORs can
predict the population-averaged causal effect but not the impact on specific
subgroups (Harbord, Didelez et al. 2013). The above considerations affect the
ability of MR to precisely estimate the magnitude of the causal effect.
Methods dealing with the pitfalls of MR 4.1.1.4
Inverse variance weighted (IVW) 4.1.1.4.1
The IVW is the traditional method used for the estimation of the causal effect of the
exposure on the outcome under study, when all IVs are valid. The estimate is
equivalent to the slope of a linear regression of the variant-outcome association
estimates on the variant-exposure association estimates with the intercept term
constrained to zero (Burgess, Butterworth et al. 2013).
7 A phenomenon first described in the auction theory, where the winner is likely to overpay for the item
is bidding for. In GWAS, it is the systematic overestimation of effects due to chance noise.
194
MR-Egger 4.1.1.4.2
In contrast to IVW, this method can provide a true estimate of the causal effect even if
all SNP-outcome associations are affected by horizontal pleiotropy. It is an alternative
method based on a technique for identifying publication bias in meta-analysis studies
proposed by Bowden et al. and performs a weighted regression of the gene-outcome
coefficients on the gene-exposure coefficients with the weights being the inverse
variances of the gene-outcome associations. Basically, MR-Egger substitutes the third
assumption with a more loose assumption; the InSIDE (Instrument Strength
Independent of Direct Effect) assumption according to which the distributions of the
genetic associations with the exposure and the SNP-outcome associations are
independent. The intercept is an estimate of the pleiotropic effect across all IVs and if
it differs from zero potential pleiotropy, could bias the estimate (Bowden, Davey Smith
et al. 2015). MR-Egger can only detect “directional” pleiotropy (having a non-zero
value) and not “balanced” pleiotropy where all SNPs present pleiotropy but it is
canceled out. Finally, the slope of the regression (beta coefficient) indicates the causal
effect between exposure and outcome adjusted for pleiotropy (Bowden, Del Greco et
al. 2016).
Weighted median 4.1.1.4.3
Although the IVW is an efficient method when all IVs are valid, it will give biased
estimates even if only one IV is invalid. This means that IVW has a 0% breakdown.
However, there is an estimator with 50% breakdown; the median estimator which
gives consistent estimates when up to half the IVs are not valid. When the precision of
the estimates varies, weighted median can be used where the probability of the
estimate of each IV is proportional to the inverse of its variance, so more precise
estimates receive more weight. The weights used are the inverse of the variance of the
ratio estimates (Bowden, Davey Smith et al. 2016).
Mode-based estimate 4.1.1.4.4
Another method that has been proposed to offer robustness to horizontal pleiotropy
is the Mode-based Estimator (MBE). It uses a “relaxed” assumption called Zero Modal
Pleiotropy Assumption (ZEMPA) in which the causal effect is precise if the most
common pleiotropy value across the IVs is zero. The weighted MBE is more precise
195
compared with simple MBE, but more susceptible to bias due to violations of the
InSIDE assumption (Hartwig, Davey Smith et al. 2017).
The essential assumptions regarding pleiotropy that need to be valid in order to obtain
accurate estimates by MR methods can be seen in Table 36. In addition to the
aforementioned methods used to estimate the causal effect of an exposure on the
outcome of interest even with the presence of pleiotropic genetic polymorphisms,
there are more methods that can be used to deal with the limitations of MR and can
be seen in the Table 37.
Table 36 | Assumptions regarding pleiotropy of the Mendelian Randomization methods
Method Assumptions regarding pleiotropy
IVW No pleiotropic effects and InSIDE holds
MR-Egger regression Consistent even if all IVs are invalid and InSIDE holds
Simple median Consistent if less than 50% of IVs are invalid
Weighted median Consistent if less than 50% of the weight is contributed by invalid IVs
Simple MBE Consistent if the most common horizontal pleiotropy value is zero
Weighted MBE Consistent if the largest weights among the k subsets are contributed by
invalid instruments
IVW: Inverse-Variance Weighting; InSIDE: Instrument Strength Independent of Direct Effect;
IV: Instrument Variable; MBE: Mode-Based Estimator
196
Table 37 | Methods used to address MR limitations
Method Use Description
MR-Egger intercept
Testing for pleiotropy
It is the intercept of the MR-Egger regression
and captures the pleiotropy across all SNPs
(Bowden, Davey Smith et al. 2015).
Leave-one-out analysis Removal of one IV from the MR analysis per
time to identify any outliers that influence
the estimate of the causal effect.
Cochran Q It is used along with IVW to estimate the
heterogeneity between the IVs that could
indicate the presence of pleiotropy. It can be
calculated using only summarized data
F-statistic
Assessing the
strength of the IV
It is used in IVW to measure the strength of
IVs. Less than 10 is considered problematic
and can lead to weak instrument bias.
I-squared (I2) It is used in MR-Egger to measure the degree
of regression dilution bias in the two-sample
setting. It lies between 0 and 1 with
estimation close to 1 means bias are
negligible (Bowden, Del Greco et al. 2016).
Funnel plot
Data visualization
It is a plot of the IV’s precision against the
IV’s estimates and it should be symmetrical,
as precise estimates are less variable. It is
used to detect the existence of pleiotropy in
MR-Egger (Burgess, Bowden et al. 2017).
Scatter plot It is plot of the genetic associations with the
outcome against the genetic associations
with the exposure. Each point is an IV and if
any point deviates from the straight line
through the origin under the null, then it
should be investigated for pleiotropy. Thus,
it is used to assess heterogeneity and also to
compare regression slopes from different
MR methods (Burgess, Bowden et al. 2017).
Forest plot It compares the MR estimates for each IV to
detect the existence of pleiotropy.
IV: Instrumental Variable; SNP: Single Nucleotide Polymorphism; IVW: Inverse Variance Weighted
197
Aims and objectives 4.2
Aim 4.2.1
The aim of this chapter is to assess the causal role of the BMI, smoking status and
alcohol consumption, which were found to be associated with PsA in chapter two, in
the development of PsA and vice versa using bidirectional, two-sample Mendelian
randomization.
Objectives 4.2.2
Find the IVs for each exposure from publicly available GWAS summary
statistics data
Perform bidirectional, two-sample MR between each exposure and PsA
Perform relevant sensitivity analyses to ensure the accuracy of the outcome.
Contribution of the candidate 4.3
The data acquisition and preparation, the planning, statistical analysis and interpretation
of the results were performed by the candidate (EB).
198
Methods 4.4
Two-sample, bidirectional MR was used to assess the potential causal role of BMI,
smoking status and alcohol consumption identified in chapter 2 on PsA. In addition, a
range of sensitivity analysis methods were applied to identify any pleiotropy that could
bias the results. In the analysis, the main outcome variable was either diagnosed PsA or
one of the lifestyle factors under study.
Data sources and choice of IVs 4.4.1
For the application of MR, the use of SNPs that have been found to be significantly
associated with the traits in large consortia is recommended to avoid weak instrument
bias and to have the necessary power to address the causal role of exposures. Thus,
IVs for the lifestyle factors were identified from publicly-available summary GWASs
data which included both sexes and were restricted to European populations to avoid
any population stratification bias. The IVs for the PsA were taken from our in-house
GWAS study described in the previous chapter. For each trait, all IVs achieved the
genome-wide significance (p-value<5e-08).
Defining IVs for BMI 4.4.1.1
The IVs for BMI were identified using results from the Genetic Investigation of
ANthropometric Traits (GIANT) consortium (Table 38), a large collaborative GWAS
on human body size and shape (Locke, Kahali et al. 2015). Using GWAS data on
339,224 individuals, GIANT identified 97 genetic markers independently associated
with BMI. In the GIANT study population, these 97 SNPs explained 2.7% of in-study
variance in BMI. In the GIANT population, the effect estimate was expressed in SD of
BMI, where 1 SD change in BMI equaled 4.65kg/m2. To generate the corresponding
SNP-outcome (here, PsA) association, the beta estimates and standard errors were
taken from our in-house PsA cohort.
Defining IVs for alcohol frequency and smoking status 4.4.1.2
GWAS summary data were available from the Tobacco, Alcohol and Genetic (TAG)
consortium (Tobacco and Genetics 2010) for smoking initiation assessed as a binary
ever/never variable ascertained in 74,053 individuals; however for alcohol consumption
no GWAS associations were publicly available.
199
Defining IVs for PsA 4.4.1.3
For PsA, a genetic instrument was constructed using the PsA-associated SNPs and
their effect sizes as estimated in our in-house PsA cohort. To generate the
corresponding SNP-outcome (here BMI, alcohol frequency and smoking initiation)
association, the effect estimates and the standard errors were taken from GIANT
consortium, TAG and the UK Biobank.
Validation of findings using the UK Biobank 4.4.1.4
For validity reasons, the causal relationships between PsA and the lifestyle factors were
assessed using summary statistics from the UK Biobank (Table 38). The summary
statistics data were released by the Neale Lab who performed a univariate analysis on
approximately 337,000 participants (http://www.nealelab.is/blog/2017/7/19/rapid-gwas-
of-thousands-of-phenotypes-for-337000-samples-in-the-uk-biobank). The data used for
BMI, alcohol intake frequency and smoking initiation (ever smoker) correspond to the
UK Biobank field ID 21001, 1558 and 20160, respectively. According to the UK
Biobank classification, an individual is classified as ever smoker if they are currently
smoking tobacco either most days or occasionally, or they used to smoke.
Table 38 | Characteristics of the GIANT consortium and the UK Biobank
Cohort Design Aim Participants Summary statistics used
GIANT Consortium
of different
groups,
countries
and studies
Identify genetic loci
that are associated
with the human
body size and shape
More than half
a million of
European and
non-European
descent.
Taken from a meta-analysis of 125
studies (Locke, Kahali et al. 2015)
including 339,224 individuals
UK
Biobank
Prospective
study
Facilitate
epidemiological
studies to assess
the causes of a
range of conditions
and their
susceptibility
factors
Half a million
of British
participants
registered
with the
National
Health System
Taken from Neale’s Lab
(http://www.nealelab.is/blog/2017/7/19
/rapid-gwas-of-thousands-of-
phenotypes-for-337000-samples-in-
the-uk-biobank) after a basic
association test was performed on
approximately 337,000 individuals
GIANT: Genetic Investigation of Anthropometric Traits consortium
200
Statistical analysis 4.4.2
Bidirectional, two-sample MR was used to evaluate the causal role of lifestyle factors
on PsA and vice versa. Analyses and data visualization were performed using the R
package “TwoSampleMR” (Hemani, Zheng et al. 2016).
Initially, the number of significant SNPs from each summary dataset was clumped using
the function “clump_data” to ensure that the IVs for the exposure are independent.
During clumping the provided SNPs are extracted from 1000G data, the LD is
calculated among them and amongst those SNPs (in a clumping distance of 10,000kb)
with 𝑟2>0.01 only the SNP with the lowest p-value are kept. Then, it is essential to
harmonise the SNP-exposure and SNP-outcome association effects in order to
correspond to the same allele. This was achieved with the “harmonise_data” function.
All associations were combined using the IVW method. This method is suitable even
for exposures with less than ten IVs due to its increased power even when the number
of the associated SNPs is limited. This approach produces a causal estimate of the
association between exposure-outcome, which is equal to the coefficient from a
weighted regression of the SNP-outcome on the SNP-exposure association estimates,
where the weights are the inverse of the precision of the SNP-outcome coefficients
and the intercept is constrained to zero.
The degree of weak instrument bias for the IVW can be quantified with the F-statistic.
Given that the chosen IVs are statistically significant SNPs, the F-statistic will
necessarily be high.
Sensitivity analysis 4.4.2.1
MR-Egger regression, weighted median analysis and weighted MBE were used to assess
the robustness of the findings to potential horizontal pleiotropy. All three methods
provide consistent results, even if there is a violation of the third assumption. In
addition, the intercept of MR-Egger regression provides the degree of pleiotropy
present in the data based on the degree it departs from zero. Therefore, the intercept
should be zero in the absence of pleiotropy. For the validity of the results, the absence
of overlap between the IVs associated with the pair of traits investigated each time is
essential. For example, it is possible that an instrument for PsA includes SNPs that are
directly associated with BMI if there is a causal effect of BMI on PsA. Therefore, any
201
identical or strongly correlated SNPs among the pair of traits under study were
removed and the analyses were repeated.
In addition, Cochran’s Q test for heterogeneity was used to assess any inconsistencies
in the effect estimates across the IVs (IVs presenting unexpectedly large or small
effects on the outcome, given the magnitude of their effect on the exposure) which
would indicate potential pleiotropy.
Finally, four types of plots were used to visually assess the presence of pleiotropy.
Scatterplots provide the causal effect estimates for each IV by plotting SNP-outcome
associations against SNP-exposure associations. In funnel plots, larger spread suggests
higher heterogeneity and asymmetry indicates horizontal pleiotropy. Finally, with the
leave-one-out analysis and plot, any associations that are being disproportionately
influenced by a single variant can be assessed.
Results 4.5
The number of instruments used in each MR analysis can be seen in Table 39. In the
case of smoking, no significant SNP was found at the published GWAS from TAG;
however the analysis was performed using the UK Biobank data. Moreover, no publicly
available GWAS was found for alcohol intake frequency, thus the analysis was limited
in the UK Biobank data.
Effect of BMI upon PsA and vice versa 4.5.1
MR analysis performed with published GWAS data and UK Biobank gave evidence of
higher BMI increases the risk of PsA. Using the available 65/69 SNPs associated with
BMI from GIANT study in the in-house PsA GWAS summary results, there was a 0.65
(95% CI 0.25, 1.05) log increase in PsA per SD (4.5kg/m2) increase in BMI (p-
value=0.001) or 92% (𝑒0.65 = 1.92) increase in the odds of developing PsA (Table 40).
The causal effect of BMI on PsA was confirmed using the UK Biobank GWAS summary
results where a 0.53 (95% CI 0.27, 079) log increase in PsA was reported per SD
increase in BMI (p-value=5.76e-05).
202
Table 39 | Number of genetic instruments used for the MR analysis for each exposure-outcome
Exposure Outcome Number of genetic instruments
Identified
(after clumping)
Available in
outcome
After
harmonization
BMI (GIANT)
PsA
69 66 65
BMI (UK Biobank) * 315 286 251
smoking initiation (TAG) - - -
smoking initiation
(UK Biobank) ⱡ
40 40 35
alcohol intake frequency ¥
(UK Biobank)
44 40 36
PsA
BMI (GIANT) 6 2 2
BMI
(UK Biobank)
6 6 4
smoking
initiation (UK
Biobank)
6 6 4
alcohol intake
frequency
(UK Biobank)
6 6 4
BMI: Body Mass Index; PsA: Psoriatic Arthritis; UK: United Kingdom; GIANT: Genetic Investigation
of Anthropometric Traits consortium; TAG: Tobacco, Alcohol and Genetics consortium
* UK Biobank code 21001
ⱡ UK Biobank code 20160
¥ UK Biobank code 1558
In the reverse direction IVW did not show any causal evidence by using two and four
SNPs associated with PsA from the GIANT BMI and the UK Biobank BMI, respectively.
Only the IVW method was used as it is the only method to have the power to
estimate a causal effect when the number of IVs is less than ten. No sensitivity analyses
were performed due to lack of causal association of PsA on BMI.
203
Table 40 | Results of Mendelian randomization with BMI as exposure and PsA as the outcome
Exposure Dataset Method Estimate ¥ 95% CI p-value
BMI
GIANT
IVW 0.65 0.25, 1.05 0.001
MR-Egger 1.03 -0.17, 2.23 0.09
Weighted median 0.57 0.01, 1.13 0.04
Weighted MBE 1.36 0.36, 2.36 0.008
UK Biobank
IVW 0.53 0.27, 0.79 5.76e-05
MR-Egger 0.53 -0.21, 1.27 0.15
Weighted median 0.48 0.01, 0.96 0.04
Weighted MBE 0.39 -0.33, 1.11 0.29
IVW: Inverse Variance Weighted; CI: Confidence Interval; MBE: Mode-Based Estimator;
GIANT: Genetic Investigation of Anthropometric Traits consortium;
¥ log increase in PsA per SD (4.5 kg./m2) increase in BMI
Sensitivity analysis 4.5.1.1
Assessing the instrumental variable assumptions 4.5.1.1.1
For assessing heterogeneity, scatter plots are used for a visual inspection along with
the Cochran Q test. Q test relies on the assumption that all valid IVs identify the same
causal parameter; otherwise the heterogeneity test might over-reject the null. Figure
21 and Figure 22 depict evidence of heterogeneity with some outliers with Q statistic
being 74.96 (64 degrees of freedom, p-value=0.16) for IVs taken from GIANT
consortium and 281.7 (250 degrees of freedom, p-value=0.08) for IVs taken from the
UK Biobank; although the results were not significant.
204
Figure 21 | Scatterplot for comparison of methods of BMI (GIANT) upon PsA. It presents the genetic associations with PsA against the genetic associations with BMI (lines represent 95% confidence intervals). The slope of each line represents the causal association and each method has a different line. IVW, MR-Egger, weighted median and weighted MBE estimates are indicated by the light blue, blue, light green and green lines respectively. There is evidence of heterogeneity with a few outliers.
205
Figure 22 | Scatterplot for comparison of methods of BMI (UK Biobank) upon PsA. It presents the genetic associations with PsA against the genetic associations with BMI (lines represent 95% confidence intervals). The slope of each line represents the causal association and each method has a different line. IVW, MR-Egger, weighted median and weighted MBE estimates are indicated by the light blue, blue, light green and green lines respectively. There is evidence of heterogeneity with a few outliers.
For assessing directional pleiotropy, funnel plots of the IVs precisions against the IVs
estimates are used with any asymmetry being evidence of horizontal pleiotropy;
pleiotropic effects do not average to zero and causal estimates from weaker variants
tend to be skewed in one direction. There is no sign of departure from asymmetry for
BMI taken from the GIANT consortium (Figure 23) and the UK Biobank (Figure 24),
respectively. In addition, the intercept of MR-Egger regression indicates the average
pleiotropic effect; if it differs from zero, then there is evidence of pleiotropy. Here
GIANT intercept= -0.01 with p-value=0.50 and UK Biobank intercept=-9.4e-05 with p-
value=0.99, suggesting there was no strong directional horizontal pleiotropy under the
InSIDE assumption.
206
Figure 23 | Funnel plot displaying the causal effect estimate of each IV against its precision for MR analysis of BMI (GIANT) on PsA. Asymmetry is indicative of horizontal pleiotropy, meaning that the pleiotropic effects of genetic variants are not balanced about the null. Here the plot is symmetrical.
207
Figure 24 | Funnel plot displaying the causal effect estimate of each IV against its precision for MR analysis of BMI (UK Biobank) on PsA. Asymmetry is indicative of horizontal pleiotropy, meaning that the pleiotropic effects of genetic variants are not balanced about the null. Here the plot is symmetrical.
Using robust methods 4.5.1.1.2
The MR-Egger regression, weighted median and MBE are used as alternative methods
to assess the causal effect as they use weaker assumptions than the standard IVW. In
summary, MR-Egger regression can provide a consistent effect even when all IVs
exhibit some pleiotropy, the weighted median is consistent under the assumption that
the valid IVs represent over 50% of the weight in the analysis and the weighted-mode
is consistent if the most common pleiotropy value is zero.
In the case of BMI from GIANT consortium, the alternative methods suggest an effect
on the same direction as the IVW with results being significant for all of them (MR-
Egger has a p-value=0.09). This indicates that the IVs might all be valid and the
conclusion that the increase of BMI causes an increase in PsA is reliable. When
208
assessing the results of UK Biobank BMI upon PsA, all methods indicate a positive
causal effect with the result being statistically significant only with the weighted-median
method.
Effect of smoking initiation upon PsA and vice versa 4.5.2
Using 35 SNPs significantly associated with the ever smoker status in PsA, there is a
0.36 (-2.5, 1.78) log decrease of PsA for ever smokers compared to those who have
never smoked, but the result was not statistically significant. In contrast, the rest of the
methods suggest a positive effect of ever smoking on PsA (Table 41). Estimating the
Cochran Q, there is no statistically significant evidence of heterogeneity (Q=60, 34
degrees of freedom, p-value=0.08); however the MR-Egger regression was consistent
with the null (intercept=-0.06 with 95% CI -0.14, 0.02 and p-value=0.11). Performing
leave-one-out analysis to assess whether there were any outlier SNPs affecting the
result showed that when SNP rs9468350 (gene ZSCAN31) was removed from the IVs,
there was a positive effect of smoking initiation on PsA using the IVW method as well
(estimate= 0.02, 95% CI -1.98, 2.02). This SNP has been found to be associated with
other diseases such as RA and schizophrenia, suggesting possible pleiotropy biasing the
causal effect.
In the reverse direction, using only four SNPs as IVs method IVW indicated no
significant causal role of PsA on smoking initiation (estimate=0.02 (95% CI -0.04, 0.08),
p-value=0.23).
Table 41 | Results of Mendelian randomization with smoking initiation from the UK Biobank as the exposure and PsA as the outcome
Exposure Dataset Method Estimate ¥ 95% CI p-value
smoking
initiation
(ever
smoker)
UK Biobank
IVW -0.36 -2.5, 1.78 0.74
MR-Egger 7.42 -2.34, 17.18 0.14
Weighted median 0.31 -2.19, 2.81 0.80
Weighted MBE 0.52 -3.24, 4.28 0.78
IVW: Inverse Variance Weighted; CI: Confidence Interval; MBE: Mode-Based Estimator;
MR: Mendelian Randomization
¥ log increase in PsA for ever smokers
209
Effect of alcohol frequency consumption upon PsA and vice versa 4.5.3
Using 36 SNPs significantly associated with alcohol intake frequency in UK Biobank
that were present in the PsA summary data, there was no causal role observed as seen
in Table 42. There was no evidence of pleiotropy (MR-Egger intercept=0.02, 95% CI 0,
0.04 and p-value=0.10) or heterogeneity (Q=41, p-value=0.22).
In addition, no causal effect was observed of PsA on alcohol frequency intake using
IVW method (estimate=0.01 (95% CI -0.03, 0.05), p-value=0.53).
Table 42 | Results of Mendelian randomization with alcohol intake frequency from the UK Biobank as the exposure and PsA as the outcome
Exposure Dataset Method Estimate 95% CI p-value
alcohol
intake
frequency UK Biobank
IVW 0.18 -0.31, 0.65 0.45
MR-Egger -0.53 -1.49, 0.43 0.28
Weighted median 0.02 -0.66, 0.66 0.99
Weighted MBE -0.27 -1.09, 0.55 0.52
IVW: Inverse Variance Weighted; CI: Confidence Interval; MBE: Mode-Based Estimator;
MR: Mendelian Randomization
210
Discussion 4.6
In this chapter, the association between BMI, smoking initiation and alcohol intake
frequency and PsA was evaluated by MR using summary-level data. BMI, with IVs taken
from the GIANT consortium, presented a consistent causal effect in terms of direction
estimated using the MR methods, although statistical evidence was weak when using
the MR-Egger regression method. Using BMI data from the UK Biobank found the
equivalent causal effect; however the result was only significant in two of the MR
methods. There was no evidence of horizontal pleiotropy as assessed by MR-Egger and
funnel plots; however, there was some evidence of heterogeneity. Heterogeneity,
where some IVs present disproportionately large or small causal effect estimates,
might be a sign of pleiotropy. In the reverse direction, the results were not significant
suggesting that the observed association is explained by the causal role of BMI on PSO
and/or PsA. There was no evidence of bidirectional causal role among smoking
initiation and PsA or of alcohol intake frequency and PsA.
These findings suggest that the commonly positive association between BMI and both
PSO and PsA reported in observational studies (as described in 1.3.5.4) may
correspond to a causal risk-increasing effect. Obesity is thought to be a chronic
inflammatory condition (Monteiro and Azevedo 2010). Macrophages in adipose tissue
induce the secretion of inflammatory mediators, establishing the inflammatory state.
The adipose tissue secretes adipocytokines including TNFα, IL-6 and leptin which
contribute to an ongoing inflammatory status and probably to the pathogenesis of PSO
and PsA (Hamminga, van der Lely et al. 2006). Leptin plays a key role in the irregular
deposit of fat and development of insulin resistance (Gisondi, Tessari et al. 2007). For
that reason, obesity could trigger PSO and/or PsA or it could be the consequence of
the latent diseases, arising from metabolic disorders and low quality of life (eating
habits, physical inactivity) (Carrascosa, Rocamora et al. 2014). Nonetheless, further
studies are needed to understand the role of inflammation and other biological
pathways shared by obesity and PsA.
MR analysis has emerged as a powerful tool to examine causal inference between
exposures and disease outcomes. MR has a number of advantages that has helped it
gain popularity among epidemiologists. The key feature is the avoidance of confounding
and selection bias, as there is a random allocation of genetic markers of interest. In
211
addition, reverse causation, which is one of the limitations of observational studies, is
avoided as genetic variants are allocated at conception which precedes disease onset.
Finally MR is a cost effective, especially compared with RCTs or prospective cohort
studies, and ethically approved.
However, MR has limitations that should be considered. With the use of summary-
level data from large consortia, the probability of using invalid IVs due to pleiotropy
increases. Pleiotropy can distort MR analyses; for that reason various methods are
being developed to negate pleiotropy including MR-Egger regression. In addition, LD
between the IVs can induce pleiotropy or confounding; however sometimes this is
advantageous as it allows an unmeasured variant to be estimated through a proxy
variant. Perhaps the most important consideration is the statistical power to infer
causality; usually genetic variants have a small effect on the exposure and this could be
tiny if the IV is weak. This means that large sample sizes are required; for example
SNP-exposure associations investigated by consortia to adequately test causal
hypotheses. In general, the difficulty in assessing the MR’s second and third
assumptions could lead to some uncertainty in inferring causality among exposure and
outcome especially when it comes to the precision of the effect estimate.
Nevertheless, MR is an important tool which, coupled with GWAS genetic variants,
has contributed important insights in the potential causality of factors associated with
CVD and mental disorders and can also be used to inform drug development by
pharmacologically modulating the causal risk factors.
Strengths and weaknesses of the study 4.6.1
Several assumptions are required for MR to provide consistent estimates of the
causality of a putative risk factor on a disease outcome. In the current analysis, BMI-
associated variants from a GWAS consortium (GIANT) and a population-based study
(UK Biobank) were used to exploit their large sample sizes to test the causal
hypothesis. The use of only independent, statistically significant SNPs helps to limit the
weak instrument bias and any confounding induced by the LD. In the GIANT study
population, the identified associations explained 2.7% of the BMI variance In addition,
the analysis was restricted to participants of European origin to minimise the risk of
bias due to population stratification. Finally, the use of two-sample design allowed the
212
application of methods that apply different assumptions regarding pleiotropy, thus
relaxing the initial assumptions for valid causal inference.
Conversely, the current analysis suffers from some limitations. First, the PsA cohort
consists of patients who also have PSO; thus, it is not feasible to assess whether the
BMI infers causality for PSO or PsA. In that case, two-step MR analysis can be used to
test for causal mediation in the pathway between BMI and PsA. Also, the use of non-
MHC SNPs as IVs for PsA that explain a small variance of the outcome’s variance could
lead to limited statistical power to assess the causal role of PsA on BMI. Further
similar analyses need to be conducted using the largest GWAS in both PSO and PsA to
test this reverse causation. In addition, the causal role of smoking initiation and alcohol
consumption frequency on PsA could only be investigated using UK Biobank data due
to the lack of publicly available GWAS summary data for alcohol consumption and the
lack of significant variants for smoking initiation in TAG study. A possible solution to
this could be the construction and application of polygenic risk scores for each factor
using variants that do not reach significance; however, their use increases the risk of
pleiotropy. Thus, publicly available summary-data are needed for both exposures from
large studies.
Future work 4.6.2
One of the challenges in MR is the possibility of pleiotropic effects of the IVs which can
induce a genetic correlation among traits because of shared aetiology. A new approach
has been suggested using the latent variable modelling that can identify full or partial
causation among genetically correlated, polygenic traits (O'Connor and Price 2018).
The latent causal variable (LCV) model introduces a latent variable that has a causal
effect on both traits and the genetic correlation between the traits is mediated by this
latent variable. It is used to account for potential pleiotropy; it distinguishes between
genetic correlation and genetic causation and quantifies the magnitude of causality
using the genetic causality proportion (gcp). Authors showed that LCV can avoid
confounding when there is a difference in power or polygenicity between two traits,
unlike bidirectional MR. They also confirmed that MR produces false positives in the
presence of genetic correlation. Thus, LCV should be used in the future to confirm the
causal role of BMI on PsA found in the current study.
213
Conclusion 4.6.3
A higher prevalence of obesity in patients with PSO with or without arthritis
compared to the general population has been reported in various observational
studies. Increased OR of BMI was also reported in chapter two, where I investigated
the association of environmental factors with both PSO without arthritis and PsA using
data from the UK Biobank. However, due to the limitations of observational studies,
alternative low-cost methods have been developed that can effectively and quickly
assess the causal role of lifestyle choices on the onset of disease. One promising
method which is now widely-used is the Mendelian Randomization. In this study
bidirectional MR was applied and identified a possible causal role of BMI on PsA/PSO
that could help preventing disease by targeting adiposity levels along with the use of
immune-oriented medication. This is the first study, to my knowledge, that has utilised
MR to investigate the causal role of obesity on PSO and/or PsA.
215
Chapter 5 Discussion of thesis
5
During the last decade, GWAS have been applied to hundreds of complex disorders
yielding thousands of genetic markers and increasing our knowledge of disease
aetiology and underlying biological pathways. In PSO more than 80 risk loci have been
discovered, whereas in PsA only a handful of PsA-specific variants (i.e. associated with
PsA but not with PSO) have been identified due to the smaller samples sizes in PsA
research and its complicated clinical overlap with PSO and other inflammatory
arthritides. In parallel, the research focused on environmental and lifestyle risk factors
that affect the disease’s onset has been mainly conducted in cross-sectional studies not
allowing the investigation of the causal role of those factors on PsA. It is apparent that
only the combination of environmental and genetic risk factors and other -omics data
will provide the full picture of disease heritability and pathogenesis which would help
the effective recognition of PSO patients at risk of developing PsA and the
advancement of therapies.
The application of GWAS in complex traits has proved that many genetic loci
contribute to the genetic variation; thus, the proportion of variance explained by each
SNP is small and larger cohort sizes are needed to detect additional markers. This
realization led the research community to establish large consortia and biobanks in an
effort to detect a proportion of the “missing heritability” of complex traits.
Simultaneously, advances in biostatistics have enabled researchers to explore various
ways of using the results of GWAS for further investigating the genetic architecture of
traits. Therefore, to address the challenges that PsA research faces, a novel cross-
phenotype study was conducted using state-of-the-art statistical methods applied to
216
GWAS data that exploit the phenomenon of pleiotropy. To date, the identification of
PsA genetic risk factors has been performed using traditional GWAS. As multiple lines
of evidence suggest the existence of extensive pleiotropy for complex traits, especially
in autoimmune diseases, may influence predisposition to many of them. The latter has
been the central idea in the development of a specific group of techniques which use
the observed pleiotropy to leverage more power to detect novel associations without
increasing sample sizes. In chapter three, I presented three methods of this category
which led to the detection of 21 novel loci in PsA as well as novel loci in PsA-
correlated diseases including RA, SLE, AS and JIA. Independent replication studies are
needed to establish their association with the disease susceptibility and subsequent
functional studies to confirm their role in causation.
Cross-phenotype GWAS can be challenging compared to the meta-analysis of a single
trait. First, a specific locus can affect only a subset of the analyzed traits and in some
cases this locus might be protective for one disease and increase the risk of another.
Second, it is important to distinguish between truly heterogeneous effects and
statistical noise in cases where the studies are of different power and design. Finally,
the existence of overlapping subjects leads to the inflation of false positive associations.
MTAG and ASSET used in the current thesis are two meta-analysis methods used in a
cross-phenotype context which were developed to address these challenges, with
some unresolved issues still remaining. A recent study compared a number of meta-
analysis methods for cross-phenotype GWAS studies including ASSET (Zhu, Anttila et
al. 2018). The fixed-effect methods like the ASSET outperformed other methods in the
presence of diverse heterogeneity. More specifically, ASSET which exhaustively
explores all subsets of disease combinations for the presence of association performs
best when the number of traits with non-null effects is small. It also presents the best
sensitivity in the presence of directionally opposite effects and the best specificity
under most settings. Finally, it can adequately adjust for known sample overlap. A
crucial improvement will be the adjustment for unknown sample overlap like MTAG
which addresses this by using the LD score regression analysis (Zhu, Anttila et al.
2018). Although MTAG was not included in the comparison, it is known that this
method does not take into account any subset specific effect and performs well when
all variants share the same genetic correlation across all traits. However, the latter
assumption usually is not held in a cross-phenotype design.
217
In the pathogenesis of complex diseases, lifestyle choices and environmental factors
play a key role in the development of the disease. The establishment of biobanks and
repositories help researchers to elucidate these risk factors for diseases. The UK
Biobank is part of this effort and is a rich resource of data including genetic, lifestyle,
biomarkers and imaging. In this thesis, I investigated the association of known lifestyle
factors using data from 500,000 participants. The study showed the role of obesity in
both PSO and PsA compared to the general population and its increased prevalence in
patients with PsA compared to PSO which is in line with previous literature. The
involvement of both alcohol frequency consumption and smoking has been unclear. In
this study, smoking and alcohol frequency intake was found to be less prevalent in PsA
compared to PSO, where collider bias probably influenced the association with
smoking. The use of cross-sectional data allows only the estimation of the prevalence
rates of those factors; therefore MR was applied to GWAS summary data to elucidate
their causality on PsA. Only BMI was found to play a causal role in the development of
PsA, a finding that could help clinicians motivate patients with PSO to change their
nutritional habits or adopt healthy eating to decrease the odds of developing PsA.
The key question is “How can these factors, both genetic and lifestyle, be useful in clinical
practice for each patient?”. Personalised medicine is not a new concept. For years
clinicians have tried to tailor health care to an individual’s needs, however the
identification of individuals at risk of developing a disease or more likely to respond to
certain treatments has not yet been sufficiently predictive to be clinically useful in many
cases. To address these challenges, researchers use the vast amount of genome-wide
data that is available to create genomic risk prediction models. The approach is
straightforward; each individual’s DNA is genotyped using one of the various
genotyping arrays and after passing quality control tests and performing imputation, an
algorithm calculates a risk score utilising the weights of a list of genetic markers. As
genomic risk models only show the heritable component of risk, integrating
environmental factors, biomarkers and electronic health records could increase the
predictive ability of these models. It should be noted that the role of predictive
models is not to substitute clinical judgement but to provide insights to the
progression of a disease and stratify individuals to those at risk of developing a disease.
In the case of PSO patients, predictive risk modelling can stratify patients into
appropriate treatment plans and help implement accurate screening techniques. The
218
transition from PSO to the development of PsA is known; psoriatic individuals with
genetic risk factors for PsA will be exposed to relevant environmental and lifestyle risk
factors and some of them will eventually develop PsA. The prevention of PsA would
involve interventions to halt the onset of PsA in the phase of exposure to
environmental determinants while pre-existing systemic autoimmunity underlies, for
example a healthy diet and frequent exercise campaign could be implemented for first-
degree relatives of patients with PsA. It is obvious that relevant interventions could
take place in the case of the development of PSO by preventing any PSO-related
systemic autoimmunity. However, targeted prevention could be challenging regarding
the “screening” for potential cases in the general population. PsA present us with a
unique opportunity for preventative intervention as patients with PSO represent a high
risk group where prevalence of PsA is approximately 30%. This is not the case for
PSO and other diseases where the general population is the potential pool of subjects
which could lead to increased risk of false positives and false negatives. Furthermore,
predictive modelling is not without its challenges when it comes to efficient
manipulation, storage and protection of the exponentially increasing data and the
knowledge gaps on “mining” useful information from large datasets. In addition, the
lack of standards for bioinformatics processing, storage and assistance with clinical-
decision making means the incorporation of genomic data into clinical practice remains
challenging. Despite these concerns, genomic data will play an essential role in
personalised medicine and will further help patient-oriented care.
Conclusion 5.1
In conclusion, this thesis has carried forward the research in detecting PsA risk factors.
The thesis includes the first cross-phenotype study of PsA along with other four
musculoskeletal diseases, the first study of environmental factors and comorbidities in
PsA using the UK Biobank and the first MR analysis performed in a PsA cohort to
detect potential causality among environmental factors and the disease. While this data
do not provide the opportunity for immediate clinical application, it can be the basis
for further studies.
219
References
Adeloye, D., S. Chua, et al. (2015). "Global and regional estimates of COPD prevalence: Systematic review and meta-analysis." J Glob Health 5(2): 020415.
Afifi, L., M. J. Danesh, et al. (2017). "Dietary Behaviors in Psoriasis: Patient-Reported
Outcomes from a U.S. National Survey." Dermatol Ther (Heidelb) 7(2): 227-
242.
Aggarwal, R., S. Ringold, et al. (2015). "Distinctions between diagnostic and
classification criteria?" Arthritis Care Res (Hoboken) 67(7): 891-897.
Airoldi, I., E. Di Carlo, et al. (2005). "Lack of Il12rb2 signaling predisposes to
spontaneous autoimmunity and malignancy." Blood 106(12): 3846-3853.
Al'Abadie, M. S., G. G. Kent, et al. (1994). "The relationship between stress and the
onset and exacerbation of psoriasis and other skin conditions." Br J Dermatol
130(2): 199-203.
Al-Mutairi, N., S. Al-Farag, et al. (2010). "Comorbidities associated with psoriasis: an
experience from the Middle East." J Dermatol 37(2): 146-155.
Alamanos, Y., P. V. Voulgari, et al. (2008). "Incidence and prevalence of psoriatic
arthritis: a systematic review." J Rheumatol 35(7): 1354-1358.
Alenius, G. M., B. Stenberg, et al. (2002). "Inflammatory joint manifestations are
prevalent in psoriasis: prevalence study of joint and axial involvement in
psoriatic patients, and evaluation of a psoriatic and arthritic questionnaire." J
Rheumatol 29(12): 2577-2582.
Allen, N. (2013). "UK Biobank: current status and what it means for epidemiology."
Health Policy and Technology 1(3): 123-126.
Andreassen, O. A., S. Djurovic, et al. (2013). "Improved detection of common variants
associated with schizophrenia by leveraging pleiotropy with cardiovascular-
disease risk factors." Am J Hum Genet 92(2): 197-209.
Andreassen, O. A., W. K. Thompson, et al. (2013). "Improved detection of common
variants associated with schizophrenia and bipolar disorder using pleiotropy-
informed conditional false discovery rate." PLoS Genet 9(4): e1003455.
220
Andreassen, O. A., W. K. Thompson, et al. (2015). "Correction: Improved Detection
of Common Variants Associated with Schizophrenia and Bipolar Disorder
Using Pleiotropy-Informed Conditional False Discovery Rate." PLoS Genet
11(11): e1005544.
Apel, M., S. Uebe, et al. (2013). "Variants in RUNX3 contribute to susceptibility to
psoriatic arthritis, exhibiting further common ground with ankylosing
spondylitis." Arthritis Rheum 65(5): 1224-1231.
Arakawa, A., K. Siewert, et al. (2015). "Melanocyte antigen triggers autoimmunity in
human psoriasis." J Exp Med 212(13): 2203-2212.
Armstrong, A. W., C. T. Harskamp, et al. (2012). "The association between psoriasis
and obesity: a systematic review and meta-analysis of observational studies."
Nutr Diabetes 2: e54.
Armstrong, A. W., C. T. Harskamp, et al. (2013). "The association between psoriasis
and hypertension: a systematic review and meta-analysis of observational
studies." J Hypertens 31(3): 433-443.
Armstrong, A. W., C. T. Harskamp, et al. (2013). "Psoriasis and the risk of diabetes mellitus: a systematic review and meta-analysis." JAMA Dermatol 149(1): 84-91.
Armstrong, E. J., C. T. Harskamp, et al. (2013). "Psoriasis and major adverse
cardiovascular events: a systematic review and meta-analysis of observational
studies." J Am Heart Assoc 2(2): e000062.
Australo-Anglo-American Spondyloarthritis, C., J. D. Reveille, et al. (2010). "Genome-
wide association study of ankylosing spondylitis identifies non-MHC
susceptibility loci." Nat Genet 42(2): 123-127.
Avina-Zubieta, J. A., J. Thomas, et al. (2012). "Risk of incident cardiovascular events in
patients with rheumatoid arthritis: a meta-analysis of observational studies."
Ann Rheum Dis 71(9): 1524-1529.
Azfar, R. S., N. M. Seminara, et al. (2012). "Increased risk of diabetes mellitus and
likelihood of receiving diabetes mellitus treatment in patients with psoriasis."
Arch Dermatol 148(9): 995-1000.
Barber, J., S. Muller, et al. (2010). "Measuring morbidity: self-report or health care
records?" Fam Pract 27(1): 25-30.
Basavaraj, K. H., N. M. Ashok, et al. (2010). "The role of drugs in the induction and/or
exacerbation of psoriasis." Int J Dermatol 49(12): 1351-1361.
221
Bath, R. K., N. K. Brar, et al. (2014). "A review of methotrexate-associated
hepatotoxicity." J Dig Dis 15(10): 517-524.
Baum, P. R., R. B. Gayle, 3rd, et al. (1994). "Molecular characterization of murine and
human OX40/OX40 ligand systems: identification of a human OX40 ligand as
the HTLV-1-regulated protein gp34." EMBO J 13(17): 3992-4001.
Bencherif, M., P. M. Lippiello, et al. (2011). "Alpha7 nicotinic receptors as novel
therapeutic targets for inflammation-based diseases." Cell Mol Life Sci 68(6):
931-949.
Benjamin, M. and D. McGonagle (2001). "The anatomical basis for disease localisation
in seronegative spondyloarthropathy at entheses and related sites." J Anat
199(Pt 5): 503-526.
Bentham, J., D. L. Morris, et al. (2015). "Genetic association analyses implicate aberrant
regulation of innate and adaptive immunity genes in the pathogenesis of
systemic lupus erythematosus." Nat Genet 47(12): 1457-1464.
Bergboer, J. G. M., P. Zeeuwen, et al. (2012). "Genetics of psoriasis: evidence for
epistatic interaction between skin barrier abnormalities and immune deviation." J Invest Dermatol 132(10): 2320-2331.
Bernstein, C. N., A. Wajda, et al. (2005). "The clustering of other chronic inflammatory
diseases in inflammatory bowel disease: a population-based study."
Gastroenterology 129(3): 827-836.
Bhattacharjee, S., P. Rajaraman, et al. (2012). "A subset-based approach improves
power and interpretation for the combined analysis of genetic association
studies of heterogeneous traits." Am J Hum Genet 90(5): 821-835.
Bhole, V. M., H. K. Choi, et al. (2012). "Differences in body mass index among
individuals with PsA, psoriasis, RA and the general population." Rheumatology
(Oxford) 51(3): 552-556.
Blackshaw, S., A. Sawa, et al. (2000). "Type 3 inositol 1,4,5-trisphosphate receptor
modulates cell death." FASEB J 14(10): 1375-1379.
Bo, K., M. Thoresen, et al. (2008). "Smokers report more psoriasis, but not atopic
dermatitis or hand eczema: results from a Norwegian population survey among
adults." Dermatology 216(1): 40-45.
Boehncke, S., D. Thaci, et al. (2007). "Psoriasis patients show signs of insulin
resistance." Br J Dermatol 157(6): 1249-1251.
222
Bowden, J., G. Davey Smith, et al. (2015). "Mendelian randomization with invalid
instruments: effect estimation and bias detection through Egger regression." Int
J Epidemiol 44(2): 512-525.
Bowden, J., G. Davey Smith, et al. (2016). "Consistent Estimation in Mendelian
Randomization with Some Invalid Instruments Using a Weighted Median
Estimator." Genet Epidemiol 40(4): 304-314.
Bowden, J., M. F. Del Greco, et al. (2016). "Assessing the suitability of summary data
for two-sample Mendelian randomization analyses using MR-Egger regression:
the role of the I2 statistic." Int J Epidemiol 45(6): 1961-1974.
Bowes, J., J. Ashcroft, et al. (2017). "Cross-phenotype association mapping of the MHC
identifies genetic variants that differentiate psoriatic arthritis from psoriasis."
Ann Rheum Dis 76(10): 1774-1779.
Bowes, J., A. Budu-Aggrey, et al. (2015). "Dense genotyping of immune-related
susceptibility loci reveals new insights into the genetics of psoriatic arthritis."
Nat Commun 6: 6046.
Bowes, J., S. Eyre, et al. (2011). "Evidence to support IL-13 as a risk locus for psoriatic arthritis but not psoriasis vulgaris." Ann Rheum Dis 70(6): 1016-1019.
Bowes, J., S. Loehr, et al. (2015). "PTPN22 is associated with susceptibility to psoriatic
arthritis but not psoriasis: evidence for a further PsA-specific risk locus." Ann
Rheum Dis 74(10): 1882-1885.
Bowes, J., G. Orozco, et al. (2011). "Confirmation of TNIP1 and IL23A as susceptibility
loci for psoriatic arthritis." Ann Rheum Dis 70(9): 1641-1644.
Boyd, A. S. and K. H. Neldner (1990). "The isomorphic response of Koebner." Int J
Dermatol 29(6): 401-410.
Brandrup, F., M. Hauge, et al. (1978). "Psoriasis in an unselected series of twins." Arch
Dermatol 114(6): 874-878.
Brauchli, Y. B., S. S. Jick, et al. (2008). "Psoriasis and the risk of incident diabetes
mellitus: a population-based study." Br J Dermatol 159(6): 1331-1337.
Brenaut, E., C. Horreau, et al. (2013). "Alcohol consumption and psoriasis: a systematic
literature review." J Eur Acad Dermatol Venereol 27 Suppl 3: 30-35.
Budu-Aggrey, A., J. Bowes, et al. (2016). "Replication of a distinct psoriatic arthritis risk
variant at the IL23R locus." Ann Rheum Dis 75(7): 1417-1418.
223
Budu-Aggrey, A., J. Bowes, et al. (2017). "A rare coding allele in IFIH1 is protective for
psoriatic arthritis." Ann Rheum Dis 76(7): 1321-1324.
Bulik-Sullivan, B., H. K. Finucane, et al. (2015). "An atlas of genetic correlations across
human diseases and traits." Nat Genet 47(11): 1236-1241.
Burgess, S., J. Bowden, et al. (2017). "Sensitivity Analyses for Robust Causal Inference
from Mendelian Randomization Analyses with Multiple Genetic Variants."
Epidemiology 28(1): 30-42.
Burgess, S., A. Butterworth, et al. (2013). "Mendelian randomization analysis with
multiple genetic variants using summarized data." Genet Epidemiol 37(7): 658-
665.
Burgess, S., F. Dudbridge, et al. (2015). "Re: "Multivariable Mendelian randomization:
the use of pleiotropic genetic variants to estimate causal effects"." Am J
Epidemiol 181(4): 290-291.
Burgess, S., R. A. Scott, et al. (2015). "Using published data in Mendelian randomization:
a blueprint for efficient identification of causal risk factors." Eur J Epidemiol
30(7): 543-552.
Burgess, S. and S. G. Thompson (2013). "Use of allele scores as instrumental variables
for Mendelian randomization." Int J Epidemiol 42(4): 1134-1144.
Burgess, S. and S. G. Thompson (2015). "Multivariable Mendelian randomization: the
use of pleiotropic genetic variants to estimate causal effects." Am J Epidemiol
181(4): 251-260.
Burgess, S., S. G. Thompson, et al. (2011). "Avoiding bias from weak instruments in
Mendelian randomization studies." Int J Epidemiol 40(3): 755-764.
Burner, T. W. and A. K. Rosenthal (2009). "Diabetes and rheumatic diseases." Curr
Opin Rheumatol 21(1): 50-54.
Buske-Kirschbaum, A., S. Kern, et al. (2007). "Altered distribution of leukocyte subsets
and cytokine production in response to acute psychosocial stress in patients
with psoriasis vulgaris." Brain Behav Immun 21(1): 92-99.
Buskila, D., D. D. Gladman, et al. (1990). "Rheumatologic manifestations of infection
with the human immunodeficiency virus (HIV)." Clin Exp Rheumatol 8(6): 567-
573.
Candia, R., A. Ruiz, et al. (2015). "Risk of non-alcoholic fatty liver disease in patients
with psoriasis: a systematic review and meta-analysis." J Eur Acad Dermatol
Venereol 29(4): 656-662.
224
Canete, J. D. and P. Mease (2012). "The link between obesity and psoriatic arthritis."
Ann Rheum Dis 71(8): 1265-1266.
Cargill, M., S. J. Schrodi, et al. (2007). "A large-scale genetic association study confirms
IL12B and leads to the identification of IL23R as psoriasis-risk genes." Am J
Hum Genet 80(2): 273-290.
Carrascosa, J. M., V. Rocamora, et al. (2014). "Obesity and psoriasis: inflammatory
nature of obesity, relationship between psoriasis and obesity, and therapeutic
implications." Actas Dermosifiliogr 105(1): 31-44.
Cham, C. M., K. Ko, et al. (2012). "Interferon regulatory factor 5 in the pathogenesis of
systemic lupus erythematosus." Clin Dev Immunol 2012: 780436.
Chandran, V. (2010). "Genetics of psoriasis and psoriatic arthritis." Indian J Dermatol
55(2): 151-156.
Chandran, V., S. B. Bull, et al. (2013). "Human leukocyte antigen alleles and
susceptibility to psoriatic arthritis." Hum Immunol 74(10): 1333-1338.
Chandran, V., C. T. Schentag, et al. (2009). "Familial aggregation of psoriatic arthritis."
Ann Rheum Dis 68(5): 664-667.
Chang, J. T., E. M. Shevach, et al. (1999). "Regulation of interleukin (IL)-12 receptor
beta2 subunit expression by endogenous IL-12: a critical step in the
differentiation of pathogenic autoreactive T cells." J Exp Med 189(6): 969-978.
Chapman, J. T., L. E. Otterbein, et al. (2001). "Carbon monoxide attenuates
aeroallergen-induced inflammation in mice." Am J Physiol Lung Cell Mol Physiol
281(1): L209-216.
Cheng, H., Y. Li, et al. (2014). "Identification of a missense variant in LNPEP that
confers psoriasis risk." J Invest Dermatol 134(2): 359-365.
Chiang, Y. Y. and H. W. Lin (2012). "Association between psoriasis and chronic
obstructive pulmonary disease: a population-based study in Taiwan." J Eur Acad
Dermatol Venereol 26(1): 59-65.
Chouela, E., A. Abeldano, et al. (1996). "Hepatitis C virus antibody (anti-HCV):
prevalence in psoriasis." Int J Dermatol 35(11): 797-799.
Ciccacci, C., P. Conigliaro, et al. (2016). "Polymorphisms in STAT-4, IL-10, PSORS1C1,
PTPN2 and MIR146A genes are associated differently with prognostic factors in
Italian patients affected by rheumatoid arthritis." Clin Exp Immunol 186(2):
157-163.
225
Coates, L. C., T. Aslam, et al. (2013). "Comparison of three screening tools to detect
psoriatic arthritis in patients with psoriasis (CONTEST study)." Br J Dermatol
168(4): 802-807.
Cohen, A. D., J. Dreiher, et al. (2009). "Psoriasis associated with ulcerative colitis and
Crohn's disease." J Eur Acad Dermatol Venereol 23(5): 561-565.
Cohen, A. D., D. Weitzman, et al. (2010). "Psoriasis associated with hepatitis C but not
with hepatitis B." Dermatology 220(3): 218-222.
Cohen, A. D., D. Weitzman, et al. (2010). "Psoriasis and hypertension: a case-control
study." Acta Derm Venereol 90(1): 23-26.
Collins, R. (2012). "What makes UK Biobank special?" Lancet 379(9822): 1173-1174.
Cortes, A. and M. A. Brown (2011). "Promise and pitfalls of the Immunochip." Arthritis
Res Ther 13(1): 101.
Costello, P., B. Bresnihan, et al. (1999). "Predominance of CD8+ T lymphocytes in
psoriatic arthritis." J Rheumatol 26(5): 1117-1124.
Cotsapas, C., B. F. Voight, et al. (2011). "Pervasive sharing of genetic effects in
autoimmune disease." PLoS Genet 7(8): e1002254.
Curtis, J. R., T. Beukelman, et al. (2010). "Elevated liver enzyme tests among patients
with rheumatoid arthritis or psoriatic arthritis treated with methotrexate
and/or leflunomide." Ann Rheum Dis 69(1): 43-47.
Dai, J. Y. and X. C. Zhang (2015). "Mendelian randomization studies for a continuous
exposure under case-control sampling." Am J Epidemiol 181(6): 440-449.
Dalgard, F. J., U. Gieler, et al. (2015). "The psychological burden of skin diseases: a
cross-sectional multicenter study among dermatological out-patients in 13
European countries." J Invest Dermatol 135(4): 984-991.
Dand, N., S. Mucha, et al. (2017). "Exome-wide association study reveals novel
psoriasis susceptibility locus at TNFSF15 and rare protective alleles in genes
contributing to type I IFN signalling." Hum Mol Genet 26(21): 4301-4313.
Davey Smith, G. and G. Hemani (2014). "Mendelian randomization: genetic anchors for
causal inference in epidemiological studies." Hum Mol Genet 23(R1): R89-98.
de Cid, R., E. Riveira-Munoz, et al. (2009). "Deletion of the late cornified envelope
LCE3B and LCE3C genes as a susceptibility factor for psoriasis." Nat Genet
41(2): 211-215.
226
de Korte, J., M. A. Sprangers, et al. (2004). "Quality of life in patients with psoriasis: a
systematic literature review." J Investig Dermatol Symp Proc 9(2): 140-147.
De Souza, Y. G. and J. S. Greenspan (2013). "Biobanking past, present and future:
responsibilities and benefits." AIDS 27(3): 303-312.
Delgado-Rodriguez, M. and J. Llorca (2004). "Bias." J Epidemiol Community Health
58(8): 635-641.
Despres, J. P. (2012). "Body fat distribution and risk of cardiovascular disease: an
update." Circulation 126(10): 1301-1313.
Dicker, R. C. (2006). Principles of Epidemiology in Public Health Practice: An
Introduction to Applied Epidemiology and Biostatistics, Centre for Disease
Control and Prevention (CDC).
Dommasch, E. D., T. Li, et al. (2015). "Risk of depression in women with psoriasis: a
cohort study." Br J Dermatol 173(4): 975-980.
dos Santos Silva, I. (1999). Cancer Epidemiology: Principles and Methods. France,
International Agency for Research and Cancer.
Dowlatshahi, E. A., M. Wakkee, et al. (2014). "The prevalence and odds of depressive symptoms and clinical depression in psoriasis patients: a systematic review and
meta-analysis." J Invest Dermatol 134(6): 1542-1551.
Doyle, T. J. and P. F. Dellaripa (2017). "Lung Manifestations in the Rheumatic Diseases."
Chest 152(6): 1283-1295.
Dreiher, J., T. Freud, et al. (2013). "Psoriatic arthritis and diabetes: a population-based
cross-sectional study." Dermatol Res Pract 2013: 580404.
Dreiher, J., D. Weitzman, et al. (2008). "Psoriasis and chronic obstructive pulmonary
disease: a case-control study." Br J Dermatol 159(4): 956-960.
du Prel, J. B., G. Hommel, et al. (2009). "Confidence interval or p-value?: part 4 of a
series on evaluation of scientific publications." Dtsch Arztebl Int 106(19): 335-
339.
Duffin, K. C., I. C. Freeny, et al. (2009). "Association between IL13 polymorphisms and
psoriatic arthritis is modified by smoking." J Invest Dermatol 129(12): 2777-
2783.
Duffy, D. L., L. S. Spelman, et al. (1993). "Psoriasis in Australian twins." J Am Acad
Dermatol 29(3): 428-434.
227
Duhen, T., R. Geiger, et al. (2009). "Production of interleukin 22 but not interleukin 17
by a subset of human skin-homing memory T cells." Nat Immunol 10(8): 857-
863.
Ebrahim, S. and G. Davey Smith (2013). "Commentary: Should we always deliberately
be non-representative?" Int J Epidemiol 42(4): 1022-1026.
Eder, L., V. Chandran, et al. (2017). "The Risk of Developing Diabetes Mellitus in
Patients with Psoriatic Arthritis: A Cohort Study." J Rheumatol 44(3): 286-291.
Eder, L., V. Chandran, et al. (2012). "Human leucocyte antigen risk alleles for psoriatic
arthritis among patients with psoriasis." Ann Rheum Dis 71(1): 50-55.
Eder, L., V. Chandran, et al. (2011). "IL13 gene polymorphism is a marker for psoriatic
arthritis among psoriasis patients." Ann Rheum Dis 70(9): 1594-1598.
Eder, L., V. Chandran, et al. (2011). "Incidence of arthritis in a prospective cohort of
psoriasis patients." Arthritis Care Res (Hoboken) 63(4): 619-622.
Eder, L., A. Haddad, et al. (2015). "The incidence and risk factors for psoriatic arthritis
in patients with psoriasis - a prospective cohort study." Arthritis Rheumatol.
Eder, L., T. Law, et al. (2011). "Association between environmental factors and onset of psoriatic arthritis in patients with psoriasis." Arthritis Care Res (Hoboken)
63(8): 1091-1097.
Eder, L., S. Shanmugarajah, et al. (2012). "The association between smoking and the
development of psoriatic arthritis among psoriasis patients." Ann Rheum Dis
71(2): 219-224.
Edwards, R. R., C. Cahalan, et al. (2011). "Pain, catastrophizing, and depression in the
rheumatic diseases." Nat Rev Rheumatol 7(4): 216-224.
Egeberg, A., J. P. Thyssen, et al. (2017). "Risk of Myocardial Infarction in Patients with
Psoriasis and Psoriatic Arthritis: A Nationwide Cohort Study." Acta Derm
Venereol 97(7): 819-824.
Elia, M. (2013). "Body composition by whole-body bioelectrical impedance and
prediction of clinically relevant outcomes: overvalued or underused?" Eur J Clin
Nutr 67 Suppl 1: S60-70.
Ellinghaus, D., E. Ellinghaus, et al. (2012). "Combined analysis of genome-wide
association studies for Crohn disease and psoriasis identifies seven shared
susceptibility loci." Am J Hum Genet 90(4): 636-647.
228
Ellinghaus, D., L. Jostins, et al. (2016). "Analysis of five chronic inflammatory diseases
identifies 27 new associations and highlights disease-specific patterns at shared
loci." Nat Genet 48(5): 510-518.
Ellinghaus, E., D. Ellinghaus, et al. (2010). "Genome-wide association study identifies a
psoriasis susceptibility locus at TRAF3IP2." Nat Genet 42(11): 991-995.
Ellinghaus, E., P. E. Stuart, et al. (2012). "Genome-wide meta-analysis of psoriatic
arthritis identifies susceptibility locus at REL." J Invest Dermatol 132(4): 1133-
1140.
Ellis, J. A., K. J. Scurrah, et al. (2015). "Epistasis amongst PTPN2 and genes of the
vitamin D pathway contributes to risk of juvenile idiopathic arthritis." J Steroid
Biochem Mol Biol 145: 113-120.
Ernster, V. L. (1994). "Nested case-control studies." Prev Med 23(5): 587-590.
Ethgen, O. and B. Standaert (2012). "Population- versus cohort-based modelling
approaches." Pharmacoeconomics 30(3): 171-181.
Evangelou, E. and J. P. Ioannidis (2013). "Meta-analysis methods for genome-wide
association studies and beyond." Nat Rev Genet 14(6): 379-389.
Evers, A. W., Y. Lu, et al. (2005). "Common burden of chronic skin diseases?
Contributors to psychological distress in adults with psoriasis and atopic
dermatitis." Br J Dermatol 152(6): 1275-1281.
Fadnes, L. T., A. Taube, et al. (2008). "How to identify information bias due to self-
reporting in epidemiological research." The Internet Journal of Epidemiology
7(2).
Farber, E. M. and L. Nall (1993). "Psoriasis: a stress-related disease." Cutis 51(5): 322-
326.
Farber, E. M., M. L. Nall, et al. (1974). "Natural history of psoriasis in 61 twin pairs."
Arch Dermatol 109(2): 207-211.
Fedak, K. M., A. Bernal, et al. (2015). "Applying the Bradford Hill criteria in the 21st
century: how data integration has changed causal inference in molecular
epidemiology." Emerg Themes Epidemiol 12: 14.
Ferreira, B. I., J. L. Abreu, et al. (2016). "Psoriasis and Associated Psychiatric Disorders:
A Systematic Review on Etiopathogenesis and Clinical Correlation." J Clin
Aesthet Dermatol 9(6): 36-43.
229
Fisher, R. A. (1925). Statistical methods for research workers. Edinburgh: Oliver and
Boyd.
Fletcher, R. H. and S. W. Fletcher (2005). Clinical epidemiology : the essentials.
Philadelphia, Lippincott Williams & Wilkins.
Fry, A., T. J. Littlejohns, et al. (2017). "Comparison of Sociodemographic and Health-
Related Characteristics of UK Biobank Participants with the General
Population." Am J Epidemiol.
Fujita, H. (2013). "The role of IL-22 and Th22 cells in human skin diseases." J Dermatol
Sci 72(1): 3-8.
Furue, M. and T. Kadono (2017). ""Inflammatory skin march" in atopic dermatitis and
psoriasis." Inflamm Res 66(10): 833-842.
Gabriel, S. E. and K. Michaud (2009). "Epidemiological studies in incidence, prevalence,
mortality, and comorbidity of the rheumatic diseases." Arthritis Res Ther 11(3):
229.
Galea, S. and M. Tracy (2007). "Participation rates in epidemiologic studies." Ann
Epidemiol 17(9): 643-653.
Gelfand, J. M., A. L. Neimann, et al. (2006). "Risk of myocardial infarction in patients
with psoriasis." JAMA 296(14): 1735-1741.
Genetic Analysis of Psoriasis, C., C. the Wellcome Trust Case Control, et al. (2010).
"A genome-wide association study identifies new psoriasis susceptibility loci and
an interaction between HLA-C and ERAP1." Nat Genet 42(11): 985-990.
Gisondi, P., G. Targher, et al. (2009). "Non-alcoholic fatty liver disease in patients with
chronic plaque psoriasis." J Hepatol 51(4): 758-764.
Gisondi, P., G. Tessari, et al. (2007). "Prevalence of metabolic syndrome in patients
with psoriasis: a hospital-based case-control study." Br J Dermatol 157(1): 68-
73.
Gladman, D. D., K. A. Anhorn, et al. (1986). "HLA antigens in psoriatic arthritis." J
Rheumatol 13(3): 586-592.
Gladman, D. D., C. Antoni, et al. (2005). "Psoriatic arthritis: epidemiology, clinical
features, course, and outcome." Ann Rheum Dis 64 Suppl 2: ii14-17.
Gladman, D. D., C. T. Schentag, et al. (2009). "Development and initial validation of a
screening questionnaire for psoriatic arthritis: the Toronto Psoriatic Arthritis
Screen (ToPAS)." Ann Rheum Dis 68(4): 497-501.
230
Glas, J., J. Wagner, et al. (2012). "PTPN2 gene variants are associated with
susceptibility to both Crohn's disease and ulcerative colitis supporting a
common genetic disease background." PLoS One 7(3): e33682.
Gottlieb, A., N. J. Korman, et al. (2008). "Guidelines of care for the management of
psoriasis and psoriatic arthritis: Section 2. Psoriatic arthritis: overview and
guidelines of care for treatment with an emphasis on the biologics." J Am Acad
Dermatol 58(5): 851-864.
Gottlieb, A. B., F. Dann, et al. (2008). "Psoriasis and the metabolic syndrome." J Drugs
Dermatol 7(6): 563-572.
Gottlieb, S. L., P. Gilleaudeau, et al. (1995). "Response of psoriasis to a lymphocyte-
selective toxin (DAB389IL-2) suggests a primary immune, but not keratinocyte,
pathogenic basis." Nat Med 1(5): 442-447.
Grando, S. A., R. M. Horton, et al. (1996). "Activation of keratinocyte nicotinic
cholinergic receptors stimulates calcium influx and enhances cell
differentiation." J Invest Dermatol 107(3): 412-418.
Greb, J. E., A. M. Goldminz, et al. (2016). "Psoriasis." Nat Rev Dis Primers 2: 16082.
Greene, B. L., G. F. Haldeman, et al. (2006). "Factors affecting physical activity behavior
in urban adults with arthritis who are predominantly African-American and
female." Phys Ther 86(4): 510-519.
Greenland, S. (2000). "An introduction to instrumental variables for epidemiologists."
Int J Epidemiol 29(4): 722-729.
Griffiths, C. E. and H. L. Richards (2001). "Psychological influences in psoriasis." Clin
Exp Dermatol 26(4): 338-342.
Grjibovski, A. M., A. O. Olsen, et al. (2007). "Psoriasis in Norwegian twins:
contribution of genetic and environmental effects." J Eur Acad Dermatol
Venereol 21(10): 1337-1343.
Groll, D. L., T. To, et al. (2005). "The development of a comorbidity index with
physical function as the outcome." J Clin Epidemiol 58(6): 595-602.
Gudu, T., A. Etcheto, et al. (2016). "Fatigue in psoriatic arthritis - a cross-sectional
study of 246 patients from 13 countries." Joint Bone Spine 83(4): 439-443.
Gupta, M. A. and A. K. Gupta (1998). "Depression and suicidal ideation in dermatology
patients with acne, alopecia areata, atopic dermatitis and psoriasis." Br J
Dermatol 139(5): 846-850.
231
Gupta, M. A., A. K. Gupta, et al. (1989). "A psychocutaneous profile of psoriasis
patients who are stress reactors. A study of 127 patients." Gen Hosp Psychiatry
11(3): 166-173.
Hackinger, S. and E. Zeggini (2017). "Statistical methods to detect pleiotropy in human
complex traits." Open Biol 7(11).
Haegert, D. G. (2004). "Analysis of the threshold liability model provides new
understanding of causation in autoimmune diseases." Med Hypotheses 63(2):
257-261.
Haldar, R. and D. Mukhopadhyay (2011). "Levenshtein distance technique in dictionary
lookup methods: An improved approach." Web intelligence & distributed
computing research lab.
Hamminga, E. A., A. J. van der Lely, et al. (2006). "Chronic inflammation in psoriasis
and obesity: implications for therapy." Med Hypotheses 67(4): 768-773.
Han, C., D. W. Robinson, Jr., et al. (2006). "Cardiovascular disease and risk factors in
patients with rheumatoid arthritis, psoriatic arthritis, and ankylosing
spondylitis." J Rheumatol 33(11): 2167-2172.
Harbord, R. M., V. Didelez, et al. (2013). "Severity of bias of a simple estimator of the
causal odds ratio in Mendelian randomization studies." Stat Med 32(7): 1246-
1258.
Haroon, M., P. Gallagher, et al. (2014). "High prevalence of metabolic syndrome and of
insulin resistance in psoriatic arthritis is associated with the severity of
underlying disease." J Rheumatol 41(7): 1357-1365.
Haroon, M., B. Kirby, et al. (2013). "High prevalence of psoriatic arthritis in patients
with severe psoriasis with suboptimal performance of screening
questionnaires." Ann Rheum Dis 72(5): 736-740.
Haroon, N. and R. D. Inman (2010). "Endoplasmic reticulum aminopeptidases: Biology
and pathogenic potential." Nat Rev Rheumatol 6(8): 461-467.
Hart, C. L., D. S. Morrison, et al. (2010). "Effect of body mass index and alcohol
consumption on liver disease: analysis of data from two prospective cohort
studies." BMJ 340: c1240.
Hartwig, F. P., G. Davey Smith, et al. (2017). "Robust inference in summary data
Mendelian randomization via the zero modal pleiotropy assumption." Int J
Epidemiol 46(6): 1985-1998.
232
Haycock, P. C., S. Burgess, et al. (2016). "Best (but oft-forgotten) practices: the design,
analysis, and interpretation of Mendelian randomization studies." Am J Clin
Nutr 103(4): 965-978.
Hemani, G., J. Zheng, et al. (2016). "MR-Base: a platform for systematic causal
inference across the phenome using billions of genetic associations. ." bioRxiv.
Henchoz, Y., F. Bastardot, et al. (2012). "Physical activity and energy expenditure in
rheumatoid arthritis patients and matched controls." Rheumatology (Oxford)
51(8): 1500-1507.
Henseler, T. and E. Christophers (1985). "Psoriasis of early and late onset:
characterization of two types of psoriasis vulgaris." J Am Acad Dermatol 13(3):
450-456.
Hewlett, S., Z. Cockshott, et al. (2005). "Patients' perceptions of fatigue in rheumatoid
arthritis: overwhelming, uncontrollable, ignored." Arthritis Rheum 53(5): 697-
702.
Hidalgo, B. and M. Goodman (2013). "Multivariate or multivariable regression?" Am J
Public Health 103(1): 39-40.
Ho, P. Y., A. Barton, et al. (2008). "Investigating the role of the HLA-Cw*06 and HLA-
DRB1 genes in susceptibility to psoriatic arthritis: comparison with psoriasis
and undifferentiated inflammatory arthritis." Ann Rheum Dis 67(5): 677-682.
Hoffmann, A. and D. Baltimore (2006). "Circuitry of nuclear factor kappaB signaling."
Immunol Rev 210: 171-186.
Hoggart, C. J., T. G. Clark, et al. (2008). "Genome-wide significance for dense SNP and
resequencing data." Genet Epidemiol 32(2): 179-185.
Hsieh, J., S. Kadavath, et al. (2014). "Can traumatic injury trigger psoriatic arthritis? A
review of the literature." Clin Rheumatol 33(5): 601-608.
Hu, S. C. and C. E. Lan (2017). "Psoriasis and Cardiovascular Comorbidities: Focusing
on Severe Vascular Events, Cardiovascular Risk Factors and Implications for
Treatment." Int J Mol Sci 18(10).
Huffmeier, U., S. Uebe, et al. (2010). "Common variants at TRAF3IP2 are associated
with susceptibility to psoriatic arthritis and psoriasis." Nat Genet 42(11): 996-
999.
Huidekoper, A. L., D. van der Woude, et al. (2013). "Patients with early arthritis
consume less alcohol than controls, regardless of the type of arthritis."
Rheumatology (Oxford) 52(9): 1701-1707.
233
Husni, M. E., K. H. Meyer, et al. (2007). "The PASE questionnaire: pilot-testing a
psoriatic arthritis screening and evaluation tool." J Am Acad Dermatol 57(4):
581-587.
Husted, J. A., A. Thavaneswaran, et al. (2011). "Cardiovascular and other comorbidities
in patients with psoriatic arthritis: a comparison with patients with psoriasis."
Arthritis Care Res (Hoboken) 63(12): 1729-1735.
Husted, J. A., B. D. Tom, et al. (2009). "Occurrence and correlates of fatigue in
psoriatic arthritis." Ann Rheum Dis 68(10): 1553-1558.
Ibrahim, G. H., M. H. Buch, et al. (2009). "Evaluation of an existing screening tool for
psoriatic arthritis in people with psoriasis and the development of a new
instrument: the Psoriasis Epidemiology Screening Tool (PEST) questionnaire."
Clin Exp Rheumatol 27(3): 469-474.
International HapMap, C. (2003). "The International HapMap Project." Nature
426(6968): 789-796.
Jafferany, M. (2008). "Lithium and psoriasis: what primary care and family physicians
should know." Prim Care Companion J Clin Psychiatry 10(6): 435-439.
Jafri, K., C. M. Bartels, et al. (2017). "Incidence and Management of Cardiovascular Risk
Factors in Psoriatic Arthritis and Rheumatoid Arthritis: A Population-Based
Study." Arthritis Care Res (Hoboken) 69(1): 51-57.
Johnson, J. A., C. Ma, et al. (2014). "Diet and nutrition in psoriasis: analysis of the
National Health and Nutrition Examination Survey (NHANES) in the United
States." J Eur Acad Dermatol Venereol 28(3): 327-332.
Kang, J. H., Y. H. Chen, et al. (2010). "Comorbidity profiles among patients with
ankylosing spondylitis: a nationwide population-based study." Ann Rheum Dis
69(6): 1165-1168.
Kaprio, J. (2000). "Science, medicine, and the future. Genetic epidemiology." BMJ
320(7244): 1257-1259.
Karason, A., T. J. Love, et al. (2009). "A strong heritability of psoriatic arthritis over
four generations--the Reykjavik Psoriatic Arthritis Study." Rheumatology
(Oxford) 48(11): 1424-1428.
Karreman, M. C., A. E. Weel, et al. (2016). "Performance of screening tools for
psoriatic arthritis: a cross-sectional study in primary care." Rheumatology
(Oxford).
234
Kassi, K., O. A. Mienwoley, et al. (2013). "Severe skin forms of psoriasis in black
africans: epidemiological, clinical, and histological aspects related to 56 cases."
Autoimmune Dis 2013: 561032.
Kelley, G. A. and K. S. Kelley (2012). "Statistical models for meta-analysis: A brief
tutorial." World J Methodol 2(4): 27-32.
Khraishi, M., I. Landells, et al. (2010). "The self-administered Psoriasis and Arthritis
Screening Questionnaire (PASQ): A sensitive and specific tool for the diagnosis
of early and established psoriatic arthritis." Psoriasis Forum 16(2): 9-16.
Khraishi, M., D. MacDonald, et al. (2011). "Prevalence of patient-reported
comorbidities in early and established psoriatic arthritis cohorts." Clin
Rheumatol 30(7): 877-885.
Klein, J. P., J. D. Rizzo, et al. (2001). "Statistical methods for the analysis and
presentation of the results of bone marrow transplants. Part 2: Regression
modeling." Bone Marrow Transplant 28(11): 1001-1011.
Kopec, J. A. and J. M. Esdaile (1990). "Bias in case-control studies. A review." J
Epidemiol Community Health 44(3): 179-186.
Kotsis, K., P. V. Voulgari, et al. (2012). "Anxiety and depressive symptoms and illness
perceptions in psoriatic arthritis and associations with physical health-related
quality of life." Arthritis Care Res (Hoboken) 64(10): 1593-1601.
Krausgruber, T., K. Blazek, et al. (2011). "IRF5 promotes inflammatory macrophage
polarization and TH1-TH17 responses." Nat Immunol 12(3): 231-238.
Krueger, J. G. (2002). "The immunologic basis for the treatment of psoriasis with new
biologic agents." J Am Acad Dermatol 46(1): 1-23; quiz 23-26.
Krueger, J. G. and P. M. Brunner (2017). "Interleukin-17 alters the biology of many cell
types involved in the genesis of psoriasis, systemic inflammation and associated
comorbidities." Exp Dermatol.
Kryczek, I., A. T. Bruce, et al. (2008). "Induction of IL-17+ T cell trafficking and
development by IFN-gamma: mechanism and pathological relevance in
psoriasis." J Immunol 181(7): 4733-4741.
Kumar, S., J. Han, et al. (2013). "Obesity, waist circumference, weight change and the
risk of psoriasis in US women." J Eur Acad Dermatol Venereol 27(10): 1293-
1298.
235
Kurd, S. K., A. B. Troxel, et al. (2010). "The risk of depression, anxiety, and suicidality
in patients with psoriasis: a population-based cohort study." Arch Dermatol
146(8): 891-895.
Lai, Y. C. and Y. W. Yew (2016). "Psoriasis as an Independent Risk Factor for
Cardiovascular Disease: An Epidemiologic Analysis Using a National Database."
J Cutan Med Surg 20(4): 327-333.
Larkin, L. and N. Kennedy (2014). "Correlates of physical activity in adults with
rheumatoid arthritis: a systematic review." J Phys Act Health 11(6): 1248-1261.
Lee, E. J., K. D. Han, et al. (2017). "Smoking and risk of psoriasis: A nationwide cohort
study." J Am Acad Dermatol 77(3): 573-575.
Lee, Y. H. and G. G. Song (2017). "Smoking paradox in the development of psoriatic
arthritis among patients with psoriasis." Ann Rheum Dis.
Lewallen, S. and P. Courtright (1998). "Epidemiology in practice: case-control studies."
Community Eye Health 11(28): 57-58.
Lewinson, R. T., I. A. Vallerand, et al. (2017). "Depression Is Associated with an
Increased Risk of Psoriatic Arthritis among Patients with Psoriasis: A
Population-Based Study." J Invest Dermatol 137(4): 828-835.
Li, W., J. Han, et al. (2012). "Obesity and risk of incident psoriatic arthritis in US
women." Ann Rheum Dis 71(8): 1267-1272.
Li, W., J. Han, et al. (2012). "Smoking and risk of incident psoriatic arthritis in US
women." Ann Rheum Dis 71(6): 804-808.
Li, W. Q., J. L. Han, et al. (2013). "Psoriasis, psoriatic arthritis and increased risk of
incident Crohn's disease in US women." Ann Rheum Dis 72(7): 1200-1205.
Li, X., L. Kong, et al. (2015). "Association between Psoriasis and Chronic Obstructive
Pulmonary Disease: A Systematic Review and Meta-analysis." PLoS One 10(12):
e0145221.
Liley, J. and C. Wallace (2015). "A pleiotropy-informed Bayesian false discovery rate
adapted to a shared control design finds new disease associations from GWAS
summary statistics." PLoS Genet 11(2): e1004926.
Lindsay, K., A. D. Fraser, et al. (2009). "Liver fibrosis in patients with psoriasis and
psoriatic arthritis on long-term, high cumulative dose methotrexate therapy."
Rheumatology (Oxford) 48(5): 569-572.
236
Liu, Y., C. Helms, et al. (2008). "A genome-wide association study of psoriasis and
psoriatic arthritis identifies new disease loci." PLoS Genet 4(3): e1000041.
Locke, A. E., B. Kahali, et al. (2015). "Genetic studies of body mass index yield new
insights for obesity biology." Nature 518(7538): 197-206.
Lonnberg, A. S., L. Skov, et al. (2013). "Heritability of psoriasis in a large twin sample."
Br J Dermatol 169(2): 412-416.
Lopez-Larrea, C., J. C. Torre Alonso, et al. (1990). "HLA antigens in psoriatic arthritis
subtypes of a Spanish population." Ann Rheum Dis 49(5): 318-319.
Lopez de Lapuente, A., A. Feliu, et al. (2016). "Correction: Novel Insights into the
Multiple Sclerosis Risk Gene ANKRD55." J Immunol 197(10): 4177.
Lopez de Lapuente, A., A. Feliu, et al. (2016). "Novel Insights into the Multiple Sclerosis
Risk Gene ANKRD55." J Immunol 196(11): 4553-4565.
Love, T. J., Y. Zhu, et al. (2012). "Obesity and the risk of psoriatic arthritis: a
population-based study." Ann Rheum Dis 71(8): 1273-1277.
Lowes, M. A., T. Kikuchi, et al. (2008). "Psoriasis vulgaris lesions contain discrete
populations of Th1 and Th17 T cells." J Invest Dermatol 128(5): 1207-1211.
Mabuchi, T. and N. Hirayama (2016). "Binding Affinity and Interaction of LL-37 with
HLA-C*06:02 in Psoriasis." J Invest Dermatol 136(9): 1901-1903.
Macaubas, C., E. Wong, et al. (2016). "Altered signaling in systemic juvenile idiopathic
arthritis monocytes." Clin Immunol 163: 66-74.
Macfarlane, G. J., M. Beasley, et al. (2015). "Can large surveys conducted on highly
selected populations provide valid information on the epidemiology of common
health conditions? An analysis of UK Biobank data on musculoskeletal pain." Br
J Pain 9(4): 203-212.
Maher, B. (2008). "Personal genomes: The case of the missing heritability." Nature
456(7218): 18-21.
Manning, V. L., M. V. Hurley, et al. (2012). "Are patients meeting the updated physical
activity guidelines? Physical activity participation, recommendation, and
preferences among inner-city adults with rheumatic diseases." J Clin Rheumatol
18(8): 399-404.
Martin, D. A., J. E. Towne, et al. (2013). "The emerging role of IL-17 in the
pathogenesis of psoriasis: preclinical and clinical findings." J Invest Dermatol
133(1): 17-26.
237
Matthews, A. G., D. M. Finkelstein, et al. (2008). "Analysis of familial aggregation
studies with complex ascertainment schemes." Stat Med 27(24): 5076-5092.
McDonough, E., R. Ayearst, et al. (2014). "Depression and anxiety in psoriatic disease:
prevalence and associated factors." J Rheumatol 41(5): 887-896.
McGonagle, D. (2005). "Imaging the joint and enthesis: insights into pathogenesis of
psoriatic arthritis." Ann Rheum Dis 64 Suppl 2: ii58-60.
McGonagle, D., Z. Ash, et al. (2011). "The early phase of psoriatic arthritis." Ann
Rheum Dis 70 Suppl 1: i71-76.
McGonagle, D., K. G. Hermann, et al. (2015). "Differentiation between osteoarthritis
and psoriatic arthritis: implications for pathogenesis and treatment in the
biologic therapy era." Rheumatology (Oxford) 54(1): 29-38.
McGonagle, D., R. J. Lories, et al. (2007). "The concept of a "synovio-entheseal
complex" and its implications for understanding joint inflammation and damage
in psoriatic arthritis and beyond." Arthritis Rheum 56(8): 2482-2491.
Mease, P. J., D. D. Gladman, et al. (2014). "Comparative performance of psoriatic
arthritis screening tools in patients with psoriasis in European/North American
dermatology clinics." J Am Acad Dermatol 71(4): 649-655.
Mehta, N. N., Y. Yu, et al. (2011). "Attributable risk estimate of severe psoriasis on
major cardiovascular events." Am J Med 124(8): 775 e771-776.
Miele, L., S. Vallone, et al. (2009). "Prevalence, characteristics and severity of non-
alcoholic fatty liver disease in patients with chronic plaque psoriasis." J Hepatol
51(4): 778-786.
Mills, R. J. and C. A. Young (2008). "A medical definition of fatigue in multiple
sclerosis." QJM 101(1): 49-60.
Mischke, D., B. P. Korge, et al. (1996). "Genes encoding structural proteins of
epidermal cornification and S100 calcium-binding proteins form a gene complex
("epidermal differentiation complex") on human chromosome 1q21." J Invest
Dermatol 106(5): 989-992.
Mishra, S., H. Kancharla, et al. (2017). "Comparison of four validated psoriatic arthritis
screening tools in diagnosing psoriatic arthritis in patients with psoriasis
(COMPAQ Study)." Br J Dermatol 176(3): 765-770.
Mitchell, K. J. (2012). "What is complex about complex disorders?" Genome Biol
13(1): 237.
238
Moll, J. M. and V. Wright (1973). "Familial occurrence of psoriatic arthritis." Ann
Rheum Dis 32(3): 181-201.
Moll, J. M. and V. Wright (1973). "Psoriatic arthritis." Semin Arthritis Rheum 3(1): 55-
78.
Monteiro, R. and I. Azevedo (2010). "Chronic inflammation in obesity and the
metabolic syndrome." Mediators Inflamm 2010.
Morrow, J. D., B. Frei, et al. (1995). "Increase in circulating products of lipid
peroxidation (F2-isoprostanes) in smokers. Smoking as a cause of oxidative
damage." N Engl J Med 332(18): 1198-1203.
Mrowietz, U. and S. Domm (2013). "Systemic steroids in the treatment of psoriasis:
what is fact, what is fiction?" J Eur Acad Dermatol Venereol 27(8): 1022-1025.
Murase, J. E., K. K. Chan, et al. (2005). "Hormonal effect on psoriasis in pregnancy and
post partum." Arch Dermatol 141(5): 601-606.
Myers, A., L. J. Kay, et al. (2005). "Recurrence risk for psoriasis and psoriatic arthritis
within sibships." Rheumatology (Oxford) 44(6): 773-776.
Nair, R. P., K. C. Duffin, et al. (2009). "Genome-wide scan reveals association of psoriasis with IL-23 and NF-kappaB pathways." Nat Genet 41(2): 199-204.
Nair, R. P., x..c., et al. (2013). "Meta-analysis of psoriasis and psoriatic arthritis
identifies three new susceptibility loci." J. Invest. Dermatol. 133(S136).
Navarini, A. A., A. D. Burden, et al. (2017). "European consensus statement on
phenotypes of pustular psoriasis." J Eur Acad Dermatol Venereol 31(11): 1792-
1799.
Neimann, A. L., D. B. Shin, et al. (2006). "Prevalence of cardiovascular risk factors in
patients with psoriasis." J Am Acad Dermatol 55(5): 829-835.
Nguyen, U. D. T., Y. Zhang, et al. (2018). "Smoking paradox in the development of
psoriatic arthritis among patients with psoriasis: a population-based study." Ann
Rheum Dis 77(1): 119-123.
Nguyen, U. S. D. T., Y. Zhang, et al. (2015). "The Smoking Paradox in the Development
of Psoriatic Arthritis Among Psoriasis Patients [abstract]." Arthritis Rheumatol
67(suppl 10).
Ni, C. and M. W. Chiu (2014). "Psoriasis and comorbidities: links and risks." Clin
Cosmet Investig Dermatol 7: 119-132.
239
Nograles, K. E., B. Davidovici, et al. (2010). "New insights in the immunologic basis of
psoriasis." Semin Cutan Med Surg 29(1): 3-9.
Nurmohamed, M. T., M. Heslinga, et al. (2015). "Cardiovascular comorbidity in
rheumatic diseases." Nat Rev Rheumatol 11(12): 693-704.
O'Connor, L. J. and A. L. Price (2018). "Distinguishing genetic correlation from
causation across 52 diseases and complex traits." BioRxiv.
O'Rielly, D. D. and P. Rahman (2014). "Genetics of psoriatic arthritis." Best Pract Res
Clin Rheumatol 28(5): 673-685.
Obuch, M. L., T. A. Maurer, et al. (1992). "Psoriasis and human immunodeficiency virus
infection." J Am Acad Dermatol 27(5 Pt 1): 667-673.
Ockenfels, H. M., C. Keim-Maas, et al. (1996). "Ethanol enhances the IFN-gamma, TGF-
alpha and IL-6 secretion in psoriatic co-cultures." Br J Dermatol 135(5): 746-
751.
Ogdie, A. and J. M. Gelfand (2015). "Clinical Risk Factors for the Development of
Psoriatic Arthritis Among Patients with Psoriasis: A Review of Available
Evidence." Curr Rheumatol Rep 17(10): 64.
Ogdie, A., S. K. Grewal, et al. (2017). "Risk of incident liver disease in patients with
psoriasis, psoriatic arthritis, and rheumatoid arthritis: a population-based
study." J Invest Dermatol.
Ogdie, A., S. Langan, et al. (2013). "Prevalence and treatment patterns of psoriatic
arthritis in the UK." Rheumatology (Oxford) 52(3): 568-575.
Ogdie, A., S. Schwartzman, et al. (2015). "Recognizing and managing comorbidities in
psoriatic arthritis." Curr Opin Rheumatol 27(2): 118-126.
Ogdie, A. and P. Weiss (2015). "The Epidemiology of Psoriatic Arthritis." Rheum Dis
Clin North Am 41(4): 545-568.
Ogdie, A., Y. Yu, et al. (2015). "Risk of major cardiovascular events in patients with
psoriatic arthritis, psoriasis and rheumatoid arthritis: a population-based cohort
study." Ann Rheum Dis 74(2): 326-332.
Oishi, T., A. Iida, et al. (2008). "A functional SNP in the NKX2.5-binding site of ITPR3
promoter is associated with susceptibility to systemic lupus erythematosus in
Japanese population." J Hum Genet 53(2): 151-162.
240
Okada, Y., B. Han, et al. (2014). "Fine mapping major histocompatibility complex
associations in psoriasis and its clinical subtypes." Am J Hum Genet 95(2): 162-
172.
Okada, Y., D. Wu, et al. (2014). "Genetics of rheumatoid arthritis contributes to
biology and drug discovery." Nature 506(7488): 376-381.
Okura, Y., L. H. Urban, et al. (2004). "Agreement between self-report questionnaires
and medical record data was substantial for diabetes, hypertension, myocardial
infarction and stroke but not for heart failure." J Clin Epidemiol 57(10): 1096-
1103.
Pakhomov, S. V., S. J. Jacobsen, et al. (2008). "Agreement between patient-reported
symptoms and their documentation in the medical record." Am J Manag Care
14(8): 530-539.
Palla, L. and F. Dudbridge (2015). "A Fast Method that Uses Polygenic Scores to
Estimate the Variance Explained by Genome-wide Marker Panels and the
Proportion of Variants Affecting a Trait." Am J Hum Genet 97(2): 250-259.
Parisi, R., D. P. Symmons, et al. (2013). "Global epidemiology of psoriasis: a systematic review of incidence and prevalence." J Invest Dermatol 133(2): 377-385.
Parkes, M., A. Cortes, et al. (2013). "Genetic insights into common pathways and
complex relationships among immune-mediated diseases." Nat Rev Genet
14(9): 661-673.
Paschos, P. and K. Paletas (2009). "Non alcoholic fatty liver disease and metabolic
syndrome." Hippokratia 13(1): 9-19.
Pattison, E., B. J. Harrison, et al. (2008). "Environmental risk factors for the
development of psoriatic arthritis: results from a case-control study." Ann
Rheum Dis 67(5): 672-676.
Pedersen, O. B., A. J. Svendsen, et al. (2008). "On the heritability of psoriatic arthritis.
Disease concordance among monozygotic and dizygotic twins." Ann Rheum Dis
67(10): 1417-1421.
Peloso, P., M. Behl, et al. (1997). "The psoriasis and arthritis questionnaire (PAQ) in
detection of arthritis among patients with psoriasis [abstract]." Arthritis Rheum
40(Suppl:S64).
Pickrell, J. K., T. Berisa, et al. (2016). "Detection and interpretation of shared genetic
influences on 42 human traits." Nat Genet 48(7): 709-717.
241
Pietrzak, D., A. Pietrzak, et al. (2017). "Digestive system in psoriasis: an update." Arch
Dermatol Res 309(9): 679-693.
Polachek, A., Z. Touma, et al. (2017). "Risk of Cardiovascular Morbidity in Patients
With Psoriatic Arthritis: A Meta-Analysis of Observational Studies." Arthritis
Care Res (Hoboken) 69(1): 67-74.
Prussick, R. B. and L. Miele (2017). "Nonalcoholic fatty liver disease in patients with
psoriasis: A consequence of systemic inflammatory burden?" Br J Dermatol.
Punzi, L., M. Pianon, et al. (1998). "Clinical, laboratory and immunogenetic aspects of
post-traumatic psoriatic arthritis: a study of 25 patients." Clin Exp Rheumatol
16(3): 277-281.
Qureshi, A. A., H. K. Choi, et al. (2009). "Psoriasis and the risk of diabetes and
hypertension: a prospective study of US female nurses." Arch Dermatol 145(4):
379-382.
Qureshi, A. A., P. L. Dominguez, et al. (2010). "Alcohol intake and risk of incident
psoriasis in US women: a prospective study." Arch Dermatol 146(12): 1364-
1369.
R Development Core Team (2008). "R: A language and environment for statistical
computing."
Raaby, L., O. Ahlehoff, et al. (2017). "Psoriasis and cardiovascular events: updating the
evidence." Arch Dermatol Res 309(3): 225-228.
Rahman, P. and J. T. Elder (2005). "Genetic epidemiology of psoriasis and psoriatic
arthritis." Ann Rheum Dis 64 Suppl 2: ii37-39; discussion ii40-31.
Rapp, S. R., S. R. Feldman, et al. (1999). "Psoriasis causes as much disability as other
major medical diseases." J Am Acad Dermatol 41(3 Pt 1): 401-407.
Ray-Jones, H., S. Eyre, et al. (2016). "One SNP at a Time: Moving beyond GWAS in
Psoriasis." J Invest Dermatol.
Raychaudhuri, S. K., A. Saxena, et al. (2015). "Role of IL-17 in the pathogenesis of
psoriatic arthritis and axial spondyloarthritis." Clin Rheumatol 34(6): 1019-
1023.
Reich, K. (2009). "Approach to managing patients with nail psoriasis." J Eur Acad
Dermatol Venereol 23 Suppl 1: 15-21.
Richiardi, L., C. Pizzi, et al. (2013). "Commentary: Representativeness is usually not
necessary and often should be avoided." Int J Epidemiol 42(4): 1018-1022.
242
Risch, N. (1990). "Linkage strategies for genetically complex traits. III. The effect of
marker polymorphism on analysis of affected relative pairs." Am J Hum Genet
46(2): 242-253.
Risch, N. and K. Merikangas (1996). "The future of genetic studies of complex human
diseases." Science 273(5281): 1516-1517.
Roach, J. C., K. Deutsch, et al. (2006). "Genetic mapping at 3-kilobase resolution
reveals inositol 1,4,5-triphosphate receptor 3 as a risk factor for type 1
diabetes in Sweden." Am J Hum Genet 79(4): 614-627.
Robinson, D., Jr., M. Hackett, et al. (2006). "Co-occurrence and comorbidities in
patients with immune-mediated inflammatory disorders: an exploration using
US healthcare claims data, 2001-2002." Curr Med Res Opin 22(5): 989-1000.
Rosen, C. F., F. Mussani, et al. (2012). "Patients with psoriatic arthritis have worse
quality of life than those with psoriasis alone." Rheumatology (Oxford) 51(3):
571-576.
Sagi, L. and H. Trau (2011). "The Koebner phenomenon." Clin Dermatol 29(2): 231-
236.
Sahu, M. and J. G. Prasuna (2016). "Twin Studies: A Unique Epidemiological Tool."
Indian J Community Med 41(3): 177-182.
Samarasekera, E. J., J. M. Neilson, et al. (2013). "Incidence of cardiovascular disease in
individuals with psoriasis: a systematic review and meta-analysis." J Invest
Dermatol 133(10): 2340-2346.
Scarpa, R., A. Del Puente, et al. (1992). "Interplay between environmental factors,
articular involvement, and HLA-B27 in patients with psoriatic arthritis." Ann
Rheum Dis 51(1): 78-79.
Schopf, R. E., H. M. Ockenfels, et al. (1996). "Ethanol enhances the mitogen-driven
lymphocyte proliferation in patients with psoriasis." Acta Derm Venereol 76(4):
260-263.
Setty, A. R., G. Curhan, et al. (2007). "Smoking and the risk of psoriasis in women:
Nurses' Health Study II." Am J Med 120(11): 953-959.
Shenefelt, P. D. (2011). "Psychodermatological disorders: recognition and treatment."
Int J Dermatol 50(11): 1309-1322.
Sheng, Y., X. Jin, et al. (2014). "Sequencing-based approach identified three new
susceptibility loci for psoriasis." Nat Commun 5: 4331.
243
Shi, H., N. Mancuso, et al. (2017). "Local Genetic Correlation Gives Insights into the
Shared Genetic Architecture of Complex Traits." Am J Hum Genet 101(5):
737-751.
Skelly, A. C., J. R. Dettori, et al. (2012). "Assessing bias: the importance of considering
confounding." Evid Based Spine Care J 3(1): 9-12.
Skoie, I. M., I. Dalen, et al. (2017). "Fatigue in psoriasis: a controlled study." Br J
Dermatol 177(2): 505-512.
Skroza, N., I. Proietti, et al. (2013). "Correlations between psoriasis and inflammatory
bowel diseases." Biomed Res Int 2013: 983902.
Smith, G. D. and S. Ebrahim (2004). "Mendelian randomization: prospects, potentials,
and limitations." Int J Epidemiol 33(1): 30-42.
Snekvik, I., C. H. Smith, et al. (2017). "Obesity, Waist Circumference, Weight Change,
and Risk of Incident Psoriasis: Prospective Data from the HUNT Study." J Invest
Dermatol 137(12): 2484-2490.
Sobolewski, P., I. Walecka, et al. (2017). "Nail involvement in psoriatic arthritis."
Reumatologia 55(3): 131-135.
Solinger, A. M. and E. V. Hess (1993). "Rheumatic diseases and AIDS--is the association
real?" J Rheumatol 20(4): 678-683.
Solomon, D. H., T. J. Love, et al. (2010). "Risk of diabetes among patients with
rheumatoid arthritis, psoriatic arthritis and psoriasis." Ann Rheum Dis 69(12):
2114-2117.
Solovieff, N., C. Cotsapas, et al. (2013). "Pleiotropy in complex traits: challenges and
strategies." Nat Rev Genet 14(7): 483-495.
Soltani-Arabshahi, R., B. Wong, et al. (2010). "Obesity in early adulthood as a risk
factor for psoriatic arthritis." Arch Dermatol 146(7): 721-726.
Song, J. W. and K. C. Chung (2010). "Observational studies: cohort and case-control
studies." Plast Reconstr Surg 126(6): 2234-2242.
Song, Y. W. and E. H. Kang (2010). "Autoantibodies in rheumatoid arthritis:
rheumatoid factors and anticitrullinated protein antibodies." QJM 103(3): 139-
146.
Sopori, M. (2002). "Effects of cigarette smoke on the immune system." Nat Rev
Immunol 2(5): 372-377.
244
Spain, S. L. and J. C. Barrett (2015). "Strategies for fine-mapping complex traits." Hum
Mol Genet 24(R1): R111-119.
Springate, D. A., R. Parisi, et al. (2017). "Incidence, prevalence and mortality of patients
with psoriasis: a U.K. population-based cohort study." Br J Dermatol 176(3):
650-658.
Stahl, E. A., S. Raychaudhuri, et al. (2010). "Genome-wide association study meta-
analysis identifies seven new rheumatoid arthritis risk loci." Nat Genet 42(6):
508-514.
Stuart, P. E., R. P. Nair, et al. (2010). "Genome-wide association analysis identifies three
psoriasis susceptibility loci." Nat Genet 42(11): 1000-1004.
Stuart, P. E., R. P. Nair, et al. (2015). "Genome-wide Association Analysis of Psoriatic
Arthritis and Cutaneous Psoriasis Reveals Differences in Their Genetic
Architecture." Am J Hum Genet 97(6): 816-836.
Sudlow, C., J. Gallacher, et al. (2015). "UK biobank: an open access resource for
identifying the causes of a wide range of complex diseases of middle and old
age." PLoS Med 12(3): e1001779.
Sugiura, K., A. Takemoto, et al. (2013). "The majority of generalized pustular psoriasis
without psoriasis vulgaris is caused by deficiency of interleukin-36 receptor
antagonist." J Invest Dermatol 133(11): 2514-2521.
Sun, L. and X. Zhang (2014). "The immunological and genetic aspects in psoriasis."
Applied Informatics 1(3).
Sun, L. D., H. Cheng, et al. (2010). "Association analyses identify six new psoriasis
susceptibility loci in the Chinese population." Nat Genet 42(11): 1005-1009.
Symmons, D. P. and S. E. Gabriel (2011). "Epidemiology of CVD in rheumatic disease,
with a focus on RA and SLE." Nat Rev Rheumatol 7(7): 399-408.
Taglione, E., M. L. Vatteroni, et al. (1999). "Hepatitis C virus infection: prevalence in
psoriasis and psoriatic arthritis." J Rheumatol 26(2): 370-372.
Tam, L. S., B. Tomlinson, et al. (2008). "Cardiovascular risk profile of patients with
psoriatic arthritis compared to controls--the role of inflammation."
Rheumatology (Oxford) 47(5): 718-723.
Tan, E. S., W. S. Chong, et al. (2012). "Nail psoriasis: a review." Am J Clin Dermatol
13(6): 375-388.
245
Tang, H., X. Jin, et al. (2014). "A large-scale screen for coding variants predisposing to
psoriasis." Nat Genet 46(1): 45-50.
Taylor, W., D. Gladman, et al. (2006). "Classification criteria for psoriatic arthritis:
development of new criteria from a large international study." Arthritis Rheum
54(8): 2665-2673.
Tejada Cdos, S., R. A. Mendoza-Sassi, et al. (2011). "Impact on the quality of life of
dermatological patients in southern Brazil." An Bras Dermatol 86(6): 1113-
1121.
Tey, H. L., H. L. Ee, et al. (2010). "Risk factors associated with having psoriatic arthritis
in patients with cutaneous psoriasis." J Dermatol 37(5): 426-430.
Thompson, S. D., M. C. Marion, et al. (2012). "Genome-wide association analysis of
juvenile idiopathic arthritis identifies a new susceptibility locus at chromosomal
region 3q13." Arthritis Rheum 64(8): 2781-2791.
Thorarensen, S. M., N. Lu, et al. (2017). "Physical trauma recorded in primary care is
associated with the onset of psoriatic arthritis among patients with psoriasis."
Ann Rheum Dis 76(3): 521-525.
Thumboo, J., K. Uramoto, et al. (2002). "Risk factors for the development of psoriatic
arthritis: a population based nested case control study." J Rheumatol 29(4):
757-762.
Tierney, M., A. Fraser, et al. (2012). "Physical activity in rheumatoid arthritis: a
systematic review." J Phys Act Health 9(7): 1036-1048.
Tilling, L., S. Townsend, et al. (2006). "Methotrexate and hepatic toxicity in rheumatoid
arthritis and psoriatic arthritis." Clin Drug Investig 26(2): 55-62.
Tinazzi, I., S. Adami, et al. (2012). "The early psoriatic arthritis screening questionnaire:
a simple and fast method for the identification of arthritis in patients with
psoriasis." Rheumatology (Oxford) 51(11): 2058-2063.
Tiosano, S., A. Farhi, et al. (2017). "Schizophrenia among patients with systemic lupus
erythematosus: population-based cross-sectional study." Epidemiol Psychiatr Sci
26(4): 424-429.
Tobacco and C. Genetics (2010). "Genome-wide meta-analyses identify multiple loci
associated with smoking behavior." Nat Genet 42(5): 441-447.
Tobin, A. M., M. Sadlier, et al. (2017). "Fatigue as a symptom in psoriasis and psoriatic
arthritis: an observational study." Br J Dermatol 176(3): 827-828.
246
Tsoi, L. C., S. L. Spain, et al. (2015). "Enhanced meta-analysis and replication studies
identify five new psoriasis susceptibility loci." Nat Commun 6: 7001.
Tsoi, L. C., S. L. Spain, et al. (2012). "Identification of 15 new psoriasis susceptibility loci
highlights the role of innate immunity." Nat Genet 44(12): 1341-1348.
Tsoi, L. C., P. E. Stuart, et al. (2017). "Large scale meta-analysis characterizes genetic
architecture for common psoriasis associated variants." Nat Commun 8: 15382.
Tuder, R. M. and I. Petrache (2012). "Pathogenesis of chronic obstructive pulmonary
disease." J Clin Invest 122(8): 2749-2755.
Turley, P., R. K. Walters, et al. (2018). "Multi-trait analysis of genome-wide association
summary statistics using MTAG." Nat Genet 50(2): 229-237.
Turley, P., R. K. Walters, et al. (2018). "Multi-trait analysis of genome-wide association
summary statistics using MTAG." Nat Genet.
Ungprasert, P., N. Srivali, et al. (2016). "Risk of incident chronic obstructive pulmonary
disease in patients with rheumatoid arthritis: A systematic review and meta-
analysis." Joint Bone Spine 83(3): 290-294.
Ursum, J., J. C. Korevaar, et al. (2013). "Prevalence of chronic diseases at the onset of inflammatory arthritis: a population-based study." Fam Pract 30(6): 615-620.
Ursum, J., M. M. Nielen, et al. (2013). "Increased risk for chronic comorbid disorders in
patients with inflammatory arthritis: a population based study." BMC Fam Pract
14: 199.
van der Vaart, H., D. S. Postma, et al. (2004). "Acute effects of cigarette smoke on
inflammation and oxidative stress: a review." Thorax 59(8): 713-721.
van der Voort, E. A., E. M. Koehler, et al. (2014). "Psoriasis is independently associated
with nonalcoholic fatty liver disease in patients 55 years old or older: Results
from a population-based study." J Am Acad Dermatol 70(3): 517-524.
van der Voort, E. A., E. M. Koehler, et al. (2016). "Increased Prevalence of Advanced
Liver Fibrosis in Patients with Psoriasis: A Cross-sectional Analysis from the
Rotterdam Study." Acta Derm Venereol 96(2): 213-217.
van Lent, P. L., C. G. Figdor, et al. (2003). "Expression of the dendritic cell-associated
C-type lectin DC-SIGN by inflammatory matrix metalloproteinase-producing
macrophages in rheumatoid arthritis synovium and interaction with intercellular
adhesion molecule 3-positive T cells." Arthritis Rheum 48(2): 360-369.
247
Verhoeven, E. W., F. W. Kraaimaat, et al. (2007). "Prevalence of physical symptoms of
itch, pain and fatigue in patients with skin diseases in general practice." Br J
Dermatol 156(6): 1346-1349.
Vessey, M. P., R. Painter, et al. (2000). "Skin disorders in relation to oral contraception
and other factors, including age, social class, smoking and body mass index.
Findings in a large cohort study." Br J Dermatol 143(4): 815-820.
Wagner, G. P. and J. Zhang (2011). "The pleiotropic structure of the genotype-
phenotype map: the evolvability of complex organisms." Nat Rev Genet 12(3):
204-213.
Walsh, J. A., K. Callis Duffin, et al. (2013). "Limitations in screening instruments for
psoriatic arthritis: a comparison of instruments in patients with psoriasis." J
Rheumatol 40(3): 287-293.
Wang, J., A. B. Kay, et al. (2009). "Alcohol consumption is not protective for systemic
lupus erythematosus." Ann Rheum Dis 68(3): 345-348.
Warburton, D. E., C. W. Nicol, et al. (2006). "Health benefits of physical activity: the
evidence." CMAJ 174(6): 801-809.
Warnecke, C., I. Manousaridis, et al. (2011). "Cardiovascular and metabolic risk profile
in German patients with moderate and severe psoriasis: a case control study."
Eur J Dermatol 21(5): 761-770.
Weiss, S. C., A. B. Kimball, et al. (2002). "Quantifying the harmful effect of psoriasis on
health-related quality of life." J Am Acad Dermatol 47(4): 512-518.
Wilson, F. C., M. Icen, et al. (2009). "Incidence and clinical predictors of psoriatic
arthritis in patients with psoriasis: a population-based study." Arthritis Rheum
61(2): 233-239.
Winchester, R., G. Minevich, et al. (2012). "HLA associations reveal genetic
heterogeneity in psoriatic arthritis and in the psoriasis phenotype." Arthritis
Rheum 64(4): 1134-1144.
Wu, S., E. Cho, et al. (2015). "Alcohol intake and risk of incident psoriatic arthritis in
women." J Rheumatol 42(5): 835-840.
Wu, S., J. Han, et al. (2015). "Use of aspirin, non-steroidal anti-inflammatory drugs, and
acetaminophen (paracetamol), and risk of psoriasis and psoriatic arthritis: a
cohort study." Acta Derm Venereol 95(2): 217-223.
Wu, S., W. Q. Li, et al. (2014). "Hypercholesterolemia and risk of incident psoriasis
and psoriatic arthritis in US women." Arthritis Rheumatol 66(2): 304-310.
248
Wu, Y., D. Mills, et al. (2008). "Psoriasis: cardiovascular risk factors and other disease
comorbidities." J Drugs Dermatol 7(4): 373-377.
Yin, X., H. Q. Low, et al. (2015). "Genome-wide meta-analysis identifies multiple novel
associations and ethnic heterogeneity of psoriasis susceptibility." Nat Commun
6: 6916.
Zheng, J., D. Baird, et al. (2017). "Recent Developments in Mendelian Randomization
Studies." Curr Epidemiol Rep 4(4): 330-345.
Zheng, J., A. M. Erzurumluoglu, et al. (2017). "LD Hub: a centralized database and web
interface to perform LD score regression that maximizes the potential of
summary level GWAS data for SNP heritability and genetic correlation
analysis." Bioinformatics 33(2): 272-279.
Zhu, T. Y., E. K. Li, et al. (2012). "Cardiovascular risk in patients with psoriatic
arthritis." Int J Rheumatol 2012: 714321.
Zhu, Z., V. Anttila, et al. (2018). "Statistical power and utility of meta-analysis methods
for cross-phenotype genome-wide association studies." PLoS One 13(3):
e0193256.
Zuo, X., L. Sun, et al. (2015). "Whole-exome SNP array identifies 15 new susceptibility
loci for psoriasis." Nat Commun 6: 6793.
249
Appendix
Appendix Table 1 describes the process followed during the assessment visit at a UK
Biobank centre.
Appendix Table 1 | The sequence of the assessment visit (table taken from http://www.ukbiobank.ac.uk/).
Visit station Assessments undertaken
Reception 1. Registration
2. A USB key was provided to each participant
Touch screen questionnaire 1. Consent
2. Questionnaire
3. Hearing Test
4. Cognitive function tests
Interview and blood pressure 1. Interview with a research nurse
2. Blood pressure measurement
3. Measurement of arterial stiffness
Eye measurements 1. Visual acuity
2. Auto-refraction
3. Intraocular pressure
4. Retinal image
Physical measurements 1. Height (standing and sitting)
2. Hip and waist measurement
3. Weight and body composition measurement
4. Hand-grip strength
5. Ultrasound bone densitometry
6. Lung function test (spirometry)
Physical fitness/cardio test 1. Cycling
Sample collection 1. Blood sample
2. Urine sample
3. Saliva sample
4. Consent and result summary printed
Web-based diet questions 1. Dietary questionnaire
250
Appendix Figure 1 and Appendix Figure 2 depict the IPAQ questionnaire along with
the scoring protocol used in the current thesis
(https://sites.google.com/site/theipaq/questionnaire_links).
Appendix Figure 1 | Short version of the International Physical Activity Questionnaire (IPAQ)
253
Appendix Table 2 presents the genetic correlations for PsA and JIA with RA and SLE
using LD Hub (http://ldsc.broadinstitute.org/ldhub/) as a technical validation of a subset
of results presented in the main body of the thesis. The results are identical confirming
the successful harmonization of the datasets and application of the LD score
regression.
Appendix Table 2 | Genetic correlations between PsA, JIA and RA and SLE using LD Hub
Trait Trait genetic correlation (rg) p-value
PsA RA 0.30 0.002
SLE 0.14 0.25
JIA RA 0.49 8.22e-05
SLE 0.59 0.0002
254
cFDR analysis using JIA as the principal disease
Enrichment plots
In Appendix Figure 3 strong enrichment of JIA-associated SNPs was observed with the
proportion of true effects in JIA varying considerably depending on different levels of
association for RA and SLE (bottom plots) and there appears to be a greater amount
of separation between the different curves. The two top plots present a less robust
enrichment pattern for JIA conditioned on AS and PsA.
Appendix Figure 3| Q-Q plots for JIA conditional on AS (top left), PsA (top right), RA (bottom left) and SLE (bottom right). Y axes show log10(P’) for each principal disease and X axes show the log quantile of p-values in sets of SNPs. The degree of leftward shift of a black point from the diagonal is proportional to the unconditional FDR of that p-value for the principal phenotype, and the degree of leftward shift of a coloured point is proportional to the conditional FDR of the p-value for the principal phenotype and the p-cutoff corresponding to the colour for the conditional phenotype. Each colour corresponds to the Q-Q plot for 𝒑𝑱𝑰𝑨 amongst a subset of SNPs with 𝒑𝑨𝑺𝒐𝒓 𝒑𝑷𝒔𝑨𝒐𝒓 𝒑𝑹𝑨 𝒐𝒓 𝒑𝑺𝑳𝑬 less than the
indicated cutoff. A leftward shift with decreasing cut-off indicates that SNPs which are associated with the conditional phenotype (AS, PsA, RA or SLE) are more likely to be associated with the principal phenotype (JIA), presumably due to pleiotropic effects on phenotypes
255
JIA loci identified with cFDR
The greater enrichment seen when conditioning on RA and SLE is depicted in
Appendix Figure 4 (bottom plots), with the increased amount of newly identified
significant loci for JIA. More specifically, 1083 significant SNPs were identified with a
significance threshold of cFDR<1.43e-02 and 348 with a significance threshold of
cFDR<0.014 using cFDR analysis for JIA conditioned of RA and SLE, respectively.
Moreover, 83 significant SNPs (cFDR<2.24e-02) and 161 significant SNPs
(cFDR<1.45e-02) were identified when JIA was conditioned on AS and PsA,
respectively. The final list of independent loci can be seen in Appendix Table 3.
Associations with known genes were identified (for example STAT4, RUNX3,
ANKRD55, SH2B3, TYK2); however, the identified SNPs could be different to those
reported in previous studies as this method works as a SNP prioritization tool,
recognising additional variants. Among the novel identified SNPs 16 were found in
intergenic regions with some of them being associated with the susceptibility of
immune-mediated diseases such as RA, celiac disease and MS. Some of the notable
novel SNPs are rs692211 (ZKSCAN3), rs78264909 (ZSCAN12) and rs13215804
(ZSCAN23) which have been found to be associated with both RA and schizophrenia.
In addition, rs413024 (SOCS1) a susceptibility variant for primary biliary cirrhosis and
PSO was associated with JIA when leveraging power from the PsA cohort. SOCS1 is a
cytokine signalling inhibitor gene that regulates the IFN signal transduction. A previous
study reported changes in SOCS1 levels in systemic JIA monocytes which provides
evidence of inhibition of IFN signalling in these cells (Macaubas, Wong et al. 2016). An
association was found with rs2661798 (SPRED2), a gene associated with RA, SLE and
IBD susceptibility. There is evidence of association at 2p for JIA but has not been
replicated in other studies (Thompson, Marion et al. 2012).
256
Appendix Figure 4 | cFDR results for JIA conditioned on AS (top left), PsA (top right), RA (bottom left). The black vertical line signifies the GWAS significance threshold 5e-08. The red dots signify the genome-wide significant SNPs for the principal disease (herein, JIA), whereas the orange dots (on the left side of the vertical line) signify the SNPs identified as significant for JIA after conditioning on the conditional disease (AS, PSA, RA and SLE). Black dots show a random sample of the observed p-value pairs. Note that the leftward shift of colours corresponding to an increased p-value threshold for association with JIA for SNPs with low p-values for the conditional diseases.
257
Appendix Table 3 | Loci associated with JIA after applying cFDR analysis using as conditional phenotypes AS, PsA, RA and SLE
Chr Position rsid effect
allele
other
allele
MAF conditional
phenotype
principal
p-value
conditiona
l p-value
cFDRprinc.|cond Gene Consequence Associated
Trait
1 25293356 rs4265380 C T 0.46
AS
1.51e-05 8.23e-09 6.19e-04 RUNX3 upstream gene variant p: PSO,AS, JIA
2 191973034 rs10174238 G A 0.24 5.56e-08 0.28 8.75e-03 STAT4 intron variant RA, JIA, p: SLE
3 119017382 rs7640033 T A 0.17 1.86e-05 4.28e-03 1.70e-02 ARHGAP31 intron variant g: celiac disease
5 40385790 rs12523160 T A 0.35 1.08e-04 4.53e-04 1.58e-02 intergenic variant CD, IBD
6 28325308 rs6922111 T C 0.20 2.12e-03 1.93e-30 2.34e-03 ZKSCAN3 intron variant SCZ, p: RA
8 106471210 rs4734866 C A 0.37 1.91e-04 1.87e-04 1.69e-02 ZFPM2 intron variant g:platelet count
9 117629689 rs7048073 A G 0.28 2.47e-06 9.68e-03 1.93e-02 intergenic variant
1 25291010 rs6672420 A T 0.50
PsA
2.02e-05 6.75e-07 1.99e-03 RUNX3 missense variant p: PSO,AS, JIA
2 191966452 rs7568275 G C 0.23 8.35e-08 7.75e-03 7.45e-03 STAT4 intron variant RA, p: SLE,JIA
3 21477153 rs73045433 A G 0.02 3.92e-05 2.70e-04 1.39e-02 ZNF385D intron variant g: gut
microbiome
6 28368106 rs78264909 C T 0.09 9.07e-05 1.09e-05 7.09e-03 ZSCAN12 upstream gene variant RA. g: SCZ
6 162124704 rs73597197 C T 0.02 2.14e-05 4.79e-04 1.40e-02 PARK2 intron variant g: BMI
11 74273590 rs75409031 A C 0.01 2.58e-05 4.20e-04 1.42e-02 POLD3 intron variant g: cancer
16 11354091 rs413024 G A 0.32 1.88e-05 2.24e-06 2.40e-03 SOCS1 upstream gene variant PBC, p: PSO
22 21999292 rs5749600 G A 0.25 1.58e-04 1.82e-05 1.24e-02 SDF2L1 downstream gene variant g: CD,IBD,UC
Chr: Chromosome; MAF: Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; p: proxy SNP to reported SNP associated with;
PSO: Psoriasis; AS: Ankylosing Spondylitis; IgAD: immunoglobulin A Deficiency; RA: Rheumatoid Arthritis; JIA: Juvenile Idiopathic Arthritis; SLE: Systemic Lupus Erythematosus;
g: gene associated with; CD: Crohn’s Disease; IBD: Inflammatory Bowel Disease; SCZ: Schizophrenia; PBC: Primary Biliary Cirrhosis/Cholangitis
JIA|AS cut-off = 2.24e-02; JIA|PsA cut-off = 1.45e-02; JIA|RA cut-off = 1.43e-02; JIA|SLE cut-off = 0.014
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
258
Appendix Table 3 | Loci associated with JIA after applying cFDR analysis using as conditional phenotypes AS, PsA, RA and SLE
Chr Position rsid effect
allele
other
allele
MAF conditional
phenotype
principal
p-value
conditional
p-value
cFDRprinc.|cond Gene Consequence Associated Trait
1 172715702 rs78037977 G A 0.12
RA
3.67e-06 4.44e-04 1.55e-03 SLC25A38P1 upstream gene
variant
g: PSO
1 113828107 rs773560 A G 0.28 3.56e-03 9.18e-16 7.79e-03 intergenic variant RA
2 191969341 rs8179673 C T 0.23 7.97e-08 9.45e-12 1.55e-05 STAT4 intron variant RA, p: SLE, g: JIA
2 191538562 rs10931468 A C 0.15 2.35e-05 9.20e-07 1.81e-03 NAB1 intron variant PBC
2 65635688 rs2661798 T A 0.45 1.10e-03 1.13e-09 6.55e-03 SPRED2 intron variant RA, g: SLE,IBD
2 204769054 rs3116504 G A 0.30 2,34e-03 9.42e-09 1.15e-02 intergenic variant RA, alopecia
2 100764004 rs13415465 G T 0.38 2.24e-03 1.31e-08 1.17e-02 AFF3 upstream gene
variant
RA
3 159094888 rs1375406 T C 0.01 2.77e-06 1.12e-02 6.99e-03 IQCJ-SCHIP1 intron variant
4 123026426 rs13144652 A G 0.16 6.66e-05 4.67e-05 4.45e-03 intergenic variant p: IgAD, celiac
5 55444683 rs7731626 A G 0.38 7.04e-08 7.93e-23 9.79e-06 ANKRD55 intron variant RA, g: JIA
6 28415572 rs13215804 G A 0.31 2.23e-06 2.83e-13 2.77e-04 ZSCAN23 upstream gene
variant
RA, SCZ mixed
6 33546837 rs210142 T C 0.27 7.33e-06 1.77e-07 5.92e-04 BAK1 intron variant platelet count
6 135709760 rs3827780 G A 0.42 4.01e-07 2.92e-02 3.39e-03 AHI1 intron variant g: IgAD, MS
Chr: Chromosome; MAF: Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; g: gene associated with; PSO: Psoriasis; p: proxy SNP to
reported SNP associated with; RA: Rheumatoid Arthritis; SLE: Systemic Lupus Erythematosus; JIA: Juvenile Idiopathic Arthritis; CD: Crohn’s Disease; IBD: Inflammatory Bowel Disease;
SCZ: Schizophrenia; PBC: Primary Biliary Cirrhosis/Cholangitis; IgAD: immunoglobulin A Deficiency; mixed: mixed population (Europeans and Asians); MS: Multiple Sclerosis
JIA|AS cut-off = 2.24e-02; JIA|PsA cut-off = 1.45e-02; JIA|RA cut-off = 1.43e-02; JIA|SLE cut-off = 0.014
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
259
Appendix Table 3 | Loci associated with JIA after applying cFDR analysis using as conditional phenotypes AS, PsA, RA and SLE
Chr Position rsid effect
allele
other
allele
MA
F
conditional
phenotype
principal
p-value
conditional
p-value
cFDRprinc.|cond Gene Consequence Associated Trait
6 433061 rs1567901 A C 0.46
RA
3.60e-04 5.96e-06 5.89e-03 intergenic variant RA mixed
6 33689534 rs2967 A G 0.26 3.26e-03 1.15e-15 7.27e-03 IP6K3 3 prime UTR variant RA, g: BMI,CD
6 27546448 rs116724532 G A 0.02 1.31e-04 5.26e-04 1.05e-02 intergenic variant
7 128576086 rs3757387 C T 0.45 6.21e-04 2.01e-11 2.92e-03 IRF5 upstream gene
variant
RA, PBC
p: SLE, UC, MS
8 129540464 rs16903065 A C 0.10 1.50e-03 5.51e-07 1.29e-02 RP11-89M16.1 intron & non-coding
transcript variant
p: ovarian cancer
9 117353464 rs10124511 G A 0.31 3.70e-07 0.19 1.36e-02 ATP6V1G1 intron variant
10 6178614 rs1983890 T C 0.40 1.45e-04 1.72e-06 3.96e-03 intergenic variant
10 6100725 rs3134883 A G 0.29 9.94e-04 3.47e-09 6.77e-03 IL2RA intron variant RA, p: JIA
11 134180440 rs113825217 A G 0.03 3.29e-07 4.72e-02 4.11e-03 GLB1L3 intron variant g: BMI traits
12 111865049 rs7310615 C G 0.47 4.42e-04 3.69e-07 5.36e-03 SH2B3 intron variant CAD, MI, g: JIA
12 113030227 rs233724 A G 0.49 1.58e-05 3.15e-03 8.59e-03 RPH3A intron variant g: platelet count
13 40300328 rs9603603 G T 0.36 8.25e-04 3.40e-10 4.61e-03 COG6 intron variant RA, p: PSO, JIA
13 43009008 rs1924415 C G 0.22 2.39e-04 8.55e-06 5.66e-03 intergenic variant
Chr: Chromosome; MAF: Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; RA: Rheumatoid Arthritis; g: gene associated with;
BMI: Body Mass Index; CD: Crohn’s Disease; PBC: Primary Biliary Cirrhosis/Cholangitis; p: proxy SNP to reported SNP associated with; SLE: Systemic Lupus Erythematosus;
UC: Ulcerative Colitis; MS: Multiple Sclerosis; CAD: Coronary Artery Disease; MI: Myocardial Infarction; JIA: Juvenile Idiopathic Arthritis; PSO: Psoriasis
JIA|AS cut-off = 2.24e-02; JIA|PsA cut-off = 1.45e-02; JIA|RA cut-off = 1.43e-02; JIA|SLE cut-off = 0.014
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
260
Appendix Table 3 | Loci associated with JIA after applying cFDR analysis using as conditional phenotypes AS, PsA, RA and SLE
Chr Position rsid effect
allele
other
allele
MAF conditional
phenotype
principal
p-value
conditional
p-value
cFDRprinc.|cond Gene Consequence Associated
Trait
14 74297983 rs8006139 G A 0.46
RA
5.48E-05 1.23e-03 1.08e-02 RP5-1021I20.2 upstream gene
variant
15 38915313 rs56279249 C G 0.17 1.14e-03 1.48e-08 8.20e-03 intergenic variant RA
19 10463118 rs34536443 C G 0.03 2.79e-04 4.56e-16 1.36e-03 TYK2 missense variant RA, JIA, PSO
22 37558356 rs2051582 A G 0.21 1.02e-04 2.96e-05 5.02e-03 RP1-151B14.6
1 172715702 rs78037977 G A 0.12
SLE
3.67e-06 1.81e-03 2.01e-03 SLC25A38P1 upstream gene
variant
1 25297184 rs11249215 G A 0.48 3.87e-06 3.31e-02 1.26e-02 RP11-84D1.2 non coding
transcript variant
AS, p: PSO,
IgAD, celiac
disease
2 191960109 rs113429865 T A 0.24 1.44e-07 1.03e-64 2.20e-07 STAT4 intron variant g: JIA
2 214085179 rs2371887 G A 0.44 8.05e-07 4.48e-05 3.65e-04 intergenic variant
2 191486081 rs72917118 T C 0.16 2.79e-06 6.57e-06 5.35e-04 intergenic variant
5 55444683 rs7731626 A G 0.38 7.04e-08 1.84e-04 1.77e-04 ANKRD55 intron variant RA, g: JIA
5 62250627 rs139135162 T C 0.01 1.33e-06 7.78e-02 1.38e-02 intergenic variant
Chr: Chromosome; MAF: Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; RA: Rheumatoid Arthritis; g: gene associated with;
p: proxy SNP to reported SNP associated with; JIA: Juvenile Idiopathic Arthritis; PSO: Psoriasis;
JIA|AS cut-off = 2.24e-02; JIA|PsA cut-off = 1.45e-02; JIA|RA cut-off = 1.43e-02; JIA|SLE cut-off = 0.014
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
261
Appendix Table 3 | Loci associated with JIA after applying cFDR analysis using as conditional phenotypes AS, PsA, RA and SLE
Chr Position rsid effect
allele
other
allele
MAF conditional
phenotype
principal
p-value
conditional
p-value
cFDRprinc.|cond Gene Consequence Associated
Trait
6 28411244 rs13190937 A G 0.31
SLE
2.08e-06 2.30E-09 2.76e-04 ZSCAN23 5 prime UTR variant RA, SCZ
6 135841056 rs9647635 C A 0.36 3.15e-07 1.39e-03 6.63e-04 LINC00271 intron & non coding
transcript variant
p: MS, g: SLE
6 33479774 rs6907702 T C 0.12 9.58e-06 5.58e-93 7.54e-03 intergenic variant
6 33627077 rs2296342 G A 0.37 4.88e-06 1.53e-02 8.41e-03 ITPR3 intron variant g: CD
6 76528438 rs13194998 T C 0.05 2.24e-06 4.33e-02 1.11e-02 MYO6 intron variant g: high BP
7 128570026 rs12706860 G C 0.34 2.37e-05 2.26e-18 1.03e-03 intergenic variant SLE
10 6094697 rs61839660 T C 0.07 3.67e-06 3.23e-03 2.45e-03 IL2RA intron variant T1D, p: JIA,
g: CD, JIA
11 134180440 rs113825217 A G 0.03 3.29e-07 3.56e-02 6.49e-03 GLB1L3 intron variant g: BMI traits
12 112553032 rs10850001 A T 0.45 2.39e-04 1.35e-06 1.11e-02 intergenic variant p: CAD, MI
16 58415897 rs9926887 C T 0.29 3.63e-06 2.39e-02 9.50e-03 RNU6-269P upstream gene variant
19 10463118 rs34536443 C G 0.03 2.79e-04 2.03e-11 1.07e-02 TYK2 missense variant RA, JIA, PSO
21 36665202 rs9305565 G A 0.28 1.26e-05 7.37e-04 5.01e-03 RUNX1 intron variant p: JIA
22 21999292 rs5749600 G A 0.25 1.58e-04 1.07e-06 7.56e-03 SDF2L1 downstream gene
variant
g: CD, IBD
Chr: Chromosome; MAF: Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; RA: Rheumatoid Arthritis; g: gene associated with;
p: proxy SNP to reported SNP associated with; JIA: Juvenile Idiopathic Arthritis; PSO: Psoriasis;
JIA|AS cut-off = 2.24e-02; JIA|PsA cut-off = 1.45e-02; JIA|RA cut-off = 1.43e-02; JIA|SLE cut-off = 0.014
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
262
cFDR analysis using SLE as the principal disease
Enrichment plots
The Q-Q plots in Appendix Figure 5 show a robust enrichment pattern for SLE
conditioned on RA and JIA with curves distinctively separated; thus providing evidence
of pleiotropy.
Appendix Figure 5 | Q-Q plots for SLE conditional on RA (left) and JIA (right). Y axes show log10(P’) for each principal disease and X axes show the log quantile of p-values in sets of SNPs. The degree of leftward shift of a black point from the diagonal is proportional to the unconditional FDR of that p-value for the principal phenotype, and the degree of leftward shift of a coloured point is proportional to the conditional FDR of the p-value for the principal phenotype and the p-cutoff corresponding to the colour for the conditional phenotype. Each colour corresponds to the Q-Q plot for 𝒑𝑺𝑳𝑬 amongst a subset of SNPs with 𝒑𝑹𝑨 𝒐𝒓 𝒑𝑱𝑰𝑨 less than the indicated cutoff. A
leftward shift with decreasing cut-off indicates that SNPs which are associated with the conditional phenotype (RA or JIA) are more likely to be associated with the principal phenotype (JIA), presumably due to pleiotropic effects on phenotypes.
SLE loci identified with cFDR
The enrichment observed in Appendix Figure 5 led to the identification of additional
SNPs, presented with orange colour in Appendix Figure 6, significantly associated with
SLE after leveraging power from RA and JIA GWAS. 849 SNPs were identified when
RA was used as the conditional trait with significant threshold cFDR<1.25e-04 and 315
when JIA was the conditional trait with cFDR<1.19e-04.
263
Appendix Figure 6 | cFDR results for SLE conditioned on RA (left) and JIA (right). The black vertical line signifies the GWAS significance threshold 5e-08. The red dots signify the genome-wide significant SNPs for the principal disease (herein, SLE), whereas the orange dots (on the left side of the vertical line) signify the SNPs identified as significant for SLE after conditioning on the conditional disease (RA and JIA). Black dots show a random sample of the observed p-value pairs. Note that the leftward shift of colours corresponding to an increased p-value threshold for association with SLE for SNPs with low p-values for the conditional diseases.
Thirty novel loci were identified being associated with SLE with six of them being in
intergenic regions as seen in Appendix Table 4. Among the novel loci, three
(rs4954125, rs1444766 and rs183779130) mapped in genes associated with
schizophrenia and two (rs7844895 and rs453301) have been reported to be associated
with neuroticism. Observational studies have shown increased prevalence between
schizophrenia and SLE (Tiosano, Farhi et al. 2017). Moreover, two of the newly
identified SNPs (rs6659932 and rs7764323) have been reported in previous GWAS to
be associated with IBD, PBC and RA, respectively. The rest of the novel SNPs are in
LD with SNPs which have been reported to contribute to the susceptibility of various
autoimmune diseases and monocyte count levels.
264
Appendix Table 4 | Loci associated with SLE after applying cFDR analysis using as a conditional phenotype RA and JIA
Chr Position rsid effect
allele
other
allele
MAF conditional
phenotype
principal
p-value
conditiona
l p-value
cFDRprinc.|cond Gene Consequence Associated
Trait
1 67802371 rs6659932 A C 0.16
RA
1.51e-06 5.83e-06 1.61e-05 IL12RB2 intron variant IBD, PBC
1 92665899 rs12753920 G A 0.34 9.35e-06 1.97e-04 1.14e-04 RP4-775D17.1 upstream gene variant
2 65576306 rs113947673 A C 0.20 3.37e-07 1.16e-04 4.57e-06 SPRED2 intron variant RA mixed,
g:SLE
2 135046984 rs4954125 T G 0.33 8.80e-08 0.16 6.59e-05 MGAT5 intron variant g: SCZ
2 192008203 rs35672585 A G 0.16 1.54e-06 7.52e-03 9.78e-05 STAT4 intron variant g: SLE
3 159747815 rs2647928 G A 0.36 2.01e-07 7.97e-04 4.68e-06 LINC01100 downstream gene
variant
p: PBC
3 123925271 rs1444766 G A 0.26 2.00e-07 2.05e-02 3.09e-05 KALRN intron variant g: SCZ
3 159533769 rs1965998 T G 0.30 1.60e-07 7.46e-02 6.71e-05 IQCJ-SCHIP1 intron variant
3 58429135 rs62259783 A G 0.38 8.28e-08 0.34 9.59e-05 intergenic variant
6 33901603 rs142476835 C T 0.02 6.11e-07 9.24e-05 7.74e-06 intergenic variant
6 36345840 rs7764323 A G 0.12 2.71e-06 3.55e-07 2.38e-05 ETV7 intron variant RA
6 25244395 rs183779130 A G 0.03 6.00e-08 0.40 8.10e-05 KATNBL1P5 upstream gene variant SCZ
Chr: Chromosome; MAF: Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; IBD: Inflammatory Bowel Disease; SCZ: Schizophrenia;
PBC: Primary Biliary Cirrhosis/Cholangitis; RA: Rheumatoid Arthritis; g: gene associated with; SLE: Systemic Lupus Erythematosus; p: proxy SNP to reported SNP associated with;
SLE|RA cut-off = 1.25e-04; SLE|JIA cut-off = 1.19e-04
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
265
Appendix Table 4 | Loci associated with SLE after applying cFDR analysis using as a conditional phenotype RA and JIA
Chr Position rsid effect
allele
other
allele
MAF conditional
phenotype
principal
p-value
conditional
p-value
cFDRprinc.|cond Gene Consequence Associated
Trait
6 33819571 rs72896160 T C 0.10
RA
6.53e-06 2.89e-04 8.89e-05 intergenic variant
6 137900027 rs79689527 A G 0.02 1.42e-05 2.39e-06 1.12e-04 intergenic variant
7 128782112 rs74549660 G A 0.03 9.46e-08 2.23e-03 3.49e-06 TSPAN33 upstream gene variant g: UC
7 73605165 rs150727739 T C 0.01 1.07e-06 4.15e-05 1.14e-05 EIF4H intron variant
7 73811948 rs12537907 G T 0.01 1.21e-06 2.07e-04 1.82e-05 CLIP2 intron variant
7 73866009 rs2097926 T A 0.01 1.45e-06 2.02e-04 2.12e-05 GTF2IRD1 upstream gene variant g: SLE
7 73434106 rs115021831 A G 0.01 2.90e-06 3.39e-05 2.79e-05 intergenic variant
7 42121585 rs866417 T C 0.50 5.75e-08 0.14 4.01e-05 GLI3 intron variant
7 74096144 rs192479202 A G 0.01 5.73e-06 5.58e-05 5.62e-05 GTF2I intron variant g: SLE
8 10955225 rs7844895 G C 0.49 2.84e-07 0.06 9.34e-05 XKR6 intron variant Neurotism
8 11448328 rs1478891 C G 0.34 5.72e-07 0.03 1.10e-04 intergenic variant Neurotism
10 8472876 rs10905367 C G 0.38 1.18e-07 0.11 6.66e-05 RP11-
543F8.2
intron & non coding
transcript variant
g: monocyte
count
11 128499000 rs7941765 C T 0.50 1.14e-06 6.21e-07 1.05e-05 RP11-
744N12.3
downstream gene variant RA mixed
Chr: Chromosome; MAF: Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; g: gene associated with; UC: Ulcerative Colitis;
SLE: Systemic Lupus Erythematosus; RA: Rheumatoid Arthritis; mixed: mixed population (Europeans and Asians included)
SLE|RA cut-off = 1.25e-04; SLE|JIA cut-off = 1.19e-04
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
266
Appendix Table 4 | Loci associated with SLE after applying cFDR analysis using as a conditional phenotype RA and JIA
Chr Position rsid effect
allele
other
allele
MAF conditional
phenotype
principal
p-value
conditional
p-value
cFDRprinc.|cond Gene Consequence Associated
Trait
11 128324869 rs12575600 G C 0.10
RA
3.55e-06 1.43e-04 4.32e-05 ETS1 downstream gene
variant
RA (mixed),
p: SLE(Asian)
16 86021505 rs35703946 A G 0.14 5.53e-08 0.03 1.06e-05 RP11-
542M13.2
downstream gene
variant
g: monocyte
count
16 58329828 rs10852562 T C 0.22 1.99e-07 0.09 9.42e-05 PRSS54 upstream gene variant
17 38068043 rs869402 C T 0.48 5.29e-07 8.20e-07 5.11e-06 GSDMB intron variant RA (mixed),
PBC, g: SLE
17 7234112 rs3809822 G C 0.21 1.29e-07 0.005 7.87e-06 NEURL4 upstream gene variant g: monocytes
1 159171603 rs3845622 A C 0.12
JIA
1.12e-07 0.07 8.81e-05 CADM3 3 prime UTR variant g: BD
2 135072001 rs7575908 G A 0.33 1.31e-07 0.05 8.78e-05 MGAT5 intron variant g: SCZ
3 159747815 rs2647928 G A 0.36 2.01e-07 0.002 2.07e-05 LINC01100 downstream gene
variant
p: PBC
3 159533769 rs1965998 T G 0.30 1.60e-07 0.03 7.72e-05 IQCJ-SCHIP1 intron variant
8 9030387 rs453301 G T 0.47 1.16e-07 0.09 1.01e-04 RP11-10A14.4 downstream gene
variant
Neurotism
Chr: Chromosome; MAF: Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; RA: Rheumatoid Arthritis; SCZ: Schizophrenia;
SLE: Systemic Lupus Erythematosus; mixed: mixed population (Europeans and Asians included); g: gene associated with; PBC: Primary Biliary Cirrhosis/Cholangitis; BD: Bipolar Disease;
SLE|RA cut-off = 1.25e-04; SLE|JIA cut-off = 1.19e-04
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
267
Appendix Table 4 | Loci associated with SLE after applying cFDR analysis using as a conditional phenotype RA and JIA
Chr Position rsid effect
allele
other
allele
MAF conditional
phenotype
principal
p-value
conditional
p-value
cFDRprinc.|con
d
Gene Consequence Associated
Trait
10 8472876 rs10905367 C G 0.38 1.18e-07 0.07 9.34e-05 RP11-
543F8.2
intron & non coding
transcript variant
g: monocyte
count
16 86021505 rs35703946 A G 0.14
JIA
5.53e-08 0.36 9.54e-05 RP11-
542M13.2
downstream gene variant g: monocyte
count
17 37993352 rs12938617 A T 0.03 5.12e-08 0.68 1.04e-04 IKZF3 intron variant p: SLE
18 55813873 rs117647127 T A 0.05 2.32e-07 1.90e-03 2.50e-05 BRSK1 intron variant g: menopause
Chr: Chromosome; MAF: Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; g: gene associated with;
p: proxy SNP to the reported SNP associated with; SLE: Systemic Lupus Erythematosus
SLE|RA cut-off = 1.25e-04; SLE|JIA cut-off = 1.19e-04
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
268
cFDR analysis using RA as the principal disease
Enrichment plots
Appendix Figure 7 presents the Q-Q enrichment analysis for the principal disease RA
conditioned on SLE (top), PsA (right) and JIA (left). The less robust enrichment pattern
was in the pair RA-PsA with the slope of Q-Q plots slightly increased when SNPs of
increasingly strong association with PsA were plotted. The other two plots present
evidence of pleiotropy, especially in the first four significance levels.
RA loci identified with cFDR
Appendix Figure 8 presents the newly identified SNPs associated with RA upon
conditioning with the correlated traits including SLE, PsA and JIA. 1,023 SNPs were
associated with RA (cFDR<1.25e-04) conditioning on SLE and 247 when leveraging
power from the PsA summary data with significance threshold cFDR<2.15e-05. Finally
607 were identified with cFDR<1.22e-04 when JIA was used as the conditional trait.
The association of RA with most of the loci identified in previous studies was
replicated; however 16 novel loci were identified, of which eight were intergenic and
the rest have been reported in other autoimmune diseases as shown by PhenoScanner.
For example, rs6679356 (IL12RB2) found here is in LD with variant associated with
IBD and MS. IL12RB2 gene is the receptor for IL-12 and promotes the proliferation of
T-cells. It encodes IL-12Rβ2 whose lack of signalling promotes autoimmunity in animal
models (Airoldi, Di Carlo et al. 2005). No previous association of rs1234313 that maps
gene TNFSF4 with RA predisposition has been reported previously. TNFSF4 encodes a
cytokine that is expressed on CD40-stimulated B-cells and antigen-presenting cells and
has been associated with SLE and MS (Baum, Gayle et al. 1994). An interesting finding
is the association with SNP rs4958880 in TNIP1 region whose variants are associated
with PSO, PSA, SLE and myasthenia gravis and inhibits NF-κB transcriptional activity.
Novel loci were also found in genes CCRI (rs3176953) and ZFP36L1 (rs10443). These
genes are associated with the susceptibility to a number of autoimmune diseases
including IBD, CD, UC and IBD. The list of novel associations for RA can be found in
Appendix Table 5.
269
Appendix Figure 7 | Q-Q plots for RA conditional on SLE (top), PsA (bottom left) and JIA (bottom right). Y axes show log10(P’) for each principal disease and X axes show the log quantile of p-values in sets of SNPs. The degree of leftward shift of a black point from the diagonal is proportional to the unconditional FDR of that p-value for the principal phenotype, and the degree of leftward shift of a coloured point is proportional to the conditional FDR of the p-value for the principal phenotype and the p-cutoff corresponding to the colour for the conditional phenotype. Each colour corresponds to the Q-Q plot for 𝒑𝑹𝑨 amongst a subset of SNPs with 𝒑𝑺𝑳𝑬 𝒐𝒓 𝒑𝑷𝒔𝑨𝒐𝒓 𝒑𝑱𝑰𝑨 less than the indicated
cutoff. A leftward shift with decreasing cut-off indicates that SNPs which are associated with the conditional phenotype (SLE, PSA, JIA) are more likely to be associated with the principal phenotype (RA), presumably due to pleiotropic effects on phenotypes.
270
Appendix Figure 8 | cFDR results for RA conditioned on SLE (top), PsA (bottom left) and JIA (bottom right). The black vertical line signifies the GWAS significance threshold 5e-08. The red dots signify the genome-wide significant SNPs for the principal disease (herein, RA), whereas the orange dots (on the left side of the vertical line) signify the SNPs identified as significant for RA after conditioning on the conditional disease (SLE, PSA, JIA). Black dots show a random sample of the observed p-value pairs. Note that the leftward shift of colours corresponding to an increased p-value threshold for association with RA for SNPs with low p-values for the conditional disease.
271
Appendix Table 5 | Loci associated with RA after applying cFDR analysis using as a conditional phenotype SLE, JIA and PsA
Chr Position rsid effect
allele
other
allele
MAF conditional
phenotype
principal
p-value
conditional
p-value
cFDRprinc.|cond Gene Consequence Associated
Trait
1 161478859 rs4657041 T C 0.49
SLE
9.84e-08 7.64e-11 1.32e-06 FCGR2A intron variant IBD, UC, g: RA
1 67820194 rs6679356 C T 0.172 4.51e-06 1.12e-05 4.60e-05 IL12RB2 intron variant p: IBD, PBC
1 173166247 rs1234313 A G 0.31 9.34e-06 1.03e-06 6.97e-05 TNFSF4 intron variant g: SLE,CD,MS
2 191516020 rs6733720 G C 0.20 4.32e-07 3.12e-08 3.52e-06 NAB1 intron variant g: PBC
2 202154397 rs6715284 G C 0.10 2.93e-07 2.95e-02 7.00e-05 ALS2CR12 intron variant RA (mixed)
3 58318477 rs185407974 A G 0.05 5.68e-08 1.45e-04 1.81e-06 PXK upstream gene variant p: RA
5 150438477 rs4958880 A C 0.19 8.15e-07 3.76e-15 5.60e-06 TNIP1 intron variant g: PsA,PSO
5 133423616 rs244687 A G 0.16 6.35e-06 1.31e-08 3.86e-05 intergenic variant
6 27616489 rs4711160 C T 0.14 1.24e-07 9.32e-06 2.08e-06 RP1-15D7.1 downstream gene
variant
6 426268 rs6930468 A G 0.35 1.61e-07 0.12 9.83e-05 intergenic variant RA (mixed)
8 11351912 rs922483 T C 0.28 1.76e-07 7.23e-15 1.45e-06 BLK 5 prime UTR variant RA (mixed)
8 129540464 rs16903065 A C 0.10 5.51e-07 2.30e-03 3.84e-05 RP11-
89M16.1
intron & non coding
transcript variant
p: ovarian
cancer
10 63910344 rs148672683 C T 0.01 5.98e-08 2.73e-04 2.32e-06 intergenic variant
12 111833788 rs10774624 G A 0.47 2.36e-07 1.49e-07 2.30e-06 RP3-
473L9.4
intron & non coding
transcript variant
RA (mixed),
p: T1D
Chr: Chromosome; MAF: Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; g: gene associated with;
p: proxy SNP to the reported SNP associated with; SLE: Systemic Lupus Erythematosus
RA|SLE cut-off = 1.25e-04; RA|JIA cut-off = 1.22e-04; RA|PsA cut-off = 2.15e-05
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
272
Appendix Table 5 | Loci associated with RA after applying cFDR analysis using as a conditional phenotype SLE, JIA and PsA
Chr Position rsid effect
allele
other
allele
MAF conditional
phenotype
principal
p-value
conditional
p-value
cFDRprinc.|cond Gene Consequence Associated
Trait
12 56384804 rs705699 A G 0.39
SLE
9.84e-08 4.86e-02 3.61e-05 RAB5B intron variant RA (mixed)
14 68760141 rs1950897 C T 0.35 2.51e-07 9.05e-03 3.14e-05 RAD51B intron variant RA
16 86009760 rs12232384 A C 0.22 5.85e-07 2.73e-02 1.23e-04 intergenic variant RA (mixed),
MS
22 21979096 rs11089637 C T 0.17 5.55e-07 2.73e-12 5.19e-06 YDJC downstream gene
variant
CD, IBD,
RA(mixed)
1 114588810 rs139977996 C T 0.01
PsA
1.59e-07 3.12e-02 1.07e-04 intergenic variant
10 6177894 rs71479758 G A 0.25 1.58e-07 2.25e-02 9.07e-05 intergenic variant
12 56384804 rs705699 A G 0.39 9.84e-08 5.92e-02 9.24e-05 RAB5B intron variant RA (mixed)
22 21979096 rs11089637 C T 0.17 5.55e-07 7.56e-05 2.06e-05 YDJC downstream gene
variant
CD, IBD, RA
(mixed)
1 114588810 rs139977996 C T 0.01
JIA
1.59e-07 1.76e-02 3.08e-05 intergenic variant
3 58318477 rs185407974 A G 0.05 5.68e-08 8.55e-02 3.03e-05 PXK upstream gene variant p: RA
3 46243718 rs3176953 A T 0.14 4.69e-07 8.68e-03 5.70e-05 CCR1 3 prime UTR variant g: UC, IBD
6 33820658 rs78861422 T C 0.03 5.92e-08 0.25 6.34e-05 intergenic variant
6 27714052 rs142306808 G C 0.04 4.53e-07 0.01 7.23e-05 intergenic variant
6 426268 rs6930468 A G 0.35 1.61e-07 8.88e-02 8.10e-05 intergenic variant RA
Chr: Chromosome; MAF: Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; g: gene associated with;
p: proxy SNP to the reported SNP associated with; SLE: Systemic Lupus Erythematosus
RA|SLE cut-off = 1.25e-04; RA|JIA cut-off = 1.22e-04; RA|PsA cut-off = 2.15e-05
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: 1000G; r-squared: 0.8
273
Appendix Table 5 | Loci associated with RA after applying cFDR analysis using as a conditional phenotype SLE, JIA and PsA
Chr Position rsid effect
allele
other
allele
MAF conditional
phenotype
principal
p-value
conditional
p-value
cFDRprinc.|cond Gene Consequence Associated
Trait
8 11341880 rs2736337 C T 0.24
JIA
1.60e-07 0.02 3.11e-05 intergenic variant RA, p: SLE
8 129540464 rs16903065 A C 0.10 5.51e-07 0.002 3.16e-05 RP11-
89M16.1
intron & non coding
transcript variant
p: RA, CD
10 6178941 rs11598494 C T 0.37 1.16e-06 3.89e-04 3.21e-05 intergenic variant
10 9049253 rs12413578 T C 0.11 3.27e-07 0.01 5.37e-05 intergenic variant RA (mixed)
12 111833788 rs10774624 G A 0.47 2.36e-07 6.80e-04 9.31e-06 RP3-
473L9.4
intron & non coding
transcript variant
RA (mixed),
p: RA
12 58108052 rs1633360 C T 0.40 9.11e-08 0.11 5.34e-05 OS9 intron variant RA
14 69260290 rs10443 T C 0.25 6.97e-07 0.009 8.42els-05 ZFP36L1 upstream gene variant g: T1D,MS,CD
Chr: Chromosome; MAF: Minor Allele Frequency; cFDR: conditional False Discovery Rate; princ.: principal; cond.: conditional; g: gene associated with;
p: proxy SNP to the reported SNP associated with; SLE: Systemic Lupus Erythematosus
RA|SLE cut-off = 1.25e-04; RA|JIA cut-off = 1.22e-04; RA|PsA cut-off = 2.15e-05
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: 1000G; r-squared: 0.8
274
JIA loci identified with MTAG
The noteworthy power gain noticed in JIA can be seen in the number of novel
associations detected by MTAG (Appendix Table 6 and Appendix Table 7). Thirty nine
signal peaks were identified including 14 intergenic SNPs and 15 provided evidence of
contribution to the predisposition of JIA including; RP4-590F24.1 (rs12563513, OR
1.16, p = 3.26e-18), AFF3 (rs12712065, OR 1.06, p = 1.51e-08), AC006460.2 (rs744600,
OR 1.08, p = 1.07e-16), C6orf106 (rs13207858, OR 1.12, p = 41.53e-08), AHI1
(rs2614266, OR 1.05, p = 3.64e-08), ITPR3 (rs749338, OR 1.07, p = 4.79e-13), TNOP3
(rs17338998, OR 1.21, p = 1.85e-32), BLK (rs4840568, OR 1.09, p = 9.55e-15) and
RASGRP1 (rs8043085, OR 1.08, p = 1.74e-11); the remainder are listed in Appendix
Table 7. The remaining 9 associations were found to be protective of JIA; IL12RB2
(rs6693065, OR 0.94, p = 6.28e-09), LINC01100 (rs485499, OR 0.94, p = 6.69e-10),
C5orf30 (rs411648, OR 0.94, p = 1.74e-10), RP11-89M16.1 (rs16903081, OR 0.92, p =
1.45e-08), ICAM3 (rs2278442, OR 0.92, p = 2.69e-16), RP11-279F6.3 (rs12899564, OR
0.88, p = 1.17e-12) and the remaining three SNPs protective of JIA can be found in
Appendix Table 7.
Finally, six gene associations were replicated in this study including ANKRD55, STAT4,
IL2RA, SH2B3, RUNX1 and COG6 (Appendix Table 6). In addition, a novel independent
loci was found for ANKRD55 to be protective for JIA as well (rs13186299, OR 0.90, p
= 2.81e-13). The Manhattan plot depicted in Appendix Figure 9 presents the SNPs for
JIA GWAS and MTAG analyses in genomic scale.
275
Appendix Table 6 | MTAG results for JIA (results presented for original JIA p-value<0.05)
Chr Position rsid effect
allele
other
allele
MAF JIA
p-value
MTAG
p-value
MTAG
OR
MTAG
95% CI
Gene Consequence Associated Trait
1 25304552 rs10794667 C T 0.46 9.00e-06 1.76e-08 0.95 0.93-0.97 intergenic variant
1 114547798 rs12563513 A G 0.09 2.19e-02 3.26e-18 1.16 1.12-1.19 RP4-590F24.1 upstream gene variant RA
1 67800018 rs6693065 G A 0.24 2.40e-02 6.28e-09 0.94 0.92-0.96 IL12RB2 intron variant
2 100761105 rs12712065 C G 0.49 1.81e-02 1.51e-08 1.06 1.04-1.08 AFF3 upstream gene variant RA
2 191564757 rs744600 G T 0.39 9.13e-03 1.07e-16 1.08 1.06-1.11 AC006460.2 intron & non coding
transcript variant
Height
2 191970120 rs7582694 C G 0.23 7.36e-08 1.05e-53 1.19 1.16-1.22 STAT4 intron variant JIA, RA, p: SLE
3 159745863 rs485499 C T 0.35 2.27e-03 6.69e-10 0.94 0.92-0.96 LINC01100 downstream gene variant PBC
4 123402195 rs6534349 G A 0.09 2.32e-03 7.19e-09 1.10 1.07-1.14 intergenic variant
5 55438851 rs10065637 T C 0.22 2.89e-05 4.72e-15 0.91 0.89-0.93 ANKRD55 intron variant JIA, CD, RA
5 55455645 rs13186299 C G 0.14 2.05e-03 2.81e-13 0.90 0.88-0.93 ANKRD55 intron variant RA
5 133425735 rs17167255 A G 0.07 2.63e-02 2.24e-09 1.12 1.08-1.16 intergenic variant
5 102602902 rs411648 T A 0.30 7.19e-03 1.74e-10 0.94 0.92-0.95 C5orf30 intron variant RA, g: PBC
6 34640870 rs13207858 T C 0.06 6.89e-04 1.53e-08 1.12 1.08-1.17 C6orf106 intron variant g: CAD, high BP
6 137973068 rs2327832 G A 0.17 2.29e-02 1.89e-15 1.11 1.08-1.14 intergenic variant UC, RA, IgAD
Chr: Chromosome; MAF; Minor Allele Frequency; MTAG: Multi-Trait Analysis of GWAS; OR: Odds Ratio; CI: Confidence Interval; RA: Rheumatoid Arthritis;
IBD: Inflammatory Bowel Disease; PBC: Primary Biliary Cirrhosis/Cholangitis; p: proxy SNP to the reported SNP associated with; SLE: Systemic Lupus Erythematosus;
IgAD: Immunoglobulin A Deficiency; JIA: Juvenile Idiopathic Arthritis; CD: Crohn’s Disease; CAD: Coronary Artery Disease; BP: Blood Pressure; UC: Ulcerative Colitis
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
Novel loci are presented in bright purple
276
Appendix Table 6 | MTAG results for JIA (results presented for original JIA p-value<0.05)
Chr Position rsid effect
allele
other
allele
MAF JIA
p-value
MTAG
p-value
MTAG
OR
MTAG
95% CI
Gene Consequence Associated Trait
6 135716532 rs2614266 A T 0.42 4.61e-07 3.64e-08 1.05 1.03-1.08 AHI1 intron variant g: IgAD, MS
6 33653448 rs749338 T C 0.46 3.42e-02 4.79e-13 1.07 1.05-1.09 ITPR3 synonymous variant RA, Height
7 128618559 rs17338998 T C 0.10 4.04e-03 1.85e-32 1.21 1.17-1.25 TNPO3 intron variant RA, p: SLE,MS
8 129548309 rs16903081 C T 0.10 2.12e-03 1.45e-08 0.92 0.89-0.94 RP11-
89M16.1
intron & non coding
transcript variant
8 11351019 rs4840568 A G 0.27 2.97e-02 9.55e-15 1.09 1.06-1.11 BLK upstream gene variant RA, p:SLE, g:
Neurotism, Sjogren’s
10 6100725 rs3134883 A G 0.29 9.94e-04 5.03e-08 1.06 1.04-1.08 IL2RA intron variant T1D, p: JIA
12 111884608 rs3184504 T C 0.46 5.81e-04 1.61e-13 1.07 1.05-1.09 SH2B3 missense variant JIA, MI, T1D
13 40300328 rs9603603 G T 0.36 8.25e-04 3.96e-08 0.95 0.93-0.97 COG6 intron variant RA, p: JIA,PSO
15 69985284 rs12899564 G C 0.07 4.82e-02 1.17e-12 0.88 0.84-0.91 RP11-279F6.3 intron & non coding
transcript variant
RA, Height
15 38828140 rs8043085 T G 0.22 3.01e-02 1.74e-11 1.08 1.06-1.11 RASGRP1 intron variant RA
19 10444826 rs2278442 G A 0.34 8.59e-03 2.69e-16 0.92 0.90-0.94 ICAM3 intron variant RA, g: IBD, UC
21 36715761 rs9979383 C T 0.36 1.03e-02 3.87e-08 0.95 0.93-0.97 RUNX1 intron variant JIA, RA
Chr: Chromosome; MAF; Minor Allele Frequency; MTAG: Multi-Trait Analysis of GWAS; OR: Odds Ratio; CI: Confidence Interval; RA: Rheumatoid Arthritis;
IBD: Inflammatory Bowel Disease; PBC: Primary Biliary Cirrhosis/Cholangitis; p: proxy SNP to the reported SNP associated with; SLE: Systemic Lupus Erythematosus;
IgAD: Immunoglobulin A Deficiency; JIA: Juvenile Idiopathic Arthritis; MI: Myocardial Infraction; PSO: Psoriasis; CD: Crohn’s Disease; BP: Blood Pressure; UC: Ulcerative Colitis
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
Novel loci are presented in bright purple
277
Appendix Table 7 | MTAG results for JIA (original JIA p-value>0.05)
Chr Position rsid effect
allele
other
allele
MAF JIA
p-value
MTAG
p-value
MTAG
OR
MTAG
95% CI
Gene Consequence Associated Trait
1 173353881 rs1557121 T C 0.24 6.46e-02 8.80e-15 0.92 0.90-0.94 intergenic variant RA
1 173191475 rs2205960 T G 0.23 3.02e-01 2.43e-09 1.07 1.05-1.09 intergenic variant SLE (Asian), IgAD
2 113829869 rs13019891 T G 0.45 4.16e-01 1.76e-11 0.94 0.92-0.96 IL1F10 upstream gene variant
2 163110536 rs2111485 A G 0.40 1.63e-01 8.22e-09 0.95 0.93-0.96 intergenic variant IBD, p: PSO,T1D
2 233288667 rs2573219 C A 0.09 9.34e-01 7.13e-14 1.14 1.10-1.17 AC068134.5 upstream gene variant
3 12481375 rs4498025 T C 0.24 1.44e-02 2.06e-06 1.05 1.03-1.08 intergenic variant
3 129084581 rs9852014 G A 0.07 9.12e-02 5.33e-11 1.13 1.09-1.17 intergenic variant
4 102714254 rs4518254 G T 0.44 5.66e-02 1.43e-14 0.93 0.91-0.95 BANK1 intron variant
5 150438988 rs1422673 T C 0.19 8.09e-01 2.15e-12 1.09 1.06-1.11 TNIP1 intron variant Myasthenia Gravis
5 159879978 rs2431697 C T 0.43 1.94e-01 8.64e-09 0.95 0.93-0.96 intergenic variant PSO, SLE
6 26309908 rs10484439 A G 0.07 7.88e-01 3.70e-10 1.12 1.08-1.17 intergenic variant SCZ
6 26582035 rs13198716 T C 0.07 8.70e-01 1.37e-12 1.15 1.10-1.19 intergenic variant SCZ
6 27868792 rs13199649 T C 0.07 4.64e-01 2.06e-13 1.14 1.10-1.19 RNU7-26P upstream gene variant SCZ
6 25983010 rs13212534 A G 0.06 4.39e-01 1.61e-09 1.12 1.08-1.17 TRIM38 intron variant SCZ
Chr: Chromosome; MAF; Minor Allele Frequency; MTAG: Multi-Trait Analysis of GWAS; OR: Odds Ratio; CI: Confidence Interval; RA: Rheumatoid Arthritis;
SLE: Systemic Lupus Erythematosus; IgAD: Immunoglobulin A Deficiency; PBC: Primary Biliary Cirrhosis/Cholangitis; IBD: Inflammatory Bowel Disease;
p: proxy SNP to the reported SNP associated with; PSO: Psoriasis; T1D: Type 1 Diabetes; SCZ: Schizophrenia
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
Novel loci are presented in bright purple
278
Appendix Table 7 | MTAG results for JIA (original JIA p-value>0.05)
Chr Position rsid effect
allele
other
allele
MAF JIA
p-value
MTAG
p-value
MTAG
OR
MTAG
95% CI
Gene Consequence Associated Trait
6 167540842 rs1571878 C T 0.42 3.51e-01 2.05e-08 1.06 1.04-1.08 CCR6 intron variant RA
6 33557225 rs430655 A G 0.36 6.19e-01 2.15e-10 0.94 0.92-0.96 GGNBP1 downstream gene
variant
RA
6 138230389 rs7749323 A G 0.02 1.11e-01 1.05e-17 1.34 1.25-1.43 intergenic variant RA, SLE
22 21979096 rs11089637 C T 0.17 7.36e-02 4.44e-19 1.12 1.09-1.15 YDJC downstream gene
variant
RA, IBD
22 39740078 rs137687 A G 0.44 7.89e-01 1.48e-08 0.95 0.93-0.96 intergenic variant RA
Chr: Chromosome; MAF; Minor Allele Frequency; MTAG: Multi-Trait Analysis of GWAS; OR: Odds Ratio; CI: Confidence Interval; RA: Rheumatoid Arthritis;
SLE: Systemic Lupus Erythematosus; IBD: Inflammatory Bowel Disease;
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
Novel loci are presented in bright purple
279
Appendix Figure 9 | Manhattan plot of association results for JIA. Each circle presents the − 𝐥𝐨𝐠𝟏𝟎(𝒑) of the variants. The thresholds of suggestive (p-value = 1e-06) and genome-wide significance (p-value = 5e-08) are delineated with blue and red lines, respectively. The plot includes SNPs that were significant in both GWAS and MTAG.
280
SLE loci identified with MTAG
Appendix Table 8 and Appendix Figure 10 present the 26 regions with genome-wide
significant association for SLE. Two of these were previously known (ARID5B and
ATXN2/SH2B3) and the other 24 were newly established at genome-wide significance.
Thirteen of the novel associations were found to contribute to disease susceptibility;
RP4-590F24.1 (rs10494164, OR 1.17, p = 2.53e-11), AC012370.3 (rs1866050, OR 1.09,
p = 3.84e-08), KALRN (rs1444766, OR 1.09, p = 9.43e-09), C6orf106 (rs13207858, OR
1.19, p = 7.78e-10), CLIP2 (rs12537907, OR 1.41, p = 2.07e-08), XKR6 (rs2001433, OR
1.07, p = 1.60e-08), RP11-744N12.3 (rs7945677, OR 1.07, p = 1.62e-08), RASGRP1
(rs8043085, OR 1.09, p = 1.50e-08), PRSS54 (rs11644244, OR 1.10, p = 2.34e-09),
NEURL4 (rs8081264, OR 1.10, p = 8.69e-10), intergenic rs2327832 (OR 1.16, p =
1.52e-16), intergenic rs13274269 (OR 1.08, p = 2.48e-08) and rs9308364 (OR 1.08, p
= 2.34e-09).
The remaining 11 novel gene associations were found to be protective for SLE
including CTLA4 (rs3087243, OR 0.92, p = 3.12e-09) MGAT (rs4954125, OR 0.91, p =
2.89e-11), LINC01100 (rs564976, OR 0.92, p = 4.64e-09), ITPR3 (rs4259245, OR 0.91,
p = 4.72e-12), ETV7 (rs881648, OR 0.89, p = 4.26e-10), RP11-543F8.2 (rs10905371,
OR 0.91, p = 2.66e-11), GSDMB (rs7224129, OR 0.93, p = 8.08e-09), CACNA1I
(rs12170452, OR 0.92, p = 4.86e-10), intergenic variant rs7000141 (OR 0.91, p =
2.86e-11), intergenic rs10152590 (OR 0.87, p = 2.08e-08) and rs137687 (OR 0.92, p =
3.31e-10).
281
Appendix Table 8 | MTAG results for SLE
Chr Position rsid effect
allele
other
allele
MAF SLE
p-value
MTAG
p-value
MTAG
OR
MTAG
95% CI
Gene Consequence Associated Trait
1 114546528 rs10494164 G A 0.09 2.67e-03 2.53e-11 1.17 1.12-1.22 RP4-590F24.1 upstream gene variant RA
2 65677729 rs1866050 G A 0.24 5.38e-06 3.84e-08 1.09 1.06-1.12 AC012370.3 non coding transcript
variant
2 204738919 rs3087243 A G 0.47 8.97e-03 3.12e-09 0.92 0.90-0.95 CTLA4 downstream gene variant RA, T1D,IgAD
2 135046984 rs4954125 T G 0.33 8.80e-08 2.89e-11 0.91 0.89-0.94 MGAT5 intron variant g: SCZ
3 123925271 rs1444766 G A 0.26 2.00e-07 9.43e-09 1.09 1.06-1.12 KALRN intron variant g: SCZ
3 159729059 rs564976 A G 0.35 9.49e-07 4.64e-09 0.92 0.90-0.95 LINC01100 upstream gene variant p: PBC
6 34640870 rs13207858 T C 0.06 1.12e-07 7.78e-10 1.19 1.13-1.26 C6orf106 intron variant g: BMI, CAD, high
BP
6 137973068 rs2327832 G A 0.17 3.40e-06 1.52e-16 1.16 1.12-1.20 intergenic variant RA, UC, IgAD
6 33624221 rs4259245 G A 0.39 2.46e-06 4.72e-12 0.91 0.89-0.93 ITPR3 intron variant RA, g: CD, asthma
6 36350605 rs881648 T C 0.14 3.54e-06 4.26e-10 0.89 0.86-0.92 ETV7 intron variant RA
7 73811948 rs12537907 G T 0.01 1.21e-06 2.07e-08 1.41 1.25-1.59 CLIP2 intron variant
8 11449325 rs13274269 T G 0.34 7.28e-07 2.48e-08 1.08 1.05-1.11 intergenic variant Neurotism
8 10903475 rs2001433 T A 0.49 3.60e-07 1.60e-08 1.07 1.05-1.11 XKR6 intron variant Neurotism
8 11070721 rs7000141 A G 0.33 6.50e-08 1.86e-11 0.91 0.88-0.94 intergenic variant
Chr: Chromosome; MAF; Minor Allele Frequency; MTAG: Multi-Trait Analysis of GWAS; OR: Odds Ratio; CI: Confidence Interval; RA: Rheumatoid Arthritis;
IBD: Inflammatory Bowel Disease; PBC: Primary Biliary Cirrhosis/Cholangitis; p: proxy SNP to the reported SNP associated with; SLE: Systemic Lupus Erythematosus;
IgAD: Immunoglobulin A Deficiency; JIA: Juvenile Idiopathic Arthritis; MI: Myocardial Infraction; PSO: Psoriasis; CD: Crohn’s Disease; BP: Blood Pressure; UC: Ulcerative Colitis
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
Novel loci are presented with bright purple
282
Appendix Table 8 | MTAG results for SLE
Chr Position rsid effect
allele
other
allele
MAF SLE
p-value
MTAG
p-value
MTAG
OR
MTAG
95% CI
Gene Consequence Associated Trait
10 8480044 rs10905371 G A 0.32 2.06e-07 2.66e-11 0.91 0.88-0.94 RP11-543F8.2 intron & non coding
transcript variant
g: monocyte count
10 63813744 rs16916931 T A 0.37 1.21e-07 3.90e-12 1.10 1.07-1.13 ARID5B intron variant p: RA, p: SLE
11 128499905 rs7945677 C T 0.50 1.34e-06 1.62e-08 1.07 1.05-1.11 RP11-744N12.3 intron & non coding
transcript variant
RA (mixed)
12 112007756 rs653178 C T 0.47 1.20e-07 1.31e-11 1.09 1.07-1.12 ATXN2/SH2B3 intron variant CAD,MI, p:SLE,JIA
15 70048116 rs10152590 T A 0.07 1.86e-03 2.08e-08 0.87 0.83-0.91 intergenic variant RA, Height
15 38828140 rs8043085 T G 0.22 1.02e-03 1.50e-08 1.09 1.06-1.13 RASGRP1 intron variant RA, g:T2D,CD
16 58322851 rs11644244 G A 0.20 3.10e-06 2.56e-08 1.10 1.06-1.13 PRSS54 intron variant
16 86003446 rs9308364 C T 0.49 3.41e-07 2.34e-09 1.08 1.05-1.11 intergenic variant
17 38075426 rs7224129 A G 0.48 1.95e-06 8.08e-09 0.93 0.90-0.95 GSDMB upstream gene variant UC, RA(mixed)
17 7235316 rs8081264 C G 0.21 1.66e-07 8.69e-10 1.10 1.07-1.14 NEURL4 upstream gene variant g: monocyte % of
white cells
22 40019773 rs12170452 A G 0.42 4.00e-04 4.86e-10 0.92 0.90-0.94 CACNA1I intron variant p: SCZ
22 39740078 rs137687 A G 0.44 4.83e-03 3.31e-10 0.92 0.90-0.94 intergenic variant RA
Chr: Chromosome; MAF; Minor Allele Frequency; MTAG: Multi-Trait Analysis of GWAS; OR: Odds Ratio; CI: Confidence Interval; &: and; g: gene associated with;
p: proxy SNP to the reported SNP associated with; RA: Rheumatoid Arthritis; SLE: Systemic Lupus Erythematosus; mixed: mixed populations (Europeans and Asians);
CAD: Cardiac Artery Disease; MI: Myocardial Infraction; JIA: Juvenile Idiopathic Arthritis; T2D: Type 2 Diabetes; CD; Crohn’s Disease; UC: Ulcerative Colitis; SCZ: Schizophrenia
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
Novel loci are presented with bright purple
283
Appendix Figure 10 | Manhattan plot of association results for SLE. Each circle presents the − 𝐥𝐨𝐠𝟏𝟎(𝒑) of the variants. The thresholds of suggestive (p-value = 1e-06) and genome-wide significance (p-value = 5e-08) are delineated with blue and red lines, respectively. The plot includes SNPs that were significant in both GWAS and MTAG.
284
RA loci identified with MTAG
Appendix Table 9 and Appendix Figure 11 present the 18 regions with genome-wide
significant association for RA. Seven of these were previously reported mainly in mixed
population cohort (ALS2CR12, ETV7, BLK, SH2B3, RAD51B, RP11-973H7.1, and YDJC)
and the other 11 were newly established at genome-wide significance. Seven of the
novel associations were found to contribute to disease susceptibility; MANEAL
(rs2306627, OR 1.04, p = 9.73e-10), IL12RB2 (rs6679356, OR 1.04, p = 1.94e-08),
KIAA1109 (rs7677168, OR 1.06, p = 1.95e-08), TNIP1 (rs4958880, OR 1.05, p = 7.78e-
10), intergenic variant rs244689 (OR 1.04, p = 6.65e-09), intergenic rs12215241 (OR
1.04, p = 3.36e-08) and rs802791 (OR 1.03, p = 1.12e-08).
The remaining four novel gene associations were found to be protective for RA
including TNFS4 (rs1234313, OR 0.97, p = 3.40e-09), BANK1 (rs4572884, OR 0.97, p =
3.64e-09), RP11-89M16.1 (rs16903065, OR 0.95, p = 1.48e-09) and UBASH3A
(rs9980184, OR 0.94, p = 2.09e-14). The latter is independent to the locus that has
previously been reported to be associated with RA.
285
Appendix Figure 11 | Manhattan plot of association results for RA. Each circle presents the − 𝐥𝐨𝐠𝟏𝟎(𝒑) of the variants. The thresholds of suggestive (p-value = 1e-06) and genome-wide significance (p-value = 5e-08) are delineated with blue and red lines, respectively. The plot includes SNPs that were significant in both GWAS and MTAG.
286
Appendix Table 9 | MTAG results for RA
Chr Position rsid effect
allele
other
allele
MAF RA
p-value
MTAG
p-value
MTG
OR
MTAG
95% CI
Gene Consequence Associated Trait
1 173166247 rs1234313 A G 0.31 9.34e-06 3.40e-09 0.97 0.96-0.98 TNFSF4 intron variant g: SLE, CD, MS
1 38260503 rs2306627 T C 0.28 5.98e-08 9.73e-10 1.04 1.02-1.05 MANEAL intron variant
1 67820194 rs6679356 C T 0.17 4.51e-06 1.94e-08 1.04 1.03-1.05 IL12RB2 intron variant
2 202171573 rs13408294 G C 0.10 4.36e-07 1.17e-08 1.05 1.03-1.07 ALS2CR12 intron variant p: RA
4 102783351 rs4572884 T C 0.38 2.55e-04 3.64e-09 0.97 0.96-0.98 BANK1 intron variant g: SLE,CD,SCZ
4 123134158 rs7677168 A G 0.08 4.63e-06 1.95e-08 1.06 1.04-1.08 KIAA1109 intron variant g: IBD, T1D, UC
5 133422816 rs244689 A G 0.16 5.99e-06 6.65e-09 1.04 1.03-1.06 intergenic variant
5 150438477 rs4958880 A C 0.19 8.15e-07 1.97e-13 1.05 1.04-1.06 TNIP1 intron variant PSO,PSA
6 27023081 rs12215241 A G 0.20 3.61e-04 3.36e-08 1.04 1.02-1.05 intergenic variant SCZ, p: T1D
6 106569270 rs802791 T C 0.31 3.78e-05 1.12e-08 1.03 1.02-1.05 intergenic variant IgAD, p: SLE
6 36350605 rs881648 T C 0.14 5.28e-08 5.56e-11 0.95 0.94-0.97 ETV7 intron variant p: RA
8 129540464 rs16903065 A C 0.10 5.51e-07 1.48e-09 0.95 0.93-0.97 RP11-
89M16.1
intron & non coding
transcript variant
p: ovarian cancer
8 11351019 rs4840568 A G 0.27 8.89e-07 3.18e-14 1.05 1.03-1.06 BLK upstream gene variant RA (mixed), p:SLE
12 111884608 rs3184504 T C 0.46 3.02e-07 5.25e-12 1.04 1.03-1.05 SH2B3 missense variant JIA,T1D, p:RA(mixed)
Chr: Chromosome; MAF; Minor Allele Frequency; MTAG: Multi-Trait Analysis of GWAS; OR: Odds Ratio; CI: Confidence Interval; &: and; g: gene associated with;
SLE: Systemic Lupus Erythematosus; CD; Crohn’s Disease; MS: Multiple Sclerosis; p: proxy SNP to the reported SNP associated with; IBD: Inflammatory Bowel Disease;
PBC: Primary Biliary Cirrhosis/Cholangitis; RA: Rheumatoid Arthritis; SCZ: Schizophrenia; T1D: Type 1 Diabetes; UC: Ulcerative Colitis; PSO: Psoriasis; PsA: Psoriatic Arthritis;
IgAD: Immunoglobulin A Deficiency; mixed: mixed populations (Europeans and Asians); JIA: Juvenile Idiopathic Arthritis
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
Novel loci are presented with bright purple
287
Appendix Table 9 | MTAG results for RA
Chr Position rsid effect
allele
other
allele
MAF RA
p-value
MTAG
p-value
MTG
OR
MTAG
95% CI
Gene Consequence Associated Trait
14 68747868 rs7148416 T A 0.34 5.31e-07 2.37e-09 0.97 0.96-0.98 RAD51B intron variant RA (mixed), height
18 12779947 rs2542151 G T 0.14 5.83e-08 4.57e-10 1.05 1.03-1.06 RP11-
973H7.1
upstream gene variant RA (mixed), T1D,IBD
21 43843391 rs9980184 A G 0.06 2.83e-07 2.15e-08 0.94 0.92-0.96 UBASH3A intron variant g: RA,T1D, PBC
22 21979096 rs11089637 C T 0.17 5.55e-07 2.09e-14 1.06 1.04-1.07 YDJC downstream gene
variant
RA, CD
Chr: Chromosome; MAF; Minor Allele Frequency; MTAG: Multi-Trait Analysis of GWAS; OR: Odds Ratio; CI: Confidence Interval; &: and; g: gene associated with;
p: proxy SNP to the reported SNP associated with; RA: Rheumatoid Arthritis; SLE: Systemic Lupus Erythematosus; mixed: mixed populations (Europeans and Asians);
CAD: Cardiac Artery Disease; MI: Myocardial Infraction; JIA: Juvenile Idiopathic Arthritis; T2D: Type 2 Diabetes; CD; Crohn’s Disease; UC: Ulcerative Colitis; SCZ: Schizophrenia
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
Novel associations are presented with bright purple
288
AS loci identified with MTAG
The power gain to detect additional genome-wide significant loci in AS was only 3%
using MTAG which is obvious from Appendix Table 10. Only one novel locus was
identified; TDRD10 (rs4845639, OR 0.92, p = 2.15e-08) which was found to be
protective for AS
The Manhattan plot depicted the association results for both AS GWAS and MTAG
analysis can been in Appendix Figure 12.
Appendix Figure 12 | Manhattan plot of association results for AS. Each circle presents the − 𝐥𝐨𝐠𝟏𝟎(𝒑) of the variants. The thresholds of suggestive (p-value = 1e-06) and genome-wide significance (p-value = 5e-08) are delineated with blue and red lines, respectively. The plot includes SNPs that were significant in both GWAS and MTAG.
289
Appendix Table 10 | MTAG results for AS
Chr Position rsid effect
allele
other
allele
MAF AS
p-value
MTAG
p-value
MTG
OR
MTAG
95% CI
Gene Consequence Associated Trait
1 154490352 rs4845639 C T 0.41 1.25e-07 2.15e-08 0.92 0.90-0.95 TDRD10 intron variant p: IL 6 levels
Chr: Chromosome; MAF; Minor Allele Frequency; MTAG: Multi-Trait Analysis of GWAS; OR: Odds Ratio; CI: Confidence Interval; p: proxy SNP to the reported SNP associated with;
IL: Interleukin
The Associated Traits have been detected using PhenoScanner (version 1.1) with parameters catalogue: GWAS, p-value cut-off: 5e-08, proxies: Yes; r-squared: 0.8
Novel association is presented with bright purple
290
Appendix Figure 13, Appendix Figure 14, Appendix Figure 15 and Appendix Figure 16
are forest and leave-one-out plots produced when assessing the causal role of BMI on
PsA using both GIANT and UK Biobank datasets for BMI. The plots are used in
Mendelian Randomization to visually check the validity of the instrumental variables,
the existence of any outliers and the presence of pleiotropy.
Appendix Figure 13 | Forest plot of BMI (GIANT) on PsA using Wald ratio for each IVW. The MR estimate using all SNP using IVW is shown with red.
291
Appendix Figure 14| Leave-one-out-plot for BMI (GIANT) on PsA. Each black point represents the MR analysis using IVW excluding the particular SNP. The overall effect using all SNPs is shown with red.
292
Appendix Figure 15 | Forest plot of BMI (UK Biobank) on PsA using Wald ratio for each IVW. The MR estimate using all SNP using IVW is shown with red.